freki | Analyze XML extracted from PDFs | Document Editor library

 by   xigt Python Version: v0.3.0 License: MIT

kandi X-RAY | freki Summary

kandi X-RAY | freki Summary

freki is a Python library typically used in Editor, Document Editor applications. freki has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has low support. You can download it from GitHub.

Freki is a package that takes the markup-language formatted output of a PDF-to-text extraction tool (either PDFLib TET or PDFMiner), and detects text blocks (e.g., paragraphs, headers, figures, etc.). The blocks are assigned attributes (identifiers, bounding boxes, etc.) for later analysis. This was developed for the detection of interlinear glossed text (IGT), but it could serve other purposes, as well. Freki also includes a method to convert plain text documents to the Freki format for the purposes of IGT detection and language identification.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              freki has a low active ecosystem.
              It has 19 star(s) with 3 fork(s). There are 6 watchers for this library.
              OutlinedDot
              It had no major release in the last 12 months.
              There are 5 open issues and 11 have been closed. On average issues are closed in 25 days. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of freki is v0.3.0

            kandi-Quality Quality

              freki has 0 bugs and 0 code smells.

            kandi-Security Security

              freki has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              freki code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              freki is licensed under the MIT License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              freki releases are available to install and integrate.
              Build file is available. You can build the component from source.
              Installation instructions are not available. Examples and code snippets are available.
              It has 1142 lines of code, 128 functions and 14 files.
              It has medium code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed freki and discovered the below as its top functions. This is intended to give you an instant insight into freki implemented functionality, and help decide if they suit your requirements.
            • Run analyzer
            • Processes a freki document
            • Return the document id given a path
            • Calculate the average character width of a block
            • Analyze the content of the page
            • Convert a zone to a Block
            • Make a bitmap
            • Computes the parameters for each bitmap
            • Reads the XML page
            • Merge two ranges
            • Append a new item
            • Return dict of spans
            • Iterate over all lines
            • List of fonts
            • Read a file from a string
            • List of bounding boxes
            • List of fonts in the block
            • List of font fonts
            • List of lines
            Get all kandi verified functions for this library.

            freki Key Features

            No Key Features are available at this moment for freki.

            freki Examples and Code Snippets

            No Code Snippets are available at this moment for freki.

            Community Discussions

            QUESTION

            Trying to pull the Name and/or ID of the code below, but can only pull the Job-Base-Cost
            Asked 2018-Dec-29 at 13:47

            Below is the code I have now. It pulls the Job-Base-Cost just fine, however I cannot get it to pull the ID and or Name of the item. Can you help?

            Link to the sites XML pull.

            ...

            ANSWER

            Answered 2018-Dec-29 at 13:47

            This is a sample of one line of the OP's XML file
            109555912.69

            The OP wants to use the IMPORTXML function to report the ID and Name as well as the Job Cost from the XML data. Presently, the OP's formula is:
            =importxml("link","//job-base-cost")

            There are two options:
            1 - One long column
            =importxml("link","//@id | //@name | //job-base-cost")

            Note //@id and //@name in the xpath query: // indicate nodes in the document (at any level, not just the root level) and @ indicate attributes. The pipe | operator indicates AND. So the plain english query is to display the id, name and job-base-cost.

            2 - Three columns (table format)
            ={IMPORTXML("link","//@name"),IMPORTXML("link","//job-base-cost"),IMPORTXML("link","//@id")}

            This creates a series that will display the fields in each of three columns.

            Note: there is an arrayformula that uses a single importXML function described in How do I return multiple columns of data using ImportXML in Google Spreadsheets?. Readers may want to look at whether that option can be implemented.

            My thanks to @Tanaike for his comment which spurred me to look at how xpath works.

            Source https://stackoverflow.com/questions/53868515

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install freki

            You can download it from GitHub.
            You can use freki like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/xigt/freki.git

          • CLI

            gh repo clone xigt/freki

          • sshUrl

            git@github.com:xigt/freki.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link