PDFLayoutTextStripper | pdf file into a text file | Document Editor library

 by   JonathanLink Java Version: 2.2.4 License: Apache-2.0

kandi X-RAY | PDFLayoutTextStripper Summary

kandi X-RAY | PDFLayoutTextStripper Summary

PDFLayoutTextStripper is a Java library typically used in Editor, Document Editor applications. PDFLayoutTextStripper has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has medium support. You can download it from GitHub, Maven.

Converts a PDF file into a text file while keeping the layout of the original PDF. Useful to extract the content from a table or a form in a PDF file. PDFLayoutTextStripper is a subclass of PDFTextStripper class (from the Apache PDFBox library).
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              PDFLayoutTextStripper has a medium active ecosystem.
              It has 1446 star(s) with 200 fork(s). There are 49 watchers for this library.
              OutlinedDot
              It had no major release in the last 12 months.
              There are 17 open issues and 15 have been closed. On average issues are closed in 7 days. There are 4 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of PDFLayoutTextStripper is 2.2.4

            kandi-Quality Quality

              PDFLayoutTextStripper has 0 bugs and 0 code smells.

            kandi-Security Security

              PDFLayoutTextStripper has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              PDFLayoutTextStripper code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              PDFLayoutTextStripper is licensed under the Apache-2.0 License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              PDFLayoutTextStripper releases are available to install and integrate.
              Deployable package is available in Maven.
              Build file is available. You can build the component from source.
              Installation instructions are not available. Examples and code snippets are available.
              PDFLayoutTextStripper saves you 149 person hours of effort in developing the same functionality from scratch.
              It has 372 lines of code, 49 functions and 2 files.
              It has high code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed PDFLayoutTextStripper and discovered the below as its top functions. This is intended to give you an instant insight into PDFLayoutTextStripper implemented functionality, and help decide if they suit your requirements.
            • Write page
            • Get number of new lines from previous text position
            • Returns the index of the given character
            • Iterate through text and create new lines
            • Processes a single page
            • Set the current page width
            • Complete the line with spaces
            • Gets the line length
            Get all kandi verified functions for this library.

            PDFLayoutTextStripper Key Features

            No Key Features are available at this moment for PDFLayoutTextStripper.

            PDFLayoutTextStripper Examples and Code Snippets

            No Code Snippets are available at this moment for PDFLayoutTextStripper.

            Community Discussions

            QUESTION

            Extract text from PDF files and preserve the orginal layout, in Python
            Asked 2021-Jul-17 at 09:24

            I want to extract text from the PDF files but the layout of text in the PDF should be maintained, like the images below. Images show results from the [github.com/JonathanLink/PDFLayoutTextStripper]. I tried the below code but it doesn't maintain the Layout. I want get results exactly the same way as shown in the images by using any of the Python libraries like PyPDF2, PDFPlumber, PDFminer etc. I tried all these libraries but didn't get the desired results. I need help in extracting the text from the PDF file exactly as is shown in the images.

            ...

            ANSWER

            Answered 2021-Jul-17 at 09:24

            You can preserve layout/indentation using PDFtotext package.

            Source https://stackoverflow.com/questions/68407261

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install PDFLayoutTextStripper

            You can download it from GitHub, Maven.
            You can use PDFLayoutTextStripper like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the PDFLayoutTextStripper component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            Install
            Maven
            Gradle
            CLONE
          • HTTPS

            https://github.com/JonathanLink/PDFLayoutTextStripper.git

          • CLI

            gh repo clone JonathanLink/PDFLayoutTextStripper

          • sshUrl

            git@github.com:JonathanLink/PDFLayoutTextStripper.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link