pdf2json | PDF file parser that converts PDF binaries | Document Editor library
kandi X-RAY | pdf2json Summary
kandi X-RAY | pdf2json Summary
pdf2json is a node.js module that parses and converts PDF from binary to json format, it's built with pdf.js and extends it with interactive form elements and text content parsing outside browser. The goal is to enable server side PDF parsing with interactive form elements when wrapped in web service, and also enable parsing local PDF to json file when using as a command line utility.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of pdf2json
pdf2json Key Features
pdf2json Examples and Code Snippets
var pdf2json = require("form-pdf2json");
// or
import pdf2json from "form-pdf2json";
// => { convertPdf2Json: [Function], convertJson2Fdf: [Function], ... }
Community Discussions
Trending Discussions on pdf2json
QUESTION
When I try to run pdf2json
(without any parameters at all) I'm getting this error:
ANSWER
Answered 2022-Mar-26 at 02:56Quoting https://github.com/modesty/pdf2json/issues/250:
The problem here is that these were breaking changes but only marked as a new (non-breaking) feature with a minor version increment. So calling
npm audit fix
actually forces this upgrade and breaks everything. This needs to be re-released as a MAJOR version increment to v2.0 and/or a bug patch released on the 1.x line to make this backwards compatible again.
The solution is to do npm install --save-exact pdf2json@1.2.5
.
QUESTION
I'm trying to upload a pdf file then I need to parse it using pdf2json without saving the file in the directory but pdf2json requests a directory path in loadPDF. any idea what can I do, please?
...ANSWER
Answered 2021-Aug-30 at 18:27Just use pdfParser.parseBuffer(pdfBuffer);
and pass in the file buffer.
You'll need to use the built in File Interceptor instead of the AzureStorageFileInterceptor
QUESTION
I tried pdf2json:
...ANSWER
Answered 2020-Jul-09 at 11:34Try calipers. Code example:
QUESTION
I have an Electron app with VueJS and I try to download a generated XLS with sheetjs but the provided workaround does not work in my case. Here is what I have been trying:
...ANSWER
Answered 2020-Mar-23 at 10:46The dialog method dialog.showSaveDialog
returns a promise which resolves to an object, it does not return a string:
dialog.showSaveDialog([browserWindow, ]options)
Returns
Promise - Resolve with an object containing the following:
canceled
Boolean - whether or not the dialog was canceled.filePath
String (optional) - If the dialog is canceled, this will beundefined
.bookmark
String (optional)macOS
mas - Base64 encoded string which contains the security scoped bookmark data for the saved file.securityScopedBookmarks
must be enabled for this to be present. (For return values, see table here.)You have to
await
fordialog.showSaveDialog
or usedialog.showSaveDialogSync
and also, you can makeexportData
to be asynchronous rather than creating an inner async function.Try to refactor your code like this:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install pdf2json
More details can be found at the bottom of this document.
If you have an early version of pdf2json, please remove your local node_modules directory and re-run npm install to upgrade to pdf2json@1.0.x. v1.x.x upgraded dependency packages, removed some unnecessary dependencies, started to assumes ES6 / ES2015 with node ~v4.x. More PDFs are added for unit testing. Note: pdf2json has been in production for over 3 years, it's pretty reliable and solid when parsing hundreds (sometimes tens of thousands) of PDF forms every day, thanks to everybody's help. Starting v1.0.3, I'm trying to address a long over due annoying problem on broken text blocks. It's the biggest problem that hinders the efficiency of PDF content creation in our projects. Although the root cause lies in the original PDF streams, since the client doesn't render JSON character by character, it's a problem often appears in final rendered web content. We had to work around it by manually merge those text blocks. With the solution in v1.0.x, the need for manual text block merging is greately reduced. The solution is to put to a post-parsing process stage to identify and auto-merge those adjacent blocks. It's not ideal, but works in most of my tests with those 261 PDFs underneath test directory. The auto merge solution still needs some fine tuning, I keep it as an experimental feature for now, it's off by default, can be turned on by "-m" switch in commandline. In order to support this auto merging capability, text block objects have an additional "sw" (space width of the font) property together with x, y, clr and R. If you have a more effective usage of this new property for merging text blocks, please drop me a line.
v1.1.4 unified event data structure: only when you handle these top level events, no change if you use commandline event "pdfParser_dataError": {"parserError": errObj} event "pdfParser_dataReady": {"formImage": parseOutput}
v1.0.8 fixed issue 27, it converts x coordinate with the same ratio as y, which is 24 (96/4), rather than 8.7 (96/11), please adjust client renderer accordingly when position all elements' x coordinate.
Make sure nodejs is installed. Detailed installation steps can be found at http://stackoverflow.com/a/16303380/433814.
Create a symbolic link from node to nodejs
Verify the version of node and installation
Proceed with the install of pdf2json as described above
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page