pdf-tool | offers PDF file encryption in command line | Security Testing library
kandi X-RAY | pdf-tool Summary
kandi X-RAY | pdf-tool Summary
A tool which offers PDF file encryption in command line and brute forcing them using a local wordlist.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Encrypt a file .
- Use multiprocessing .
- Reverse password .
pdf-tool Key Features
pdf-tool Examples and Code Snippets
Community Discussions
Trending Discussions on pdf-tool
QUESTION
I'm trying to generate proper and conformant PDF/A, based on normal PDF documents, and after spend some hours in investigation, we've decided to make use of Ghostscript capabilities. This bussiness requirement has been set for a bigger C# project I'm working in, but first of all I started some testing with Ghostscript commands over Windows context on the one hand, and created an isolated console application that uses Ghostscript .NET on the other, to test viability of this feature.
We concentrated efforts in PFD/A-1B format for this first test, and make use of VeraPDF and PDF-Tools to check conformance for generated files.
The following tests have been completed with a few different PDF files, some of them were files actually generated by our project application. For simplicity, and in case anyone wants to check, I provide a simple PDF (with only a few lines of text) which has been used and tested in same way and that reproduces same behavior.
Ghostscript command testing ExecutionUsing GhostScript v 9.52, I tried the following command:
...ANSWER
Answered 2020-Nov-14 at 15:07You should not use -dNOSAFER
, you should instead add files/directories to the permitted file reading list using the --permit-file-read switch. The file which needs to be read is the OutputIntent profile which is one of the main ingredients of the pdfa_def.ps file. See below.
If you do not include the pdfa_def.ps file then you will not get an OutputIntent in the final PDF/A file and it will not be PDF/A compliant (unless you specify UseDeviceIndependentColor as the ColorConversionStrategy
). That's why your code example doesn't work. Noticing that PDF-Tools still says the file is valid, I would stop using that as a validator, it clearly isn't reliable. I've found VeraPDF to be the best validator personally (it's better than the Acrobat built-in verification).
I'm surprised that the command line you have shown at the top of the question produces a valid PDF/A file, unless you have modified the pdfa_def.ps file? You are supposed to and in particular you must modify the value associated with the /ICCProfile
key. That value (a string inside parentheses) needs to be a fully qualified path to the ICC profile and either the ICC profile file or the directory it resides in needs to be added to the permitted list of files to read see the documentation here under -dSAFER
.
Assuming you have done so, then the resulting PDF file should be a PDF/A-1b conformant file. And indeed according to your question, VeraPDF says it is conformant so I'm unclear on what your problem is there. It would be much more useful to share the input and output PDF files rather than a picture of (part of) the Acrobat display.
So to answer your questions:
Yes there is a way to generate a PDF/A file with ICC information (it isn't valid if it doesn't have an OutputIntent) and your command line does so. If you have not modified the pdfa_def.ps file appropriately you may still have a problem.
As far as I know you run the pdfa_def.ps file using Ghostscript.NET in exactly the same way as you do on the command line, you just put it in the list of arguments. So you just need to uncomment the line you've commented. Of course, you haven't included -dNOSAFER, nor added the ICC profile to the list of permitted files to read, so you will get an error. I am surprised you are getting a fatal error though, I'd expect an invalidaccess, but the obvious thing to do is to add -dNOSAFER to the arguments. The back channel output might be useful, it may have more information, and you haven't included that.
Oh, and I would not write to a pipe either. The pdfwrite device expects to be writing to a file and it may try to seek within the file while writing it. If it does and you've written to a pipe (or other non-seekable output), then it's going to fail.
You don't need to add -f to the argument list, and this:
QUESTION
I have the following code, which counts the number of PDFs in specific folders, and counts the number of sheets in those specific PDFs, and sends an email with this data.
I've anonymised part of the script.
...ANSWER
Answered 2020-May-08 at 12:18It's a HTML encoding issue. I think you need to either use the following code.
QUESTION
I have a PDF with an embedded XML file. I want to access the embedded XML file in R.
One way to solve the problem manually would be to open the PDF file with Adobe Acrobat and save the embedded XML file from there manually (see here). The saved XML file could then be accessed in R using the XML package.
However, as I have to run this for numerous PDFs and the rest of the code is in R, I'm looking for a solution in R. The pdftools package doesn't seem to provide a solution, nor does pdftk seem to be implemented for R.
...ANSWER
Answered 2020-May-07 at 20:26It seems like pdftools
has pdf_attachments()
function. Using the example pdf file you provided:
QUESTION
Preflight (version 2.0.15) tool has validated correctly the generated pdf (was created with pdfbox version 2.0.15) file but online pdf-tools (e.x. https://www.pdf-online.com/osa/validate.aspx) does not validate it correctly. I am getting below error:
Compliance pdfa-1b Result Document does not conform to PDF/A. Details Validating file "file.pdf" for conformance level pdfa-1b
Anonymous RDF resources (rdf:Description without rdf:about attribute) are not allowed in XMP Metadata.
The appearance dictionary doesn't contain an entry.
The appearance dictionary doesn't contain an entry.
The appearance dictionary doesn't contain an entry.
The appearance dictionary doesn't contain an entry.
The appearance dictionary doesn't contain an entry.
The document does not conform to the requested standard.
The document contains annotations or form fields with ambigous or without appropriate appearances.
The document's meta data is either missing or inconsistent or corrupt. The document does not conform to the PDF/A-1b standard.
Done.
In order to generate metadata I use below code:
...ANSWER
Answered 2019-Jul-08 at 12:20As discussed in the comments:
1) The failure to report "The appearance dictionary doesn't contain an entry" is a bug in PDFBox preflight that will be fixed in 2.0.17, see PDFBOX-4586. According to this document:
An ISO 19005-1 validator shall FAIL otherwise conforming files in which a widget annotation lacks an appearance dictionary
2) The "rdf:Description without rdf:about attribute" may or may not be a bug. VeraPDF doesn't consider it to be one. Your code used an 1.8.* version. For these, you can call dcSchema.setAbout("")
to fix this. In 2.0.* the problem doesn't occur if you created the schema with metadata.createAndAddDublinCoreSchema()
.
I have created an issue in the VeraPDF project and they will bring this question for discussion at the next meeting of the Validation technical working group.
3) That the widgets didn't contain an entry is because at the time setValue()
was called, not enough information was present (e.g. the rectangle).That is why you got the message widget of field aa has no rectangle, no appearance stream created
.
QUESTION
I'm using CefSharp in a 32-bit WPF application. I use CefSharp as a document viewer, which displays local HTML and PDF files. Each Browser instance is embedded in a Tab (using WPF TabControl).
With version 63.0.3 everything worked fine. After updating to 73.1.130, I encounter the following issue: after opening some tabs and switching between them, the browser displays a blank page in all tabs.
Note: in WPF switching from and back to a tab results in a reload of all controls inside the tab.
The only way I found to fix the issue is downgrading to 63.0.3 again.
CefSharp is initialized in App.xaml.cs:
...ANSWER
Answered 2019-Jul-04 at 07:05Thanks to amaitland (see comments) it's an issue in version 73 - https://github.com/cefsharp/CefSharp/issues/2779. Version 71 doesn't have this issue.
QUESTION
I have a docker container setup that keeps failing to install this pytest-django==3.4.8 from requirements.txt
. If I comment it out everything else installs correctly. Tried everything from tearing down the setup and rebuilding to upgrading pip to deleting the pip cache and still nothing. Any help is appreciated!
ANSWER
Answered 2019-Jul-03 at 15:51The fix comes down to updating pip and the symbolic link in the DockerFile:
QUESTION
I am using xpdf to convert pdf to text and then with help of regex function seraching for words after colon in pdf and then looping that data with strpos function of php and storing them into database.Its working for me for single data. but for multiple same data i am not getting how to add this data into database.
step by step i will show you my code and response:
i am using xpdf to convert my pdf into text format with below code.
...ANSWER
Answered 2018-Dec-01 at 13:34You can not have duplicate keys, so you could create a multidimensional array. If the data for each row is always there, you could use array_chunk with a size of 3:
QUESTION
...I am creating one regex to find words after colon in my
pdftotext
. i am getting data like: I am using this xpdf to convert uploaded pdf by user into text format.
ANSWER
Answered 2018-Dec-01 at 07:01Note: The OP has changed his question after several answers were given. This is an answer to the original question.
Here is one solution, using preg_match_all
. We can try matching on the following pattern:
QUESTION
This is my code for converting .pdf into .txt file to text mining purpose. Note that I used a pdftotext.exe to convert .pdf to .txt file.
...ANSWER
Answered 2018-Oct-31 at 05:10Try it like this:
QUESTION
I have an rails 5.2 app and its about generating a pdf files.
I am using pdf doc as template and generating new temp pdf and with pdf-toolkit doing merging of these two to a final pdf file.
This are file paths in development:
...ANSWER
Answered 2018-Aug-09 at 05:58Problem was I need it to install pdftk manually on my server Ubuntu 16.04. Logged on with ssh and run:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install pdf-tool
You can use pdf-tool like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page