pdf2doi | python library/command-line tool | Document Editor library

by MicheleCotrufo Python Version: 1.6rc1 License: No License

X-Ray Key Features Code Snippets(3)Community Discussions(6)Vulnerabilities Install Support

kandi X-RAY | pdf2doi Summary

pdf2doi is a Python library typically used in Editor, Document Editor applications. pdf2doi has no bugs, it has no vulnerabilities, it has build file available and it has low support. You can install using 'pip install pdf2doi' or download it from GitHub, PyPI.

Automatically associating a DOI or other identifiers (e.g. arXiv ID) to a pdf file can be either a very easy or a very difficult (sometimes nearly impossible) task, depending on how much care was placed in crafting the file. In the simplest case (which typically works with most recent publications) it is enough to look into the file metadata. For older publications, the identifier is often found within the pdf text and it can be extracted with the help of regular expressions. In the unluckiest cases, the only method left is to google some details of the publication (e.g. the title or parts of the text) and hope that a valid identifier is contained in one of the first results. The pdf2doi library applies sequentially all these methods (starting from the simplest ones) until a valid identifier is found and validated. Specifically, for a given .pdf file it will, in order,. Any time that a possible identifier is found, it is validated by performing a query to a relevant website (e.g., for DOIs and for arxiv IDs). The validation process returns raw BibTeX info when the identifier is valid. When a valid identifier is found with any method different than the first one, the identifier is also stored inside the metadata of the pdf file. In this way, future lookups of this same file will be able to extract the identifier with the first method, speeding up the search (This feature can be disabled by the user, in case edits to the pdf file are not desired). The library is far from being perfect. Often, especially for old publications, none of the currently implemented methods will work. Other times the wrong DOI might be extracted: this can happen, for example, if the DOI of another paper is present in the pdf text and it appears before the correct DOI. A quick and dirty solution to this problem is to look up the identifier manually and then add it to the metadata of the file, with the methods shown here (from python console) or here (from command line). In this way, pdf2doi will always retrieve the correct DOI in future requests, which can be useful when pdf2doi is used to automatize bibliographic procedures for a large number of files (e.g. via pdf2bib or pdf-renamer). Currently, only the format of arXiv identifiers in use after 1 April 2007 is supported.

Support

Quality

Security

License

Reuse

Support

pdf2doi has a low active ecosystem.

It has 53 star(s) with 8 fork(s). There are 1 watchers for this library.

There were 2 major release(s) in the last 12 months.

There are 2 open issues and 19 have been closed. On average issues are closed in 23 days. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of pdf2doi is 1.6rc1

Quality

pdf2doi has no bugs reported.

Security

pdf2doi has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

pdf2doi does not have a standard license declared.

Check the repository for any license declaration and review the terms closely.

Without a license, all rights are reserved, and you cannot use the library in your applications.

Reuse

pdf2doi releases are available to install and integrate.

Deployable package is available in PyPI.

Build file is available. You can build the component from source.

Installation instructions, examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi has reviewed pdf2doi and discovered the below as its top functions. This is intended to give you an instant insight into pdf2doi implemented functionality, and help decide if they suit your requirements.

Reads parameters in config file
Converts the params to booleans
Converts parameters to Numb
Writes the INI file
Validate a DOI
Validate DOI using DOI
Validate the arXiv ID
Standardise a DOI identifier
Tries to extract the first n characters from the PDF file
Given a list of texts and a list of texts find the corresponding DOI
Perform a google search using Google search
Extract arXiv ID from text
Convert a PDF file to a DOI
Find a DOI from a PDF file
Convert a single file to DOI identifier
Adds an identifier to the target metadata
Add a key to a folder
Find the identifier in the pdfinfo dictionary
Get information about a PDF
Try to find a valid identifier for a given file
Find possible titles
Find the identifier in a file
Find the identifier in the PDF text extracted from the PDF
Saves pdf2doi results
Set a parameter
Removes all keys associated with the right click

Get all kandi verified functions for this library.

pdf2doi Key Features

No Key Features are available at this moment for pdf2doi.

pdf2doi Examples and Code Snippets

pdf2doi,Usage,Command line usage

Python

Lines of Code : 109

License : No License

Copy

$ pdf2doi 'path/to/target'

$ pdf2doi ".\examples" -v
[pdf2doi]: Looking for pdf files in the folder D:\Dropbox (Personal)\PythonScripts\pdf2doi\examples...
[pdf2doi]: Found 4 pdf files.
[pdf2doi]: ................
[pdf2doi]: Trying to retrieve a DOI

pdf2doi,Usage,Usage inside a python script

Python

Lines of Code : 28

License : No License

Copy

>>> from pdf2doi import pdf2doi
>>> pdf2doi.config.set('verbose',False)
>>> results = pdf2doi('.\examples')

result['identifier'] = DOI or other identifier (or None if nothing is found)
result['identifier_type'] = string sp

pdf2doi,Installing the shortcuts in the right-click context menu of Windows

Python

Lines of Code : 2

License : No License

Copy

$ pdf2doi  -install--right--click

$ pdf2doi  -uninstall--right--click

Community Discussions

Vulnerabilities

No vulnerabilities reported

Install pdf2doi

Use the package manager pip to install pdf2doi. Under Windows, it is also possible to add shortcuts to the right-click context menu.

Support

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Find more information at:

Reuse Trending Solutions

Build a Realtime Voice-to-Image Generator using Generative AI

Image Resizing using OpenCV in Python

Build your own Custom GPT Content Generator (Open-Source ChatGPT Alternative)

How to Validate an Email Address in JavaScript

Age Calculator using JavaScript

Addressing Bias in AI - Toolkit for Fairness, Explainability and Privacy

15 best JavaScript Node.js Payment libraries

Build Credit Risk predictor using Federated Learning

10 Best JavaScript Tours and Guides Libraries in 2023

Disease Predictor using Pandas & Scikit

28 best Python Face Recognition libraries

Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

Find more libraries

Install

PyPI

pip install pdf2doi

CLONE

HTTPS

https://github.com/MicheleCotrufo/pdf2doi.git

CLI

gh repo clone MicheleCotrufo/pdf2doi

sshUrl

git@github.com:MicheleCotrufo/pdf2doi.git

Download

Stay Updated

Subscribe to our newsletter for trending solutions and developer bootcamps

Share this Page

Explore Related Topics

Document Editor Editor

Reuse Document Editor Kits

Convert Word document to PDF in Java

7 Best React PDF Viewer Libraries 2024

React pdf viewer Libraries

Nodejs PDF Generator libraries

Angular PDF Viewer Libraries

See all related Kits

Reuse Editor Kits

Create your own GenAI text bot with these 8 cool open-source projects.

Chatbot using Bi-directional Recurrent Neural Network

7 best XAML Editor libraries

Javascript pdf generator libraries

React Syntax Highlighting Libraries

See all related Kits

Consider Popular Document Editor Libraries

by mozilla

by RelaxedJS

by wkhtmltopdf

by bpampuch

by coolwanglu

See all Document Editor Libraries

Try Top Libraries by MicheleCotrufo

pdf-renamer

by MicheleCotrufoPython

pdf2bib

by MicheleCotrufoPython

DataAcquisition_Oscilloscope-Powermeter

by MicheleCotrufoPython

test

by MicheleCotrufoPython

pyThorlabsPM100x

by MicheleCotrufoPython

See all Learning Libraries

Open Weaver – Develop Applications Faster with Open Source

Terms
Privacy policy

pdf2doi | python library/command-line tool | Document Editor library

kandi X-RAY | pdf2doi Summary

kandi X-RAY | pdf2doi Summary

Support

Quality

Security

License

Reuse

Top functions reviewed by kandi - BETA

pdf2doi Key Features

pdf2doi Examples and Code Snippets

Community Discussions

Vulnerabilities

Install pdf2doi

Support

Reuse Trending Solutions

Open Weaver – Develop Applications Faster with Open Source

kandi

Community and Support

Company

Follow