Comments (3)
Hi, it would be useful if you could provide the full traceback, rather than only the error at the end.
from refextract.
@tiffsea refextract uses pdftotext in the background. The error seems to be because refextract cannot find pdftotext installed in your system. Try installing it following the instructions for os dependencies here:
https://pypi.org/project/pdftotext/
and installing pdftotext:
pip install pdftotext
as well as:
conda install -c conda-forge poppler
The above solved the issue for me
from refextract.
@tiffsea To my limited knowledge, pip install pdftotext
installs some other package, which is different from what is needed here (correct me if i am wrong). pdftotext(1) version 3.00 is to be installed for refextract.
So, i installed XpdfReader instead (https://www.xpdfreader.com/pdftotext-man.html) using the commands:
wget http://security.ubuntu.com/ubuntu/pool/main/p/poppler/libpoppler73_0.62.0-2ubuntu2.12_amd64.deb
sudo apt-get install ./libpoppler73_0.62.0-2ubuntu2.12_amd64.deb
wget http://archive.ubuntu.com/ubuntu/pool/universe/x/xpdf/xpdf_3.04-7_amd64.deb
sudo apt-get install ./xpdf_3.04-7_amd64.deb
(ref: https://askubuntu.com/questions/1245518/how-to-install-xpdf-on-ubuntu-20-04)
The above solved the issue for me.
from refextract.
Related Issues (20)
- refextract: month in pubnote HOT 2
- Crash in TeXKeys extraction HOT 3
- TypeError: coercing to Unicode HOT 5
- Error in importing HOT 2
- Syntax error in references/api.py line 96
- refextract: recognize DELPHI notes HOT 1
- Import refextract fails HOT 15
- dont split PTEP articleIDs at letter in the middle
- Year taken as page number when page number is 4 digits
- Error in extract_references_from_file(path) method HOT 2
- Issue with non a-zA-Z auther names
- Ininite loop on debian
- Refextract fails to extract from two-columned layout pdf HOT 2
- mmap: resizing not available HOT 4
- TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType HOT 1
- TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType HOT 2
- clean_pdf_file throws SystemError on MacOS with mmap: resizing not available HOT 1
- extract_references_from_file returns inconsistent data
- mmap resize unavailable
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from refextract.