Comments (25)
Comment by andreasotto
Tue Nov 4 10:51:46 2014
# python -c 'import reportlab' && echo "installed"
installed
from ocrmypdf.
Comment by andreasotto
Tue Nov 4 10:54:20 2014
Ah, i've seen OCRmyPDF wants to have reportlab version >= 3.0
Under Debian 6 squeeze the version is: 2.4-4
Are there real dependencies for >= 3.0?
from ocrmypdf.
Comment by jbarlow83
Tue Jul 28 12:15:31 2015
OCRmyPDF v3.0-rc2 needs reportlab >= 3.0 (although there is a workaround to avoid reportlab: --pdf-renderer tesseract
if you have Tesseract 3.03). In both v2.0 and v3.0 of OCRmyPDF, it's a 'firm' dependency because older reportlabs had a serious bug in image handling that really bloated the sizes of PDFs.
from ocrmypdf.
Comment by Wikinaut
Tue Jul 28 17:47:13 2015
Actual problem with
- Opensuse Tumbleweed
- python-reportlab is installed
- OCRmyPDF still says "Please install the python library reportlab. Exiting...
Having the problem and reading this issue I checked the Tumbleweed reportlab version (2.7-3.3) and let me post the
suggestion
to change the OCRmyPDF dependency-checker test to "Please install the python library reportlab version >= 3.0". i.e. to notify the user they must install a correct version.
from ocrmypdf.
Comment by Wikinaut
Tue Jul 28 17:50:34 2015
@jbarlow83 pls. can you explain, where the option --pdf-renderer tesseract
is to be added ? It does not work on a command line of OCRmyPDF, and it is not mentioned in the help, where such an option could be added.
from ocrmypdf.
Comment by jbarlow83
Tue Jul 28 19:02:34 2015
Only the new version (a pre-release) supports it:
https://github.com/fritz-hh/OCRmyPDF/releases version v3.0-rc2.
Or download the source latest from the "master" branch.
On Tue, 28 Jul 2015 at 10:50 Wikinaut [email protected] wrote:
@jbarlow83 https://github.com/jbarlow83 pls. can you explain, where the
option --pdf-renderer tesseract is to be added ? It does not work on a
command line of OCRmyPDF—
Reply to this email directly or view it on GitHub
fritz-hh/OCRmyPDF#99 (comment).
from ocrmypdf.
Comment by Wikinaut
Tue Jul 28 19:27:40 2015
hmm, one problem is solved (when checking out here, the default branch is v2.x. i changed this now to master) ...
but now I get
# sh ./OCRmyPDF.sh -h
Traceback (most recent call last):
File "/usr/lib64/python3.4/runpy.py", line 170, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib64/python3.4/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/src/OCRmyPDF/ocrmypdf/main.py", line 16, in <module>
import PyPDF2 as pypdf
ImportError: No module named 'PyPDF2'
When I run
./OCRmyPDF.sh -h
bash: ./OCRmyPDF.sh: Keine Berechtigung
All files and subdirectories belong to the current user.
from ocrmypdf.
Comment by jbarlow83
Tue Jul 28 19:33:49 2015
It's a Python 3 package now. Run the installer in the current directory: pip3 install -e .
from ocrmypdf.
Comment by Wikinaut
Tue Jul 28 19:34:53 2015
oops, while you wrote your answer, I read the readme and did the pip3, but:
FileNotFoundError: [Errno 2] No such file or directory: 'mutool'
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /usr/local/src/OCRmyPDF
So I have to install this, too. (should it be added to the dependency checks ???)
from ocrmypdf.
Comment by Wikinaut
Tue Jul 28 19:35:51 2015
uh, `mutool`` is not in Tumbleweed. Have to look for it. (Even Tesseract is easier to install)
from ocrmypdf.
Comment by jbarlow83
Tue Jul 28 19:36:44 2015
It's mupdf-tools
. If it's a pain to get, how is qpdf? Both do the same thing.
You'll need tesseract, ghostscript, unpaper, poppler, and java too.
from ocrmypdf.
Comment by Wikinaut
Tue Jul 28 19:38:16 2015
It is part of mupdf
in Opensuse
from ocrmypdf.
Comment by Wikinaut
Tue Jul 28 19:39:05 2015
Now this is gone, but I have
pip3 install -e .
Obtaining file:///usr/local/src/OCRmyPDF
Complete output from command python setup.py egg_info:
Checking for tesseract >= 3.02.02...
Found tesseract 3.04
Checking for gs >= 9.14...
Found gs 9.16
Checking for unpaper >= 6.1...
Found unpaper 6.2
Checking for pdfseparate >= 0.29.0...
Found pdfseparate 0.33.0
Checking for java >= 1.5.0...
Found java 1.8.0
Checking for mutool >= 1.7a...
Traceback (most recent call last):
File "/usr/local/src/OCRmyPDF/setup.py", line 117, in check_external_program
version = version_scrape_regex.search(result).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 20, in <module>
File "/usr/local/src/OCRmyPDF/setup.py", line 167, in <module>
package='mupdf-tools'
File "/usr/local/src/OCRmyPDF/setup.py", line 119, in check_external_program
error_unknown_version(program, package, optional, minimum_version)
TypeError: error_unknown_version() takes 3 positional arguments but 4 were given
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /usr/local/src/OCRmyPDF
(updated with the complete output)
from ocrmypdf.
Comment by Wikinaut
Tue Jul 28 19:41:40 2015
(post above updated with the complete output)
from ocrmypdf.
Comment by jbarlow83
Tue Jul 28 19:41:55 2015
Added a possible fix - do a git pull
.
from ocrmypdf.
Comment by Wikinaut
Tue Jul 28 19:44:59 2015
I was (I am) already on 6e6f918 . This gives the above error.
from ocrmypdf.
Comment by jbarlow83
Tue Jul 28 19:48:49 2015
Apologies, I pushed it to wrong repo. commit 6901550 should now be available on the main repo.
from ocrmypdf.
Comment by Wikinaut
Tue Jul 28 19:51:23 2015
Different error output:
pip3 install -e .
Obtaining file:///usr/local/src/OCRmyPDF
Complete output from command python setup.py egg_info:
Checking for tesseract >= 3.02.02...
Found tesseract 3.04
Checking for gs >= 9.14...
Found gs 9.16
Checking for unpaper >= 6.1...
Found unpaper 6.2
Checking for pdfseparate >= 0.29.0...
Found pdfseparate 0.33.0
Checking for java >= 1.5.0...
Found java 1.8.0
Checking for mutool >= 1.7a...
Traceback (most recent call last):
File "/usr/local/src/OCRmyPDF/setup.py", line 117, in check_external_program
version = version_scrape_regex.search(result).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 20, in <module>
File "/usr/local/src/OCRmyPDF/setup.py", line 167, in <module>
package='mupdf-tools'
File "/usr/local/src/OCRmyPDF/setup.py", line 119, in check_external_program
error_unknown_version(program, package, optional)
File "/usr/local/src/OCRmyPDF/setup.py", line 83, in error_unknown_version
print(unknown_version.format(**locals()), file=sys.stderr)
KeyError: 'need_version'
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /usr/local/src/OCRmyPDF
from ocrmypdf.
Comment by jbarlow83
Tue Jul 28 20:08:52 2015
Thanks for your patience. Please pull again and it give another shot.
from ocrmypdf.
Comment by Wikinaut
Tue Jul 28 20:29:31 2015
uh, now I have
pip3 install -e .
Obtaining file:///usr/local/src/OCRmyPDF
Complete output from command python setup.py egg_info:
Checking for tesseract >= 3.02.02...
Found tesseract 3.04
Checking for gs >= 9.14...
Found gs 9.16
Checking for unpaper >= 6.1...
Found unpaper 6.2
Checking for pdfseparate >= 0.29.0...
Found pdfseparate 0.33.0
Checking for java >= 1.5.0...
Found java 1.8.0
Checking for mutool >= 1.7a...
OCRmyPDF requires 'mutool' 1.7a or higher. Your system has
'mutool' but we cannot tell what version is installed. Contact the
package maintainer.
This program is REQUIRED for OCRmyPDF to work. Installation will abort.
On systems with the aptitude package manager (Debian, Ubuntu), try these
commands:
sudo apt-get update
sudo apt-get install mupdf-tools
On RPM-based systems (Red Hat, Fedora), search for instructions on
installing the RPM for mupdf-tools.
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /usr/local/src/OCRmyPDF
from ocrmypdf.
Comment by Wikinaut
Tue Jul 28 20:30:18 2015
mupdf 1.7-1.3 on Opensuse Tumbleweed
from ocrmypdf.
Comment by jbarlow83
Tue Jul 28 20:37:56 2015
I don't know what to think when Linux distributions make up arbitrary version numbers that don't follow the package's own conventions, as in this case.
I dropped the version requirement to mupdf 1.7.
from ocrmypdf.
Comment by Wikinaut
Tue Jul 28 20:47:37 2015
better, but still buggy:
writing manifest file 'ruffus.egg-info/SOURCES.txt'
running install_lib
creating /usr/lib/python3.4/site-packages/ruffus
error: could not create '/usr/lib/python3.4/site-packages/ruffus': Permission denied
----------------------------------------
Command "/usr/bin/python3 -c "import setuptools, tokenize;__file__='/tmp/pip-build-1ekvbx92/ruffus/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-8nnv8idr-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-1ekvbx92/ruffus
from ocrmypdf.
Comment by jbarlow83
Tue Jul 28 23:24:17 2015
You need to install as sudo or create a virtual environment with pyvenv and
install to that environment. As the error message says it doesn't have
permission to write /usr/local.
On Tue, Jul 28, 2015 at 13:47 Wikinaut [email protected] wrote:
better, but still buggy:
writing manifest file 'ruffus.egg-info/SOURCES.txt'
running install_lib
creating /usr/lib/python3.4/site-packages/ruffus
error: could not create '/usr/lib/python3.4/site-packages/ruffus': Permission denied----------------------------------------
Command "/usr/bin/python3 -c "import setuptools, tokenize;file='/tmp/pip-build-1ekvbx92/ruffus/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-8nnv8idr-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-1ekvbx92/ruffus
—
Reply to this email directly or view it on GitHub
fritz-hh/OCRmyPDF#99 (comment).
from ocrmypdf.
Comment by jbarlow83
Wed Jul 29 00:49:10 2015
With a virtual environment:
pyvenv venv
source venv/bin/activate
pip3 install -e .
from ocrmypdf.
Related Issues (20)
- [Bug]: crashes with tesseract 5.4.0 HOT 8
- [Bug]: ocrmypdf 16.3.1 fails on a file on Arch that 13.4.0 on Ubuntu handles well HOT 1
- [Feature]: Alternative AI OCR "surya" as opposed to EasyOCR, Just found it today and it dominated the accuracy and speed of Tesseract & EasyOCR HOT 3
- [Bug]: Paperless-ngx Release 2.9.0 Ghostscript rasterizing failed HOT 1
- [Bug]: MetadataProgress does not respect progress_bar=False argument
- [Bug]: No errors and no output for large DPI files HOT 2
- [Bug]: `lots of diacritics - possibly poor OCR` but using standalone tesseract works perfectly HOT 1
- [Bug]: ocrmypdf (16.3.1) and Tesseract 5.4.1 HOT 3
- [Bug]: Existing text is completely replaced with other characters HOT 3
- [Request]: Please make rich logging library an optional dependency HOT 1
- [Feature]: Enable execution on GPU HOT 1
- [Bug]: doesn't always parse Latin with diacritics HOT 3
- Output file images are corrupted HOT 1
- [Bug]: OSError: [Errno 28] No space left on device HOT 4
- [Bug]: problem with tif "DPI is not credible". Estimate dpi HOT 3
- [Bug]: Ghostscript can't create a PDF/A-file (Page object was reserved for an Annotation destination) HOT 3
- [Bug]: KeyError: '/Subtype'
- [Bug]: Ghostscript rasterizing failed HOT 3
- [Bug]: files signed with a-trust are not recognised as digitally signed and hence processed HOT 1
- --sidecar writes text content and messages to file HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ocrmypdf.