Comments (4)
'unicode' as a keyword does not exist in python 3, as strings are Unicode. That change needs to be made.
from pypdf.
extractText Issue #17 appears to be related. The method works relatively smoothly, but fails often in Python 3
from pypdf.
The goal for branch Python3-3 is for all features of PyPDF2 to be compatible with all python versions from 2.5 - 3.3 (and soon 3.4). This may make the code a little 'less clean'. I pushed a few experimental changes in commit df5d5f2. The main error that persists is that PyPDF2 is unable to find the endstream marker when parsing streams. (In Python 3+). Also, we need to merge this branch with PyPDF2 1.18, which will cause many merge conflicts in pdf.py.
from pypdf.
This project is nearly finished (although support for Python 2.5 is dropped). The only major compatibility issue that remains is that of encryption, which is listed in a separate issue.
from pypdf.
Related Issues (20)
- Add support for /Kids and /Limits in page labels HOT 2
- Set Page Mode in pypdf 4.1.0 HOT 1
- Pillow 10.3.0 breaks test_filters.test_rgba HOT 3
- DEV: Review pinned requirements in include files HOT 1
- DEV: Evaluate dependency update concept for CI HOT 4
- TypeError handling incorrect xref size: '<' not supported between instances of 'int' and 'NameObject'
- Spaces between form values when previewing and flattening pdf HOT 2
- Confusion between User Access Permissions & Document Security Restrictions HOT 1
- PyPDF unable to read PDFs that are generated via "Print to PDF"/"DocuWare Generated PDF" HOT 5
- Merging PDFs with content streams ending in Q causes error message in Adobe Reader
- Let transfer_rotation_to_content affect Annotations HOT 1
- PdfMerger does not merge outlines of pdf documents with single pages HOT 2
- ValueError: invalid literal for int() with base 16: b'F:' HOT 2
- User error.
- Wrong characters during extract_text with /Differences for font /TJQCZS+FzBookMaker2DlFont HOT 2
- Execute docs examples in CI HOT 1
- PDF writing not filling the Dropdown HOT 25
- Images contained in objects of type "/Pattern" are not retrieved HOT 8
- BUG: UnboundLocalError when iterating on pages of malformed pdf (with strict=True) HOT 12
- is there any way I can highlight part of the text based on given a index span within in a pdf?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pypdf.