Comments (3)
Just had a first look at some of your files. The ones I checked out fall into either one of these 2 categories:
-
Bit-flipping causes jpylyzer to expect an ICC profile that contans millions of entries. Examples are cc-16-kdu.jp2-killed-1971, cc-16-kdu.jp2-killed-1961, cc-16-kdu.jp2-killed-1937. Analyzing these files with ExifTool gives you sth like this:
Bad ICC_Profile table (67108875 entries)
The solution is to impose a sensible upper limit to the number of entries in an ICC profile.
-
For all other images, the codestream header field with the number of tiles is corrupted so that it has a very large (milions!) value. Internally jpylyzer creates a dictionary which has an entry for each (expected) tile, and then loops over all them. The result is a seemingly endless loop + excessive memory usage. Examples are cc-16-kdu.jp2-killed-11218 and cc-16-kdu.jp2-killed-10918.
Solution: impose sensible upper limit here as well. Already did a quick test, and this resulted in the secondary problem:
TypeError: cannot serialize 2147483664L (type long)
Solution: add long type to remap function in ETpatch, i.e.:
elif textType in[int,long,float,bool]: textOut=str(remappedValue)
Then there's the endless loop in cc-16-kdu.jp2-killed-22181, will look at that later.
I'll fix this in the next release + I'll also run tests on the full set of files.
from jpylyzer.
Fixed in version 1.10.3. These were actually 3 separate issues:
- ICC profile issue: fixed by imposing upper limit of 4096 on tagCount in ICC module (ExifTool does this as well)
- Tile issue: fixed by imposing upper limit to number of tiles of 65,535 (Kakadu also uses this as an upper limit; it's unclear if this is actually a limit imposed by the standard as I can't find this anywhere in the codestream spec)
- Infinite looping over Unknown Box: the culprit here was a corrupted Box Length field that caused the parsing of the box structure to go all wrong. Fixed by check on -9999 value of boxLengthValue when looping over child boxes.
from jpylyzer.
In case anyone wants to take this kind of tools testing/analysis further, here's a link to my write-up of this technique: Understanding Tools & Formats Via Bitwise Analysis
from jpylyzer.
Related Issues (20)
- foundExpectedNumberOfTileParts clarification HOT 2
- Feature Request: validate RSIZ profile: cinema and IMF HOT 1
- Add documentation + update XSD schema for PLM, PLT marker output HOT 3
- E ModuleNotFoundError: No module named 'six' HOT 4
- Finalize/review unit tests that use jpylyzer test corpus
- Refactor findFiles function HOT 1
- Remove support for Python 2.7
- Show warnings for deprecated features HOT 1
- Remove --wrapper and --legacyout options HOT 1
- Unexpected behaviour running tests on openjpeg-data HOT 1
- Feature request: validate TLM marker HOT 1
- Example of corrupt PLT markers HOT 1
- feature request : abbreviate </plt> marker output for large images HOT 8
- jpylyzer failing on compressed CMYK file with valid enumcs value of 12 HOT 2
- Garbled sot info HOT 2
- Finalize 2.1 release
- Hard-coded path in test_test.py (PY27 support and Travis questions) HOT 3
- Travis CI fails on pre-commit (Py 3.5 build) HOT 2
- JPEG 2000 files downloaded from USGS are marked as invalid HOT 5
- Travis fails on pre-commit command HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jpylyzer.