GithubHelp home page GithubHelp logo

Comments (3)

bitsgalore avatar bitsgalore commented on June 7, 2024

Just had a first look at some of your files. The ones I checked out fall into either one of these 2 categories:

  1. Bit-flipping causes jpylyzer to expect an ICC profile that contans millions of entries. Examples are cc-16-kdu.jp2-killed-1971, cc-16-kdu.jp2-killed-1961, cc-16-kdu.jp2-killed-1937. Analyzing these files with ExifTool gives you sth like this:

    Bad ICC_Profile table (67108875 entries)

    The solution is to impose a sensible upper limit to the number of entries in an ICC profile.

  2. For all other images, the codestream header field with the number of tiles is corrupted so that it has a very large (milions!) value. Internally jpylyzer creates a dictionary which has an entry for each (expected) tile, and then loops over all them. The result is a seemingly endless loop + excessive memory usage. Examples are cc-16-kdu.jp2-killed-11218 and cc-16-kdu.jp2-killed-10918.

Solution: impose sensible upper limit here as well. Already did a quick test, and this resulted in the secondary problem:

TypeError: cannot serialize 2147483664L (type long)

Solution: add long type to remap function in ETpatch, i.e.:
elif textType in[int,long,float,bool]: textOut=str(remappedValue)

Then there's the endless loop in cc-16-kdu.jp2-killed-22181, will look at that later.

I'll fix this in the next release + I'll also run tests on the full set of files.

from jpylyzer.

bitsgalore avatar bitsgalore commented on June 7, 2024

Fixed in version 1.10.3. These were actually 3 separate issues:

  • ICC profile issue: fixed by imposing upper limit of 4096 on tagCount in ICC module (ExifTool does this as well)
  • Tile issue: fixed by imposing upper limit to number of tiles of 65,535 (Kakadu also uses this as an upper limit; it's unclear if this is actually a limit imposed by the standard as I can't find this anywhere in the codestream spec)
  • Infinite looping over Unknown Box: the culprit here was a corrupted Box Length field that caused the parsing of the box structure to go all wrong. Fixed by check on -9999 value of boxLengthValue when looping over child boxes.

from jpylyzer.

anjackson avatar anjackson commented on June 7, 2024

In case anyone wants to take this kind of tools testing/analysis further, here's a link to my write-up of this technique: Understanding Tools & Formats Via Bitwise Analysis

from jpylyzer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.