GithubHelp home page GithubHelp logo

marshalparser's Introduction

marshalparser's People

Contributors

encukou avatar frenzymadness avatar hroncok avatar keszybz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

marshalparser's Issues

Python 3.10.0a2 breakage

See https://src.fedoraproject.org/rpms/python3.10/pull-request/8#comment-59965

+ marshalparser /usr/lib64/python3.10/email/mime/__pycache__/audio.cpython-310.opt-1.pyc
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/marshalparser/marshalparser.py", line 205, in read_object
    self.record_object_result(result)
UnboundLocalError: local variable 'result' referenced before assignment

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/marshalparser", line 8, in <module>
    sys.exit(main())
  File "/usr/lib/python3.9/site-packages/marshalparser/marshalparser.py", line 373, in main
    parser.parse()
  File "/usr/lib/python3.9/site-packages/marshalparser/marshalparser.py", line 57, in parse
    self.read_object()
  File "/usr/lib/python3.9/site-packages/marshalparser/marshalparser.py", line 207, in read_object
    raise RuntimeError(
RuntimeError: Error: type [TYPE_FLOAT] is recognized but result is not present

Add Python 3.12

The marshalparser tests failed during https://src.fedoraproject.org/rpms/python3.12/pull-request/1

+ marshalparser /usr/lib64/python3.12/test/test_importlib/resources/__pycache__/test_open.cpython-312.pyc
Cannot read/parse byte 53 b'5' on possition 0
Might be error or unsupported TYPE

This is expected, considering magic.py does not even recognize 3.12 yet. Hence, opening this.

"Content is the same, nothing to fix…" is hard to understand

In Fedora, when we run find ... | xargs marshalparser --fix --overwrite I see:

+ /usr/lib/rpm/redhat/brp-fix-pyc-reproducibility /builddir/build/BUILDROOT/glib2-2.70.0-1.fc36.x86_64/usr/share
Content is the same, nothing to fix…
Content is the same, nothing to fix…
Content is the same, nothing to fix…
Content is the same, nothing to fix…
Content is the same, nothing to fix…
Content is the same, nothing to fix…
Content is the same, nothing to fix…
Content is the same, nothing to fix…
Content is the same, nothing to fix…
Content is the same, nothing to fix…
Content is the same, nothing to fix…
Content is the same, nothing to fix…

At that point, I think it is just noise. Would you consider not printing this line at all or changing the line to something more explanatory (i.e. what content is the same as what)?

I think something like this would be ideal:

No unused FLAG_REFs in /builddir/build/BUILDROOT/glib2-2.70.0-1.fc36.x86_64/usr/share/spam.py
Removing unused FLAG_REFs from /builddir/build/BUILDROOT/glib2-2.70.0-1.fc36.x86_64/usr/share/eggs.py

WDYT?

Adjust magic ranges for Python 3.11

Python 3.10 final is out. The magic range is inclusive_range(3430, 3439).
A new range for 3.11 should be something like inclusive_range(3450, 3500).

I can do that, but I am not sure I know how to setup tests for Python 3.11.

marshalparser cannot parse pyc files if not named *.pyc

I've played with marshalparser a bit and I have trouble using it on files with arbitrary filenames.

$ touch foo.py 
$ python3.8 -m foo
$ mv __pycache__/foo.cpython-38.pyc test.dat
$ python -m marshalparser -p test.dat 
Cannot read/parse byte 85 b'U' on possition 0
Might be error or unsupported TYPE

Or even:

$ touch foo.py 
$ python3.9 -m foo
$ mv __pycache__/foo.cpython-39.pyc test.dat
$ python -m marshalparser -p test.dat 
Traceback (most recent call last):
  File "/usr/lib64/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib64/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File ".../marshalparser/marshalparser/__main__.py", line 3, in <module>
    main()
  File ".../marshalparser/marshalparser/marshalparser.py", line 359, in main
    parser.parse()
  File ".../marshalparser/marshalparser/marshalparser.py", line 50, in parse
    self.read_object()
  File ".../marshalparser/marshalparser/marshalparser.py", line 119, in read_object
    result = self.read_string()
  File ".../marshalparser/marshalparser/marshalparser.py", line 221, in read_string
    bytes = self.read_bytes(size)
  File ".../marshalparser/marshalparser/marshalparser.py", line 208, in read_bytes
    index, byte = next(self.iterator)
StopIteration

It seems that marshalparser relies on the filename to skip the pyc header:

# skip pyc header (first n bytes)
if filename.suffix == ".pyc":
self.python_version = get_pyc_python_version(filename)
pyc_header_len = get_pyc_header_lenght(self.python_version)
for x in range(pyc_header_len):
next(iterator)

I find that a tad confusing. I would assume all processed files have the header. Can a non-default --pure-marshal or --no-header switch be added to handle the cases where the pyc header is missing? Or even better, can the header be detected?

Find a better way how to maintain the compatibility with the list of magic numbers from CPython

I can imagine some kind of range for each Python version but I'm afraid that it might end up identifying pure marshal files as pyc files.

I have to investigate whether some kind of random conflict might happen. If not, we can maintain future proof ranges of magic numbers for every supported Python version and adjust it when something important happens there and we won't have to release a new version after every change.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.