GithubHelp home page GithubHelp logo

ua-nick / fleep-py Goto Github PK

View Code? Open in Web Editor NEW
260.0 10.0 40.0 122 KB

File format determination library for Python

Home Page: https://pypi.python.org/pypi/fleep

License: MIT License

Python 100.00%
magic-numbers magic-number filetype fileformat mimetype extension file-extensions file-types file-format file-format-detection

fleep-py's People

Contributors

ua-nick avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fleep-py's Issues

Has fleep been abandoned?

I was looking for a way for Python to identify files and this seems to get the most hits on the net, however, it looks like it was abandoned some time ago.

Can anyone confirm if it's no longer being actively maintained?

doc mime type seems wrong

The mime type for Microsoft word looks wrong:
fleep\data.json: line 70
{"type": "document", "extension": "doc", "mime": "application/vnd.ms-excel", "offset": 0, "signature": ["D0 CF 11 E0 A1 B1 1A E1", "50 4B 03 04 14 00 06 00"]},

Should be:
application/msword

MP3 file not being recognized properly.

I am using the following mp3 file
but fleep returns an empty list for type or mime or extension for this file

info = fleep.get(open("test.mp3", "rb").read())
assert info.type == ["audio"]
assert info.extension == ["mp3"]

It returns the following error:

AssertionError: assert [] == ['audio']

MP4 is indicated for M4A files

Pure .m4a audio files show MP4 as file extension:

    with open(full_file, "rb") as f:
        info = fleep.get(f.read(128))

    print(info.type, info.extension, info.mime, file)

['video'] ['mp4'] ['video/mp4'] Bob Marley & The Wailers - Jamming-VF30VS5FZ6c.m4a

fleep.get returns nothing regarding type, extension, or mime

I have the following code

with open("back number - sister.mp3", "rb") as file:
    info = fleep.get(file.read(128))
    print(info.type)
    print(info.extension)
    print(info.mime)

But the output shows nothing

[]
[]
[]

I tried exiftool & it shows the correct info

ExifTool Version Number         : 11.11
File Name                       : back number - sister.mp3
Directory                       : .
File Size                       : 3.8 MB
File Modification Date/Time     : 2018:12:15 18:57:00+09:00
File Access Date/Time           : 2019:02:02 18:00:44+09:00
File Inode Change Date/Time     : 2018:12:15 18:57:28+09:00
File Permissions                : rw-r--r--
File Type                       : MP3
**File Type Extension             : mp3**
MIME Type                       : audio/mpeg
MPEG Audio Version              : 1
Audio Layer                     : 3
Audio Bitrate                   : 128 kbps
Sample Rate                     : 44100
Channel Mode                    : Joint Stereo
MS Stereo                       : On
Intensity Stereo                : Off
Copyright Flag                  : False
Original Media                  : True
Emphasis                        : None
Encoder                         : LAME3.99r
Lame VBR Quality                : 4
Lame Quality                    : 3
Lame Method                     : CBR
Lame Low Pass Filter            : 17 kHz
Lame Bitrate                    : 128 kbps
Lame Stereo Mode                : Joint Stereo
ID3 Size                        : 128
Title                           :
Artist                          :
Album                           :
Year                            :
Comment                         :
Genre                           : None
Duration                        : 0:04:07 (approx)

Link to the trouble file back number - sister.mp3

Could you please help?

Not working for specific jpeg image

Greetings!

First of all, thanks for the great lib, it's very simple and well defined.

However, I've been having problems with a specific image, the one found in this URL. Here's the code:

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) ' +
               'AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
response = requests.get(url, headers=headers)
content = ContentFile(response.content)
info = fleep.get(content.read(128))

# < ----- debug line ----- >

if not info.type_matches('raster-image'):
    raise ValidationError('some error here')

When debugging at the specified point of the code, I get:

ipdb> info.type
[]
ipdb> info.extension
[]
ipdb> info.mime
[]

Do you know what could be happening?

Cheers!

Fleep wheels won't build

Here's a log, I don't know what else to post

Defaulting to user installation because normal site-packages is not writeable
Collecting fleep
  Using cached fleep-1.0.1.tar.gz (6.5 kB)
  Preparing metadata (setup.py) ... done
Building wheels for collected packages: fleep
  Building wheel for fleep (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [79 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib
      creating build/lib/fleep
      copying fleep/__init__.py -> build/lib/fleep
      running egg_info
      writing fleep.egg-info/PKG-INFO
      writing dependency_links to fleep.egg-info/dependency_links.txt
      writing top-level names to fleep.egg-info/top_level.txt
      reading manifest file 'fleep.egg-info/SOURCES.txt'
      reading manifest template 'MANIFEST.in'
      writing manifest file 'fleep.egg-info/SOURCES.txt'
      copying fleep/data.json -> build/lib/fleep
      /usr/lib/python3/dist-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
      !!
      
              ********************************************************************************
              Please avoid running ``setup.py`` directly.
              Instead, use pypa/build, pypa/installer or other
              standards-based tools.
      
              See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
              ********************************************************************************
      
      !!
        self.initialize_options()
      installing to build/bdist.linux-x86_64/wheel
      running install
      running install_lib
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-xzyizbr1/fleep_7188a3aba8414fa595f6b9510b7e1289/setup.py", line 4, in <module>
          setuptools.setup(
        File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 107, in setup
          return distutils.core.setup(**attrs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/lib/python3/dist-packages/setuptools/_distutils/core.py", line 185, in setup
          return run_commands(dist)
                 ^^^^^^^^^^^^^^^^^^
        File "/usr/lib/python3/dist-packages/setuptools/_distutils/core.py", line 201, in run_commands
          dist.run_commands()
        File "/usr/lib/python3/dist-packages/setuptools/_distutils/dist.py", line 969, in run_commands
          self.run_command(cmd)
        File "/usr/lib/python3/dist-packages/setuptools/dist.py", line 1233, in run_command
          super().run_command(command)
        File "/usr/lib/python3/dist-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/usr/lib/python3/dist-packages/wheel/bdist_wheel.py", line 381, in run
          self.run_command("install")
        File "/usr/lib/python3/dist-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)
        File "/usr/lib/python3/dist-packages/setuptools/dist.py", line 1233, in run_command
          super().run_command(command)
        File "/usr/lib/python3/dist-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/usr/lib/python3/dist-packages/setuptools/command/install.py", line 78, in run
          return orig.install.run(self)
                 ^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/lib/python3/dist-packages/setuptools/_distutils/command/install.py", line 708, in run
          self.run_command(cmd_name)
        File "/usr/lib/python3/dist-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)
        File "/usr/lib/python3/dist-packages/setuptools/dist.py", line 1233, in run_command
          super().run_command(command)
        File "/usr/lib/python3/dist-packages/setuptools/_distutils/dist.py", line 987, in run_command
          cmd_obj.ensure_finalized()
        File "/usr/lib/python3/dist-packages/setuptools/_distutils/cmd.py", line 111, in ensure_finalized
          self.finalize_options()
        File "/usr/lib/python3/dist-packages/setuptools/command/install_lib.py", line 17, in finalize_options
          self.set_undefined_options('install',('install_layout','install_layout'))
        File "/usr/lib/python3/dist-packages/setuptools/_distutils/cmd.py", line 296, in set_undefined_options
          setattr(self, dst_option, getattr(src_cmd_obj, src_option))
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/lib/python3/dist-packages/setuptools/_distutils/cmd.py", line 107, in __getattr__
          raise AttributeError(attr)
      AttributeError: install_layout. Did you mean: 'install_platlib'?
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for fleep
  Running setup.py clean for fleep
Failed to build fleep
ERROR: Could not build wheels for fleep, which is required to install pyproject.toml-based projects

LAME encoder ID

The Lame encoder uses a different number than the one provided so a falsy empty list is returned.

FF FB 90 64 with no offset

MP3 Info Tag revision

Since MP3 is a streaming format it doesn't have an official Magic Number as you can see here it can a bit tricky sometimes to get a correct identification. Especially when the file was made with some wonky online tool.

       Magic number(s): none
       File extension(s): .mp1, .mp2, .mp3
       Macintosh File Type Code(s): MPEG
       Object Identifier(s) or OID(s): none

NEF is wrong

I tested this on some NEF (Nikon Electric Format) files, it detects them as raster images and raw types. However, NEF isn't in any of the extensions listed.

Tagging versions in git

At Gentoo, we prefer to have the testsuite available to test the package across python versions.
As the testsuite isn't shipped in PyPi tarballs, could you tag the version in GitHub for easier download of version tarballs?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.