GithubHelp home page GithubHelp logo

basedwon / mimetics Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 1.0 381 KB

Mimetics determines the file type, MIME type, and media type of a given file using magic numbers and content analysis to detect the most likely file type and then maps that to the appropriate MIME and media type.

License: MIT License

JavaScript 100.00%
filetype magicbytes

mimetics's Introduction

Mimetics

Know thy files

npm pipeline license downloads

Gitlab Github Twitter Discord

Mimetics determines the file type, MIME type, and media type of a given file using magic numbers and content analysis to detect the most likely file type and then maps that to the appropriate MIME and media type. I've included all of the file types that GPT could think of, but if I'm missing any please submit a pull request.

Features

  • Detects file type using magic numbers
  • Detects file type using text content analysis
  • Extracts the MIME and media types
  • Supports wide variety of file types.
  • Extendable with user-defined magic numbers, MIME types and file types

Installation

npm install mimetics

Usage

Basic Usage

Read a file and pass the resulting buffer to Mimetics, which analyzes it to determine its file type, MIME type, and media type:

const mimetics = require('mimetics')
const fs = require('fs')

const buffer = fs.readFileSync('example.jpg')

// call it as a function:
const fileInfo = mimetics(buffer)

// or, you can call the `parse` method:
const fileInfo = mimetics.parse(buffer)

console.log(fileInfo) // Logs { ext: 'jpg', mime: 'image/jpeg', media: 'image' }

In the Browser

const fileInput = document.getElementById('myFileInput')
fileInput.addEventListener('change', (event) => {
  const file = event.target.files[0]
  const reader = new FileReader()
  reader.onload = (event) => {
    const arrayBuffer = event.target.result
    const fileInfo = mimetics(arrayBuffer)
    console.log(fileInfo)
  }
  reader.readAsArrayBuffer(file)
})

Adding Custom Magic Numbers

Here, we're adding a custom magic number for a hypothetical file type and then using Mimetics to analyze a file of that type:

const mimetics = require('mimetics')
const fs = require('fs')

const buffer = fs.readFileSync('example.custom')
const mm = new mimetics({
  magic: { custom: [0x43, 0x55, 0x53, 0x54] }
})
const fileInfo = mm.parse(buffer)

console.log(fileInfo) // Logs { ext: 'custom', mime: 'application/octet-stream', media: 'application' }

Adding Custom MIME Types

Here, we're adding a custom MIME type for a specific file extension:

const Mimetics = require('mimetics')
const fs = require('fs')

const buffer = fs.readFileSync('example.custom')
const mm = new Mimetics()
mm.addMimeType('custom', 'application/x-custom')

const fileInfo = mm.parse(buffer)

console.log(fileInfo) // Logs { ext: 'custom', mime: 'application/x-custom', media: 'application' }

API

Mimetics exports a single class with the following methods:

  • parse(buffer: Buffer): Object - Takes in a Buffer and returns an object with the file type (ext), MIME type (mime), and media type (media).
  • getFileType(buffer: Buffer): string - Determines the file type from the provided buffer.
  • getMimeType(extension: string): string - Returns the MIME type for the provided file extension.
  • getMediaType(extension: string): string - Returns the media type for the provided file extension.

Mimetics also provides a comprehensive API to customize and extend the default behavior:

  • setOptions(opts: Object): Mimetics - Sets options for the instance, enabling the addition of custom magic numbers, MIME types, file types and edge cases.
  • addMagicNumber(ext: string, magicNumber: Array<number>): void - Adds a new magic number to the instance's magicNumbers map.
  • addMimeType(ext: string, mimeType: string): void - Adds a new MIME type to the instance's mimeTypeMap.
  • addFileType(ext: string, regex: RegExp): void - Adds a new file type to the instance's fileTypeMap.
  • addEdgeCase(specialExt: string, extList: Array<string>): void - Adds a new edge case to the instance's edgeCases map.

Mimetics.parse(buffer)

Takes a buffer as input, identifies the file type, mime type, and media type, and returns an object containing these three properties: ext, mime, media.

Parameters

  • buffer - Buffer - The input buffer to be parsed.

Returns

An object containing the determined file type, mime type, and media type.


For more detailed API documentation, see the API reference and the comments in the code.

Contributing

Contributions are welcome. Submit a Pull Request or open an Issue to discuss any changes. Please read contributing.md for details on our code of conduct, and the process for submitting merge requests to us.

Testing

Mimetics includes a test suite built with testr.

To run the test, first clone the respository:

git clone https://github.com/basedwon/mimetics.git

Install the dependencies, then run npm test:

npm install
npm test

Donations

If you find this project useful and want to help support further development, please send us some coin. We greatly appreciate any and all contributions. Thank you!

Bitcoin (BTC):

1JUb1yNFH6wjGekRUW6Dfgyg4J4h6wKKdF

Monero (XMR):

46uV2fMZT3EWkBrGUgszJCcbqFqEvqrB4bZBJwsbx7yA8e2WBakXzJSUK8aqT4GoqERzbg4oKT2SiPeCgjzVH6VpSQ5y7KQ

License

Mimetics is MIT licensed.

mimetics's People

Contributors

basedwon avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

naumovevgeniy

mimetics's Issues

zip files are read as docx files

when you parse a zip file it reads as a docx file because they share the same magic number, same with ppx and xlsx. is there a way to identify them individually?

thanks!

thanks for the library

I was trying file type, but I was having issues with typescript and jest.

you saved me :D

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.