GithubHelp home page GithubHelp logo

rusty-ferris-club / decompress Goto Github PK

View Code? Open in Web Editor NEW
7.0 3.0 3.0 146 KB

Extracting archives made easy for Rust ๐Ÿฆ€

License: Apache License 2.0

Rust 98.74% Shell 1.26%
decompress rust rust-lang tar unpack zip

decompress's People

Contributors

jondot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

arnovsky tecc dsully

decompress's Issues

Refactoring: use `anyhow` for all internal error handling and propagation, but use a final `Error` for API

Suggestion / Feature Request

The various decompressors use a "central" error enum, which can convert from named and specific error types. Although this is the recommended and acceptable best practice, it makes for an overly verbose coding of an error ergonomics story.

anyhow will chain, wrap, errors, and encourage and allow for adding contexts, which is something we need here.
the only drawback of using anyhow in libraries is that we cannot expose any anyhow type in the public API.

This is why we suggest a refactoring: use anyhow for all internal error handling and propagation, but convert it to a final Error for the public API.

Notes:

  • get familiar with anyhow first, see how using context can be cool and nice for "attaching" error information
  • convert all Result types to anyhow::Result, this is supposed to be easy
  • locate all public API and use an std::Result. this means you need to now convert from anyhow::Result to a standard one. you can use the existing thiserror library to do that

test

Suggestion / Feature Request

Your suggestion here

add a `can_decompress` checker on public API

Suggestion / Feature Request

For a cheap test before decompression, when going over a massive list of files, it may be preferred to test for decompression before attempting it.
While there's no difference in performance, the API might be useful for making decisions in other part of program logic for some users.

[Bug]: println in find_decompressor

What happened?

There's a println in the find_decompressor function of decompress 0.6.0.
It definitely shouldn't be there, at least not for the finished product.

What type of Operating System?

Windows

Steps to produce this issue.

1. Use decompress. Specifically the can_decompress function.

Add `filter` and `map` to file paths

Suggestion / Feature Request

Just like in the npm version of decompress: https://github.com/kevva/decompress#filter

The idea is to let users:

  • filter out items from decompression, in archives that support listing of entries (filter)
  • rename destination paths or file names (map)

There are two primary challenges:

  1. Accepting a type of such a function in a clean, easy way, through Opts. It can be an Fn signature, or a trait called FilterMap which implements an identity filter and map by default, and then passed through a Box<dyn ..>
  2. Understanding which compressor can support such operations and which cannot (don't fight with those which cannot, just don't implement for those)

Add a `bz2`, `xz`, `zstd` standalone decompressors

Suggestion / Feature Request

Add the following decompressors:

  • bz2
  • xz
  • zstd

Which are standalone, only compress a single file, they have no concept of file listing or "archiving" like tar.

Similar to the recently added gz decompressor: 651effa

Important:

  • feature flags should be matching the decompressor and its dependencies
  • the decompressor should come after the tar.<..> decompressor in the decompressor array stack, so that only if it's not a tar.<something> file, then it will decompress.
  • in test suite - create a demo archive and show that decompression works
  • it's OK to copy the gz decompressor to start with

add an option to auto detect decompressor from magic bytes

Suggestion / Feature Request

Add an option to detect the compressor from magic bytes, instead of extensions.

This means every decompressor should now have an additional test_magic which applies magic number detection (for relevant archive types) and returns if it's compatible for decompression.

After this is done we will have two strategies:

  • by extension
  • by magic bytes

The public API should allow to choose a strategy explicitly, or to order the strategies in "magic, fallback to extension" or "extension, fallback to magic". This may incorporate a small refactor in how decompressors are tested for validity.

In addition, including magic bytes now opens up the question of "what to do if no decompressors are found?". For example, it's completely viable for no magic bytes to exist, which may mean this is an uncompressed file.
If this is an uncompressed file, should we copy it to target folder? some users would prefer that over "no decompressor found". This kind of result type ("no decompressor", "copied", etc.) should be modeled and refactored as well.

A clever trick is to add a "PassthroughDecompressor" which is added to the stack, and so this one will copy files when no decompressor found.

Nodejs native bindings with `neon`

Suggestion / Feature Request

Use neon to wrap the main decompress API for node.js. This does not require any knowledge of decompress other than its public API, and the API lend itself to FFI very well.

  • Build an ergonomic Node API (play with Neon, provide a suggestion first)
  • CI build for multi-platform binaries for the binding (see how other projects do it)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.