GithubHelp home page GithubHelp logo

lz4tools's Introduction

Lz4tools and Lz4f

https://travis-ci.org/darkdragn/lz4tools.svg?branch=master

Overview

This package consists of two parts:

  1. lz4f - C-Module containing python bindings for all lz4frame functions.
  2. lz4tools - a zipfile-like file wrapper class and tarfile-like class for lz4 compressed files.
  3. lz4toolsCli - a quick cli for using lz4tools static functions.

Before going any further, I recommend reading up on lz4 at: https://code.google.com/p/lz4/

It is an awesome compression algorithm and I can't thank Yann Collet enough for putting together the C implementation and lz4frame.

Usage

Cli:
New addition, there is now a simple Cli for anyone wishing these capabilities would exist directly on the command line.
$lz4toolsCli
usage: lz4toolsCli [-h] [-f] [-t] [-d] input [output]

positional arguments:
  input       The targeted input.
  output      Optional output target.

optional arguments:
  -h, --help  show this help message and exit
  -f          Compress file. Default action if input is a file.
  -t          Compress directory to .tar.lz4. Default action if input is a
              directory
  -d          Decompress. Default action if the file ends in .lz4.
C-Module / Bindings:

I would recommend against using lz4f directly except in debug/testing situations. If necessary, a compress or decompress operation first needs a context that will be used with all lz4f functions:

lz4f compression the hard way:
>>> import lz4f
>>> inputFile = open('fileIn', 'rb')
>>> cCtx = lz4f.createCompressionContext
>>> header = lz4f.compressBegin(cCtx)
>>> data = lz4f.compressUpdate(inputFile.read(), cCtx)
>>> end = lz4f.compressEnd(cCtx)
>>> with open('output.lz4', 'wb') as out:
...     out.write(header)
...     out.write(data)
...     out.write(end)
...     out.flush()
...     out.close()
>>> lz4f.freeCompContext(cCtx)
>>> inputFile.close()
>>> del header, data, end
lz4f compression the easy way:
>>> import lz4f
>>> with open('output.lz4', 'wb') as out:
...     with open('fileIn', 'rb') as inFile:
...         out.write(lz4f.compressFrame(inFile.read())
...     out.flush()
...     out.close()

Advantages and disadvantages: The easy way takes more ram. It reads the contents of the file into a buffer, passes it and compresses it all in one go. With the hard way you can have it read as little or as much as you like. For instance, you can break up the input into 64kiB chunks. Each chunk could be read, compressed and dropped to disk to conserve ram.

Lz4Tools Module:

The lz4file class is currently read only. Right now it is a bit rough around the edges, however over the next couple of weeks, I will finish adding some docstrings, and such to make it more user friendly. As soon as I get a chance I will make it write capable. The easiest way to use it is with either the open or openTar methods. That's right! There is a lz4Tar class in the module that is a subclass of tarfile.

lz4tools tar example:
>>> import lz4tools
>>> lz4tools.compressTarDefault('src')
>>> testTar = lz4tools.openTar('src.tar.lz4')
>>> testTar.list()
-rwxr-xr-x darkdragn/darkdragn          0 2014-10-02 23:06:09 src/
-rw-r--r-- darkdragn/darkdragn      29905 2014-09-16 18:29:45 src/lz4hc.c
-rw-r--r-- darkdragn/darkdragn       6781 2014-09-16 18:29:45 src/  xxhash.h
-rw-r--r-- darkdragn/darkdragn      25662 2014-09-16 18:29:45 src/  xxhash.c
-rw-rw-r-- darkdragn/darkdragn      13894 2014-10-02 20:22:09 src/lz4frame.h
-rw-rw-r-- darkdragn/darkdragn      46241 2014-10-02 20:22:09 src/lz4.c
-rw-r--r-- darkdragn/darkdragn       8832 2014-09-16 18:29:45 src/lz4hc.h
-rw-rw-r-- darkdragn/darkdragn      11734 2014-10-02 23:06:08 src/python-lz4f.c
-rw-rw-r-- darkdragn/darkdragn       2554 2014-10-02 20:22:09 src/python-lz4f.h
-rw-r--r-- darkdragn/darkdragn      14882 2014-09-18 01:28:06 src/lz4.h
-rw-rw-r-- darkdragn/darkdragn      50141 2014-10-02 23:04:05 src/lz4frame.c
lz4tools file example:
>>> import lz4tools
>>> lz4tools.compressFileDefault('setup.py')
>>> testFile = lz4tools.open('setup.py.lz4')
>>> testFile.blkDict
{0: {'decomp_e': 1445, 'compressed_begin': 7, 'blkSize': 923}}
>>> testFile.seek(1002)
>>> print testFile.read()
    test_suite = "nose.collector",
    keywords = ['lz4', 'lz4frame', 'lz4file', 'lz4tar'],
    classifiers=[
        'Development Status :: 5 - Production/Stable',
        'License :: OSI Approved :: BSD License',
        'Intended Audience :: Developers',
        'Programming Language :: C',
        'Programming Language :: Python',
        'Programming Language :: Python :: 2.6',
        'Programming Language :: Python :: 2.7',
    ],
)

And thus ends the brief tutorial.

Notes

Version :
The first two digits of the version will always correspond with the version of lz4 that is included. Current version is r123, thus 1.2. The next digit corresponds to milestone improvements. Example: Once lz4file supports write. The last digit will be slight improvements. Usually some contextual error, or syntax error. Perhaps even a quick fix for python3.4, since I don't use it often, if an issue is brought to my attention, I will provide a quick fix as quickly as possible.

lz4tools's People

Contributors

darkdragn avatar qiujianmt avatar thomaswaldmann avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.