GithubHelp home page GithubHelp logo

srcomp's Introduction

srcomp

Simple real-time text files compressor.

This is an experimental text files compressor that is simple enough to be implemented by hardware.

It is not the best compressor but it has an acceptable compression ratio and is real-time. Using a 1Mb block size it has a similar compression ratio to LZOP (based in LZO algorithm) for text files.

The compression algorithm

The compression algorithm is a simplification of the Bzip2 compression.

Bzip2 compression uses the Burrows-Wheeler Transform followed by the Move-To-Front transform and after that compresses the information by using Huffman-coding.

Srcomp uses a simple 16-bits word sorting (every word is sorted according to the previous one) followed by a byte sorting (the lower byte of every 16-bits word is sorted according to the upper byte) to replace the Burrows-Wheeler Transform.

This replacement is the main cause of losing compression ratio but gaining compression speed. And it is so simple that it can be implemented by hardware.

After this first stage, the Move-To-Front transform is used because, although slow in software implementations, it can be implemented by hardware very easily (hardware implementations are real-time). And last but not least, the Huffman coding is replaced by the Elias gamma coding.

So, every step of the compression algorithm is real-time and simple enough to be implemented by hardware.

One of the advantages over other real-time compressors is that the compression speed is slightly faster than the decompression speed.

Build the software

In order to build the software you will need libtool, autoconf and automake (and of course GCC). Also you will need libiberty and cmocka (for the unit tests).

So the building process is:

aclocal
autoconf
automake --add-missing

./configure
make

After a successful build you can run the unit tests if you want:

tests/tests

Run the software

The software can run in the command line with the following options:

USAGE: srcomp [c|d] -i <input_file> -o <output_file> 
 -h           print this message.
 -c           compress.
 -d           decompress.
 -p           use previous data to compress more.
 -i <file>    specify the input file.
 -o <file>    specify the output file.
 -b <size>    specify the block size (in kilobytes).

Compressing a file

You can compress a file by running it like:

./srcomp -c -i enwik8 -o enwik8.srz

Or alternatively, you can compress the file by using stdin and stdout:

cat enwik8 | ./srcomp -c > enwik8.srz

In compression mode, you can choose how big do you want the block size to be (up to 64 megabytes). The bigger the block size the better the compression.

# Example with 1Mb block size
./srcomp -c -b 1024 -i enwik8 -o enwik8.srz

You can also use the previous data to compress a little bit more.

# Example with 1Mb block size and previous data usage
./srcomp -c -p -b 1024 -i enwik8 -o enwik8.srz

Decompressing a file

You can decompress a file by running it like:

./srcomp -d -i enwik8.srz -o enwik8.txt

Or alternatively, you can compress the file by using stdin and stdout:

cat enwik8.srz | ./srcomp -d > enwik8.txt

srcomp's People

Contributors

system25 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.