GithubHelp home page GithubHelp logo

huffman-ml's Introduction

Romeo and Juliet script as a Huffman tree

Huffman (de)compression implemented in OCaml in a single file

You are strongly advised to read the Wikipedia article about Huffman coding. A single file (huffman.ml) is the focus of this project; it is a command line tool for compressing and decompressing files.

Installation and usage

This program can only be compiled on UNIX-like systems, although you are free to modify it to run on other, possibly non-free, operating systems. The dependencies are:

  • ocaml >= 5.2 (because huffman.ml uses the Dynarray module),
  • dune (I use 3.15.2),
  • unix (is a dependency of the OCaml compiler, as far as I know),
  • optionally dot, one of Graphviz's renderers, to see the Huffman tree as an SVG (see examples/romeo-and-juliet.txt.svg for an example).

If you have never used OPAM or OCaml, start by downloading OPAM, then run

$ opam init
...
$ opam switch create 5.2.0   # or a newer version, see `opam switch list-available`
...
$ opam install dune
...

Once you have a working environment, you can compile huffman.ml and add it your path with

$ dune build --profile=release
$ dune install
$ huffman
Not enough arguments.

Usage: huffman <option> <file>

Synopsis: Compress and decompress mainly text files through Huffman encoding.

Options:
  - g, graph       Generate a Graphviz dot SVG out of the Huffman tree of the given file
  - w, codewords   Display the corresponding codewords of a file and exit
  - c, compress    Compress the given file
  - d, decompress  Decompress the given file

Examples

We will compress and decompress Romeo and Juliet. From the root of this project:

$ cd examples/

$ huffman compress romeo-and-juliet.txt
Compressed file saved to examples/romeo-and-juliet.txt.huff
Compressed file is 37.482621% smaller than the original.

$ mv romeo-and-juliet.txt romeo-and-juliet.txt.bak

$ huffman decompress romeo-and-juliet.txt.huff
Decompressed to romeo-and-juliet.txt

$ cmp romeo-and-juliet.txt romeo-and-juliet.txt.bak
$ echo $?
0
$ # zero means that the files have the exact same content (ie. lossless compression)

huffman-ml's People

Contributors

raegnald avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.