GithubHelp home page GithubHelp logo

thanduriel / tensorcompress Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 155 KB

expirements with the HOSVD for video compression

License: MIT License

CMake 5.75% C++ 94.25%
linear-algebra video-processing compression

tensorcompress's Introduction

Tensor-Compress

About

This is a small command line tool to experiment with tensors and video compression. A video can be interpreted as a 4th-order tensor with dimensions

color channel x width x height x frame

For matrices (or second order tensors), the best low rank approximation with respect to the Frobenius norm is given by its singular value decomposition. Taking only the parts corresponding to the k largest singular values, one gets the best rank-k approximation. If the k is sufficiently small, the singular values together with the associated basis vectors require less space to store than the initial matrix.

A generalization of the SVD to arbitrary tensors is the higher-order singular value decomposition. While it does not have the same optimality guarantee, it can be used in a similar fashion. The decomposition yields a number of matrices equal to the tensor order and a new tensor of the same order containing the singular values. By truncating this new tensor, one gets a quasi-optimal solution to the rank optimization problem.

Build

Tensor-Compress requires a C++17 compiler (tested with msvc-17.1, gcc-10.3) and cmake (>=3.12). Most dependencies are integrated as submodules and are thus taken care of by a recursive clone or a submodule update. However, ffmpeg needs to be available on the system. In particular, the components libavformat-dev,libavcodec-dev and libswscale-dev should be installed. On windows, a dev lib including all these components can be acquired through vcpkg. Once set up properly, run

$ git clone --recursive https://github.com/Thanduriel/tensorCompress
$ cd tensorCompress
$ mkdir build
$ cd build
$ cmake .. -DCMAKE_TOOLCHAIN_FILE=<path/to/vcpkg.cmake>

Usage

To get an overview of the available options try

tensorComp --help

As input file any video file that ffmpeg can handle is valid. For the output file, the ending should be .avi to store a lossless version of the video reconstructed from the truncated tensor. Other video formats are not supported. If the output file has the ending .ten, truncated tesor itself is stored instead. This option is of limited use currently, as the resulting file is quite large and the only operation that can be done with it, is to load it again to encode it as an .avi file.

Supported pixel formats for --pix_fmt are YUV444 and RGB. The different color spaces effect the result, especially if the color dimension is truncated.

Truncation modes for --trunc are rank, tolerance and tolerance_sum. Together with the truncation threshold values, this rule determines how many singular values are kept in each dimension. Expected are 4 values, one for each dimension and in the order color width height frames. In case of tolerance and tolerance_sum, the values are interpreted as float threshold for singular values to keep, for rank it should be integers describing the size of the resulting tensor.

Example

The example video shown here is Video32_Country_panorama from the ITEC Short Casual Video Dataset (license CC-4) with resolution 640x360 and 500 frames.

Country_panorama_original.mp4

For easy viewing pleasure, the following approximated videos are also encoded as h264 but with a higher bit-rate.

Unfortunatly, even a minor reduction to rank (3x512x288x400) already results in some prominent artifacts, showing that this representation is unsuited for video compression.

Country_panorama_3_512_288_400.mp4

Reducing the video to rank (1x32x28x25) the axis aligned segments which form the basis become visible in the spatial dimension. Furthermore, the adaptive frame-rate resulting from the truncation of the time dimension can be seen.

short_1_32_18_25.mp4

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.