GithubHelp home page GithubHelp logo

guhuozhengling / learned_image_compression Goto Github PK

View Code? Open in Web Editor NEW

This project forked from thekoshkina/learned_image_compression

0.0 0.0 0.0 4.26 MB

Joint autoregressive and hierarchical priors for learned image compression

Python 100.00%

learned_image_compression's Introduction

Joint autoregressive and hierarchical priors for learned image compression

PyTorch implementation of the paper

Minnen, David, Johannes Ballé, and George D. Toderici. "Joint autoregressive and hierarchical priors for learned image compression." Advances in Neural Information Processing Systems. 2018.

Model structure

Required packages:

torch~=1.3.1
torchvision~=0.4.2
torchsummary~=1.5.1
Pillow~=5.1.0
scipy~=1.4.0

Implemented model summary

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1        [-1, 192, 381, 381]          14,592
               GDN-2        [-1, 192, 381, 381]               0
            Conv2d-3        [-1, 192, 189, 189]         921,792
               GDN-4        [-1, 192, 189, 189]               0
            Conv2d-5          [-1, 192, 93, 93]         921,792
               GDN-6          [-1, 192, 93, 93]               0
            Conv2d-7          [-1, 192, 45, 45]         921,792
           Encoder-8          [-1, 192, 45, 45]               0
            Conv2d-9          [-1, 192, 43, 43]         331,968
           Conv2d-10          [-1, 192, 21, 21]         921,792
           Conv2d-11            [-1, 192, 9, 9]         921,792
     HyperEncoder-12            [-1, 192, 9, 9]               0
     MaskedConv2d-13          [-1, 384, 45, 45]       1,843,584
ContextPrediction-14          [-1, 384, 45, 45]               0
  ConvTranspose2d-15          [-1, 192, 21, 21]         921,792
  ConvTranspose2d-16          [-1, 288, 43, 43]       1,382,688
  ConvTranspose2d-17          [-1, 384, 45, 45]         995,712
     HyperDecoder-18          [-1, 384, 45, 45]               0
           Conv2d-19          [-1, 640, 45, 45]         492,160
           Conv2d-20          [-1, 512, 45, 45]         328,192
           Conv2d-21          [-1, 384, 45, 45]         196,992
EntropyParameters-22          [-1, 384, 45, 45]               0
  ConvTranspose2d-23          [-1, 192, 93, 93]         921,792
              GDN-24          [-1, 192, 93, 93]               0
  ConvTranspose2d-25        [-1, 192, 189, 189]         921,792
              GDN-26        [-1, 192, 189, 189]               0
  ConvTranspose2d-27        [-1, 192, 381, 381]         921,792
              GDN-28        [-1, 192, 381, 381]               0
  ConvTranspose2d-29          [-1, 3, 765, 765]          14,403
          Decoder-30          [-1, 3, 765, 765]               0
================================================================
Total params: 13,896,419
Trainable params: 13,896,419

Preliminary results

Training of the model with a simplified hyperlatent rate show the potential for the model to train:

Example of original/reconstructed image at epoch 0 Example of original/reconstructed image at epoch 0 Example of original/reconstructed image at epoch 1 xample of original/reconstructed image at epoch 1 Example of original/reconstructed image at epoch 2 xample of original/reconstructed image at epoch 2

Future work

  • add padding so the model can work with imaging of any resolution
  • optimise updates of dynamic Lagrange multipliers vs constants
  • add Compress function encodes an image into a bitstream
  • add Decompress function that reconstructs

References:

Minnen, David, Johannes Ballé, and George D. Toderici. "Joint autoregressive and hierarchical priors for learned image compression." Advances in Neural Information Processing Systems. 2018.

Ballé, D. Minnen, S. Singh, S. J. Hwang, and N. Johnston, “Variational image compression with a scale hyperprior,” 6th Int. Conf. on Learning Representations, 2018.

Implementation of Masked convolution was taken from: https://www.codeproject.com/Articles/5061271/PixelCNN-in-Autoregressive-Models

Implementation of the GDN non-linearity was from https://github.com/jorge-pessoa/pytorch-gdn/blob/master/pytorch_gdn/__init__.py

learned_image_compression's People

Contributors

thekoshkina avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.