GithubHelp home page GithubHelp logo

drao2 / vae-for-image-generation Goto Github PK

View Code? Open in Web Editor NEW

This project forked from chaitanya100100/vae-for-image-generation

2.0 2.0 0.0 140.6 MB

Variational AutoEncoder - Keras implementation on mnist and cifar10 datasets

License: MIT License

Python 100.00%

vae-for-image-generation's Introduction

VAE for Image Generation

Variational AutoEncoder - Keras implementation on mnist and cifar10 datasets

Dependencies

  • keras
  • tensorflow / theano (current implementation is according to tensorflow. It can be used with theano with few changes in code)
  • numpy, matplotlib, scipy

implementation Details

code is highly inspired from keras examples of vae : vae, vae_deconv
(source files contains some code duplication)

MNIST

  • images are flatten out to treat them as 1D vectors
  • encoder and decoder - both have normal neural network architecture
network architecture
mnist_vae_architecture
src/mnist_train.py
  • it trains vae model according to the hyperparameters defined in src/mnist_params.py
  • entire vae model, encoder and decoder is stored as keras models in models directory as ld_<latent_dim>_id_<intermediate_dim>_e_<epochs>_<vae/encoder/decoder>.h5 where <latent_dim> is number of latent dimensions, <intermediate_dim> is number of neurons in hidden layer and <epochs> is number of training epochs
  • after training, the saved model can be used to analyse the latent distribution and to generate new images
  • it also stores the training history in ld_<latent_dim>_id_<intermediate_dim>_e_<epochs>_history.pkl
src/mnist_2d_latent_space_and_generate.py
  • it is only for 2 dimensional latent space
  • it loads trained model according to the hyperparameters defined in mnist_params.py
  • it displays the latent space distribution and then generates the images according to user input of latent variables (see the code as it is almost self-explanatory)
  • it can also generate images from latent vectors randomly sampled from 2D latent space (comment out the user input lines) and display them in a grid
src/mnist_3d_latent_space_and_generate.py
  • it is same as mnist_2d_latent_space_and_generate.py but it is for 3d latent space
src/mnist_general_latent_space_and_generate.py
  • it loads trained model according to the hyperparameters defined in mnist_params.py
  • if latent space is either 2D or 3D, it displays it
  • it displays a grid of images generated from randomly sampled latent vectors

results

2D latent space
latent space uniform sampling
2D 2D
3D latent space

3D

3D latent space results
uniform sampling random sampling
3D 3D
  • more results are in images directory

CIFAR10

  • images are treated as 2D input
  • encoder has the architecture of convolutional neural network and decoder has the architecture of deconvolutional network
  • network architecture for encoder and decoder are as follows
encoder
cifar10_vae_encoder
decoder
cifar10_vae_decoder
src/cifar10_train.py , src/cifar10_generate.py

implementation structure is same as mnist files

result - latent dimensions 16

25 epochs 50 epochs 75 epochs
cifar10 cifar10 cifar10
600 epochs
cifar10

CALTECH101

  • caltech101_<sz>_train.py and caltech101_<sz>_generate.py (where sz is the size of input image - here the training was done for two sizes - 92*92 and 128*128) are same as cifar10 dataset files
  • as the image size is large, more computation power is needed to train the model
  • results obtained with less training are qualitatively not good
  • in dataset directory, src/caltech101_preprocess.py is provided to preprocess the dataset

vae-for-image-generation's People

Contributors

chaitanya100100 avatar gopal86 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.