GithubHelp home page GithubHelp logo

ankit-dhankhar / bicyclegan Goto Github PK

View Code? Open in Web Editor NEW

This project forked from prakashpandey9/bicyclegan

0.0 1.0 0.0 104.01 MB

Tensorflow implementation of the NIPS paper "Toward Multimodal Image-to-Image Translation"

License: MIT License

Python 100.00%

bicyclegan's Introduction

Multimodal Image-to-Image Translation

This is a Tensorflow implementation of the NIPS paper "Toward Multimodal Image-to-Image Translation". The aim is to generate a distribution of output images given an input image. Basically, it is an extension of image to image translation model using Conditional Generative Adversarial Networks.

The idea is to learn a low-dimensional latent representation of target images using an encoder net i.e., a probability distribution which has generated all the target images and to learn the joint probability distribution of this latent vector as P(z). In this model, the mapping from latent vector to output images and output image to latent vector is bijective. The overall architecture consists of two cycle, B->z->B' and z->B'->z' and hence the name BicycleGAN.

Model Architecture

Image Source : Toward Multimodal Image-to-Image Translation Paper

Description

  • We have 3 different networks: a) Discriminator, b) Encoder, and c) Generator
  • A cGAN-VAE (Conditional Generative Adversarial Network- Variational Autoencoder) is used to encode the ground truth output image B to latent vector z which is then used to reconstruct the output image B' i.e., B -> z -> B'
  • For inverse mapping (z->B'->z'), we use LR-GAN (Latent Regressor Generative Adversarial Networks) in which a Generator is used to generate B' from input image A and z.
  • Combining both these models, we get BicycleGAN.
  • The architecture of Generator is same as U-net in which there are encoder and decoder nets with symmetric skip connections.
  • For Encoder, we use several residual blocks for an efficient encoding of the input image.
  • The model is trained using Adam optimizer using BatchNormalization with batch size 1.
  • LReLU activation function is used for all types of networks.

Requirements

  • Python 2.7
  • Numpy
  • Tensorflow
  • Scipy

Training / Testing

After cloning this repository, you can train the network by running the following command.

$ mkdir test_results
$ python main.py

References

  • Toward Multimodal Image-to-Image Translation2 Paper
  • pix2pix Paper3 Paper

License

MIT

bicyclegan's People

Contributors

prakashpandey9 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.