GithubHelp home page GithubHelp logo

atharva-phatak / image2image Goto Github PK

View Code? Open in Web Editor NEW
5.0 3.0 0.0 11.87 MB

This is a Pytorch implementation of Cycle Gan's and Pix2Pix to perform unpaired and paired image to image translations.

License: MIT License

Jupyter Notebook 98.94% Python 1.06%

image2image's Introduction

Season-Tranfer

Image to Image Translation

Image to Image to translation is a process whose goal is to learn mappings between input image and output images i.e it is a task which helps to convert an image from domain to another domain. A great amount of research has been done on this subject , most of it has used a supervised approach where we have corresponding pairs of images from the two domains whose mapping we want to learn. The problem with this approach is that for many a tasks there won't be such paired image data. The below image shows image to image translation tasks.

Here's how image to image translation tasks look like

Cycle Gan's to the Rescue!

The researcher's at the Berkeley Artificial Intelligence Research (BAIR) published a paper Unsupervised Image-to-Image translation with cycle consistend adversarial networks in 2017 in which an approach was presented which did not required paired image data for image to image translations tasks. Yes of course still the set of images from both the domains was required they do not need to directly correspond to one another.

How do they work ?

Two Generator's

Cycle Gan is a generative adversarial network with two generators and two discriminators. Let's call one generator G which convert images from domain X to domain Y(G : X -> Y) and other generator F which converts images from domain Y to domain X (F : Y -> X). Each generator has a corresponding discriminator which tries to tell apart the synthesized image and real image.

The Loss Function

There are two loss functions viz Adversarial loss and cycle consistency loss. The adversarial loss would come as no surpirse to people who have worked with GAN's. Both the generators are trying to 'fool' the discriminators i.e they are trying to make them classify fakes images as real ones. The below image shows the adversarial loss

adversarial loss

The cycle consistency loss (The big gun!)

What does cycle consistent means? For ex : Consider you translate a sentence from English to French and translate it back , you should reach at the original sentence. This is what cycle consistent means. More formally speaking we have a genereator G:X-->Y and another generator F:Y-->X the G and F should be inverse mappings of each other. Using this assumption both the generators G and F are trained simultaenously with a loss function that encourages F(G(x)) ≈ x and G(F(y)) ≈ y. The images below shows how cycle consistency looks

cycle

The cycle consistency loss ensures the property that x --> G(x) --> F(G(x)) ≈ x cycle loss

Combining both the adversarial and cycle consistency loss we get out loss function total loss

The objective of our task is to minmize the combined loss for generators and maximize for the discriminator. obj

The Network Architecture

Generator Architecture

The Cycle Gan generator is composed of three sections : an encoder , residual blocks , decoder. The input image is fed directly into the encoder which shrinks the size while increasing the number of channels.The encoder consists of three convoltuional layers.The resulting output is the passed to six residual blocks. The output from the residual blocks is expanded by the decoder which comprises of three transpose convolutions.

Architecture

Discriminator Architecture

The discriminators are PatchGAN's .PatchGAN's are fully convolutional neural nets that look at a patch of an image and classify it as real or fake as opposed to classifiying the whole image as fake. This approach is more computationally efficient and it allows the discriminator to look more at the surface level features like texture,etc which is more important in image translation tasks.

Disc Arch


My Results

I trained the net for 4000 epochs with avg time per epoch ≈ 12 secs on GTX 1050Ti.

results

References

Dependencies

  • Python 3.6
  • PyTorch
  • Torchvision
  • Numpy
  • Scikit-Image

image2image's People

Contributors

atharva-phatak avatar imgbotapp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.