GithubHelp home page GithubHelp logo

clipping-clip-to-gan's Introduction

clipping-CLIP-to-GAN

modelfig

Introduction

Recently, openAI proposed CLIP : multimodal transformer based model that can perform incredible wide-domain zero shot tasks. You can read all about it from openAI's blog and it's paper. On the other hand, DALL-E, which is generative model, has also been released on the same date, but it is currently not-released and probably end up like GPT-3.

More recently, Ryan Murdock proposed that good feature visualization should generate some image that matches the text : mainly, he used SIREN as a set of parameters to optimize over and used autograd to learn the best parameters that generates image that matches given set of images.

In general, this could be done with any kind of deterministic generative model, such as GAN, VAE, (I think VQVAE would be really good here too, but the gradient ascent part is still something to implement).

This repository contains test.py, that in general takes generative model and learnable latent vector to find image matching input text.

The models I used here are CLIP (obviously, https://github.com/openai/CLIP), and FastGAN.

Sampled Examples with GAN on FFHQ dataset

modelfig

Input Text : This is an image of young boy. He is 2 years old. He is just a baby.

modelfig

Input Text : This is an image of old woman. She is from France.

modelfig

Input Text : This is an image of old man, he is wearing glasses.

modelfig

Input Text : This is an image of young woman, she is african american.

Ethnicity, gender, age, and other features weren't disentangled. They were just "found" by the text prompt, which is really interesting. I also found that if initial latent vector (so initial person I guess...?) is very different from what we are aiming to (such as example 3 in this case), then it is very difficult to learn with small learning rates.

How To Use

You can get this running immediately by git clone and downloading pre-trained model from here, (test.py used FFHQ) and put it in /models folder. In test.py, everything is fixed with random seed, which will yield same results for all the example text inputs. I tried to make everything very easy and comprehensible! If you have any trouble let me know!

clipping-clip-to-gan's People

Contributors

cloneofsimo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.