GithubHelp home page GithubHelp logo

hsinger04 / vogue-reimplementation Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 87.41 MB

A reimplementation of the VOGUE paper for the IANNwTF WS20/21 course at the University of Osnabrück

Jupyter Notebook 97.06% Python 2.72% Cuda 0.22% Batchfile 0.01% Shell 0.01%

vogue-reimplementation's People

Stargazers

 avatar

Watchers

 avatar

vogue-reimplementation's Issues

Learn try-on

  1. Have a 2-layer MLP map from w to sigma. Only a single one is needed, as w_latent stays the same for all styles / resolutions.
  2. Have p as a trainable vector, q = sigmoid(p) and Q = DiagonalMatrix(q) for each style.
  3. Calculate losses and optimize.

Reducing pixel size?

Either: Reduce image and segmentation dimensions or change StyleGAN to input / output 1024 images

Project images to latent vector z

  1. Create a CNN that maps from image to latent vector z.
  2. Train it by minimizing the perceptual loss between input and output by StlyeGAN (unclear: perceptual loss)

Creating try_on dataset

  • Things to optimize – prefer generator code: Simple to write and memory-efficient. And maybe try_on doesn‘t need long to train
    • Speed
    • Memory
  • What the dataset should look like
    • Returns: latent_p, latent_g, seg_p, seg_g

Editing localization loss

  1. Calculate A (Is the style matrix. See Fig. 2 in Editing in Style.)
  2. Get Segmentation (tf.resize)
  3. Normalize A
  4. (Downsample Segmentation)
  5. Follow Eq. 3 from VOGUE (Just multiply A^2 with U and then use reduce_mean)
  6. Unclear: Right side of page 4

General TODOs

  • Refactor code
  • Fix Encoder (find out why it fails by maybe inspecting the loss (via tensorboard) and find an alternative architecture)
  • GitHub README and .yaml
  • Analysis of results

Project dataset

Ideally, we would use the original authors' dataset. However, I also found https://github.com/royorel/FFHQ-Aging-Dataset, which is good for the following two reasons:

1.) It builds on the FFHQ dataset, which StyleGAN was also trained on --> might allow us to be more flexible with regards to using transfer learning
2.) It contains segmentation labels

Modulated_conv2d

self.p initialized with random_normal or random_uniform leads to error, even though I am 99% sure I am supposedly doing it correctly.

How to do segmentation?

Encoder model

  • Seems to somehow only map to positive numbers right now.
  • Batch Normalization and tanh are missing
  • Make sure to save current model files somewhere else and start retraining.

Perceptual loss

  1. Just try out L2-norm between unit-scaled activations of last layer between two networks or sth.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.