GithubHelp home page GithubHelp logo

akanimax / pro_gan_pytorch Goto Github PK

View Code? Open in Web Editor NEW
535.0 15.0 98.0 200.05 MB

Unofficial PyTorch implementation of the paper titled "Progressive growing of GANs for improved Quality, Stability, and Variation"

License: MIT License

Python 100.00%
gan pytorch convolutional-neural-network adversarial-machine-learning progressive-growing-of-gans

pro_gan_pytorch's Introduction

pro_gan_pytorch

Unofficial PyTorch implementation of Paper titled "Progressive growing of GANs for improved Quality, Stability, and Variation".
For the official TensorFlow code, please refer to this repo

GitHub PyPi

How to use:

Using the package

Requirements (aka. we tested for):

  1. Ubuntu 20.04.3 or above
  2. Python 3.8.3
  3. Nvidia GPU GeForce 1080 Ti or above min GPU-mem 8GB
  4. Nvidia drivers >= 470.86
  5. Nvidia cuda 11.3 | can be skipped since pytorch ships with cuda, cudnn etc.

Installing the package

  1. Easiest way is to create a new virtual-env so that your global python env doesn't get corrupted
  2. Create and switch to your new virtual environment
    (your-machine):~$ python3 -m venv <env-store-path>/pro_gan_pth_env 
    (pro_gan_pth_env)(your-machine):~$ source <env-store-path>/pro_gan_pth_env/bin/activate
  1. Install the pro-gan-pth package from pypi, if you meet all the above dependencies
    (pro_gan_pth_env)(your-machine):~$ pip install pro-gan-pth 
  1. Once installed, you can either use the installed commandline tools progan_train, progan_lsid and progan_fid. Note that the progan_train can be used with multiple gpus (If you have many ๐Ÿ˜„). Just ensure that the gpus visible in the CUDA_VISIBLE_DEVICES=0,1,2 environment variable. The other two tools only use a single GPU.
    (your-machine):~$ progan_train --help
usage: Train Progressively grown GAN
       [-h]
       [--retrain RETRAIN]
       [--generator_path GENERATOR_PATH]
       [--discriminator_path DISCRIMINATOR_PATH]
       [--rec_dir REC_DIR]
       [--flip_horizontal FLIP_HORIZONTAL]
       [--depth DEPTH]
       [--num_channels NUM_CHANNELS]
       [--latent_size LATENT_SIZE]
       [--use_eql USE_EQL]
       [--use_ema USE_EMA]
       [--ema_beta EMA_BETA]
       [--epochs EPOCHS [EPOCHS ...]]
       [--batch_sizes BATCH_SIZES [BATCH_SIZES ...]]
       [--batch_repeats BATCH_REPEATS]
       [--fade_in_percentages FADE_IN_PERCENTAGES [FADE_IN_PERCENTAGES ...]]
       [--loss_fn LOSS_FN]
       [--g_lrate G_LRATE]
       [--d_lrate D_LRATE]
       [--num_feedback_samples NUM_FEEDBACK_SAMPLES]
       [--start_depth START_DEPTH]
       [--num_workers NUM_WORKERS]
       [--feedback_factor FEEDBACK_FACTOR]
       [--checkpoint_factor CHECKPOINT_FACTOR]
       train_path
       output_dir

    positional arguments:
      train_path            Path to the images folder for training the ProGAN
      output_dir            Path to the directory for saving the logs and models

    optional arguments:
      -h, --help            show this help message and exit
      --retrain RETRAIN     whenever you want to resume training from saved models (default: False)
      --generator_path GENERATOR_PATH
                            Path to the generator model for retraining the ProGAN (default: None)
      --discriminator_path DISCRIMINATOR_PATH 
                            Path to the discriminat or model for retraining the ProGAN (default: None)
      --rec_dir REC_DIR     whether images stored under one folder or has a recursive dir structure (default: True)
      --flip_horizontal FLIP_HORIZONTAL
                            whether to apply mirror augmentation (default: True)
      --depth DEPTH         depth of the generator and the discriminator (default: 10)
      --num_channels NUM_CHANNELS
                            number of channels of in the image data (default: 3)
      --latent_size LATENT_SIZE
                            latent size of the generator and the discriminator (default: 512)
      --use_eql USE_EQL     whether to use the equalized learning rate (default: True)
      --use_ema USE_EMA     whether to use the exponential moving averages (default: True)
      --ema_beta EMA_BETA   value of the ema beta (default: 0.999)
      --epochs EPOCHS [EPOCHS ...]
                            number of epochs over the training dataset per stage (default: [42, 42, 42, 42, 42, 42, 42, 42, 42])
      --batch_sizes BATCH_SIZES [BATCH_SIZES ...]
                            batch size used for training the model per stage (default: [32, 32, 32, 32, 16, 16, 8, 4, 2])
      --batch_repeats BATCH_REPEATS
                            number of G and D steps executed per training iteration (default: 4)
      --fade_in_percentages FADE_IN_PERCENTAGES [FADE_IN_PERCENTAGES ...]
                            number of iterations for which fading of new layer happens. Measured in percentage (default: [50, 50, 50, 50, 50, 50, 50, 50, 50])
      --loss_fn LOSS_FN     loss function used for training the GAN. Current options: [wgan_gp, standard_gan] (default: wgan_gp)
      --g_lrate G_LRATE     learning rate used by the generator (default: 0.003)
      --d_lrate D_LRATE     learning rate used by the discriminator (default: 0.003)
      --num_feedback_samples NUM_FEEDBACK_SAMPLES
                            number of samples used for fixed seed gan feedback (default: 4)
      --start_depth START_DEPTH
                            resolution to start the training from. Example 2 --> (4x4) | 3 --> (8x8) ... | 10 --> (1024x1024)Note that this is not a way to restart a partial training. Resuming is not
                            supported currently. But will be soon. (default: 2)
      --num_workers NUM_WORKERS
                            number of dataloader subprocesses. It's a pytorch thing, you can ignore it ;). Leave it to the default value unless things are weirdly slow for you. (default: 4)
      --feedback_factor FEEDBACK_FACTOR
                            number of feedback logs written per epoch (default: 10)
      --checkpoint_factor CHECKPOINT_FACTOR
                            number of epochs after which a model snapshot is saved per training stage (default: 10)

------------------------------------------------------------------------------------------------------------------------------------------------------------------

    (your-machine):~$ progan_lsid --help
    usage: ProGAN latent-space walk demo video creation tool [-h] [--output_path OUTPUT_PATH] [--generation_depth GENERATION_DEPTH] [--time TIME] [--fps FPS] [--smoothing SMOOTHING] model_path

    positional arguments:
      model_path            path to the trained_model.bin file

    optional arguments:
      -h, --help            show this help message and exit
      --output_path OUTPUT_PATH
                            path to the output video file location. Please only use mp4 format with this tool (.mp4 extension). I have banged my head too much to get anything else to work :(. (default:
                            ./latent_space_walk.mp4)
      --generation_depth GENERATION_DEPTH
                            depth at which the images should be generated. Starts from 2 --> (4x4) | 3 --> (8x8) etc. (default: None)
      --time TIME           number of seconds in the video (default: 30)
      --fps FPS             fps of the generated video (default: 60)
      --smoothing SMOOTHING
                            smoothness of walking in the latent-space. High values corresponds to more smoothing. (default: 0.75)

------------------------------------------------------------------------------------------------------------------------------------------------------------------

    (your-machine):~$ progan_fid --help
    usage: ProGAN fid_score computation tool [-h] [--generated_images_path GENERATED_IMAGES_PATH] [--batch_size BATCH_SIZE] [--num_generated_images NUM_GENERATED_IMAGES] model_path dataset_path

    positional arguments:
      model_path            path to the trained_model.bin file
      dataset_path          path to the directory containing the images from the dataset. Note that this needs to be a flat directory

    optional arguments:
      -h, --help            show this help message and exit
      --generated_images_path GENERATED_IMAGES_PATH
                            path to the directory where the generated images are to be written. Uses a temporary directory by default. Provide this path if you'd like to see the generated images yourself
                            :). (default: None)
      --batch_size BATCH_SIZE
                            batch size used for generating random images (default: 4)
      --num_generated_images NUM_GENERATED_IMAGES
                            number of generated images used for computing the FID (default: 50000)
  1. Or, you could import this as a python package in your code for more advanced use-cases:
    import pro_gan_pytorch as pg 

You can use all the modules in the package such as: pg.networks.Generator, pg.networks.Discriminator, pg.gan.ProGAN etc. Mostly, you'll only need the pg.gan.ProGAN module for training. For inference, you will probably only need the pg.networks.Generator. Please refer to the scripts for the tools as in 4. under pro_gan_pytorch_scripts/ for examples on how to use the package. Besides, please feel free to just read the code. It's really easy to follow (or at least I hope so ๐Ÿ˜… ๐Ÿ˜ฌ).

Developing the package

For more advanced use-cases in your project, or if you'd like to contribute new features to this project, the following steps would help you get this project setup for developing. There are no standard set of rules for contributing here (no CONTRIBUTING.md) but let's try to maintain the overall ethos of the codebase ๐Ÿ˜„.

  1. clone this repository
    (your-machine):~$ cd <path to project>
    (your-machine):<path to project>$ git clone https://github.com/akanimax/pro_gan_pytorch.git
  1. Apologies in advance since the step 1. will take a while. I ended up pushing gifs and other large binary assets to git back then. I didn't know better :sad:. I'll see if this can be sorted out somehow. But once done setup a development virtual env,
    (your-machine):<path to project>$ python3 -m venv pro-gan-pth-dev-env
    (your-machine):<path to project>$ source pro-gan-pth-dev-env/source/activate
  1. Install the package in development mode:
    (pro-gan-pth-dev-env)(your-machine):<path to project>$ pip install -e .
  1. Also install the dev requirements:
    (pro-gan-pth-dev-env)(your-machine):<path to project>$ pip install -r requirements-dev.txt
  1. Now open the project in the editor of your choice, and you are good to go. I use pytest for testing and black for code formatting. Check out this_link for how to setup black with various IDEs.

  2. There is no fancy CI, or automated testing, or docs building since this is a fairly tiny project. But we are open to considering these tools if more features keep getting added to this project.

Trained Models

We will be training models using this package on different datasets over the time. Also, please feel free to open PRs for the following table if you end up training models for your own datasets. If you are contributing, then please setup a file hosting solution for serving the trained models.

Courtesy Dataset Size Resolution GPUs used #Epochs per stage Training time FID score Link Qualitative samples
@owang Metfaces ~1.3K 1024 x 1024 1 V100-32GB 42 24 hrs 101.624 model_link image

Note that we compute the FID using the clean_fid version from Parmar et. al.

General cool stuff ๐Ÿ˜„

Training timelapse (fixed latent points):

The training timelapse created from the images logged during the training looks really cool.


Checkout this YT video for a 4K version ๐Ÿ˜„.

If interested please feel free to check out this medium blog I wrote explaining the progressive growing technique.

References

1. Tero Karras, Timo Aila, Samuli Laine, & Jaakko Lehtinen (2018). 
Progressive Growing of GANs for Improved Quality, Stability, and Variation. 
In International Conference on Learning Representations.

2. Parmar, Gaurav, Richard Zhang, and Jun-Yan Zhu. 
"On Buggy Resizing Libraries and Surprising Subtleties in FID Calculation." 
arXiv preprint arXiv:2104.11222 (2021).

Feature requests

  • Conditional GAN support
  • Tool for generating time-lapse video from the log images
  • Integrating fid-metric computation as a training-logging

Thanks

As always,
please feel free to open PRs/issues/suggestions here. Hope this work is useful in your project ๐Ÿ˜„.

cheers ๐Ÿป!
@akanimax ๐Ÿ˜Ž

pro_gan_pytorch's People

Contributors

akanimax avatar dependabot[bot] avatar joebradly avatar joshalbrecht avatar mauricio-repetto avatar saoyan avatar srihari-humbarwadi avatar tomasheiskanen avatar zszazi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pro_gan_pytorch's Issues

Can I generate a class-specific image using your code?

Hello @akanimax
Thanks a lot for your implementation of Progressive Growing GANs!

I have a question: I understand that you are training the images based on classes. So, when testing, how do I specify the class label to the generator so that I can obtain an image of that class?

Thanks in advance!

Edit: As a follow-up I am also curious about the author's paper where they use LSUN dataset. The paper does not get into details of how they did class-specific training but they have shown examples of images generated with class labels (eg. bedroom). Any insights on this would be much appreciated! Thanks.

Why defining bias term yourself?

Hi there, in CustomLayers.py, the bias term is defined by hand:

self.bias = th.nn.Parameter(th.FloatTensor(c_out).fill_(0))

Why not just use bias=True/False in nn.Conv2d and nn.Linear?

implementation of MinibatchStdDev

The official implementation is here.

  1. It seems that in your implementation there are multiple options, which don't exist in the official code. May I ask why?

  2. Your default option is "all", which points to this line. However, the size of vals here is 1x1xHxW, that is you keep the spatial dimension. But according to both the paper and the official code, vals should be one single value (the average over all channels and pixels).

  3. In your option "group", the code is actually not correct.

vals = vals.view(self.n, self.shape[1] / self.n, self.shape[2], self.shape[3])
vals = th.mean(vals, 0, keepdim=True).view(1, self.n, 1, 1)

The size of tensors would be incompatible.

Output size

Hi @akanimax
Is there anywhere that the output size of generated samples is specified? can I increase it at all?
Thanks!

About network parameter setting

Hi,
I see that there are some parameter settings which are some sort of different from the original paper, e.g., the latent_size is set as 128 throughout the whole G and D networks. However in the original paper, it increases/decreases from 16 to 512 or vice versa. Is there any specific reason that this implementation has such different setting? Or just for simplification with limited performance influence?

Thanks!

WGAN_GP Discriminator loss

For WGAN_GP, the Discriminator's loss is calculated as follows:
loss = (th.mean(fake_out) - th.mean(real_out) + (self.drift * th.mean(real_out ** 2)))

Could you explain please what the last bit (self.drift * th.mean(real_out ** 2)) does and where it comes from ? I could not find any information either in the paper or the presentation. Thank you in advance !

Training the model

Is there a way I can train the model?
I used the code as it is and also the example mentioned in the readme.
You have mentioned a pg.ConditionalProGAN() in that CIFAR example. I could not find the mentioned function in the code.
Am I missing something?

Some questions about generating face samples( using pre-trained model)

First of all, thank you for this pytorch implementation and this whole package, it is truly amazing!
However, I don't know if I use the pretrained model in the correct way.
I tried to generate some image samples using "generate samples.py", and the generator model is loaded from saved_models/best_model/GAN_GEN_SHADOW_8.pth.
I just ran the script and generate 300 image samples, but the color of the image is somewhat strange, but I don't know how to fix it. Maybe I did this in the wrong way.
However, I also ran "demo.py", and the demo result is pretty good. The generator model is loaded from the same path, and thus I don't know what made this difference.
I attach some of the generated image samples below.

97
77
64

Which size GPU for full resolution training?

Hi - sorry if this is a broad question but I've looked around the repo and can't find the answer anywhere..

Assuming I only have 1 Titan XP gpu (12gb) - can i train the full resolution of this model? I've been trying to modify the official tensorflow implementation of progressive growing of GANs to only use 12gb; but i have to seriously cut back on the number of filters / batch size to make it work.

Thanks so much.

Multi-GPU version runs much slower than on single card

I found this problem at last weekend. The GPU volatile utilizing is only 11% on each card (I am using 4 titan xp). However when I use single card, the usage would be 45% (not that optimal, maybe >= 80% would be better). I guess it has something to do with the EMA thing. I will try to debug it. Just an info.

Best,
Shu

Samples not showing any progress during training for for 1024x1024 model

I tried to train progan on multigpu instance but the generated sample images where seemed to be copies of each other.

Left side first sample at first level and right side last epoch(27) for first level. Pixels seem exactly the same.
image

The same continues for higher resolutions but some color variation probably due to fadeins.
image
image

I was using pytorch=0.4.1 cuda90 and the pro_gan_pytorch-examples/implementation/train_network.py training script with following config:

# Hyperparameters for the Model
img_dims:
  - 1024
  - 1024

# Pro GAN hyperparameters
use_eql: True
depth: 9
latent_size: 512
learning_rate: 0.003
beta_1: 0
beta_2: 0.99
eps: 0.00000001
drift: 0.001
n_critic: 1
use_ema: True
ema_decay: 0.999

# Training hyperparameters:
epochs:
  - 27
  - 54
  - 54
  - 54
  - 54
  - 54
  - 54
  - 54
  - 300

# % of epochs for fading in the new layer
fade_in_percentage:
  - 50
  - 50
  - 50
  - 50
  - 50
  - 50
  - 50
  - 50
  - 50

batch_sizes:
  - 512
  - 512
  - 512
  - 512
  - 512
  - 256
  - 128
  - 64
  - 32

loss_function: "wgan-gp"  # loss function to be used

num_samples: 16
num_workers: 90
feedback_factor: 5   # number of logs generated per epoch
checkpoint_factor: 10  # save the models after these many epochs

Error when running example: 'list' object has no attribute 'to'

When attempting to run the example from the readme. Saw the same error when attempting to run the example at https://github.com/akanimax/pro_gan_pytorch-examples.

Possibly an incompatibility with Pytorch v1.0.0?

Files already downloaded and verified
<class 'torchvision.datasets.cifar.CIFAR10'>
Starting the training process ... 


Currently working on Depth:  0
Current resolution: 4 x 4

Epoch: 1
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-7-ca3c71ce5bd6> in <module>()
     25     epochs=num_epochs,
     26     fade_in_percentage=fade_ins,
---> 27     batch_sizes=batch_sizes
     28 )
     29 # ======================================================================

/usr/local/lib/python3.5/dist-packages/pro_gan_pytorch/PRO_GAN.py in train(self, dataset, epochs, batch_sizes, fade_in_percentage, num_samples, start_depth, num_workers, feedback_factor, log_dir, sample_dir, save_dir, checkpoint_factor)
    605 
    606                     # extract current batch of data for training
--> 607                     images = batch.to(self.device)
    608 
    609                     gan_input = th.randn(images.shape[0], self.latent_size).to(self.device)

AttributeError: 'list' object has no attribute 'to'

Possible typo in LSGAN loss

The current code of LSGAN loss does following:

0.5 * ((th.mean(self.dis(fake_samps, height, alpha)) - 1) ** 2)

which is square of mean of error, probably should be mean of squared error as following:

0.5 * (th.mean((self.dis(fake_samps, height, alpha) - 1) ** 2))

If above is correct, same applies to both generator and discriminator of LSGAN and LSGANSigmoid.

Note: I haven't actually run the above code, just noticed it while looking at the other losses in your code.

I met this error when run your cifar-10 train code.

Hi, @joebradly @joshalbrecht @akanimax @SaoYan

I met this error when run your cifar-10 train code.

--
Files already downloaded and verified
Starting the training process ...

Currently working on Depth: 0
Current resolution: 4 x 4

Epoch: 1
Traceback (most recent call last):
File "depth4.py", line 61, in
batch_sizes=batch_sizes
File "/home/oem/pro_gan_pytorch/pro_gan_pytorch/PRO_GAN.py", line 1046, in train
labels, current_depth, alpha)
File "/home/oem/pro_gan_pytorch/pro_gan_pytorch/PRO_GAN.py", line 865, in optimize_discriminator
labels, depth, alpha)
File "/home/oem/pro_gan_pytorch/pro_gan_pytorch/Losses.py", line 345, in dis_loss
fake_out = self.dis(fake_samps, labels, height, alpha)
File "/home/oem/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/oem/.local/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 123, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/oem/.local/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 133, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/oem/.local/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 77, in parallel_apply
raise output
File "/home/oem/.local/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 53, in _worker
output = module(*input, **kwargs)
File "/home/oem/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/oem/pro_gan_pytorch/pro_gan_pytorch/PRO_GAN.py", line 305, in forward
out = self.final_block(y, labels)
File "/home/oem/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/oem/pro_gan_pytorch/pro_gan_pytorch/CustomLayers.py", line 445, in forward
labels = self.label_embedder(labels) # [B x C]
File "/home/oem/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/oem/.local/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 110, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "/home/oem/.local/lib/python3.6/site-packages/torch/nn/functional.py", line 1110, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: torch/csrc/autograd/variable.cpp:166: get_grad_fn: Assertion output_nr_ == 0 failed.

What's wrong to me?

Thanks.

Help restarting after GPU out of memory error at 3 days

Hi, I'm too much of a noob to figure this out myself, and would love to know if there is a simple answer.

I was training for 3 days on my own dataset and was loving the results at 256x256, unfortunately as soon as we progressed up, the batch size was too large to be handled by my GPU. I guess i'll have to make it bs=2 or 1 (currently 4).

Is there a way to restart the training from the end point of 256x256? I don't want to start all over again...

THANK YOU!

Djproc

p.s. this is some fantastic work you have done and i'm really appreciative that you've made it so easy to get started!

256x256 pretrained model

Hello @akanimax ,

Because I couldn't reopen issue: #25
I was wondering if you could provide me with pretrained model for the discriminator and generator with the resolution of 256x256 on the CelebA-HQ dataset. It would really improve my research.\

Kind Regards,
Guus

model training plateaus

In regards to the latent size, can you still get good results using a value of 256? I tried to train ProGan on an art dataset, and it seems to make progress until Depth 6. Please see attached pictures showing the output at different depths. I ran these using 64 epochs, default fade-in, and 64 for batching, wgan-gp for the loss function, and trained it on 8 GPUs. Thanks!

Depth 2
image

Depth 3
gen_3_54_70

Depth 4
gen_4_54_70

Depth 5
gen_5_64_70

Depth 6
gen_6_64_70

handling grayscale image

For the users who curate their own datasets to train the model, sometimes there will be grayscale images. In that case, I've ran into the error RuntimeError: output with shape [1, 1024, 1024] doesn't match the broadcast shape [3, 1024, 1024] from pro_gan_pytorch-examples/implementation/data_processing/DataLoader.py. I would like to open a PR and make some suggested changes.

Resuming from snapshot and using start_depth

I have a question for understanding.
I have set feedback_factor: 10
When I interrupt training at depth 3 (32x32) and want to resume training from the saved model, what is correct?
--start_depth=3 --generator_file=training_runs/dataset/saved_models/GAN_GEN_3.pth
or
--start_depth=3 --generator_file=training_runs/dataset/saved_models/GAN_GEN_4.pth

or do I skip depth 3 and start with depth 4?
Should I only interrupt when the depth is chaning. Say during the transition from depth 3 to 4. So that I don't loose epochs or override epochs? For example when I interrupt during depth 3 with a max epoch size of 64 and I stop training on epoch 30. Is this bad? Should I stop at epoch 1 instead?

Training the model

Hi, I have been trying to use the CIFAR example you have given. I have used it as it is and the output I am getting is :
gen_3_20_351

Also, can you explain how to change the other parameters if I decide to increase the depth?

Very large d_loss

Thanks for your great work. I'm training based on my own dataset using the default WGAN-GP loss. The depth is 5. I wonder that the d_loss is always very large like this:

`
Epoch: 10
Elapsed: [0:16:56.978964] batch: 20 d_loss: 360.280609 g_loss: 1.696454
Elapsed: [0:17:05.162741] batch: 40 d_loss: 1318.774780 g_loss: 46.433090
Elapsed: [0:17:12.699862] batch: 60 d_loss: 369.273987 g_loss: -0.842132
Elapsed: [0:17:20.655461] batch: 80 d_loss: 687.216553 g_loss: 14.159639
Elapsed: [0:17:28.525324] batch: 100 d_loss: 1313.480713 g_loss: 34.156487
Elapsed: [0:17:36.623373] batch: 120 d_loss: 347.785248 g_loss: 4.414964
Elapsed: [0:17:44.439503] batch: 140 d_loss: 689.839966 g_loss: -9.050404
Elapsed: [0:17:51.356449] batch: 160 d_loss: 387.812683 g_loss: 8.951473
Time taken for epoch: 66.575 secs

Epoch: 11
Elapsed: [0:18:03.436184] batch: 20 d_loss: 536.160645 g_loss: 29.115032
Elapsed: [0:18:11.773171] batch: 40 d_loss: 333.525940 g_loss: 7.787774
Elapsed: [0:18:18.837069] batch: 60 d_loss: 265.996277 g_loss: 7.262208
`

I have tried to tune hyper parameters like learning rate but it still cannot be fixed. The generated images like this:
image

Any suggestions? Thank you.

About the alpha for fade-in effect

In the process of training, there is a fade-in parameter called alpha for both Generator loss gen_loss and Discriminator loss dis_loss, which value is calculated at the beginning of every epoch:

            # calculate the value of aplha for fade-in effect
            alpha = int(epoch / num_epochs)

However, this alpha will always be 0 except the last epoch. It seems very weird and this fade-in effect is not discussed in the original paper.

Is there any explanation for this? Thanks!

square() not exist?

AttributeError: 'Tensor' object has no attribute 'square'

y = torch.reshape(x, [group_size, -1, channels, height, width])
y = torch.sqrt(y.square().mean(dim=0, keepdim=False) + alpha)

square() not exist?

Error running demo code: forward() missing 1 required positional argument: 'x'

Thanks for publishing this code. I'm getting an error message while running the demo.py code described in the README. I'm using pytorch 1.0.0

For this line of code:

shower = plt.imshow(get_image(start_point))

I get the following error message: forward() missing 1 required positional argument: 'x'

Any ideas on the cause?


TypeError Traceback (most recent call last)
in
----> 1 shower = plt.imshow(get_image(start_point))

in get_image(point)
1 # function to generate an image given a latent_point
2 def get_image(point):
----> 3 img = gen(point, depth=depth, alpha=1).detach().squeeze(0).permute(1, 2, 0)
4 img = (img - img.min()) / (img.max() - img.min())
5 return img.cpu().numpy()

~/anaconda3/envs/pro_gan_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)

~/anaconda3/envs/pro_gan_pytorch/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py in forward(self, *inputs, **kwargs)
141 return self.module(*inputs[0], **kwargs[0])
142 replicas = self.replicate(self.module, self.device_ids[:len(inputs)])
--> 143 outputs = self.parallel_apply(replicas, inputs, kwargs)
144 return self.gather(outputs, self.output_device)
145

~/anaconda3/envs/pro_gan_pytorch/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py in parallel_apply(self, replicas, inputs, kwargs)
151
152 def parallel_apply(self, replicas, inputs, kwargs):
--> 153 return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
154
155 def gather(self, outputs, output_device):

~/anaconda3/envs/pro_gan_pytorch/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py in parallel_apply(modules, inputs, kwargs_tup, devices)
81 output = results[i]
82 if isinstance(output, Exception):
---> 83 raise output
84 outputs.append(output)
85 return outputs

~/anaconda3/envs/pro_gan_pytorch/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py in _worker(i, module, input, kwargs, device)
57 if not isinstance(input, (list, tuple)):
58 input = (input,)
---> 59 output = module(*input, **kwargs)
60 with lock:
61 results[i] = output

~/anaconda3/envs/pro_gan_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)

TypeError: forward() missing 1 required positional argument: 'x'

Memory issues/requirements

Hi, I am trying to work with grid of 3*3 images in 1024 resolution (each image in grid is 1024px). Somehow it is not enough, even with GPU with 24GB video memory. I may do one image at 1024 during which memory consumption is ~ 16GB, but if I add more it jumps quickly over the limit.

I am working with small set of images (around 200).

Reading around here I have impression that something is not right, some people are making huge grids with less video memory, but not me :( Any tips?

Cannot train on GPU

When I run the progan using pytorch for GPU, I get:

Starting the training process ...


Currently working on Depth:  0
Current resolution: 4 x 4

Epoch: 1
Traceback (most recent call last):
  File "progan.py", line 39, in <module>
    feedback_factor=2
  File "/scratch2/virtualenv/lib/python3.7/site-packages/pro_gan_pytorch/PRO_GAN.py", line 1046, in train
    labels, current_depth, alpha)
  File "/scratch2/virtualenv/lib/python3.7/site-packages/pro_gan_pytorch/PRO_GAN.py", line 865, in optimize_discriminator
    labels, depth, alpha)
  File "/scratch2/virtualenv/lib/python3.7/site-packages/pro_gan_pytorch/Losses.py", line 345, in dis_loss
    fake_out = self.dis(fake_samps, labels, height, alpha)
  File "/scratch2/virtualenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/scratch2/virtualenv/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/scratch2/virtualenv/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/scratch2/virtualenv/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply
    raise output
  File "/scratch2/virtualenv/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker
    output = module(*input, **kwargs)
  File "/scratch2/virtualenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/scratch2/virtualenv/lib/python3.7/site-packages/pro_gan_pytorch/PRO_GAN.py", line 305, in forward
    out = self.final_block(y, labels)
  File "/scratch2/virtualenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/scratch2/virtualenv/lib/python3.7/site-packages/pro_gan_pytorch/CustomLayers.py", line 445, in forward
    labels = self.label_embedder(labels)  # [B x C]
  File "/scratch2/virtualenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/scratch2/virtualenv/lib/python3.7/site-packages/torch/nn/modules/sparse.py", line 117, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/scratch2/virtualenv/lib/python3.7/site-packages/torch/nn/functional.py", line 1506, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: diff_view_meta->output_nr_ == 0 ASSERT FAILED at /pytorch/torch/csrc/autograd/variable.cpp:209, please report a bug to PyTorch.

But when I run it using pytorch for CPU only, it works but works very, very slowly. Any idea what could be causing this and is there any way I can work with GPU support?

Can i use well trained progan generator do gan inversion?

I have used my own customal dataset train a progan generator, and now, i 'd like to use the progan do gan inversion, so i can get a latent code of related image through gan inversion.

can anyone help me, what should i do this.

thanks very much.

recalucating of __progressive_downsampling

It's strange that both optimize_discriminator and optimize_generator call __progressive_downsampling to downsample realimage. Why not downsample the realimage zhen pass it to optimize_discriminator and optimize_generator?

Mnist dataset

Where to download the Mnist dataset?
Is it should be the normal CSV (28x28) dataset?

Thanks,
Or

Greyscale modifications?

Hi Akanimax!
Thanks for providing this code. I'm a pytorch beginner trying to use your code for an undergrad project. I'm trying to use your implementation with a greyscale single channel dataset (ie same format as mnist), but get an error because the network expects 3-channel inputs. Should I transform my data to 3-channel first, or is there a way to toggle to single channel in the code? I assume that would make training much faster.

Sorry for a basic question for you.
Matthew

Is multi-GPU supported?

As the title.
Is it easy to expand this code to multigpu support by applying nn.DataParallel to all network forwards?

Best,

How do I get celebA-HQ data?

I assume that for training and getting the quality of results that you have listed, you have used the celebA-HQ dataset. May I know how to generated the 1024x1024 images? I know the original authors provide a way to generate deltas, but I'm not sure how these are used to generate the 1024x1024 images.

I tried using https://github.com/nperraud/download-celebA-HQ , but that script no longer works. Sorry, I know that this is not directly related to your repo, but I can't figure out what to do next.

EDIT: Related question, I saw that many references to the celebA-HQ data mention 128x128, 256x256 & 512x512 versions as well. Are these required when training my own version of ProGAN?

Runtime error related to tensor shapes when training ProGAN

I am trying to train a pg.ProGAN module on my created dataset, however when trying to use the train function I run into the following error:

/opt/venv/lib/python3.7/site-packages/pro_gan_pytorch/Losses.py in __gradient_penalty(self, real_samps, fake_samps, height, alpha, reg_lambda)
    124 
    125         # create the merge of both real and fake samples
--> 126         merged = epsilon * real_samps + ((1 - epsilon) * fake_samps)
    127         merged.requires_grad_(True)
    128 

RuntimeError: The size of tensor a (16) must match the size of tensor b (4) at non-singleton dimension 3

Looking at the complete error it seems that the shapes of the real samples and fake sample are not the same. I am using the following training code using pro-gan-pth version 2.1.1 installed using pip:

import pro_gan_pytorch.PRO_GAN as pg

device = "cuda" if torch.cuda.is_available() else "cpu"
dataset = ImageDataset('full_dataset.hdf5')

depth = 4
num_epochs = [10, 10, 10, 10]
fade_ins = [50, 50, 50, 50]
batch_sizes = [128, 128, 128, 128]
latent_size = 128

pro_gan = pg.ProGAN(depth=depth, latent_size=latent_size, device=device)

pro_gan.train(
    dataset=dataset, 
    epochs=num_epochs,
    fade_in_percentage=fade_ins,
    batch_sizes=batch_sizes,
    num_workers=1
)

Missing LReLU activation after from_rgb conv layers

Correct me if i am wrong, but are we supposed to apply an LReLU activation after the from_rgb 1x1 conv?

Table from the Appendix of the paper
image

Relevant code from this repository

self.from_rgb = ModuleList(
reversed(
[
ConvBlock(num_channels, nf(stage), kernel_size=(1, 1))
for stage in range(1, depth)
]
)
)

if depth > 2:
residual = self.from_rgb[-(depth - 2)](
avg_pool2d(x, kernel_size=2, stride=2)
)
straight = self.layers[-(depth - 1)](self.from_rgb[-(depth - 1)](x))

@akanimax

Wrong to the generated image

I copy your code(absolutely) and download your pre-train model, but my generated images' color is odd.

I haven't use scipy's imsave, is there something different from PIL or opencv?

If you don't mind, please let me know what environment you are using.

Hi akanimax,

I ran your code in my environment (torch 1.4.0, torchvision 0.5.0) and it didn't work due to a bug.
So I would like to know what environment you are using. If you don't mind, please let me know in requirements.txt or something like that.

import torch as th
import torchvision as tv
import pro_gan_pytorch.PRO_GAN as pg

# select the device to be used for training
device = th.device("cuda" if th.cuda.is_available() else "cpu")
data_path = "cifar-10/"

def setup_data(download=False):
    """
    setup the CIFAR-10 dataset for training the CNN
    :param batch_size: batch_size for sgd
    :param num_workers: num_readers for data reading
    :param download: Boolean for whether to download the data
    :return: classes, trainloader, testloader => training and testing data loaders
    """
    # data setup:
    classes = ('plane', 'car', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck')

    transforms = tv.transforms.ToTensor()

    trainset = tv.datasets.CIFAR10(root=data_path,
                                   transform=transforms,
                                   download=download)

    testset = tv.datasets.CIFAR10(root=data_path,
                                  transform=transforms, train=False,
                                  download=False)

    return classes, trainset, testset


if __name__ == '__main__':

    # some parameters:
    depth = 4
    # hyper-parameters per depth (resolution)
    num_epochs = [10, 20, 20, 20]
    fade_ins = [50, 50, 50, 50]
    batch_sizes = [128, 128, 128, 128]
    latent_size = 128

    # get the data. Ignore the test data and their classes
    _, dataset, _ = setup_data(download=True)

    # ======================================================================
    # This line creates the PRO-GAN
    # ======================================================================
    pro_gan = pg.ConditionalProGAN(num_classes=10, depth=depth, 
                                   latent_size=latent_size, device=device)
    # ======================================================================

    # ======================================================================
    # This line trains the PRO-GAN
    # ======================================================================
    pro_gan.train(
        dataset=dataset,
        epochs=num_epochs,
        fade_in_percentage=fade_ins,
        batch_sizes=batch_sizes
    )
    # ======================================================================  
Files already downloaded and verified
Starting the training process ... 


Currently working on Depth:  0
Current resolution: 4 x 4

Epoch: 1
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-1-c94a729137e7> in <module>
     59         epochs=num_epochs,
     60         fade_in_percentage=fade_ins,
---> 61         batch_sizes=batch_sizes
     62     )
     63     # ======================================================================

~/pro_gan_pytorch/pro_gan_pytorch/PRO_GAN.py in train(self, dataset, epochs, batch_sizes, fade_in_percentage, start_depth, num_workers, feedback_factor, log_dir, sample_dir, save_dir, checkpoint_factor)
   1044                     # optimize the discriminator:
   1045                     dis_loss = self.optimize_discriminator(gan_input, images,
-> 1046                                                            labels, current_depth, alpha)
   1047 
   1048                     # optimize the generator:

~/pro_gan_pytorch/pro_gan_pytorch/PRO_GAN.py in optimize_discriminator(self, noise, real_batch, labels, depth, alpha)
    863 
    864             loss = self.loss.dis_loss(real_samples, fake_samples,
--> 865                                       labels, depth, alpha)
    866 
    867             # optimize discriminator

~/pro_gan_pytorch/pro_gan_pytorch/Losses.py in dis_loss(self, real_samps, fake_samps, labels, height, alpha)
    343     def dis_loss(self, real_samps, fake_samps, labels, height, alpha):
    344         # define the (Wasserstein) loss
--> 345         fake_out = self.dis(fake_samps, labels, height, alpha)
    346         real_out = self.dis(real_samps, labels, height, alpha)
    347 

~/.pyenv/versions/3.7.5/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

~/.pyenv/versions/3.7.5/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py in forward(self, *inputs, **kwargs)
    150             return self.module(*inputs[0], **kwargs[0])
    151         replicas = self.replicate(self.module, self.device_ids[:len(inputs)])
--> 152         outputs = self.parallel_apply(replicas, inputs, kwargs)
    153         return self.gather(outputs, self.output_device)
    154 

~/.pyenv/versions/3.7.5/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py in parallel_apply(self, replicas, inputs, kwargs)
    160 
    161     def parallel_apply(self, replicas, inputs, kwargs):
--> 162         return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
    163 
    164     def gather(self, outputs, output_device):

~/.pyenv/versions/3.7.5/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py in parallel_apply(modules, inputs, kwargs_tup, devices)
     83         output = results[i]
     84         if isinstance(output, ExceptionWrapper):
---> 85             output.reraise()
     86         outputs.append(output)
     87     return outputs

~/.pyenv/versions/3.7.5/lib/python3.7/site-packages/torch/_utils.py in reraise(self)
    392             # (https://bugs.python.org/issue2651), so we work around it.
    393             msg = KeyErrorMessage(msg)
--> 394         raise self.exc_type(msg)

RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/home/USER/.pyenv/versions/3.7.5/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
    output = module(*input, **kwargs)
  File "/home/USER/.pyenv/versions/3.7.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/USER/pro_gan_pytorch/pro_gan_pytorch/PRO_GAN.py", line 305, in forward
    out = self.final_block(y, labels)
  File "/home/USER/.pyenv/versions/3.7.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/USER/pro_gan_pytorch/pro_gan_pytorch/CustomLayers.py", line 445, in forward
    labels = self.label_embedder(labels)  # [B x C]
  File "/home/USER/.pyenv/versions/3.7.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/USER/.pyenv/versions/3.7.5/lib/python3.7/site-packages/torch/nn/modules/sparse.py", line 114, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/home/USER/.pyenv/versions/3.7.5/lib/python3.7/site-packages/torch/nn/functional.py", line 1484, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: diff_view_meta->output_nr_ == 0 INTERNAL ASSERT FAILED at /pytorch/torch/csrc/autograd/variable.cpp:326, please report a bug to PyTorch. 

Redproducing paper's results

I've been trying to reproduce the 8.80 IS on Cifar-10, however, I managed to get only to 6.60.. It looks like the following code uses the same hyperparameters as the ones that are in the paper:

# some parameters:
depth = 4
# hyper-parameters per depth (resolution)
num_epochs = [16, 32, 32, 32]
fade_ins = [50, 50, 50, 50]
batch_sizes = [16, 16, 16, 16]
latent_size = 512

Apart from the fact that lambda in the loss is 10 instead of 750 (tested with 750 and it did not produce better results).

800k*2 images per epoch, which is 50k (Cifar-10 train set) per epoch * 32.

I am wondering if somebody managed to reproduce the reported results by using this code ?

Inferenced images do not match samples during training

Inferenced images generated with this code

import torch as th
import pro_gan_pytorch.PRO_GAN as pg
import torchvision.utils as vutils

fdepth = 7
depth = 4
latent_size = 512

device = th.device("cuda" if th.cuda.is_available() 
                   else "cpu")
gen = th.nn.DataParallel(pg.Generator(fdepth, latent_size))
gen.load_state_dict(th.load("training_runs/portrait_large_homogen_256/saved_models/GAN_GEN_4.pth", map_location=str(device)))
for x in range(5): 
    noise = th.randn(1, 512).to(device)
    sample_image = gen(noise, depth, alpha=1).detach()
    vutils.save_image(sample_image[0, :, :, :], 'portaits_' + str(x) + '_' + str(depth) + '_' + str(latent_size) + '.png'.format(4))

Result (arranged and upsampled with Photoshop)
inference_collage

Do not resemble generated samples during training:
gen_4_245_48

What is wrong with the code?
The inference samples do not resemble in color nor in shape.
Should I avoid saving images with torchvision???
I can provide the .pth model if desired.

fan-in calculation in _equalized_deconv2d

Hi! First of all thanks for your implementation - the code is very nice! I am using it as a reference for some of the special blocks, and I noticed that in _equalized_deconv2d (here) you define the fan_in variable as the number of input channels, so the kernel size is not taken into consideration. After seeing this, I compared to the official TF implementation, where they use the kernel size (here).
So wanted to ask if this is intentional, and if so, then why did you do that.
Thanks in advance.

Pretrained models for lower resolutions

Great results! Is there any chance you could provide the pretrained weights for the lower-resolution versions of CelebA-HQ? (in particular, I'm keen to see the 256x256 version)

Thanks,
Will

Q: Equalized convolution do they make a big difference?

Thank you for the nice implementation of ProGan it is far more readable than the tensorflow version.

I was just wondering if using the special equalized convolution in your experience made any big difference in the visual results?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.