nashory / pggan-pytorch Goto Github PK

:fire::fire: PyTorch implementation of "Progressive growing of GANs (PGGAN)" :fire::fire:

License: MIT License

Python 99.96% Shell 0.04%

generative-adversarial-network progressive-gan pytorch celeba-hq-dataset gan progressively-growing-gan tensorboard

pggan-pytorch's Introduction

Pytorch Implementation of "Progressive growing GAN (PGGAN)"

PyTorch implementation of PROGRESSIVE GROWING OF GANS FOR IMPROVED QUALITY, STABILITY, AND VARIATION
YOUR CONTRIBUTION IS INVALUABLE FOR THIS PROJECT :)

What's different from official paper?

original: trans(G)-->trans(D)-->stab / my code: trans(G)-->stab-->transition(D)-->stab
no use of NIN layer. The unnecessary layers (like low-resolution blocks) are automatically flushed out and grow.
used torch.utils.weight_norm for to_rgb_layer of generator.
No need to implement the the Celeb A data, Just come with your own dataset :)

How to use?

[step 1.] Prepare dataset
The author of progressive GAN released CelebA-HQ dataset, and which Nash is working on over on the branch that i forked this from. For my version just make sure that all images are the children of that folder that you declare in Config.py. Also i warn you that if you use multiple classes, they should be similar as to not end up with attrocities.

---------------------------------------------
The training data folder should look like : 
<train_data_root>
                |--Your Folder
                        |--image 1
                        |--image 2
                        |--image 3 ...
---------------------------------------------

[step 2.] Prepare environment using virtualenv

you can easily set PyTorch (v0.3) and TensorFlow environment using virtualenv.
CAUTION: if you have trouble installing PyTorch, install it mansually using pip. [PyTorch Install]
For install please take your time and install all dependencies of PyTorch and also install tensorflow

$ virtualenv --python=python2.7 venv
$ . venv/bin/activate
$ pip install -r requirements.txt
$ conda install pytorch torchvision -c pytorch

[step 3.] Run training

edit config.py to change parameters. (don't forget to change path to training images)
specify which gpu devices to be used, and change "n_gpu" option in config.py to support Multi-GPU training.
run and enjoy!

  (example)
  If using Single-GPU (device_id = 0):
  $ vim config.py   -->   change "n_gpu=1"
  $ CUDA_VISIBLE_DEVICES=0 python trainer.py
  
  If using Multi-GPUs (device id = 1,3,7):
  $ vim config.py   -->   change "n_gpu=3"
  $ CUDA_VISIBLE_DEVICES=1,3,7 python trainer.py

[step 4.] Display on tensorboard (At the moment skip this part)

you can check the results on tensorboard.

$ tensorboard --logdir repo/tensorboard --port 8888
$ <host_ip>:8888 at your browser.

[step 5.] Generate fake images using linear interpolation

CUDA_VISIBLE_DEVICES=0 python generate_interpolated.py

Experimental results

The result of higher resolution(larger than 256x256) will be updated soon.

Generated Images

Loss Curve

To-Do List (will be implemented soon)

Support WGAN-GP loss
training resuming functionality.
loading CelebA-HQ dataset (for 512x512 and 1024x0124 training)

Compatability

cuda v8.0 (if you dont have it dont worry)
Tesla P40 (you may need more than 12GB Memory. If not, please adjust the batch_table in dataloader.py)

Acknowledgement

Author

MinchulShin, @nashory

Contributors

DeMarcus Edwards, @Djmcflush
MakeDirtyCode, @MakeDirtyCode
Yuan Zhao, @yuanzhaoYZ
zhanpengpan, @szupzp

pggan-pytorch's People

Contributors

Stargazers

Watchers

Forkers

hbcbh1999 evitself yzhao666 yuanzhaoyz ck853178967 iaroslav-ai awesome-archive sonsus ondyari tobyclh szupzp jatinagrawal31 spmohanty omair-kg rajayan0215 nguyenanhtien nightingale999 pandinosaurus locussam chequochuu sakshamsinha josephkj mingo-x siddharthalodha lihungte96 steveorsomethin zmsunnyday gogobd hongdayu yukivivian ktho22 xuzhm sjmoran ezrealzhang wokea zealousdove kongyulian99 piegu yldcs akumar14 seitalab djmcflush hyperfraise keiboy anothertk kristofe haohaobaozhi ahuirecome maxjiang93 liziyun lucasmllr hhyla66 richardfeder-zz seanliu96 guoyilin tan5o xiaoye77 darrenruan joel-hanson yuzhang10 chandanpanda bigtaotao yupeihenry superever woolpeeker baohq1595 juaari batermj normalct tinamilee khanitachi dragona anhvth nx2018 moooyang zhyj3038 hongyunnchen anas-alamri eatapie alexacarlson pgsrv ml-and-ai-repo dunkle juliefhill kiedatamashi bigdatasciencegroup florianlemarchand joshualin24 razvanmarinescu lineojcd createroner boiiorluo negativemind milez770 shiji203 askintution fedral theabominog freegliboracle chaman9000

pggan-pytorch's Issues

How many seconds per iteration should be expected?

I'm only getting ~1.0 iterations per second, which seems really slow, and this is only the 4x4 stage

Problem when training on my dataset

When training on my own dataset (8.8k 512*512 images), like this:

the process goes well at first but collapsed after a while.
When initializing the output of the model like this:

But as the training continue, the output went strange and lost all the diversity:

Could anyone help me out? I cannot determine where the problem is: the scale of dataset, or the training parameters?

The config I use:

training parameters.

parser.add_argument('--lr', type=float, default=0.001) # learning rate.
parser.add_argument('--lr_decay', type=float, default=0.87) # learning rate decay at every resolution transition.
parser.add_argument('--eps_drift', type=float, default=0.001) # coeff for the drift loss.
parser.add_argument('--smoothing', type=float, default=0.997) # smoothing factor for smoothed generator.
parser.add_argument('--nc', type=int, default=3) # number of input channel.
parser.add_argument('--nz', type=int, default=512) # input dimension of noise.
parser.add_argument('--ngf', type=int, default=512) # feature dimension of final layer of generator.
parser.add_argument('--ndf', type=int, default=512) # feature dimension of first layer of discriminator.
parser.add_argument('--TICK', type=int, default=1000) # 1 tick = 1000 images = (1000/batch_size) iter.
parser.add_argument('--max_resl', type=int, default=9) # 10-->1024, 9-->512, 8-->256
parser.add_argument('--trns_tick', type=int, default=200) # transition tick
parser.add_argument('--stab_tick', type=int, default=100) # stabilization tick

network structure.

parser.add_argument('--flag_wn', type=bool, default=True) # use of equalized-learning rate.
parser.add_argument('--flag_bn', type=bool, default=False) # use of batch-normalization. (not recommended)
parser.add_argument('--flag_pixelwise', type=bool, default=True) # use of pixelwise normalization for generator.
parser.add_argument('--flag_gdrop', type=bool, default=True) # use of generalized dropout layer for discriminator.
parser.add_argument('--flag_leaky', type=bool, default=True) # use of leaky relu instead of relu.
parser.add_argument('--flag_tanh', type=bool, default=False) # use of tanh at the end of the generator.
parser.add_argument('--flag_sigmoid', type=bool, default=False) # use of sigmoid at the end of the discriminator.
parser.add_argument('--flag_add_noise', type=bool, default=True) # add noise to the real image(x)
parser.add_argument('--flag_norm_latent', type=bool, default=False) # pixelwise normalization of latent vector (z)
parser.add_argument('--flag_add_drift', type=bool, default=True) # add drift loss

optimizer setting.

parser.add_argument('--optimizer', type=str, default='adam') # optimizer type.
parser.add_argument('--beta1', type=float, default=0.0) # beta1 for adam.
parser.add_argument('--beta2', type=float, default=0.99) # beta2 for adam.
`

Which version of Pytorch should we use ?

PyTorch 1.0, 0.3 and 0.4 all gave me annoying "compatibility" bugs.

For example, Pytorch 1.0 gave me a

pggan-pytorch/custom_layers.py", line 114, in forward
    return x + self.bias.view(1,-1,1,1).expand_as(x)
RuntimeError: expected type torch.cuda.FloatTensor but got torch.FloatTensor

Pytorch 0.4 gives me a RandomSampler based error, and after fixing it, I get :

  File "torch/nn/functional.py", line 1537, in 
    return lambd_optimized(input, target, size_average, reduce)
RuntimeError: input and target shapes do not match: input [32 x 1], target [32] at /pytorch/aten/src/THCUNN/generic/MSECriterion.cu:15

Similarly with Pytorch 0.3 (it was a .item() error). So which version do you want us to use ?

Questions about PyTorch version

Why should the Pytorch version be 0.3 ? Problems occur when using PyTorch 0.3. What's more, .item() and with torch.no_grad(): should be used in version 0.4 and higher.

How to implement for custom own dataset of say 100 custom images?

so much errors in code

Is there some way to download the images for Celeba-HQ

Or if you can make them available to me I can host them somewhere for people to download. I'd really like the dataset but don't particularly want to go through the process to generate it myself.

issue in dataloader.py

row32: transforms.Scale(size=(self.imsize,self.imsize), interpolation=Image.NEAREST),

The requirement of size is a int, but here get a tuple.

AttributeError: 'Generator' object has no attribute 'module'

The following is the error I get when the generator begins to grow:

Traceback (most recent call last):
  File "trainer.py", line 343, in <module>
    trainer.train()
  File "trainer.py", line 251, in train
    self.resl_scheduler()
  File "trainer.py", line 149, in resl_scheduler
    self.G.module.grow_network(floor(self.resl))
  File "/Users/rahulbhalley/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 366, in __getattr__
    type(self).__name__, name))
AttributeError: 'Generator' object has no attribute 'module'

Please help! Thanks in advance.

Why even use PyTorch's DataLoader in dataloader.py

It seems that the only way to get data from self.dataloader, is from get_batch(), and in get_batch() we recreate a new iterator of self.dataloader every time it is called. Therefore we always use only the first item returned by self.dataloader, which looks very pointless to create self.dataloader in the first place.

network structure

Generator structure:
Sequential (
  (first_block): Sequential (
    (0): ConvTranspose2d(512, 512, kernel_size=(4, 4), stride=(1, 1))
    (1): LeakyReLU (0.2)
    (2): ConvTranspose2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): LeakyReLU (0.2)
  )
  (to_rgb_block): Sequential (
    (0): ConvTranspose2d(512, 3, kernel_size=(1, 1), stride=(1, 1))
    (1): Tanh ()
  )
)
Discriminator structure:
Sequential (
  (from_rgb_block): Sequential (
    (0): Conv2d(3, 512, kernel_size=(1, 1), stride=(1, 1))
    (1): LeakyReLU (0.2)
  )
  (last_block): Sequential (
    (0): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): LeakyReLU (0.2)
    (2): Conv2d(512, 512, kernel_size=(4, 4), stride=(1, 1))
    (3): LeakyReLU (0.2)
    (4): Flatten (
    )
    (5): Linear (512 -> 1)
    (6): Sigmoid ()
  )
)

how to load pretrained generator ?

while running:
CUDA_VISIBLE_DEVICES=1 python generate_interpolated.py

I got the following error:

KeyError: 'unexpected key "model.concat_block.layer1.low_resl_to_rgb.to_rgb_block.high_resl_to_rgb.0.bias" in state_dict'

Of course I made sure the path of the generator model was the right one.
Any idea how to solve this ?

about multi-gpu

hi.
I found that the code in multi-gpu training may be wrong.
In trainer.py, when you refresh the network, in multi-gpu, you should re-start parralle-data again. otherwise, your renew model can not replicate to other gpu.

question是

Memory leak during network growing?

thanks for providing your code , it's much more readable than the original one.
i have observed severe memory leak in training .
During training , former allocated batch tensors of smaller network is never removed from gpu memory.

I notice that you use very small batch to solve this , but it makes the training painfully slow : ( .
have you found any better solution?

Multi-GPU training error

Error in trainer.py with indentation

I am executing in windows and get this error

C:\Users\admin\Documents\pggan-pytorch-master\pggan-pytorch-master>python3 trainer.py
  File "trainer.py", line 270
    loss_d = self.mse(self.fx.squeeze(), self.real_label) + \
                                                            ^
TabError: inconsistent use of tabs and spaces in indentation

How to train a custom dataset of 4:3 800x600 image dataset?

Is it possible to train a custom image dataset of this size? How?

KeyError: 'fadein_block'

Hi, I was running the code, and i come with the problem below:

growing network[4x4 to 8x8]. It may take few seconds...
growing network[4x4 to 8x8]. It may take few seconds...
[*] Renew dataloader configuration, load data from ../../input.
Traceback (most recent call last):
File "trainer.py", line 392, in
trainer.train()
File "trainer.py", line 266, in train
self.resl_scheduler()
File "trainer.py", line 164, in resl_scheduler
self.fadein['gen'] = dict(self.G.module.named_children())['fadein_block']
KeyError: 'fadein_block'

i set the trns_tick=4 and stab_tick=2 to speed up the process to see if there is any additional bug in the code. is that the problem is? is there any constraints on the value of trns_tick and stab_tick?

Is config.smoothing ever used?

Multi-gpu not working?

Hello,
I'm trying to train with multi gpu but it simply hangs at trainer.train() without throwing exception.
Any idea?
Thank you

What do you mean by 'the model is being trained'?

Does this code work? Are you able to attain high-res quality images, as in the paper?

[Clarification] Why are we looping until self.max_resl+1+5 ?

Hi,

In the main training loop, why do we loop until self.max_resl+1+5?

https://github.com/nashory/pggan-pytorch/blob/master/trainer.py#L249

for step in range(2, self.max_resl+1+5):

Thanks!

I met come errors in trainer.py

Thanks for sharing your code for us !
I have trained for more than 2500 epochs on one GTX 1080 ti, but the resolution is still the init phase :4X4 and doesn't grow . And I met the error after 2557 epochs like this:
AttributeError: 'DataParallel' object has no attribute 'grow_network'.
I don't know how to solve.

Difference between Grow_network and Flush_network

Can someone explain the difference between grow and flush components? At a higher level, from what i see, both add layers during training highlighting the progressive part.
Also, what are the to_rgb, from_rgb, fade_in and concat blocks ?

Possible mismatch in runtime weight scaling implementation (equalized learning rate section)

In your code here: x = self.conv(x.mul(self.scale)), the input x is multiplied by the scale which is equal to scale = sqrt(2 / fan_in) from HE initializer. I am a bit confused about the multiplication. The paper states that w_i_hat = w / scale which in case of convolution, can be achieved by doing out = conv(x / scale).

My question is: why is the scale multiplied by the x, instead of dividing? Please help.

Module tkinter is not found

training on smaller dataset

python3.6

thanks for your great project. I have a question about the python version. Can the project run on python3.6?

Some questions about using the code!

Hi there, thanks a lot for providing such wonderful code!
I was trying to run your code,
but after I run the command for training,
which command should I use to generate new images from trained model??

Trying to understand the training on LSUN dataset with multi-class labels

Hi @nashory
Thanks a lot for this implementation! I am trying to use PG-GAN for multi-class dataset (such as LSUN).

I have a question, and I was wondering if I can get your thoughts: The paper mentions that their training is unsupervised - meaning that it was not label-conditioned. Then how come they were able to generate label-specific images for LSUN dataset? Did they train separate networks for each label or is their network a multi-class generator?

Any information on multi-class PGGAN training will be of great help.
Thanks in advance!

License?

Please add a license to this project. Thanks!

have you reproduced the results in thesis?

why in the deconv function you called Equalized_conv instead of Equalized_deconv instead ???

Probably that's a mistake , otherwise you are using convolution instead convolution Transposed in the Generator ,
which is absurd form my perspective. and I looked on the original implementation in Tensorflow , that has been published with the article , it support my idea .

https://github.com/tkarras/progressive_growing_of_gans/blob/master/networks.py

-- additionaly can you explaine plaise the generalized_drop_out, there are many mods , but only "prop" mod used !!

i need answers so badly, thank you for you effort and for the sharing

weight_scale for equalized_learning

hi. I check the original code , which supplied by the paper author, that weight_scale was not like yours.
the weight will multiply a constant from he's initializer .
I just want to know why you set the weight not like that.
THX

Resuming Training

I see that check box in the readme updated for resuming training. Is there an example for the same?

AttributeError: 'DataParallel' object has no attribute 'grow_network'

When I run the model using only 100 images for testing I got this error :

Traceback (most recent call last): File "trainer.py", line 380, in <module> trainer.train() File "trainer.py", line 261, in train self.resl_scheduler() File "trainer.py", line 157, in resl_scheduler self.G.grow_network(floor(self.resl)) File "/home/miri/anaconda3/envs/mirilab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 576, in __getattr__ type(self).__name__, name)) AttributeError: 'DataParallel' object has no attribute 'grow_network'

not working well on other datasets

I've been trying to train this code on other datasets like ImageNet but the results are very bad. The images turn into almost black and grey blobs. However, using the original author's theano implementation, results on ImageNet are good. Have you tried out other datasets?

have error when growing network[32x32 to 64x64]

I train another data set which is cell Pathological image，and it doesn‘t work well.I set the --trns_tick to 80 and --stab_tick to 40 because my data set is very small and other config is default.when the training is begining to grow network from 32x32 to 64x64,the program is stop and point out a error :
Traceback (most recent call last):
File "trainer.py", line 352, in
trainer.train()
File "trainer.py", line 258, in train
self.resl_scheduler()
File "trainer.py", line 150, in resl_scheduler
self.G.module.grow_network(floor(self.resl))
File "/media/sdc/zhanpeng/PycharmProjects/GAN/PG_GAN/pggan-pytorch/network.py", line 142, in grow_network
inter_block, ndim, self.layer_name = self.intermediate_block(resl)
File "/media/sdc/zhanpeng/PycharmProjects/GAN/PG_GAN/pggan-pytorch/network.py", line 105, in intermediate_block
layers = deconv(layers, ndim*2, ndim, 3, 1, 1, self.flag_leaky, self.flag_bn, self.flag_wn, self.flag_pixelwise)
File "/media/sdc/zhanpeng/PycharmProjects/GAN/PG_GAN/pggan-pytorch/network.py", line 13, in deconv
if wn: layers.append(equalized_conv2d(c_in, c_out, k_size, stride, pad))
File "/media/sdc/zhanpeng/PycharmProjects/GAN/PG_GAN/pggan-pytorch/custom_layers.py", line 103, in init
self.conv = nn.Conv2d(c_in, c_out, k_size, stride, pad, bias=False)
File "/home/panzhanpeng/sdc_link/anaconda3/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 273, in init
False, _pair(0), groups, bias)
File "/home/panzhanpeng/sdc_link/anaconda3/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 33, in init
out_channels, in_channels // groups, *kernel_size))

TypeError: torch.cuda.FloatTensor constructor received an invalid combination of arguments - got (float, float, int, int), but expected one of:
* no arguments
* (int ...)
didn't match because some of the arguments have invalid types: (float, float, int, int)
* (torch.cuda.FloatTensor viewed_tensor)
* (torch.Size size)
* (torch.cuda.FloatStorage data)
* (Sequence data)
so what should i do?how to improve the network?

nashory / pggan-pytorch Goto Github PK

pggan-pytorch's Introduction

Pytorch Implementation of "Progressive growing GAN (PGGAN)"

What's different from official paper?

How to use?

Experimental results

To-Do List (will be implemented soon)

Compatability

Acknowledgement

Author

Contributors

pggan-pytorch's People

Contributors

Stargazers

Watchers

Forkers

pggan-pytorch's Issues

training parameters.

network structure.

optimizer setting.

Recommend Projects

Recommend Topics

Recommend Org

Jobs