GithubHelp home page GithubHelp logo

endlesssora / tsit Goto Github PK

View Code? Open in Web Editor NEW
274.0 22.0 32.0 6.39 MB

[ECCV 2020 Spotlight] A Simple and Versatile Framework for Image-to-Image Translation

License: Other

Python 92.51% Shell 7.49%
generative-adversarial-network gan image-to-image-translation image-generation image-manipulation two-stream-networks versatile feature-transformation multi-scale style-transfer

tsit's People

Contributors

endlesssora avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tsit's Issues

parmeters

I want to know why there is a big gap between the preset parameters of G and D. Is this a suitable default value in your experiment? When I set ngf and ndf to 32, the parameters of G are about 100M and D is 0.3M (I'm afraid it's too small). Thank you

Alpha is not working

Thanks for your nice work.
If I change the alpha during the testing time it produces false results. It generates black images instead of stylized imaged. what could be the possible reasons?

I trained the network with alpha 1.0 but I guess it should work if we change it during the inference time.

thank you

What is the network structure in the box?

Hello, I have some questions about the paper. What is the network structure in the box(between the two Content ResBlk)? I didn't find the descriptions in your paper? And what you meaning of using Spectral Norm for all layers in the network? Is not Batch normalization and Instance normalization?

The result may not correct.

Thanks for sharing this amazing paper. However, I think the accu of Cityscapes may not correct (94.4 in Table 2 of your paper) according to xh-liu/CC-FPSE#4. Can you double-check it? Thanks.

cuda runtime error (700) : an illegal memory access was encountered

I try to test your code with a custom dataset using the following command:

python train.py \
    --name sis_maps \
    --task SIS \
    --gpu_ids 0,1 \
    --checkpoints_dir  ./checkpoints\
    --batchSize 4 \
    --dataset_mode custom \
    --croot ../../datasets/maps \
    --sroot ../../datasets/maps \
    --nThreads 4 \
    --gan_mode hinge \
    --num_upsampling_layers more \
    --use_vae \
    --alpha 1.0 \
    --display_freq 1000 \
    --save_epoch_freq 20 \
    --niter 100 \
    --niter_decay 100 \
    --lambda_vgg 20 \
    --lambda_feat 10 \
    --no_instance \
    --label_dir ../../datasets/maps/trainA \
    --image_dir ../../datasets/maps/trainB

The goal is to translate images from the domain of trainA into images from the domain of trainB. Do I understand the flags label_dir and image_dir correctly?

The dataset and network are created without throwing an error:

dataset [CustomDataset] of size 1096 was created
Network [TSITGenerator] was created. Total number of parameters: 332.4 million. To see the architecture, do print(network).
Network [MultiscaleDiscriminator] was created. Total number of parameters: 5.6 million. To see the architecture, do print(network).
Network [ConvEncoder] was created. Total number of parameters: 10.5 million. To see the architecture, do print(network).
create web directory ./checkpoints/sis_maps/web...
  0%|                                                                                  | 0/200 [00:00<?, ?it/sTHCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCTensorScatterGather.cu line=384 error=700 : an illegal memory access was encountered

However, the code fails when it tries to create a web directory:

Traceback (most recent call last):
  File "train.py", line 37, in <module>
    trainer.run_generator_one_step(data_i)
  File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/leander/tsit/TSIT/trainers/pix2pix_trainer.py", line 30, in run_generator_one_step
    g_losses, generated = self.pix2pix_model(data, mode='generator')
  File "/n/home05/lauenburg/.conda/envs/tsit/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/n/home05/lauenburg/.conda/envs/tsit/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 153, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/n/home05/lauenburg/.conda/envs/tsit/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/leander/tsit/TSIT/models/pix2pix_model.py", line 37, in forward
    input_semantics, real_image = self.preprocess_input(data)
  File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/leander/tsit/TSIT/models/pix2pix_model.py", line 118, in preprocess_input
    input_semantics = input_label.scatter_(1, label_map, 1.0)
RuntimeError: cuda runtime error (700) : an illegal memory access was encountered at /pytorch/aten/src/THC/generic/THCTensorScatterGather.cu:384
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: an illegal memory access was encountered (insert_events at /pytorch/c10/cuda/CUDACachingAllocator.cpp:771)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x46 (0x2b41150c3536 in /n/home05/lauenburg/.conda/envs/tsit/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0x7ae (0x2b4114e7dfbe in /n/home05/lauenburg/.conda/envs/tsit/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x2b41150b3abd in /n/home05/lauenburg/.conda/envs/tsit/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x5236b2 (0x2b40c90696b2 in /n/home05/lauenburg/.conda/envs/tsit/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x523756 (0x2b40c9069756 in /n/home05/lauenburg/.conda/envs/tsit/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #20: __libc_start_main + 0xf5 (0x2b40c08cd555 in /lib64/libc.so.6)

I tried resolving the issue by following e.g.:

Requirements and device versions

Hi there,

First of all, thank you for your work. I believe your model can produce wondering results but when I try to run it I face the same problem all the time.

Before I start describing the issue I'd like to recommend you to specify the exact version of Pillow in your "requirements.txt" as it doesn't work properly when you use latest Pillow version, producing the following error:

ImportError: cannot import name 'PILLOW_VERSION' from 'PIL' (/home/ubuntu/anaconda3/envs/tsit/lib/python3.7/site-packages/PIL/init.py)

Actually, your Pillow version should be <6.0.0 in order not to edit "PIL/init.py" file to use "PILLOW_VERSION" in the correct way.

Coming back to my problem, when I run "bash test_scripts/ast_summer2winteryosemite.sh" (or any other test command) whatever version of CUDA (10...12), cudatoolkit and Ubuntu (18, 20, 22) I use the model produces the following error:

Traceback (most recent call last):
File "test.py", line 34, in
generated = model(data_i, mode='inference')
File "/home/ubuntu/anaconda3/envs/tsit/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/ML-OpenPet/tsit/models/pix2pix_model.py", line 51, in forward
fake_image, _ = self.generate_fake(input_semantics, real_image)
File "/home/ubuntu/ML-OpenPet/tsit/models/pix2pix_model.py", line 197, in generate_fake
fake_image = self.netG(input_semantics, real_image, z=z)
File "/home/ubuntu/anaconda3/envs/tsit/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/ML-OpenPet/tsit/models/networks/generator.py", line 85, in forward
ft0, ft1, ft2, ft3, ft4, ft5, ft6, ft7 = self.content_stream(content)
File "/home/ubuntu/anaconda3/envs/tsit/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/ML-OpenPet/tsit/models/networks/stream.py", line 29, in forward
x0 = self.res_0(input) # (n,64,256,512)
File "/home/ubuntu/anaconda3/envs/tsit/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/ML-OpenPet/tsit/models/networks/architecture.py", line 108, in forward
x_s = self.shortcut(x)
File "/home/ubuntu/ML-OpenPet/tsit/models/networks/architecture.py", line 119, in shortcut
x_s = self.actvn(self.norm_layer_s(self.conv_s(x)))
File "/home/ubuntu/anaconda3/envs/tsit/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
hook(self, input)
File "/home/ubuntu/anaconda3/envs/tsit/lib/python3.7/site-packages/torch/nn/utils/spectral_norm.py", line 99, in call
setattr(module, self.name, self.compute_weight(module, do_power_iteration=module.training))
File "/home/ubuntu/anaconda3/envs/tsit/lib/python3.7/site-packages/torch/nn/utils/spectral_norm.py", line 85, in compute_weight
sigma = torch.dot(u, torch.mv(weight_mat, v))
RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:117

I hope you can help me with dealing with that issue. All the tests are performed on AWS. It would be extremely helpful if you shared with me your full list of working requirements (which can be obtained by "pip freeze" command) and versions of CUDA, nvidia driver and Ubuntu.

Thank you in advance!

Do we gain speed due to the simplicity of the model?

Hi guys, thank you and congratulation for the great work! I have a small question.

If I understand correctly, with the two-streams architecture, TSIT is able to accomplish the task of image-to-image translation without using constrains like cycle consistency which is heavy and time consuming. Therefore, I guess the training and inference time of TSIT is shorter than related works but I couldn't find this information in your paper. Have you done a comparison in terms of training and inference speed? I can't wait to see the results.

Thank you very much!

cityscapes sis pretrained model

I am getting an error when I run test script for cityscapes sis

The loaded model is missing keys

Missing key(s) in state_dict: "style_stream.res_0.conv_0.weight_orig", "style_stream.res_0.conv_0.weight", "style_stream.res_0.conv_0.weight_u", "style_stream.res_0.conv_0.bias", "style_stream.res_0.conv_0.weight_orig", "style_stream.res_0.conv_0.weight_u", "style_stream.res_0.conv_0.weight_v", "style_stream.res_0.conv_1.weight_orig", "style_stream.res_0.conv_1.weight", "style_stream.res_0.conv_1.weight_u", "style_stream.res_0.conv_1.bias", "style_stream.res_0.conv_1.weight_orig", "style_stream.res_0.conv_1.weight_u", "style_stream.res_0.conv_1.weight_v", "style_stream.res_0.conv_s.weight_orig", "style_stream.res_0.conv_s.weight", "style_stream.res_0.conv_s.weight_u", "style_stream.res_0.conv_s.weight_orig", "style_stream.res_0.conv_s.weight_u", "style_stream.res_0.conv_s.weight_v", "style_stream.res_1.conv_0.weight_orig", "style_stream.res_1.conv_0.weight", "style_stream.res_1.conv_0.weight_u", "style_stream.res_1.conv_0.bias", "style_stream.res_1.conv_0.weight_orig", "style_stream.res_1.conv_0.weight_u", "style_stream.res_1.conv_0.weight_v", "style_stream.res_1.conv_1.weight_orig", "style_stream.res_1.conv_1.weight", "style_stream.res_1.conv_1.weight_u", "style_stream.res_1.conv_1.bias", "style_stream.res_1.conv_1.weight_orig", "style_stream.res_1.conv_1.weight_u", "style_stream.res_1.conv_1.weight_v", "style_stream.res_1.conv_s.weight_orig", "style_stream.res_1.conv_s.weight", "style_stream.res_1.conv_s.weight_u", "style_stream.res_1.conv_s.weight_orig", "style_stream.res_1.conv_s.weight_u", "style_stream.res_1.conv_s.weight_v", "style_stream.res_2.conv_0.weight_orig", "style_stream.res_2.conv_0.weight", "style_stream.res_2.conv_0.weight_u", "style_stream.res_2.conv_0.bias", "style_stream.res_2.conv_0.weight_orig", "style_stream.res_2.conv_0.weight_u", "style_stream.res_2.conv_0.weight_v", "style_stream.res_2.conv_1.weight_orig", "style_stream.res_2.conv_1.weight", "style_stream.res_2.conv_1.weight_u", "style_stream.res_2.conv_1.bias", "style_stream.res_2.conv_1.weight_orig", "style_stream.res_2.conv_1.weight_u", "style_stream.res_2.conv_1.weight_v", "style_stream.res_2.conv_s.weight_orig", "style_stream.res_2.conv_s.weight", "style_stream.res_2.conv_s.weight_u", "style_stream.res_2.conv_s.weight_orig", "style_stream.res_2.conv_s.weight_u", "style_stream.res_2.conv_s.weight_v", "style_stream.res_3.conv_0.weight_orig", "style_stream.res_3.conv_0.weight", "style_stream.res_3.conv_0.weight_u", "style_stream.res_3.conv_0.bias", "style_stream.res_3.conv_0.weight_orig", "style_stream.res_3.conv_0.weight_u", "style_stream.res_3.conv_0.weight_v", "style_stream.res_3.conv_1.weight_orig", "style_stream.res_3.conv_1.weight", "style_stream.res_3.conv_1.weight_u", "style_stream.res_3.conv_1.bias", "style_stream.res_3.conv_1.weight_orig", "style_stream.res_3.conv_1.weight_u", "style_stream.res_3.conv_1.weight_v", "style_stream.res_3.conv_s.weight_orig", "style_stream.res_3.conv_s.weight", "style_stream.res_3.conv_s.weight_u", "style_stream.res_3.conv_s.weight_orig", "style_stream.res_3.conv_s.weight_u", "style_stream.res_3.conv_s.weight_v", "style_stream.res_4.conv_0.weight_orig", "style_stream.res_4.conv_0.weight", "style_stream.res_4.conv_0.weight_u", "style_stream.res_4.conv_0.bias", "style_stream.res_4.conv_0.weight_orig", "style_stream.res_4.conv_0.weight_u", "style_stream.res_4.conv_0.weight_v", "style_stream.res_4.conv_1.weight_orig", "style_stream.res_4.conv_1.weight", "style_stream.res_4.conv_1.weight_u", "style_stream.res_4.conv_1.bias", "style_stream.res_4.conv_1.weight_orig", "style_stream.res_4.conv_1.weight_u", "style_stream.res_4.conv_1.weight_v", "style_stream.res_4.conv_s.weight_orig", "style_stream.res_4.conv_s.weight", "style_stream.res_4.conv_s.weight_u", "style_stream.res_4.conv_s.weight_orig", "style_stream.res_4.conv_s.weight_u", "style_stream.res_4.conv_s.weight_v", "style_stream.res_5.conv_0.weight_orig", "style_stream.res_5.conv_0.weight", "style_stream.res_5.conv_0.weight_u", "style_stream.res_5.conv_0.bias", "style_stream.res_5.conv_0.weight_orig", "style_stream.res_5.conv_0.weight_u", "style_stream.res_5.conv_0.weight_v", "style_stream.res_5.conv_1.weight_orig", "style_stream.res_5.conv_1.weight", "style_stream.res_5.conv_1.weight_u", "style_stream.res_5.conv_1.bias", "style_stream.res_5.conv_1.weight_orig", "style_stream.res_5.conv_1.weight_u", "style_stream.res_5.conv_1.weight_v", "style_stream.res_6.conv_0.weight_orig", "style_stream.res_6.conv_0.weight", "style_stream.res_6.conv_0.weight_u", "style_stream.res_6.conv_0.bias", "style_stream.res_6.conv_0.weight_orig", "style_stream.res_6.conv_0.weight_u", "style_stream.res_6.conv_0.weight_v", "style_stream.res_6.conv_1.weight_orig", "style_stream.res_6.conv_1.weight", "style_stream.res_6.conv_1.weight_u", "style_stream.res_6.conv_1.bias", "style_stream.res_6.conv_1.weight_orig", "style_stream.res_6.conv_1.weight_u", "style_stream.res_6.conv_1.weight_v", "style_stream.res_7.conv_0.weight_orig", "style_stream.res_7.conv_0.weight", "style_stream.res_7.conv_0.weight_u", "style_stream.res_7.conv_0.bias", "style_stream.res_7.conv_0.weight_orig", "style_stream.res_7.conv_0.weight_u", "style_stream.res_7.conv_0.weight_v", "style_stream.res_7.conv_1.weight_orig", "style_stream.res_7.conv_1.weight", "style_stream.res_7.conv_1.weight_u", "style_stream.res_7.conv_1.bias", "style_stream.res_7.conv_1.weight_orig", "style_stream.res_7.conv_1.weight_u", "style_stream.res_7.conv_1.weight_v".

License

Can you please add license to this repo? thanks!

latest_net_G.pth

Thank you very much for your excellent work.I want to use your Sunny2Diffweather model to make Finetune, which requires the parameters of G and D. However, you only provided the parameter of G, so I would like to request it. Just need SUNNY2DIFFWEATHER LATEST_NET_D.PTH, thank you very much!!

results are broken

Hello,

I am writing to you because I am interested in using your code for my undergraduate graduation thesis. I have tested the code, but the results are as follows. I am not sure what the problem is. My environment is Ubuntu 22.04 / Pytorch 1.1.0 / cudatoolkit 10.0. There is no problem with using the GPU, but the results are broken.

b3826400-c03b5432
b3121071-c0fd88c4

I would be grateful if you could help me to identify the problem.

Thank you for your time and consideration.

Reducing Generator size

Thank you for sharing your research!
I am training my own model with a dataset with pairs of images 256x256.
The output Generator size is 1.59GB, is there a way to reduce the size of the model based on the size of the images?
Or in any other way?

Thank you for your advice.

Beg for the trained BDD100K dataset

Hi, thanks for your excellent work.

Sorry for disturbing you. Recently, I have been reproducing your code.
But I found that the BDD100K dataset you used in your paper is different from the original dataset. And you also mentioned in your paper, “We classify the dataset into different times.” and "We further classify the images in BDD100K [48] dataset into different time and weather conditions".
Therefore, I would like to beg for a copy of the BDD100K dataset you are using. My email is: [email protected]

Thanks.

How to train on custom dataset?

Thanks for your great work and your efforts.

My question is that is there a way to train a model on a custom dataset suppose I want to perform "MMIS" operation on my custom dataset where I have "trainA" "trainB" and want to transform trainA to trainB with any of the MMIS operations how to do that?

another thing I want to know is if we can perform any other additional transformation like fog to the normal image or vice versa.

Thank You

Why do you put FADE before Conv?

I see that you inherited approach from SPADE where SPADE is located before Conv in SPADEResnetBlock, can you explain the motivation?

Inception Score

Hi, thanks for your excellent work.

Sorry for disturbing you, but how to get Inception Score in your experiments about summer2diffentweather mappings? where is the code?

Thanks.

three question about network detail

Hello, thanks your great paper first. I am trying to reproduce your work. I have three question while coding.

  1. What is the network structure of input layer and output layer, who we put an input layer at the entrance. The feature channel is adjusted to 64 after the input layer, while the resolution remains unchanged. and output layer who reduce 64 channels feature to image?

for now, I use conv block:

# input layer
nn.Sequential(
    conv_block(self.use_spectral, in_channels, out_channels, kernel_size=7, stride=1,
                       padding_mode="zeros", padding=3, bias=True),
    nn.InstanceNorm2d(out_channels),
    nn.ReLU(inplace=True)
)
# output layer
nn.Sequential(
    conv_block(use_spectral, base_channels, out_channels, kernel_size=7, padding=3, padding_mode="zeros"),
    nn.Tanh()
)
  1. In your network, FAdaIN is placed at the front of the BatchNorm(in FADE). While do arbitrary style transfer task, the batch size is 1, so that BatchNorm have the same behavior as instance norm(Is this correct?), Didn’t the sytle injected by FAdaIN disappear?

related code in my implementation:

# main code in AdaIN
self.norm = nn.InstanceNorm2d(num_features, eps, momentum, affine, track_running_stats)
x = self.gamma * x + self.beta

# code in FADE,
nn.BatchNorm2d(num_features=in_channels, affine=False, track_running_stats=True)
  1. Despite the above two questions, I have tried my implementation in a unpaired dataset( 6000 human face to 3400 anime face). and I got this result (during training after ~90000 pairs):
    image

content image style image generated image

batch size = 1, feature matching loss weight = 1, perceptual loss weight = 1

All images generated in test dataset. It seems that the generated image have the average style, but not the style of the input style image. Is this a problem with my implementation, or a problem with the network in paper?

Other GPU ids throw error

When using any other GPU devices ID, except 0, the code throws error.
"
Traceback (most recent call last):
File "test.py", line 12, in
opt = TestOptions().parse()
File "/home/jang_sa/phd/AI/domain_adaptation/TSIT/options/base_options.py", line 178, in parse
torch.cuda.set_device(opt.gpu_ids[0])
File "/home/jang_sa/Software/anaconda3/envs/tsit/lib/python3.7/site-packages/torch/cuda/init.py", line 263, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
"
The GPUs are available and device IDs are valid but still error is got !! any solution for this problem ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.