GithubHelp home page GithubHelp logo

tamarott / singan Goto Github PK

View Code? Open in Web Editor NEW
3.3K 3.3K 608.0 94.89 MB

Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"

Home Page: https://tamarott.github.io/SinGAN.htm

License: Other

Python 100.00%
animation arbitrery-sizes gan harmonization image-edit official singan single-image single-image-animation single-image-generation single-image-super-resolution super-resolution

singan's People

Contributors

alemelis avatar cclauss avatar feinsteinben avatar sharkezz avatar tamarott avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

singan's Issues

Objective function

Why did you use Loss_adv of WGAN-GP in instead of another L_adv? ?Looking forward to your reply.

--cuda argument is apparently being ignored

The project theoretically was designed to run without using CUDA, since it has the --cuda command-line argument. However, in practice it doesn't seem to work that way.

I'm trying to run the project in a device without CUDA support and it's not working. When I run main_train.py, the function np2torch in functions invokes the torch.cuda.FloatTensor class without checking if the --cuda option was used, which raises an exception.

The imresize module also has a np2torch function with the same problem.

Also, the generate_noise function in functions is always invoked without named parameters, which sets the device parameter to the default value of 'cuda'.

What is the purpose of the --cuda CLI argument?

More about SIFID score

Your GAN paper is potentially very useful. I am trying to train a model on learning an image, then find the SIFID score between the image and other test image. Both training and test images are of the same resolution ie. 128 X 128. I would like to pose a question:

In the README:
python SIFID/sifid_score.py --path2real --path2fake --images_suffix <e.g. jpg, png>

Is the path2real and path2fake the directory that contain the image for comparison? I have issue with running sifid_score.py.

How to use Harmonization code?

Hi, I think there are some question about use Harmonization, but those can not fix my problem , and the answer is blur. So I make a new issue.

How I use it :
Following ReadMe, First I train a image called "seascape.png" in Input/Images
python main_train.py --input_name seacape.png
and When the train processing have finished.I run
python harmonization.py --input_name seacape.png --ref_name tree.png --harmonization_start_scale 2

why I set those parameters?Because follow by ReadMe ,like this:
python harmonization.py --input_name <training_image_file_name> --ref_name <naively_pasted_reference_image_file_name> --harmonization_start_scale <scale to inject>

and I think <training_image_file_name> is my first train image, so it is "seascape.png",
and <naively_pasted_reference_image_file_name> is Input/Harmonization/tree.jpg, so I set tree.jpg
I set 2

the result above , I got :
Screenshot from 2019-11-22 10-20-31
it named tree_mask_dilated.png located in Input/Harmonization and Other output is
Screenshot from 2019-11-22 10-23-03
it named start_scale=2.png located in Output/Harmonization/seascape/tree_out
Those are different from paper, So I want to know how to use it.

start_scale=5.png image:
Screenshot from 2019-11-22 11-26-19

start_scale=6.png image:
Screenshot from 2019-11-22 11-26-26

start_scale=7.png image:
Screenshot from 2019-11-22 11-26-32

start_scale=8.png image:
Screenshot from 2019-11-22 11-26-38

If you can reimplement result like paper, tell how to do it, thank you

SIFID from paper unreproducible

I've modified calculate_sifid_given_paths(path1, path2, batch_size, cuda, dims, suffix) slightly in sifid_score.py s.t. the function calculates mu and sigma of a single real image and claculate mu and sigma of 50 fake images generated from the single real image. This was an attempt to decrease variation of mu and sigma of the generated output:

def calculate_sifid_given_paths(path1, path2, batch_size, cuda, dims, suffix):
    # some code...
    # ...


    # get mu and sigma of single training image
    ref_m, ref_s = calculate_activation_statistics([img_path], model, batch_size, dims, cuda)

    fid_values = []

    # get fid of multiple images that were based on the single reference image,
    # then calculate distance between each one and ref_m, ref_s
    for i in range(len(files2)):
        generated_m, generated_s = calculate_activation_statistics([files2[i]], model, batch_size, dims, cuda)
        fid_values.append(calculate_frechet_distance(ref_m, ref_s, generated_m, generated_s))

    return fid_values

The values I get are too small compared to the pubilshed SIFID:

When using balloons.png as training image, I get 1.650822e-05 for n=N and 1.0670579e-05 for n=N-1 with 50 generated samples.

The published SIFID are:

Screen Shot 2019-12-04 at 12 17 32 PM

size error in Harmonization mode, discrepancy btw noise and input image shapes

Dear all

I am trying to use the harmonization example. I specified images as required.

However, I get an error in SinGAN_generate(), the shapes do not match on the code line

z_in = noise_amp(z_curr)+I_prev*

I wonder, should the harmonization image I insert have exactly the same size as the training image? Or is there a way to make it work with any size image for harmonization, the model architecture seems flexible enough. But I am a bit confused by the overall code structure and where these tensor size changes should be refactored exactly

Single Image Frechet Inception Distance

Thank you for the valuable work done with this project.

In my own experiments/research, I am interested in using the newly introduced SIFID metric. However, I could not find an implementation of the calculation of this metric in the provided code. Could I please ask for the code used in generating those values to be pushed, if it still possible?

Many thanks

Retargeting

Hi, in the paper there are retargeting (non-homogeneous resizing) examples, but I can't see how to reproduce them in the repo. Is there a way to use the code as-is for retargeting?

Thanks!
Mark

Questions about the paper

Thanks for your great work and sharing the code.

In your paper, for the reconstruction loss, you set the input noise of Gn as a fixed noise map, while for other generators, it's set zero. Could you explain why you do that?

Looking forward to your reply. Thanks

About super-resolution image size and output results

  First of all, congratulate your paper for getting the best ICCV 2019 paper and share your own code. I have some questions that I do n’t understand and I need to ask you. When I implement image super-resolution, execute python SR.py --input_name <LR_image_file_name>, I found The maximum size of the output image can only be 1000, but since the image I trained is 1200 or 800, the final output result is not ideal. How do I run it? wish you a happy life!

Summary of your paper

Hi,
I really liked the idea of the paper and have written a summary for the same which can be found here.
Hope people find it helpful and would really love any of your feedbacks on the same 😄

about the cuda

Hello,I'm Zhou Chuanyou from Yangtze University,wuhan,China. I'm studying your research. It‘s great. And some questions to ask for your help.
2.I'm trying to install cuda, which version do you install for your computer, and which version when you install pytorch cuda.
This is my LinkedIn contact:
https://www.linkedin.com/in/chuanyou-zhou-8734a1196/
I have tried to add you, when you are free, you can check your LinkedIn.
And my email:
[email protected]

Kernel size can't be greater than actual input size

I tired this code using original data, but the following error occured:
RuntimeError: Calculated padded input size per channel: (1 x 4). Kernel size: (3 x 3). Kernel size can't be greater than actual input size

Could you tell me what should I do?

Methond Limitations in random_samples_arbitrary_sizes

Hello,I have question:When there is a person or only one significant target in the image , the effect of generating any size ratio image is not true. such as zebra image, when I need generate any size ratio image, and the whole horse will be broken,Whether this method can't process this kind of image. but for mountain group and Birds group image,so on, image target is consecutive, so result is acceptable

About output

hello,thx for your source code
in the output image,what's the mean of G(z_opt).png and fake_sample.png
which one is the output from the net?

List dependencies

Hello and thanks for your work.

Is it possible to list the exact dependencies for running the scripts? Wrong dependency versions may lead to wrong results. For example using the latest Pytorch version (1.3.0) led to the following warning message:

UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite o
rder: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning
-rate

Listing dependencies can be done running pip freeze and copying the content, or better pip freeze > requirements.txt and committing the file.

Thanks in advance.

Changing the dimensions of the noise maps

Quoting the paper: "Because the generators are fully convolutional, we can generate images of arbitrary size and aspect ratio at test time (by changing the dimensions of the noise maps)." Do I understand correctly that the mentioned dimension of noise maps is driven by nc_z? I tried changing this parameter to a greater value (10), but got an exception:

  File "projects/python/ai/SinGAN/SinGAN/manipulate.py", line 132, in SinGAN_generate
    z_in = noise_amp*(z_curr)+I_prev
RuntimeError: The size of tensor a (10) must match the size of tensor b (3) at non-singleton dimension 1

Could someone please clarify this for me? My model is trained using default nc_z, because the paper says I can increase the dimensions of the noise maps only at test stage to generate larger samples.

RuntimeError: The size of tensor a (276) must match the size of tensor b (282) at non-singleton dimension 3

I am having problems with running the SR.py:

(sinGAN) B:\Program_Files\Anaconda3\envs\sinGAN\SinGAN-master>python SR.py --input_name TEST_12_variant.png --out R:\
Random Seed: 1708
4.000000
Traceback (most recent call last):
File "SR.py", line 64, in
out = SinGAN_generate(Gs_sr, Zs_sr, reals_sr, NoiseAmp_sr, opt, in_s=reals_sr[0], num_samples=1)
File "B:\Program_Files\Anaconda3\envs\sinGAN\SinGAN-master\SinGAN\manipulate.py", line 131, in SinGAN_generate
z_in = noise_amp*(z_curr)+I_prev
RuntimeError: The size of tensor a (276) must match the size of tensor b (282) at non-singleton dimension 3

Windows 10, conda version : 4.7.10, GTX 1080Ti, driver version: 26.21.14.3160

Scaling from 25px width to 250px width size training and inference on 680px image width

Is it possible to train up to a certain scale like 250px width and with the trained model for inference to scale to 480px width or higher ?

Random Samples

I tried to generate random samples :
python random_samples.py --input_name mountain.jpeg --mode random_samples_arbitrary_sizes --scale_h 480 --scale_v 600
But I get out of memory error. So if it is possible, how can I generate bigger resolution images without out of memory crash ?

Edit/Harmonize

The same question for edit and harmonize, if I train a model on a small resolution image can I upscale the output of edit/harmonize ?

Continuing train that stopped

Does it possible to continue scaling-and-training if the training stopped in the middle of work?
Or I have to remove the training folder and start from the beginning?

Training time

As mentioned in the supplementary materials submitted by you, when generating 256 * 256 images, on a single 1080ti GPU, the training time will take about 30 minutes, and the actual generated images will be less than one second each. But when I run main.train.py, when I train to generate 224 * 224 images, the actual training time is nearly two hours. Why? My image size for training is 224 * 224. My device is 2080ti, cuda9.0, pytorch1.1.0, python3.6. I've also tested it on 1080ti, and it takes longer to train.
Looking forward to your reply.

ValueError: math domain error

python main_train.py --input_name 33039_LR.png

('Random Seed: ', 5309)
functions.adjust_scales2image(real, opt)
in adjust_scales2image
opt.num_scales = int((math.log(math.pow(opt.min_size / (real_.shape[2]), 1), opt.scale_factor_init))) + 1
ValueError: math domain error

I have some questions about the papers and procedures

Hello, your SINGAN is very attractive. I am very interested in SR, but I have some questions about the papers and procedures, I hope you can answer.

  1. What is the meaning of the Effective Patch Size in Figure 4 in your paper?
  2. In the GAN of the coarsest scale, why is the fixed noise you input all 0 when the program is running?
  3. In other scales of GAN, the input of train_single_scale in your program is noise plus upsampled image. Why is the input of each GAN all 0 when the program is running?
  4. After the training is completed, use the last-scale GAN generation model to perform super-resolution image reconstruction? Gs_sr is the same model. Why do you want to add the same model to the list multiple times?
    You can reply to me in the post or send me an email. My email address is [email protected]
    I hope to receive a reply soon. Thank you

train model the cmd traceback show this comment

Hi,
it shows:python37\lib\site-packages\torch\jit\frontend. py
in build_subscript raise not supported error(base.range(),"slicing multiple dimensions at the same time is notsupportyet")
I do not how to do that.
thanks~

No residual connections

The original paper mentions a residual connection in the generator, and probably I guess that is why you used tanh instead of sigmoid in the generator output. But the given code in the file
SinGAN/models.py
does not seem to contain any residual connections.

About the model already exists

Dear tamarott!
First of all, congratulations on your paper getting the best ICCV 2019 paper. Recently I had the honor to read that your paper had some problems during runtime, so to ask you, I first run python main_train.py --input_name huge.jpg without problems, but he from The photo produced at 0 scale is distorted, so I want to generate it from a finer scale, and then run the second command python random_samples.py --input_name huge.jpg-mode random_samples --gen_start_scale 2 But the result shows that the model already exists, Later, I changed another file name but it was still an image. In the end, the model still exists. May I ask why? Thank you very much! wish you a happy life!

training stuck at scale 9:[1999/2000]

Training constantly stuck at [1999/2000]
such as "scale 7:[1999/2000]" or "scale 9:[1999/2000]".

Can't interrupt it even if I used ctrl+c, it's totally dead.

I used a mountain picture, resized to same size as one of your sample image.

I'm using :
python 3.6.8
torch 1.3.0

GPU rtx2080ti
NVIDIA Driver 419.35
CUDA 10.1

SIFID:nan

SIFID/sifid_score.py:262: RuntimeWarning: Mean of empty slice.

Random_samples outputs nothing

I trained the network 7 scales, 2000 epochs each and i forced stopped the training when scale 8 started.
I tried this command:
!python random_samples.py --input_name "file_1.jpg" --mode random_samples --gen_start_scale 2

  • the ! is because i'm using a jupyter notebook
    And there were some paths created: Output/RandomSamples/file_1/gen_start_scale=2
    The last folder is empty, there is no image output, why ?

Clarification on painting procedure

I would like to know about the procedure to create the painting.
For example, the mountains.png, in your results:

image_singan

How did you know that was needed to paint the border of the mountain with green instead of keeping it grey?

I tried to replicate the results, but since the above drawing was not available, I cropped from the above image (and erased the black arrow). Also, I painted some mountains to see the results, as shown below:

email_singan

See pdf for better resolution.

Note the difference in results, mainly with the second column with the first figure (your results).
The best results was from the inserting the original image as paint.

What was the procedure used to paint the images from your results?

Thanks!
[email protected]

About “trained model already exist” and solution way

Congratulations on your paper get the best paper in iccv2019!
When I run your code, I input the path to my image and then tell me can't find the file.
Later I realized I should input the file name and put the image to /Input/Images/.
However, I get the “trained model already exist” .
I read the source code and find the path already be created then I deleted the path.
Finally, I can run it normally.

Random samples may be generating similar output to input?

When trying to use random_samples I get results that are appear very similar to the input at most generation scales.
47

Except gen scale of 0
9

Not sure if this is expected or not
Also maybe that's just a bad test image / pattern for this type of use? I guess mentally I was thinking it would re arrange the apples or something? It does do that sort of using other function I sent image here: #2 (comment)

I don't know if this is useful or not but I tossed the models and stuff into folder here:
https://drive.google.com/open?id=1xe1nbplb0O2hGOt5Y4sFLcjm0UfLtodf

How to generate interesting images at higher scale > 1

I am able to train and generate interesting images at scale = 0 (lowest resolution). But images with scale > 1 look almost exactly like the original training image with very slight perturbations. My original training image is 900x600 in size. Can someone provide suggestion on generating similar resolution images that look different?

Excuse me

I am afraid that some codes is wrong in SinGAN/functions.adjust_scales2image.
Line 204,205 seem repetitive with line 198 ,199.
Should opt.scale_factor_init in Line 199 be replaced by opt.scale_factor
or just delete Line 204 205?
Looking forward your reply.

Potential code error in training.py

Hello

I have a concern about the sampling of the fixed noise z* used in the rec loss (training.py, line 97) during the training of the first generator/discriminator pair.

If I understand the paper well, that noise should be sampled once and kept fixed.
It appears that the noise is sampled multiplie times (once per epoch) during the training of the first generator/discriminator pair.

Thank you for your help

MULTI-GPU

First of all, thank you for your excellent work and congratulations on your paper winning the best paper award in ICCV2019. I'm very interested in your work. But I have some problems running your code. I hope you can help me.

  1. Can the program run on multiple GPUs? I changed the default value of CUDA in config.py from 1 to 2, but it was useless.
  2. Can only one training image be sent to Gan for training, and then the next training image can be changed artificially? If I have 1000 images, can I put them in a folder and send them to the program to generate them automatically? Instead of a training, the program stops, and I manually fill in the name of the next picture that needs training.
  3. What's the difference between G (z'opt). png and fake_sample.png in the saved file? What's the use of z_opt.pth, gs.pth, noisemamp.pth, reals.pth, zs.pth files? What are their contents?
    Thank you very much for your help. Your work is really excellent!

min_size and max_size in random_samples.py?

What is the intended use of [--min_size MIN_SIZE] and [--max_size MAX_SIZE] in random_samples.py? Setting min_size larger than 256 seems to impact image generation but does not result in larger images.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.