tamarott / singan Goto Github PK
View Code? Open in Web Editor NEWOfficial pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"
Home Page: https://tamarott.github.io/SinGAN.htm
License: Other
Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"
Home Page: https://tamarott.github.io/SinGAN.htm
License: Other
Why did you use Loss_adv of WGAN-GP in instead of another L_adv? ?Looking forward to your reply.
The project theoretically was designed to run without using CUDA, since it has the --cuda
command-line argument. However, in practice it doesn't seem to work that way.
I'm trying to run the project in a device without CUDA support and it's not working. When I run main_train.py
, the function np2torch
in functions
invokes the torch.cuda.FloatTensor
class without checking if the --cuda
option was used, which raises an exception.
The imresize
module also has a np2torch
function with the same problem.
Also, the generate_noise
function in functions
is always invoked without named parameters, which sets the device
parameter to the default value of 'cuda'
.
What is the purpose of the --cuda
CLI argument?
the program will get wrong after the first 25*25 picture was outputed
Your GAN paper is potentially very useful. I am trying to train a model on learning an image, then find the SIFID score between the image and other test image. Both training and test images are of the same resolution ie. 128 X 128. I would like to pose a question:
In the README:
python SIFID/sifid_score.py --path2real --path2fake --images_suffix <e.g. jpg, png>
Is the path2real and path2fake the directory that contain the image for comparison? I have issue with running sifid_score.py.
Hi, I think there are some question about use Harmonization, but those can not fix my problem , and the answer is blur. So I make a new issue.
How I use it :
Following ReadMe, First I train a image called "seascape.png" in Input/Images
python main_train.py --input_name seacape.png
and When the train processing have finished.I run
python harmonization.py --input_name seacape.png --ref_name tree.png --harmonization_start_scale 2
why I set those parameters?Because follow by ReadMe ,like this:
python harmonization.py --input_name <training_image_file_name> --ref_name <naively_pasted_reference_image_file_name> --harmonization_start_scale <scale to inject>
and I think <training_image_file_name> is my first train image, so it is "seascape.png",
and <naively_pasted_reference_image_file_name> is Input/Harmonization/tree.jpg, so I set tree.jpg
I set 2
the result above , I got :
it named tree_mask_dilated.png located in Input/Harmonization and Other output is
it named start_scale=2.png located in Output/Harmonization/seascape/tree_out
Those are different from paper, So I want to know how to use it.
If you can reimplement result like paper, tell how to do it, thank you
I've modified calculate_sifid_given_paths(path1, path2, batch_size, cuda, dims, suffix)
slightly in sifid_score.py
s.t. the function calculates mu
and sigma
of a single real image and claculate mu
and sigma
of 50 fake images generated from the single real image. This was an attempt to decrease variation of mu
and sigma
of the generated output:
def calculate_sifid_given_paths(path1, path2, batch_size, cuda, dims, suffix):
# some code...
# ...
# get mu and sigma of single training image
ref_m, ref_s = calculate_activation_statistics([img_path], model, batch_size, dims, cuda)
fid_values = []
# get fid of multiple images that were based on the single reference image,
# then calculate distance between each one and ref_m, ref_s
for i in range(len(files2)):
generated_m, generated_s = calculate_activation_statistics([files2[i]], model, batch_size, dims, cuda)
fid_values.append(calculate_frechet_distance(ref_m, ref_s, generated_m, generated_s))
return fid_values
The values I get are too small compared to the pubilshed SIFID:
When using balloons.png
as training image, I get 1.650822e-05 for n=N
and 1.0670579e-05
for n=N-1
with 50 generated samples.
The published SIFID are:
Dear all
I am trying to use the harmonization example. I specified images as required.
However, I get an error in SinGAN_generate(), the shapes do not match on the code line
z_in = noise_amp(z_curr)+I_prev*
I wonder, should the harmonization image I insert have exactly the same size as the training image? Or is there a way to make it work with any size image for harmonization, the model architecture seems flexible enough. But I am a bit confused by the overall code structure and where these tensor size changes should be refactored exactly
Thank you for the valuable work done with this project.
In my own experiments/research, I am interested in using the newly introduced SIFID metric. However, I could not find an implementation of the calculation of this metric in the provided code. Could I please ask for the code used in generating those values to be pushed, if it still possible?
Many thanks
Hi, in the paper there are retargeting (non-homogeneous resizing) examples, but I can't see how to reproduce them in the repo. Is there a way to use the code as-is for retargeting?
Thanks!
Mark
Thanks for your great work and sharing the code.
In your paper, for the reconstruction loss, you set the input noise of Gn as a fixed noise map, while for other generators, it's set zero. Could you explain why you do that?
Looking forward to your reply. Thanks
First of all, congratulate your paper for getting the best ICCV 2019 paper and share your own code. I have some questions that I do n’t understand and I need to ask you. When I implement image super-resolution, execute python SR.py --input_name <LR_image_file_name>, I found The maximum size of the output image can only be 1000, but since the image I trained is 1200 or 800, the final output result is not ideal. How do I run it? wish you a happy life!
Hi,
I really liked the idea of the paper and have written a summary for the same which can be found here.
Hope people find it helpful and would really love any of your feedbacks on the same 😄
Hello,I'm Zhou Chuanyou from Yangtze University,wuhan,China. I'm studying your research. It‘s great. And some questions to ask for your help.
2.I'm trying to install cuda, which version do you install for your computer, and which version when you install pytorch cuda.
This is my LinkedIn contact:
https://www.linkedin.com/in/chuanyou-zhou-8734a1196/
I have tried to add you, when you are free, you can check your LinkedIn.
And my email:
[email protected]
I tired this code using original data, but the following error occured:
RuntimeError: Calculated padded input size per channel: (1 x 4). Kernel size: (3 x 3). Kernel size can't be greater than actual input size
Could you tell me what should I do?
Hello,I have question:When there is a person or only one significant target in the image , the effect of generating any size ratio image is not true. such as zebra image, when I need generate any size ratio image, and the whole horse will be broken,Whether this method can't process this kind of image. but for mountain group and Birds group image,so on, image target is consecutive, so result is acceptable
hello,thx for your source code
in the output image,what's the mean of G(z_opt).png and fake_sample.png
which one is the output from the net?
Hello and thanks for your work.
Is it possible to list the exact dependencies for running the scripts? Wrong dependency versions may lead to wrong results. For example using the latest Pytorch version (1.3.0) led to the following warning message:
UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite o
rder: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning
-rate
Listing dependencies can be done running pip freeze
and copying the content, or better pip freeze > requirements.txt
and committing the file.
Thanks in advance.
I can not figuer out that this functions is doing , what does the param scale2stop means?
Does it support 1024px?
let --max_size=1024?
Quoting the paper: "Because the generators are fully convolutional, we can generate images of arbitrary size and aspect ratio at test time (by changing the dimensions of the noise maps)." Do I understand correctly that the mentioned dimension of noise maps is driven by nc_z
? I tried changing this parameter to a greater value (10), but got an exception:
File "projects/python/ai/SinGAN/SinGAN/manipulate.py", line 132, in SinGAN_generate
z_in = noise_amp*(z_curr)+I_prev
RuntimeError: The size of tensor a (10) must match the size of tensor b (3) at non-singleton dimension 1
Could someone please clarify this for me? My model is trained using default nc_z
, because the paper says I can increase the dimensions of the noise maps only at test stage to generate larger samples.
What is the meaning of the Effective Patch Size in Figure 4 in your paper?
I've been trying to get paint2image and super scaling working but the code seems not to be complete.
Do you have a version of this code base with these features available?
I am having problems with running the SR.py:
(sinGAN) B:\Program_Files\Anaconda3\envs\sinGAN\SinGAN-master>python SR.py --input_name TEST_12_variant.png --out R:\
Random Seed: 1708
4.000000
Traceback (most recent call last):
File "SR.py", line 64, in
out = SinGAN_generate(Gs_sr, Zs_sr, reals_sr, NoiseAmp_sr, opt, in_s=reals_sr[0], num_samples=1)
File "B:\Program_Files\Anaconda3\envs\sinGAN\SinGAN-master\SinGAN\manipulate.py", line 131, in SinGAN_generate
z_in = noise_amp*(z_curr)+I_prev
RuntimeError: The size of tensor a (276) must match the size of tensor b (282) at non-singleton dimension 3
Windows 10, conda version : 4.7.10, GTX 1080Ti, driver version: 26.21.14.3160
Is it possible to train up to a certain scale like 250px width and with the trained model for inference to scale to 480px width or higher ?
I tried to generate random samples :
python random_samples.py --input_name mountain.jpeg --mode random_samples_arbitrary_sizes --scale_h 480 --scale_v 600
But I get out of memory error. So if it is possible, how can I generate bigger resolution images without out of memory crash ?
The same question for edit and harmonize, if I train a model on a small resolution image can I upscale the output of edit/harmonize ?
Is it possible to train a neural network on a group of similar images rather than on a single image?
在终端输入python random_samples.py文件时,没有生成随机样本,生成的文件目录中Output/random_samples/mountains/gen_start_scale=0是空白的,运行时没有报错,请问这是哪里出问题了呢?
Does it possible to continue scaling-and-training if the training stopped in the middle of work?
Or I have to remove the training folder and start from the beginning?
the syntax of the command is incorrect
As mentioned in the supplementary materials submitted by you, when generating 256 * 256 images, on a single 1080ti GPU, the training time will take about 30 minutes, and the actual generated images will be less than one second each. But when I run main.train.py, when I train to generate 224 * 224 images, the actual training time is nearly two hours. Why? My image size for training is 224 * 224. My device is 2080ti, cuda9.0, pytorch1.1.0, python3.6. I've also tested it on 1080ti, and it takes longer to train.
Looking forward to your reply.
python main_train.py --input_name 33039_LR.png
('Random Seed: ', 5309)
functions.adjust_scales2image(real, opt)
in adjust_scales2image
opt.num_scales = int((math.log(math.pow(opt.min_size / (real_.shape[2]), 1), opt.scale_factor_init))) + 1
ValueError: math domain error
Hello, your SINGAN is very attractive. I am very interested in SR, but I have some questions about the papers and procedures, I hope you can answer.
Hi,
it shows:python37\lib\site-packages\torch\jit\frontend. py
in build_subscript raise not supported error(base.range(),"slicing multiple dimensions at the same time is notsupportyet")
I do not how to do that.
thanks~
The original paper mentions a residual connection in the generator, and probably I guess that is why you used tanh instead of sigmoid in the generator output. But the given code in the file
SinGAN/models.py
does not seem to contain any residual connections.
Dear tamarott!
First of all, congratulations on your paper getting the best ICCV 2019 paper. Recently I had the honor to read that your paper had some problems during runtime, so to ask you, I first run python main_train.py --input_name huge.jpg without problems, but he from The photo produced at 0 scale is distorted, so I want to generate it from a finer scale, and then run the second command python random_samples.py --input_name huge.jpg-mode random_samples --gen_start_scale 2 But the result shows that the model already exists, Later, I changed another file name but it was still an image. In the end, the model still exists. May I ask why? Thank you very much! wish you a happy life!
What is the meaning of the Effective Patch Size in Figure 4 in your paper?
Training constantly stuck at [1999/2000]
such as "scale 7:[1999/2000]" or "scale 9:[1999/2000]".
Can't interrupt it even if I used ctrl+c, it's totally dead.
I used a mountain picture, resized to same size as one of your sample image.
I'm using :
python 3.6.8
torch 1.3.0
GPU rtx2080ti
NVIDIA Driver 419.35
CUDA 10.1
SIFID/sifid_score.py:262: RuntimeWarning: Mean of empty slice.
I trained the network 7 scales, 2000 epochs each and i forced stopped the training when scale 8 started.
I tried this command:
!python random_samples.py --input_name "file_1.jpg" --mode random_samples --gen_start_scale 2
Output/RandomSamples/file_1/gen_start_scale=2
and it shows:modulenotfound error:"No module named ’torch'
I would like to know about the procedure to create the painting.
For example, the mountains.png, in your results:
How did you know that was needed to paint the border of the mountain with green instead of keeping it grey?
I tried to replicate the results, but since the above drawing was not available, I cropped from the above image (and erased the black arrow). Also, I painted some mountains to see the results, as shown below:
See pdf for better resolution.
Note the difference in results, mainly with the second column with the first figure (your results).
The best results was from the inserting the original image as paint.
What was the procedure used to paint the images from your results?
Thanks!
[email protected]
Congratulations on your paper get the best paper in iccv2019!
When I run your code, I input the path to my image and then tell me can't find the file.
Later I realized I should input the file name and put the image to /Input/Images/.
However, I get the “trained model already exist” .
I read the source code and find the path already be created then I deleted the path.
Finally, I can run it normally.
When trying to use random_samples
I get results that are appear very similar to the input at most generation scales.
Not sure if this is expected or not
Also maybe that's just a bad test image / pattern for this type of use? I guess mentally I was thinking it would re arrange the apples or something? It does do that sort of using other function I sent image here: #2 (comment)
I don't know if this is useful or not but I tossed the models and stuff into folder here:
https://drive.google.com/open?id=1xe1nbplb0O2hGOt5Y4sFLcjm0UfLtodf
I am able to train and generate interesting images at scale = 0 (lowest resolution). But images with scale > 1 look almost exactly like the original training image with very slight perturbations. My original training image is 900x600 in size. Can someone provide suggestion on generating similar resolution images that look different?
I am afraid that some codes is wrong in SinGAN/functions.adjust_scales2image.
Line 204,205 seem repetitive with line 198 ,199.
Should opt.scale_factor_init in Line 199 be replaced by opt.scale_factor
or just delete Line 204 205?
Looking forward your reply.
Hello
I have a concern about the sampling of the fixed noise z* used in the rec loss (training.py, line 97) during the training of the first generator/discriminator pair.
If I understand the paper well, that noise should be sampled once and kept fixed.
It appears that the noise is sampled multiplie times (once per epoch) during the training of the first generator/discriminator pair.
Thank you for your help
First of all, thank you for your excellent work and congratulations on your paper winning the best paper award in ICCV2019. I'm very interested in your work. But I have some problems running your code. I hope you can help me.
What is the intended use of [--min_size MIN_SIZE] and [--max_size MAX_SIZE] in random_samples.py? Setting min_size larger than 256 seems to impact image generation but does not result in larger images.
I am having problems with running the SR.py:
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 32, 1, 1])
And I got some answers on the web:
Most likely you have a nn.BatchNorm layer somewhere in your model, which expects more then 1 value to calculate the running mean and std of the current batch.
How to solve this problem?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.