dps2022 / diffusion-posterior-sampling Goto Github PK

View Code? Open in Web Editor NEW

326.0 5.0 29.0 16.18 MB

Official pytorch repository for "Diffusion Posterior Sampling for General Noisy Inverse Problems"

Home Page: https://dps2022.github.io/diffusion-posterior-sampling-page/

Dockerfile 0.56% Python 99.28% Shell 0.16%

diffusion-model inverse-problems pytorch

diffusion-posterior-sampling's People

Contributors

Stargazers

Watchers

diffusion-posterior-sampling's Issues

Got exception: invalid load key, '<'.

While running the below task (and others) I get the error Got exception: invalid load key, '<'. but the program proceeds to execute.

python3 sample_condition.py --model_config=configs/model_config.yaml --diffusion_config=configs/diffusion_config.yaml --task_config=configs/gaussian_deblur_config.yaml

Details about pretrained neural network

I am trying to use the pretrained neural network with my own inputs. From my understanding, the output has 6 channels, the first 3 of which are the mean, and the last 3 of which are the variance. I believe the network is trained on T=1000 steps. Therefore, when the input is a clean image, at t=999, the output should be almost unchanged. But when I ran it, although the shape of the face and features are there, the coloring and contrast is completely different. So I am wondering what are the scales of the images for the input and output? The most realistic output I get is when I scale the input to be between 0 and 1, but even in that case, the output is mostly between -1 and 1.

Reproducing results in the paper

Hi, I am trying to reproduce the results from the paper and I cannot find exactly which 1k images of the FFHQ and ImageNet dataset were used for the tables in the paper. Can you please clarify the exact split used for comparing DPS with the other methods?
Thank you!

Calculating FID

Hello, thanks for publishing this paper and repo.

I am curious about reproducing the results in the paper. I applied the Gaussian blur model to the first 1,000 images of FFHQ-256 as per Issue #4, but when using torch-fidelity I don't reproduce the FID numbers. If I include torch-fidelity's image resizing, I get 29.3. If I don't include image resizing, I get 37.0. Both of these are pretty far away from the paper value of 44.05.

Could you provide some more details on how to reproduce the numbers of Table 1?

Issue with random nonlinear blur kernel

It looks like every time the nonlinear blur operator is called, a new random blur kernel is generated. When calculating the guidance term, the kernel applied to the current posterior mean estimate and the kernel used to generate the measurement are different. Does this not cause an issue with data consistency? I think the kernels should be matched here. Thanks for your reply in advance!

ModuleNotFoundError: No module named 'models.arch_util'

Hi, I am getting this error while trying to run the code with the nonlinear deblur task. If I try the super resolution task it works.

python3 sample_condition.py \                                              
--model_config=configs/model_config.yaml \
--diffusion_config=configs/diffusion_config.yaml \
--task_config=configs/nonlinear_deblur_config.yaml;

Device set to cpu.
Traceback (most recent call last):
  File "sample_condition.py", line 121, in <module>
    main()
  File "sample_condition.py", line 57, in main
    operator = get_operator(device=device, **measure_config['operator'])
  File "/home/ethan/diffusion-posterior-sampling/guided_diffusion/measurements.py", line 32, in get_operator
    return __OPERATOR__[name](**kwargs)
  File "/home/ethan/diffusion-posterior-sampling/guided_diffusion/measurements.py", line 178, in __init__
    self.blur_model = self.prepare_nonlinear_blur_model(opt_yml_path)     
  File "/home/ethan/diffusion-posterior-sampling/guided_diffusion/measurements.py", line 184, in prepare_nonlinear_blur_model
    from bkse.models.kernel_encoding.kernel_wizard import KernelWizard
  File "/home/ethan/diffusion-posterior-sampling/bkse/models/kernel_encoding/kernel_wizard.py", line 3, in <module>
    import models.arch_util as arch_util
ModuleNotFoundError: No module named 'models.arch_util'

Reproducing Table 1, Random Inpainting FID for FFHQ256

Hi, we are trying to reproduce table 1 results, using the https://www.kaggle.com/datasets/xhlulu/flickrfaceshq-dataset-nvidia-resized-256px FFHQ256 dataset.
So calculating FID for the reconstructed inpainting images (first 500 images of the dataset), we see an FID of 46.24. Would you mind sharing the preprocessing steps needed to match the reported 21.19 FID score?

Thanks!

Paper & implementation differences

Hi,
There are a few differences between the paper and this repository and it will be wonderful if you could clarify for me the reasons behind them:

The reported gaussain-noisy experiments in the paper use sigma_y=0.05, and indeed in the config files config['noise']['sigma']=0.05.
But while the images are stretchered from [0,1] to [-1,1], the sigma is unchanged – meaning that in practice the noise added is with std sigma/2, i.e. y_n is cleaner compared to the reported settings in the paper.
This can be easily checked by computing torch.std(y-yn) after the creation of y and y_n in sample_condition.py.
The paper defines the step-size scalar as a constant divided by the norm of the gradient (Appendix C.2), meaning that we always normalize the gradient before scaling it.
In the code, the constant is defined in config['conditioning']['params']['scale'] and used in PosteriorSampling.conditioning() to scale the gradient, but we never normalized the gradient in the first place (in PosteriorSampling.grad_and_value() for example).
By adding the gradient normalization the method seems to break.
For the gaussian FFHQ-SRx4 case, Appendix D.1 defines the scale as 1.0, but configs/super_resolution_config.yaml uses 0.3.

Thank you for your time and effort!

Impact of batchsize on performance during sampling

Thanks for your nice work. I tried this method on my own dataset and observed a phenomenon. During the testing phase, the performance of the model is affected by the batch size. It seems that a small batch size will give better results, but this will increase the time spent evaluating on the test set. Is this reasonable, or is there any way to fix it?

How to train it？

Issue with phase retrieval and Poisson noise

Hello, I am trying to reproduce phase retrieval results with Poisson noise.
I have tried different rates for the noise but I never got a decent result, while results for gaussian noise were good.
are the values needed for phase retrieval with poison noise different from those who needed for PR with gaussian noise?

Thanks!

What is the license?

Thanks

Using DDIM sampling method

I am trying to use the DDIM sampling method to decrease the number of sampling steps required. When I change the sampler to ddim in diffusion_config.yaml (and change nothing else) with gaussian_deblur_config, I get an output which is just a black image. Do I have to change some of the other parameters too?

Can this approach be used on VE-SDE?

Hi, it's a nice job.
I want to know that if this approach can be used on VE-SDE. Or in other words, why do you just use VP-SDE?
Thanks

dps2022 / diffusion-posterior-sampling Goto Github PK

diffusion-posterior-sampling's People

Contributors

Stargazers

Watchers

Forkers

diffusion-posterior-sampling's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs