bubbliiiing / ddpm-pytorch Goto Github PK

View Code? Open in Web Editor NEW

187.0 187.0 17.0 567 KB

这个是一个ddpm的pytorch仓库，可以用于训练自己的数据集。

License: MIT License

Python 100.00%

ddpm-pytorch's Introduction

Hi，很高兴遇见你 👋

🧡 专注于深度学习 Focusing on Deep Learning
🔨 复现各类深度学习算法，主要集中于图像，但属于NLP编外人员
🍬 期待可以去海边旅游
🥩 想吃但是更想瘦
📯 我的哔哩哔哩空间（Bilibili Video） https://space.bilibili.com/472467171
📚 我的CSDN博客（CSDN Blog） https://blog.csdn.net/weixin_44791964
🍱 我的知乎（Zhihu） https://www.zhihu.com/people/bubbliiiing
📜 我的微信公众号（Wechat Official Accounts） Bubbliiiing的深度学习小课堂

ddpm-pytorch's People

Contributors

Stargazers

Watchers

Forkers

dongdongwu001 xinyizhang0724 nagatoyuki0943 liu-congcong likecv xuandiaosheng lifeng0718 shuxiangguo globle-thunder shuxjweb zhang-ling-yun wuqua songyang86 chenrunbin123 lzllzlo xuelizhe

ddpm-pytorch's Issues

继续训练

想要继续训练只需要修改 diffusion_model_path 为上次训练生成的pth文件就可以了吗，修改后好像还是从头训练

训练集100多张图片，测试输出的图片与原图色差太大了，请问是数据集太少的问题嘛？

关于训练过程中Loss突然变大的问题

想请问一下博主，我使用LSUN的数据集，初始学习率2e-04，使用adam优化器，为什么会有训练到一半突然loss增大的问题。
期待您的回复。

segmentation fault (core dumped)

大家训练过程中有遇到这个错误吗？训练中途，报错，segmentation fault (core dumped)...

关于model_path

第一个问题：我训练完了之后会出现第一张图片中show result后面无内容的结果，想问一下具体会是什么原因呢？
第二个问题：我训练结束后想读取logs中的权重文件，logs中的只有第二张图片上的三个文件，我想应该是第二个（但并不是.
pth的格式，想问问是因为什么？），我读取后用predice.py进行预测，但进行了报错（报错如第三张图片），会是什么原因呢？

up你好，我将正向加噪过程过程中使用到的高斯噪声保存了下来，在去噪的时候用到了这些噪声，但是发现最终得到的图像全是噪声点，请问一下这是咋回事啊，下面是我的代码，我是你在b站上的粉丝。`import numpy as np
import torch
from PIL import Image
import os
def preprocess_input(x):
x /= 255
x -= 0.5
x /= 0.5
return x

def cvtColor(image):
if len(np.shape(image)) == 3 and np.shape(image)[2] == 3:
return image
else:
image = image.convert('RGB')
return image

def postprocess_output(x):
x *= 0.5
x += 0.5
x *= 255
return x

def extract(a, t, x_shape):
b, *_ = t.shape
out = a.gather(-1, t)
return out.reshape(b, *((1,) * (len(x_shape) - 1)))
def perturb_x(sqrt_alphas_cumprod, x, t, noise, sqrt_one_minus_alphas_cumprod):
return (
extract(sqrt_alphas_cumprod, t, x.shape) * x +
extract(sqrt_one_minus_alphas_cumprod, t, x.shape) * noise
)

def remove_noise(remove_noise_coeff, noise, reciprocal_sqrt_alphas, x, t, use_ema=False):
if use_ema:
return (
(x - extract(remove_noise_coeff, t, x.shape) * noise) *
extract(reciprocal_sqrt_alphas, t, x.shape)
)
else:
return (
(x - extract(remove_noise_coeff, t, x.shape) * noise) *
extract(reciprocal_sqrt_alphas, t, x.shape)
)

num_timesteps = 100
save_path = "tmp.jpg"
if not os.path.exists("original_pic"):
os.makedirs("original_pic")
if not os.path.exists("after_pic"):
os.makedirs("after_pic")
image = Image.open("0_clean.png")
image = cvtColor(image).resize([128, 128], Image.BICUBIC)
image = np.array(image, dtype=np.float32)
image = np.transpose(preprocess_input(image), (2, 0, 1))
x = torch.from_numpy(np.array(image, np.float32))
x = x[None,:,:,:]
betas = torch.linspace(start=0.0001, end=0.02, steps=1000)
alphas = 1 - betas
alphas_cumprod = torch.cumprod(alphas,dim=0)
sqrt_alphas_cumprod = torch.sqrt(alphas_cumprod)
sqrt_one_minus_alphas_cumprod = torch.sqrt(1 - alphas_cumprod)
reciprocal_sqrt_alphas = torch.sqrt(1 / alphas)
remove_noise_coeff = betas / torch.sqrt(1 - alphas_cumprod)
sigma = torch.sqrt(betas)

保留加噪过程中的epsilon，用于下个阶段的还原

epsilon_list = []
for t in range(num_timesteps):
t = torch.tensor([t])
epsilon = torch.randn_like(x)
epsilon_list.append(epsilon)
x_t = perturb_x(sqrt_alphas_cumprod, x, t, epsilon, sqrt_one_minus_alphas_cumprod)
tmp1 = x_t.clone()
test_images = postprocess_output(tmp1[0].cpu().data.numpy().transpose(1, 2, 0))
Image.fromarray(np.uint8(test_images)).save(os.path.join("original_pic", str(t) + ".png"))

去噪过程随机采样的xt

x = x_t
#x = torch.randn((1, 3, 128, 128))
for t in range(num_timesteps - 1, -1, -1):
t_batch = torch.tensor([t]).repeat(1)
x = remove_noise(remove_noise_coeff, epsilon_list[t], reciprocal_sqrt_alphas, x, t_batch)
if t > 0:
x += extract(sigma, t_batch, x.shape) * torch.randn_like(x)
tmp = x.clone()
test_images = postprocess_output(tmp[0].cpu().data.numpy().transpose(1, 2, 0))
Image.fromarray(np.uint8(test_images)).save(os.path.join("after_pic", str(t) + ".png"))

test_images = postprocess_output(x[0].cpu().data.numpy().transpose(1, 2, 0))
Image.fromarray(np.uint8(test_images)).save(save_path)`

Diffusion_Flower.pth的下载链接404了

Diffusion_Flower.pth的下载链接404了，作者可以更新一下吗，感谢

FileNotFoundError: [Errno 2] No such file or directory: 'D:\\graduation'.The wrong place is in the Image.py in the PIL

灰度图能不能训练呀

虽然在utils.py里看到了def cvtColor(image)，但是这个函数好像预测时才用到了，所以好奇训练的话能不能用灰度数据集呢…如果可以的话需要改哪些地方呀（如果能大概回复一下就很感谢了）

请教一下这个show_result()函数输出的图是啥意思？

ddpm-pytorch/utils/utils.py

Line 36 in 6f3369c

def show_result(num_epoch, net, device):

我看到输出文件夹中有个2x2的图，想问一下这四个图是随机的四个噪声生成的结果吗，多谢多谢~

关于torch版本的问题

使用requirement.txt中指定的1.2.0的torch，显示module 'torch.nn.functional' has no attribute 'silu'，请问怎么解决

为什么train的时候1000次epoch，生成的图片有很多彩色噪声点呀？

请问测试代码是以那个为基础来还原的呀我在代码里面没有找到

为什么测试的四张图颜色不一样

这是我用于训练的数据

这是测试的结果

运行过程中会出错

运行到n个epoch后，提示TypeError: 'Tensor' object is not iterable.不知道大家遇到过没？
utils_fit.py 中的 optimizer.step()产生的，但是训练了那么多epoch都没问题，为什么会突然出现问题呢？

这个代码生成的是3232，如果我想生成128128，是不是只需要改train.py和diffuion.py里的input size?

博主，权重没了

博主，可以更新github的权重链接吗（失效了），百度网盘太慢了

为什么生成的图片颜色和原图差别很大呢？

问题如标题

DDP模式下报错

老师您好，我在linux 下DDP模式报错，pytorch 用的是1.13，请问有什么方法。

Epoch:1/1000
Total_loss: 0.5074
Show_result:
Traceback (most recent call last):
File "train.py", line 246, in
fit_one_epoch(diffusion_model_train, diffusion_model, loss_history, optimizer,
File "/mnt/disk86/cyc/ddpm-pytorch-master/utils/utils_fit.py", line 58, in fit_one_epoch
show_result(epoch + 1, diffusion_model, images.device)
File "/mnt/disk86/cyc/ddpm-pytorch-master/utils/utils.py", line 37, in show_result
test_images = net.sample(4, device)
File "/home/cyc/anaconda3/envs/stylegan3/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/mnt/disk86/cyc/ddpm-pytorch-master/nets/diffusion.py", line 107, in sample
x = self.remove_noise(x, t_batch, y, use_ema)
File "/home/cyc/anaconda3/envs/stylegan3/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/mnt/disk86/cyc/ddpm-pytorch-master/nets/diffusion.py", line 89, in remove_noise
(x - extract(self.remove_noise_coeff, t, x.shape) * self.ema_model(x, t, y)) *
File "/mnt/disk86/cyc/ddpm-pytorch-master/nets/diffusion.py", line 12, in extract
out = a.gather(-1, t)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_gather)
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 78549 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 78550 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 78551 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 78548) of binary: /home/cyc/anaconda3/envs/stylegan3/bin/python
Traceback (most recent call last):
File "/home/cyc/anaconda3/envs/stylegan3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/cyc/anaconda3/envs/stylegan3/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/cyc/anaconda3/envs/stylegan3/lib/python3.8/site-packages/torch/distributed/launch.py", line 195, in
main()
File "/home/cyc/anaconda3/envs/stylegan3/lib/python3.8/site-packages/torch/distributed/launch.py", line 191, in main
launch(args)
File "/home/cyc/anaconda3/envs/stylegan3/lib/python3.8/site-packages/torch/distributed/launch.py", line 176, in launch
run(args)
File "/home/cyc/anaconda3/envs/stylegan3/lib/python3.8/site-packages/torch/distributed/run.py", line 753, in run
elastic_launch(
File "/home/cyc/anaconda3/envs/stylegan3/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 132, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/cyc/anaconda3/envs/stylegan3/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

train.py FAILED

Failures:
<NO_OTHER_FAILURES>

Root Cause (first observed failure):
[0]:
time : 2023-03-30_10:55:30
host : localhost.localdomain
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 78548)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html