Comments (10)
@anton-l @patrickvonplaten Thanks for your input thus far.
I took the latest commit (as of this moment) and made a minimum reproduction of a 1D MLP model and training.
I had to make an additional modification to pipeline_ddpm.py
to support noise samples of the right shape.
python3 train_unconditional_gaussian_test.py
This runs a test on a bimodal gaussian distribution centered at +33, -33 with low variance
It seems to not capture the -33 mode after an epoch or two. Am running the training overnight to see what happens.
Welcome you guys to try running this to see if there's anything I did wrong
from diffusers.
Taking a look at the function as is:
def training_step(self, original_samples: torch.Tensor, noise: torch.Tensor, timesteps: torch.Tensor):
if timesteps.dim() != 1:
raise ValueError("`timesteps` must be a 1D tensor")
device = original_samples.device
batch_size = original_samples.shape[0]
timesteps = timesteps.reshape(batch_size, 1, 1, 1)
sqrt_alpha_prod = self.alphas_cumprod[timesteps] ** 0.5
sqrt_one_minus_alpha_prod = (1 - self.alphas_cumprod[timesteps]) ** 0.5
noisy_samples = sqrt_alpha_prod.to(device) * original_samples + sqrt_one_minus_alpha_prod.to(device) * noise
return noisy_samples
Note that the input can be both torch
and numpy
tensors -> this should be changed.
Also there shouldn't be any .to(device)
statements, nor framework and modality spefific .reshape(...)
operation.
I'd be in favor of implementing framework specific (one for PT one for TF) functions called
def extract(....)
in SchedulerMixin
that have if framework == "pt"
statements in them. Also note that we shouldn't assume to know the dimension of the input original_samples
from diffusers.
@anton-l - we need to make sure that training_step
is both framework agnostic and shape agnostic
from diffusers.
BTW, it's super nice to get all your feedback here @richardrl - thanks a lot!
from diffusers.
Thanks for reporting @richardrl !
Indeed the plan is to support multiple modalities, but we haven't yet tested the schedulers with 1D data.
from diffusers.
Now we use match_shape(timestaps, original_samples)
for everything, which is framework- and shape-agnostic: https://github.com/huggingface/diffusers/blob/main/src/diffusers/schedulers/scheduling_ddpm.py#L146
from diffusers.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
from diffusers.
@anton-l as you've re-opened the issue -> are you planning on doing something with it?
from diffusers.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
from diffusers.
I think we have support for all shapes now, agreed with the stalebot :)
from diffusers.
Related Issues (20)
- I feel confused about this TODO issue. how to pass timesteps as tensors? HOT 1
- Multi-controlnet formatting issue HOT 1
- CLIP Training Example Bug - Overfitting HOT 1
- with XL ,When the image is enlarged for viewing,The drawn image will appear as red dots in most cases HOT 8
- deepfloyd stage 2 crashes with tensor size mismatch when input image size is not divisible by 8 HOT 2
- examples/community/lpw_stable_diffusion_xl.py Not correctly decoded HOT 1
- Severe difference with A1111 HOT 5
- Index 11 is out of bounds for dimension 0 with size 11 on concurrent pipeline calls HOT 1
- RuntimeError: Input type (c10::Half) and bias type (float) should be the same HOT 2
- Multi-vector Token already in tokenizer vocabulary. Please choose a different token name HOT 2
- FlashFace implementation in diffusers
- Support Residual Classifier-Free Guidance (RCFG) ?
- diffusers soft inpainting support HOT 5
- MotionMaster: Training-free Camera Motion Transfer For Video Generation
- Error when using blockwise scales with sd_xl_offset_example-lora_1.0.safetensors HOT 3
- diffusers0.19.3 HOT 1
- Support out_dim argument for Attention block
- pokemon-blip-captions dataset HOT 7
- lpw_stable_diffusion pipeline not working when "from_single_file" is used HOT 7
- stable diffusion adapter pipeline for t2i adapter missing `from_single_file` HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from diffusers.