GithubHelp home page GithubHelp logo

New modalities about diffusers HOT 10 CLOSED

huggingface avatar huggingface commented on May 21, 2024 1
New modalities

from diffusers.

Comments (10)

richardrl avatar richardrl commented on May 21, 2024 3

@anton-l @patrickvonplaten Thanks for your input thus far.

I took the latest commit (as of this moment) and made a minimum reproduction of a 1D MLP model and training.

I had to make an additional modification to pipeline_ddpm.py to support noise samples of the right shape.

bimodal_testt.zip

python3 train_unconditional_gaussian_test.py This runs a test on a bimodal gaussian distribution centered at +33, -33 with low variance

It seems to not capture the -33 mode after an epoch or two. Am running the training overnight to see what happens.

Welcome you guys to try running this to see if there's anything I did wrong

from diffusers.

patrickvonplaten avatar patrickvonplaten commented on May 21, 2024 2

Taking a look at the function as is:

    def training_step(self, original_samples: torch.Tensor, noise: torch.Tensor, timesteps: torch.Tensor):
        if timesteps.dim() != 1:
            raise ValueError("`timesteps` must be a 1D tensor")

        device = original_samples.device
        batch_size = original_samples.shape[0]
        timesteps = timesteps.reshape(batch_size, 1, 1, 1)

        sqrt_alpha_prod = self.alphas_cumprod[timesteps] ** 0.5
        sqrt_one_minus_alpha_prod = (1 - self.alphas_cumprod[timesteps]) ** 0.5
        noisy_samples = sqrt_alpha_prod.to(device) * original_samples + sqrt_one_minus_alpha_prod.to(device) * noise
        return noisy_samples

Note that the input can be both torch and numpy tensors -> this should be changed.

Also there shouldn't be any .to(device) statements, nor framework and modality spefific .reshape(...) operation.

I'd be in favor of implementing framework specific (one for PT one for TF) functions called

def extract(....) in SchedulerMixin that have if framework == "pt" statements in them. Also note that we shouldn't assume to know the dimension of the input original_samples

from diffusers.

patrickvonplaten avatar patrickvonplaten commented on May 21, 2024 1

@anton-l - we need to make sure that training_step is both framework agnostic and shape agnostic

from diffusers.

patrickvonplaten avatar patrickvonplaten commented on May 21, 2024 1

BTW, it's super nice to get all your feedback here @richardrl - thanks a lot!

from diffusers.

patil-suraj avatar patil-suraj commented on May 21, 2024

Thanks for reporting @richardrl !
Indeed the plan is to support multiple modalities, but we haven't yet tested the schedulers with 1D data.

cc @patrickvonplaten @anton-l

from diffusers.

anton-l avatar anton-l commented on May 21, 2024

Now we use match_shape(timestaps, original_samples) for everything, which is framework- and shape-agnostic: https://github.com/huggingface/diffusers/blob/main/src/diffusers/schedulers/scheduling_ddpm.py#L146

from diffusers.

github-actions avatar github-actions commented on May 21, 2024

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

from diffusers.

patrickvonplaten avatar patrickvonplaten commented on May 21, 2024

@anton-l as you've re-opened the issue -> are you planning on doing something with it?

from diffusers.

github-actions avatar github-actions commented on May 21, 2024

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

from diffusers.

anton-l avatar anton-l commented on May 21, 2024

I think we have support for all shapes now, agreed with the stalebot :)

from diffusers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.