GithubHelp home page GithubHelp logo

Comments (8)

sayakpaul avatar sayakpaul commented on June 16, 2024 1

#6552

from diffusers.

sayakpaul avatar sayakpaul commented on June 16, 2024

Thanks for bringing this up. Possible for you to show a comparison between what happens when you zero out like the way you mentioned compared to the existing approach?

I've checked the values of the embeds, and classifier-free guidance at inference time definitely makes use of the zero embed and not just "", which end up producing very different results.

The SD IP2P pipeline uses "", though when negative prompt is not provided:

However, it makes use to zeros_like for the unconditional image embeddings:

uncond_image_latents = torch.zeros_like(image_latents)

from diffusers.

bghira avatar bghira commented on June 16, 2024

the base model was trained using it, so i figured aligning with the base model's training and inference has better results.

from my own tests, i can now reduce the step count required when running the default config on the SDXL pipelines, eg. force zeroes is set to True

I also have much better learning. this model started from ptx0/terminus-xl-velocity-v1 and it was unable to spell.

1000 steps of tuning later:

image

the base model was trained using "" and it never really ends up with better CFG performance... but now it does!

from diffusers.

bghira avatar bghira commented on June 16, 2024

see the base SDXL pipeline:

        # get unconditional embeddings for classifier free guidance
        zero_out_negative_prompt = negative_prompt is None and self.config.force_zeros_for_empty_prompt
        if do_classifier_free_guidance and negative_prompt_embeds is None and zero_out_negative_prompt:
            negative_prompt_embeds = torch.zeros_like(prompt_embeds)
            negative_pooled_prompt_embeds = torch.zeros_like(pooled_prompt_embeds)

and the config:

{
   "_class_name": "StableDiffusionXLPipeline",
   "_diffusers_version": "0.19.0.dev0",
   "force_zeros_for_empty_prompt": true
}

from diffusers.

sayakpaul avatar sayakpaul commented on June 16, 2024

Ah makes a ton of sense. Do you want to take a step at opening a PR to fix this?

from diffusers.

bghira avatar bghira commented on June 16, 2024

can i also open the pull request for all of the other training examples, to add general dropout capabilities to them?

from diffusers.

sayakpaul avatar sayakpaul commented on June 16, 2024

We can open that up for the community. This way everyone gets to participate.

from diffusers.

bghira avatar bghira commented on June 16, 2024

like the ticket for updating the fp16 error? #6231

from diffusers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.