GithubHelp home page GithubHelp logo

SDXL Training about tokencompose HOT 4 CLOSED

scarbain avatar scarbain commented on June 23, 2024
SDXL Training

from tokencompose.

Comments (4)

zwcolin avatar zwcolin commented on June 23, 2024 1

Hi Sébastien, our research lab currently adopts a non-commercial license for research projects so it might not be suitable to directly use our codebase for commercial products. I'm closing this issue for now. If you have any further questions, feel free to start a new one or reopen this issue. Thank you!

from tokencompose.

zwcolin avatar zwcolin commented on June 23, 2024

Hi Sébastien, finetuning SD1.4 is pretty fast and can be done with a single RTX 3090 (with a batch size 1, grad. accumulation of 4 and grad. checkpointing).

We also provide SD 2.1 checkpoints finetuned on 768*768 resolutions on A6000 GPUs (requires more than >24gb vram).

Unfortunately we don't have checkpoints trained on SDXL because it's too large and likely requires GPUs such as A100. Our training pipeline requires more vram usage compared to a typical training pipeline because we have additional objectives on the cross-attention map.

If you want to train SDXL, my guess would be:
(1) increase the number of optimization steps (we observe that we need to increase this when we train 2.1 compared to 1.4)
(2) change the grounding loss ratio, e.g., $\lambda$ and $\gamma$. Essentially you want the token loss to decrease as much as possible without compromising the denosing objective. At the same time, you want the pixel loss to maintain constant or slowly decrease as the model is finetuned with the token loss in order to preserve good image quality and grounding capabilities. You can experiment with different hyperparameters by observing the training loss curve. Empirically, we use the same set of $\lambda$ and $\gamma$ for training both SD 1.4 and 2.1, which you can use as a starting point too if you want to experiment with SDXL!

from tokencompose.

scarbain avatar scarbain commented on June 23, 2024

Thanks for your insights ! I'm currently trying to reproduce your training on a finetune of SD1.5 but I'm having some errors about missing text keys in the data, I'll keep digging !

About SDXL, I'll make some tests but if you're willing to work with me on this, I can probably provide cloud GPUs (depending on the price it would cost of course). We could then opensource it, I'm seeing a lot of value to this!

from tokencompose.

scarbain avatar scarbain commented on June 23, 2024

Oh, I just checked your license and it's non-commercial

from tokencompose.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.