mlpc-ucsd / tokencompose Goto Github PK

(CVPR 2024) 🧩 TokenCompose: Grounding Diffusion with Token-level Supervision

Home Page: https://mlpc-ucsd.github.io/TokenCompose/

License: Other

Python 10.51% Shell 0.57% Jupyter Notebook 88.92%

computer-vision diffusion-models generative-ai machine-learning text-to-image artificial-intelligence latent-diffusion multimodal stable-diffusion image-generation

tokencompose's People

Contributors

Stargazers

Watchers

Forkers

sorokinvld whuhxb

tokencompose's Issues

Could you please provide the metadata.jsonl or any other .json file used for training?

GPU memory usage

Hi, thank you for this great work. I wonder which gpu did you use in your experiments? I'm having OOM on V100 when the unet is equipped with the controller.

Why batch size could only be 1?

Hi, congrats on your great work!

I wonder why batch size could only be set to 1. It seems to me that controller could store attention map for a larger batch. Is this due to the memory cost or any other reason?

Could you please provide the training log?

May I ask on which GPU did you train for how long

Composition capability of SDXL

I was surprised to find that the SDXL model addresses the problem of multi-category object composition.

Well, congrats! I'm really impressed with the generations being so well grounded without any extra network or anything !
And you only finetuned SD1.4 for 24K steps ? That was a pretty fast training right ?

Have you considered training SDXL ? I didn't see anything about it in your paper !
Do you have recommendations on the parameters to use for training SDXL ?

compatibility with lora

Hi, I wonder if you have tried using LoRA to finetune the model? It should need less gpu memory than the current full finetuning strategy.

mlpc-ucsd / tokencompose Goto Github PK

tokencompose's People

Contributors

Stargazers

Watchers

Forkers

tokencompose's Issues

Could you please provide the metadata.jsonl or any other .json file used for training?

GPU memory usage

Why batch size could only be 1?

Could you please provide the training log?

May I ask on which GPU did you train for how long

Composition capability of SDXL

SDXL Training

compatibility with lora

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs