GithubHelp home page GithubHelp logo

zichengduan / thechosenone Goto Github PK

View Code? Open in Web Editor NEW
226.0 9.0 20.0 6.92 MB

Unofficial implementation of the paper "The Chosen One: Consistent Characters in Text-to-Image Diffusion Models"

Home Page: https://arxiv.org/abs/2311.10093

Python 100.00%
deep-learning diffusion dinov2 generative-art generative-model

thechosenone's Introduction

Welcome to my GitHub main page ๐Ÿ‘‹

Sic Parvis Magna

thechosenone's People

Contributors

zichengduan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

thechosenone's Issues

Tried to allocate 20.00 MiB (GPU 0; 14.76 GiB total capacity; 13.90 GiB already allocated; 14.75 MiB free; 14.14 GiB reserved in total by PyTorch

I keep getting out of memory exceptions no matter how I try to set PYTORCH_CUDA_ALLOC_CONF
This is the error:
File "/opt/saturncloud/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1143, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 14.76 GiB total capacity; 13.90 GiB already allocated; 14.75 MiB free; 14.14 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

notebook for google colab

Hi,
Thank you for this implementation!

I'm trying to run training on google colab, and I'm running into dependency issues.
It would be greatly helpful if you could provide a working notebook.
Thank you!

Consistency in character clothes

In the paper, the results show that character's clothes is consistent in different background situation which is very nice.
image

But in github, the image shows character clothing also changes with the background. Why is it so or are you controlling the clothes in some way.
image

supporing negative prompt, and using other diffusion models

Hi ZichengDuan!
can you please support a negative prompt? it may help to converge to the wanted character - i.e. a real human being and a CGI.

additional question: only stable-diffusion-xl-base-1.0 (among the known models) have tokenizer_2 and text_encoder_2:
how can I modify the code to work with other diffusion models?

Thanks!

Questions about the training code part

Hi. Thank you for your implementation of the paper.

While I was looking at your code, I couldn't understand whether the model being trained each loop is the model that was trained in the previous loop, or you are calling a vanilla SDXL model every loop.

Can you tell me what is right, and where the appropriate code for my question is ?

Train errors

Hello!
When I'm traing using config:"resume_from_checkpoint: latest",I load the lora checkpoint but it has the error:"No inf checks were recorded for this optimizer." Howevwe, if I train without checkpoint, it has the error, bf16 is not supported int(sorry I forget the integrity error)

How to install diffusers==0.24.0.dev0

Thank you for your work. Upon checking the official repository, I couldn't locate the diffusers version 0.24.0.dev0 so that I can not use the API text_encoder_lora_state_dict. What should I do, if there are alternative APIs available. Currently, I am using diffuser version 0.23.1.

Error related to models when I try running main.py

when I try to run python3 main.py
I get the following error

ImportError: cannot import name 'text_encoder_lora_state_dict' from 'diffusers.models.lora' (/usr/local/lib/python3.11/site-packages/diffusers/models/lora.py)

I had to change a couple of things to get to this point
I changed to requirements file to this in order to make it work

accelerate==0.24.1
bitsandbytes==0.41.2.post2
datasets==2.15.0
diffusers==0.25.0.dev0

and deleted diffusers.egg==info

Time For training process

Hi, there. Wonderful work! I am now running the training process, but it seems to take a lot of time, wondering what should be a normal duration time for the training process when training on a single v100, say using the default config provided in the config file.

text_encoder_lora_state_dict not found

Facing an env problem: ImportError: cannot import name 'text_encoder_lora_state_dict' from 'diffusers.models.lora'

May I ask which diffusers commit u use? thx

Bad results

I did not modify any code, I simply ran the training and inference programs directly, and the results were very poor. I'm not sure what the problem is. As the loop increases, the results get worse. Is there a problem somewhere?
loop=0:
photo_man__sitting_on_a_rocket
loop=1:
photo_man__sitting_on_a_rocket

GPU requirements

what are the minimal GPU requirements for training the model?
currently, I'm running out of CUDA memory, using a batch size of 1, using a 16GB RAM GPU

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.