zeqiang-lai / mini-dalle3 Goto Github PK

View Code? Open in Web Editor NEW

291.0 4.0 26.0 172 KB

Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models

Home Page: https://minidalle3.github.io/

Python 100.00%

dalle dalle-3 dalle3 interactive-text-to-image mini-dalle3 dall-e-3

mini-dalle3's Introduction

Technical Report • Project page • Demo (Temporarily Unavailable)

minidalle3.mp4

An experimental attempt to obtain the interactive and interleave text-to-image and text-to-text experience of DALL•E 3 and ChatGPT.

Try Yourself 🤗

Download the checkpoint and save it as following

checkpoints
   - models
   - sdxl_models

run the following commands, and you will get a gradio-based web demo.

export OPENAI_API_KEY="your key"
python -m minidalle3.web

To use other LLM rather than ChatGPT, such as baichuan.

python -m minidalle3.llm.baichuan
export OPENAI_API_BASE="http://0.0.0.0:10039/v1"
python -m minidalle3.web

chatglm, baichuan, internlm are tested. llama have not supported yet. qwen is not tested.

TODO

Support generating image interleaved in the conversations.
Support generating multiple images at once.
Support selecting image.
Support refinement.
Support prompt refinement/variation.
Instruct tuned LLM/SD.

Citation

If you find this repo helpful, please consider citing us.

@misc{minidalle3,
    author={Lai, Zeqiang and Zhu, Xizhou and Dai, Jifeng and Qiao, Yu and Wang, Wenhai},
    title={Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models},
    year={2023},
    url={https://github.com/Zeqiang-Lai/Mini-DALLE3},
}

Acknowledgement

IP-Adapter • Stable Diffusion XL

mini-dalle3's People

Contributors

Stargazers

Watchers

mini-dalle3's Issues

Cool! but it's too slow

I use a 3060 12g graphics card, but image creation is very slow (takes more than 5 minutes). How can I improve performance?

checkpoint is not found

hello, would mind please add the checkpoint?thanks

依赖包的版本

requirement 里没有写这些依赖包的版本，运行的时候各种报错，能更新一下依赖包版本么？

About Figure 4 in the paper. Illustration of 6 types of interactions in interactive text-to-image workflow

Are all of these generated by mini-DALLE3? I wonder how the 3. Selecting is done. Because in my opinion, every time mini-DALLE3 just generates a new image according to the prompted input, so it‘s kind of impossible for the 2 images to be the same. thx

How to use it?

Can someone please share a step-by-step procedure to get started?

I would prefer a Non-Gradio Approach (CLI or HF Pipeline)

How to use other open source ones such as Mistral to incorporate stable diffusion

运行最低的显存要求大概是多少？我用RTX A4000 16GB 报显存不足

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 14.00 MiB (GPU 0; 15.73 GiB total capacity; 15.12 GiB already allocated; 3.19 MiB free; 15.54 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

fire.Fire(main) failed, a bug?

@whai362 When I use the command "python -m minidalle3.web" to launch the demo, I encounter the following error. It seems that there is a failure in starting "fire.Fire(main)". Could you please help me understand the reason behind this?

Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:34<00:00, 4.97s/it]
Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████████████| 7/7 [02:27<00:00, 21.08s/it]
/home/mi/anaconda3/envs/dfuxl/lib/python3.8/site-packages/gradio/blocks.py:928: UserWarning: api_name add_text already exists, using add_text_1
warnings.warn(f"api_name {api_name} already exists, using {api_name_}")
/home/mi/anaconda3/envs/dfuxl/lib/python3.8/site-packages/gradio/blocks.py:928: UserWarning: api_name bot already exists, using bot_1
warnings.warn(f"api_name {api_name} already exists, using {api_name_}")
/home/mi/anaconda3/envs/dfuxl/lib/python3.8/site-packages/gradio/blocks.py:928: UserWarning: api_name lambda already exists, using lambda_1
warnings.warn(f"api_name {api_name} already exists, using {api_name_}")
/home/mi/anaconda3/envs/dfuxl/lib/python3.8/site-packages/gradio/blocks.py:928: UserWarning: api_name lambda already exists, using lambda_2
warnings.warn(f"api_name {api_name} already exists, using {api_name_}")
Traceback (most recent call last):
File "/home/mi/anaconda3/envs/dfuxl/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/mi/anaconda3/envs/dfuxl/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/mi/Desktop/project/Mini-DALLE3/minidalle3/web.py", line 137, in
fire.Fire(main)
File "/home/mi/anaconda3/envs/dfuxl/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/mi/anaconda3/envs/dfuxl/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/mi/anaconda3/envs/dfuxl/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/mi/Desktop/project/Mini-DALLE3/minidalle3/web.py", line 133, in main
demo.queue(concurrency_count=max_users).launch(server_port=port, server_name="0.0.0.0", share=share)
File "/home/mi/anaconda3/envs/dfuxl/lib/python3.8/site-packages/gradio/blocks.py", line 1676, in queue
raise DeprecationWarning(
DeprecationWarning: concurrency_count has been deprecated. Set the concurrency_limit directly on event listeners e.g. btn.click(fn, ..., concurrency_limit=10) or gr.Interface(concurrency_limit=10). If necessary, the total number of workers can be configured via max_threads in launch().

zeqiang-lai / mini-dalle3 Goto Github PK

mini-dalle3's Introduction

Try Yourself 🤗

TODO

Citation

Acknowledgement

mini-dalle3's People

Contributors

Stargazers

Watchers

Forkers

mini-dalle3's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs