limuloo / migc Goto Github PK

[CVPR 2024 Highlight] "MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis" (Official Implementation)

License: Other

Python 57.30% JavaScript 28.95% CSS 7.00% HTML 6.75%

aigc cvpr2024 stable-diffusion text-to-image computer-vision cvpr

migc's People

Contributors

Stargazers

Watchers

migc's Issues

Inference Time

I tried to generate a single image on my PC, it takes about 7 min on the single RTX 2060, is this normal？

training code

I'm here to ask for an update, when will the training code be released? May is almost over

Can this method support any base model like a plug-in?

Any plans of release the training code?

When will the pretrained weights of the SDXL model be released?

hello,thanks for your excellent work, will you release the pretrained weights of sdxl in the future?

about the training code

Hello, thanks for the excellent work, may I ask when the training code will be released, if it will be released soon, thank you!

Supporting new diffusers

Thanks for the great work! However, the pipeline does not seem to support the latest diffusers. When I use diffusers v0.25.0, I got:

  File "/home_dir/miniconda3/envs/env_name/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/home_dir/miniconda3/envs/env_name/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home_dir/MIGC/migc_gui/app.py", line 82, in process_request
    pipe = offlinePipelineSetupWithSafeTensor(sd_safetensors_path=sd_safetensors_path)
  File "/home_dir/MIGC/migc/migc_utils.py", line 174, in offlinePipelineSetupWithSafeTensor
    pipe = StableDiffusionMIGCPipeline.from_single_file(sd_safetensors_path,
  File "/home_dir/miniconda3/envs/env_name/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "/home_dir/miniconda3/envs/env_name/lib/python3.10/site-packages/diffusers/loaders/single_file.py", line 263, in from_single_file
    pipe = download_from_original_stable_diffusion_ckpt(
  File "/home_dir/miniconda3/envs/env_name/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py", line 1687, in download_from_original_stable_diffusion_ckpt
    pipe = pipeline_class(
  File "/home_dir/MIGC/migc/migc_pipeline.py", line 241, in __init__
    super().__init__(
  File "/home_dir/miniconda3/envs/env_name/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py", line 237, in __init__
    self.register_modules(
  File "/home_dir/miniconda3/envs/env_name/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 571, in register_modules
    library = not_compiled_module.__module__.split(".")[0]
AttributeError: 'bool' object has no attribute '__module__'

This is similar to the following error:
huggingface/diffusers#6094

The solution is discussed here:
huggingface/diffusers#5993 (comment)

Are you interested in solving this and making MIGC compatible with the updated diffusers?

A failure case occurred when MIGC generate "2 cats and 3 dogs".

When I used MIGC to generate "2 cats and 3 dogs", I found that the first dog below would still look like a "cat". Is there any way to improve this result?

I am using realisticVisionV51_v51VAE.safetensors, here are my parameters:

rompt_final = [['4k, best quality, masterpiece, ultra high res, ultra detailed,a cat,a cat,a dog,a dog,a dog,grass',
'a cat', 'a cat', 'a dog', 'a dog', 'a dog', 'grass']]
bboxes = [[[0.078125, 0.09375, 0.390625, 0.359375], [0.515625, 0.09375, 0.859375, 0.359375], [0.078125, 0.515625, 0.34375, 0.90625],
[0.421875, 0.515625, 0.671875, 0.921875], [0.71875, 0.484375, 0.953125, 0.921875], [0.015625, 0.015625, 0.984375, 0.96875]]]
negative_prompt = 'worst quality, low quality, watermark, text, blurry'
seed = 12573842233801288171
seed_everything(seed)
image = pipe(prompt_final, bboxes, num_inference_steps=50, guidance_scale=8,
MIGCsteps=25, NaiveFuserSteps=25, aug_phase_with_and=False, negative_prompt=negative_prompt).images[0]

And here are the images generated by MIGC:

Batch inference

A great job, are there any tips on setting up bounding boxes to perform batch inferencing?

limuloo / migc Goto Github PK

migc's People

Contributors

Stargazers

Watchers

Forkers

migc's Issues

Inference Time

training code

Can this method support any base model like a plug-in?

Any plans of release the training code?

When will the pretrained weights of the SDXL model be released?

about the training code

Supporting new diffusers

A failure case occurred when MIGC generate "2 cats and 3 dogs".

Batch inference

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs