vislearn / controlnet-xs Goto Github PK

View Code? Open in Web Editor NEW

406.0 406.0 12.0 95.77 MB

License: Apache License 2.0

HTML 0.37% CSS 0.08% Python 99.09% Shell 0.03% Roff 0.43%

controlnet-xs's People

Contributors

Stargazers

Watchers

Forkers

bruinxiong mysqlsc cksac friedfeid whuhxb kailashtw zdxpan lwprogramming umerha flyinsky222 dao007forever builtwithai

controlnet-xs's Issues

About ControlNet harmless

When I loaded SDXL and Control checkpoints, the model produced pictures normally. But the model generates noise when I load only SDXL's ckpt. According to the zero layer processing in the paper, should the model be consistent with SDXL at this time?

Where ControlNet-XS model code file in whole code ?

Your work is good!
I want to learn from your work. Can you tell ControlNet-XS model position, I can't find it.
Thank you. @_@

how to create config to train this model

What model is it downloading of size 10.2GB ?

Hi,

Thanks for the release of both the code and models.

I've set the path to the SDXL checkpoint and canny checkpoint in the config file but the inference code is still trying to download a checkpoint "ip_pytorch_model.bin" of 10.2GB.

What is it ?

Thanks

ControlNet-XS is a new LoRA?

Hi! Awesome job! This is a non-issue but discussion. I think your idea in the paper is essentially two things, more connections (or control just in time) and smaller control nets. I saw the team of diffusers did smaller control nets when they train a control net of SDXL here, and I saw stability uses a LoRA to be a control net here.

I haven't taken a closer look in the code of these two, but these make me wonder that probably we can take your smaller control nets as a LoRA that not only modify the behaviour of SD but also extends it, i.e. seeing the control images.

I also wonder whether we can just use LoRA to do your smaller control nets, then probably we can achieve even faster training and even smaller weights. I guess we can replace you networks with LoRAs while keeping your connections between generative encoders and control encoders.

what can the "learning embedding" do?

if self.learn_embedding:
emb = self.control_model.time_embed(t_emb) * self.control_scale ** 0.3 + base_model.time_embed(t_emb) * (1 -
self.control_scale ** 0.3)

Why do you set this learning embedding.

Dependencies require different python version

Hello,
The code dependencies require different python versions >=3.9 and <3.5
how to deal with this conflict?

why is your controlnet-xs smaller than original controlnet?

i see the blog. but it seems like that model A,B are same as the original controlnet. both them have a complete encoder of unet. and model C has a whole unet. why do they have smaller weights than original controlnet?

How to train ControlNet-XS with my own dataset?

Great Job! How to train ControlNet-XS with my own dataset?

Training scripts and working with diffusers

Thanks a lot for building this. I have 2 questions -

Is there a way to make this work in diffusers?
When do you plan to release the training scripts?

Diffusers compatibility

Hello @Sipirius,

Just to let you know I created a feature request on the diffusers library to support your architecture : huggingface/diffusers#5168

TwoStreamControlNet: control_scale is always set to 1.0

It is initialized in the CTOR and never changes afterwards.

As a consequence, setting control_scale in the high-level get_sdxl_sample API has no effect on it while this factor is involved in the timestep embedding computation.

(In other words, the base model time_embed has no effect whatever the input control_scale)

But also, setting control_scale=0.0 when calling get_sdxl_sample is different than completely disabling ControlNet-XS via no_control = True.

It looks to me, it should be changed via something like:

--- a/scripts/control_utils.py
+++ b/scripts/control_utils.py
@@ -185,6 +185,7 @@ def get_sdxl_sample(

     if float(control_scale) != 1.0:
         model.model.scale_list *= control_scale
+        model.model.control_scale = control_scale
         print(f'[CONTROL CORRECTION OF {type(model).__name__} SCALED WITH {control_scale}]')

     control = torch.stack([tt.ToTensor()(ds['hint'][..., None].repeat(3, 2))] * num_samples).float().to('cuda')
@@ -231,6 +232,7 @@ def get_sdxl_sample(

     #  reset scales
     model.model.scale_list = model.model.scale_list * 0. + 1.
+    model.model.control_scale = 1.

     return x_samples, control

Is it intended (bug or feature)? Am I missing something?

Multiple Controlnets

Is there a way to combine multiple controlnets, i.e. depth and canny, with ControlNet-XS?

How to train?

How to train?I haven't seen any tutorials about training

Weird results after updating controlnet-xs + sdxl from controlnet + sdxl

I have been fine-tuning sdxl with controlnet for remote-sensing images of buildings and the seg-map as control, but when I tried your work , the sample results became weird...
This is my controlnet+sdxl result below:

And this is controlnet-XS + sdxl:

We can see difference between those two results, controlnet is more natural and controlnet-xs is weird.
Is that normal or I missed some detail ?

Dataset usage

Hello, please tell me how to download and use the data sets in the project. I downloaded the data sets in sgm, but it shows that Datasets not yet available. Can you explain it?

I would like to ask if this solution has been compared with ControlNet-lite?

About model size and human pose

Thanks for your wonderful job.
I want to ask two questiones:

If I want a larger controlnet-xs, do I only need to change the contol_model_ratio configuration?
Does your pipeline work well for human poses?

As ControlNet sharing the config of UNet, how to make it smaller?

Hi, thanks for sharing the source code. It seems that the ControlNet is using the same UNet config. Then, how to make the ControlNet smaller?

open_clip.create_model_and_transforms fuction problem when inferencing

When I ran the example code in README.md, I met a strange problem.

import scripts.control_utils as cu
import torch
from PIL import Image

path_to_config = 'configs/inference/sdxl/sdxl_encD_canny_48m.yaml'
model = cu.create_model(path_to_config).to('cuda')

image_path = 'IMAGES/00007.png'

canny_high_th = 250
canny_low_th = 100
size = 768
num_samples=2

image = cu.get_image(image_path, size=size)
edges = cu.get_canny_edges(image, low_th=canny_low_th, high_th=canny_high_th)

samples, controls = cu.get_sdxl_sample(
    guidance=edges,
    ddim_steps=10,
    num_samples=num_samples,
    model=model,
    shape=[4, size // 8, size // 8],
    control_scale=0.95,
    prompt='cinematic, shoe in the streets, made from meat, photorealistic shoe, highly detailed',
    n_prompt='lowres, bad anatomy, worst quality, low quality',
)


Image.fromarray(cu.create_image_grid(samples)).save('00007.png')

The error occured in the following line in File "ControlNet-XS\sgm\modules\encoders\modules.py", line 428:

model, _, _ = open_clip.create_model_and_transforms(

To fix it, I downloaded the laion CLIP-ViT-H-14-laion2B-s32B-b79K model manually and put it in a directory, then I use the model:

model, _, _ = open_clip.create_model_and_transforms(
            arch,
            device=torch.device("cpu"),
            # pretrained=version,
            pretrained="laion/CLIP-ViT-H-14-laion2B-s32B-b79K/open_clip_pytorch_model.bin",
        )

Then I met the error I could not fix:

RuntimeError: Error(s) in loading state_dict for CLIP:
	Missing key(s) in state_dict: "visual.transformer.resblocks.32.ln_1.weight", "visual.transformer.resblocks.32.ln_1.bias", "visual.transformer.resblocks.32.attn.in_proj_weight", "visual.transformer.resblocks.32.attn.in_proj_bias", "visual.transformer.resblocks.32.attn.out_proj.weight", "visual.transformer.resblocks.32.attn.out_proj.bias", 
......
"transformer.resblocks.31.mlp.c_proj.weight", "transformer.resblocks.31.mlp.c_proj.bias". 
	size mismatch for positional_embedding: copying a param with shape torch.Size([77, 1024]) from checkpoint, the shape in current model is torch.Size([77, 1280]).
	size mismatch for text_projection: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 1280]).
	......

I wonder if it is the problem of the model or something else.

Cannnot use sd21 model

First when trying the use sd21 I got

cannot import name 'rank_zero_only' from 'pytorch_lightning.utilities.distributed'

which needed pytorch-lightning to be rolled back to v1.6.5 to get past that.

But now I get...

Traceback (most recent call last):
  File "D:\ControlNet-XS\ControlNet-XS.py", line 69, in <module>
    samples, controls = cu.get_sdxl_sample(
  File "D:\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:ControlNet-XS\scripts\control_utils.py", line 180, in get_sdxl_sample
    model.sampler.num_steps = ddim_steps
  File "D:\venv\lib\site-packages\torch\nn\modules\module.py", line 1614, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'TwoStreamControlLDM' object has no attribute 'sampler'. Did you mean: 'sample'?

Any ideas on what needs to be tweaked for sd21 to work? sdxl works fine.
Thanks.

When training cn-xs, do 50% text prompt should be ignored as like controlnet

Thanks a lot for building this. I have 2 questions

Error in Colab

Hi,

Thanks for the release of both the code and models.

I've tried to run the code on Google Colab, and I've got an error. Probably, I'm missing something, but in case it is an error.
You can see the Colab.

Also, it looks like it needs too much VRAM for free Colab. Can you tell me if that is true?

Thanks

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-2-1b395f27c622>](https://localhost:8080/#) in <cell line: 1>()
----> 1 import scripts.control_utils as cu
      2 import torch
      3 from PIL import Image
      4 
      5 path_to_config = '/content/ControlNet-XS/configs/inference/sdxl/sdxl_encD_canny_48m.yaml'

13 frames
[/usr/local/lib/python3.10/dist-packages/numpy/testing/_private/utils.py](https://localhost:8080/#) in <module>
     55 IS_PYSTON = hasattr(sys, "pyston_version_info")
     56 HAS_REFCOUNT = getattr(sys, 'getrefcount', None) is not None and not IS_PYSTON
---> 57 HAS_LAPACK64 = numpy.linalg._umath_linalg._ilp64
     58 
     59 _OLD_PROMOTION = lambda: np._get_promotion_state() == 'legacy'

AttributeError: module 'numpy.linalg._umath_linalg' has no attribute '_ilp64'

I am going to train this model. and I find the loss is about 0.20+ at the begin of training.

I have trained original controlnet, the loss is lower than 0.15 at the begin of training, due to zero conv module.
as for controlnet-xs, the model still uses "zero conv" module, but the initial loss is about 0.20.
Is it normal? can you introduce your training loss. can you show me, please!

vislearn / controlnet-xs Goto Github PK

controlnet-xs's People

Contributors

Stargazers

Watchers

Forkers

controlnet-xs's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs