GithubHelp home page GithubHelp logo

controlnet-xs's People

Contributors

nicolasbender avatar sipirius avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

controlnet-xs's Issues

About ControlNet harmless

When I loaded SDXL and Control checkpoints, the model produced pictures normally. But the model generates noise when I load only SDXL's ckpt. According to the zero layer processing in the paper, should the model be consistent with SDXL at this time?

What model is it downloading of size 10.2GB ?

Hi,

Thanks for the release of both the code and models.

I've set the path to the SDXL checkpoint and canny checkpoint in the config file but the inference code is still trying to download a checkpoint "ip_pytorch_model.bin" of 10.2GB.

What is it ?

Thanks

ControlNet-XS is a new LoRA?

Hi! Awesome job! This is a non-issue but discussion. I think your idea in the paper is essentially two things, more connections (or control just in time) and smaller control nets. I saw the team of diffusers did smaller control nets when they train a control net of SDXL here, and I saw stability uses a LoRA to be a control net here.

I haven't taken a closer look in the code of these two, but these make me wonder that probably we can take your smaller control nets as a LoRA that not only modify the behaviour of SD but also extends it, i.e. seeing the control images.

I also wonder whether we can just use LoRA to do your smaller control nets, then probably we can achieve even faster training and even smaller weights. I guess we can replace you networks with LoRAs while keeping your connections between generative encoders and control encoders.

what can the "learning embedding" do?

if self.learn_embedding:
emb = self.control_model.time_embed(t_emb) * self.control_scale ** 0.3 + base_model.time_embed(t_emb) * (1 -
self.control_scale ** 0.3)

Why do you set this learning embedding.

TwoStreamControlNet: control_scale is always set to 1.0

It is initialized in the CTOR and never changes afterwards.

As a consequence, setting control_scale in the high-level get_sdxl_sample API has no effect on it while this factor is involved in the timestep embedding computation.

(In other words, the base model time_embed has no effect whatever the input control_scale)

But also, setting control_scale=0.0 when calling get_sdxl_sample is different than completely disabling ControlNet-XS via no_control = True.

It looks to me, it should be changed via something like:

--- a/scripts/control_utils.py
+++ b/scripts/control_utils.py
@@ -185,6 +185,7 @@ def get_sdxl_sample(

     if float(control_scale) != 1.0:
         model.model.scale_list *= control_scale
+        model.model.control_scale = control_scale
         print(f'[CONTROL CORRECTION OF {type(model).__name__} SCALED WITH {control_scale}]')

     control = torch.stack([tt.ToTensor()(ds['hint'][..., None].repeat(3, 2))] * num_samples).float().to('cuda')
@@ -231,6 +232,7 @@ def get_sdxl_sample(

     #  reset scales
     model.model.scale_list = model.model.scale_list * 0. + 1.
+    model.model.control_scale = 1.

     return x_samples, control

--

Is it intended (bug or feature)? Am I missing something?

Multiple Controlnets

Is there a way to combine multiple controlnets, i.e. depth and canny, with ControlNet-XS?

How to train?

How to train?I haven't seen any tutorials about training

Weird results after updating controlnet-xs + sdxl from controlnet + sdxl

I have been fine-tuning sdxl with controlnet for remote-sensing images of buildings and the seg-map as control, but when I tried your work , the sample results became weird...
This is my controlnet+sdxl result below:
image
And this is controlnet-XS + sdxl:
image
We can see difference between those two results, controlnet is more natural and controlnet-xs is weird.
Is that normal or I missed some detail ?

Dataset usage

Hello, please tell me how to download and use the data sets in the project. I downloaded the data sets in sgm, but it shows that Datasets not yet available. Can you explain it?

About model size and human pose

Thanks for your wonderful job.
I want to ask two questiones:

  1. If I want a larger controlnet-xs, do I only need to change the contol_model_ratio configuration?
  2. Does your pipeline work well for human poses?

open_clip.create_model_and_transforms fuction problem when inferencing

When I ran the example code in README.md, I met a strange problem.

import scripts.control_utils as cu
import torch
from PIL import Image

path_to_config = 'configs/inference/sdxl/sdxl_encD_canny_48m.yaml'
model = cu.create_model(path_to_config).to('cuda')

image_path = 'IMAGES/00007.png'

canny_high_th = 250
canny_low_th = 100
size = 768
num_samples=2

image = cu.get_image(image_path, size=size)
edges = cu.get_canny_edges(image, low_th=canny_low_th, high_th=canny_high_th)

samples, controls = cu.get_sdxl_sample(
    guidance=edges,
    ddim_steps=10,
    num_samples=num_samples,
    model=model,
    shape=[4, size // 8, size // 8],
    control_scale=0.95,
    prompt='cinematic, shoe in the streets, made from meat, photorealistic shoe, highly detailed',
    n_prompt='lowres, bad anatomy, worst quality, low quality',
)


Image.fromarray(cu.create_image_grid(samples)).save('00007.png')

The error occured in the following line in File "ControlNet-XS\sgm\modules\encoders\modules.py", line 428:

model, _, _ = open_clip.create_model_and_transforms(

To fix it, I downloaded the laion CLIP-ViT-H-14-laion2B-s32B-b79K model manually and put it in a directory, then I use the model:

model, _, _ = open_clip.create_model_and_transforms(
            arch,
            device=torch.device("cpu"),
            # pretrained=version,
            pretrained="laion/CLIP-ViT-H-14-laion2B-s32B-b79K/open_clip_pytorch_model.bin",
        )

Then I met the error I could not fix:

RuntimeError: Error(s) in loading state_dict for CLIP:
	Missing key(s) in state_dict: "visual.transformer.resblocks.32.ln_1.weight", "visual.transformer.resblocks.32.ln_1.bias", "visual.transformer.resblocks.32.attn.in_proj_weight", "visual.transformer.resblocks.32.attn.in_proj_bias", "visual.transformer.resblocks.32.attn.out_proj.weight", "visual.transformer.resblocks.32.attn.out_proj.bias", 
......
"transformer.resblocks.31.mlp.c_proj.weight", "transformer.resblocks.31.mlp.c_proj.bias". 
	size mismatch for positional_embedding: copying a param with shape torch.Size([77, 1024]) from checkpoint, the shape in current model is torch.Size([77, 1280]).
	size mismatch for text_projection: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 1280]).
	......

I wonder if it is the problem of the model or something else.

Cannnot use sd21 model

First when trying the use sd21 I got

cannot import name 'rank_zero_only' from 'pytorch_lightning.utilities.distributed'

which needed pytorch-lightning to be rolled back to v1.6.5 to get past that.

But now I get...

Traceback (most recent call last):
  File "D:\ControlNet-XS\ControlNet-XS.py", line 69, in <module>
    samples, controls = cu.get_sdxl_sample(
  File "D:\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:ControlNet-XS\scripts\control_utils.py", line 180, in get_sdxl_sample
    model.sampler.num_steps = ddim_steps
  File "D:\venv\lib\site-packages\torch\nn\modules\module.py", line 1614, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'TwoStreamControlLDM' object has no attribute 'sampler'. Did you mean: 'sample'?

Any ideas on what needs to be tweaked for sd21 to work? sdxl works fine.
Thanks.

Error in Colab

Hi,

Thanks for the release of both the code and models.

I've tried to run the code on Google Colab, and I've got an error. Probably, I'm missing something, but in case it is an error.
You can see the Colab.

Also, it looks like it needs too much VRAM for free Colab. Can you tell me if that is true?

Thanks

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-2-1b395f27c622>](https://localhost:8080/#) in <cell line: 1>()
----> 1 import scripts.control_utils as cu
      2 import torch
      3 from PIL import Image
      4 
      5 path_to_config = '/content/ControlNet-XS/configs/inference/sdxl/sdxl_encD_canny_48m.yaml'

13 frames
[/usr/local/lib/python3.10/dist-packages/numpy/testing/_private/utils.py](https://localhost:8080/#) in <module>
     55 IS_PYSTON = hasattr(sys, "pyston_version_info")
     56 HAS_REFCOUNT = getattr(sys, 'getrefcount', None) is not None and not IS_PYSTON
---> 57 HAS_LAPACK64 = numpy.linalg._umath_linalg._ilp64
     58 
     59 _OLD_PROMOTION = lambda: np._get_promotion_state() == 'legacy'

AttributeError: module 'numpy.linalg._umath_linalg' has no attribute '_ilp64'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.