GithubHelp home page GithubHelp logo

sgzqc / controlnext Goto Github PK

View Code? Open in Web Editor NEW

This project forked from dvlab-research/controlnext

0.0 0.0 0.0 35.16 MB

Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA

License: Apache License 2.0

Shell 1.47% Python 98.53%

controlnext's Introduction

๐ŸŒ€ ControlNeXt

ControlNeXt is our official implementation for controllable generation, supporting both images and videos while incorporating diverse forms of control information. In this project, we propose a new method that reduces trainable parameters by up to 90% compared with ControlNet, achieving faster convergence and outstanding efficiency. This method can be directly combined with other LoRA techniques to alter style and ensure more stable generation. Please refer to the examples for more details.

We provide an online demo of ControlNeXt-SDXL. Due to the high resource requirements of SVD, we are unable to offer it online.

This project is still undergoing iterative development. The code and model may be updated at any time. More information will be provided later.

Experiences

We share more training experiences there and in the Issue. We spent a lot of time to find these. Now share with all of you. May these will help you!

Model Zoo

  • ControlNeXt-SDXL [ Link ] : Controllable image generation. Our model is built upon Stable Diffusion XL . Fewer trainable parameters, faster convergence, improved efficiency, and can be integrated with LoRA.

  • ControlNeXt-SVD-v2 [ Link ] : Generate the video controlled by the sequence of human poses. In the v2 version, we implement several improvements: a higher-quality collected training dataset, larger training and inference batch frames, higher generation resolution, enhanced human-related video generation through continual training, and pose alignment for inference to improve overall performance.

  • ControlNeXt-SVD-v2-Training [ Link ] : The training scripts for our ControlNeXt-SVD-v2 [ Link ].

  • ControlNeXt-SVD [ Link ] : Generate the video controlled by the sequence of human poses. This can be seen as an attempt to replicate the implementation of AnimateAnyone. However, our model is built upon Stable Video Diffusion, employing a more concise architecture.

  • ControlNeXt-SD1.5 [ Link ] : Controllable image generation. Our model is built upon Stable Diffusion 1.5. Fewer trainable parameters, faster convergence, improved efficiency, and can be integrated with LoRA.

  • ControlNeXt-SD1.5-Training : The process is quite simple, so we do not plan to invest additional effort into it. You can directly use the HuggingFace examples. Please refer to the SDXL and SVD sections for our newly updated versions!

  • ControlNeXt-SD3 [ Link ] : Stay tuned.

๐ŸŽฅ Examples

For more examples, please refer to our Project page.

demo1 demo2 demo3 demo5

If you can't load the videos, you can also directly download them from here and here. Or you can view them from our Project Page or BiliBili.

02.mp4
02-1.mp4
01.mp4
01-1.mp4

03-1.mp4

04-1.mp4

If you can't load the videos, you can also directly download them from here.

tiktok.mp4

spiderman.mp4

star.mp4
chair.mp4

DreamShaper

Anythingv3

Anythingv3

If you find this work useful, please consider citing:

@article{peng2024controlnext,
  title={ControlNeXt: Powerful and Efficient Control for Image and Video Generation},
  author={Peng, Bohao and Wang, Jian and Zhang, Yuechen and Li, Wenbo and Yang, Ming-Chang and Jia, Jiaya},
  journal={arXiv preprint arXiv:2408.06070},
  year={2024}
}

controlnext's People

Contributors

eugeoter avatar pbihao avatar yukang2017 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.