GithubHelp home page GithubHelp logo

lihuibng / videocrafter Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ailab-cvc/videocrafter

0.0 1.0 0.0 175.66 MB

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Home Page: https://ailab-cvc.github.io/videocrafter2/

License: Other

Shell 0.44% Python 99.56%

videocrafter's Introduction

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Discord GitHub

๐Ÿ”ฅ๐Ÿ”ฅ The VideoCrafter2 Large improvements over VideoCrafter1 with limited data. Better Motion, Better Concept Combination!!!

Please Join us and create your own film on Discord/Floor33.


๐ŸŽฅ Exquisite film, produced by VideoCrafter2, directed by Human

IMAGE ALT TEXT HERE

๐Ÿ”† Introduction

๐Ÿค—๐Ÿค—๐Ÿค— VideoCrafter is an open-source video generation and editing toolbox for crafting video content.
It currently includes the Text2Video and Image2Video models:

1. Generic Text-to-video Generation

Click the GIF to access the high-resolution video.

"Tom Cruise's face reflects focus, his eyes filled with purpose and drive." "A child excitedly swings on a rusty swing set, laughter filling the air." "A young woman with glasses is jogging in the park wearing a pink headband."
"With the style of van gogh, A young couple dances under the moonlight by the lake." "A rabbit, low-poly game art style" "Impressionist style, a yellow rubber duck floating on the wave on the sunset"

2. Generic Image-to-video Generation

"a black swan swims on the pond" "a girl is riding a horse fast on grassland" "a boy sits on a chair facing the sea" "two galleons moving in the wind at sunset"

๐Ÿ“ Changelog

  • [2024.01.18]: ๐Ÿ”ฅ๐Ÿ”ฅ Release the VideoCrafter2 and Tech Report!

  • [2023.10.30]: Release VideoCrafter1 Technical Report!

  • [2023.10.13]: ๐Ÿ”ฅ๐Ÿ”ฅ Release the VideoCrafter1, High Quality Video Generation!

  • [2023.08.14]: Release a new version of VideoCrafter on Discord/Floor33. Please join us to create your own film!

  • [2023.04.18]: Release a VideoControl model with most of the watermarks removed!

  • [2023.04.05]: Release pretrained Text-to-Video models, VideoLora models, and inference code.


โณ Models

T2V-Models Resolution Checkpoints
VideoCrafter2 Coming soon
VideoCrafter1 576x1024 Hugging Face
VideoCrafter1 320x512 Hugging Face
I2V-Models Resolution Checkpoints
VideoCrafter1 320x512 Hugging Face

โš™๏ธ Setup

1. Install Environment via Anaconda (Recommended)

conda create -n videocrafter python=3.8.5
conda activate videocrafter
pip install -r requirements.txt

๐Ÿ’ซ Inference

1. Text-to-Video

  1. Download pretrained T2V models via Hugging Face, and put the model.ckpt in checkpoints/base_1024_v1/model.ckpt.
  2. Input the following commands in terminal.
  sh scripts/run_text2video.sh

2. Image-to-Video

  1. Download pretrained I2V models via Hugging Face, and put the model.ckpt in checkpoints/i2v_512_v1/model.ckpt.
  2. Input the following commands in terminal.
  sh scripts/run_image2video.sh

3. Local Gradio demo

  1. Download the pretrained T2V and I2V models and put them in the corresponding directory according to the previous guidelines.
  2. Input the following commands in terminal.
  python gradio_app.py

๐Ÿ“‹ Techinical Report

๐Ÿ˜‰ VideoCrafter2 Tech report: VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

๐Ÿ˜‰ VideoCrafter1 Tech report: VideoCrafter1: Open Diffusion Models for High-Quality Video Generation

๐Ÿ˜‰ Citation

The technical report is currently unavailable as it is still in preparation. You can cite the paper of our image-to-video model and related base model.

@misc{chen2024videocrafter2,
      title={VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models}, 
      author={Haoxin Chen and Yong Zhang and Xiaodong Cun and Menghan Xia and Xintao Wang and Chao Weng and Ying Shan},
      year={2024},
      eprint={2401.09047},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

@misc{chen2023videocrafter1,
      title={VideoCrafter1: Open Diffusion Models for High-Quality Video Generation}, 
      author={Haoxin Chen and Menghan Xia and Yingqing He and Yong Zhang and Xiaodong Cun and Shaoshu Yang and Jinbo Xing and Yaofang Liu and Qifeng Chen and Xintao Wang and Chao Weng and Ying Shan},
      year={2023},
      eprint={2310.19512},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

@article{xing2023dynamicrafter,
      title={DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors}, 
      author={Jinbo Xing and Menghan Xia and Yong Zhang and Haoxin Chen and Xintao Wang and Tien-Tsin Wong and Ying Shan},
      year={2023},
      eprint={2310.12190},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

@article{he2022lvdm,
      title={Latent Video Diffusion Models for High-Fidelity Long Video Generation}, 
      author={Yingqing He and Tianyu Yang and Yong Zhang and Ying Shan and Qifeng Chen},
      year={2022},
      eprint={2211.13221},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

๐Ÿค— Acknowledgements

Our codebase builds on Stable Diffusion. Thanks the authors for sharing their awesome codebases!

๐Ÿ“ข Disclaimer

We develop this repository for RESEARCH purposes, so it can only be used for personal/research/non-commercial purposes.


videocrafter's People

Contributors

yzhang2016 avatar scutpaul avatar yingqinghe avatar vinthony avatar chenxwh avatar eltociear avatar menghanxia avatar mayuelala avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.