GithubHelp home page GithubHelp logo

sunhengzhe / freeinit Goto Github PK

View Code? Open in Web Editor NEW

This project forked from tianxingwu/freeinit

0.0 0.0 0.0 60.7 MB

FreeInit: Bridging Initialization Gap in Video Diffusion Models

Home Page: https://tianxingwu.github.io/pages/FreeInit/

License: Other

Python 100.00%

freeinit's Introduction

FreeInit : Bridging Initialization Gap in Video Diffusion Models

Paper Project Page Video Hugging Face Visitor

This repository contains the implementation of the following paper:

FreeInit: Bridging Initialization Gap in Video Diffusion Models
Tianxing Wu, Chenyang Si, Yuming Jiang, Ziqi Huang, Ziwei Liu

From MMLab@NTU affiliated with S-Lab, Nanyang Technological University

๐Ÿ“– Overview

overall_structure

We propose FreeInit, a concise yet effective method to improve temporal consistency of videos generated by diffusion models. FreeInit requires no additional training and introduces no learnable parameters, and can be easily incorporated into arbitrary video diffusion models at inference time.

๐Ÿ”ฅ Updates

๐Ÿ“ƒ Usage

In this repository, we use AnimateDiff as an example to demonstrate how to integrate FreeInit into current text-to-video inference pipelines.

In pipeline_animation.py, we define a class AnimationFreeInitPipeline inherited from AnimationPipeline, showing how to modify the original pipeline.

In freeinit_utils.py, we provide frequency filtering code for Noise Reinitialization.

An example inference script is provided at animate_with_freeinit.py.

Please refer to the above scripts as a reference when intergrating FreeInit into other video diffusion models.

๐Ÿ”จ Quick Start

1. Clone Repo

git clone https://github.com/TianxingWu/FreeInit.git
cd FreeInit
cd examples/AnimateDiff

2. Prepare Environment

conda env create -f environment.yaml
conda activate animatediff

3. Download Checkpoints

Please refer to the official repo of AnimateDiff. The setup guide is listed here.

4. Inference with FreeInit

After downloading the base model, motion module and personalize T2I checkpoints, run the following command to generate animations with FreeInit. The generation results is then saved to outputs folder.

python -m scripts.animate_with_freeinit \
    --config "configs/prompts/freeinit_examples/RealisticVision_v2.yaml" \
    --num_iters 5 \
    --save_intermediate \
    --use_fp16

where num_iters is the number of freeinit iterations. We recommend to use 3-5 iterations for a balance between the quality and efficiency. For faster inference, the argument use_fast_sampling can be enabled to use the Coarse-to-Fine Sampling strategy, which may lead to inferior results.

You can change the text prompts in the config file. To tune the frequency filter parameters for better results, please change the filter_params settings in the config file. The 'butterworth' filter with n=4, d_s=d_t=0.25 is set as default. For base models with larger temporal inconsistencies, please consider using the 'guassian' filter.

More .yaml files with different motion module / personalize T2I settings are also provided for testing.

๐Ÿค— Gradio Demo

We also provide a Gradio Demo to demonstrate our method with UI. Running the following command will launch the demo. Feel free to play around with the parameters to improve generation quality.

python app.py

Alternatively, you can try the online demo hosted on Hugging Face: [demo link] .

๐Ÿ–ผ๏ธ Generation Results

Please refer to our project page for more visual comparisons.

๐Ÿ–‹๏ธ Citation

If you find our repo useful for your research, please consider citing our paper:

@article{wu2023freeinit,
     title={FreeInit: Bridging Initialization Gap in Video Diffusion Models},
     author={Wu, Tianxing and Si, Chenyang and Jiang, Yuming and Huang, Ziqi and Liu, Ziwei},
     journal={arXiv preprint arXiv:2312.07537},
     year={2023}

๐Ÿค Acknowledgement

This project is distributed under the S-Lab License. See LICENSE for more information.

The example code is built upon AnimateDiff. Thanks the team for their impressive work!

freeinit's People

Contributors

tianxingwu avatar eltociear avatar linhqyy avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.