GithubHelp home page GithubHelp logo

tiger-ai-lab / consisti2v Goto Github PK

View Code? Open in Web Editor NEW
166.0 166.0 11.0 29.38 MB

ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation

Home Page: https://tiger-ai-lab.github.io/ConsistI2V/

License: MIT License

Python 100.00%
diffusion-models image-to-video-generation video-generation video-synthesis

consisti2v's Issues

Where to download the Training Dataset

Hi authors,
Thanks for this awesome work! In the paper, the ConsistI2V is trained with WebVid-10M dataset. If I want to reproduce the training, which website should I refer to download this dataset? Thanks

Discussion on Computing Resources and Training Details

Thank you for your significant contributions. I attempted to use consistI2V to train our tasks, and found that each iteration takes approximately 24.14s/it, based on the default parameters (8 GPUs with 3 batch size with 256x256 resolution). I'm curious about the duration of each iteration when you use the default training YAML. Could you share your experience?

Question about negative prompt

Thank you for sharing the excellent work! I am confused about the "negative prompt" shown in your demo and code. It seems that you didn't mention that in your paper. What is it used for?

The camera motion cannot be used

Hello, Thanks for your nice work! I want to use the code to achieve the camera motion result. By simply setting the camera motion (such as pan_left), the dim match would generate some problems in z_T calculation. So how to use the code correctly to get the camera motion results?

Issue with blurry results in fine-tuned Model

Hello, your work is really cool!

I have been fine-tuning your model using my dataset, 25k videos, starting from your TIGER-Lab/ConsistI2V checkpoint. Due to limited resources, I used a batch size of 2 training on 2 RTX 6000, while keeping the rest of the configuration the same. However, I noticed that the geometry of the moving objects is blurry.
003

Is this an expected outcome since I cannot replicate batch size of 192? Are the number of GPU or dataset matter here? Did you observe this problem during training the model and was it gone after training for longer?

Code Availability for ConsistI2V Project?

Hi there!

I'm excited about the ConsistI2V project's ability to have consistent image to video as the source. I noticed the code isn't currently available in the repository. While I understand it's still under development, I'm curious if there's any information about a potential release timeframe.

I appreciate any insights you can share about the code's availability. Thanks for your time and the awesome project!

Discussion on Computing Resources and Training Details

Hello, I am very interested in your work, and I am really impressed with your demo. I would like to inquire about the number of GPUs used for training the diffusion model and the duration of the training time. Additionally, the dataset is sampled from WebVid-10M, and I noticed that you only sampled 16 frames each video. How do you ensure that the sampled series are sufficiently dynamic, and is this 16-frame sampling a tradeoff? Looking forward to your response!

watermark problem

Hi,

I wonder why there is always a watermark like pattern appealing in the generated video? Any idea how to get rid of it?

autoregressive doesn't work?

Hi there - thanks for this amazing project and releasing the code!

I'm trying to run autoregressive inference using the default yaml file inference_autoregress, but the resulting video ends up being the same length as using the regular inference

any ideas what I might be doing wrong?

memory

hallow,could you please tell me how much memory of cuda i should to prepare training

Checkpoint can't be downloaded

Dear Authors,
Thank you for your great work. Could you please help check that the checkpoint is not available now on the Huggingface. Thank you!

low resolution output

Firstly, excellent work. Consistency with the first frame is very important in the actual image animation generation. I play with ConsistI2V on Replicate with different images. However, the image animation has a low-resolution issue. The input images are high-resolution, but the output video is in low resolution. Even the demo outputs on Replicate have low resolution.
1. What are the prompt settings for these outputs of video gallery on the project page?
2. Is the low resolution related to the prompt settings, or is it a limitation of the model itself?
Screenshot 2024-05-23 at 4 00 58 PM
Screenshot 2024-05-23 at 4 05 07 PM
Again, thank you for your excellent work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.