vchitect / lavie Goto Github PK
View Code? Open in Web Editor NEWLaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
License: Apache License 2.0
LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
License: Apache License 2.0
I want to express my appreciation for your impressive project and its release.
Thank you for your valuable contribution. :)
I am particularly interested in exploring long video generation,
and I was wondering if you could share your fine-tuned models for research purpose.
Thank you.
There are no train scripts.
My video card is rtx4090, 24G VRAM
System is ubuntu 22
Here is the error message:
args.input_path = ../results/base/a_panda_taking_a_selfie,_2k,_high_quality.mp4
args.prompt = ['a_panda_taking_a_selfie,_2k,_high_quality']
loading video from ../results/base/a_panda_taking_a_selfie,_2k,high_quality.mp4
Traceback (most recent call last):
File "/home/vantage/apps/vchitect-lavie/interpolation/sample.py", line 307, in
main(**OmegaConf.load(args.config))
File "/home/vantage/apps/vchitect-lavie/interpolation/sample.py", line 279, in main
video_clip = auto_inpainting_copy_no_mask(args, video_input, prompt, vae, text_encoder, diffusion, model, device,)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vantage/apps/vchitect-lavie/interpolation/sample.py", line 142, in auto_inpainting_copy_no_mask
video_input = vae.encode(video_input).latent_dist.sample().mul(0.18215)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vantage/miniconda3/envs/lavie/lib/python3.11/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vantage/miniconda3/envs/lavie/lib/python3.11/site-packages/diffusers/models/autoencoder_kl.py", line 164, in encode
h = self.encoder(x)
^^^^^^^^^^^^^^^
File "/home/vantage/miniconda3/envs/lavie/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vantage/miniconda3/envs/lavie/lib/python3.11/site-packages/diffusers/models/vae.py", line 129, in forward
sample = down_block(sample)
^^^^^^^^^^^^^^^^^^
File "/home/vantage/miniconda3/envs/lavie/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vantage/miniconda3/envs/lavie/lib/python3.11/site-packages/diffusers/models/unet_2d_blocks.py", line 1014, in forward
hidden_states = resnet(hidden_states, temb=None)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vantage/miniconda3/envs/lavie/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vantage/miniconda3/envs/lavie/lib/python3.11/site-packages/diffusers/models/resnet.py", line 599, in forward
output_tensor = (input_tensor + hidden_states) / self.output_scale_factor
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.77 GiB (GPU 0; 23.65 GiB total capacity; 18.59 GiB already allocated; 4.41 GiB free; 18.71 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
After testing, except for the memory overflow when running interpolation, both base and vsr can run normally.
Hello,
Any expected timeline for when the code will be publicly available?
While Traing VSR model, Now that you have frozen the spatial layer, how to implement joint image-video fine-tuning, the image input seems to have lost its meaning.
Hello, I am getting next:
(lavie) C:\Users\gsusm\Documents\GitHub\LaVie>lavie
'lavie' is not recognized as an internal or external command,
operable program or batch file.
How could I solve it?, I have already the environment activated.
Thank you.
When I run video super resolution model, there is an error
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 400.00 GiB (GPU 0; 44.52 GiB total capacity; 12.21 GiB already allocated; 31.33 GiB free; 12.83 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Why it need to try to allocate 400gb, should i change some settings?
Thanks for the project ❤️ I made a colab. 🥳 I hope you like it. https://github.com/camenduru/LaVie-colab
chosen wrong model?
Dear LaVie Development Team,
I hope this message finds you well. I am reaching out to propose an enhancement to the video interpolation step in the LaVie high-quality video generation pipeline. Having delved into the impressive capabilities of LaVie and its cascaded latent diffusion models, I believe that the interpolation component could benefit from an advanced frame synthesis approach that potentially increases the fluidity of generated video sequences.
Currently, the interpolation process serves to augment the temporal resolution of videos by increasing the frame count, thereby creating smoother transitions and motion. However, I have observed that certain complex scenarios, particularly those involving rapid movement or intricate textures, could exhibit minor artefacts or a less than seamless flow.
To address this, I suggest exploring the integration of machine learning-based frame prediction algorithms that leverage temporal and spatial information more effectively. Such algorithms could include but are not limited to, bidirectional predictive models that estimate intermediate frames using both past and future context or the employment of more sophisticated motion estimation techniques that account for non-linear movements within the scene.
The objective of this enhancement is to further refine the temporal coherence and visual quality of the generated videos, ensuring that the output aligns with the high standards set by LaVie's text-to-video generation framework. I believe this could significantly enhance the user experience, especially for applications requiring high-fidelity video output.
I am keen to hear your thoughts on this suggestion and would be delighted to contribute further to the discussion or preliminary research, should you find this proposal of interest.
Thank you for considering my input, and I commend you on the remarkable work accomplished thus far with LaVie.
Best regards,
yihong1120
I mean, I want to generate upto 3 minutes video. Is it possible?
Hi, when is the dataset planned to be released?
@wyhsirius @maxin-cn @xinyuanc91 @pooya-mohammadi I'm trying to reproduce zero-shot UCF101 FVD score of LaVie (526.3) reported in Table 2. However, I'm only getting much higher FVD score (689.6), and I hope you could help me on this.
Below are some details of my trial:
Where do you think the score difference comes from? Some missing information that I think might cause the difference are:
If you can find anything that I'm missing or doing incorrectly here, please let me know.
Thank you.
I encountered some errors while trying to execute the first step
My environment was created according to the conda env create - f environment.yml
in readme
I hope to receive help. Thank you. I have listed the environmental information below
python pipelines/sample.py --config configs/sample.yaml
/datadisk/zjh/anaconda3/lib/python3.11/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
/datadisk/zjh/anaconda3/lib/python3.11/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
Traceback (most recent call last):
File "/home/zjh/LaVie/base/pipelines/sample.py", line 6, in <module>
from pipeline_videogen import VideoGenPipeline
File "/home/zjh/LaVie/base/pipelines/pipeline_videogen.py", line 40, in <module>
from diffusers.pipeline_utils import DiffusionPipeline
ModuleNotFoundError: No module named 'diffusers.pipeline_utils'
Package Version Editable project location
----------------------------- --------------- -------------------------------
about-time 4.2.1
addict 2.4.0
aiobotocore 2.5.0
aiofiles 22.1.0
aiohttp 3.8.3
aioitertools 0.7.1
aiosignal 1.2.0
aiosqlite 0.18.0
alabaster 0.7.12
alive-progress 3.1.5
altair 5.1.2
anaconda-catalogs 0.2.0
anaconda-client 1.12.0
anaconda-navigator 2.4.2
anaconda-project 0.11.1
annotated-types 0.6.0
anyio 3.7.1
appdirs 1.4.4
argon2-cffi 21.3.0
argon2-cffi-bindings 21.2.0
arrow 1.2.3
astroid 2.14.2
astropy 5.1
asttokens 2.0.5
async-timeout 4.0.2
atomicwrites 1.4.0
attrs 22.1.0
Automat 20.2.0
autopep8 1.6.0
Babel 2.11.0
backcall 0.2.0
backports.functools-lru-cache 1.6.4
backports.tempfile 1.0
backports.weakref 1.0.post1
bcrypt 3.2.0
beautifulsoup4 4.12.2
binaryornot 0.4.4
black 0.0
bleach 4.1.0
bokeh 3.2.1
boltons 23.0.0
botocore 1.29.76
Bottleneck 1.3.5
brotlipy 0.7.0
certifi 2023.7.22
cffi 1.15.1
chardet 4.0.0
charset-normalizer 2.0.4
click 8.0.4
cloudpickle 2.2.1
clyent 1.2.2
colorama 0.4.6
colorcet 3.0.1
comm 0.1.2
conda 23.7.2
conda-build 3.26.0
conda-content-trust 0+unknown
conda_index 0.2.3
conda-libmamba-solver 23.5.0
conda-pack 0.6.0
conda-package-handling 2.2.0
conda_package_streaming 0.9.0
conda-repo-cli 1.0.41
conda-token 0.4.0
conda-verify 3.4.2
constantly 15.1.0
contourpy 1.0.5
cookiecutter 1.7.3
cryptography 41.0.2
cssselect 1.1.0
cycler 0.11.0
cytoolz 0.12.0
daal4py 2023.1.1
dask 2023.6.0
datasets 2.12.0
datashader 0.15.1
datashape 0.5.4
debugpy 1.6.7
decorator 5.1.1
defusedxml 0.7.1
diff-match-patch 20200713
diffusers 0.26.3
dill 0.3.6
distlib 0.3.7
distributed 2023.6.0
docstring-to-markdown 0.11
docutils 0.18.1
einops 0.7.0
entrypoints 0.4
et-xmlfile 1.1.0
executing 0.8.3
fastapi 0.104.1
fastjsonschema 2.16.2
ffmpy 0.3.1
filelock 3.13.1
flake8 6.0.0
Flask 2.2.2
fonttools 4.25.0
frozenlist 1.3.3
fsspec 2024.2.0
future 0.18.3
gensim 4.3.0
glob2 0.7
gmpy2 2.1.2
gradio 4.5.0
gradio_client 0.7.0
grapheme 0.6.0
greenlet 2.0.1
h11 0.14.0
h5py 3.7.0
HeapDict 1.0.1
holoviews 1.17.0
httpcore 1.0.2
httpx 0.25.1
huggingface-hub 0.21.4
hvplot 0.8.4
hyperlink 21.0.0
idna 3.4
imagecodecs 2021.8.26
imageio 2.31.1
imagesize 1.4.1
imbalanced-learn 0.10.1
importlib-metadata 6.0.0
importlib-resources 6.1.1
incremental 21.3.0
inflection 0.5.1
iniconfig 1.1.1
intake 0.6.8
intervaltree 3.1.0
ipykernel 6.19.2
ipython 8.12.0
ipython-genutils 0.2.0
ipywidgets 8.0.4
isort 5.9.3
itemadapter 0.3.0
itemloaders 1.0.4
itsdangerous 2.0.1
ivi-utils 2.0.0 /datadisk/zjh/project/AiS_utils
jaraco.classes 3.2.1
jedi 0.18.1
jeepney 0.7.1
jellyfish 0.9.0
Jinja2 3.1.2
jinja2-time 0.2.0
jmespath 0.10.0
joblib 1.2.0
json5 0.9.6
jsonpatch 1.32
jsonpointer 2.1
jsonschema 4.17.3
jupyter 1.0.0
jupyter_client 7.4.9
jupyter-console 6.6.3
jupyter_core 5.3.0
jupyter-events 0.6.3
jupyter-server 1.23.4
jupyter_server_fileid 0.9.0
jupyter_server_ydoc 0.8.0
jupyter-ydoc 0.2.4
jupyterlab 3.6.3
jupyterlab-pygments 0.1.2
jupyterlab_server 2.22.0
jupyterlab-widgets 3.0.5
keyring 23.13.1
kiwisolver 1.4.4
lazy_loader 0.2
lazy-object-proxy 1.6.0
libarchive-c 2.9
libmambapy 1.4.1
linkify-it-py 2.0.0
llvmlite 0.40.0
lmdb 1.4.1
locket 1.0.0
lxml 4.9.1
lz4 4.3.2
Markdown 3.4.1
markdown-it-py 2.2.0
MarkupSafe 2.1.1
matplotlib 3.7.1
matplotlib-inline 0.1.6
mccabe 0.7.0
mdit-py-plugins 0.3.0
mdurl 0.1.0
mistune 0.8.4
mkl-fft 1.3.6
mkl-random 1.2.2
mkl-service 2.4.0
more-itertools 8.12.0
mpmath 1.3.0
msgpack 1.0.3
multidict 6.0.2
multipledispatch 0.6.0
multiprocess 0.70.14
munkres 1.1.4
mypy-extensions 0.4.3
navigator-updater 0.4.0
nbclassic 0.5.5
nbclient 0.5.13
nbconvert 6.5.4
nbformat 5.7.0
nest-asyncio 1.5.6
networkx 3.1
nltk 3.8.1
notebook 6.5.4
notebook_shim 0.2.2
numba 0.57.0
numexpr 2.8.4
numpy 1.24.3
numpydoc 1.5.0
nvidia-cublas-cu11 11.11.3.6
nvidia-cuda-cupti-cu11 11.8.87
nvidia-cuda-nvrtc-cu11 11.8.89
nvidia-cuda-runtime-cu11 11.8.89
nvidia-cudnn-cu11 8.7.0.84
nvidia-cufft-cu11 10.9.0.58
nvidia-curand-cu11 10.3.0.86
nvidia-cusolver-cu11 11.4.1.48
nvidia-cusparse-cu11 11.7.5.86
nvidia-nccl-cu11 2.19.3
nvidia-nvtx-cu11 11.8.86
opencv-python 4.8.1.78
openpyxl 3.0.10
orjson 3.9.10
packaging 23.0
pandas 1.5.3
pandocfilters 1.5.0
panel 1.2.1
param 1.13.0
parsel 1.6.0
parso 0.8.3
partd 1.2.0
pathlib 1.0.1
pathspec 0.10.3
patsy 0.5.3
pep8 1.7.1
pexpect 4.8.0
pickleshare 0.7.5
Pillow 9.4.0
pip 23.2.1
pkginfo 1.9.6
platformdirs 4.0.0
plotly 5.9.0
pluggy 1.0.0
ply 3.11
pooch 1.4.0
poyo 0.5.0
prometheus-client 0.14.1
prompt-toolkit 3.0.36
Protego 0.1.16
psutil 5.9.0
ptyprocess 0.7.0
pure-eval 0.2.2
py-cpuinfo 8.0.0
pyarrow 11.0.0
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycodestyle 2.10.0
pycosat 0.6.4
pycparser 2.21
pycryptodome 3.19.0
pyct 0.5.0
pycurl 7.45.2
pydantic 2.5.1
pydantic_core 2.14.3
PyDispatcher 2.0.5
pydocstyle 6.3.0
pydub 0.25.1
pyerfa 2.0.0
pyflakes 3.0.1
Pygments 2.15.1
PyJWT 2.4.0
pylint 2.16.2
pylint-venv 2.3.0
pyls-spyder 0.4.0
pyodbc 4.0.34
pyOpenSSL 23.2.0
pyparsing 3.0.9
PyQt5 5.15.10
PyQt5-Qt5 5.15.2
PyQt5-sip 12.13.0
PyQtWebEngine 5.15.6
PyQtWebEngine-Qt5 5.15.2
pyrsistent 0.18.0
PySocks 1.7.1
pytest 7.4.0
python-dateutil 2.8.2
python-json-logger 2.0.7
python-lsp-black 1.2.1
python-lsp-jsonrpc 1.0.0
python-lsp-server 1.7.2
python-multipart 0.0.6
python-slugify 5.0.2
python-snappy 0.6.1
pytoolconfig 1.2.5
pytz 2022.7
pyviz-comms 2.3.0
PyWavelets 1.4.1
pyxdg 0.27
PyYAML 6.0
pyzmq 23.2.0
QDarkStyle 3.0.2
qstylizer 0.2.2
QtAwesome 1.2.2
qtconsole 5.4.2
QtPy 2.2.0
queuelib 1.5.0
regex 2022.7.9
requests 2.31.0
requests-file 1.5.1
requests-toolbelt 1.0.0
responses 0.13.3
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rich 13.7.0
rope 1.7.0
Rtree 1.0.1
ruamel.yaml 0.17.21
ruamel-yaml-conda 0.17.21
s3fs 2023.4.0
sacremoses 0.0.43
safetensors 0.4.2
scikit-image 0.20.0
scikit-learn 1.3.0
scikit-learn-intelex 20230426.111612
scipy 1.10.1
Scrapy 2.8.0
seaborn 0.12.2
SecretStorage 3.3.1
semantic-version 2.10.0
Send2Trash 1.8.0
service-identity 18.1.0
setuptools 68.0.0
shellingham 1.5.4
sip 6.6.2
six 1.16.0
smart-open 5.2.1
sniffio 1.2.0
snowballstemmer 2.2.0
sortedcontainers 2.4.0
soupsieve 2.4
Sphinx 5.0.2
sphinx-multiversion 0.2.4
sphinxcontrib-applehelp 1.0.2
sphinxcontrib-devhelp 1.0.2
sphinxcontrib-htmlhelp 2.0.0
sphinxcontrib-jsmath 1.0.1
sphinxcontrib-qthelp 1.0.3
sphinxcontrib-serializinghtml 1.1.5
spyder 5.4.3
spyder-kernels 2.4.3
SQLAlchemy 1.4.39
stack-data 0.2.0
starlette 0.27.0
statsmodels 0.14.0
sympy 1.11.1
tables 3.8.0
tabulate 0.8.10
TBB 0.2
tblib 1.7.0
tenacity 8.2.2
terminado 0.17.1
text-unidecode 1.3
textdistance 4.2.1
threadpoolctl 2.2.0
three-merge 0.1.1
tifffile 2021.7.2
tinycss2 1.2.1
tldextract 3.2.0
tokenizers 0.13.2
toml 0.10.2
tomlkit 0.12.0
toolz 0.12.0
torch 2.2.1+cu118
torchaudio 2.2.1+cu118
torchvision 0.17.1+cu118
tornado 6.3.2
tqdm 4.65.0
traitlets 5.7.1
transformers 4.29.2
triton 2.2.0
Twisted 22.10.0
typer 0.9.0
typing_extensions 4.8.0
uc-micro-py 1.0.1
ujson 5.4.0
Unidecode 1.2.0
urllib3 1.26.16
uvicorn 0.24.0.post1
virtualenv 20.24.7
w3lib 1.21.0
watchdog 2.1.6
wcwidth 0.2.5
webencodings 0.5.1
websocket-client 0.58.0
websockets 11.0.3
Werkzeug 2.2.3
whatthepatch 1.0.2
wheel 0.38.4
widgetsnbextension 4.0.5
wrapt 1.14.1
wurlitzer 3.0.2
xarray 2023.6.0
xxhash 2.0.2
xyzservices 2022.9.0
y-py 0.5.9
yapf 0.31.0
yarl 1.8.1
ypy-websocket 0.8.2
zict 2.2.0
zipp 3.11.0
zope.interface 5.4.0
zstandard 0.19.0
python --version
Python 3.11.4
First step
python pipelines/sample.py --config configs/sample.yaml
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
How to fix it, Or someone share the wheel for us.
Thanks in advance!
I have seen versions of webui elsewhere, haven't they been made officially
Hi, @wyhsirius , thanks for your great work "LaVie"!
I found some typos in config files while running the code, they are:
# ckpt_path: "../pretrained_models/lavie_interpolation.pt"
ckpt_path: "../pretrained_models/stable-diffusion-v1-4"
# ckpt_path: "../pretained_models/lavie_vsr.pt"
ckpt_path: "../pretrained_models/lavie_vsr.pt"
Best wishes,
MqLeet.
What configuration can be modified to modify the length of the output video
well done team on release, results are great! I'm wondering if quality can be improved further with different base models?
about LaVie-2 ,What effect? Any test cases
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.