hpcaitech / open-sora Goto Github PK
View Code? Open in Web Editor NEWOpen-Sora: Democratizing Efficient Video Production for All
Home Page: https://hpcaitech.github.io/Open-Sora/
License: Apache License 2.0
Open-Sora: Democratizing Efficient Video Production for All
Home Page: https://hpcaitech.github.io/Open-Sora/
License: Apache License 2.0
Runing the inerence sample
python sample.py -m "DiT/XL-2" --text "a person is walking on the street" --ckpt /path/to/checkpoint --height 256 --width 256 --fps 10 --sec 5 --disable-cfg
I got the following error:
usage: sample.py [-h]
[-m {DiT-XL/2,DiT-XL/4,DiT-XL/8,DiT-L/2,DiT-L/4,DiT-L/8,DiT-B/2,DiT-B/4,DiT-B/8,DiT-S/2,DiT-S/4,DiT-S/8}]
[--text TEXT] [--cfg-scale CFG_SCALE] [--num-sampling-steps NUM_SAMPLING_STEPS]
[--seed SEED] --ckpt CKPT [-c {raw,vqvae,vae}] [--text_model TEXT_MODEL]
[--width WIDTH] [--height HEIGHT] [--fps FPS] [--sec SEC] [--disable-cfg]
sample.py: error: argument -m/--model: invalid choice: 'DiT/XL-2' (choose from 'DiT-XL/2', 'DiT-XL/4', 'DiT-XL/8', 'DiT-L/2', 'DiT-L/4', 'DiT-L/8', 'DiT-B/2', 'DiT-B/4', 'DiT-B/8', 'DiT-S/2', 'DiT-S/4', 'DiT-S/8')
Then, I changed to
!python sample.py -m "DiT-XL/2" --text "a person is walking on the street" --ckpt pretrained_models/DiT-XL-2-256x256.pt --height 256 --width 256 --fps 10 --sec 5 --disable-cfg
But I got a different error
Traceback (most recent call last):
File "/content/Open-Sora/sample.py", line 136, in
main(args)
File "/content/Open-Sora/sample.py", line 39, in main
model.load_state_dict(torch.load(args.ckpt))
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DiT:
Missing key(s) in state_dict: "video_embedder.proj.weight", "video_embedder.proj.bias", "blocks.0.attn.to_q.weight", "blocks.0.attn.to_q.bias", "blocks.0.attn.to_k.weight", "blocks.0.attn.to_k.bias", "blocks.0.attn.to_v.weight", "blocks.0.attn.to_v.bias", "blocks.0.attn.to_out.0.weight", "blocks.0.attn.to_out.0.bias", "blocks.1.attn.to_q.weight", "blocks.1.attn.to_q.bias", "blocks.1.attn.to_k.weight", "blocks.1.attn.to_k.bias", "blocks.1.attn.to_v.weight", "blocks.1.attn.to_v.bias", "blocks.1.attn.to_out.0.weight", "blocks.1.attn.to_out.0.bias", "blocks.2.attn.to_q.weight", "blocks.2.attn.to_q.bias", "blocks.2.attn.to_k.weight", "blocks.2.attn.to_k.bias", "blocks.2.attn.to_v.weight", "blocks.2.attn.to_v.bias", "blocks.2.attn.to_out.0.weight", "blocks.2.attn.to_out.0.bias", "blocks.3.attn.to_q.weight", "blocks.3.attn.to_q.bias", "blocks.3.attn.to_k.weight", "blocks.3.attn.to_k.bias", "blocks.3.attn.to_v.weight", "blocks.3.attn.to_v.bias", "blocks.3.attn.to_out.0.weight", "blocks.3.attn.to_out.0.bias", "blocks.4.attn.to_q.weight", "blocks.4.attn.to_q.bias", "blocks.4.attn.to_k.weight", "blocks.4.attn.to_k.bias", "blocks.4.attn.to_v.weight", "blocks.4.attn.to_v.bias", "blocks.4.attn.to_out.0.weight", "blocks.4.attn.to_out.0.bias", "blocks.5.attn.to_q.weight", "blocks.5.attn.to_q.bias", "blocks.5.attn.to_k.weight", "blocks.5.attn.to_k.bias", "blocks.5.attn.to_v.weight", "blocks.5.attn.to_v.bias", "blocks.5.attn.to_out.0.weight", "blocks.5.attn.to_out.0.bias", "blocks.6.attn.to_q.weight", "blocks.6.attn.to_q.bias", "blocks.6.attn.to_k.weight", "blocks.6.attn.to_k.bias", "blocks.6.attn.to_v.weight", "blocks.6.attn.to_v.bias", "blocks.6.attn.to_out.0.weight", "blocks.6.attn.to_out.0.bias", "blocks.7.attn.to_q.weight", "blocks.7.attn.to_q.bias", "blocks.7.attn.to_k.weight", "blocks.7.attn.to_k.bias", "blocks.7.attn.to_v.weight", "blocks.7.attn.to_v.bias", "blocks.7.attn.to_out.0.weight", "blocks.7.attn.to_out.0.bias", "blocks.8.attn.to_q.weight", "blocks.8.attn.to_q.bias", "blocks.8.attn.to_k.weight", "blocks.8.attn.to_k.bias", "blocks.8.attn.to_v.weight", "blocks.8.attn.to_v.bias", "blocks.8.attn.to_out.0.weight", "blocks.8.attn.to_out.0.bias", "blocks.9.attn.to_q.weight", "blocks.9.attn.to_q.bias", "blocks.9.attn.to_k.weight", "blocks.9.attn.to_k.bias", "blocks.9.attn.to_v.weight", "blocks.9.attn.to_v.bias", "blocks.9.attn.to_out.0.weight", "blocks.9.attn.to_out.0.bias", "blocks.10.attn.to_q.weight", "blocks.10.attn.to_q.bias", "blocks.10.attn.to_k.weight", "blocks.10.attn.to_k.bias", "blocks.10.attn.to_v.weight", "blocks.10.attn.to_v.bias", "blocks.10.attn.to_out.0.weight", "blocks.10.attn.to_out.0.bias", "blocks.11.attn.to_q.weight", "blocks.11.attn.to_q.bias", "blocks.11.attn.to_k.weight", "blocks.11.attn.to_k.bias", "blocks.11.attn.to_v.weight", "blocks.11.attn.to_v.bias", "blocks.11.attn.to_out.0.weight", "blocks.11.attn.to_out.0.bias", "blocks.12.attn.to_q.weight", "blocks.12.attn.to_q.bias", "blocks.12.attn.to_k.weight", "blocks.12.attn.to_k.bias", "blocks.12.attn.to_v.weight", "blocks.12.attn.to_v.bias", "blocks.12.attn.to_out.0.weight", "blocks.12.attn.to_out.0.bias", "blocks.13.attn.to_q.weight", "blocks.13.attn.to_q.bias", "blocks.13.attn.to_k.weight", "blocks.13.attn.to_k.bias", "blocks.13.attn.to_v.weight", "blocks.13.attn.to_v.bias", "blocks.13.attn.to_out.0.weight", "blocks.13.attn.to_out.0.bias", "blocks.14.attn.to_q.weight", "blocks.14.attn.to_q.bias", "blocks.14.attn.to_k.weight", "blocks.14.attn.to_k.bias", "blocks.14.attn.to_v.weight", "blocks.14.attn.to_v.bias", "blocks.14.attn.to_out.0.weight", "blocks.14.attn.to_out.0.bias", "blocks.15.attn.to_q.weight", "blocks.15.attn.to_q.bias", "blocks.15.attn.to_k.weight", "blocks.15.attn.to_k.bias", "blocks.15.attn.to_v.weight", "blocks.15.attn.to_v.bias", "blocks.15.attn.to_out.0.weight", "blocks.15.attn.to_out.0.bias", "blocks.16.attn.to_q.weight", "blocks.16.attn.to_q.bias", "blocks.16.attn.to_k.weight", "blocks.16.attn.to_k.bias", "blocks.16.attn.to_v.weight", "blocks.16.attn.to_v.bias", "blocks.16.attn.to_out.0.weight", "blocks.16.attn.to_out.0.bias", "blocks.17.attn.to_q.weight", "blocks.17.attn.to_q.bias", "blocks.17.attn.to_k.weight", "blocks.17.attn.to_k.bias", "blocks.17.attn.to_v.weight", "blocks.17.attn.to_v.bias", "blocks.17.attn.to_out.0.weight", "blocks.17.attn.to_out.0.bias", "blocks.18.attn.to_q.weight", "blocks.18.attn.to_q.bias", "blocks.18.attn.to_k.weight", "blocks.18.attn.to_k.bias", "blocks.18.attn.to_v.weight", "blocks.18.attn.to_v.bias", "blocks.18.attn.to_out.0.weight", "blocks.18.attn.to_out.0.bias", "blocks.19.attn.to_q.weight", "blocks.19.attn.to_q.bias", "blocks.19.attn.to_k.weight", "blocks.19.attn.to_k.bias", "blocks.19.attn.to_v.weight", "blocks.19.attn.to_v.bias", "blocks.19.attn.to_out.0.weight", "blocks.19.attn.to_out.0.bias", "blocks.20.attn.to_q.weight", "blocks.20.attn.to_q.bias", "blocks.20.attn.to_k.weight", "blocks.20.attn.to_k.bias", "blocks.20.attn.to_v.weight", "blocks.20.attn.to_v.bias", "blocks.20.attn.to_out.0.weight", "blocks.20.attn.to_out.0.bias", "blocks.21.attn.to_q.weight", "blocks.21.attn.to_q.bias", "blocks.21.attn.to_k.weight", "blocks.21.attn.to_k.bias", "blocks.21.attn.to_v.weight", "blocks.21.attn.to_v.bias", "blocks.21.attn.to_out.0.weight", "blocks.21.attn.to_out.0.bias", "blocks.22.attn.to_q.weight", "blocks.22.attn.to_q.bias", "blocks.22.attn.to_k.weight", "blocks.22.attn.to_k.bias", "blocks.22.attn.to_v.weight", "blocks.22.attn.to_v.bias", "blocks.22.attn.to_out.0.weight", "blocks.22.attn.to_out.0.bias", "blocks.23.attn.to_q.weight", "blocks.23.attn.to_q.bias", "blocks.23.attn.to_k.weight", "blocks.23.attn.to_k.bias", "blocks.23.attn.to_v.weight", "blocks.23.attn.to_v.bias", "blocks.23.attn.to_out.0.weight", "blocks.23.attn.to_out.0.bias", "blocks.24.attn.to_q.weight", "blocks.24.attn.to_q.bias", "blocks.24.attn.to_k.weight", "blocks.24.attn.to_k.bias", "blocks.24.attn.to_v.weight", "blocks.24.attn.to_v.bias", "blocks.24.attn.to_out.0.weight", "blocks.24.attn.to_out.0.bias", "blocks.25.attn.to_q.weight", "blocks.25.attn.to_q.bias", "blocks.25.attn.to_k.weight", "blocks.25.attn.to_k.bias", "blocks.25.attn.to_v.weight", "blocks.25.attn.to_v.bias", "blocks.25.attn.to_out.0.weight", "blocks.25.attn.to_out.0.bias", "blocks.26.attn.to_q.weight", "blocks.26.attn.to_q.bias", "blocks.26.attn.to_k.weight", "blocks.26.attn.to_k.bias", "blocks.26.attn.to_v.weight", "blocks.26.attn.to_v.bias", "blocks.26.attn.to_out.0.weight", "blocks.26.attn.to_out.0.bias", "blocks.27.attn.to_q.weight", "blocks.27.attn.to_q.bias", "blocks.27.attn.to_k.weight", "blocks.27.attn.to_k.bias", "blocks.27.attn.to_v.weight", "blocks.27.attn.to_v.bias", "blocks.27.attn.to_out.0.weight", "blocks.27.attn.to_out.0.bias".
Unexpected key(s) in state_dict: "y_embedder.embedding_table.weight", "x_embedder.proj.weight", "x_embedder.proj.bias", "blocks.0.attn.qkv.weight", "blocks.0.attn.qkv.bias", "blocks.0.attn.proj.weight", "blocks.0.attn.proj.bias", "blocks.1.attn.qkv.weight", "blocks.1.attn.qkv.bias", "blocks.1.attn.proj.weight", "blocks.1.attn.proj.bias", "blocks.2.attn.qkv.weight", "blocks.2.attn.qkv.bias", "blocks.2.attn.proj.weight", "blocks.2.attn.proj.bias", "blocks.3.attn.qkv.weight", "blocks.3.attn.qkv.bias", "blocks.3.attn.proj.weight", "blocks.3.attn.proj.bias", "blocks.4.attn.qkv.weight", "blocks.4.attn.qkv.bias", "blocks.4.attn.proj.weight", "blocks.4.attn.proj.bias", "blocks.5.attn.qkv.weight", "blocks.5.attn.qkv.bias", "blocks.5.attn.proj.weight", "blocks.5.attn.proj.bias", "blocks.6.attn.qkv.weight", "blocks.6.attn.qkv.bias", "blocks.6.attn.proj.weight", "blocks.6.attn.proj.bias", "blocks.7.attn.qkv.weight", "blocks.7.attn.qkv.bias", "blocks.7.attn.proj.weight", "blocks.7.attn.proj.bias", "blocks.8.attn.qkv.weight", "blocks.8.attn.qkv.bias", "blocks.8.attn.proj.weight", "blocks.8.attn.proj.bias", "blocks.9.attn.qkv.weight", "blocks.9.attn.qkv.bias", "blocks.9.attn.proj.weight", "blocks.9.attn.proj.bias", "blocks.10.attn.qkv.weight", "blocks.10.attn.qkv.bias", "blocks.10.attn.proj.weight", "blocks.10.attn.proj.bias", "blocks.11.attn.qkv.weight", "blocks.11.attn.qkv.bias", "blocks.11.attn.proj.weight", "blocks.11.attn.proj.bias", "blocks.12.attn.qkv.weight", "blocks.12.attn.qkv.bias", "blocks.12.attn.proj.weight", "blocks.12.attn.proj.bias", "blocks.13.attn.qkv.weight", "blocks.13.attn.qkv.bias", "blocks.13.attn.proj.weight", "blocks.13.attn.proj.bias", "blocks.14.attn.qkv.weight", "blocks.14.attn.qkv.bias", "blocks.14.attn.proj.weight", "blocks.14.attn.proj.bias", "blocks.15.attn.qkv.weight", "blocks.15.attn.qkv.bias", "blocks.15.attn.proj.weight", "blocks.15.attn.proj.bias", "blocks.16.attn.qkv.weight", "blocks.16.attn.qkv.bias", "blocks.16.attn.proj.weight", "blocks.16.attn.proj.bias", "blocks.17.attn.qkv.weight", "blocks.17.attn.qkv.bias", "blocks.17.attn.proj.weight", "blocks.17.attn.proj.bias", "blocks.18.attn.qkv.weight", "blocks.18.attn.qkv.bias", "blocks.18.attn.proj.weight", "blocks.18.attn.proj.bias", "blocks.19.attn.qkv.weight", "blocks.19.attn.qkv.bias", "blocks.19.attn.proj.weight", "blocks.19.attn.proj.bias", "blocks.20.attn.qkv.weight", "blocks.20.attn.qkv.bias", "blocks.20.attn.proj.weight", "blocks.20.attn.proj.bias", "blocks.21.attn.qkv.weight", "blocks.21.attn.qkv.bias", "blocks.21.attn.proj.weight", "blocks.21.attn.proj.bias", "blocks.22.attn.qkv.weight", "blocks.22.attn.qkv.bias", "blocks.22.attn.proj.weight", "blocks.22.attn.proj.bias", "blocks.23.attn.qkv.weight", "blocks.23.attn.qkv.bias", "blocks.23.attn.proj.weight", "blocks.23.attn.proj.bias", "blocks.24.attn.qkv.weight", "blocks.24.attn.qkv.bias", "blocks.24.attn.proj.weight", "blocks.24.attn.proj.bias", "blocks.25.attn.qkv.weight", "blocks.25.attn.qkv.bias", "blocks.25.attn.proj.weight", "blocks.25.attn.proj.bias", "blocks.26.attn.qkv.weight", "blocks.26.attn.qkv.bias", "blocks.26.attn.proj.weight", "blocks.26.attn.proj.bias", "blocks.27.attn.qkv.weight", "blocks.27.attn.qkv.bias", "blocks.27.attn.proj.weight", "blocks.27.attn.proj.bias".
size mismatch for final_layer.linear.weight: copying a param with shape torch.Size([32, 1152]) from checkpoint, the shape in current model is torch.Size([24, 1152]).
size mismatch for final_layer.linear.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([24]).
I would appreciate the help to solve it.
巍峨的大秦岭
File "/data/Sora/Open-Sora-main/train.py", line 122, in main
ema = deepcopy(model)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 271, in _reconstruct
state = deepcopy(state, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 231, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 297, in _reconstruct
value = deepcopy(value, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 271, in _reconstruct
state = deepcopy(state, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 231, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 297, in _reconstruct
value = deepcopy(value, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 271, in _reconstruct
state = deepcopy(state, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 231, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 297, in _reconstruct
value = deepcopy(value, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 271, in _reconstruct
state = deepcopy(state, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 231, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/root/miniconda3/envs/opensora/lib/python3.10/copy.py", line 161, in deepcopy
rv = reductor(4)
TypeError: cannot pickle 'torch._C._distributed_c10d.ProcessGroup' object
Does this project support training one's own dataset from scratch
when i run script from the mentioned, it occured following bug
it seems to be related to the code "colossalai.launch_from_torch({})" (from inference.py)
how can i solve it? thanks!
[W socket.cpp:601] [c10d] The IPv6 network addresses of (nma08-101-c-07-sev-nf5468-04u04, 52925) cannot be retrieved (gai error: -2 - Name or service not known).
[W socket.cpp:601] [c10d] The IPv6 network addresses of (nma08-101-c-07-sev-nf5468-04u04, 52925) cannot be retrieved (gai error: -2 - Name or service not known).
[W socket.cpp:601] [c10d] The IPv6 network addresses of (nma08-101-c-07-sev-nf5468-04u04, 52925) cannot be retrieved (gai error: -2 - Name or service not known).
[W socket.cpp:601] [c10d] The IPv6 network addresses of (nma08-101-c-07-sev-nf5468-04u04, 52925) cannot be retrieved (gai error: -2 - Name or service not known).
[W socket.cpp:601] [c10d] The IPv6 network addresses of (nma08-101-c-07-sev-nf5468-04u04, 52925) cannot be retrieved (gai error: -2 - Name or service not known).
[W socket.cpp:601] [c10d] The IPv6 network addresses of (nma08-101-c-07-sev-nf5468-04u04, 52925) cannot be retrieved (gai error: -2 - Name or service not known).
[W socket.cpp:601] [c10d] The IPv6 network addresses of (nma08-101-c-07-sev-nf5468-04u04, 52925) cannot be retrieved (gai error: -2 - Name or service not known).
I am running the inference and this is what I am getting.
The command that I ran: python sample.py -m "DiT-XL/2" --text "a person is walking on the street" --ckpt /home/nlp/open_sora/Open-Sora/pretrained_models/DiT-XL-2-256x256.pt --height 256 --width 256 --fps 10 --sec 5 --disable-cfg
I downloaded the checkpoints using download.py in the given repo.
I ran the training scripts (using DiT-S/8 by default) successfully, and the loss curve is shown below.
However, the sampled results (also using the default sampling parameters with DiT-XL/2 modified to DiT-S/8) are random noises.
Is that because the model is weak?
Could you please provide a recommended setting (hyper-params like the model arch, compression, etc) that we should start with?
FreeInit is a method of improving temporal consistency with no extra training.
Project Page - https://tianxingwu.github.io/pages/FreeInit/
Code - https://github.com/TianxingWu/FreeInit
Demo - https://huggingface.co/spaces/TianxingWu/FreeInit
FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling
What is the expected run-time (in mins or hours) for processing an image? Can it be done without gpu on cpu only?
大佬好,请问您训练的checkpoint会开源吗?
Mainly, training requires GPU cards and time.. Can you provide a stable version of CKPT for us to run demo examples on a daily basis. Now I just don't know the effect, so I need to train the data first and then look at it.. The time is too long.
[03/07/24 14:50:30] INFO colossalai - colossalai - INFO: train.py:155 main
INFO colossalai - colossalai - INFO: Dataset contains 105060 samples
[03/07/24 14:52:00] INFO colossalai - colossalai - INFO: train.py:165 main
INFO colossalai - colossalai - INFO: Booster init max device memory: 1222.12 MB
Epoch 0: 0%| | 0/410 [00:00<?, ?it/s]/root/miniconda3/envs/opensora/lib/python3.8/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/root/miniconda3/envs/opensora/lib/python3.8/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/root/miniconda3/envs/opensora/lib/python3.8/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/root/miniconda3/envs/opensora/lib/python3.8/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/root/miniconda3/envs/opensora/lib/python3.8/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/root/miniconda3/envs/opensora/lib/python3.8/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/root/miniconda3/envs/opensora/lib/python3.8/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/root/miniconda3/envs/opensora/lib/python3.8/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
请问什么时候开源VQVAE的训练流程?
ControlNet is a pretty musthave feature for diffusion models, so it will be nice to have it implemented
For example, PixArt-alpha has a ControlNet-Transformer (modified from the Unet-one) module allowing it to take various conditionings
Additionally, the authors of AnimateDiff have released the SparseCtrl ControlNet modification specially for text2video enabling it to take the conditions sparsely at given frames instead of requiring to duplicate/interpolate them to all the frames
https://github.com/guoyww/AnimateDiff#202312-animatediff-v3-and-sparsectrl
[2024-03-07 12:26:19,748] torch.distributed.run: [WARNING] master_addr is only used for static rdzv_backend and when rdzv_endpoint is not specified.
/root/miniconda3/lib/python3.10/site-packages/colossalai/shardformer/layer/normalization.py:45: UserWarning: Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel
warnings.warn("Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel")
/root/miniconda3/lib/python3.10/site-packages/colossalai/shardformer/layer/normalization.py:45: UserWarning: Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel
warnings.warn("Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel")
/root/miniconda3/lib/python3.10/site-packages/colossalai/shardformer/layer/normalization.py:45: UserWarning: Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel
warnings.warn("Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel")
/root/miniconda3/lib/python3.10/site-packages/colossalai/shardformer/layer/normalization.py:45: UserWarning: Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel
warnings.warn("Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel")
/root/miniconda3/lib/python3.10/site-packages/colossalai/shardformer/layer/normalization.py:45: UserWarning: Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel
warnings.warn("Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel")
/root/miniconda3/lib/python3.10/site-packages/colossalai/shardformer/layer/normalization.py:45: UserWarning: Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel
warnings.warn("Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel")
/root/miniconda3/lib/python3.10/site-packages/colossalai/shardformer/layer/normalization.py:45: UserWarning: Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel
warnings.warn("Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel")
/root/miniconda3/lib/python3.10/site-packages/colossalai/shardformer/layer/normalization.py:45: UserWarning: Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel
warnings.warn("Please install apex from source (https://github.com/NVIDIA/apex) to use the fused layernorm kernel")
unable to import lightllm kernels
unable to import lightllm kernels
unable to import lightllm kernels
/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py:48: UserWarning: config
is deprecated and will be removed soon.
warnings.warn("config
is deprecated and will be removed soon.")
unable to import lightllm kernels
unable to import lightllm kernels
unable to import lightllm kernels
unable to import lightllm kernels
unable to import lightllm kernels
Traceback (most recent call last):
File "/tmp/pycharm_project_146/train.py", line 267, in
main(args)
File "/tmp/pycharm_project_146/train.py", line 96, in main
launch_from_torch({})
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py", line 173, in launch_from_torch
launch(
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py", line 61, in launch
cur_accelerator.set_device(local_rank)
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/accelerator/cuda_accelerator.py", line 50, in set_device
torch.cuda.set_device(device)
File "/root/miniconda3/lib/python3.10/site-packages/torch/cuda/init.py", line 404, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
Traceback (most recent call last):
File "/tmp/pycharm_project_146/train.py", line 267, in
main(args)
File "/tmp/pycharm_project_146/train.py", line 96, in main
launch_from_torch({})
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py", line 173, in launch_from_torch
launch(
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py", line 61, in launch
cur_accelerator.set_device(local_rank)
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/accelerator/cuda_accelerator.py", line 50, in set_device
torch.cuda.set_device(device)
File "/root/miniconda3/lib/python3.10/site-packages/torch/cuda/init.py", line 404, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
Traceback (most recent call last):
File "/tmp/pycharm_project_146/train.py", line 267, in
main(args)
File "/tmp/pycharm_project_146/train.py", line 96, in main
launch_from_torch({})
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py", line 173, in launch_from_torch
launch(
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py", line 61, in launch
cur_accelerator.set_device(local_rank)
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/accelerator/cuda_accelerator.py", line 50, in set_device
torch.cuda.set_device(device)
File "/root/miniconda3/lib/python3.10/site-packages/torch/cuda/init.py", line 404, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
Traceback (most recent call last):
File "/tmp/pycharm_project_146/train.py", line 267, in
main(args)
File "/tmp/pycharm_project_146/train.py", line 96, in main
launch_from_torch({})
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py", line 173, in launch_from_torch
launch(
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py", line 61, in launch
cur_accelerator.set_device(local_rank)
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/accelerator/cuda_accelerator.py", line 50, in set_device
torch.cuda.set_device(device)
File "/root/miniconda3/lib/python3.10/site-packages/torch/cuda/init.py", line 404, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
Traceback (most recent call last):
File "/tmp/pycharm_project_146/train.py", line 267, in
main(args)
File "/tmp/pycharm_project_146/train.py", line 96, in main
launch_from_torch({})
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py", line 173, in launch_from_torch
launch(
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py", line 61, in launch
cur_accelerator.set_device(local_rank)
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/accelerator/cuda_accelerator.py", line 50, in set_device
torch.cuda.set_device(device)
File "/root/miniconda3/lib/python3.10/site-packages/torch/cuda/init.py", line 404, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
Traceback (most recent call last):
File "/tmp/pycharm_project_146/train.py", line 267, in
main(args)
File "/tmp/pycharm_project_146/train.py", line 96, in main
launch_from_torch({})
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py", line 173, in launch_from_torch
launch(
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py", line 61, in launch
cur_accelerator.set_device(local_rank)
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/accelerator/cuda_accelerator.py", line 50, in set_device
torch.cuda.set_device(device)
File "/root/miniconda3/lib/python3.10/site-packages/torch/cuda/init.py", line 404, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
Traceback (most recent call last):
File "/tmp/pycharm_project_146/train.py", line 267, in
main(args)
File "/tmp/pycharm_project_146/train.py", line 96, in main
launch_from_torch({})
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py", line 173, in launch_from_torch
launch(
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/initialize.py", line 61, in launch
cur_accelerator.set_device(local_rank)
File "/root/miniconda3/lib/python3.10/site-packages/colossalai/accelerator/cuda_accelerator.py", line 50, in set_device
torch.cuda.set_device(device)
File "/root/miniconda3/lib/python3.10/site-packages/torch/cuda/init.py", line 404, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
train.py FAILED
我现在手头只有一台A100 40G、128G内存、1T的NVME硬盘,官方说可以在8块A100 80G上训练,如果采用ZeRO-Infinity技术,我的这个机器应该也可以训练,请问我的这个硬件可以支持全参数训练吗?
另外,想问一下,支持LoRA等PEFT微调方法吗?
Hi, hpcaitech!
I saw SD3 is on your todo list. While it wasn't officially released yet, I made an unofficial MMDiT implementation based on their paper and OpenDiT. (supporting joint CLIP and T5 embeddings as well)
I think it will be useful for you and allow to save time
SD3 will be released on 12th June, so it might be better to refer to their implementation
The MaskDiT project shows that it's possible to accelerate the training of a DiT by using masked transformers
coomand: python sample.py -m "DiT/XL-2" --text "a person is walking on the street" --ckpt /path/to/checkpoint --height 256 --width 256 --fps 10 --sec 5 --disable-cfg
ERROR:
(open312) eduardo@eduardo-Creator-15M-A9SD:~/Documents/Open-Sora$ python sample.py -m "DiT/XL-2" --text "a person is walking on the street" --ckpt /path/to/checkpoint --height 256 --width 256 --fps 10 --sec 5 --disable-cfg
Traceback (most recent call last):
File "/home/eduardo/Documents/Open-Sora/sample.py", line 21, in
from open_sora.modeling import DiT_models
File "/home/eduardo/Documents/Open-Sora/open_sora/modeling/init.py", line 1, in
from .dit import DiT, DiT_models
File "/home/eduardo/Documents/Open-Sora/open_sora/modeling/dit/init.py", line 1, in
from .dit import SUPPORTED_SEQ_PARALLEL_MODES, DiT, DiT_models
File "/home/eduardo/Documents/Open-Sora/open_sora/modeling/dit/dit.py", line 22, in
from open_sora.utils.comm import gather_seq, split_seq
File "/home/eduardo/Documents/Open-Sora/open_sora/utils/comm.py", line 6, in
from colossalai.moe._operation import MoeInGradScaler, MoeOutGradScaler
ModuleNotFoundError: No module named 'colossalai.moe'
As you know from SD3's paper they used Rectified Flow to make the training and the sampling process faster. However, in the past month a new modification of Rectified Flow named PiecewiseRectifiedFlow was released
Project page: https://piecewise-rectified-flow.github.io/
Github: https://github.com/magic-research/piecewise-rectified-flow/tree/main
Claims to be faster than the normal Rectified Flow (used in PKU-YuanGroup/Open-Sora-Plan#43)
I believe it will be a huge quality/speed win compared to the vanilla diffusion pipeline that is used at this moment here
Thanks for open-sourcing this incredible repo!
I found that if specifying 'vqvae' for --compressor argument in train.py
, it requires access to the pretrained model on Huggingface. Could you please provide the access to that model?
Best
Hi,
We contribute the first dataset featuring 1.67 million unique text-to-video prompts and 6.69 million videos generated from 4 different state-of-the-art diffusion models. We hope it can help your Open-Sora plan.
Title:VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models
Arxiv:https://arxiv.org/abs/2403.06098
Hello, I was wondering why NaViT or an architecture similar to it was not used as the vision transformer architecture. NaViT natively (hence native resolution) supports multi-resolution training as one of its defining features and a similar architecture was used for OpenAI's Sora to allow for good visual fidelity with differing resolutions. In the Latte paper here section 4.1 it states that the model was trained only on square images/videos and would require resizing to process non-square images/videos.
I fisrt use VQVAE for video compress, the code runs fine. But Loss drops very slowly.
So I change AE to VAE, and I got the OOM error, even though I set batch_size and accumulation_steps to 1.
Has anyone encountered this problem too?
From the expand_mask_4d
func this will make a crazy allocation for large tensor sizes. When the sequence length ~ 200k this will try to allocate 720gb.
How is it possible to achieve such high sequence lengths without going oom creating the masks?
Thanks
Hello!
As you probably know, there are developments proposing to switch away from the traditional transformer's attention architecture due to its quadratic context cost. While the approaches such as Mamba are too exotic and may be too complicated for the existing pipelines, such as ControlNet-Transformer, other sub-quadratic alternatives have been proposed recently. An example is ReBased Linear Transformers with Learnable kernels https://github.com/corl-team/rebased which seems to fare better than Mamba
Also it may be worth to take a look at Large World Model's ring attention https://github.com/lucidrains/ring-attention-pytorch enabling it to extend its context window to millions of tokens while reliably answering the needle in the haystack test
Here's my implementation for Latte Vchitect/Latte#51
Hi, thanks for your great work! This is super useful, however, one minor issue is, it seems that this framework can only support A100 node, and get stuck on H100 node, I wonder whether the H100 support feature is undergoing or not?
If I use 2~ gpus on inference, following error occurs.
Traceback (most recent call last):
File "/hub_data1/minhyuk/diffusion/opensora/scripts/inference.py", line 114, in <module>
main()
File "/hub_data1/minhyuk/diffusion/opensora/scripts/inference.py", line 95, in main
samples = scheduler.sample(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/__init__.py", line 72, in sample
samples = self.p_sample_loop(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 434, in p_sample_loop
for sample in self.p_sample_loop_progressive(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 485, in p_sample_loop_p
rogressive
out = self.p_sample(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 388, in p_sample
out = self.p_mean_variance(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/respace.py", line 94, in p_mean_variance
return super().p_mean_variance(self._wrap_model(model), *args, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 267, in p_mean_variance
model_output = model(x, t, **model_kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/respace.py", line 127, in __call__
return self.model(x, new_ts, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/__init__.py", line 89, in forward_with_cfg
model_out = model.forward(combined, timestep, y, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/models/stdit/stdit.py", line 267, in forward
x = auto_grad_checkpoint(block, x, y, t0, y_lens, tpe)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/acceleration/checkpoint.py", line 24, in auto_grad_checkpoint
return module(*args, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/models/stdit/stdit.py", line 111, in forward
x = x + self.cross_attn(x, y, mask)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/models/layers/blocks.py", line 313, in forward
kv = self.kv_linear(cond).view(B, -1, 2, self.num_heads, self.head_dim)
RuntimeError: shape '[4, -1, 2, 16, 72]' is invalid for input of size 105523
I tested on 2/3/4 gpus, and all give the same error.
Dear Authors,
Thanks for your great work!
I've just read your report and come up with some questions regarding the choice of the VAE. You mentioned that VideoGPT yields poor performance, so you chose 2D VAE because 3D sota VAEs like MAGVIT-v1/v2 are not open-sourced.
My question is have you ever tried using other 3D-VAE variants like TATS (Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer)?
Thanks in advance!
之前那个diffusers的问题没有了,出现了新的问题。求助解答
torch的版本有限制吗?我用的版本是torch==2.1.2+cu121
现在的报错信息:
TypeError: cannot pickle 'torch._C._distributed_c10d.ProcessGroup' object
[2024-03-06 10:13:06,886] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 52488) of binary: /root/miniconda3/bin/python
我怀疑是torch版本的问题
Edit: seems we can't use the smaller models, so it would be handy to have a way to load the xxl models in 8bit format for smaller vram GPUs. its doable for pixart image gen models using diffusers library.
I tried to use google/t5-v1_1-large
model as text encoder instead of DeepFloyd/t5-v1_1-xxl
, but encountered following error.
RuntimeError: Error(s) in loading state_dict for STDiT:
size mismatch for y_embedder.y_embedding: copying a param with shape torch.Size([120, 4096]) from checkpoint, the shape in current model is torch.Size([120, 1024]).
size mismatch for y_embedder.y_proj.fc1.weight: copying a param with shape torch.Size([1152, 4096]) from checkpoint, the shape in current model is torch.Size([1152, 1024]).
It seems the output embedding dimension for large model is 1024 and for xxl is 4096, and opensora weights only accept weights from xxl model i.e. 4096 dim weights.
is there anyway we can use the t5-large model instead of the xxl model? I want to run inference in cloud gpus i.e. T4 in colab notebooks.
here's my notebook as a gist i used to run on colab.
https://gist.github.com/sandeshrajbhandari/ac3857cd2aaae5e3a9de0d7c219ac351
when i run inference there is an importerror
scripts/inference.py FAILED
python sample.py -m "DiT/XL-2" --text "a person is walking on the street" --ckpt /path/to/checkpoint --height 256 --width 256 --fps 10 --sec 5 --disable-cfg
What is the path to the checkpoint? Can you provide the weights or tell what weights to use?
FileNotFoundError: [Errno 2] No such file or directory: '/root/miniconda3/lib/python3.10/site-packages/colossalai/kernel/extensions/csrc/cuda/cpu_adam.cpp'
这是服务器的信息:Linux #101-Ubuntu SMP Tue Nov 14 13:30:08 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
找不到这个模块:ModuleNotFoundError: No module named 'colossalai._C.cpu_adam_x86'
我用默认的设置,8卡训了两三天了,5w视频数据左右,但是现在一点视频的感觉都没有,都是patch状的拼接,是训练的不够吗,还是哪里出错了?
/data/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/models/text_encoder/t5.py(145)init()
-> self.model = T5EncoderModel.from_pretrained(path, **t5_model_kwargs).eval()
(Pdb) n
ModuleNotFoundError: No module named 'fused_layer_norm_cuda'
/data/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/models/text_encoder/t5.py(145)init()
-> self.model = T5EncoderModel.from_pretrained(path, **t5_model_kwargs).eval()
AdaptiveDetector
For improved results in scene detection, I recommend using the AdaptiveDetector
instead of the ContentDetector
. The AdaptiveDetector
provides a more nuanced approach, especially for videos with varying lighting or content. Here's how you can use it in your project:
from scenedetect import AdaptiveDetector, detect_scenes, split_video_ffmpeg
# Path to the input video
video_path = 'your_video_path_here.mp4'
# Directory to save the output clips
video_dir = 'your_output_directory_here'
# Perform scene detection
scene_list = detect_scenes(
video_path=video_path,
scene_detector=AdaptiveDetector(
luma_only=True,
adaptive_threshold=1.5,
min_scene_len=3
),
)
# Split and save the detected scenes into separate clips
for i, scene in enumerate(scene_list):
output_file = f"{video_dir}/clip_{i+1}.mp4"
split_video_ffmpeg(
video_path=video_path,
scene_list=[scene],
output_file_template=output_file,
)
---.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.