GithubHelp home page GithubHelp logo

nivha / single_video_generation Goto Github PK

View Code? Open in Web Editor NEW
28.0 3.0 3.0 1.26 MB

Home Page: https://nivha.github.io/vgpnn

License: Creative Commons Zero v1.0 Universal

Jupyter Notebook 1.46% HTML 3.10% Python 89.69% C++ 0.68% Cuda 5.07%

single_video_generation's Introduction

VGPNN:
Diverse Generation from a Single Video Made Possible

Accepted to ECCV 2022

Pytorch implementation of the paper: "Diverse Generation from a Single Video Made Possible"

Code

Data

You can download videos from this Dropbox Videos Folder into ./data folder.

Note that a video is represented as a directory with PNG files in the format <frame number>.png

For example:

some/path/my_video/
   1.png
   2.png
   3.png
   ...

Video generation

To generate a new sample from a single video

python run_generation.py --gpu 0 --frames_dir <path to frames dir> --start_frame <number of first frame> --end_frame <number of last frame>

Examples:

python run_generation.py --frames_dir=data/airballoons_QGAMTlI6XxY --start_frame=66 --end_frame=80
python run_generation.py --frames_dir=data/airballoons_QGAMTlI6XxY --start_frame=66 --end_frame=165 --max_size=360 --sthw='(0.5,1,1)'

Video analogies

Please download raft-sintel.pth model from RAFT (or directly from here) and place it in ./raft/models/raft-sintel.pth

To compute a new video with the spatio-temporal layout of video A and the appearance of video B:

python run_analogies.py --a_frames_dir <A frames dir> --b_frames_dir <B frames dir> --a_n_bins <A: number of dynamic bins> --b_n_bins <B: number of dynamic bins> --results_dir <results dir>

For example:

python run_analogies.py --a_frames_dir data/waterfall_Qo3OM5sPUPM --b_frames_dir data/lava_m_e7jUfvt-I --a_n_bins 4 --b_n_bins 8 --results_dir results/wfll2lava

Video Retargeting

Retargeting is similar to generation but with a different aspect ratio for the output, and without adding any noise.

python run_generation.py --gpu 0 --frames_dir <path to frames dir> --start_frame <number of first frame> --end_frame <number of last frame> --use_noise False --sthw '(ST,SH,SW)'

Where (ST,SH,SW) are the required scales for the temporal, height and width dimensions, respectively. E.g., (1,1,1) will not change the result, where (1,1,0.5) will generate a retargeted result with the same height and number of frames, but with half the width of the input.

For example:

python run_generation.py --gpu 0 --frames_dir data/airballoons_QGAMTlI6XxY --start_frame 66 --end_frame 80 --max_size 360 --use_noise False --min_size '(3,40)' --kernel_size '(3,7,7)' --downfactor '(0.87,0.82)' --sthw '(1,1,0.6)'

Citation

If you find our project useful for your work please cite:

@inproceedings{haim2022diverse,
  title={Diverse generation from a single video made possible},
  author={Haim, Niv and Feinstein, Ben and Granot, Niv and Shocher, Assaf and Bagon, Shai and Dekel, Tali and Irani, Michal},
  booktitle={European Conference on Computer Vision},
  pages={491--509},
  year={2022},
  organization={Springer}
}

single_video_generation's People

Contributors

nivha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

single_video_generation's Issues

AssertionError in run_generation.py

Hello, I'm having some problems trying to run the code.

First of all I ran into a device issue in utils/resize_right.py. I had to add a set device in get_field_of_view:

mirror = fw_cat((fw.arange(in_sz), fw.arange(in_sz - 1, -1, step=-1)), fw)
mirror = fw_set_device(mirror, projected_grid.device, fw) # <-- avoids device error in the following line
field_of_view = mirror[fw.remainder(field_of_view, mirror.shape[0])]

Now I'm running into an AssertionError related to the temporal dimension:

Namespace(gpu='0', results_dir='output/', frames_dir='data/', start_frame=1, end_frame=15, max_size=144, min_size=(3, 15), downfactor=(0.85, 0.85), J=5, J_start_from=1, kernel_size=(3, 7, 7), sthw=(0.5, 1, 1), reduce='median', vgpnn_type='pm', use_noise=True, noise_amp=5, verbose=True, save_intermediate=True, save_intermediate_path='output/')
Traceback (most recent call last):
  File "/home/hans/code/vgpnn/run_generation.py", line 82, in <module>
    VGPNN, orig_vid = vgpnn.get_vgpnn(
  File "/home/hans/code/vgpnn/vgpnn.py", line 293, in get_vgpnn
    assert (
AssertionError: smallest pyramid level has less frames 2 than temporal kernel-size 3. You may want to increase min_size of the temporal dimension

I'm using all the default settings. I've only set --frames_dir and --frames_dir. My video is square about 300 frames long (but the default settings only look at the first 15).

I've also tried setting --min_size 4,15 but ran into exactly the same error.

Any idea what could be going wrong?

Thank you very much for your work. While running your code, I encountered some problems. May I know how to solve them?

`(vgpnn) C:\ai\single_video_generation-main>python run_generation.py --gpu 0 --frames_dir data/airballoons_QGAMTlI6XxY --start_frame 66 --end_frame 80 --max_size 360 --use_noise False --min_size '(6,40)'
Namespace(J=5, J_start_from=1, downfactor=(0.85, 0.85), end_frame=80, frames_dir='data/airballoons_QGAMTlI6XxY', gpu='0', kernel_size=(3, 7, 7), max_size=360, min_size='(6,40)', noise_amp=5, reduce='median', results_dir='./results/generation', save_intermediate=True, save_intermediate_path='./results/generation', start_frame=66, sthw=(0.5, 1, 1), use_noise=False, verbose=True, vgpnn_type='pm')
Traceback (most recent call last):
File "run_generation.py", line 70, in
VGPNN, orig_vid = vgpnn.get_vgpnn(
File "C:\ai\single_video_generation-main\vgpnn.py", line 217, in get_vgpnn
downscales, upscale_factors, out_shapes = scale_utils.get_scales_out_shapes(T, H, W, o.downfactor, o.min_size)
File "C:\ai\single_video_generation-main\utils\scale_utils.py", line 48, in get_scales_out_shapes
assert T >= min_size[0], f"min_size ({min_size[0]},{min_size[1]}) larger than original size ({T},{H},{W}) (it must be smaller)"
TypeError: '>=' not supported between instances of 'int' and 'str'

(vgpnn) C:\ai\single_video_generation-main>python run_generation.py --gpu 0 --frames_dir data/airballoons_QGAMTlI6XxY --start_frame 6 --end_frame 80 --use_noise False
Namespace(J=5, J_start_from=1, downfactor=(0.85, 0.85), end_frame=80, frames_dir='data/airballoons_QGAMTlI6XxY', gpu='0', kernel_size=(3, 7, 7), max_size=144, min_size=(3, 15), noise_amp=5, reduce='median', results_dir='./results/generation', save_intermediate=True, save_intermediate_path='./results/generation', start_frame=6, sthw=(0.5, 1, 1), use_noise=False, verbose=True, vgpnn_type='pm')
Traceback (most recent call last):
File "run_generation.py", line 70, in
VGPNN, orig_vid = vgpnn.get_vgpnn(
File "C:\ai\single_video_generation-main\vgpnn.py", line 226, in get_vgpnn
assert ret_pyr[0].shape[2] >= o.kernel_size[0], f'smallest pyramid level has less frames {ret_pyr[0].shape[2]} than temporal kernel-size {o.kernel_size[0]}. You may want to increase min_size of the temporal dimension'
AssertionError: smallest pyramid level has less frames 2 than temporal kernel-size 3. You may want to increase min_size of the temporal dimension`

run_generation.py stops with an error

When I run it on the ballet example like this:

python3 run_generation.py --frames_dir=/home/frank/tmp/ballet/ballet_Wz_f9B4pPtg --start_frame=1 --end_frame=60

Then I get this error:

Namespace(gpu='0', results_dir='./results/generation', frames_dir='/home/frank/tmp/ballet/ballet_Wz_f9B4pPtg', start_frame=1, end_frame=60, max_size=144, min_size=(3, 15), downfactor=(0.85, 0.85), J=5, J_start_from=1, kernel_size=(3, 7, 7), sthw=(0.5, 1, 1), reduce='median', vgpnn_type='pm', use_noise=True, noise_amp=5, verbose=True, save_intermediate=True, save_intermediate_path='./results/generation')
/home/frank/.local/lib/python3.11/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
Traceback (most recent call last):
  File "/home/frank/tmp/single_video_generation/run_generation.py", line 70, in <module>
    VGPNN, orig_vid = vgpnn.get_vgpnn(
                      ^^^^^^^^^^^^^^^^
  File "/home/frank/tmp/single_video_generation/vgpnn.py", line 215, in get_vgpnn
    orig_vid = read_original_video(o.frames_dir, o.start_frame, o.end_frame, o.max_size, o.device, verbose=verbose, ext=ext)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/frank/tmp/single_video_generation/utils/main_utils.py", line 48, in read_original_video
    frames = read_frames(frames_dir, start_frame, end_frame, resizer, device=device, verbose=verbose, ext=ext)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/frank/tmp/single_video_generation/utils/main_utils.py", line 35, in read_frames
    x = frame_resizer(x)
        ^^^^^^^^^^^^^^^^
  File "/home/frank/tmp/single_video_generation/utils/resize_right.py", line 71, in resize
    field_of_view, weights = prepare_weights_and_field_of_view_1d(
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/frank/tmp/single_video_generation/utils/resize_right.py", line 165, in prepare_weights_and_field_of_view_1d
    field_of_view = get_field_of_view(projected_grid, cur_support_sz, in_sz,
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/frank/tmp/single_video_generation/utils/resize_right.py", line 284, in get_field_of_view
    field_of_view = mirror[fw.remainder(field_of_view, mirror.shape[0])]
                    ~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.