CURE: Learning Cross-Video Neural Representations for High-Quality Frame Interpolation

Computational Imaging Group (CIG), Washington University in St. Louis

Accepted to ECCV 2022

Official PyTorch Impletement for "Learning Cross-Video Neural Representations for High-Quality Frame Interpolation"

Requirements and Dependencies

*Other version of packages might affects the performance

opencv-python==4.5.4.58
torchvision==0.11.1
numpy==1.22.0
scipy==1.7.1
pillow>=9.0.1
tqdm==4.62.3
h5py==3.6.0
pytorch-lightning>=1.5.10
CUDA>=11.3
PyTorch>=1.9.1
CUDNN>=8.0.5
Python==3.8.10

Installation

Create conda environment

$ conda update -n base -c defaults conda
$ conda create -n CURE python=3.8.10 anaconda
$ conda activate CURE
$ conda install pytorch=1.10.2 torchvision torchaudio cudatoolkit=11.3 -c pytorch
$ pip install -r requirements.txt

Download the repository

$ git clone https://github.com/wustl-cig/CURE.git

Download the pre-trained model checkpoint from here: CURE.pth.tar(Onedrive) or CURE.pth.tar(Google Drive) and put the file in the root of the project dir CURE.

Usage

Test on a custom frame: synthesis the intermediate frame between two inputs:

$ python test_custom_frames.py -f0 path/to/the/first/frame -f1 path/to/the/second/frame -fo path/to/the/output/frame -t 0.5

This will synthesis the middle frame between first frame and second frame.

Or just run

$ python test_custom_frames.py

This will generate the result with the provided sample image. The path of output image is predict.png.

Interpolate a custom video:

Place your custom video in any path
Run the follow command:
```
$ python intv.py -f 2 -vir /path/to/video -rs 1920,1080 -nf -1 -d 1
```
- -f: the times of original FPS
- -vdir: the path to the video
- -rs: new resolution (1K resolution needs at least 24GB VMemory)
- -nf: crop the video and leave numbers of frames from the begging
- -d: downsample the video in temporal domain (make FPS lower)
The output video will be in the same directory of the input video

Test on datasets: get quantitative result on evaluating the dataset

Download the dataset and place it in CURE/data/ or you could specify the root dir of dataset with args: -dbdir

Datasets:

UCF101: ucf101.zip(OneDrive) ucf101.zip(Google Drive)
Vimeo90K:vimeo90k
SNU-FILM: test, eval_modes
Nvidia Scene: nerf_data.zip
Xiph 4K: Xiph.zip(OneDrive) Xiph.zip(Google Drive)
X4k: x4k.zip(OneDrive) x4k.zip(Google Drive)

The data directory should looks like this:

   CURE
      └── data
          ├──vimeo_triplet
          |	 ├──sequences
          |	 └──tri_testlist.txt
          ├──ucf101
          |  ├──1
          |  ├──...
          ├──Xiph
          |  └──test_4k
          ├──SNU-FILM
          |  ├──eval_modes
          |  |  ├──test-easy.txt
          |  |  ├── ... 
          |  └──test
          |     ├──GOPRO_test
          |     └──YouTube_test
          ├── x4k
          |   ├──test
          |   └──x4k.txt
          ├──  nvidia_data_full
          |   ├──Balloon1-2
          |   ├──Balloon2-2
          |   ├──DynamicFace-2
          └── ├──...

Run the following command with args -d=/dataset/name, where dataset name could be any of "ucf101", "vimeo90k", "sfeasy", "sfmedium", "sfhard", "sfextreme", "nvidia", "xiph4k", "x4k"

For example, when testing Vimeo90K dataset, you should run the following command
```
$ python test.py -d vimeo90k
```
After compelete, it will print the testing result and generate a txt file named result.txt

Experiment Results

If environment is proporelly created, you will get the following result.

Citation

@misc{shangguan2022learning,
      title={Learning Cross-Video Neural Representations for High-Quality Frame Interpolation}, 
      author={Wentao Shangguan and Yu Sun and Weijie Gan and Ulugbek S. Kamilov},
      year={2022},
      eprint={2203.00137},
      archivePrefix={arXiv},
      primaryClass={eess.IV}
}

Acknowledgment

RAFT is impleted revised from the official implement: https://github.com/princeton-vl/RAFT

Some of the function was revised from LIIF: https://github.com/yinboc/liif

Pytorch Lightning model: https://www.pytorchlightning.ai

Offical Dataset sources:

Vimeo90K: http://toflow.csail.mit.edu

SNU-FILM: https://myungsub.github.io/CAIN/

UCF101: https://www.crcv.ucf.edu/data/UCF101.php

XIPH4k: https://media.xiph.org/video/derf/.

Nvidia Dynamic Scene: https://research.nvidia.com/publication/2020-06_novel-view-synthesis-dynamic-scenes-globally-coherent-depths

X4k: https://github.com/JihyongOh/XVFI

ywu40 / cure Goto Github PK

cure's Introduction

CURE: Learning Cross-Video Neural Representations for High-Quality Frame Interpolation

Computational Imaging Group (CIG), Washington University in St. Louis

Accepted to ECCV 2022

Requirements and Dependencies

Installation

Usage

Experiment Results

Citation

Acknowledgment

cure's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs