GithubHelp home page GithubHelp logo

zgljl2012 / ema-vfi Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mcg-nju/ema-vfi

0.0 0.0 0.0 8.75 MB

[CVPR 2023] Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolatio

License: Apache License 2.0

Python 100.00%

ema-vfi's Introduction

Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation arxiv

Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation
Accepted by CVPR 2023
Guozhen Zhang, Yuhan Zhu, Haonan Wang, Youxin Chen, Gangshan Wu, Limin Wang

PWC PWC PWC PWC PWC PWC

💥 News

  • [2023.03.12] We compared our method with other methods (VFIFormer and M2M) under extreme cases such as large motion and scene transitions. The video demonstrating our results can be found here (Bilibili).
  • [2023.03.12] Thanks to @jhogsett, our model now has a more user-friendly WebUI!

😆 HighLights

In this work, we propose to exploit inter-frame attention for extracting motion and appearance information in video frame interpolation. In particular, we utilize the correlation information hidden within the attention map to simultaneously enhance the appearance information and model motion. Meanwhile, we devise an hybrid CNN and Transformer framework to achieve a better trade-off between performance and efficiency. Experiment results show that our proposed module achieves state-of-the-art performance on both fixed- and arbitrary-timestep interpolation and enjoys effectiveness compared with the previous SOTA method.

Runtime and memory usage compared with previous SOTA method:

💕Dependencies

  • torch 1.8.0
  • python 3.8
  • skimage 0.19.2
  • numpy 1.23.1
  • opencv-python 4.6.0
  • timm 0.6.11
  • tqdm

😎 Play with Demos

  1. Download the model checkpoints (baidu&code:gi5j)and put the ckpt folder into the root dir.
  2. Run the following commands to generate 2x and Nx (arbitrary) frame interpolation demos:
python demo_2x.py        # for 2x interpolation
python demo_Nx.py --n 8  # for 8x interpolation

By running above commands, you should get the follow examples by default:

✨ Training for Fixed-timestep Interpolation

  1. Download Vimeo90K dataset
  2. Run the following command at the root dir:
  python -m torch.distributed.launch --nproc_per_node=4 train.py --world_size 4 --batch_size 8 --data_path **YOUR_VIMEO_DATASET_PATH** 

The default training setting is Ours. If you want train Ours_small or your own model, you can modify the MODEL_CONFIG in config.py.

🏃 Evaluation

  1. Download the dataset you need:

  2. Download the model checkpoints and put the ckpt folder into the root dir.

For 2x interpolation benchmarks:

python benchmark/**dataset**.py --model **model[ours/ours_small]** --path /where/is/your/**dataset**

For 4x interpolation benchmarks:

python benchmark/**dataset**.py --model **model[ours_t/ours_small_t]** --path /where/is/your/dataset

You can also test the inference time of our methods on the $H\times W$ image with the following command:

python benchmark/TimeTest.py --model **model[ours/ours_small]** --H **SIZE** --W **SIZE**

💪 Citation

If you think this project is helpful in your research or for application, please feel free to leave a star⭐️ and cite our paper:

@inproceedings{zhang2023extracting,
  title={Extracting motion and appearance via inter-frame attention for efficient video frame interpolation},
  author={Zhang, Guozhen and Zhu, Yuhan and Wang, Haonan and Chen, Youxin and Wu, Gangshan and Wang, Limin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={5682--5692},
  year={2023}
}

💗 License and Acknowledgement

This project is released under the Apache 2.0 license. The codes are based on RIFE, PvT, IFRNet, Swin and HRFormer. Please also follow their licenses. Thanks for their awesome works.

ema-vfi's People

Contributors

guozhenzhang1999 avatar wanglimin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.