Code and models of paper. " ECO: Efficient Convolutional Network for Online Video Understanding"
By Mohammadreza Zolfaghari, Kamaljeet Singh, Thomas Brox
- 2018.4.17: Repository for ECO.
This repository will contains all the required models and scripts for the paper ECO: Efficient Convolutional Network for Online Video Understanding.
In this work, we introduce a network architecture that takes long-term content into account and enables fast per-video processing at the same time. The architecture is based on merging long-term content already in the network rather than in a post-hoc fusion. Together with a sampling strategy, which exploits that neighboring frames are largely redundant, this yields high-quality action classification and video captioning at up to 230 videos per second, where each video can consist of a few hundred frames. The approach achieves competitive performance across all datasets while being 10x to 80x faster than state-of-the-art methods.
Action Recognition on UCF101 and HMDB51 | Video Captioning on MSVD dataset |
---|---|
Model trained on UCF101 dataset | Model trained on Something-Something dataset |
---|---|
- Code and Models
- Data
- Tables and Results
- Demo
Questions can also be left as issues in the repository. We will be happy to answer them.