GithubHelp home page GithubHelp logo

fancy-video-classification's Introduction

fancy-video-classification

Video Classifiication Paper List

Papers

Recommendations!

arXiv

  • Cao, Haoyuan, Shining Yu, and Jiashi Feng.
    "Compressed Video Action Recognition with Refined Motion Vector." arXiv(2019).[PDF]

CVPR2020

  • TEA:Yan Li, Bin Ji, et al.
    "TEA: Temporal Excitation and Aggregation for Action Recognition" CVPR(2020)[PDF]
  • TPN:Ceyuan Yang, Yinghao Xu, et al.
    "TPN: Temporal Pyramid Network for Action Recognition" CVPR(2020)[PDF][Code]

CVPR2019

  • Dmc-net:Shou, Zheng, et al.
    "Dmc-net: Generating discriminative motion cues for fast compressed video action recognition." CVPR(2019).[PDF][Code]

ICCV2019

  • SlowFast: Feichtenhofer C, Fan H, Malik J, et al.
    "Slowfast Networks for Video Recognition",ICCV(2019 oral).[PDF][Code]
  • TSM: Chuang Gan, Song Han,Ji Lin
    "Temporal Shift Module for Efficient Video Understanding",ICCV(2019).[PDF][Code]
  • STM: Jiang, Boyuan, et al.
    "STM: SpatioTemporal and motion encoding for action recognition." ICCV(2019).[PDF]

NIPS2019

  • bLVNet-TAM: Quanfu Fan, Chun-Fu (Richard) Chen, Hilde Kuehne, Marco Pistoia, David Cox
    "More Is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation".NIPS(2019)[PDF][Code]

CVPR 2018

  • R(2+1)D: Tran, Du, et al.
    "A closer look at spatiotemporal convolutions for action recognition." CVPR(2018).[PDF][Code]
  • CoViAR:Wu, Chao-Yuan, et al.
    "Compressed video action recognition." CVPR(2018).[PDF][Code]
  • Non-local:Wang, Xiaolong, et al.
    "Non-local neural networks." CVPR(2018).[PDF][Code]

NIPS 2018

  • TrajectoryNet: Zhao, Yue, Yuanjun Xiong, and Dahua Lin.
    "Trajectory convolution for action recognition." NIPS(2018)[PDF]

CVPR 2017

  • I3D: Carreira Joao and Andrew Zisserman.
    "Quo vadis, action recognition? a new model and the kinetics dataset" CVPR(2017).[PDF][Code]

ECCV2016

  • TSN:Wang, Limin, et al.
    "Temporal segment networks: Towards good practices for deep action recognition." ECCV(2016)[PDF][Code]

NIPS2014

  • Two Stream: Simonyan, Karen, and Andrew Zisserman.
    "Two-stream convolutional networks for action recognition in videos." NIPS(2014).[PDF][Code]

ICCV2013

  • IDT:Wang, Heng, and Cordelia Schmid.
    "Action recognition with improved trajectories." ICCV(2013).[PDF]

Competitions

Datasets

  • UCF101
    13320 videos; average time ~10s; 101 human action categories,each class has 25 groups,videos in same group share some common features; datasets are not realistic and are staged by actors.
  • HMDB51
    6849 videos; average time ~5s; 51 human action categories, each containing a minimum of 101 videos; datasets are most from movies clips, and a small proportion from other public datasets and web videos.
  • Kinetics(due to the missing videos in kinetics source csv, the 'nolocal net' reseachers offer a pre-downloaded version of kinetics-400,here it's the relevent issue)
    650000 videos; average time ~10s; 700/600/400 human categories, each action class has at least 600 video clips; datasets are most from youtube videos.
  • Something-something v2
    220847 videos; average time 2~6s; 174 human basic action categories; datasets focus the human fine-grined actions,such as "Putting something on a surface".
  • Charades
    average ~30s per video, long-term video dataset.
  • Moments in Time
    about one millon videos; average time ~3s, involving people, animals, objects or natural phenomena, that capture the gist of a dynamic scene.

Benchmarks

Distinguished Researchers & Teams

fancy-video-classification's People

Contributors

saijunhu avatar

Forkers

zzwei1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.