GithubHelp home page GithubHelp logo

bqhuyy / gsm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from swathikirans/gsm

0.0 0.0 0.0 43.65 MB

Gate-Shift Networks for Video Action Recognition - CVPR 2020

License: Other

Python 68.19% Shell 1.56% Lua 1.45% Jupyter Notebook 1.67% C++ 25.22% Makefile 0.24% Dockerfile 0.05% Starlark 1.63%

gsm's Introduction

Gate-Shift Networks for Video Action Recognition

PWC

We release the code and trained models of our paper Gate-Shift Networks for Video Action Recognition. If you find our work useful for your research, please cite

@InProceedings{gsm,
author = {Sudhakaran, Swathikiran and Escalera, Sergio and Lanz, Oswald},
title = {{Gate-Shift Networks for Video Action Recognition}},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2020}
} 

Prerequisites

Data preparation

  • Something Something-v1: Download the frames from the official website. Copy the directory containing frames and the train-val files to dataset-->something-v1. Run python data_scripts/process_dataset_something.py to create the train/val list files.

  • Diving48: Download the videos and the annotations from the official website. Copy the directory containing videos and the annotations to the directory dataset-->Diving48. Run python data_scripts/extract_frames_diving48.py for extracting the frames from the videos. Run python data_scripts/process_dataset_diving.py for creating the train/test list files.

Training

python main.py something-v1 RGB --arch BNInception \
               --num_segments 8 --consensus_type avg \
               --batch-size 16 --iter_size 2 --dropout 0.5 \
               --lr 0.01 --warmup 10 --epochs 60 --eval-freq 5 \
               --gd 20 --run_iter 1 -j 16 --npb --gsm

Testing

python test_models.py something-v1 RGB models/something-v1_RGB_InceptionV3_avg_segment16_checkpoint.pth.tar \
		      --arch InceptionV3 --crop_fusion_type avg \
                      --test_segments 16 --test_crops 1 --num_clips 1 --gsm

To evaluate using 2 clips sampled from each model, change --num_clips 1 to --num_clips 2. For prediction using ensemble of models, perform evaluation with the option --save_scores to save the prediction scores and run python average_scores.py.

Models

The models can be downloaded by running python download_models.py or from google drive. The table shows the results reported in the paper. To reproduce the results, run the script obtained when clicked on the accuracy scores.

No. of frames Top-1 Accuracy (%) Something Something-v1 Visualization
8 49.01 Visualization
12 51.58
16 50.63
24 49.63
8x2 50.43
12x2 51.98
8x2 + 12x2 + 16 + 24 55.16

To reproduce the results on Diving48 dataset, click on 39.03% (16 frames) and 40.27% (16x2 frames).

Acknowledgements

This implementation is built upon the TRN-pytorch codebase which is based on TSN-pytorch. We thank Yuanjun Xiong and Bolei Zhou for releasing TSN-pytorch and TRN-pytorch repos.

gsm's People

Contributors

swathikirans avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.