GithubHelp home page GithubHelp logo

mdongbenben / two-stream-action-recognition-1 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mohammed-elkomy/two-stream-action-recognition

0.0 1.0 0.0 50.15 MB

My re-implementation of two stream action recognition

License: Apache License 2.0

Jupyter Notebook 15.47% Python 84.32% Shell 0.20%

two-stream-action-recognition-1's Introduction

Demo GIF

Action Recognition

In this repo we study the problem of action recognition(recognizing actions in videos) on UCF101 famous dataset.

Here, I reimplemented two-stream approach for action recognition using pre-trained Xception networks for both streams(Look at references).

Get started:

A full demo of the code in the repo can be found in Action_Recognition_Walkthrough.ipynb notebook.

Please clone Action_Recognition_Walkthrough.ipynb notebook to your drive account and run it on Google Colab on python3 GPU-enabled instance.

Environment and requirements:

This code requires python 3.6,

Tensorflow 1.11.0 (GPU enabled-the code uses keras associated with Tensorflow)
Imgaug 0.2.6
opencv 3.4.2.17
numpy 1.14.1

All of these requirements are satisfied by (python3 Colab GPU-enabled instance) Just use it and the notebook Action_Recognition_Walkthrough.ipynb will install the rest :)

Dataset:

I used UCF101 dataset originally found here.

Also the dataset is processed and published by feichtenhofer/twostreamfusion)

  • RGB images(single zip file split into three parts)
wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_jpegs_256.zip.001
wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_jpegs_256.zip.002
wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_jpegs_256.zip.003
  • Optical Flow u/v frames(single zip file split into three parts)
wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_tvl1_flow.zip.001
wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_tvl1_flow.zip.002
wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_tvl1_flow.zip.003

Code Features:

  • You have variety of models to exchange between them easily.
  • Saves checkpoints on regular intervals and those checkpoints are synchronized to google drive using Drive API which means you can resume training anywhere for any Goggle Colab Instance.
  • Accesses the public models on my drive and you can resume and fine-tune them at different time stamps. Where the name of every checkpoint is as follows, EPOCH.BEST_TOP_1_ACC.CURRENT_TOP_1_ACC for example this which is 300-0.84298-0.84166.zip in folder heavy-mot-xception-adam-1e-05-imnet at this checkpoint,
    • epoch=300
    • best top 1 accuracy was 0.84298 (obtained in checkpoint before 300)
    • the current accuracy is 0.84166
    • in the experiment heavy-mot-xception-adam-1e-05-imnet

Models:

I used pre-trained models on imagenet provided by keras applications here.

The best results are obtained using Xception architecture.

Network Top1-Acc
Spatial VGG19 stream ~75%
Spatial Resnet50 stream 81.2%
Spatial Xception stream 86.04%
------------------------ -------
Motion Resnet50 stream ~75%
Motion xception stream 84.4%
------------------------ -------
Average fusion 91.25%
------------------------ -------
Recurrent network fusion 91.7%

Pre-trained Model

All the pre-trained models could be found here.

It's the same drive folder accessed by the code while training and resuming training from a checkpoint.

Reference Papers:

Nice implementations of two-stream approach:

Future directions:

Useful links:

two-stream-action-recognition-1's People

Contributors

mohammed-elkomy avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.