GithubHelp home page GithubHelp logo

xiaoanshi / anet2016-cuhk Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yjxiong/anet2016-cuhk

0.0 1.0 0.0 3.81 MB

Action Recognition Toolbox for CUHK&ETHZ&SIAT submission to ActivityNet 2016

License: BSD 2-Clause "Simplified" License

Shell 6.29% Python 69.71% CSS 5.38% HTML 18.62%

anet2016-cuhk's Introduction

CUHK & ETH & SIAT Solution to ActivityNet Challenge 2016

This repository holds the materials necessary to reproduce the results for our solution to ActivityNet Challenge 2016. We won the 1st place in the untrimmed video classification task.

Although initially designed for the challenge, the repository also means to provide an accessible framework for general video classification tasks.

  • We are currently organizing the codebase. Please stay tuned.*

  • Jul 14 - The correct reference flow model is available for download. See here.

  • Jul 11 - Demo website is now online!

  • Jul 10 - Web demo code released

Functionalities & Release Status

  • Basic utilities
  • Action recognition with single video
    • Web demo for action recognition
  • ActivityNet validation set evaluation
  • Training action recognition system - We use the TSN framework to train our models.

Dependencies

The codebase is written in Python. It is recommended to use Anaconda distribution package with it.

Besides, we also use Caffe and OpenCV. Particularly, the OpenCV should be compiled with VideoIO support. GPU support will be good if possible. If you use build_all.sh, it will locally install these dependencies for you.

Requirements

NVIDIA GPU with CUDA support. At least 4GB display memory is needed to run the reference models.

Get the code

Use Git

git clone --recursive https://github.com/yjxiong/anet2016-cuhk

If you happen to forget adding --recursive to the command. You can still go to the project directory and issue

git submodule update --init

Single Video Classification

  • Build all modules In the root directory of the project, run the following command
bash build_all.sh
  • Get the reference models
bash models/get_reference_models.sh
  • Run the classification There is a video clip in the data/plastering.avi for your example. To do single video classification with RGB model one can run
python examples/classify_video.py data/plastering.avi

It should print the top 3 prediction in the output. To use the two-stream model, one can add --use_flow flag to the command. The framework will then extract optical flow on the fly.

python examples/classify_video.py --use_flow data/plastering.avi

You can use your own video files by specifying the filename.

One can also specify a youtube url here to do the classification, for example

python examples/classify_video.py https://www.youtube.com/watch?v=QkuC0lvMAX0

The two-stream model here consists of one reset-200 model for RGB input and one BN-Inception model for optical flow input. The model spec and parameter files can be found in models/.

Web Demo

We also provide a light-weighted demo server. The server uses Flask.

python demo_server.py

It will be run on 127.0.0.1:5000. It supports uploading local files and directly analyzing Youtube-style video urls.

For a quick start, we have set up a public demo server at

Action Recognition Web Demo

The server runs on the Titan X GPU awarded for winning the challenge. Thanks to the organizers!

Related Projects

LICENSE

Released under BSD 2-Clause license.

anet2016-cuhk's People

Contributors

yjxiong avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.