GithubHelp home page GithubHelp logo

lcdc_release's Introduction

Learning Motion in Feature Space

About

This is the official repository for the paper "Learning Motion in Feature Space: Locally-Consistent Deformable Convolution Networks for Fine-Grained Action Detection" in ICCV 2019.

Links

Experimental results:

  • 50 Salads dataset:
  • GTEA dataset:
  • Ablation study:

Before running

Everything must be run from the root directory, e.g.

./scripts/50_salads/build_tfrecords_train.sh

To view the usage of python code, please use the help feature, e.g.

python ./src/trainer.py --help

Please consider copying and modifying the scripts in scripts/ directory for other use cases.

Installation

Option 1: Docker

A docker for this project is available at: link.

Option 2: Manual install

It is recommended to create a separate virtual environment for this project.

Dependencies

Pip packages (can be installed using pip install -r requirements.txt):

tensorflow-gpu==1.3.0
matplotlib
opencv-python
pyyaml
progressbar2
cython
easydict
scikit-image
scikit-learn

Other requirements:

python3
ffmpeg
gcc4.9
cuda8.0
cudnn6.0

(The code may work with gcc5, but it has not been fully tested.)

Setup TF_Deformable_Net:

Clone TF_Deformable_Net

git clone https://github.com/Zardinality/TF_Deformable_Net $TF_DEFORMABLE_NET

where $TF_DEFORMABLE_NET is the target directory.

Add $TF_DEFORMABLE_NET to your environment (e.g. .bashrc), e.g.

export PYTHONPATH=$PYTHONPATH:$TF_DEFORMABLE_NET

Continue installing the framework by following the instruction on the original Zardinality's repository TF_Deformable_Net.

Extra:

In case the installation does not work, please replace with the files inside ./extra/TF_Deformable_Net/. Remember to backup before overwriting.

Weights and features

  • Pretrained models (for initialization):

The pretrained weights are extracted from the weights provided by TF_Deformable_Net. Since the default deformable convolution network is implemented with 4 deformable group, we extract each of those to match the implementation of LCDC. The weights are indexed as g0, g1, g2, g3. The difference between choosing which one to use is minimal.

The weights are available here.

  • Trained models:

Trained models are available here.

  • Extracted features (for TCN):

Short-temporal features for TCN networks are available here.

Usage

Step 1: Download data

This section shows you how to set up dataset, using 50 Salads dataset as an example. Other datasets should follow a similar routine.

Create data/ directory, then download 50 Salads dataset and link it in data/, e.g.

ln -s [/path/to/salads/dataset] ./data/50_salads_dataset

Copy extra contents from ./extra/50_salads_dataset/ to ./data/50_salads_dataset/.

Credits: The labels and splits are originally from Colin Lea's repository TemporalConvolutionalNetworks.

You should have something similar to this:

data/
└── 50_Salads_dataset/
    ├── rgb/
    ├── labels/
    ├── splits/
    └── timestamps/

Step 2: Extract frames and segment activities

Extract the frames and segment in fine-grained level. Modify the script accordingly.

./scripts/50_salads/prepare_data_50salads.sh

Step 3: Create tfrecords

Prepare tfrecord files. Modify the script accordingly.

./scripts/50_salads/build_tfrecords_train.sh

Step 4: Train models

Modify the training scripts accordingly before running. All training and testing scripts require GPU ID as an input argument, e.g.

./scripts/50_salads/resnet_motionconstraint/train_nomotion_g0.sh $GPU_ID

Results are stored in logs/ directory. You can use TensorBoard to visualize training and testing processes, e.g.

tensorboard --logdir [path/to/log/dir] --port [your/choice/of/port]

Further details of how to use TensorBoard are available at this link.

Step 5: Validate/test models

It is recommmended to run the testing script in parallel with the training script, using a different GPU. The default configuration will go through the checkpoints you have generated and wait for new checkpoints coming. Press Ctrl+C to stop the testing process. You can customize the number of checkpoints or which checkpoints to run by switching to other testing mode. Please inspect the help of tester_fd.py for more information.

./scripts/50_salads/resnet_motionconstraint/test_nomotion_g0.sh $GPU_ID

Other tools and usage

Extract features

Please change the output dir and other parameters accordingly.

./scripts/50_salads/resnet_motionconstraint/extract_feat_nomotion_g0.sh $GPU_ID

Citation

If you are interested in using this project, please cite our paper as

@InProceedings{Mac_2019_ICCV,
    author    = {Mac, Khoi-Nguyen C. and 
                 Joshi, Dhiraj and 
                 Yeh, Raymond A. and 
                 Xiong, Jinjun and 
                 Feris, Rogerio S. and 
                 Do, Minh N.},
    title     = {Learning Motion in Feature Space: Locally-Consistent Deformable Convolution Networks for Fine-Grained Action Detection},
    booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2019}
}

lcdc_release's People

Contributors

knmac avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.