GithubHelp home page GithubHelp logo

miniroad's Introduction

MiniROAD: Minimal RNN Framework for Online Action Detection

PWC PWC PWC

Introduction

This is a pytorch implementation for our ICCV 2023 paper "MiniROAD: Minimal RNN Framework for Online Action Detection".

teaser

Data Preparation

THUMOS14 and TVSeries

To prepare the features and targets by yourself, please refer to LSTR. You can also directly download the pre-extracted features and targets from TeSTra.

FineAction

Download the officially available pre-extracted features from FineAction. As mentioned in the paper, the temporal dimensions have been linearly interpolated by a factor of four as the officially available feature is too condensed (16 frames being converted into one feature).

Data Structure

  1. If you want to use our dataloaders, please make sure to put the files as the following structure:

    • THUMOS'14 dataset:

      $YOUR_PATH_TO_THUMOS_DATASET
      ├── rgb_FEATURETYPE/
      |   ├── video_validation_0000051.npy 
      │   ├── ...
      ├── flow_FEATURETYPE/ 
      |   ├── video_validation_0000051.npy 
      |   ├── ...
      ├── target_perframe/
      |   ├── video_validation_0000051.npy (of size L x 22)
      |   ├── ...
      
    • TVSeries dataset:

      $YOUR_PATH_TO_TVSERIES_DATASET
      ├── rgb_FEATURETYPE/
      |   ├── Breaking_Bad_ep1.npy 
      │   ├── ...
      ├── flow_FEATURETYPE/
      |   ├── Breaking_Bad_ep1.npy 
      |   ├── ...
      ├── target_perframe/
      |   ├── Breaking_Bad_ep1.npy (of size L x 31)
      |   ├── ...
      
    • FineAction dataset:

      $YOUR_PATH_TO_FINEACTION_DATASET
      ├── rgb_kinetics_i3d/
      |   ├── v_00008645.npy (of size L x 2048)
      │   ├── ...
      ├── flow_kinetics_i3d/
      |   ├── v_00008645.npy (of size L x 2048)
      |   ├── ...
      ├── target_perframe/
      |   ├── v_00008645.npy (of size L x 107)
      |   ├── ...
      

    For appropriate FEATURETYPE, please refer to (datasets/dataset.py)

  2. Create softlinks of datasets:

    cd MiniROAD
    ln -s $YOUR_PATH_TO_THUMOS_DATASET data/THUMOS
    ln -s $YOUR_PATH_TO_TVSERIES_DATASET data/TVSERIES
    ln -s $YOUR_PATH_TO_FINEACTION_DATASET data/FINEACTION
    

Training

```
cd MiniROAD
python main.py --config $PATH_TO_CONFIG_FILE 
```

Inference from checkpoint

```
cd MiniROAD
python main.py --config $PATH_TO_CONFIG_FILE --eval $PATH_TO_CHECKPOINT
```

Main Results and checkpoints

THUMOS14

method feature mAP (%) config checkpoint
MiniROAD kinetics 71.8 yaml Download
MiniROAD nv_kinetics 68.4 yaml Download

FINEACTION

method feature mAP (%) config checkpoint
MiniROAD kinetics 37.1 yaml Download

TVSERIES

method feature mcAP (%) config checkpoint
MiniROAD kinetics 89.6 yaml Download

Citations

If you are using the data/code/model provided here in a publication, please cite our paper:

@inproceedings{miniroad,
	title={MiniROAD: Minimal RNN Framework for Online Action Detection},
	author={An, Joungbin and Kang, Hyolim and Han, Su Ho and Yang, Ming-Hsuan and Kim, Seon Joo},
	booktitle={International Conference on Computer Vision (ICCV)},
	year={2023}
}

License

This project is licensed under the Apache-2.0 License.

Acknowledgements

Many of the codebase is from LSTR.

miniroad's People

Contributors

jbistanbul avatar

Stargazers

Junghwan Park avatar  avatar Seminara Luigi avatar Chih-Ming Lien avatar  avatar  avatar Filippo Ziche avatar Ziying Xia avatar SC-Ching avatar livic avatar CHC avatar 爱可可-爱生活 avatar Miran Heo avatar Jeongwhan Choi avatar En-Jhih Lo avatar Joya Chen avatar Sejong Yang avatar Woojin-Cho avatar Dahyun Kang avatar Jungho Lee avatar Jaehyun Kang avatar Junhyeok Kim avatar Daekyu Kwon avatar HYUN Jeongseok avatar Hanjung Kim avatar

Watchers

 avatar

miniroad's Issues

Feature extraction

I have another question about the feature extraction. Because I want to apply MiniROAD to other datasets, so I try to extract the features by myself. As section 4 Implementation details in the supplementary of the paper, "For THUMOS and TVSeries,
we first sample the videos into 24 FPS and feed nonoverlapping snippets to the feature extractor. The snippet size is set to 6 and TSN is adopted as the feature extractor. " I have resample the videos at 24 FPS and extract RGB and flow(视频帧和光流) using mmaction2 But I don't know how to set the video snippet size to 6. How could you do that and are there any tools in mmaction2 can help to realize this step?

about target_perframe file

I'm uncertain about how to create the target_perframe file. Could you please describe how to create it and provide the relevant code? It would greatly assist us.
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.