GithubHelp home page GithubHelp logo

leonhlj / mmsd Goto Github PK

View Code? Open in Web Editor NEW
4.0 0.0 1.0 44 KB

The official implementation of Multi-Modality Self-Distillation for Weakly Supervised Temporal Action Localization

License: MIT License

Python 100.00%
deep-learning pytorch weakly-supervised-learning temporal-action-localization

mmsd's Introduction

MMSD

Multi-Modality Self-Distillation for Weakly Supervised Temporal Action Localization
Linjiang Huang (CUHK), Liang Wang (CASIA), Hongsheng Li (CUHK)

Paper: TIP

Overview

We propose a pseudo-label-based methods by taking full advantages of multiple modalities, i.e., RGB and optical flow sequences, to generate high quality pseudo labels. The experimental results on THUMOS14 are as below.

Method \ mAP(%) @0.1 @0.2 @0.3 @0.4 @0.5 @0.6 @0.7 AVG
UntrimmedNet 44.4 37.7 28.2 21.1 13.7 - - -
STPN 52.0 44.7 35.5 25.8 16.9 9.9 4.3 27.0
W-TALC 55.2 49.6 40.1 31.1 22.8 - 7.6 -
AutoLoc - - 35.8 29.0 21.2 13.4 5.8 -
CleanNet - - 37.0 30.9 23.9 13.9 7.1 -
MAAN 59.8 50.8 41.1 30.6 20.3 12.0 6.9 31.6
CMCS 57.4 50.8 41.2 32.1 23.1 15.0 7.0 32.4
BM 60.4 56.0 46.6 37.5 26.8 17.6 9.0 36.3
RPN 62.3 57.0 48.2 37.2 27.9 16.7 8.1 36.8
DGAM 60.0 54.2 46.8 38.2 28.8 19.8 11.4 37.0
TSCN 63.4 57.6 47.8 37.7 28.7 19.4 10.2 37.8
EM-MIL 59.1 52.7 45.5 36.8 30.5 22.7 16.4 37.7
BaS-Net 58.2 52.3 44.6 36.0 27.0 18.6 10.4 35.3
A2CL-PT 61.2 56.1 48.1 39.0 30.1 19.2 10.6 37.8
ACM-BANet 64.6 57.7 48.9 40.9 32.3 21.9 13.5 39.9
HAM-Net 65.4 59.0 50.3 41.1 31.0 20.7 11.1 39.8
ACSNet - - 51.4 42.7 32.4 22.0 11.7 -
WUM 67.5 61.2 52.3 43.4 33.7 22.9 12.1 41.9
AUMN 66.2 61.9 54.9 44.4 33.3 20.5 9.0 41.5
CoLA 66.2 59.5 51.5 41.9 32.2 22.0 13.1 40.9
ASL 67.0 - 51.8 - 31.1 - 11.4 -
MMSD (Ours) 69.7 64.3 54.6 45.0 36.4 23.0 12.3 43.6

Prerequisites

Recommended Environment

  • Python 3.6
  • Pytorch 1.2
  • Tensorboard Logger
  • CUDA 10.0

Data Preparation

  1. Prepare THUMOS'14 dataset.

    • We recommend using features and annotations provided by this repo.
  2. Place the features and annotations inside a dataset/Thumos14reduced/ folder.

Usage

Training

You can easily train the model by running the provided script.

  • Refer to train_options.py. Modify the argument of dataset-root to the path of your dataset folder.

  • Run the command below.

$ python train_main.py --run-type 0 --model-id 1

Models are saved in ./ckpt/dataset_name/model_id/

Evaulation

The trained model can be found here. Please put it into ./ckpt/dataset_name/model_id/.

  • Run the command below.
$ python train_main.py --pretrained --run-type 1 --model-id 1 --load-epoch 240

load-epoch refers to the epoch of the best model. The best model would not always occur at 240 epoch, please refer to the log in the same folder of saved models to set the load epoch of the best model. Make sure you set the right model-id that corresponds to the model-id during training.

References

We referenced the repos below for the code.

Contact

If you have any question or comment, please contact the first author of the paper - Linjiang Huang ([email protected]).

mmsd's People

Contributors

leonhlj avatar

Stargazers

 avatar  avatar  avatar  avatar

Forkers

mymuli

mmsd's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.