GithubHelp home page GithubHelp logo

pokameng / acar-net Goto Github PK

View Code? Open in Web Editor NEW

This project forked from siyu-c/acar-net

0.0 0.0 0.0 2.23 MB

[CVPR 2021] Actor-Context-Actor Relation Network for Spatio-temporal Action Localization

License: Apache License 2.0

Python 100.00%

acar-net's Introduction

[CVPR 2021] Actor-Context-Actor Relation Network for Spatio-temporal Action Localization

This repository gives the official PyTorch implementation of Actor-Context-Actor Relation Network for Spatio-temporal Action Localization (CVPR 2021) - 1st place solution of AVA-Kinetics Crossover Challenge 2020. This codebase also provides a general pipeline for training and evaluation on AVA-style datasets, as well as state-of-the-art action detection models.

Junting Pan Siyu Chen Zheng Shou Jing Shao Hongsheng Li
Junting Pan Siyu Chen Zheng Shou Jing Shao Hongsheng Li

Requirements

Some key dependencies are listed below, while others are given in requirements.txt.

  • Python >= 3.6
  • PyTorch >= 1.3, and a corresponding version of torchvision
  • ffmpeg (used in data preparation)
  • Download pre-trained models, which are listed in pretrained/README.md, to the pretrained folder
  • Prepare data. Please refer to DATA.md
  • Download annotations files to the annotations folder. See annotations/README.md for detailed information.

Usage

Default values for arguments nproc_per_node, backend and master_port are 8, nccl and 31114 respectively.

python main.py --config CONFIG_FILE [--nproc_per_node N_PROCESSES] [--backend BACKEND] [--master_addr MASTER_ADDR] [--master_port MASTER_PORT]

Running with Multiple Machines

In this case, the master_addr argument must be provided. Moreover, arguments nnodes and node_rank can be additionally specified (similar to torch.distributed.launch), otherwise the program will try to obtain their values from environment variables. See distributed_utils.py for details.

To-do List

  • Model zoo
  • More advanced backbone
  • Data preparation for Kinetics dataset, and training on AVA-Kinetics
  • Implementation for ACFB

License

ACAR-Net is released under the Apache 2.0 license.

CVPR 2020 AVA-Kinetics Challenge

Find slides and video presentation of our winning solution on [Google Slides] [Youtube Video] [Bilibili Video] (Starting from 18:20).

Preprint

Find our work on ArXiv. architecture-fig

Please cite with the following Bibtex code:

@article{pan2020actorcontextactor,
  title={Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization},
  author={Junting Pan and Siyu Chen and Zheng Shou and Jing Shao and Hongsheng Li},
  journal={arXiv preprint arXiv:2006.07976},
  year={2020}
}

You may also want to refer to our publication with the more human-friendly Chicago style:

Junting Pan, Siyu Chen, Zheng Shou, Jing Shao, Hongsheng Li. "Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization." Arxiv 2020.

Contact

If you have any general question about our work or code which may be of interest to other researchers, please use the public issues section of this repository. Alternatively, drop us an e-mail at [email protected] and [email protected] .

acar-net's People

Contributors

junting avatar siyu-c avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.