GithubHelp home page GithubHelp logo

osvai / ske2grid Goto Github PK

View Code? Open in Web Editor NEW
20.0 4.0 0.0 672 KB

The official project website of "Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition" (The paper of Ske2Grid is published in ICML 2023)

License: Apache License 2.0

Python 100.00%

ske2grid's Introduction

Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition

By Dongqi Cai, Yangyuxuan Kang, Anbang Yao and Yurong Chen.

This repository is an official Pytorch implementation of "Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition", dubbed Ske2Grid, published in ICML 2023.

Overview

Ske2Grid, a progressive representation learning framework conditioned on transforming human skeleton graph into an up-sampled grid representation, which is dedicated to skeleton-based human action recognition, showing leading performance on six mainstream benchmarks.

Comparison of convolution operations in GCNs and in our Ske2Grid. In Ske2Grid, we construct a regular grid patch for skeleton representation via up-sampling transform (UPT) and graph-node index transform (GIT). Convolution operation upon this grid patch convolves every grid cell using a shared regular kernel. It operates on a set of grid cells within a squared sub-patch which may be filled by a set of nodes distributed remotely on the graph, achieving a learnable receptive field on the skeleton for action feature modeling. In the figure, the up-sampled skeleton graph is visualized assuming the locations of the original graph nodes being unchanged for a better illustration.

(a) The overall framework of Ske2Grid: the input skeleton graph with $N$ joints is converted to a grid patch of size $H\times W$ using a pair of up-sampling transform (UPT) and graph-node index transform (GIT), which is then fed into the Ske2Grid convolution network for action recognition. (b) Ske2Grid with progressive learning strategy (PLS): the input skeleton is converted to a larger grid patch ($H'>H, W'>W$) using two-stage UPT plus GIT pairs. The well-trained Ske2Grid convolution network for the first-stage grid patch as in (a) is re-used to initialize the network for the second-stage grid patch as in (b), and the first-stage UPT plus GIT pair is fixed during training. PLS is used in a cascaded way to boost the performance of our Ske2Grid convolution network with increasing grid patch size.

Usage

  • Download PYSKL:
git clone https://github.com/kennymckormick/pyskl.git
  • Prepare datasets following PYSKL data format or download the pre-processed 2D or 3D skeletons from PYSKL repository.
  • Merge our model and config file folders "models", "utils" and "configs" into the corresponding folders "pyskl/models", "pyskl/utils" and "pyskl/configs" respectively.
  • Replace folder "pyskl/datasets" with ours "datasets" (to avoid mismatch due to PYSKL version update).
  • Install PYSKL as officially instructed:
pip3 install -e .

Models & Results

Results comparison on the NTU-60 XSub validation set.

Method Grid Patch Size Config Top-1 Acc(%) Model Log
ST-GCN -- config 85.15 model log
Ske2Grid 5x5 config 86.20 model log
Ske2Grid 6x6 config 87.87 model log
Ske2Grid 7x7 config 88.26 model log
Ske2Grid 8x8 config 88.55 model log

Training & Testing

Training Ske2Grid using grid patch representation of $D_{5\times 5}$:

bash tools/dist_train.sh configs/Ske2Grid/d5_stgcn2cn_pyskl_ntu60_xsub_hrnet/j.py 4 --validate

Progressively training Ske2Grid of $D_{6\times 6}$ from previous trained Ske2Grid model of $D_{5\times 5}$:

bash tools/dist_train.sh configs/Ske2Grid/d5tod6_stgcn2cn_pyskl_ntu60_xsub_hrnet/j.py 4 --validate  --cfg-options load_from='work_dirs/ske2grid/d5_stgcn2cn_pyskl_ntu60_xsub_hrnet/j/best_top1_acc_epoch_*.pth'

Evaluating Ske2Grid of grid patch $D_{5\times 5}$:

bash tools/dist_test.sh configs/Ske2Grid/d5_stgcn2cn_pyskl_ntu60_xsub_hrnet/j.py work_dirs/ske2grid/d5_stgcn2cn_pyskl_ntu60_xsub_hrnet/j/best_top1_acc_epoch_*.pth 4 --out results_d5.pkl --eval top_k_accuracy mean_class_accuracy

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{cai2023ske2grid,
  title={Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition},
  author={Cai, Dongqi and Kang, Yangyuxuan and Yao, Anbang and Chen, Yurong},
  booktitle={International Conference on Machine Learning},
  year={2023}
  url={https://openreview.net/forum?id=SQtp4uUByd}
}

License

Ske2Grid is released under the Apache license. We encourage use for both research and commercial purposes, as long as proper attribution is given.

Acknowledgement

This repository is built based on PYSKL repository. We thank the authors for releasing their amazing codes.

ske2grid's People

Contributors

yaoanbang avatar caidonkey avatar dongqicai avatar

Stargazers

Gibran Benitez-Garcia avatar  avatar Yongpeng Cao avatar coco avatar real-ljt avatar Dongyang Jin avatar BinRen avatar Yuan Mingzhuo avatar fanjiawei avatar Deepan Adak avatar Uzay Gökay avatar hulianyu avatar Huy Manh avatar  avatar Felix avatar  avatar Yikai Wang avatar  avatar ztttttt avatar Lu Ming avatar

Watchers

Xiaolong Liu avatar  avatar KYANG avatar Kostas Georgiou avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.