GithubHelp home page GithubHelp logo

cvaidl / unipose Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bmartacho/unipose

0.0 0.0 0.0 119 KB

We propose UniPose, a unified framework for human pose estimation, based on our “Waterfall” Atrous Spatial Pooling architecture, that achieves state-of-art-results on several pose estimation metrics. Current pose estimation methods utilizing standard CNN architectures heavily rely on statistical postprocessing or predefined anchor poses for joint localization. UniPose incorporates contextual seg- mentation and joint localization to estimate the human pose in a single stage, with high accuracy, without relying on statistical postprocessing methods. The Waterfall module in UniPose leverages the efficiency of progressive filter- ing in the cascade architecture, while maintaining multi- scale fields-of-view comparable to spatial pyramid config- urations. Additionally, our method is extended to UniPose- LSTM for multi-frame processing and achieves state-of-the- art results for temporal pose estimation in Video. Our re- sults on multiple datasets demonstrate that UniPose, with a ResNet backbone and Waterfall module, is a robust and efficient architecture for pose estimation obtaining state-of- the-art results in single person pose detection for both sin- gle images and videos.

License: Other

Python 100.00%

unipose's Introduction

UniPose

UniPose: Unified Human Pose Estimation in Single Images and Videos.


NEW!: BAPose: Bottom-Up Pose Estimation with Disentangled Waterfall Representations

Our novel framework for bottom-up multi-person pose estimation achieves State-of-the-Art results in several datasets. The pre-print of our new method, BAPose, can be found in the following link: BAPose pre-print. Full code for the BAPose framework is scheduled to be released in the near future.


NEW!: UniPose+: A unified framework for 2D and 3D human pose estimation in images and videos

Our novel and improved UniPose+ framework for pose estimation achieves State-of-the-Art results in several datasets. UniPose+ can be found in the following link: UniPose+ at PAMI. Full code for the UniPose+ framework is scheduled to be released in the near future.


NEW!: OmniPose: A Multi-Scale Framework for Multi-Person Pose Estimation

Our novel framework for multi-person pose estimation achieves State-of-the-Art results in several datasets. The pre-print of our new method, OmniPose, can be found in the following link: OmniPose pre-print. Full code for the OmniPose framework is scheduled to be released in the near future. Github: https://github.com/bmartacho/OmniPose.


Figure 1: UniPose architecture for single frame pose detection. The input color image of dimensions (HxW) is fed through the ResNet backbone and WASP module to obtain 256 feature channels at reduced resolution by a factor of 8. The decoder module generates K heatmaps, one per joint, at the original resolution, and the locations of the joints are determined by a local max operation.


Figure 2: UniPose-LSTM architecture for pose estimation in videos. The joint heatmaps from the decoder of UniPose are fed into the LSTM along with the final heatmaps from the previous LSTM state. The convolutional layers following the LSTM reorganize the outputs into the final heatmaps used for joint localization.


We propose UniPose, a unified framework for human pose estimation, based on our "Waterfall" Atrous Spatial Pooling architecture, that achieves state-of-art-results on several pose estimation metrics. UniPose incorporates contextual segmentation and joint localization to estimate the human pose in a single stage, with high accuracy, without relying on statistical postprocessing methods. The Waterfall module in UniPose leverages the efficiency of progressive filtering in the cascade architecture, while maintaining multi-scale fields-of-view comparable to spatial pyramid configurations. Additionally, our method is extended to UniPose-LSTM for multi-frame processing and achieves state-of-the-art results for temporal pose estimation in Video. Our results on multiple datasets demonstrate that UniPose, with a ResNet backbone and Waterfall module, is a robust and efficient architecture for pose estimation obtaining state-of-the-art results in single person pose detection for both single images and videos.

We propose the “Waterfall Atrous Spatial Pyramid” module, shown in Figure 3. WASP is a novel architecture with Atrous Convolutions that is able to leverage both the larger Field-of-View of the Atrous Spatial Pyramid Pooling configuration and the reduced size of the cascade approach.


Figure 3: WASP Module.


Examples of the UniPose architecture for Pose Estimation are shown in Figures 4 for single images and videos.


Figure 4: Pose estimation samples for UniPose in images and videos.

Link to the published article at CVPR 2020.


Datasets:

Datasets used in this paper and required for training, validation, and testing can be downloaded directly from the dataset websites below:
LSP Dataset: https://sam.johnson.io/research/lsp.html
MPII Dataset: http://human-pose.mpi-inf.mpg.de/
PennAction Dataset: http://dreamdragon.github.io/PennAction/
BBC Pose Dataset: https://www.robots.ox.ac.uk/~vgg/data/pose/


Pre-trained Models:

The pre-trained weights can be downloaded here.


Contact:

Bruno Artacho:
E-mail: [email protected]
Website: https://www.brunoartacho.com

Andreas Savakis:
E-mail: [email protected]
Website: https://www.rit.edu/directory/axseec-andreas-savakis

Citation:

Artacho, B.; Savakis, A. UniPose: Unified Human Pose Estimation in Single Images and Videos. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

Latex:
@InProceedings{Artacho_2020_CVPR,
title = {UniPose: Unified Human Pose Estimation in Single Images and Videos},
author = {Artacho, Bruno and Savakis, Andreas},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020},
}

unipose's People

Contributors

bmartacho avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.