GithubHelp home page GithubHelp logo

lrxjason / attention3dhumanpose Goto Github PK

View Code? Open in Web Editor NEW
152.0 11.0 34.0 6.65 MB

The project is an official implementation of our CVPR2020 paper "Attention Mechanism Exploits Temporal Contexts: Real-time 3D Human Pose Reconstruction"

Home Page: https://sites.google.com/a/udayton.edu/jshen1/cvpr2020

Python 100.00%

attention3dhumanpose's Introduction

Attention Mechanism Exploits Temporal Contexts: Real-time 3D Human Pose Reconstruction (CVPR 2020 Oral)

More extensive evaluation andcode can be found at our lab website: (https://sites.google.com/a/udayton.edu/jshen1/cvpr2020) network

     

PyTorch code of the paper "Attention Mechanism Exploits Temporal Contexts: Real-time 3D Human Pose Reconstruction". pdf

If you found this code useful, please cite the following paper:

@inproceedings{liu2020attention,
  title={Attention Mechanism Exploits Temporal Contexts: Real-Time 3D Human Pose Reconstruction},
  author={Liu, Ruixu and Shen, Ju and Wang, He and Chen, Chen and Cheung, Sen-ching and Asari, Vijayan},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={5064--5073},
  year={2020}
}

Environment

The code is developed and tested on the following environment

  • Python 3.6
  • PyTorch 1.1 or higher
  • CUDA 10

Dataset

The source code is for training/evaluating on the Human3.6M dataset. Our code is compatible with the dataset setup introduced by Martinez et al. and Pavllo et al.. Please refer to VideoPose3D to set up the Human3.6M dataset (./data directory). We upload the training 2D cpn data here and the 3D gt data here. The 3D Avatar model and code are avaliable here.

Training new models

To train a model from scratch, run:

python run.py -da -tta

-da controls the data augments during training and -tta is the testing data augmentation.

For example, to train our 243-frame ground truth model or causal model in our paper, please run:

python run.py -k gt

or

python run.py -k cpn_ft_h36m_dbb --causal

It should require 24 hours to train on one TITAN RTX GPU.

Evaluating pre-trained models

We provide the pre-trained cpn model here and ground truth model here. To evaluate them, put them into the ./checkpoint directory and run:

For cpn model:

python run.py -tta --evaluate cpn.bin

For ground truth model:

python run.py -k gt -tta --evaluate gt.bin

Visualization and other functions

We keep our code consistent with VideoPose3D. Please refer to their project page for further information.

attention3dhumanpose's People

Contributors

lrxjason avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

attention3dhumanpose's Issues

We could not find common/casual_model.py ...

Thank you for your amazing work! But...
I meet some problem when I tried to run the code of real-time human pose reconstruction.
In line45,46 of run.py (if args.causal: \n from common.causal_model import *), the code had import causal_model, but it do not exist in the common folder. Did you miss it when you packing the code?
Looking forward to your early reply.

谢谢您出色的工作!但是…
我在试图运行实时人体姿态重建代码时遇到了一些问题。
在run.py的第45、第46行 (if args.causal: \n from common.causal_model import *),引入了causal_model,但它并不在common文件夹中。请问是在打包代码的时候遗漏了吗?
期待您的早日回复。

How to animate the skeleton?

I am interested in your model framework. The animation model in your demo system is Kung Fu Panda, I would like to know how to animate the skeleton, can you share the tools or code? Thank you very much!

how to animate avatar model

Thank you for your amazing work!
I do not know how to animate avatar, could you share the tools or code about it.

CPN detections for HumanEva

Hey! Thank you for an amazing work. I am looking forward to the release of your code.
In your paper you report the results of training your model on CPN detections for HumanEva. Would it be possible to share the CPN 2D detections file (similarly to the CPN detections for Human3.6M which you are already sharing)?
Additionally, you reference the results for HumanEva from Pavllo et al. [35] (VideoPose3D). For Walk S3 you report 27.2, when in their paper they mention that "The high error on “Walk” of S3 is due to corrupted mocap data" and so they get 46.6. How did you fix this issue?
Thank you

The test method seems different with videopose3d

Videopose3d use UnChunkedGenerator during validating and testing while Attention3DHumanPose use ChunkedGenerator. With ChunckedGenerator, sequences would be cropped into some chunks with the same chunk length and padding will be added if the chunk has not enough frames. Does it make your test dataset containing some "fake data pairs"?

How to test other receptive field

Thank you for your great work.

I tried -arc 3x3x3 or 3x3x3x3 to test 27 or 81 receptive field, but the script gives error It seem the code only support 3x3x3x3x3.
May I know what I should modify when I test other receptive field besides Ac 3x3x3 or 3x3x3x3 to test 27 or 81 receptive field, but the script gives error
May I know what should I modify when I test other receptive field besides "-arc"?

Thank you in advance.

the test time of real-time inference

Hi~ Thank you for sharing such great works. I have a question about the test time of real-time inference.
Can you provide me with the details of the test time of real-time inference?
It would be great if you could provide the test code.

Best wishes

generating visual effect with custom video input

Thanks for sharing your great works! But I am still confused about generating visual effect of my custom video. I have done all the modification needed to generate a video, but finally I find that your code in this project may assume that the 2d joint positions are in Human3.6 style; while the instructions from VideoPose3D will only generate custom detection result in COCO style. Is there any possible solution for this circumstance? Or this code is designed only for Human3.6m data as described in README.

Thanks in advance for anyone who is willing to answer me.

Pretrained 2D CPN model

Hi! Very nice work and thanks for sharing the codes. I'll be really appreciate if you could share the weights of the pretrained 2D cpn model to produce the 2D h36m format keypoints for inference in the wild. Thanks!!

Why test-time augmentation will improve the result

In this paper, the data enhancement is to flip the action and add it to the test set. In my understanding, this method of data expansion will lead to poor test results, why it is better here. Is it because of the high accuracy after flipping.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.