GithubHelp home page GithubHelp logo

jackzhousz / osx Goto Github PK

View Code? Open in Web Editor NEW

This project forked from idea-research/osx

0.0 0.0 0.0 5.87 MB

[CVPR 2023] Official implementation of the paper "One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer"

Home Page: https://osx-ubody.github.io/

License: MIT License

C++ 0.24% Python 97.05% Cuda 2.71%

osx's Introduction

One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer

Authors

Jing Lin, Ailing Zeng, Haoqian Wang, Lei Zhang, Yu Li


The proposed UBody dataset

1. Introduction

This repo is official PyTorch implementation of One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer (CVPR2023). We propose the first one-stage whole-body mesh recovery method (OSX) and build a large-scale upper-body dataset (UBody). It is the top-1 method on AGORA benchmark SMPL-X Leaderboard (dated March 2023).

2. Create Environment

  • PyTorch >= 1.7 + CUDA

    Recommend to install by:

    pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
  • Python packages:

    pip install -r requirements.txt
  • mmcv-full:

    pip install openmim
    mim install mmcv-full==1.7.1
  • mmpose:

    cd main/transformer_utils
    python setup.py develop

3. Quick demo

  • Slightly change torchgeometry kernel code following here.
  • Download the pre-trained OSX from here.
  • Prepare input.png and pre-trained snapshot at demo folder.
  • Prepare human_model_files folder following below Directory part and place it at common/utils/human_model_files.
  • Go to any of main folders and edit bbox of demo.py .
  • Run python demo.py --gpu 0.
  • If you run this code in ssh environment without display device, do follow:
1、Install oemesa follow https://pyrender.readthedocs.io/en/latest/install/
2、Reinstall the specific pyopengl fork: https://github.com/mmatl/pyopengl
3、Set opengl's backend to egl or osmesa via os.environ["PYOPENGL_PLATFORM"] = "egl"

4. Directory

(1) Root

The ${ROOT} is described as below.

${ROOT}  
|-- data  
|-- dataset
|-- demo
|-- main  
|-- pretrained_models
|-- tool
|-- output  
|-- common
|   |-- utils
|   |   |-- human_model_files
|   |   |   |-- smpl
|   |   |   |   |-- SMPL_NEUTRAL.pkl
|   |   |   |   |-- SMPL_MALE.pkl
|   |   |   |   |-- SMPL_FEMALE.pkl
|   |   |   |-- smplx
|   |   |   |   |-- MANO_SMPLX_vertex_ids.pkl
|   |   |   |   |-- SMPL-X__FLAME_vertex_ids.npy
|   |   |   |   |-- SMPLX_NEUTRAL.pkl
|   |   |   |   |-- SMPLX_to_J14.pkl
|   |   |   |-- mano
|   |   |   |   |-- MANO_LEFT.pkl
|   |   |   |   |-- MANO_RIGHT.pkl
|   |   |   |-- flame
|   |   |   |   |-- flame_dynamic_embedding.npy
|   |   |   |   |-- flame_static_embedding.pkl
|   |   |   |   |-- FLAME_NEUTRAL.pkl
  • data contains data loading codes.
  • dataset contains soft links to images and annotations directories.
  • pretrained_models contains pretrained models.
  • demo contains demo codes.
  • main contains high-level codes for training or testing the network.
  • tool contains pre-processing codes of AGORA and pytorch model editing codes.
  • output contains log, trained models, visualized outputs, and test result.
  • common contains kernel codes for Hand4Whole.
  • human_model_files contains smpl, smplx, mano, and flame 3D model files. Download the files from [smpl] [smplx] [SMPLX_to_J14.pkl] [mano] [flame]. If you have problems about the model preparation, please refer to this issue, where I provide the link for each files.

(2) Data

You need to follow directory structure of the dataset as below.

${ROOT}  
|-- dataset  
|   |-- AGORA
|   |   |-- data
|   |   |   |-- AGORA_train.json
|   |   |   |-- AGORA_validation.json
|   |   |   |-- AGORA_test_bbox.json
|   |   |   |-- 1280x720
|   |   |   |-- 3840x2160
|   |-- EHF
|   |   |-- data
|   |   |   |-- EHF.json
|   |-- Human36M  
|   |   |-- images  
|   |   |-- annotations  
|   |-- MPII
|   |   |-- data
|   |   |   |-- images
|   |   |   |-- annotations
|   |-- MPI_INF_3DHP
|   |   |-- data
|   |   |   |-- images_1k
|   |   |   |-- MPI-INF-3DHP_1k.json
|   |   |   |-- MPI-INF-3DHP_camera_1k.json
|   |   |   |-- MPI-INF-3DHP_joint_3d.json
|   |   |   |-- MPI-INF-3DHP_SMPL_NeuralAnnot.json
|   |-- MSCOCO  
|   |   |-- images  
|   |   |   |-- train2017  
|   |   |   |-- val2017  
|   |   |-- annotations 
|   |-- PW3D
|   |   |-- data
|   |   |   |-- 3DPW_train.json
|   |   |   |-- 3DPW_validation.json
|   |   |   |-- 3DPW_test.json
|   |   |-- imageFiles

(3) Output

You need to follow the directory structure of the output folder as below.

${ROOT}  
|-- output  
|   |-- log  
|   |-- model_dump  
|   |-- result  
|   |-- vis  
  • Creating output folder as soft link form is recommended instead of folder form because it would take large storage capacity.
  • log folder contains training log file.
  • model_dump folder contains saved checkpoints for each epoch.
  • result folder contains final estimation files generated in the testing stage.
  • vis folder contains visualized results.

5. Training OSX

(1) Download Pretrained Encoder

Download pretrained encoder osx_vit_l.pth and osx_vit_b.pth from here and place the pretrained model to pretrained_models/.

(2) Train on MSCOCO, Human3.6m, MPII and Test on EHF and AGORA-val

In the main folder, run

python train.py --gpu 0,1,2,3 --lr 1e-4 --exp_name output/train_setting1 --end_epoch 14 --train_batch_size 16

After training, run the following command to evaluate your pretrained model on EHF and AGORA-val:

# test on EHF
python test.py --gpu 0,1,2,3 --exp_name output/train_setting1/ --pretrained_model_path ../output/train_setting1/model_dump/snapshot_13.pth --testset EHF
# test on AGORA-val
python test.py --gpu 0,1,2,3 --exp_name output/train_setting1/ --pretrained_model_path ../output/train_setting1/model_dump/snapshot_13.pth --testset AGORA

To speed up, you can use a light-weight version OSX by change the encoder setting by adding --encoder_setting osx_b or change the decoder setting by adding --decoder_setting wo_face_decoder

(3) Train on AGORA and Test on AGORA-test

In the main folder, run

python train.py --gpu 0,1,2,3 --lr 1e-4 --exp_name output/train_setting2 --end_epoch 140 --train_batch_size 16  --agora_benchmark --decoder_setting wo_decoder

After training, run the following command to evaluate your pretrained model on AGORA-test:

python test.py --gpu 0,1,2,3 --exp_name output/train_setting2/ --pretrained_model_path ../output/train_setting2/model_dump/snapshot_139.pth --testset AGORA --agora_benchmark --test_batch_size 64 --decoder_setting wo_decoder

The reconstruction result will be saved at output/train_setting2/result/.

You can zip the predictions folder into predictions.zip and submit it to the AGORA benchmark to obtain the evaluation metrics.

You can use a light-weight version OSX by adding --encoder_setting osx_b.

6. Testing OSX

(1) Download Pretrained Models

Download pretrained models osx_l.pth.tar and osx_l_agora.pth.tar from here and place the pretrained model to pretrained_models/.

(2) Test on EHF

In the main folder, run

python test.py --gpu 0,1,2,3 --exp_name output/test_setting1 --pretrained_model_path ../pretrained_models/osx_l.pth.tar --testset EHF

(3) Test on AGORA-val

In the main folder, run

python test.py --gpu 0,1,2,3 --exp_name output/test_setting1 --pretrained_model_path ../pretrained_models/osx_l.pth.tar --testset AGORA

(4) Test on AGORA-test

In the main folder, run

python test.py --gpu 0,1,2,3 --exp_name output/test_setting2  --pretrained_model_path ../pretrained_models/osx_l_agora.pth.tar --testset AGORA --agora_benchmark --test_batch_size 64

The reconstruction result will be saved at output/test_setting2/result/.

You can zip the predictions folder into predictions.zip and submit it to the AGORA benchmark to obtain the evaluation metrics.

7. Results

(1) AGORA test set

image-20230327202353903

(2) AGORA-val, EHF, 3DPW

image-20230327202755593

image-20230327204220453

Troubleshoots

  • RuntimeError: Subtraction, the '-' operator, with a bool tensor is not supported. If you are trying to invert a mask, use the '~' or 'logical_not()' operator instead.: Go to here

  • TypeError: startswith first arg must be bytes or a tuple of bytes, not str.: Go to here. It seems that this solution only works for RTX3090. If it works for V100 or A100 in your case, please tell me in the issue :)

Acknowledgement

This repo is mainly based on Hand4Whole. We thank the well-organized code and patient answers of Gyeongsik Moon in the issue!

Reference

@article{lin2023osx,
  author    = {Lin, Jing and Zeng, Ailing and Wang, Haoqian and Zhang, Lei and Li, Yu},
  title     = {One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer},
  journal   = {CVPR},
  year      = {2023},
}

osx's People

Contributors

linjing7 avatar osx-ubody avatar guspan-tanadi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.