GithubHelp home page GithubHelp logo

babyblue26 / 4d-humans Goto Github PK

View Code? Open in Web Editor NEW

This project forked from shubham-goel/4d-humans

0.0 0.0 0.0 14.3 MB

4DHumans: Reconstructing and Tracking Humans with Transformers

Home Page: https://shubham-goel.github.io/4dhumans/

License: MIT License

Shell 0.39% Python 99.61%

4d-humans's Introduction

4DHumans: Reconstructing and Tracking Humans with Transformers

Code repository for the paper: Humans in 4D: Reconstructing and Tracking Humans with Transformers Shubham Goel, Georgios Pavlakos, Jathushan Rajasegaran, Angjoo Kanazawa*, Jitendra Malik*

arXiv Website shields.io Open In Colab Hugging Face Spaces

teaser

Installation and Setup

First, clone the repo. Then, we recommend creating a clean conda environment, installing all dependencies, and finally activating the environment, as follows:

git clone https://github.com/shubham-goel/4D-Humans.git
cd 4D-Humans
conda env create -f environment.yml
conda activate 4D-humans

If conda is too slow, you can use pip:

conda create --name 4D-humans python=3.10
conda activate 4D-humans
pip install torch
pip install -e .[all]

All checkpoints and data will automatically be downloaded to $HOME/.cache/4DHumans the first time you run the demo code.

Run demo on images

The following command will run ViTDet and HMR2.0 on all images in the specified --img_folder, and save renderings of the reconstructions in --out_folder. --batch_size batches the images together for faster processing. The --side_view flags additionally renders the side view of the reconstructed mesh, --full_frame renders all people together in front view, --save_mesh saves meshes as .objs.

python demo.py \
    --img_folder example_data/images \
    --out_folder demo_out \
    --batch_size=48 --side_view --save_mesh --full_frame

Run tracking demo on videos

Our tracker builds on PHALP, please install that first:

pip install git+https://github.com/brjathu/PHALP.git

Now, run track.py to reconstruct and track humans in any video. Input video source may be a video file, a folder of frames, or a youtube link:

# Run on video file
python track.py video.source="example_data/videos/gymnasts.mp4"

# Run on extracted frames
python track.py video.source="/path/to/frames_folder/"

# Run on a youtube link (depends on pytube working properly)
python track.py video.source=\'"https://www.youtube.com/watch?v=xEH_5T9jMVU"\'

The output directory (./outputs by default) will contain a video rendering of the tracklets and a .pkl file containing the tracklets with 3D pose and shape. Please see the PHALP repository for details.

Training

Download the training data to ./hmr2_training_data/, then start training using the following command:

bash fetch_training_data.sh
python train/train.py exp_name=hmr2 data=mix_all experiment=hmr_vit_transformer trainer=gpu launcher=local

Checkpoints and logs will be saved to ./logs/. We trained on 8 A100 GPUs for 7 days using PyTorch 1.13.1 and PyTorch-Lightning 1.8.1 with CUDA 11.6 on a Linux system. You may adjust batch size and number of GPUs per your convenience.

Evaluation

Coming soon.

Acknowledgements

Parts of the code are taken or adapted from the following repos:

Additionally, we thank StabilityAI for a generous compute grant that enabled this work.

Citing

If you find this code useful for your research, please consider citing the following paper:

@article{goel2023humans,
    title={Humans in 4{D}: Reconstructing and Tracking Humans with Transformers},
    author={Goel, Shubham and Pavlakos, Georgios and Rajasegaran, Jathushan and Kanazawa, Angjoo and Malik, Jitendra},
    journal={arXiv preprint arXiv:2305.20091},
    year={2023}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.