GithubHelp home page GithubHelp logo

caizhongang / smpler-x Goto Github PK

View Code? Open in Web Editor NEW
837.0 837.0 53.0 140.89 MB

Official Code for "SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation"

Home Page: https://caizhongang.github.io/projects/SMPLer-X/

License: Other

Python 92.01% C++ 2.17% C 1.74% Cuda 3.81% Shell 0.27%

smpler-x's People

Contributors

caizhongang avatar wei-chen-hub avatar wqyin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

smpler-x's Issues

How to directly output the BVH file?

Hi, thanks for your great work.
As in README, we can get the obj file and the mesh jpg. But my goal is to obtain the BVH file which contains the pose keypoints.
We know our project should have the step that extract the pose keypoints like:
image
So what should we do if we want the BVH file?

about hand token

hello, i just found that it does not use the hand_token variable by inference. Is this intentional?Because i get the hand result not well

About Paper

Thanks for sharing this amazing work ! I am wondering if the corresponding paper have released?

Described installation protocol does not work?!

We were not able to get the project to run given the provided instructions at all (maybe fixing requirements would help).

The only thing that worked for us was to work based on the Dockerfile within the provided image, which boils down to (if in doubt, look at the Dockerfile) - NOTE: only tested this for inference:

conda create -n smplerx python=3.9.12 -y
conda activate smplerx
sudo apt-get update && sudo apt-get install -y ffmpeg
sudo apt-get install -y python3-opengl libosmesa6
pip install mmcv-full==1.7.1 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.12.0/index.html
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.3 -c pytorch -y
cd SMPLer-X/main/transformer_utils/
pip install -v -e .
pip install torchgeometry --find-links .
pip install -r {PATH_TO_FILE}/requirements_fixed.txt
conda update ffmpeg

extract the config.py and conversions.py from the Docker image (because of version conflicts) and place them wherever you installed the respective packages mmcv and torchgeometry (replace the existing files ofc). The following are the commands from within the docker container (assuming it runs on the same machine).

{COPY} smplerx/main/transformer_utils/mmcv/utils/config.py {WHEREVER_YOUR_CONDA_PACKAGES_ARE}/site-packages/mmcv/utils/config.py
{COPY} smplerx/main/transformer_utils/torchgeometry/core/conversions.py {WHEREVER_YOUR_CONDA_PACKAGES_ARE}/site-packages/torchgeometry/core/conversions.py

 

If you are not familiar with Docker, this is how you get into their container (you may leave out the nvidia arg):
sudo docker run -it --rm --runtime=nvidia --entrypoint="" {CONTAINER_ID} bash

requirements_fixed.txt

Is it possible to get 3D pose under a global coordinate?

Currently, the reconstruction is done on the image patch of the detected human bounding box. I tried directly rendering the reconstructed poses of a video with fixed camera parameters, but the result kept jittering. Is there an easy way to transfer from the predicted local 3D pose (with respect to the image patch) to a global 3D pose based on the bounding box parameters?

bone point data of the human body

This is really an exciting job. Can I directly apply the bone point data of the human body in the model inference results to the self built human body model? I visualized the bone point data of the human body in the model inference results, but did not output a representation similar to the skeleton. @caizhongang

Custom Data

Thanks for your nice work! I have some data that are different from these datasets format in your project. Can I run this model in my own data? How could I do this? If you can give me some advice, I will be thankful!

Unity model binding

Hello, I saw a related video on YouTube, and I want to ask how to model the inference results of the model to a model in Unity? This should be very interesting. Is there any expert who can offer some help? Thank you.

what's version of mmdet?

There are some errors when using mmdet. The version of mmcv is 1.7.1, and only supports mmdet 2.27.2. However, simple code like this
model = init_detector(config_file, checkpoint_file, device='cuda:0') not work. So what's the version of mmdet?

如何为虚拟角色制作动画

想请教下如何转换为动画,比较小白,想知道一条路径,不需要非常详细,能大概告知如何操作的就行,非常感谢

It seems that joint_image can't be utilized as 2D joints

Hi, SMPLer-X had already provided 3D info, but I still wanna obtain 2D information for my work (I would not like to add another 2D joints detector). Here is my modification (refer to InterWild).
SMPLer_X.py

def project_to_body_space(part_name, bbox):
    hand_bbox_w = bbox[:, None, 2] - bbox[:, None, 0]
    hand_bbox_h = bbox[:, None, 3] - bbox[:, None, 1]
    joint_img[:, smpl_x.pos_joint_part[part_name], 0] *= (
        (hand_bbox_w / cfg.output_hand_hm_shape[2]))
    joint_img[:, smpl_x.pos_joint_part[part_name], 1] *= (
        (hand_bbox_h / cfg.output_hand_hm_shape[1]))
    joint_img[:, smpl_x.pos_joint_part[part_name], 0] += bbox[:, None, 0]
    joint_img[:, smpl_x.pos_joint_part[part_name], 1] += bbox[:, None, 1]
        
for part_name, bbox in (('lhand', lhand_bbox), ('rhand', rhand_bbox)):
    project_to_body_space(part_name, bbox)

joint_img[:,smpl_x.pos_joint_part['body'],0] *= (cfg.input_body_shape[1]/ cfg.output_hm_shape[2])
joint_img[:,smpl_x.pos_joint_part['body'],1] *= (cfg.input_body_shape[0]/ cfg.output_hm_shape[1])

joint_img[:,:,0] *= (cfg.input_img_shape[1] / cfg.input_body_shape[1])
joint_img[:,:,1] *= (cfg.input_img_shape[0] / cfg.input_body_shape[0])

inference.py

joint_img = out['joint_img'].cpu().numpy()[0]
joint_img_xy1 = np.concatenate((joint_img[:,:2], np.ones_like(joint_img[:,:1])),1)
joint_img = np.round(np.dot(bb2img_trans, joint_img_xy1.transpose(1,0)).transpose(1,0)).astype(int)

lhand_bbox = out['lhand_bbox'].cpu().numpy().reshape(2,2)
hand_bbox_xyl = np.concatenate((lhand_bbox, np.ones_like(lhand_bbox[:,:1])),1)
lhand_bbox = np.round(np.dot(bb2img_trans, hand_bbox_xyl.transpose(1,0)).transpose(1,0)).astype(int)
lhand_bbox = [lhand_bbox[0,0], lhand_bbox[0,1], lhand_bbox[1,0], lhand_bbox[1,1]]  

rhand_bbox = out['rhand_bbox'].cpu().numpy().reshape(2,2)
hand_bbox_xyr = np.concatenate((rhand_bbox, np.ones_like(rhand_bbox[:,:1])),1)
rhand_bbox = np.round(np.dot(bb2img_trans, hand_bbox_xyr.transpose(1,0)).transpose(1,0)).astype(int)  
rhand_bbox = [rhand_bbox[0,0], rhand_bbox[0,1], rhand_bbox[1,0], rhand_bbox[1,1]]  

When I use default params, the results seems wired, the joints are not accurate.
420335
After modified as below:

# lhand_bbox = restore_bbox(lhand_bbox_center, lhand_bbox_size, cfg.input_hand_shape[1] / cfg.input_hand_shape[0], 2).detach()  
# rhand_bbox = restore_bbox(rhand_bbox_center, rhand_bbox_size, cfg.input_hand_shape[1] / cfg.input_hand_shape[0], 2).detach()
lhand_bbox = restore_bbox(lhand_bbox_center, lhand_bbox_size, cfg.input_hand_shape[1] / cfg.input_hand_shape[0], 1.5).detach()  
rhand_bbox = restore_bbox(rhand_bbox_center, rhand_bbox_size, cfg.input_hand_shape[1] / cfg.input_hand_shape[0], 1.5).detach()
# bbox = process_bbox(mmdet_box_xywh, original_img_width, original_img_height)
 bbox = process_bbox(mmdet_box_xywh, original_img_width, original_img_height, ratio=1.1)

Things get better, but still not accurate.
42033_

How to explain this?

recover mesh

For each image,while inferencing,there is a npz file in output.So how can I recover 3D model image using the npz file

about humandata

great work! when release train data on humandata format ?

ValueError: buffer size does not match array size

  • python inference.py --num_gpus 1 --exp_name output/demo_inference_单人半身 --pretrained_model smpler_x_h32 --agora_benchmark agora_model --img_path ../demo/images/单人半身 --start 1 --end 514 --output_folder ../demo/results/单人半身 --show_verts --show_bbox --save_mesh
    /home/bruce/anaconda3/envs/smplerx/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
    warnings.warn(
    Traceback (most recent call last):
    File "inference.py", line 297, in
    main()
    File "inference.py", line 78, in main
    from base import Demoer
    File "/media/bruce/3696C50196C4C31B/zxx/SMPLer-X/main/../common/base.py", line 11, in
    from SMPLer_X import get_model
    File "/media/bruce/3696C50196C4C31B/zxx/SMPLer-X/main/../main/SMPLer_X.py", line 4, in
    from nets.smpler_x import PositionNet, HandRotationNet, FaceRegressor, BoxNet, HandRoI, BodyRotationNet
    File "/media/bruce/3696C50196C4C31B/zxx/SMPLer-X/main/../common/nets/smpler_x.py", line 6, in
    from utils.human_models import smpl_x
    File "/media/bruce/3696C50196C4C31B/zxx/SMPLer-X/main/../common/utils/human_models.py", line 175, in
    smpl_x = SMPLX()
    File "/media/bruce/3696C50196C4C31B/zxx/SMPLer-X/main/../common/utils/human_models.py", line 20, in init
    self.j14_regressor = pickle.load(f, encoding='latin1')
    ValueError: buffer size does not match array size

Does anyone know how to solve this problem?Thank you very much!

Image normalization and VIT

I noticed, that there is only one transform: ToTensor() in the DataLoader.
Why don't you use image normalization (mean, std) before first VIT's layers?

Problem regarding estimation accuracy in Foot Region

Hi, I ran the project on other videos and it worked generally well. But I observed some differences between the pose estimation output and the actual human pose, especially in the foot area. The attached image shows a clear divergence, where the estimated pose points are not aligned with the physical position of the foot:

000001
000200

I tried models s to h and the same problem occurred in all models. Can you give your opinion on this matter? Is this a known issue or could you recommend any method to improve the performance?

Thanks in advance.

Any reason for choosing mmcv_full from sensetime?

Hi,

Do you have any reason for choosing mmcv_full from sensetime?

wget http://download.openmmlab.sensetime.com/mmcv/dist/cu113/torch1.12.0/mmcv_full-1.7.1-cp38-cp38-manylinux1_x86_64.whl

I couldn't connect to that link, so I chose to use openmmlab's but I'm curious about this.

Thanks!

Error when running "init_detector" in inference

Thanks for the nice work.

I followed the installation instructions but ran into an error when running inference. The error happened when init_detector, which then used mmdet, mmcv, and importlib.

Do you have any idea why will this happen? Is that a package version issue?

Traceback (most recent call last):
  File "inference.py", line 188, in <module>
    main()
  File "inference.py", line 71, in main
    model = init_detector(config_file, checkpoint_file, device='cuda:0')  # or device='cuda:0'
  File "/home/anaconda3/envs/smplerx/lib/python3.8/site-packages/mmdet/apis/inference.py", line 33, in init_detector
    config = mmcv.Config.fromfile(config)
  File "/home/anaconda3/envs/smplerx/lib/python3.8/site-packages/mmcv/utils/config.py", line 340, in fromfile
    cfg_dict, cfg_text = Config._file2dict(filename,
  File "/home/anaconda3/envs/smplerx/lib/python3.8/site-packages/mmcv/utils/config.py", line 208, in _file2dict
    mod = import_module(temp_module_name)
  File "/home/anaconda3/envs/smplerx/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 843, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/tmp/tmpngpqtodm/tmpti3mss0b.py", line 1, in <module>
NameError: name 'false' is not defined
Exception ignored in: <function _TemporaryFileCloser.__del__ at 0x7fcdc5affa60>
Traceback (most recent call last):
  File "/home/anaconda3/envs/smplerx/lib/python3.8/tempfile.py", line 440, in __del__
    self.close()
  File "/home/anaconda3/envs/smplerx/lib/python3.8/tempfile.py", line 436, in close
    unlink(self.name)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpngpqtodm/tmpti3mss0b.py'

The hand details are not extracted

Hi, thanks for your great work again.
But for me it seems that the hand details are not extracted, why is that? I know that SMPL officially provides a mano that can extract hand details. Does your pre-trained model not have this module? I see that all of your YouTube videos have hand details.
The model that I use is "SMPLer-X-H32":
image
Looking forward to your early reply.

Error: libcuda.so: wrong ELF class: ELFCLASS32

when I run
sh slurm_inference.sh test_video mp4 24 smpler_x_s32
it occurs error
Could not load library libcudnn_cnn_infer.so.8. Error: libcuda.so: wrong ELF class: ELFCLASS32
how can I solve it?

Use Pytorch3D to render visualization

Firstly, I would like to express my admiration for your impressive work. It has been a great resource for me.

I am currently trying to understand how to use PyTorch3D for 2D overlap visualization.

I have attempted to write my own code to achieve this, but the results have been unsatisfactory, with strange overlapping occurring. I suspect that I might be missing some crucial steps or misusing some parameters.

Could you possibly provide a detailed description or guide on how to use PyTorch3D for 2D overlap visualization, particularly with respect to the use of predicted camera parameters? Any examples or references would be greatly appreciated.

Thank you in advance for your time and assistance. I look forward to your response.

Best regards

Some question about the docker support

docker run --gpus all -v <vid_input_folder>:/smplerx_inference/vid_input
-v <vid_output_folder>:/smplerx_inference/vid_output
wcwcw/smplerx_inference:v0.2 --vid <video_name>.mp4
I wonder what does the <vid_output_folder> and the <vid_input_folder> mean. Are they the folders /SMPLer-X-main/demo/results and /SMPLer-X-main/demo/videos?

about training (video/photo) and different size of betas in datasets

  1. I saw on the main page, that SMPLer-X inference scripts expect video data as input. Is it possible to modify model to support single images?
  2. Do you train your model using images as video (in strict sequence)?
  3. BEDLAM dataset has 11 betas and only one neutral gender (according to the base model). AGORA has only 10 betas. How do you combine different numbers of shapes while training and in the model's head? Does SMPLX regression layer have dynamic size?

Thank you!

random output on inference with pretrained model

Hi,

I've tested inference with your pretrained model, but the result seems random value.
As I tried to debug this, input image cropped was okay.

000471

I've tested all pretrained models but failed.

Could you check them again?

ModuleNotFoundError: No module named 'utils.inference_utils'

Hi! I'm having some trouble when trying to inference over a video. I'm using the inference.py script, but in the console I obtain the following error:

/home/seba/anaconda3/envs/smplerx/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
Traceback (most recent call last):
File "main/inference.py", line 18, in
from utils.inference_utils import process_mmdet_results, non_max_suppression
ModuleNotFoundError: No module named 'utils.inference_utils'

I followed the instructions to setup the environment as appears in the readme file.
The command I used is as follows.
python main/inference.py --num_gpus 1 --pretrained_model smpler_x_s32 --agora_benchmark agora_model --image_path ./demo/images --start 1 --end 453 --output_folder ./demo/results/salida.mp4 --show_verts --show_bbox

Thanks for your help!

RuntimeError: Caught RuntimeError in replica 0 on device 0.

when I run

sh slurm_inference.sh test_video mp4 24 smpler_x_h32

it occurs error

RuntimeError: Caught RuntimeError in replica 0 on device 0.

I used

num_gpus =1  

and in /main/common/base.py I used

ckpt = torch.load(cfg.pretrained_model_path,map_location=torch.device('cpu'))

Pretrained model?

Hi, thanks for sharing your great work!

Is there any pretrained model I can test?

Why output is different from repository

Hi, thanks for the great work , i am new with SMPL models. May I ask why the output in your work like 'point', but not like a mesh blend output in README? how can i get that output? thanks .

Data converting scripts

Thank you for your work! Do you have any plans to release converters to the human_data format? (Most interested in the BEDLAM converter)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.