caizhongang / smpler-x Goto Github PK
View Code? Open in Web Editor NEWOfficial Code for "SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation"
Home Page: https://caizhongang.github.io/projects/SMPLer-X/
License: Other
Official Code for "SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation"
Home Page: https://caizhongang.github.io/projects/SMPLer-X/
License: Other
During the docker support stage, when I run the
docker run --gpus all -v <vid_input_folder>:/smplerx_inference/vid_input \
-v <vid_output_folder>:/smplerx_inference/vid_output \
wcwcw/smplerx_inference:v0.2 --vid <video_name>.mp4
It shows "No adapter used", but It seems that my structure is ok. Is there anything wrong?
hello, i just found that it does not use the hand_token variable by inference. Is this intentional?Because i get the hand result not well
When running the inference with specified adapter_name="vit_adapter" I keep running into this error.
Thanks for sharing this amazing work ! I am wondering if the corresponding paper have released?
We were not able to get the project to run given the provided instructions at all (maybe fixing requirements would help).
The only thing that worked for us was to work based on the Dockerfile within the provided image, which boils down to (if in doubt, look at the Dockerfile) - NOTE: only tested this for inference:
conda create -n smplerx python=3.9.12 -y
conda activate smplerx
sudo apt-get update && sudo apt-get install -y ffmpeg
sudo apt-get install -y python3-opengl libosmesa6
pip install mmcv-full==1.7.1 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.12.0/index.html
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.3 -c pytorch -y
cd SMPLer-X/main/transformer_utils/
pip install -v -e .
pip install torchgeometry --find-links .
pip install -r {PATH_TO_FILE}/requirements_fixed.txt
conda update ffmpeg
extract the config.py and conversions.py from the Docker image (because of version conflicts) and place them wherever you installed the respective packages mmcv and torchgeometry (replace the existing files ofc). The following are the commands from within the docker container (assuming it runs on the same machine).
{COPY} smplerx/main/transformer_utils/mmcv/utils/config.py {WHEREVER_YOUR_CONDA_PACKAGES_ARE}/site-packages/mmcv/utils/config.py
{COPY} smplerx/main/transformer_utils/torchgeometry/core/conversions.py {WHEREVER_YOUR_CONDA_PACKAGES_ARE}/site-packages/torchgeometry/core/conversions.py
If you are not familiar with Docker
, this is how you get into their container (you may leave out the nvidia arg):
sudo docker run -it --rm --runtime=nvidia --entrypoint="" {CONTAINER_ID} bash
Currently, the reconstruction is done on the image patch of the detected human bounding box. I tried directly rendering the reconstructed poses of a video with fixed camera parameters, but the result kept jittering. Is there an easy way to transfer from the predicted local 3D pose (with respect to the image patch) to a global 3D pose based on the bounding box parameters?
Hi! I found some frame of the results are flipped. I checked and realized it's due to my wrong modifications on the pytorch conversions.py script. I used this one as reference to make sure it's the right math: https://github.com/camenduru/SMPLer-X-colab/blob/main/SMPLer_X_colab.ipynb
This is really an exciting job. Can I directly apply the bone point data of the human body in the model inference results to the self built human body model? I visualized the bone point data of the human body in the model inference results, but did not output a representation similar to the skeleton. @caizhongang
Thanks for your nice work! I have some data that are different from these datasets format in your project. Can I run this model in my own data? How could I do this? If you can give me some advice, I will be thankful!
Thanks for the project ❤️ I made a colab. 🥳 I hope you like it. https://github.com/camenduru/SMPLer-X-colab
Hello, I saw a related video on YouTube, and I want to ask how to model the inference results of the model to a model in Unity? This should be very interesting. Is there any expert who can offer some help? Thank you.
There are some errors when using mmdet. The version of mmcv is 1.7.1, and only supports mmdet 2.27.2. However, simple code like this
model = init_detector(config_file, checkpoint_file, device='cuda:0') not work. So what's the version of mmdet?
想请教下如何转换为动画,比较小白,想知道一条路径,不需要非常详细,能大概告知如何操作的就行,非常感谢
Dear Author:
Thanks a lot for your job, when I download files as your README to construct human_models_files, I can not find the file "SMPLX_to_J14.pkl" in smplx. Could you please help me with that?
the link in README is https://smpl-x.is.tue.mpg.de/download.php
Hi, SMPLer-X had already provided 3D info, but I still wanna obtain 2D information for my work (I would not like to add another 2D joints detector). Here is my modification (refer to InterWild).
SMPLer_X.py
def project_to_body_space(part_name, bbox):
hand_bbox_w = bbox[:, None, 2] - bbox[:, None, 0]
hand_bbox_h = bbox[:, None, 3] - bbox[:, None, 1]
joint_img[:, smpl_x.pos_joint_part[part_name], 0] *= (
(hand_bbox_w / cfg.output_hand_hm_shape[2]))
joint_img[:, smpl_x.pos_joint_part[part_name], 1] *= (
(hand_bbox_h / cfg.output_hand_hm_shape[1]))
joint_img[:, smpl_x.pos_joint_part[part_name], 0] += bbox[:, None, 0]
joint_img[:, smpl_x.pos_joint_part[part_name], 1] += bbox[:, None, 1]
for part_name, bbox in (('lhand', lhand_bbox), ('rhand', rhand_bbox)):
project_to_body_space(part_name, bbox)
joint_img[:,smpl_x.pos_joint_part['body'],0] *= (cfg.input_body_shape[1]/ cfg.output_hm_shape[2])
joint_img[:,smpl_x.pos_joint_part['body'],1] *= (cfg.input_body_shape[0]/ cfg.output_hm_shape[1])
joint_img[:,:,0] *= (cfg.input_img_shape[1] / cfg.input_body_shape[1])
joint_img[:,:,1] *= (cfg.input_img_shape[0] / cfg.input_body_shape[0])
inference.py
joint_img = out['joint_img'].cpu().numpy()[0]
joint_img_xy1 = np.concatenate((joint_img[:,:2], np.ones_like(joint_img[:,:1])),1)
joint_img = np.round(np.dot(bb2img_trans, joint_img_xy1.transpose(1,0)).transpose(1,0)).astype(int)
lhand_bbox = out['lhand_bbox'].cpu().numpy().reshape(2,2)
hand_bbox_xyl = np.concatenate((lhand_bbox, np.ones_like(lhand_bbox[:,:1])),1)
lhand_bbox = np.round(np.dot(bb2img_trans, hand_bbox_xyl.transpose(1,0)).transpose(1,0)).astype(int)
lhand_bbox = [lhand_bbox[0,0], lhand_bbox[0,1], lhand_bbox[1,0], lhand_bbox[1,1]]
rhand_bbox = out['rhand_bbox'].cpu().numpy().reshape(2,2)
hand_bbox_xyr = np.concatenate((rhand_bbox, np.ones_like(rhand_bbox[:,:1])),1)
rhand_bbox = np.round(np.dot(bb2img_trans, hand_bbox_xyr.transpose(1,0)).transpose(1,0)).astype(int)
rhand_bbox = [rhand_bbox[0,0], rhand_bbox[0,1], rhand_bbox[1,0], rhand_bbox[1,1]]
When I use default params, the results seems wired, the joints are not accurate.
After modified as below:
# lhand_bbox = restore_bbox(lhand_bbox_center, lhand_bbox_size, cfg.input_hand_shape[1] / cfg.input_hand_shape[0], 2).detach()
# rhand_bbox = restore_bbox(rhand_bbox_center, rhand_bbox_size, cfg.input_hand_shape[1] / cfg.input_hand_shape[0], 2).detach()
lhand_bbox = restore_bbox(lhand_bbox_center, lhand_bbox_size, cfg.input_hand_shape[1] / cfg.input_hand_shape[0], 1.5).detach()
rhand_bbox = restore_bbox(rhand_bbox_center, rhand_bbox_size, cfg.input_hand_shape[1] / cfg.input_hand_shape[0], 1.5).detach()
# bbox = process_bbox(mmdet_box_xywh, original_img_width, original_img_height)
bbox = process_bbox(mmdet_box_xywh, original_img_width, original_img_height, ratio=1.1)
Things get better, but still not accurate.
How to explain this?
For each image,while inferencing,there is a npz file in output.So how can I recover 3D model image using the npz file
Hi!
Do you have any comparison results of inference speed(fps) between your code and other SOTAs like OSX?
Thanks!
great work! when release train data on humandata format ?
Does anyone know how to solve this problem?Thank you very much!
I noticed that the pretrain model is pretrained on many datasets and has strong generalization capabilities. I want to apply it to actual indoor scenes, but I don't have my own dataset. Do you think I still need to finetune using the 5 public datasets you used?
Thinks!
SMPLX_to_J14.pkl not found ,where is the file?
I noticed, that there is only one transform: ToTensor()
in the DataLoader.
Why don't you use image normalization (mean, std) before first VIT's layers?
Hi, I ran the project on other videos and it worked generally well. But I observed some differences between the pose estimation output and the actual human pose, especially in the foot area. The attached image shows a clear divergence, where the estimated pose points are not aligned with the physical position of the foot:
I tried models s to h and the same problem occurred in all models. Can you give your opinion on this matter? Is this a known issue or could you recommend any method to improve the performance?
Thanks in advance.
Hi,
Do you have any reason for choosing mmcv_full from sensetime?
wget http://download.openmmlab.sensetime.com/mmcv/dist/cu113/torch1.12.0/mmcv_full-1.7.1-cp38-cp38-manylinux1_x86_64.whl
I couldn't connect to that link, so I chose to use openmmlab's but I'm curious about this.
Thanks!
Thanks for the nice work.
I followed the installation instructions but ran into an error when running inference. The error happened when init_detector, which then used mmdet, mmcv, and importlib.
Do you have any idea why will this happen? Is that a package version issue?
Traceback (most recent call last):
File "inference.py", line 188, in <module>
main()
File "inference.py", line 71, in main
model = init_detector(config_file, checkpoint_file, device='cuda:0') # or device='cuda:0'
File "/home/anaconda3/envs/smplerx/lib/python3.8/site-packages/mmdet/apis/inference.py", line 33, in init_detector
config = mmcv.Config.fromfile(config)
File "/home/anaconda3/envs/smplerx/lib/python3.8/site-packages/mmcv/utils/config.py", line 340, in fromfile
cfg_dict, cfg_text = Config._file2dict(filename,
File "/home/anaconda3/envs/smplerx/lib/python3.8/site-packages/mmcv/utils/config.py", line 208, in _file2dict
mod = import_module(temp_module_name)
File "/home/anaconda3/envs/smplerx/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 843, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/tmp/tmpngpqtodm/tmpti3mss0b.py", line 1, in <module>
NameError: name 'false' is not defined
Exception ignored in: <function _TemporaryFileCloser.__del__ at 0x7fcdc5affa60>
Traceback (most recent call last):
File "/home/anaconda3/envs/smplerx/lib/python3.8/tempfile.py", line 440, in __del__
self.close()
File "/home/anaconda3/envs/smplerx/lib/python3.8/tempfile.py", line 436, in close
unlink(self.name)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpngpqtodm/tmpti3mss0b.py'
I only want to use the function of inference to get the npz from a video, so I comment out the line of inference.py from 158 to 184. Then the error comes out. Does anyone have any clues?
Hi, thanks for your great work again.
But for me it seems that the hand details are not extracted, why is that? I know that SMPL officially provides a mano that can extract hand details. Does your pre-trained model not have this module? I see that all of your YouTube videos have hand details.
The model that I use is "SMPLer-X-H32":
Looking forward to your early reply.
when I run
sh slurm_inference.sh test_video mp4 24 smpler_x_s32
it occurs error
Could not load library libcudnn_cnn_infer.so.8. Error: libcuda.so: wrong ELF class: ELFCLASS32
how can I solve it?
Firstly, I would like to express my admiration for your impressive work. It has been a great resource for me.
I am currently trying to understand how to use PyTorch3D for 2D overlap visualization.
I have attempted to write my own code to achieve this, but the results have been unsatisfactory, with strange overlapping occurring. I suspect that I might be missing some crucial steps or misusing some parameters.
Could you possibly provide a detailed description or guide on how to use PyTorch3D for 2D overlap visualization, particularly with respect to the use of predicted camera parameters? Any examples or references would be greatly appreciated.
Thank you in advance for your time and assistance. I look forward to your response.
Best regards
docker run --gpus all -v <vid_input_folder>:/smplerx_inference/vid_input
-v <vid_output_folder>:/smplerx_inference/vid_output
wcwcw/smplerx_inference:v0.2 --vid <video_name>.mp4
I wonder what does the <vid_output_folder> and the <vid_input_folder> mean. Are they the folders /SMPLer-X-main/demo/results and /SMPLer-X-main/demo/videos?
Thank you!
你好,你的工作做的非常出色,能不能说明一下data文件夹下的详细文件排列树状图呢,非常感谢
hi,
This result is very amazing. I would like to know other than the difference in training data, is there any difference between this work and OSX?
Hi! I'm having some trouble when trying to inference over a video. I'm using the inference.py
script, but in the console I obtain the following error:
/home/seba/anaconda3/envs/smplerx/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
Traceback (most recent call last):
File "main/inference.py", line 18, in
from utils.inference_utils import process_mmdet_results, non_max_suppression
ModuleNotFoundError: No module named 'utils.inference_utils'
I followed the instructions to setup the environment as appears in the readme file.
The command I used is as follows.
python main/inference.py --num_gpus 1 --pretrained_model smpler_x_s32 --agora_benchmark agora_model --image_path ./demo/images --start 1 --end 453 --output_folder ./demo/results/salida.mp4 --show_verts --show_bbox
Thanks for your help!
when I run
sh slurm_inference.sh test_video mp4 24 smpler_x_h32
it occurs error
RuntimeError: Caught RuntimeError in replica 0 on device 0.
I used
num_gpus =1
and in /main/common/base.py I used
ckpt = torch.load(cfg.pretrained_model_path,map_location=torch.device('cpu'))
Hi, thanks for sharing your great work!
Is there any pretrained model I can test?
Hi, thanks for the great work , i am new with SMPL models. May I ask why the output in your work like 'point', but not like a mesh blend output in README? how can i get that output? thanks .
Thank you for your work! Do you have any plans to release converters to the human_data format? (Most interested in the BEDLAM converter)
How to run slurm on ubuntu
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.