GithubHelp home page GithubHelp logo

jeff-sjtu / hybrik Goto Github PK

View Code? Open in Web Editor NEW
1.2K 26.0 145.0 95.18 MB

Official code of "HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation", CVPR 2021

License: MIT License

Python 99.71% Shell 0.29%
3d-pose-estimation smpl inverse-kinematics pose-estimation pytorch cvpr cvpr21

hybrik's Introduction

My Awesome Stats

hybrik's People

Contributors

biansy000 avatar jeff-sjtu avatar loyalblanc avatar wangzheallen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hybrik's Issues

SMPL model not found

Hello, I only found basicmodel_neutral_lbs_10_207_0_v1.1.0.pkl in the webpage you listed. There is no neutral model for v1.1.0. Can you share this file?

Training is extremely slow

Are the default configs recommended for training? Using 8 3090 GPUs with default configs results in extremely slow training speed (8 seconds per batch).

Is it possible to have a demo code?

Hi,

Just a suggestion.

I am wondering if it is possible to release a single demo code like
https://github.com/mks0601/3DMPPE_POSENET_RELEASE/blob/3f92ebaef214a0eb1574b7265e836456fbf3508a/demo/demo.py

Given bbox, root_depth, intrinsic, pre-trained model, and the image, the code will output the keypoints and the rendered mesh. I am working on it right now but it is a little tedious and I cannot be sure that it is totally correct. It will be extremely helpful if you guys can release a official one!

Thanks!

How can I test?

In the past issue, i found your answer says
"Hi, if you want to test with your own data, you need to run object detection to generate human bounding boxes. Then use RootNet to predict the root joint in each bounding box. Combing box and root joint, you can run our model to predict the final SMPL results."

so, i used Rootnet and output bbox and root joint
stil there are other unknown variables such as trans_inv and depth factor
how can i find it?

About depth_factor

Hi,
I'm trying to train the model from scratch on another dataset. I noticed that you get depth_factor from this line:
depth_factor = np.array([self.bbox_3d_shape[2]]).astype(np.float32)
I have 2 questions regarding depth_factor:

  1. What's the usage of depth_factor? Why not just scale z by heatmap_size * 4 to transform xyz to uvd space, just like what's done to x and y?
  2. Why choose the value 2000 for depth_factor?
    Thanks a lot!

Source for Human3.6M Images?

Hello,

I was wondering if you had the script to generate the images that correspond to your annotations for the Human3.6M dataset. I didn't see any information on it and would like them so I can train your model using your train script. I already have access to the Human3.6M raw videos so only need the script to convert to images.

About SMPL gt and H36M gt !!

Your json file provides the H36M joint coordinates obtained from the SMPL gt parameters.
But I found that the yours provided joints3D data is different from which given in Human3.6 official site.
E.g

QQ截图20210902122243

QQ截图20210902122348

QQ截图20210902122436

How to reproject smpl verts to 2D?

Hello,Thanks for your great work,but when I test your code,I can't find a way reproject smpl 3D(6890) to 2D ,can you provide an example ?

How to obtain 3dhp image frames

Hi @Jeff-sjtu ,
Thanks for your paper and code, they are helpful. When I try train your network, I encounter some issues about 3dhp. I find that code uses annotation_mpi_inf_3dhp_train_v2.json, in which the file names of 3hp frames is like this:

S1/Seq2/images/S1_Seq2_V7/img_S1_Seq2_V7_006791.jpg
S8/Seq2/images/S8_Seq2_V8/img_S8_Seq2_V8_005711.jpg

I have 2 questions:

  1. What's the meaning of V7 and V8 in lines above
  2. Can you share the scripts to extract frames from video sources of 3dhp, or if possible, could you share the frame images that you used when training HybrIK

Thanks for your work again and hope to hear from you.

Strange mesh in testing

Hi, I run testing on wild images and found sometimes it will output extremely unnatural body mesh.
Here is the input image:
000003
And below are the screenshots of the reconstructed image from front and side view:
Screen Shot 2021-09-13 at 4 51 51 PM
Screen Shot 2021-09-13 at 4 58 38 PM
As you can see, the body seemed to be twisted 180 degree near the joint SPINE_3, as the result, the upper body is inside-out.
I also test the model on several other images and find that this problem would very much likely to occur when the input images don't include full body(for example, only partial body or half body).
May I ask why this weird problem exists? Thanks a lot!

how to test?

hi~great job
I want to know how to test with my own data to obtain smpl's parameter ?

Questions about annotations of Human3.6M.

Great Job and thank you for your excellent work.

However, I have some question. Yours provided joints3D data is different from which given in Human3.6 official site. I have checked the Positions_D3_positions_mono's file which has also been used in SPIN. So, Is there existing something external operation?

About numerical stability

Hi! Thank you for your great work!
I am trying to implement your paper before and meet some problem about numerical stability.
When using svd to compute the rotation matrix of root joint, it may produce matrix whose det is -1 because of the error of predicted pose. Due to the error of the predicted pose, the three joints, spine, left hip and right hip, is not a rigid bod part.
In this case, the optimal matrix is not a rotation matrix but a matrix containing mirror transform.
Hence, when I change this rotation matrix to angle, NaN is produced.
Are there any suggestions?

h36m smpl_joint_29

Hi author, thanks for the great work!

I was able to derive the SMPL parameters and joint positions smpl_joints_24 using the regressor. However, my derived values forsmpl_joints_29 differ slighlty from the ground-truth values you provide in Sample_5_train_Human36M_smpl_leaf_twist_protocol_2.json.

I would like to check how did you obtain the 29 joints (24 normal joints + 5 vertices based on the pre-selected leaf_number= [411, 2445, 5905, 3216, 6617]) from 24 joints obtained from SMPL regressor? and would you be willing to share the code for this step?

Hope to get your advice on this, thank you!

Some issues about data

Hi, may I ask the questions as follows:

  1. How to obtain 'twist_phi', 'twist_weight' in H36M dataset, and 'root_cam' in H36M, 3DPW, 3DHP datasets? May I know the process and code calculating them?
  2. Does your method use the data: 'action_idx' in H36M and 'segmentation' in MSCOCO? If used, how to obtain them?

Thanks so much.

validation code stuck in loading models

i set all datasets and run validation scripts
but it stuck in loading models like below:

tcp://127.0.0.1:23456, ws:4, rank:1
Loading model from ./pretrained_res34.pth...
tcp://127.0.0.1:23456, ws:4, rank:0
Loading model from ./pretrained_res34.pth...

is there anything else i should do?
i set all the pretrained weights and smpl files

Inference bugs with trained model

Hi, when I carry out inference with trained models, such as best_h36m_model.pth and best_3dpw_model.pth, the following error will be reported using the ./configs/256x192_adam_lr1e-3-res34_smpl_24_3d_base_2x_mix.yaml as suggested:

RuntimeError: Error(s) in loading state_dict for Simple3DPoseBaseSMPL24:
Missing key(s) in state_dict: "decleaf.weight", "decleaf.bias".
size mismatch for final_layer.weight: copying a param with shape torch.Size([1856, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1536, 256, 1, 1]).
size mismatch for final_layer.bias: copying a param with shape torch.Size([1856]) from checkpoint, the shape in current model is torch.Size([1536]).
size mismatch for smpl.children_map: copying a param with shape torch.Size([29]) from checkpoint, the shape in current model is torch.Size([24]).
size mismatch for smpl.parents: copying a param with shape torch.Size([29]) from checkpoint, the shape in current model is torch.Size([24]).

So I switch to the configs file used in the training as: ./configs/256x192_adam_lr1e-3-res34_smpl_3d_base_2x_mix.yaml, still the following error is reported when tested with 3dpw after setting test_vertice=True in the validate_smpl.py:

Traceback (most recent call last):
File "/mnt/lustre/shaorui/anaconda3/envs/hybrik/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "xxx/hybrik/scripts/validate_smpl.py", line 251, in main_worker
gt_tot_err = validate_gt(m, opt, cfg, gt_val_dataset_3dpw, heatmap_to_coord, opt.batch, test_vertice=True)
File "xxx/hybrik/scripts/validate_smpl.py", line 102, in validate_gt
gt_output = m.module.forward_gt_theta(gt_thetas, gt_betas)
File "xxx/hybrik/hybrik/models/simple3dposeBaseSMPL.py", line 346, in forward_gt_theta
return_verts=True
File "xxx/hybrik/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "xxx/hybrik/hybrik/models/layers/smpl/SMPL.py", line 207, in forward
self.lbs_weights, pose2rot=pose2rot, dtype=self.dtype)
File "xxx/hybrik/hybrik/models/layers/smpl/lbs.py", line 269, in lbs
J_transformed, A = batch_rigid_transform(rot_mats, J, parents, dtype=dtype)
File "xxx/hybrik/hybrik/models/layers/smpl/lbs.py", line 512, in batch_rigid_transform
rel_joints[:, 1:] -= joints[:, parents[1:]].clone()
RuntimeError: The size of tensor a (23) must match the size of tensor b (28) at non-singleton dimension 1

Seems there exist some issues about the dimension when inference and may I ask how to fix them.

Thanks so much.

Error when training

Hello. When I was training with
./scripts/train_smpl.sh train_res34 ./configs/256x192_adam_lr1e-3-res34_smpl_3d_base_2x_mix.yaml
An error occurs:

Traceback (most recent call last):
File "./scripts/train_smpl.py", line 373, in
main()
File "./scripts/train_smpl.py", line 236, in main
mp.spawn(main_worker, nprocs=ngpus_per_node, args=(opt, cfg))
File "/home/jack/anaconda3/envs/hybrik/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 167, in spawn
while not spawn_context.join():
File "/home/jack/anaconda3/envs/hybrik/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 114, in join
raise Exception(msg)
Exception:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/jack/anaconda3/envs/hybrik/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/media/jack/6e1ff86c-973a-4c7d-acc8-acd6f9fe7b45/research/HybrIK/scripts/train_smpl.py", line 321, in main_worker
loss, acc17 = train(opt, train_loader, m, criterion, optimizer, writer)
File "/media/jack/6e1ff86c-973a-4c7d-acc8-acd6f9fe7b45/research/HybrIK/scripts/train_smpl.py", line 57, in train
output = m(inps, trans_inv, intrinsic_param, root, depth_factor, None)
File "/home/jack/anaconda3/envs/hybrik/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/jack/anaconda3/envs/hybrik/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 376, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/jack/anaconda3/envs/hybrik/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/media/jack/6e1ff86c-973a-4c7d-acc8-acd6f9fe7b45/research/HybrIK/hybrik/models/simple3dposeBaseSMPL.py", line 311, in forward
return_verts=True
File "/media/jack/6e1ff86c-973a-4c7d-acc8-acd6f9fe7b45/research/HybrIK/hybrik/models/layers/smpl/SMPL.py", line 261, in hybrik
leaf_thetas=leaf_thetas)
File "/media/jack/6e1ff86c-973a-4c7d-acc8-acd6f9fe7b45/research/HybrIK/hybrik/models/layers/smpl/lbs.py", line 341, in hybrik
v_shaped = v_template + blend_shapes(betas, shapedirs)
File "/media/jack/6e1ff86c-973a-4c7d-acc8-acd6f9fe7b45/research/HybrIK/hybrik/models/layers/smpl/lbs.py", line 436, in blend_shapes
blend_shape = torch.einsum('bl,mkl->bmk', [betas, shape_disps])
File "/home/jack/anaconda3/envs/hybrik/lib/python3.6/site-packages/torch/functional.py", line 211, in einsum
return torch._C._VariableFunctions.einsum(equation, operands)
RuntimeError: size of dimension does not match previous size, operand 1, dim 2

Question about 'Sample_20_test_Human36M_smpl'

Thank you for your excellent work.
'Sample_20_test_Human36M_smpl' is required in the project but only 'Sample_64_test_Human36M_protocol_2' found in the given link. Also the given file doesn't include the keys like 'h36m_joints' or 'smpl_joints'.
Thanks again.

Process to generate the JSON annotation files

Hi, HybrIK is indeed an awesome and elegant work. I really like it! Congratulations!

But, there's something I'm curious about, which is the process to generate JSON annotation files from the original datasets.
I would surely appreciate it if you would share some hints about the procedure.

Thanks again.

RuntimeError: NCCL error

Thank you for the great work.

I am having issues evaluating the pretrained model, ./validate_smpl.sh

The script validate_smpl.sh looks like

cd ..

CONFIG=./configs/256x192_adam_lr1e-3-res34_smpl_24_3d_base_2x_mix.yaml
CKPT=./models/pretrained_res34.pth
PORT=${3:-23456}

HOST=$(hostname -i)

NCCL_DEBUG=INFO python ./scripts/validate_smpl.py \
    --batch 32 \
    --gpus 0,1,2,3 \
    --world-size 4 \
    --flip-test \
    --launcher pytorch --rank 0 \
    --dist-url tcp://${HOST}:${PORT} \
    --cfg ${CONFIG} \
    --checkpoint ${CKPT}

The error log:

ued37c44e4ea65b:6861:6861 [7] init.cc:981 NCCL WARN Invalid rank requested : 7/4
ued37c44e4ea65b:6859:6859 [5] init.cc:981 NCCL WARN Invalid rank requested : 5/4
ued37c44e4ea65b:6860:6860 [6] NCCL INFO Bootstrap : Using [0]enp1s0f0:10.49.66.196<0>
ued37c44e4ea65b:6860:6860 [6] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so).
ued37c44e4ea65b:6860:6860 [6] NCCL INFO NET/IB : No device found.
ued37c44e4ea65b:6860:6860 [6] NCCL INFO NET/Socket : Using [0]enp1s0f0:10.49.66.196<0>

ued37c44e4ea65b:6860:6860 [6] init.cc:981 NCCL WARN Invalid rank requested : 6/4
ued37c44e4ea65b:6854:7877 [0] NCCL INFO Setting affinity for GPU 0 to 03,fffff000,003fffff
ued37c44e4ea65b:6856:7878 [2] NCCL INFO Setting affinity for GPU 2 to 03,fffff000,003fffff
ued37c44e4ea65b:6857:7879 [3] NCCL INFO Setting affinity for GPU 3 to 03,fffff000,003fffff
ued37c44e4ea65b:6855:6855 [1] NCCL INFO Bootstrap : Using [0]enp1s0f0:10.49.66.196<0>
ued37c44e4ea65b:6855:6855 [1] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so).
ued37c44e4ea65b:6855:6855 [1] NCCL INFO NET/IB : No device found.
ued37c44e4ea65b:6855:6855 [1] NCCL INFO NET/Socket : Using [0]enp1s0f0:10.49.66.196<0>
ued37c44e4ea65b:6855:7880 [1] NCCL INFO Setting affinity for GPU 1 to 03,fffff000,003fffff
Traceback (most recent call last):
  File "./scripts/validate_smpl.py", line 250, in <module>
    main()
  File "./scripts/validate_smpl.py", line 201, in main
    mp.spawn(main_worker, nprocs=ngpus_per_node, args=(opt, cfg))
  File "/home/ANT.AMAZON.COM/khirawal/anaconda3/envs/hybrik/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn
    while not spawn_context.join():
  File "/home/ANT.AMAZON.COM/khirawal/anaconda3/envs/hybrik/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 118, in join
    raise Exception(msg)
Exception: 

-- Process 5 terminated with the following error:
Traceback (most recent call last):
  File "/home/ANT.AMAZON.COM/khirawal/anaconda3/envs/hybrik/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
    fn(i, *args)
  File "/home/ANT.AMAZON.COM/khirawal/Desktop/ochmr/hybrik/scripts/validate_smpl.py", line 219, in main_worker
    m = torch.nn.parallel.DistributedDataParallel(m, device_ids=[opt.gpu])
  File "/home/ANT.AMAZON.COM/khirawal/anaconda3/envs/hybrik/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 298, in __init__
    self.broadcast_bucket_size)
  File "/home/ANT.AMAZON.COM/khirawal/anaconda3/envs/hybrik/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 480, in _distributed_broadcast_coalesced
    dist._broadcast_coalesced(self.process_group, tensors, buffer_size)
RuntimeError: NCCL error in: /tmp/pip-req-build-58y_cjjl/torch/lib/c10d/../c10d/NCCLUtils.hpp:39, invalid argument

Any suggestions? Thank you.

About R and t in h36m

Hi, thanks for the great work!

I have a few questions regarding the data preprocessing:

  1. How did you obtain the camera parameters R, t, f, c in H36M? I've tried to derive the camera parameters from metadata.xml provided in the H36M website but I got a different t from the ones in your annotation file.
  2. Would you mind providing the code you use for deriving the camera parameters?

Backward gradients through HybrIK

I had assumed that by making use of an analytical solver there would be gradients flowing back through HybrIK. However, when I set
pred_xyz_jts_24.register_hook(lambda grad: print(grad)) I do not see any gradients flowing back. Also, by visualizing the grad flow, I could find only two MseLossBackward corresponding to loss_beta and loss_twist, while there is no flow through loss_theta.
image

Am I missing out on some config settings?

How to generate the thetas and betas parameter?

Hi, I have download the human3.6m dataset, however, i couldn't find a way to produce the thetas and betas parameter. The only clue we know comes from the SPIN issue, which use the sensors data and Mosh methods to generate theta and beta parameter, but the sensor data of human3.6m and code of Mosh is not public, we are very anxious to generate the pose parameter both in training and testing dataset, can you please answer it?

About the data you provide!

Your json file provides thetas, betas, smpl_joints, h36m_joints.
I calculated the smpl joints by thetas, betas and J_regressor(basicModel_neutral_lbs_10_207_0_v1.0.0.pkl), it is the same as the smpl_joints you provided.
smpl joints - smpl joints root coordinate (I calculated) = smpl_joints - smpl_joints root coordinate (You provide)

But, I calculated the h36m joints by thetas, betas and J_regressor_h36m(from J_regressor_h36m_correct.npy), it is different from the h36m_joints you provided.
h36m_joints - h36m_joints root coordinate (I calculated) ≠ h36m_joints - h36m_joints root coordinate (You provide)
Why?

Missing Test Model ?

hi Dr.
Missing model for test?
when i using following code to eval.
./scripts/validate_smpl.sh ./configs/256x192_adam_lr1e-3-res34_smpl_3d_base_2x_mix.yaml
Where to download the model for eval?

Thanks

Training bugs based on your updated lbs.py

Hi,

After training with your updated lbs.py, I got the error as follows:

File "./scripts/train_smpl.py", line 359, in main_worker
loss, acc17 = train(opt, train_loader, m, criterion, optimizer, writer)
File "./scripts/train_smpl.py", line 91, in train
loss.backward()
File "/mnt/lustre/shaorui/anaconda3/envs/hybrik/lib/python3.6/site-packages/torch/tensor.py", line 118, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/mnt/lustre/shaorui/anaconda3/envs/hybrik/lib/python3.6/site-packages/torch/autograd/init.py", line 93, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1536]], which is output 0 of IndexPutBackward, is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

I have checked that this bug happens after lbs.py was updated with 'use batch opertaion'.

May I ask how to fix this?

Thanks.

Code to get Twist phi

Hello, I am working on a body model as your implementation. Thanks for providing a great implementation.

But I want to check if there is any code to get twist phi annotation from h36m data. can you provide

Thanks.

test image name

I want to use H36m datasets for evaluation, but with the error "FileNotFoundError: [Errno 2] No such file or directory: './data/h36m/images/s_09_act_09_subact_01_ca_04/s_09_act_09_subact_01_ca_04_001756.jpg'". The name of image in S9 subject of H36m is
image
I want to know "s_09_act_09_subact_01_ca_04/s_09_act_09_subact_01_ca_04_001756.jpg" is which one in S9 subject.

3DPW data

Hello, thanks for the great work!

I tried to derive the same values as your 3DPW parsed data but I was unable to obtain the same values h36m_joints and smpl_joints.

I have a few questions:
(1) What are your inputs into the SMPL and H36M regressor?
(2) Was any translation applied?
(3) How did you obtain the root coordinate?

With regards to (3), I've attached a screenshot of my outputs from h36m regressor. They differ from your g-t h36m_joints by an offset, which seems to be the "root coordinate". However, I'm unable to find this "root" from the original dataset. I was wondering how did you derive it? This is the same for your g-t smpl_joints

error02

Hope to get your advice on this, thanks!

Torch size mismatch error

Hi I see the following error msg when I was running evaluation script. I have placed basicModel_neutral_lbs_10_207_0_v1.0.0.pkl in specified location. Any suggestions? Thanks

File "D:\HybrIK\scripts\validate_smpl.py", line 216, in main_worker
m.load_state_dict(torch.load(opt.checkpoint, map_location='cpu'), strict=False)
File "C:\Users\wzhang2\Miniconda3\envs\hybrik\lib\site-packages\torch\nn\modules\module.py", line 777, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Simple3DPoseBaseSMPL:
size mismatch for final_layer.weight: copying a param with shape torch.Size([1536, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1856, 256, 1, 1]).
size mismatch for final_layer.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([1856]).
size mismatch for smpl.children_map: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([29]).
size mismatch for smpl.parents: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([29]).

Weird pose from smpl mesh and error between pred_xyz_jts_24(3D Pose) and pred_xyz_jts_24_struct(smpl pose)

Hello,Thanks for your great job. When I run your validate_smpl.py with your pretrained model(pretrained_res34.pth) in 3DPW_test_new.json, I found some issues.

  1. The model generate weird pose occasionally,especially the pose from hand and head.
  2. There are errors between pred_xyz_jts_24(3D Pose) and pred_xyz_jts_24_struct(smpl pose).
    As we all know, we compute rotation matrix from pred_xyz_jts_24, and compute pred_xyz_jts_24_struct from rotation matrix with smpl. but they are all ways not same, especially on hand and feet.

The images shown in Weird-pose-from-HybrIK

Can you help me how to solve above problems ?
Thanks your great jobs again !

Several Doubts

Hi, thanks for your great work!

May I ask several questions as follows:

  1. Since MSCOCO does not provide ground-truth 3D joints annotation, seems that the codes just use 2D keypoints annotation for the calculation of Joints loss?

  2. In criterion.py, the implementation uses labels['target_uvd_29'] for the joints loss. This should be the regression of 3D heatmap instead of 3D joints (since 3D joints are represented by pred_xyz_jts_29)? But this paper said it does 3D keypoints estimation.

Thanks.

Flip_item is not applied during training

Dear authors, thanks for the great work!

I noticed that parameter flip_item passed into the model is set to None during train() but is popped from is_flipped key during validate_gt()

In train:

output = m(inps, trans_inv, intrinsic_param, root, depth_factor, None)

in validate_gt:

flip_output = labels.pop('is_flipped', None)
output = m(inps, trans_inv, intrinsic_param, root, depth_factor, flip_output)

When flip_item is set toNone, the phi, leaf, shapes and uvd_coordinates will not be flipped during training even though the image and keypoints3d might be flipped during augmentations.

if flip_item is not None:
assert flip_output
pred_uvd_jts_24_orig, pred_phi_orig, pred_leaf_orig, pred_shape_orig = flip_item
if flip_output:
pred_uvd_jts_24 = self.flip_uvd_coord(pred_uvd_jts_24, flatten=False, shift=True)
if flip_output and flip_item is not None:
pred_uvd_jts_24 = (pred_uvd_jts_24 + pred_uvd_jts_24_orig.reshape(batch_size, 24, 3)) / 2
pred_uvd_jts_24_flat = pred_uvd_jts_24.reshape((batch_size, self.num_joints * 3))
# -0.5 ~ 0.5
# Rotate back
pred_xyz_jts_24 = self.uvd_to_cam(pred_uvd_jts_24[:, :24, :], trans_inv, intrinsic_param, joint_root, depth_factor)
assert torch.sum(torch.isnan(pred_xyz_jts_24)) == 0, ('pred_xyz_jts_24', pred_xyz_jts_24)
pred_xyz_jts_24 = pred_xyz_jts_24 - pred_xyz_jts_24[:, self.root_idx_24, :].unsqueeze(1)
pred_phi = pred_phi.reshape(batch_size, 23, 2)
pred_leaf = pred_leaf.reshape(batch_size, 5, 4)
if flip_output:
pred_phi = self.flip_phi(pred_phi)
pred_leaf = self.flip_leaf(pred_leaf)
if flip_output and flip_item is not None:
pred_phi = (pred_phi + pred_phi_orig) / 2
pred_leaf = (pred_leaf + pred_leaf_orig) / 2
pred_shape = (pred_shape + pred_shape_orig) / 2

I would also like to confirm if flip_item is indeed set to None for all training? And if yes, why is flipping of phi, shapes and uvd_coord not applied during training, but applied during validation?

I am rather confused regarding this procedure, and hope to get some clarification on this. Thank you!

error in validate

Hi,

I face this error when validating:

-- Process 3 terminated with the following error:

Traceback (most recent call last):

File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap

fn(i, *args)

File "/dump/algopre/c-szan/github/HybrIK/scripts/validate_smpl.py", line 239, in main_worker

gt_tot_err = validate_gt(m, opt, cfg, gt_val_dataset_hp3d, heatmap_to_coord, opt.batch)

File "/dump/algopre/c-szan/github/HybrIK/scripts/validate_smpl.py", line 169, in validate_gt

with open(os.path.join('exp', f'test_gt_kpt_rank_{opt.rank}.pkl'), 'wb') as fid:

FileNotFoundError: [Errno 2] No such file or directory: 'exp/test_gt_kpt_rank_3.pkl'

What is this exp directory? Am I supposed to first train it? But I only want to use the pre-trained model to do the validate. Any idea?

Thanks!

Does this work require multiple dataset to train?

Hi,

Great work!

I wonder if this framework can be trained only using Human3.6M and testing with Human3.6M.

Will the result still be promising like stated in the paper? Or it requires all the dataset for training.

one of the variables needed for gradient computation has been modified by an inplace operation

Hello, I am very interested in your work. Now I have the following problems:
Traceback (most recent call last):
File "/home/ubuntu/data/nsga/HybrIK/scripts/train_smpl.py", line 375, in
main()
File "/home/ubuntu/data/nsga/HybrIK/scripts/train_smpl.py", line 238, in main
mp.spawn(main_worker, nprocs=ngpus_per_node, args=(opt, cfg))
File "/home/ubuntu/anaconda3/envs/lrwf/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn
while not spawn_context.join():
File "/home/ubuntu/anaconda3/envs/lrwf/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/lrwf/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/ubuntu/data/nsga/HybrIK/scripts/train_smpl.py", line 323, in main_worker
loss, acc17 = train(opt, train_loader, m, criterion, optimizer, writer)
File "/home/ubuntu/data/nsga/HybrIK/scripts/train_smpl.py", line 79, in train
loss.backward()
File "/home/ubuntu/anaconda3/envs/lrwf/lib/python3.6/site-packages/torch/tensor.py", line 118, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/ubuntu/anaconda3/envs/lrwf/lib/python3.6/site-packages/torch/autograd/init.py", line 93, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [768]], which is output 0 of IndexPutBackward, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Looking forward to your recovery!

About decleaf in Simple3DPoseBaseSMPL24

Hi,
I noticed that there is a layer named decleaf in the inference model:

self.decleaf = nn.Linear(1024, 5 * 4) # rot_mat quat

which does not exist in training model simple3dposeBaseSMPL.
When looking into the code, I found it seems that you extend the joints number from 24 to 29 to calculate the rotation of the five distal joints. Thus I have 2 questions:

  1. Why not use the same methods to solve the leaf joints' rotation during inference time as in training time?
  2. How to get the layer decleaf after I train the model from scratch?

Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.