yohanshin / wham Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
the project looks amazing.
Whet is the expected date for releasing the code?
I was trying since last week to install on windows but I was having problems with DPVO and lietorch.
agreat guy called @psiydown helped a lot and another person from the lietorch github.
I created this issue so it could be easier to people to find how to process that part.
I'll add here the notes I used as reference and the files that I had to change to compile DPVO.
Dear authors,
Thank you for this amazing work! May I know if you have a plan to release the code for training?
Thanks for the great work! I have a question about the evaluation. Currently, the joint positions are computed for both GT and prediction by applying the joint regressor to the posed mesh vertices. This won't be the same as the actual joints of SMPL given as the joint output from the SMPL model (since the standard way is to apply the regressor after applying the shape blendshapes but before posing). Is there any particular reason for not using the joint output from SMPL? Thanks!
作者您好,我想问问这种方法是不是有机会做到实时输入视频流进行动捕呢,把RNN的状态暂存到下一帧做init的话🤔
Dear author, thanks for your sharing. But I cannot find the dpvo.pth file, which is required to run the demo.py
Dear author, the checkerboard size would be 0 if the person moves slightly in the video. https://github.com/yohanshin/WHAM/blob/main/lib/vis/tools.py#L174C51-L174C51
So I recommend setting a minimum row and col number: num_rows = num_cols = max(2, int(length / tile_width))
Hi,
Finally the colab notebook has been released, and I got to try this model on tennis footage. I was really impressed by the model's performance on tennis videos, and this model has been the most accurate model I have seen (I have been looking for 3D models that can perform well in tennis scenarios for the past few months)!
Kudos to the team for making such an excellent model! :)
Here's a video of the performance of the model, albeit it took me about 2 hours on colab to produce this!
I have a few questions regarding this output from the team, which I will number down below.
Once again, I was really impressed by the model performance, I am sure this model will set a new benchmark for other HMR models in the future :)
python demo.py --video dance.mp4 --visualize
2024-03-04 19:44:25.282 | INFO | main::27 - DPVO is not properly installed. Only estimate in local coordinates !
2024-03-04 19:44:25.316 | INFO | main::209 - GPU name -> NVIDIA GeForce RTX 2060
2024-03-04 19:44:25.316 | INFO | main::210 - GPU feat -> CudaDeviceProperties(name='NVIDIA GeForce RTX 2060', major=7, minor=5, total_memory=12287MB, multi_processor_count=34)
Traceback (most recent call last):
File "C:\ia\wham\WHAM\demo.py", line 214, in
smpl = build_body_model(cfg.DEVICE, smpl_batch_size)
File "C:\ia\wham\WHAM\lib\models_init.py", line 12, in build_body_model
body_model = SMPL(
File "C:\ia\wham\WHAM\lib\models\smpl.py", line 25, in init
J_regressor_wham = np.load(_C.BMODEL.JOINTS_REGRESSOR_WHAM)
File "C:\Users\ultim\miniconda3\envs\wham\lib\site-packages\numpy\lib\npyio.py", line 427, in load
fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: 'dataset/body_models/J_regressor_wham.npy'
any idea?
(wham) ubuntu@ip-172-31-38-235:~/WHAM$ python demo.py --video examples/tennis1.mp4 --visualize --run_smplify
apex is not installed
apex is not installed
apex is not installed
/home/ubuntu/miniconda3/envs/wham/lib/python3.9/site-packages/mmcv/cnn/bricks/transformer.py:27: UserWarning: Fail to import ``MultiScaleDeformableAttention`` from ``mmcv.ops.multi_scale_deform_attn``, You should install ``mmcv-full`` if you need this module.
warnings.warn('Fail to import ``MultiScaleDeformableAttention`` from '
Traceback (most recent call last):
File "/home/ubuntu/WHAM/demo.py", line 207, in <module>
cfg.merge_from_file('configs/yamls/demo_w_fit.yaml')
File "/home/ubuntu/miniconda3/envs/wham/lib/python3.9/site-packages/yacs/config.py", line 211, in merge_from_file
with open(cfg_filename, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'configs/yamls/demo_w_fit.yaml'
```.
Before pulling the new version it was working.
I was able to install and run the evaluation code, but where is the output? I couldnt find it. I just found a yaml file on the experiments folder.
sorry found it, mmpose 0.24.0. It was in the vitpose.
Hi,
This is really great work. One of the best repositories which uses SMPL.
I wanted to know if there is a way to visualize the joints on top of the 3D mesh using WHAM?
Also, check the angles between the joints?
I want to demo a video using WHAM where I can show angles overlaid on the 3D model.
Thank you!
Thank you for share nice work!
I got this error in preprocessing stage:
File "/root/WHAM/demo.py", line 223, in <module>
run(cfg,
File "/root/WHAM/demo.py", line 76, in run
slam_results = slam.process()
File "/root/WHAM/lib/models/preproc/slam.py", line 70, in process
return self.slam.terminate()[0]
File "/root/miniconda3/envs/wham/lib/python3.9/site-packages/dpvo/dpvo.py", line 166, in terminate
poses = [self.get_pose(t) for t in range(self.counter)]
File "/root/miniconda3/envs/wham/lib/python3.9/site-packages/dpvo/dpvo.py", line 166, in <listcomp>
poses = [self.get_pose(t) for t in range(self.counter)]
File "/root/miniconda3/envs/wham/lib/python3.9/site-packages/dpvo/dpvo.py", line 158, in get_pose
return dP * self.get_pose(t0)
File "/root/miniconda3/envs/wham/lib/python3.9/site-packages/dpvo/dpvo.py", line 158, in get_pose
return dP * self.get_pose(t0)
File "/root/miniconda3/envs/wham/lib/python3.9/site-packages/dpvo/dpvo.py", line 158, in get_pose
return dP * self.get_pose(t0)
[Previous line repeated 986 more times]
File "/root/miniconda3/envs/wham/lib/python3.9/site-packages/dpvo/lietorch/groups.py", line 203, in __mul__
return self.mul(other)
File "/root/miniconda3/envs/wham/lib/python3.9/site-packages/dpvo/lietorch/groups.py", line 151, in mul
return self.__class__(self.apply_op(Mul, self.data, other.data))
File "/root/miniconda3/envs/wham/lib/python3.9/site-packages/dpvo/lietorch/groups.py", line 127, in apply_op
inputs, out_shape = broadcast_inputs(x, y)
File "/root/miniconda3/envs/wham/lib/python3.9/site-packages/dpvo/lietorch/broadcasting.py", line 28, in broadcast_inputs
x1 = x.repeat(x_expand + [1]).reshape(-1, xd).contiguous()
RecursionError: maximum recursion depth exceeded while calling a Python object
Hi @yohanshin , I'm here again with some new questions😄
Which HMR2.0 variant (a or b) are you using to extract ViT features? I've also noticed that both variants yield a feature dimension of 1280 at the end of the image backbone. More specifically, the input is 256 x 192 x 3, while the output is (16*12) x 1280. "I am confused as to why your pre-saved ViT feature for an image is 1 x 1024. Could you give me some advice?
when I use the official docker image : yusun9/wham-vitpose-dpvo-cuda11.3-python3.9:latest
the wham output is fine, the animation is smooth.
But when I make my own docker image, base on pytorch/pytorch:2.2.1-cuda12.1-cudnn8-runtime
the wham output suddenly have jitter problem
I suspect the reason is pytorch version different, does the code support pytorch 2.2.1 ?
Thanks for your work,i have some questions about "self.aug_dict = torch.load(_C.KEYPOINTS.COCO_AUG_DICT) "
pmask, jittering, peak, bias both with the size of [1,1,17].
What criteria did you follow to generate these parameters? Could you provide the related code?
Hello @yohanshin
The video shows walking on flat ground, but the estimated action will become walking on a slope. And the feet are higher than the ground level
Line 83 in b25b6a4
\examples\IMG_9730.mov
. Change the "global-R [frame_i3]" in this line of code to "default-R" to output the video from the camera at zero angle, and you will be able to see the issue I mentioned.thanks for sharing such a great work, when custom video inference could be supported?
Dear authors,
Thank you for the amazing paper. I have a question regarding output If i want to extract the 3D pose from your model is it pred["pose"] or pred["poses_body"] and can you please tell me the difference between both.
Thanks for this amazing work.
Do you know the unit of the chessboard on the demo videos? How can we get body displacements in meters?
Thanks for your great work. Do you have any idea why the foot_contact conf could be lower than 0 or higher than 1. thank you
Nice work! I can frankly speak that WHAM is the most well-performed method I have ever seen in this literature, and I'm now trying to train WHAM and do evaluations on my own dataset. I have noticed that when running stage1 training following the guidance provided by the train branch, the "3dpw_val_vit.pth" is missing. Though the problem could be solved by simply renaming "3dpw_test_vit.pth" to "3dpw_val_vit.pth", as mentioned in another issue, I'm still wondering if I can generate the parsed ViT version of datasets on my own :)
I noticed in a closed comment about the output stating that
The future demo code with custom videos will support visualization and storing SMPL parameters.
Would it be possible for either yourself (owner) or another member of the opensource community to write a guide on importing the resulting data to visualize as a 3D model in Blender?
I'm just starting to research this field so I apologize if that is a simple or complicated task.
Best regards
While following the installation instructions, I've got this error.
I thought, ok, I'm usgin windows, which was not made to work with, so I went to the link from the pip installation, and I got this error
In any case, I'll try toinstall the default pytorch3d, but I wanted to let you knwo abiut what happaned, maybe it can help on somethng.
@dalgu90 @yohanshin @RohaanA Thanks for your amazing method.
I'm very expecting the training code release. I cannot wait for training the model on my own dataset. So, when could you release the training code? Thanks a lot.
My understanding from the paper is that the joint angles (results["poses_body"] = wham_inference(.)
) are deltas from the nominal pose of smpl, is this correct? if so, is the ok to get the nominal position of the joints with smpl.get_output(.).original_pose
? and what about the nominal orientation of the joint frames?
Additionally, results["poses_body"]
is a 23xR_{3x3} matrix while smpl joint count is 24; are the poses_body the joint angles of j1 to j23, leaving j0 to be the root of the body?
Thank you in advance for the response and great work!!
Hi @yohanshin,
Can you give me some suggestions on how to use the transl
in rich_test_vit.pth
to recover the ground truth global motion?
I found the transl
s are the same for index 6 and 37 (test/Gym_012_lunge1/cam_05
and test/Gym_012_lunge1/cam_04
). labels["transl"][6][1:] != labels["transl"][37][1:]).sum()
gives 0. Therefore, I think it means that the transl is in the world coordinate. However, the results of the computed joints3d are different from the provided joints3d (joints3d = labels["joints3D"][index][1:]
). Could you help me to understand the differences?
smplx_out = smplx_models[gender](
body_pose=pose[:, 3:-6].reshape(-1, 63),
global_orient=pose[:, :3],
betas=betas,
)
smpl_verts = torch.matmul(smplx2smpl, smplx_out.vertices) # in camera view, but zero-centered
transl_ = (cam_poses[:, :, :3] @ transl[:, :, None] +cam_poses[:, :, 3:]).squeeze(2)
smpl_verts = smpl_verts + transl_[:, None, :]
smpl_joints_c = smpl_J_regressor @ smpl_verts # (F, 24, 3)
(smpl_joints_c - joints3d).abs().max() # 0.0971
Hi!
We just figured out the issue of the heavy foot sliding and jittering it's because of the shrinking!
Please see this video below.
Thank you!
How can I map the network-predicted 31 points “pred_kp3d” (ranging from -1 to 1) back to the original image? I attempted the following operation to project: 'ratio = 1.0 / 224; points_3d = (points_3d + 1.0) / (2 * ratio)', but found that the positions were not accurate.
I'd like to extract SMPL coefficients like humannerf did (https://github.com/chungyiweng/humannerf?tab=readme-ov-file#metadatajson)
after WHAM, I could get results (['poses_body', 'poses_root_cam', 'betas', 'verts_cam', 'poses_root_world', 'trans_world', 'frame_id'])
,
and needed coefficients for humannerf are "poses", "betas", "cam_intrinsics", "cam_extrinsics"
I could easily find that "betas"
are exactly same as results[0]["betas"],
and it seems "cam_extrinsics"
is results[0]["poses_root_world"] + results[0]["trans_world"]
.
But in humannerf or VIBE, "poses"
were (72,) array, while results[0]['poses_body']
were (23,3,3).
also I'm not sure about recovering cam_intrinsics
. Maybe results[0]["poses_root_cam"]
means cam_intrinsics
?
Could you please let me know how to recover these coefficients?
It would be great if you also introduce me (poses_root_cam
, verts_cam
).
Hi, I would like to ask, what's the order of keypoints regressed from J_regressor_wham?
Great work!
Want to know if you could release the pretrained models with different backbones( WHAM (Res)/(HR)) in the paper. I want to run the demo in my own pc in real-time(>30fps) way, but the WHAM(ViT) is time costly.
Thanks a lot!
Thank you for share nice work!
I tried to google colab the day before yesterday. And it worked nicely.
But, I try to installation on local today. There are some problem.
On data preparation process, in fetch_demo_data.sh file,
wget "https://drive.google.com/uc?id=1pbmzRbWGgae6noDIyQOnohzaVnX_csUZ&export=download&confirm=t" -O 'dataset/body_models.tar.gz'
tar -xvf dataset/body_models.tar.gz -C dataset/
When i tried to activate getting data command, i got this message.
Hello, thanks for the great work! I am currently trying to understand your paper and have a couple of questions:
1.Can faster methods be used for feature extraction, and currently, a single batch of feature extraction can only be done at 15fps?
2.Can we obtain 3D joint positions in the Integrate features stage without using SMPL and extract them in context? What is the format?
3.Also, can the smoothing filter parameters be adjusted? If it is too smooth by default, the details of small actions cannot be seen.
4.Can it support both palm and toe inference simultaneously?
Hi @yohanshin ,
Thanks for your outstanding work!
I tried to run some demo in 3DPW dataset. However, the output video from demo.py ( I used groundtruth detection and tracking) and from evaluate_3dpw.py quite different. Do you have any suggestions for resolving this issue?
Hi,
I noticed that in any video where the person don't move, in WHAM the pelvis is moving and it make the immobile person move down/up sometimes left/right.
This problem is with the pelvis bone, that can't recognize that the person is immobile and not moving.
A sample video is not required, any recorded video of someone not moving foot but only hands will make this bug appear.
Thank you.
Thank you for the code and repo! When running on some custom videos, I get the following errors, but the example file works as expected:
Traceback (most recent call last):
File "/home/WHAM/demo.py", line 145, in <module>
run(cfg, args.video, output_pth, network, args.calib, visualize=args.visualize)
File "/home/WHAM/demo.py", line 107, in run
run_vis_on_demo(cfg, video, results, output_pth, network.smpl, vis_global=True)
File "/home/WHAM/lib/vis/run_vis.py", line 43, in run_vis_on_demo
renderer.set_ground(scale, cx.item(), cz.item())
File "/home/WHAM/lib/vis/renderer.py", line 171, in set_ground
v, f, vc, fc = map(torch.from_numpy, checkerboard_geometry(length=length, c1=center_x, c2=center_z, up="y"))
File "/home/WHAM/lib/vis/tools.py", line 207, in checkerboard_geometry
vertices = np.concatenate(vertices, axis=0).astype(np.float32)
File "<__array_function__ internals>", line 180, in concatenate
ValueError: need at least one array to concatenate```
Dear authors, would you be willing to upload the implementation of RTE, ROE, and ERVE for testing the global trajectory?
As I already checked many repos and papers, that work with SMPL fitting, I didn't get good apropriate result in hand/arm/forearm tracking in negative extreme angles. Here are few screenshots (left 4d humans, middle WHAM w/o new flag, right WHAM with 'run_smplify' flag. Is there any way how to improve this or finetune the model?
Nice work! I am wondering if the method can supprot reconstructing world 3D SMPL-X motions?
Thank you!
I saw that there was another issue with the tip to get the code from VIBE, I did some rewrite to that code and put in a blend file to be easier to use.
Just need to install joblib in blender (there is a tab with the installation script) and add the SMPL fbx for male on the same folder as you have the blend file, and put the path to the pkl file.
Hope its useful
loading_WHAM_import_SMPL_FINAL.zip
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.