patrick-swk / d3dp Goto Github PK

View Code? Open in Web Editor NEW

140.0 140.0 5.0 31.82 MB

[ICCV2023] The PyTorch implementation for "Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis Aggregation"

License: MIT License

MATLAB 3.69% Python 96.31%

d3dp's People

Stargazers

Watchers

Forkers

anny-lyr zhouzhou012 lxg-233 pantilun

d3dp's Issues

A simple operational issue, Mengxin Seeks Help

Hello, I'm sorry to bother you.
I would like to ask a potentially basic question. After downloading the code, when I run locally, it shows that these two files are not present in the from common.humaneva_dataset import HumanEvaDataset and from common.custom_dataset import CustomizaDataset, and I also did not find these two files in your project. What is the situation? Do I need to download it myself? If that's the case, what do I need to do?

Looking forward to your reply, thank you! (Computer configuration 2080ti)

Clarification for using in the wild

Hello,
We are preparing to use your model for production:

What is the input & output formats for the in_the_wild_best_epoch.bin model? We need enough information in order to integrate with our code? We need to output the results in csv with column headers (ie. 'nose x', 'nose y', ...), so we need to map tensor indices to particular keypoints.
Why is video-to-pose3D needed? Can you give a brief idea about it, what does it provide, why it is done this way as opposed to just using your own serving code?
Any issue with compressing model via torchscript for serving?

Note: we are using VitPose for the 2D detector as the others do not perform for our needs.

Thank you for your help and for creating this project

training evaluation CUDA out of memory

Hi, when I am training from scratch on one 3090 GPU, it occurs:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 23.11 GiB (GPU 0; 23.70 GiB total capacity; 1.54 GiB already allocated; 20.83 GiB free; 1.87 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Train process is totally OK, but after the training and in the evaluation step, it occurs CUDA memory problem.

About valid frames in MPI-INF-3DHP Dataset

There are some uselessful frames In the valid_frame:
Does 0 means uselessful?
1 means useful?

Some codes can't be found

Thanks for your great work! And I want to ask when can you update new version. This version can't work in the wild

About the best results

I tried to train you model using the 2D keypoints obtained by CPN as inputs,but i cant get results as your best epoch.I trained no a single 3090 and use 1 as seed.
Is there any different methods use on training progresses?or hyper-parameters？

ALphapose 2D estimator

Hello, Thanks for the excellent work.
I had a question regarding the alphapose model. Can you please point me out whether the 2D model that you are using is provided by Alphapose repo as well and if so can you please tell me which one since there are many models provided in their https://github.com/MVIG-SJTU/AlphaPose/blob/master/docs/MODEL_ZOO.md.

Thank again @paTRICK-swk

training loss

hi, could you provide the script for noise prediction (with the corresponding reverse process) as the training loss? Thanks.

Question about the paper

Hi, thanks for your great work!
I have a question about the paper. In 2.2 of the paper, you say

Note that concurrent methods [20, 16, 12] also use diffusion models for this task, but they only report the upper bound of performance, which is not available in real-world applications.

For example, I can't find any ' upper bound of ' about the evaluation stage in [16] .
Hope for your reply, Thanks!

About the reverse process

Thanks for your interesting work~

I would like to know some details about the reverse process. Why did you consider applying the one-step solution for the reverse process? In my opinion, the multi-step method can produce higher quality results for generation tasks.

single-step denoising

where is the model.py file??

HI!!
first of all, thank you for your working. Now, im following your instructions, but I stuck in steps...
When I run videopose_diffusion.py, i got a error message cause I cannot find some model.py file

Put other files in ./in_the_wild folder to the ./common folder of their repo.
And I wonder what is that meaning... should I have to copy that folder too?

Inference on in the wild videos

Hello,
I was looking into the inference part and noted that there were multiple skeletons inferenced, the exact shape was (5,5,no of frames,17,3). Now i looked into the other issue in which you told about the part of code in main.py which solves this problem, but i was unable to find the suitable code which is basically aggregating all the skeleton. Can you please guide me.

does your in_the_wild_best_epoch.bin model use coco input?

The readme is not clear - what does this model take for input? While I'm asking - does it ouput predictions in H36M? I tried netron and got nothing

Video handling for video frames less than receptive field

Thanks @paTRICK-swk for the amazing work. I am having this issue when I try to run videos in the wild with frames less than the receptive field (243) the 2D and 3D predictions go out of sync. I found you are handling it by duplicating the last frame this is causing the desync. Could please guide me how should I handle this?

training time

bug for rearrange

when i want to run "python main_draw.py -k cpn_ft_h36m_dbb -b 2 -c checkpoint -gpu 0 --nolog --evaluate h36m_best_epoch.bin -num_proposals 5 -sampling_timesteps 5 --render --viz-subject S11 --viz-action SittingDown --viz-camera 1" there is a bug for inputs_2d_p = rearrange(inputs_2d_p, 'b f c -> f c b') , i find the inputs_2d_p is torch.Size([2, 2356, 17, 2]).

MPI-INF-3DHP Generalization Better than Human3.6M?

Hi @paTRICK-swk ,

Thanks for your great work and public contribution. May I ask, it seems 3DHP (GT 2D)'s result 28.1 is even better than 35.4 on Human3.6M (Det 2D) (though K is different, I think it will not change) even if the model is trained on Human3.6M? I guess the performance on Human3.6M (GT) can be smaller than 20.

Any elaboration would be appreciated.:)

Thanks & regards,

evaluate pre-trained（H36M） model on 3DHP

Hi,

When I evaluated pre-trained（H36M） model on 3DHP, I got a bad MPJPE around 400mm.

Was there any wrong with my process?

Question about Reimplementation of MixSTE

I discovered in your paper, that you have written the results of MixSTE that replicated on your machine. I wonder whether you have changed something from the source code offered by MixSTE. Now, I'm confused about the initial hyperparameter weight of loss about MixSTE when replication.

Evaluation of on the wild videos

Hi @paTRICK-swk , thank you for the amazing work. I wanted to know how can I evaluate the in_the_wild_best_epoch.bin model quantitatively with respect to MPJPE

patrick-swk / d3dp Goto Github PK

d3dp's People

Stargazers

Watchers

Forkers

d3dp's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs