howiema / nsrmhand Goto Github PK
View Code? Open in Web Editor NEW[WACV 2020] "Nonparametric Structure Regularization Machine for 2D Hand Pose Estimation"
[WACV 2020] "Nonparametric Structure Regularization Machine for 2D Hand Pose Estimation"
hi, i have a question about the hand tightest bounding box,what your definition of this? Is the tightest bounding box enclosing all hand joints or expand a length basis of the before?
thanks
Hello, thanks for releasing the code. I really appreciate your work.
I have a question: When I train the code, the LM loss descends gradually at the beginning. However, it ascends gradually after the 50th epoch. Have you ever meet this problem? I look forward to your reply.
Hello,can you offer your models?
Looks like the augmentation code is missing. And there is no augmentation done in the data loader.
@HowieMa Can you provide the OneHand 10k dataset? Thank you very much!
hi, @HowieMa
Thanks for sharing your code, I am very interested in your ideas. You said in readme.md that NSRM can be used for HR-Net, but the network has no intermediate supervision, so how to perform a cascaded multi-task architecture to learn the hand structure and keypoint representations jointly.Could you give some advice?Looking forward to your reply.
Hello and thank you for making the code available. I greatly appreciate your efforts.
According to your paper on this work, the NSRMhand model only focuses on hand pose estimation. May I inquire as to whether the proposed methodology is capable of both hand detection and hand landmark localization? Alternatively, does the model need the hand boundary box coordinates or can it predict the boundary boxes on its own? Furthermore, if the model always assumes there is a hand in the input image?
Thank you in advance!
Hi,thanks for your wonderful work.I want to ask whether model can predict two hands from single RGB picture?
Running inference.py on Ubuntu and encountered segfault on line
117 state_dict = torch.load(args.resume)
Would you mind suggesting any possible cause that this might be happening?
hi ,thank you for your greate work ,have you tested the mediapipe handTracking demo? it works very good , I analyze the mediapipe handTracking model, it is very simple ,but I can't do it with the precision,can you give me some ideas about it? thanks
Hi @HowieMa
Thanks for your great work, I was confused how you get 21 keypoints annotations after cropping the original image. Manual annotation or some trick that transform annotation from the original annotation of CMU dataset? Cause I want to training for another datasets, I want to know how you preprocess the dataset. Looking forward to your reply.
Hi,
Thank you for such an amazing project and repository!
I was wondering if you can provide me the preprocessing file for the CMU Panoptic dataset as I want to learn more about the preprocessing steps of it and heatmap generation through your code.
Hoping for a positive response!
Only the Panoptic Hand dataset is needed to train the model?The paper says that OneHand 10k is still needed, but we have found out how to configure it。
Thanks for pretty project. Where to download the training data?I am looking forward to receiving your reply。
@HowieMa
I don't know how the program should run the OneHand10K dataset. Can you provide a flow?Thank you very much!
It means there is cmuhand.py, can you provide onehand.py and other relevant file?
Thanks for pretty project. I meet a question about PCK accuracy. Use your project in Panoptic data,In train step ”Current Best EPOCH is : 32, PCK is : 0.9577505546445453“,but In Test step appear
"0.04": 0.5353691038114515,
"0.06": 0.5957357993770676,
"0.08": 0.6186944096586716,
"0.1": 0.6288090421603569,
"0.12": 0.6367080884950065
Test some picture,the result very difficult. hope your answer,Thanks.
It says Gth of limb representatios to the missing keypoints are set to zero maps, the limb structure in the missing keypoints will be missing too. Will it affect the subsequent pose mudule regression? How can we get the better and stabler hand keypoints you prefer?
when generating limb structure, we use
mask1 = cross <= 0 # 46 * 46
mask2 = cross >= length2
mask3 = 1 - mask1 | mask2
D2 = np.zeros((self.label_size, self.label_size))
D2 += mask1.astype('float32') * ((x - x1) * (x - x1) + (y - y1) * (y - y1))
D2 += mask2.astype('float32') * ((x - x2) * (x - x2) + (y - y2) * (y - y2))
D2 += mask3.astype('float32') * ((x - px) * (x - px) + (py - y) * (py - y))
Thank you very much for your work. I have a problem: if nn. MSELoss () is used as the loss function, and the Heatmap output from the network is directly compared with the generated Heatmap, the loss value will be abnormally large. How to deal with this problem?
I found that your code is:
criterion = nn.MSELoss(reduction='sum')
loss = criterion(pred, target)
return loss / (pred.shape[0] * 46.0 * 46.0)
Do you use this loss function to avoid excessive loss?
Is the model in ./CPM folder used anywhere in the training process of the limb model? I think there are quite a few duplicated code in ./CPM as well as in other parts of the code body such as in main.py and in hand_ldm.py, etc. and I got confused in which files actually were used in training.
In paper the network layout was defined as using 3 structure stages and 3 keypoint stages, while in code the 6 stages were divided into 1 structure stage and 5 keypoint stages. What is the reason for such difference in terms of implementation or is it merely a typo?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.