GithubHelp home page GithubHelp logo

hand-graph-cnn's Introduction

3D Hand Shape and Pose Estimation from a Single RGB Image

Open source of our CVPR 2019 paper "3D Hand Shape and Pose Estimation from a Single RGB Image"

prediction example

Introduction

This work is based on our CVPR 2019 paper. You can also check our project webpage and supplementary video for a deeper introduction.

This work addresses a novel and challenging problem of estimating the full 3D hand shape and pose from a single RGB image. Most current methods in 3D hand analysis from monocular RGB images only focus on estimating the 3D locations of hand keypoints, which cannot fully express the 3D shape of hand. In contrast, we propose a Graph Convolutional Neural Network (Graph CNN) based method to reconstruct a full 3D mesh of hand surface that contains richer information of both 3D hand shape and pose. To train networks with full supervision, we create a large-scale synthetic dataset containing both ground truth 3D meshes and 3D poses. When fine-tuning the networks on real-world datasets without 3D ground truth, we propose a weakly-supervised approach by leveraging the depth map as a weak supervision in training. Through extensive evaluations on our proposed new datasets and two public datasets, we show that our proposed method can produce accurate and reasonable 3D hand mesh, and can achieve superior 3D hand pose estimation accuracy when compared with state-of-the-art methods.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{ge2019handshapepose,
  title={3D Hand Shape and Pose Estimation from a Single RGB Image},
  author={Ge, Liuhao and Ren, Zhou and Li, Yuncheng and Xue, Zehao and Wang, Yingying and Cai, Jianfei and Yuan, Junsong},
  booktitle={CVPR},
  year={2019}
}

Installation

  1. Install pytorch >= v0.4.0 following official instruction.
  2. Clone this repo, and we'll call the directory that you cloned as ${HAND_ROOT}.
  3. Install dependencies:
    pip install -r requirements.txt
    

Running the code

  1. Evaluate on our real-world dataset and visualize the results of hand mesh and pose.

    python eval_script.py --config-file "configs/eval_real_world_testset.yaml"
    

    The visualization results will be saved to ${HAND_ROOT}/output/configs/eval_real_world_testset.yaml/

  2. Evaluate on STB dataset.

    Download STB dataset to ${HAND_ROOT}/data/STB.

    Run the following script:

    python eval_script.py --config-file "configs/eval_STB_dataset.yaml"
    

    The pose estimation results will be saved to ${HAND_ROOT}/output/configs/eval_STB_dataset.yaml/pose_estimations.mat

3D hand shape and pose dataset

We release the 3D hand shape and pose dataset. It contains a large scale synthetic image dataset for training and validation, and a small real-world image dataset for testing. For details, please go to the data folder in this repository.

hand-graph-cnn's People

Contributors

3d-hand-shape avatar geliuhao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hand-graph-cnn's Issues

How to write to obj file

Hi,Author
I was complete test my own image,and i have a question that how to write the hand mesh to obj file according to the (8, 954, 3) vertices? how to generate vt and f? Can you help me,Thanks.

About training data

Hello ~

Thank you for your great jobs !

Could you provide me with data for training?

model is overfitted?

Hi @3d-hand-shape ,
First of all , thanks for your great work!
However, I've got a problem with using your model, i managed to change the code to test my own data, I tried with hand detector and without hand detector , result is really bad. here I am sharing two output, the result is in your data is good, but in other images is bad
Please help,

Question on figure 9

Hi @3d-hand-shape

Thanks in advance,
Would you please help me to obtain this figure in your paper,
for ploting this curve , Do you use a template ? How you get the results from other works?
Is it possible to provide access to your code for this part of evaluation for fair comparison?

evaluating on RHD

Hi @geliuhao ,

The reported AUC on RHD dataset, How you treat both hands in one image, did you consider both available hands in images for training and testing or have you chosen the more dominant hand from each image?

Windows 10 , Opendr

I'm trying this on Windows 10 using a Git.
But I've faced a problem in the step of running code ' python eval_script.py --config-file "configs/eval_real_world_testset.yaml" '
And the error message indicated that it occurs on 'import opendr.camera'
So I've tried installing it for 'pip install' , 'conda install' and I'm trying opendr Github.. But I didn't find solution.
Could you give a help? what should I do.

And also I'm wondering whether I can use this on Webcam or other my video data !

error results

image
I run 'python eval_script.py --config-file "configs/eval_real_world_testset.yaml"', and get this results, what is wrong with me ?

Question on STB preprocessing

Hi @raingo ,
Thanks for your great paper,
Thank you so much in advance

As it is mentioned in STB readme, we have GT coordinate in world coordinate system,right?, and provided rotation & translation vector is to transfer coordinate from depth camera to color camera, right? Why you use them for transforming from world coordinate??

would you please give me a reference for writhing SK_rot_mx function which should be Rodrigues' rotation formula?

second when you transfer back from network estimation to camera coordinate , why have you used just depth of root , not all coordinate of root, (uvd2xyz function in image_uti.py) ?

About dataset

Hello ~

Nice jobs!

Could you share me with the data for training?

Training Code

Hi, I was wondering whether you might consider releasing the training code, which can increase your work's citations (since more people would base their research on yours), and contribute to the community. Thank you!

Error - 'ColoredRenderer' object has no attribute 'vbo_verts_face'

When running the script on real world data as per example I get the following error.

2019-07-16 16:13:25,003 hand_shape_pose_inference INFO: Evaluate: [8/583]	Average pose estimation error: 10.21 (mm)
2019-07-16 16:13:25,003 hand_shape_pose_inference INFO: Saving image: ./output\configs/eval_real_world_testset.yaml\pred_0.jpg
Traceback (most recent call last):
  File "/hand-graph-cnn/eval_script.py", line 111, in <module>
    main()
  File "/hand-graph-cnn/eval_script.py", line 99, in main
    est_pose_cam_xyz, file_name)
  File "\hand-graph-cnn\hand_shape_pose\util\vis.py", line 176, in save_batch_image_with_mesh_joints
    rend_img_overlay, rend_img_vp1, rend_img_vp2 = draw_mesh(mesh_renderer, image, cam_param, box, mesh_xyz)
  File "\hand-graph-cnn\hand_shape_pose\util\vis.py", line 55, in draw_mesh
    rend_img_overlay = mesh_renderer(mesh_xyz, cam=cam_for_render, img=image, do_alpha=True)
  File "\hand-graph-cnn\hand_shape_pose\util\renderer.py", line 79, in __call__
    color_id=color_id)
  File "\hand-graph-cnn\hand_shape_pose\util\renderer.py", line 225, in render_model
    imtmp = simple_renderer(rn, verts, faces, color=color)
  File "\hand-graph-cnn\hand_shape_pose\util\renderer.py", line 179, in simple_renderer
    return rn.r
  File "\lib\chumpy\ch.py", line 555, in r
    self._call_on_changed()
  File "\lib\chumpy\ch.py", line 550, in _call_on_changed
    self.on_changed(self._dirty_vars)
  File "\lib\opendr\renderer.py", line 1080, in on_changed
    self.vbo_verts_face.set_array(np.array(self.verts_by_face).astype(np.float32))
AttributeError: 'ColoredRenderer' object has no attribute 'vbo_verts_face'

Process finished with exit code 1

I found online a solution for this problem described here: polmorenoc/opendr#1

and added in util\renderer.py the following lines just before "return rn.r" in def simple_renderer(...)

    flipXRotation = np.array([[1.0, 0.0, 0.0, 0.0],
                              [0.0, -1.0, 0., 0.0],
                              [0.0, 0., -1.0, 0.0],
                              [0.0, 0.0, 0.0, 1.0]])
    rn.camera.openglMat = flipXRotation  # this is from setupcamera in utils
    rn.glMode = 'glfw'
    rn.sharedWin = None
    rn.overdraw = True
    rn.nsamples = 8
    rn.msaa = True  # Without anti-aliasing optimization often does not work.
    rn.initGL()

now I get a crash in "draw_color_image()" in opendr\renderer saying I am missing "visibillity_image" variable.
This is where I got stuck, can you please advise?

thank you

pose_root

Thanks for your amazing idea,I train a YOLOv4 model to detect hand and want to run your project using a USB camera,But I don't known how to feed a value pose_root,can you explain how to calculate this value?

Data Augmentation

Hi @3d-hand-shape
Thanks in advance for your helpful comments
Have you had data augmentation (random rotation translation of image and 3D labels) during training, as I found in data loader provided for evaluation not any specific augmentation, is this data loader used for training too?

How could I calculate pose_root?

Hi I am trying to implementing your pre-trained model on wild images, I noticed that cam_parameters & pose_roots are read from params.mat. May I know how could you calculate the pose_root?

Best
Tian

where is the part of inverse kinematics?

In your paper ,you said you can get the joint rotation from the joint location via solve an inverse kinematics, is there any corresponding part in your code?

ValueError: unsupported pickle protocol: 3

I got this error, and I think it's because of different python version.
I implemented the inference code, eval_script.py, with python 2.


Traceback (most recent call last):
  File "eval_script.py", line 111, in <module>
    main()
  File "eval_script.py", line 52, in main
    model = ShapePoseNetwork(cfg, output_dir)
  File "/home/ubuntu/3d_pose_estimation/hand-graph-cnn/hand_shape_pose/model/shape_pose_network.py", line 41, in __init__
    build_hand_graph(cfg.GRAPH.TEMPLATE_PATH, output_dir)
  File "/home/ubuntu/3d_pose_estimation/hand-graph-cnn/hand_shape_pose/util/graph_util.py", line 154, in build_hand_graph
    graph_dict = np.load(graph_dict_path, allow_pickle=True).item()
  File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/numpy/lib/npyio.py", line 447, in load
    pickle_kwargs=pickle_kwargs)
  File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/numpy/lib/format.py", line 701, in read_array
    array = pickle.load(fp, **pickle_kwargs)
ValueError: unsupported pickle protocol: 3

How can I fix it?
I'll be looking forward to your reply.

How to run on GPU

eval_real_world_testset.yaml
image

eval_script.py
image

real_world_testset.py
image

I changed the code as above, but I couldn't run the project

I get the errors:

RuntimeError: CUDA error: out of memory 

or

Killed

image

can you give me some help

Be deeply grateful!

My machine information:

Tesla P100-PCIE

error when python eval_script.py

hand-graph-cnn-master/hand_shape_pose/util/renderer.py", line 223
color = color_list[color_id % len(color_list)]
TypeError: 'dict_values' object does not support indexing

Error : "TypeError: forward() missing 1 required positional argument: 'x'"

I followed the instructions and I ran the first line of code on my CMD :- python eval_script.py --config-file "configs/eval_real_world_testset.yaml" and I got the following error.

Traceback (most recent call last):
  File "eval_script.py", line 111, in <module>
    main()
  File "eval_script.py", line 78, in main
    model(images, cam_params, bboxes, pose_roots, pose_scales)
  File "C:\Users\ROHIT SANJAY\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\ROHIT SANJAY\Desktop\hand-graph-cnn-master\hand_shape_pose\model\shape_pose_network.py", line 87, in forward
    est_hm_list, encoding = self.net_hm(images)
  File "C:\Users\ROHIT SANJAY\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'x'

any suggestions?

error when loading state_dict for Net_HM_HG

RuntimeError: Error(s) in loading state_dict for Net_HM_HG:
Missing key(s) in state_dict: "bn1.num_batches_tracked", "r1.bn.num_batches_tracked", "r1.bn1.num_batches_tracked", "r1.bn2.num_batches_tracked", "r4.bn.num_batches_tracked", "r4.bn1.num_batches_tracked", "r4.bn2.num_batches_tracked", "r5.bn.num_batches_tracked", "r5.bn1.num_batches_tracked", "r5.bn2.num_batches_tracked", "hourglass.0.low2.low2.low2.low2_.0.bn.num_batches_tracked", "hourglass.0.low2.low2.low2.low2_.0.bn1.num_batches_tracked", "hourglass.0.low2.low2.low2.low2_.0.bn2.num_batches_tracked", "hourglass.0.low2.low2.low2.low2_.1.bn.num_batches_tracked", "hourglass.0.low2.low2.low2.low2_.1.bn1.num_batches_tracked"

How could I fix it?

Is the unit of measurement correct ?

Hi,
You have mentioned that in your dataset the unit of measurement is centimeters. Is this the correct unit ?

I believe that it could be millimeter due to following observation.

Observation description

I convert the global location to the hand joints depth. Now I took the relative depth from wrist joint i.e subtracting depth of joint i from joint wrist.

x' = depth_wrist - x

Note : all the units are converted cm to mm.

The visualisation plot is
image

Issue I am thinking of is

  • The difference of depth in closed hand from wrist is quite large. ( Again the units are converted to mm from cm ) .
  • Due to the above issue I think the unit could be mm.

Could you please confirm ? I might be wrong.

Having the same type of visualisation for Interhand dataset have different results. ( For the sake of clarity, unit in interhand is given as mm which I got to confirm from creator of dataset.)

image

Further info :

  • the conversion method of interhand and your dataset to depth format is same. I am only extracting the intrinsic and extrinsic parameters and providing them to a defined API to convert to depth format.
  • I have considered the 3d scaling factor while doing the resizing.

Model not rendered

Hello !
First of all congratulations for your great jobs !
I tried to run the eval_script.py with exactly the same command lines of your tutorial, but the evaluated shape is not rendered on the images saved by the script. I checked the output of the mesh rendering step, and got a bunch of 0 in the rendered images.
I'm wandering if this is because of incorrect requirements' versions, but I'm not quite sure about this. So by the way, what are the exact requirements version of your code ? It seems that your pkl file are saved with python3, while opendr 0.76 do not support installations with python3.

Being grateful while waiting for your response !

bbox for STB

Hi @3d-hand-shape

bbox used for croping image is Ground Truth or is from a detector ? and reported result in paper are according Ground Truth hand area?

MLP network

For the full model you used to train the MLP network, is that the model fine-tuned with STB dataset or the model just trained with synthetic dataset?
Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.