GithubHelp home page GithubHelp logo

zhengdiyu / signavatars Goto Github PK

View Code? Open in Web Editor NEW
53.0 9.0 1.0 67.57 MB

SignAvatars: A Large-scale 3D Sign Language Holistic Motion Dataset and Benchmark

Home Page: https://signavatars.github.io/

human-pose-estimation mano motion-generation sign-language sign-language-datasets smpl-x smplx vqvae

signavatars's Introduction

SignAvatars: A Large-scale 3D Sign Language Holistic Motion Dataset and Benchmark

Zhengdi Yu1,2 · Shaoli Huang2 · Yongkang Cheng2 · Tolga Birdal1

1Imperial College London, 2Tencent AI Lab

Logo


SignAvatars is the first large-scale 3D sign language holistic motion dataset with mesh annotations, which comprises 8.34M precise 3D whole-body SMPL-X annotations, covering 70K motion sequences. The corresponding MANO hand version is also provided.

News 🚩

  • [2023/11/2] Paper is now available. ⭐

TODO

  • Initial release of annotations.
  • Release the visualization code.
  • Release Videos after the agreement of video owners.
  • Enrich the dataset

Application examples on SLP

Blender Blender
SLP from HamNoSys SLP from Word
Blender Blender
SLP from ASL SLP from GSL

Instruction 📜

Dataset description

Dataset download

For annotations, please fill out this form to request access to use SignAvatars for non-commercial research purposes. By submitting the form, you have read and agree to the terms of the Data license and you will receive an email and please download the motion and text labels from the provided downloading links.

We do not distribute the original RGB videos due to license. We provide high-quality 3D motion labels annotated by our team. For the original video download of the 4 subsets, please follow the instructions below:

  1. For ASL subset, please download Green Screen RGB clips from how2sign dataset and put into language2motion/.
  2. For HamNoSys subset, please download the original videos using the data.json from the downloaded HamNoSys/data.json.
  3. For GSL subset, please follow the official instruction to download and put into language2motion/.
  4. For Word subset, please follow the official instruction to download and put into word2motion/.

Dataset Structure

After downloading the data, please construct the layout of dataset/ as follows:

|-- dataset
|   |-- hamnosys2motion/  
|   |   |-- images/
|   |   |   |-- <video_name>/
|   |   |   |   |-- <frame_number.jpg>   [ starts from 000000.jpg ]
|   |   |-- videos/
|   |   |   |-- <video_name>/  [ ..... ]   
|   |   |-- annotations/
|   |   |   |-- <annotation_type>  [ SMPL-X, MANO, ...]
|   |   |   |   |-- <video_name.pkl>
|   |   |-- data.json  [Text annotations]
|   |   |-- split.pkl
|   |   |
|   |-- language2motion/  
|   |   |-- images/
|   |   |   |-- <video_name>/
|   |   |   |   |-- <frame_number.jpg>   [ starts from 000000.jpg ]
|   |   |-- videos/
|   |   |   |-- <video_name>/  [ ..... ]   
|   |   |-- annotations/
|   |   |   |-- <annotation_type>  [ SMPL-X, MANO, ...]
|   |   |   |   |-- <video_name.pkl>
|   |   |-- text/
|   |   |   |-- how2sign_train.csv   [Text annotations]
|   |   |   |-- how2sign_test.csv    [Text annotations]
|   |   |   |-- how2sign_val.csv     [Text annotations]
|   |   |   |-- PHOENIX-2014-T.train.corpus.csv     [Text annotations]
|   |   |   |-- PHOENIX-2014-T.test.corpus.csv     [Text annotations]
|   |   |
|   |-- word2motion/  
|   |   |-- images/
|   |   |   |-- <video_name>/
|   |   |   |   |-- <frame_number.jpg>   [ starts from 000000.jpg ]
|   |   |-- videos/
|   |   |   |-- <video_name>/  [ ..... ]   
|   |   |-- annotations/
|   |   |   |-- <annotation_type>  [ SMPL-X, MANO, ...]
|   |   |   |   |-- <video_name.pkl>
|   |   |-- text/
|   |   |   |-- WLASL_v0.3.json   [Text annotations]
|   |   |
|-- common
|   |-- utils
|   |   |-- human_model_files
|   |   |   |-- smpl
|   |   |   |   |-- SMPL_NEUTRAL.pkl
|   |   |   |   |-- SMPL_MALE.pkl
|   |   |   |   |-- SMPL_FEMALE.pkl
|   |   |   |-- smplx
|   |   |   |   |-- MANO_SMPLX_vertex_ids.pkl
|   |   |   |   |-- SMPL-X__FLAME_vertex_ids.npy
|   |   |   |   |-- SMPLX_NEUTRAL.pkl
|   |   |   |   |-- SMPLX_to_J14.pkl
|   |   |   |   |-- SMPLX_NEUTRAL.npz
|   |   |   |   |-- SMPLX_MALE.npz
|   |   |   |   |-- SMPLX_FEMALE.npz
|   |   |   |-- mano
|   |   |   |   |-- MANO_LEFT.pkl
|   |   |   |   |-- MANO_RIGHT.pkl

In common/ folder, human_model_files contains smpl, smplx, mano, and flame 3D model files. Download the files from [SMPL_NEUTRAL] [SMPL_MALE.pkl and SMPL_FEMALE.pkl] [smplx] [SMPLX_to_J14.pkl] [mano]. Alternatively, you can directly download our packed model files from Dropbox and unzip to human_model_files.

Data Description

SMPL-X Annotation

In each of the .pkl files, the keys are in the format:

width, height: (1,) (1,) the video width and height
focal: (num_frames, 2)
princpt: (num_frames, 2)
2d: (num_frames, 106, 3)
pred2d: (num_frames, 106, 3)
total_valid_index: (num_frames,)
left_valid: (num_frames,)
right_valid: (num_frames,)
bb2img_trans: (num_frames, 2, 3)
smplx: (num_frames, 182)
unsmooth_smplx: (num_frames, 169)

For motion generation and motion prior learning tasks, you should use the data in smplx for better stability, whilst unsmooth_smplx can be used for pose estimation tasks. Please refer to code for more details. For example, you can extract smplx parameters as follow:

all_parameters = results_dict['smplx']
root_pose, body_pose, left_hand_pose, right_hand_pose, jaw_pose, shape, expression, cam_trans = \
all_parameters[:, :3], all_parameters[:, 3:66], all_parameters[:, 66:111], all_parameters[:, 111:156], \
all_parameters[:, 156:159], all_parameters[:, 159:169], all_parameters[:, 169:179], all_parameters[:, 179:182]

all_parameters = results_dict['unsmooth_smplx']
root_pose, body_pose, lhand_pose, rhand_pose, shape, cam_trans = \
all_parameters[:, :3], all_parameters[:, 3:66], all_parameters[:, 66:111], all_parameters[:, 111:156], \
all_parameters[:, 156:166], all_parameters[:, 166:169]
root_pose: (num_frames, 3)
body_pose: (num_frames, 63)
expression: (num_frames, 10)
jaw_pose: (num_frames, 3)
betas: (num_frames, 10)
left_hand_pose: (num_frames, 45)
right_hand_pose: (num_frames, 45)

Please note that the transl is set to 0 in these subsets as there is no root position change in the video.

Text Annotations

HamNoSys2Motion

  • The signers are standing and doing a single sign.
  • Each video is annotated with hamnosys glyph and hamnosys text:
    • "hamsymmlr,hamflathand,hamextfingero,hampalml"
  • The average length of the video is 60 frames with 24 fps

Language2Motion

  • The signers are sitting and doing multiple signs.
  • Each video is annotated with natural language translations:
    • "So we're going to start again on this one."
  • The average length of the video is 162 frames with 24 fps

Word2Motion

  • The signers are standing and doing a single sign.
  • Each video is annotated with word-level English:
  • The average length of the video is 57 frames with 24 fps

Citation

@inproceedings{yu2023signavatars,
  title = {SignAvatars: A Large-scale 3D Sign Language Holistic Motion Dataset and Benchmark},
  author = {Yu, Zhengdi and Huang, Shaoli and Cheng, Yongkakng and Birdal, Tolga},
  journal = {arXiv preprint arXiv:2310.20436},
  month     = {November},
  year      = {2023}
  }

Contact

For technical questions, please contact [email protected] or [email protected]. For license, please contact [email protected].

signavatars's People

Contributors

zhengdiyu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

cyk990422

signavatars's Issues

Enquiry about the data and code release time.

Hi, thanks for the great work! Pretty appreciate your contribution in sign languages. When would you release the code and data? Could you please give a approximate time? I hope to schedule some follow-up works based on it.

About Initial release of annotations.

Excellent work! I noticed that the provided annotations include only a portion of the SMPL-X annotations (specifically, only have the training set for the How2Sign dataset and the test set for the PHOENIX-2014T dataset). Could you share the full annotations? Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.