GithubHelp home page GithubHelp logo

sign-language-recognition-with-rnn-and-mediapipe's Introduction

Sign language recognition with RNN and Mediapipe

Sign language gesture recognition using a reccurent neural network(RNN) with Mediapipe hand tracking.

This project is for academic purpose. Thank you for Google's Mediapipe team :)

Data Preprocessing with hand tracking(Desktop)

Create training data on Desktop with input video using Hand Tracking. Gesture recognition with deep learning model can be done with only 42 hand landmarks RNN training per frame.

CUSTOMIZE:

  • Use video input instead of Webcam on Desktop to train with video data
  • Extract hand landmarks for every frame per one word and make it into one txt file

1. Set up Hand Tracking framework

  • Install Medapipe
  git clone https://github.com/google/mediapipe.git

See the rest of installation documents here.

  • Change tflite_tensors_to_landmarks_caculator.cc file
  cd mediapipe/mediapipe/caculators/tflite
  rm tflite_tensors_to_landmarks_caculator.cc

to our new tflite_tensors_to_landmarks_caculator.cc file in the modified_mediapipe folder.

  • Change demo_run_graph_main.cc file
  cd mediapipe/mediapipe/examples/desktop
  rm demo_run_graph_main.cc

to our new demo_run_graph_main.cc file in the modified_mediapipe folder.

2. Create you own training data

Make train_videos and test_videos for each sign language word in one folder. Copy build.by file in util folder to your mediapipe directory. (Currently there may be a TabError. Please chang the tab manually.)

  • Usage

To make mp4 file and txt file with mediapipe automatically, run

  python build.py --input_data_path=[INPUT_PATH] --output_data_path=[OUTPUT_PATH]

inside mediapipe directory.

Change INPUT_PATH, OUTPUT_PATH to your own folder directory path. INPUT_PATH is path to your input videos. OUTPUT_PATH is where all the hand-tracked mp4 files and txt files of 42 landmarks will be saved.

For example:

input_videos
├── Apple
│   ├── IMG_2733.MOV
│   ├── IMG_2734.MOV
│   ├── IMG_2735.MOV
│   └── IMG_2736.MOV
├── Bird
│   ├── IMG_2631.MOV
│   ├── IMG_2632.MOV
│   ├── IMG_2633.MOV
│   └── IMG_2634.MOV
└── Sorry
    ├── IMG_2472.MOV
    ├── IMG_2473.MOV
    ├── IMG_2474.MOV
    └── IMG_2475.MOV
    ...

OUTPUT_PATH is initially empty directory and when build is done, Mp4 and txt files will be extracted to your own folder path.

Created folder example:

output_data
├── _Apple
│   ├── IMG_2733.mp4
│   ├── IMG_2734.mp4
│   ├── IMG_2735.mp4
│   └── IMG_2736.mp4
└── Apple
    ├── IMG_2472.txt
    ├── IMG_2473.txt
    ├── IMG_2474.txt
    └── IMG_2475.txt
    ...

(DO NOT use space bar or '_' to your folder path and video name ex) Apple_pie (X))

3. Train RNN model

  • Train
  python LSTM.py --input_file=[PKL_FILE]

Add path to preprocessed pkl file into PKL_FILE.

Watch this video for the overall workflow. more details

sign-language-recognition-with-rnn-and-mediapipe's People

Contributors

kjonguk avatar rabbit64 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.