GithubHelp home page GithubHelp logo

hand_gesture_recognition's Introduction

Hand_Gesture_Recognition

Hand Gesture Recognition using Deep Learning

1. Problem Statement:


This project is about developing a model which can recognize the five User’s Hand Gesture that will help user to control the Video player in the Television without Remote. The Gesture is continuously monitored by the webcam mounted on the TV:
• Thumbs up: Increase the Volume
• Thumbs Down: Decrease the Volume
• Left Swipe: Jump Backwards 10 sec
• Right Swipe: Jump Forward 10 sec
• Stop: Pause the Movie

2. Training Data and Validation Data


The Training and Validation data consist of 663 and 100 videos respectively. Each videos is further divided into 30 frames each capturing the hand movements. The video frames images consists of two size (120, 160, 3) and (360, 360, 3). Three representing the RGB channels and rest two dimensions denoting the number of rows and columns.

3. Data Generators


In most of the Deep Learning projects, the data is feed into the model in batches. This is done using the concept of Data Generators to set up the data ingestion pipeline. Creating Data Generators is probably the most important part of building a training pipeline. Although libraries like Keras provide built-in generator functionalities, but they are often limited in scope and you have to write your own generators from scratch. For example, in this problem, you need to feed Batches of videos, not images.

The data generator yield batch of data at each iteration of shape (nFrames, nRows, nColumns, Channels). We have two image size, 120x160 and 360x360. While processing the image in the generators we cropped the image having 120x160 into 120x120 and then passed it to resize function to make it 80x80 image. The image with 360x360 shape is directly passed to resize function.

4. Model Architecture:

For Analysis videos using Neural Network, two types of architecture used commonly: • 3D Convolution Network or Convo3D
3D convolutions are the natural extension to the 2D convolutions. Just like 2D conv, you move the filter in two directions (x and y), in 3D Conv, you move the filter in three directions (x, y and z).

• Convolutions + RNN
The Convo 2D network will extract a feature vector for each image and a sequence of these feature vectors is then fed to an RNN-based network. The output of the RNN is a regular softmax.

Note: For all the models we have used Image Size of (80, 80, 3) and 15 frames for each video.

hand_gesture_recognition's People

Contributors

abhishek111 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.