GithubHelp home page GithubHelp logo

muskanm1 / the-handler Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jeethub-official/the-handler

0.0 1.0 0.0 19.21 MB

A Webcam based Gesture Control Tool

License: MIT License

Python 99.42% Shell 0.58%

the-handler's Introduction

The Handler: WebCam based Hand Gesture Control Tool

Technologies Used

Programming Language: Python
Libraries Used: PyTorch, OpenCV, PyAutoGUI

Summary

This project is a tool that recognizes hand gestures via a live WebCam and performs mouse functions accordingly. The motivation behind this project is to replace the traditional and cumbersome way of interacting with computers with gesture based human interaction. This system is a way of interacting with computer using a real time recognition of dynamic hand gestures from video streams through webcam. The proposed solution uses 3D-CNN models for the live-video (gesture) classification purposes due to its superior ability of extracting spatio-temporal features within video frames. And based on this gesture detection, activities like scrolling, switching slides, zooming in and out, etc are automated using powerful python libraries. Additionally we have also used transfer learning with MobileNetV2 model to provide each user an option to customise their own gestures for each class of actions. We were able to achieve more than 90% accuracy in recognizing the custom gestures set by user in the real-time based system with our solution.

The project is divided into two parts - Server and Client (codes can be found in src folder).
At First, the Client runs the input_utils.py that collects the data of user. Then, the file client.py needs to be executed which zips the data collected by the user and make a HTTP POST request (REST API call) to the server and sends the zipped data. Then, the Server sends a trained model (which was trained on the user data) in the response of the HTTP POST request. Then at the client side, the inference is performed using controls.py file that classifies the live hand gesture using two files - yolo.py and main.py and with the help of PyAutoGUI library performs that particular function. The yolo.py file helps to perform the finger tracking which accurately performs scrolling functions. And the main.py file uses the MobileNetV2 model that came as a response from the server to classify the hand gesture in the pre defined classes.
The Server side code is responsible for receiving the data from client, training the model on the received data and sending the model to the client. The server uses Apache2 HTTP server, Gunicorn WSGI server, and Flask micro-framework. The model is written using PyTorch library. The file app.py receives requests from clients and calls the model training function which is mentioned in main.py. It trains the model on the received data and then model is returned to the client. \

the-handler's People

Contributors

jeethub-official avatar kavyaa-official avatar muskanm1 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.