GithubHelp home page GithubHelp logo

mediapipe's Introduction

☝️☝️

Move your fingers in the air to interact with the web



Hi 👋!

This is a (typescript/javascript) library that uses AI (Google's Mediapipe) to recognize the position of fingers recorded from a webcam and emits HTML events similar to mouse or touch events (mousedown, mousemove, touchstart, touchmove, ...), so we can use our fingers to interact with the web.

Examples

Check out the examples folder:

Events

There are two main classes of events:

Airfingers

airfingerstart, airfingermove, airfingerend: These are meant to work like ontouchstart, ontouchmove and ontouchend (or mouse events). From Mediapipe we get an estimation of the position of the fingers in space, but we need something like a "click" to select, move, drag, paint...

So we have defined an "airfinger" as an event that happens when the tip of the index finger is closer to the camera than the wrist. This makes "touch" actions quite intuitive after a little practice.

See this image: the one in the left qualifies as airfinger and will trigger events when it starts, moves and ends, an the right one won't (so you can move freely without triggering events, as hovering).

hands

These events contain this data:

export interface AirfingerEventParams {
  airpoint: Point3D; // x, y, z (normalized 0..1 position of the index finger)
  hand: Hand;        // Left or right
}

Gestures

When a gesture is recognized, we send these events: gesturestart, gesturemove, gestureend. Mediapipe can recognize seven gestures 👍, 👎, ✌️, ☝️, ✊, 👋, 🤟 with the default training model:

  • Closed fist (Closed_Fist)
  • Open palm (Open_Palm)
  • Pointing up (Pointing_Up)
  • Thumbs down (Thumb_Down)
  • Thumbs up (Thumb_Up)
  • Victory (Victory)
  • Love (ILoveYou)

These events contain this data:

export interface GestureEventParams {
  gesture: string;    // Name of the gesture 
  hand: Hand;         // Left or right
  airpoint: Point3D;  // x, y, z (normalized 0..1 position of the index finger)
}

Note: It is possible to train mediapipe to recognize more gestures. See mediapipe docs and provide a model in config

Configure

The method init() accepts an optional argument with a configuration object:

interface ManitasConfig {
  gestureThreshold: number;
  handednessThreshold: number;
  activeThreshold: number;
  videoHeight: string;
  videoWidth: string;
  videoId: string;
  delegate: "GPU" | "CPU";
  modelAssetPath: string;
  mediapipeWasmPath: string;
}
  • gestureThreshold: Confidence threshold to decide if a gesture has been detected.
  • handednessThreshold: Confidence threshold to decide if a gesture has been detected.
  • activeThreshold: Threshold to decide if the user is pointing.
  • videoId: Id of a video element to attach the webcam stream;
  • videoHeight: Height of the video element;
  • videoWidth: Width of the video element;
  • delegate: "GPU" | "CPU", Are we using GPU or CPU for estimation?
  • modelAssetPath: Custom model if you have defined custom gestures.
  • mediapipeWasmPath: Path to mediapipe wasm.

Caveats

Q: Do I need to have a video element in the page displaying the signal from the camera?

A: I haven't figured out how to run MediaPipe without it, so a video element is needed. But you can hide it (display:none)!

mediapipe's People

Contributors

icode198 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.