GithubHelp home page GithubHelp logo

ycace / virtual-background Goto Github PK

View Code? Open in Web Editor NEW

This project forked from volcomix/virtual-background

0.0 0.0 0.0 44.11 MB

Demo on adding virtual background to a live video stream in the browser

Home Page: https://volcomix.github.io/virtual-background

License: Apache License 2.0

HTML 1.91% TypeScript 57.25% Dockerfile 0.31% C++ 1.77% Starlark 33.08% Shell 0.29% Python 5.39%

virtual-background's Introduction

Virtual Background 1

Demo on adding virtual background to a live video stream in the browser.

๐Ÿ‘‰ Try it live here!

Table of contents

Implementation details

In this demo you can switch between 2 different ML pre-trained segmentation models: BodyPix and MediaPipe Meet Segmentation.

BodyPix

The drawing utils provided in BodyPix are not optimized for the simple background image use case of this demo. That's why I haven't used toMask nor drawMask methods from the API to get a higher framerate.

The drawBokehEffect method from BodyPix API is not used. Instead, CanvasRenderingContext2D.filter property is configured with blur and CanvasRenderingContext2D.globalCompositeOperation is setup to blend the different layers according to the segmentation mask.

The result provides an interesting framerate on laptop (~20 FPS on MacBook Pro 2017 in Chrome) but is not really usable on mobile (~8 FPS on Pixel 3 in Chrome). On both devices, the segmentation lacks precision compared to Meet segmentation model.

Note: BodyPix relies on the default TensorFlow.js backend for your device (i.e. webgl usually). The WASM backend seems to be slower for this model, at least on MacBook Pro.

MediaPipe Meet Segmentation

Meet segmentation model is only available as a TensorFlow Lite model file. Few approaches are discussed in this issue to convert and use it with TensorFlow.js but I decided to try implementing something closer to Google original approach described in this post. Hence the demo relies on a small WebAssembly tool built on top of TFLite along with XNNPACK delegate and SIMD support.

Note: Meet segmentation model card was initially released under Apache 2.0 license (read more here and here) but seems to be switched to Google Terms of Service since Jan 21, 2021. Not sure what it means for this demo.

Building TFLite to WebAssembly

You can find the source of the TFLite inference tool in the tflite directory of this repository. Instructions to build TFLite using Docker are described in a dedicated section: Building TensorFlow Lite tool.

Canvas 2D + CPU

This rendering pipeline is pretty much the same as for BodyPix. It relies on Canvas compositing properties to blend rendering layers according to the segmentation mask.

Interactions with TFLite inference tool are executed on CPU to convert from UInt8 to Float32 for the model input and to apply softmax on the model output.

The framerate is higher and the quality looks better than BodyPix even with the 160x96 model:

Model MacBook Pro 2017 (Chrome) Pixel 3 (Chrome)
256x144 ~36 FPS ~14 FPS
160x96 ~60 FPS ~29 FPS

WebGL 2

The WebGL 2 rendering pipeline relies entirely on webgl2 canvas context and GLSL shaders for:

  • Resizing inputs to fit the segmentation model (there are still CPU operations to copy from RGBA UInt8Array to RGB Float32Array in TFLite WASM memory).
  • Softmax on segmentation model output to get the probability of each pixel to be a person.
  • Joint bilateral filter to smooth the segmentation mask and to preserve edges from the original input frame (implementation based on MediaPipe repository).
  • Blending background image with light wrapping.
  • Original input frame background blur. Great articles here and here.

Possible improvements

  • Rely on alpha channel to save texture fetches from the segmentation mask.
  • Blur the background image outside of the rendering loop and use it for light wrapping instead of the original background image. This should produce better rendering results for large light wrapping masks.
  • Optimize joint bilateral filter shader to prevent unnecessary variables, calculations and costly functions like exp.
  • Blur the background at low resolution for efficiency. Also give linear sampling a try.
  • Try separable approximation for joint bilateral filter.
  • Compute everything on lower source resolution (scaling down at the beginning of the pipeline).
  • Build TFLite and XNNPACK with multithreading support. Few configuration examples are in TensorFlow.js WASM backend.
  • Detect WASM features to load automatically the right TFLite WASM runtime. Inspirations could be taken from TensorFlow.js WASM backend which is based on GoogleChromeLabs/wasm-feature-detect.
  • Experiment with DeepLabv3+ and maybe retrain MobileNetv3-small model directly.

Related work

You can learn more about a pre-trained TensorFlow.js model in the BodyPix repository.

Here is a technical overview of background features in Google Meet which relies on:

Running locally

In the project directory, you can run:

yarn start

Runs the app in the development mode.
Open http://localhost:3000 to view it in the browser.

The page will reload if you make edits.
You will also see any lint errors in the console.

yarn test

Launches the test runner in the interactive watch mode.
See the section about running tests for more information.

yarn build

Builds the app for production to the build folder.
It correctly bundles React in production mode and optimizes the build for the best performance.

The build is minified and the filenames include the hashes.
Your app is ready to be deployed!

See the section about deployment for more information.

Building TensorFlow Lite tool

A Docker development environment must be initialized before building TensorFlow Lite inference tool.

yarn init:tflite

Builds a Docker development image, starts the container and initializes dependencies required for building TFLite tool.

yarn start:tflite:container

Starts the container, then updates TensorFlow and MediaPipe repositories inside the container.

yarn build:tflite:all

Builds WASM functions that can infer Meet segmentation models. The TFLite tool is built both with and without SIMD support.

virtual-background's People

Contributors

volcomix avatar ycace avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.