Real-time emotion evaluator

Live demo:

fastai.mollyai.com

To run locally:

Install fastai, starlette and uvicorn
python server.py
Go to http://localhost:8000/ (many browsers require https for webRTC)

The Model

In order to distinguish different facial expressions, we chose to train the model on the AffectNet dataset. This is a phenomenal data set of 450,000 manually annotated face images.

For example:

For much more information on training, please check out the notebook.

Inference Server

Once we had a trained model, it was time to share it with the world. Creating a server was pretty simple. We used the Starlette framework.

First, in our server file outside of any route, we had to create a Learner similar to the one we used to train our model. So, we (1) define our classes in the same way we did when training, (2) instantiate an ImageDataBunch and (3) instantiate a Learner from that ImageDataBunch.

classes = ['Anger', 'Disgust', 'Surprise', 'Sadness', 'Happiness', 'Neutral', 'Contempt', 'Fear']

data = ImageDataBunch.single_from_classes('', classes, tfms=get_transforms(), size=196).normalize(imagenet_stats)
learner = create_cnn(data, models.resnet34)
learner.load('sentiment-model')

Note that when calling ImageDatabunch.single_from_classes, the first argument is the path to your model. The function expects the model to be in a models directory under that path. So, if your models directory is in the same directory as your server file, you can just use an empty string for your path as we did here. Then when calling learner.load('sentiment-model'), you will be loading ./models/sentiment-model.pth.

Now that we have our weights loaded and ready to make inferences, we set up a route to accept images from a client and return a set of predictions.

@app.route('/face', methods=["GET","POST"])
async def face(request):
    body = await request.form()
    binary_data = a2b_base64(body['imgBase64'])
    img = open_image(BytesIO(binary_data))
    _,_,losses = learner.predict(img)
    analysis = {
        "predictions": dict(sorted(
            zip(learner.data.classes, map(float, losses)),
            key=lambda p: p[1],
            reverse=True
        ))}
    return JSONResponse(analysis)

Every time the /face route is hit, our Learner will evaluate the image sent in the request. From calling learner.predict(img) we get a tensor of probabilities for each class in the same order of the classes as defined above. We then zip these predictions to the class names and order by the probabilities in descending order and transform into a dictionary (for ease of use by a javascript client). So, if we had a happy image, the resulting analysis might look like:

{'predictions': {'Happiness': 29.584692001342773, 'Sadness': 1.511526346206665, 'Neutral': 0.5242078900337219, 'Fear': 0.33813756704330444, 'Contempt': 0.29101505875587463, 'Surprise': 0.282543420791626, 'Anger': 0.1381775140762329, 'Disgust': 0.10192008316516876}}

Web Client

We want the recognition to be as smooth and real-time as possible. We also want to distribute on the web. We can take advantage of webRTC to capture frames from the camera video stream, send them to our inference server and then display the results. However, in order to maximize the responsiveness, we need to send images fairly frequently. In order to reduce the network overhead, we use trackingjs to first track the face, crop it, and then finally only send the face to the server. By first cropping the face, we can drastically reduce the latency.

We get the cropped face, draw it to a different canvas, and then use toDataURL() to send the base64 face image to the server.

etown / dl1 Goto Github PK

dl1's Introduction

Real-time emotion evaluator

Live demo:

To run locally:

The Model

Inference Server

Web Client

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs