GithubHelp home page GithubHelp logo

mehmandarov / findyourcandy Goto Github PK

View Code? Open in Web Editor NEW

This project forked from brainpad/findyourcandy

1.0 3.0 2.0 210.45 MB

License: Apache License 2.0

Python 70.47% Shell 1.67% CSS 8.56% JavaScript 13.63% HTML 4.33% Dockerfile 1.33%

findyourcandy's Introduction

"Find Your Candy with Cloud ML" system diagram

Understanding your request with ML APIs

The demo starts with listening to your voice request such as "I like a dark chewy chocolate", and analyzes it with Cloud Speech API. The API converts audio data into text with Google's high quality voice recognition technology. Then the text is processed with Cloud Natural Language API and its syntactic analysis so that the system can extract what are the important verbs, adjectives and nouns in the text to understand the meaning of your command.

Then the system uses word2vec and regression (Inception-v3 + transfer learning) on Cloud Machine Learning (Cloud ML) for choosing the best candy for your request. The algorithm is smart enough to understand the similarity of meaning between two words - such as "milky and creamy", "hot and spicy" - based on an analysis result on natural languages. With this technology, the system tries to recommend a candy that has the highest likelihood of fulfilling the request.

Locating and serving a candy with image recognition on Cloud ML

Once system determined what type of candy to serve, it runs a deep learning model for image recognition and analyzes the image of candies on the table. The model locates the position of the target candy and serve it to you with the robot arm.

In the training mode, the system uses Inception model with transfer learning on Cloud ML to train the model within a couple of minutes by utilizing the computation power of Google Cloud. So you can bring any objects you like and train the system on-the-fly at high accuracy. It's not designed only for picking up the candies, but for a versatile image recognition for wide variety of applications.

Serving mode

  • Android tablet: for UI, recognizes the voice command with Speech and NL API
  • Controller PC (Linux) and camera: recognizes the candies with Cloud ML, controls the robot arm
  • Robot arm: picks the candy, takes it to the certain location, and drops.

Candy type

The robot can handle two types of candy. Small candy that can be picked up, for which it will correcly adjust the gripper to account for rotation etc. It also supports boxed/smooth candy which it picks up using a suction cup.

Learning mode

  • Android tablet: shows UI for training process updates
  • Controller PC: runs Inception-v3 + transfer learning on Cloud ML to train a model from scratch, with the camera image
  • More on learning mode

Setting things up

Troublehooting

  • See this page in case you encounter any errors or unexpected behaviour.

Troublehooting

  • See this page in case you encounter any errors or unexpected behaviour.

Note

  • For Learning mode it is using Cloud ML training. For Serving mode it is not using Cloud ML prediction.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.