This a my personnal sandbox for test with various machine learning models in Python. It started with tests on human pose detection (hence the name).
To use these scripts, you need Intel's OpenVino toolkit down installed on your machine and the OpenCV as well. You can read more about both these.
- OpenCV - The simple install should look like
pip install opencv-python
. - OpenVino toolKit - See website for installation depending of your configuration.
The scripts main.py and mainpi.py are my (windows) desktop and raspberry pi tests on human pose detection.
For these tests, I used the human-pose-estimation-0001 model, included in the standard distribution of OpenVino.
This is a multi-person 2D pose estimation network (based on the OpenPose approach) with tuned MobileNet v1 as a feature extractor. It finds a human pose: body skeleton, which consists of keypoints and connections between them, for every person inside image. The pose may contain up to 18 keypoints: ears, eyes, nose, neck, shoulders, elbows, wrists, hips, knees and ankles.
More information here.
The scripts emotion.py and emotionpi.py are my (windows) desktop and raspberry pi tests on Emotion recognition.
For these tests, I used the emotions-recognition-retail-0003 model, included in the standard distribution of OpenVino.
It's a fully convolutional network for recognition of five emotions ('neutral', 'happy', 'sad', 'surprise', 'anger').
More information here.
The script bert.py is my test playing with similar questions recognition with BERT.
The script bert_questions.py takes the file questions.txt for reference and asks the user to input a new question. It tries to find the 5 most similar questions in its reference file.
BERT (Bidirectional Encoder Representations from Transformers) provides dense vector representations for natural language by using a deep, pre-trained neural network with the Transformer architecture. It was originally published by Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova: "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding", 2018.
I based my tests on the "multilingual_L-12_H-768_A-12" model that I found on the Google Research Github page.
The script cat.py is my test base of what became the AceVINOtura project with help from some other Udacity students (see acknowledgements).
A lot of stuff is hardcoded in this script :
- the paths to the source video file used and the output file generated
- the path to the (windows) openvino CPU extension
- the path to the model (
frozen_inference_graph.xml
) - the coordinates of the "forbidden zone" describing the zone where the cat should not go
- the confidence threshold for the model detections
The model used (and not included in this repository) is a OpenVino IR converted from the ssdlite_mobilenet_v2_coco of the Tensorflow model zoo.
Conversion was made with the following command :
python mo_tf.py --input_model ssdlite_mobilenet_v2_coco\frozen_inference_graph.pb --tensorflow_use_custom_operations_config extensions\front\tf\ssd_v2_support.json --tensorflow_object_detection_api_pipeline_config ssd_mobilenet_v2_coco.config --data_type FP16
The script perf_counts.py is just a quick & simple example of running the method get_perf_counts()
on an inference request using the OpenVino framework.
This is a very nice diagnostic tool and I'm glad to have discovered it.
These scripts are my work for personnal testing of various models. They are based on course content from Udacity "Intel Edge AI Scholarship Foundation Course".
The tokenisation.py and input_feature.py scripts I used for my BERT tests are works from the Google AI Language Team and licensed under the Apache License, Version 2.0 (the "License");