eora-ai / inferoxy Goto Github PK
View Code? Open in Web Editor NEWService for quick deploying and using dockerized Computer Vision models
License: GNU General Public License v3.0
Service for quick deploying and using dockerized Computer Vision models
License: GNU General Public License v3.0
There is no API documentation at this time.
There are currently up and down triggers for models. But if Inferoxy is used in the scenario of hosting a signle model it isn't needed to release this model - it should run without interruptions in order to get rid of the "cold start" effect. Otherwise, we wait 10 seconds for a model to start even if a very simple model is used.
There's a need to send multiple images per request item, for example, when we are running a model for image retrieval task which accepts images of a product from different angles of view. The amount of images of the product is arbitrary. So, it looks like we can make a request object with a list of tensors of different sizes
I have a problem with very slow requests processing.
Ten parallel (or sequential) requests with only one image as input are processing for ±8 minutes, while one single (first) request takes ±10 seconds to complete.
In the logs it seems like inferoxy waits for something before process the image.
There is a log:
log.log
The model can be dowloaded from docker hub:
docker pull smthngslv/clip_vit-b32_no_proj:latest
Code, image, log.
Archive.zip
Also, does the setting exists , that allow to do not shutdown the container between requests (I mean, run sequential requests on the same container)?
It would be helpful to have a Helm chart that automatically performs steps described in this tutorial.
Pass a video to Inferoxy processing. It processed using some model and returns in the same way that was sent.
Examples:
Input: https://api.dev.visionhub.ru/public_media/66218c4f-7b14-45b0-8669-370dee03ffa6
output: https://api.dev.visionhub.ru/public_media/1eb3f365-00d7-4b66-a134-0d3b952eb89c
There is frame dropping.
We can not guarantee an order, because of retriable errors. For example, we have three batches that are passed to Inferoxy in order.
3 -> 2 -> 1 -> inferoxy
The first batch processed. When the second batch start processing model failed because processing instance is disappeared.
We put the second batch at the end of the queue.
2 -> 3 -> 1 -> inferoxy
Again we don't guarantee order, but users can send an index of the input in parameters
and sort output on their side(
Let's imagine the case when we have a stateful model consisting 2 Gb of GPU memory while processing an RTSP stream. If we have a GPU with 16 Gb of memory then theoretically we can run up to 8 copies of such models. This would be currently possible with stateless models but usually RTSP streams are processed by people trackers or similar which are stateful. The issue was first mentioned by @tz3 who designes the system where 8 RSTP streams are processed in parallel.
Even a simple REST API request to CPU model has been slowly processed (10 seconds). The reason is probably in latency since computation overhead is minimal. Needed to profile netwok during simple requests
In the documentation:
Down triggers (↓):
time of last use for source_id > T_max - in this case either model release or instance stopping happens depending on whether there are incoming requests to this model
But now in code:
if (
time.time() - model_instance.sender.get_time_of_last_sent_batch()
> self.config.load_analyzer.stateful_checker.keep_model
):
triggers += [self.make_decrease_trigger(model_instance=model_instance)]
There is no check for incoming request, just deletion procedure.
Add github action that will automatically create release.
The following links may be useful:
https://github.com/thomaseizinger/github-action-gitflow-release-workflow
https://blog.eizinger.io/12274/using-github-actions-and-gitflow-to-automate-your-release-process
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.