mosecorg / mosec Goto Github PK

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

Home Page: https://mosec.readthedocs.io/

License: Apache License 2.0

Makefile 0.94% Python 68.89% Rust 28.76% Dockerfile 1.42%

model-serving deep-learning machine-learning nerual-network mlops machine-learning-platform hacktoberfest gpu python pytorch

mosec's People

Contributors

Stargazers

Watchers

mosec's Issues

Rust: restructure to decoupling

What does parallel mean in the Qualitative Comparison

Thanks for this interesting comparison. I am wondering what the parallel column is trying to compare exactly? Does it mean the ability to serve different artifacts/models in one service?

`mypy` is annoying, let's switch to `pyright`

mypy is annoying, let's switch to pyright

Originally posted by @kemingy in #175 (comment)

[BUG] Invalid output when pyarrow plasma is used and ValidationError is raised

Describe the bug
When pyarrow plasma is used in a multi stage setting and when mosec.errors.ValidationError is raised, we get can invalid response . Example ~��6,�&B��Ӣ��

To Reproduce

Using official example

from functools import partial

from pyarrow import plasma  # type: ignore

from mosec import Server, Worker
from mosec.errors import ValidationError
from mosec.plugins import PlasmaShmWrapper


class DataProducer(Worker):
    def forward(self, data: dict) -> bytes:
        try:
            data_bytes = b"a" * data["size"]
        except KeyError as err:
            raise ValidationError(err)
        return data_bytes


class DataConsumer(Worker):
    def forward(self, data: bytes) -> dict:
        return {"ipc test data length": len(data)}


if __name__ == "__main__":
    """
    We start a subprocess for the plasma server, and pass the path
    to the plasma client which serves as the shm wrapper.
    We also register the plasma server process as a daemon, so
    that when it exits the service is able to gracefully shutdown
    and restarted by the orchestrator.
    """
    # 200 Mb store, adjust the size according to your requirement
    with plasma.start_plasma_store(plasma_store_memory=200 * 1000 * 1000) as (
        shm_path,
        shm_process,
    ):
        server = Server(
            ipc_wrapper=partial(  # defer the wrapper init to worker processes
                PlasmaShmWrapper,
                shm_path=shm_path,
            )
        )
        server.register_daemon("plasma_server", shm_process)
        server.append_worker(DataProducer, num=2)
        server.append_worker(DataConsumer, num=2)
        server.run()

Command to run

curl -X POST h000/inference -d '{"wrong_key": 2}'

Response from curl

1���6��.��2���{��r

Expected output

validation error: 'size'

Log from stdout

2022-03-04T05:59:40.629248Z  INFO mosec::protocol: abnormal tasks ids=[4] code=ValidationError
2022-03-04T06:00:22.816739Z  INFO mosec::protocol: abnormal tasks ids=[7] code=ValidationError
2022-03-04T06:00:23.513732Z  INFO mosec::protocol: abnormal tasks ids=[8] code=ValidationError

Additional note

Only applies when pyarrow plasma is used.

Desktop (please complete the following information):

OS: Ubuntu 20.04
Library Version: 0.3.0
Python Version: 3.8.12
Python Plasma Version: 7.0.0

[FEATURE] add more useful worker classes

So far, there is only one Worker base type with JSON serialization and Pickle IPC serialization.

#139
#140

Tracking the progress before the first public release

General Tasks

Python Tasks

Rust Tasks

[BUG] Plasma shm CI test failed on macOS

Describe the bug

Plasma shm CI test failed on macOS

To Reproduce
Steps to reproduce the behavior:

make test test_plugin

Desktop (please complete the following information):

OS: macOS 11.6
Library Version: 0.3.1a1
Rust Version: 1.57
Python Version: 3.9.7

Additional context

Check the log: https://github.com/mosecorg/mosec/runs/4663786352?check_suite_focus=true

It's hard to reproduce this on my laptop.

[typo] Wrong key word `path` in workflows

https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#example-including-paths

https://github.com/mosecorg/mosec/blob/main/.github/workflows/check.yml#L6

path is not supported by GitHub actions, we should use paths

pytorch： multi workers

mosec/examples/resnet50_server_pytorch.py

Line 49 in 1627a9b

class Inference(Worker):

Hi, for multi-instances usage, by default the pytorch use default cuda stream, making it not able to run instances concurrently. So how do you solve this problem? multi-process？

Rust: put the tasks back to the channel when sending failed

[BUG] security vulnerabilities

❯ cargo audit                                                                                                    
    Fetching advisory database from `https://github.com/RustSec/advisory-db.git`
      Loaded 395 security advisories (from /home/keming/.cargo/advisory-db)
    Updating crates.io index
    Scanning Cargo.lock for vulnerabilities (100 crate dependencies)
Crate:         chrono
Version:       0.4.19
Title:         Potential segfault in `localtime_r` invocations
Date:          2020-11-10
ID:            RUSTSEC-2020-0159
URL:           https://rustsec.org/advisories/RUSTSEC-2020-0159
Solution:      No safe upgrade is available!
Dependency tree:
chrono 0.4.19
└── tracing-subscriber 0.2.19
    └── mosec 0.3.1

Crate:         thread_local
Version:       1.1.3
Title:         Data race in `Iter` and `IterMut`
Date:          2022-01-23
ID:            RUSTSEC-2022-0006
URL:           https://rustsec.org/advisories/RUSTSEC-2022-0006
Solution:      Upgrade to >=1.1.4
Dependency tree:
thread_local 1.1.3
└── tracing-subscriber 0.2.19
    └── mosec 0.3.1

Crate:         tokio
Version:       1.9.0
Title:         Data race when sending and receiving after closing a `oneshot` channel
Date:          2021-11-16
ID:            RUSTSEC-2021-0124
URL:           https://rustsec.org/advisories/RUSTSEC-2021-0124
Solution:      Upgrade to >=1.8.4, <1.9.0 OR >=1.13.1
Dependency tree:
tokio 1.9.0
├── mosec 0.3.1
└── hyper 0.14.11
    └── mosec 0.3.1

error: 3 vulnerabilities found!

Will create a PR to upgrade the version.

Rust: simplify the error handling

[FIX] apply `pylint` to tests and examples

We have changed to pylint in #134. But only the code under mosec/ directory has been applied.

Tasks

Beta Give feedback

No tasks being tracked yet.

Options

[FEATURE] simplify/standardize the publish process with cibulidwheel

https://github.com/pypa/cibuildwheel

Pros:

supported by PyPA (latest update)
supports manylinux/musllinux/macOS
bundles shared library dependencies

[FIX] python package: including rust source files

I just found that this doesn't really fix this issue.

Python package is unpredictable. :(

Originally posted by @kemingy in #127 (comment)

General: deployment dockerfile and image recognition model sample

[FEATURE] Support mini-batch request

Is your feature request related to a problem? Please describe.
Currently we are batching at the request level, ignoring the fact that a single request may already contain a mini-batch formed by the client.

Describe the solution you'd like
One option is to add a header specifying the batch size as an integer, then when we do the batching taking that into consideration.

chore: better README

The current README doesn't give users a very intuitive introduction to the feature of MOSEC.

Let's make it more user-friendly.

image to demonstrate the features
better tutorial example

Python: break when all the workers in one stage are dead

[FEATURE] Async Inference

This can offer the client side the flexibility to do independent computation while waiting for the model inference result.

Maybe two API:

/inference_async/put which returns an rid;
/inference_async/get.

Python code should not be affected; need minor modification on Rust side.

[FEATURE] Quick and easy demo

Provide colab example, which should be much more efficient than users copy and run by themselves
Provide out-of-the-box sample yaml for cloud deployment (low priority)

[FEATURE]Benchmark for AI system

Our community lacks some benchmark for server-side end to end AI system. For examples, the face recognition system, smart tracfic system with tens of models, and ocr-system. A complete comparison for the throughout and Latency across different hardware platforms and also libraries is necessary. Usability might also be considered. This is both beneficial to hardware manufactures, and also Democratisation of AI.

make sure all the examples can run with the latest code

How about locally we do lint for all codes including those under examples, but we do not do that in CI? Local dev environment should have those third parties installed. Doing full lint for examples benefits code styles.
It's better to have a CI to make sure all the examples can run with the latest code.

Originally posted by @kemingy in #180 (comment)

Rust: allow custom error message

[FEATURE] Worker with msgpack serialization

[BUG]NameError: name 'plasma' is not defined

Describe the bug
NameError: name 'plasma' is not defined

To Reproduce
Steps to reproduce the behavior:

import base64
import json
import os

import cv2
import numpy as np
import yaml
from mosec import Worker, Server
from mosec.errors import ValidationError

from inference import Inference


def _get_model():
    if os.path.exists('config.yaml'):
        config = open('config.yaml', mode='r', encoding='utf-8')
        config = yaml.load(config, Loader=yaml.FullLoader)

        model = Inference(
            det_model_path=config['det_model_path'],
            rec_model_path=config['rec_model_path'],
            device=config['device'],
            dict_path=config['dict_path'],
            rec_std=0.5, rec_mean=0.5, threshold=0.7,
            angle_classes=config['angle_classes'],
            angle_classify_model_path=config['angle_model_path'],
            object_classes=None,
            object_classify_model_path=None
        )
        return model, config
    else:
        raise FileNotFoundError('must have a config.yaml file!')


class OCRInference(Worker):
    def __init__(self):
        super(OCRInference, self).__init__()
        self.model, self.config = _get_model()

    def forward(self, req: dict):
        try:
            image = req["image"]
            save_name = req['saveName']
            im = np.frombuffer(base64.b64decode(image), np.uint8)
            im = cv2.imdecode(im, 1)
            result = self.model.infer(img=im,
                                      img_save_name=save_name,
                                      cut_image_save_path=self.config['cut_image_save_path'],
                                      need_angle=self.config['need_angle'],
                                      need_object=self.config['need_object'])
            return json.dumps({'status': 1, 'result': result})

        except KeyError as err:
            raise ValidationError(f"cannot find key {err}")
        except Exception as err:
            raise ValidationError(f"cannot decode as image data: {err}")


if __name__ == "__main__":
    server = Server()

    server.append_worker(OCRInference, num=2, max_batch_size=16)
    server.run()

Desktop (please complete the following information):

OS: [e.g. Ubuntu 20.04]
Library Version: [e.g. 0.1.0]
Rust Version: [e.g. 1.55.0]
Python Version: [e.g. 3.8.5]

Additional context
Add any other context about the problem here.

Rust: add unit test

Rust: delete timeout task before sending to the Python side

General: use zig for cross-compilation

General: shared memory or GPU-to-GPU DMA data transfer

do some research on feasibility and method
non-breaking code changes
transparent to users (make it as the default)

Thanks for your great contribution ：）

2021/12/09 Update: I sincerely apologize for abusing the github issue system and we’ll find more proper ways to communicate with contributors in the future.

Hi kemingy
We are so appreciative of your contribution to fixing pylint rules through the “Let’s become Taichi contributors within 10 minutes” program. Please send an email to [email protected] with your offline address. We would deliver a hoodie to you. : ) Looking forward to hearing from you!

非常感谢您参与第0次十分钟成为 Taichi contributors 活动，并在 pylint 命令修复中的宝贵贡献！ 为表感谢，近期太极周边会向你赶来～请麻烦将收件信息发送至运营同学 [email protected] 欢迎持续关注 Taichi 社区哦～
Taichi 开源社区敬上

Rust: rate limit

[BUG] mkdocs build failed

Describe the bug

https://github.com/mosecorg/mosec/runs/6947565234?check_suite_focus=true

[BUG] `GLIBC_2.29` not found in Ubuntu 18.04

Describe the bug
mosec will raise the following error in Ubuntu 18.04:

/data/anaconda3/envs/mosec-p38/lib/python3.8/site-packages/mosec/bin/mosec: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by /data/anaconda3/envs/mosec-p38/lib/python3.8/site-packages/mosec/bin/mosec)

To Reproduce
Just run the Sentiment Analysis example.

Desktop (please complete the following information):

OS: Ubuntu 18.04
Library Version: 0.2.1
Rust Version: unknown
Python Version: 3.8.12

Additional context
I tried to use mosec in Ubuntu 20.04 and it worked. So if Ubuntu 20.04 or above is required, I think it's better to put it on the document.

rust part doesn't parse the content, will pass all the bytes to python workers
user can define their own protobuf file to serialize or de-serialize the data

mosecorg / mosec Goto Github PK

mosec's People

Contributors

Stargazers

Watchers

Forkers

mosec's Issues

General Tasks

Python Tasks

Rust Tasks

Tasks

Recommend Projects

Recommend Topics

Recommend Org

Jobs