GithubHelp home page GithubHelp logo

wzhanw / modelmesh Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kserve/modelmesh

0.0 0.0 0.0 395 KB

Distributed Model Serving Framework

License: Apache License 2.0

Dockerfile 0.36% Java 97.33% Python 0.23% Shell 1.54% Thrift 0.54%

modelmesh's Introduction

ModelMesh

The ModelMesh framework is a mature, general-purpose model serving management/routing layer designed for high-scale, high-density and frequently-changing model use cases. In essence, it acts as a distributed LRU cache for serving runtime models.

If you are looking at deploying and managing ModelMesh, please use the ModelMesh Serving repo instead, as it hosts the controller for the same. The instructions here are for development purposes only.

Quick-Start

  1. Wrap your model-loading and invocation logic in this model-runtime.proto gRPC service interface
    • runtimeStatus() - called only during startup to obtain some basic configuration parameters from the runtime, such as version, capacity, model-loading timeout
    • loadModel() - load the specified model into memory from backing storage, returning when complete
    • modelSize() - determine size (mem usage) of previously-loaded model. If very fast, can be omitted and provided instead in the response from loadModel
    • unloadModel() - unload previously loaded model, returning when complete
    • Use a separate, arbitrary gRPC service interface for model inferencing requests. It can have any number of methods and they are assumed to be idempotent. See predictor.proto for a very simple example.
    • The methods of your custom applier interface will be called only for already fully-loaded models.
  2. Build a grpc server docker container which exposes these interfaces on localhost port 8085 or via a mounted unix domain socket
  3. Extend the Kustomize-based Kube configs to use your docker image, and with appropriate mem and cpu resource allocations for your container
  4. Deploy to a Kubernetes cluster as a regular Service, which will expose this grpc service interface via kube-dns (you do not implement this yourself), consume using grpc client of your choice from your upstream service components
    • registerModel() and unregisterModel() for registering/removing models managed by the cluster
    • Any custom inferencing interface methods to make a runtime invocation of previously-registered model, making sure to set a mm-model-id or mm-vmodel-id metadata header (or -bin suffix equivalents for UTF-8 ids)

Deployment and Upgrades

Prerequisites:

  • An etcd cluster (shared or otherwise)
  • A Kubernetes namespace with the etcd cluster connection details configured as a secret key in this json format
    • Note that if provided, the root_prefix attribute is used as a key prefix for all of the framework's use of etcd

From an operational standpoint, ModelMesh behaves just like any other homogeneous clustered microservice. This means it can be deployed, scaled, migrated and upgraded as a regular Kubernetes deployment without any special coordination needed, and without any impact to live service usage.

In particular the procedure for live upgrading either the framework container or service runtime container is the same: change the image version in the deployment config yaml and then update it kubectl apply -f model-mesh-deploy.yaml

Build

Sample build:

GIT_COMMIT=$(git rev-parse HEAD)
BUILD_ID=$(date '+%Y%m%d')-$(git rev-parse HEAD | cut -c -5)
IMAGE_TAG_VERSION=0.0.1
IMAGE_TAG=${IMAGE_TAG_VERSION}-$(git branch --show-current)_${BUILD_ID}

docker build -t model-mesh:${IMAGE_TAG} \
    --build-arg imageVersion=${IMAGE_TAG} \
    --build-arg buildId=${BUILD_ID} \
    --build-arg commitSha=${GIT_COMMIT} .

modelmesh's People

Contributors

amnpandey avatar kserve-oss-bot avatar njhill avatar pvaneck avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.