GithubHelp home page GithubHelp logo

apple006 / mxnet-model-server Goto Github PK

View Code? Open in Web Editor NEW

This project forked from awslabs/multi-model-server

0.0 1.0 0.0 29.95 MB

Model Server for Apache MXNet is a tool for serving neural net models for inference

License: Apache License 2.0

Shell 3.95% Python 96.05%

mxnet-model-server's Introduction

Model Server for Apache MXNet

ubuntu/python-2.7 ubuntu/python-3.5
Python3 Build Status Python2 Build Status

Apache MXNet Model Server (MMS) is a flexible and easy to use tool for serving deep learning models exported from MXNet or the Open Neural Network Exchange (ONNX).

Use the MMS Server CLI, or the pre-configured Docker images, to start a service that sets up HTTP endpoints to handle model inference requests.

A quick overview and examples for both serving and exporting are provided below. Detailed documentation and examples are provided in the docs folder.

Contents of this Document

Other Relevant Documents

Quick Start

Install with pip

If you plan to use the ONNX features, you will need to have the protobuf compiler installed.

To install MMS with ONNX support, make sure you have Python installed, then for Ubuntu run:

sudo apt-get install protobuf-compiler libprotoc-dev
pip install mxnet-model-server

Or for Mac run:

conda install -c conda-forge protobuf
pip install mxnet-model-server

See the advanced installation page for more options and troubleshooting.

Serve a Model

Once installed, you can get MMS model serving up and running very quickly. Try out --help to see the kind of features that are available.

mxnet-model-server --help

For this quick start, we'll skip over most of the features, but be sure to take a look at the full server docs when you're ready.

Here is an easy example for serving an object classification model:

mxnet-model-server \
  --models squeezenet=https://s3.amazonaws.com/model-server/models/squeezenet_v1.1/squeezenet_v1.1.model

With the command above executed, you have MMS running on your host, listening for inference requests.

To test it out, you will need to open a new terminal window next to the one running MMS. Then you can use curl to download one of these cute pictures of a kitten and curl's -o flag will name it kitten.jpg for you. Then you will curl a POST to the MMS predict endpoint with the kitten's image.

kitten

In the example below, we provide a shortcut for these steps.

curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg
curl -X POST http://127.0.0.1:8080/squeezenet/predict -F "[email protected]"

The predict endpoint will return a prediction response in JSON. It will look something like the following result:

{
  "prediction": [
    [
      {
        "class": "n02124075 Egyptian cat",
        "probability": 0.9408261179924011
      },
      {
        "class": "n02127052 lynx, catamount",
        "probability": 0.055966004729270935
      },
      {
        "class": "n02123045 tabby, tabby cat",
        "probability": 0.0025502564385533333
      },
      {
        "class": "n02123159 tiger cat",
        "probability": 0.00034320182749070227
      },
      {
        "class": "n02123394 Persian cat",
        "probability": 0.00026897044153884053
      }
    ]
  ]
}

You will see this result in the response to your curl call to the predict endpoint, and in the server logs in the terminal window running MMS. It's also being logged locally with metrics.

Other models can be downloaded from the model zoo, so try out some of those as well.

Now you've seen how easy it can be to serve a deep learning model with MMS! Would you like to know more?

Export a Model

MMS enables you to package up all of your model artifacts into a single model archive, that you can then easily share or distribute. To export a model, follow these two steps:

1. Download Model Artifacts (if you don't have them handy)

Model-Artifacts.zip - 5 MB

Then extract the zip file to see the following model artifacts:

  • Model Definition (json file) - contains the layers and overall structure of the neural network
  • Model Params and Weights (params file) - contains the parameters and the weights
  • Model Signature (json file) - defines the inputs and outputs that MMS is expecting to hand-off to the API
  • assets (text files) - auxiliary files that support model inference such as vocabularies, labels, etc. and vary depending on the model

Further details on these files, custom services, and advanced exporting features can be found on the Exporting Models for Use with MMS page in the docs folder.

2. Export Your Model

With the model artifacts available locally, you can use the mxnet-model-export CLI to generate a .model file that can be used to serve an inference API with MMS.

Open your terminal and go to the folder that has the files you just downloaded.

In this next step we'll run mxnet-model-export and tell it our model's prefix is squeezenet_v1.1 with the model-name argument. Then we're giving it the model-path to the model's assets.

mxnet-model-export --model-name squeezenet_v1.1 --model-path .

This will output squeezenet_v1.1.model in the current working directory, and it assumes all of the model artifacts are also in the current working directory. Otherwise, instead of . you would use a path to the artifacts. This file is all you need to run MMS, serving inference requests for a simple image recognition API. Go back to the Serve a Model tutorial above and try to run this model that you just exported!

To learn more about exporting, check out MMS export documentation

Production Deployments

When launched directly, MMS uses a standalone Flask server. This is handy for testing and development. But for production deployments, we recommend using Gunicorn which should provide lower latency, higher throughput, and more efficient use of memory.

This project includes Dockerfiles to build containers recommended for production deployments. These containers demonstrate how to set up a production stack consisting of nginx, gunicorn, and MMS. The basic usage can be found on the Docker readme.

Other Features

Browse over to the Docs readme for the full index of documentation. This includes more examples, how to customize the API service, API endpoint details, and more.

Contributing

We welcome all contributions!

To file a bug or request a feature, please file a GitHub issue. Pull requests are welcome.

mxnet-model-server's People

Contributors

aaronmarkham avatar ankkhedia avatar hyandell avatar jesterhazy avatar jiajiechen avatar kevinthesun avatar knjcode avatar leopd avatar lupesko avatar nswamy avatar sandeep-krishnamurthy avatar vdantu avatar yuruofeifei avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.