GithubHelp home page GithubHelp logo

zchipitt / inference-server Goto Github PK

View Code? Open in Web Editor NEW

This project forked from xilinx/inference-server

0.0 0.0 0.0 2.96 MB

Home Page: https://xilinx.github.io/inference-server/

License: Apache License 2.0

Shell 2.82% C++ 65.30% Python 21.91% CMake 8.88% Dockerfile 1.09%

inference-server's Introduction

AMD Inference Server

The AMD Inference Server is an open-source tool to deploy your machine learning models and make them accessible to clients for inference. Out-of-the-box, the server can support selected models that run on AMD CPUs, GPUs or FPGAs by leveraging existing libraries. For all these models and hardware accelerators, the server presents a common user interface based on community standards so clients can make requests to any using the same API. The server provides HTTP/REST and gRPC interfaces for clients to submit requests. For both, there are C++ and Python bindings to simplify writing client programs. You can also use the server backend directly using the native C++ API to write local applications.

Features

  • Supports client requests using HTTP/REST, gRPC and websocket protocols using an API based on KServe's v2 specification
  • Custom applications can directly call the backend bypassing the other protocols using the native C++ API
  • C++ library with Python bindings to simplify making requests to the server
  • Incoming requests are transparently batched based on the user specifications
  • Users can define how many models, and how many instances of each, to run in parallel

The AMD Inference Server is integrated with the following libraries out of the gate:

  • TensorFlow and PyTorch models with ZenDNN on CPUs (optimized for AMD CPUs)
  • ONNX models with MIGraphX on AMD GPUs
  • XModel models with Vitis AI on AMD FPGAs
  • A graph of computation including as pre- and post-processing can be written using AKS on AMD FPGAs for end-to-end inference

Quick Start Deployment and Inference

The following example demonstrates how to deploy the server locally and run a sample inference. This example runs on the CPU and does not require any special hardware. You can see a more detailed version of this example in the quickstart.

# Step 1: Download the example files and create a model repository
wget https://github.com/Xilinx/inference-server/raw/main/examples/resnet50/quickstart-setup.sh
chmod +x ./quickstart-setup.sh
./quickstart-setup.sh

# Step 2: Launch the AMD Inference Server
docker run -d --net=host -v ${PWD}/model_repository:/mnt/models:rw amdih/serve:uif1.1_zendnn_amdinfer_0.3.0 amdinfer-server --enable-repository-watcher

# Step 3: Install the Python client library
pip install amdinfer

# Step 4: Send an inference request
python3 tfzendnn.py --endpoint resnet50 --image ./dog-3619020_640.jpg --labels ./imagenet_classes.txt

# Inference should print the following:
#
#     Running the TF+ZenDNN example for ResNet50 in Python
#     Waiting until the server is ready...
#     Making inferences...
#     Top 5 classes for ../../tests/assets/dog-3619020_640.jpg:
#       n02112018 Pomeranian
#       n02112350 keeshond
#       n02086079 Pekinese, Pekingese, Peke
#       n02112137 chow, chow chow
#       n02113023 Pembroke, Pembroke Welsh corgi

Learn more

The documentation for the AMD Inference Server is available online.

Check out the quickstart guides online to help you get started based on your use case(s): inference, deployment and development.

Support

Raise issues if you find a bug or need help. Refer to Contributing for more information.

License

The AMD Inference Server is licensed under the terms of Apache 2.0 (see LICENSE). The LICENSE file contains additional license information for third-party files distributed with this work. More license information can be seen in the dependencies.

inference-server's People

Contributors

amuralee-amd avatar bpickrel avatar dependabot[bot] avatar mattsnow-amd avatar rradjabi avatar varunsh-xilinx avatar zchipitt avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.