GithubHelp home page GithubHelp logo

syq23719034 / fil_backend Goto Github PK

View Code? Open in Web Editor NEW

This project forked from triton-inference-server/fil_backend

0.0 0.0 0.0 1.38 MB

FIL backend for the Triton Inference Server

License: Apache License 2.0

Shell 4.88% C++ 12.61% Python 5.79% C 5.80% CMake 1.66% Jupyter Notebook 68.35% Dockerfile 0.91%

fil_backend's Introduction

License

Triton Inference Server FIL Backend

Triton is a machine learning inference server for easy and highly optimized deployment of models trained in almost any major framework. This backend specifically facilitates use of tree models in Triton (including models trained with XGBoost, LightGBM, Scikit-Learn, and cuML).

If you want to deploy a tree-based model for optimized real-time or batched inference in production, the FIL backend for Triton will allow you to do just that.

Table of Contents

Usage Information

Contributor Docs

Not sure where to start?

If you aren't sure where to start with this documentation, consider one of the following paths:

I currently use XGBoost/LightGBM or other tree models and am trying to assess if Triton is the right solution for production deployment of my models

  1. Check out the FIL backend's blog post announcement
  2. Make sure your model is supported by looking at the model support section
  3. Look over the introductory example
  4. Try deploying your own model locally by consulting the FAQ notebook.
  5. Check out the main Triton documentation for additional features and helpful tips on deployment (including example Helm charts).

I am familiar with Triton, but I am using it to deploy an XGBoost/LightGBM model for the first time.

  1. Look over the introductory example
  2. Try deploying your own model locally by consulting the FAQ notebook. Note that it includes specific example code for serialization of XGBoost and LightGBM models.
  3. Review the FAQ notebook's tips for optimizing model performance.

I am familiar with Triton and the FIL backend, but I am using it to deploy a Scikit-Learn or cuML tree model for the first time

  1. Look at the section on preparing Scikit-Learn/cuML models for Triton.
  2. Try deploying your model by consulting the FAQ notebook, especially the sections on Scikit-Learn and cuML.

I am a data scientist familiar with tree model training, and I am trying to understand how Triton might be used with my models.

  1. Take a glance at the Triton product page to get a sense of what Triton is used for.
  2. Download and run the introductory example for yourself. If you do not have access to a GPU locally, you can just look over this notebook and then jump to the FAQ notebook which has specific information on CPU-only training and deployment. I have never worked with tree models before.
  3. Take a look at XGBoost's documentation.
  4. Download and run the introductory example for yourself.
  5. Try deploying your own model locally by consulting the FAQ notebook.

I don't like reading docs.

  1. Look at the Quickstart below
  2. Open the FAQs notebook in a browser.
  3. Try deploying your model. If you get stuck, Ctrl-F for keywords on the FAQ page.

Quickstart: Deploying a tree model in 3 steps

  1. Copy your model into the following directory structure. In this example, we show an XGBoost json file, but XGBoost binary files, LightGBM text files, and Treelite checkpoint files are also supported.
model_repository/
├─ example/
│  ├─ 1/
│  │  ├─ model.json
│  ├─ config.pbtxt
  1. Fill out config.pbtxt as follows, replacing $NUM_FEATURES with the number of input features, $MODEL_TYPE with xgboost, xgboost_json, lightgbm or treelite_checkpoint, and $IS_A_CLASSIFIER with true or false depending on whether this is a classifier or regressor.
backend: "fil"
max_batch_size: 32768
input [
 {
    name: "input__0"
    data_type: TYPE_FP32
    dims: [ $NUM_FEATURES ]
  }
]
output [
 {
    name: "output__0"
    data_type: TYPE_FP32
    dims: [ 1 ]
  }
]
instance_group [{ kind: KIND_AUTO }]
parameters [
  {
    key: "model_type"
    value: { string_value: "$MODEL_TYPE" }
  },
  {
    key: "output_class"
    value: { string_value: "$IS_A_CLASSIFIER" }
  }
]

dynamic_batching {}
  1. Start the server:
docker run -p 8000:8000 -p 8001:8001 --gpus all \
  -v ${PWD}/model_repository:/models \
  nvcr.io/nvidia/tritonserver:23.08-py3 \
  tritonserver --model-repository=/models

The Triton server will now be serving your model over both HTTP (port 8000) and GRPC (port 8001) using NVIDIA GPUs if they are available or the CPU if they are not. For information on how to submit inference requests, how to deploy other tree model types, or advanced configuration options, check out the FAQ notebook.

fil_backend's People

Contributors

wphicks avatar hcho3 avatar daxiongshu avatar ramitchell avatar aroraakshit avatar divyegala avatar abhisheksawarkar avatar guanluo avatar jfurtek avatar lowener avatar nealvaidya avatar viclafargue avatar dyastremsky avatar erikrene avatar kthui avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.