GithubHelp home page GithubHelp logo

jeremyjordan / ml-monitoring Goto Github PK

View Code? Open in Web Editor NEW
221.0 4.0 60.0 525 KB

A demo of Prometheus+Grafana for monitoring an ML model served with FastAPI.

Home Page: https://www.jeremyjordan.me/ml-monitoring/

License: MIT License

Dockerfile 4.99% Python 95.01%

ml-monitoring's Introduction

ml-monitoring

Jeremy Jordan

This repository provides an example setup for monitoring an ML system deployed on Kubernetes.

Blog post: https://www.jeremyjordan.me/ml-monitoring/

Components:

  • ML model served via FastAPI
  • Export server metrics via prometheus-fastapi-instrumentator
  • Simulate production traffic via locust
  • Monitor and store metrics via Prometheus
  • Visualize metrics via Grafana

Setup

  1. Ensure you can connect to a Kubernetes cluster and have kubectl and helm installed.
    • You can easily spin up a Kubernetes cluster on your local machine using minikube.
minikube start --driver=docker --memory 4g --nodes 2
  1. Deploy Prometheus and Grafana onto the cluster using the community Helm chart.
kubectl create namespace monitoring
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus-stack prometheus-community/kube-prometheus-stack -n monitoring
  1. Verify the resources were deployed successfully.
kubectl get all -n monitoring
  1. Connect to the Grafana dashboard.
kubectl port-forward svc/prometheus-stack-grafana 8000:80 -n monitoring
  • Go to http://127.0.0.1:8000/
  • Log in with the credentials:
    • Username: admin
    • Password: prom-operator
    • (This password can be configured in the Helm chart values.yaml file)
  1. Import the model dashboard.
    • On the left sidebar, click the "+" and select "Import".
    • Copy and paste the JSON defined in dashboards/model.json in the text area.

Deploy a model

This repository includes an example REST service which exposes an ML model trained on the UCI Wine Quality dataset.

You can launch the service on Kubernetes by running:

kubectl apply -f kubernetes/models/

You can also build and run the Docker container locally.

docker build -t wine-quality-model -f model/Dockerfile model/
docker run -d -p 3000:80 -e ENABLE_METRICS=true wine-quality-model

Note: In order for Prometheus to scrape metrics from this service, we need to define a ServiceMonitor resource. This resource must have the label release: prometheus-stack in order to be discovered. This is configured in the Prometheus resource spec via the serviceMonitorSelector attribute.

You can verify the label required by running:

kubectl get prometheuses.monitoring.coreos.com prometheus-stack-kube-prom-prometheus -n monitoring -o yaml

Simulate production traffic

We can simulate production traffic using a Python load testing tool called locust. This will make HTTP requests to our model server and provide us with data to view in the monitoring dashboard.

You can begin the load test by running:

kubectl apply -f kubernetes/load_tests/

By default, production traffic will be simulated for a duration of 5 minutes. This can be changed by updating the image arguments in the kubernetes/load_tests/locust_master.yaml manifest.

You can also modify the community Helm chart instead of using the manifests defined in this repo.

Uploading new images

This process can eventually be automated with a Github action, but remains manual for now.

  1. Obtain a personal access token to connect with the Github container registry.
echo "INSERT_TOKEN_HERE" >> ~/.github/cr_token
  1. Authenticate with the Github container registry.
cat ~/.github/cr_token | docker login ghcr.io -u jeremyjordan --password-stdin
  1. Build and tag new Docker images.
MODEL_TAG=0.3
docker build -t wine-quality-model:$MODEL_TAG -f model/Dockerfile model/
docker tag wine-quality-model:$MODEL_TAG ghcr.io/jeremyjordan/wine-quality-model:$MODEL_TAG
LOAD_TAG=0.2
docker build -t locust-load-test:$LOAD_TAG -f load_test/Dockerfile load_test/
docker tag locust-load-test:$LOAD_TAG ghcr.io/jeremyjordan/locust-load-test:$LOAD_TAG
  1. Push Docker images to container registery.
docker push ghcr.io/jeremyjordan/wine-quality-model:$MODEL_TAG
docker push ghcr.io/jeremyjordan/locust-load-test:$LOAD_TAG
  1. Update Kubernetes manifests to use the new image tag.

Teardown instructions

To stop the model REST server, run:

kubectl delete -f kubernetes/models/

To stop the load tests, run:

kubectl delete -f kubernetes/load_tests/

To remove the Prometheus stack, run:

helm uninstall prometheus-stack -n monitoring

ml-monitoring's People

Contributors

bateman avatar dependabot[bot] avatar jeremyjordan avatar rajgupt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

ml-monitoring's Issues

Latency & Counter Metrics Not Detected By Prometheus

Hello @jeremyjordan,

I've been following your fastapi ml-monitoring repository as a template for my own project and it's been super helpful! Thanks so much for setting this up. Unfortunately, I'm experiencing a lot of trouble getting prometheus to scrape my Counter metric and latency as well. Interestingly, when I run your wine-quality application and add a Counter metric though, it seems to be working fine, but mine which pretty much follows your same approach (only difference being that I set up my application using application factory design pattern) doesn't seem to be working. It seems like histogram and summary are going through though.

Do you have any insight as to what the issue could be? Would really appreciate your guidance as I've been trying to figure this out for 3 days.

Here is my monitoring.py file: https://github.com/rileyhun/fastapi-ml-example/blob/main/app/core/monitoring.py

Reproducible example:

git clone https://github.com/rileyhun/fastapi-ml-example.git

docker build -t ${IMAGE_NAME}:${IMAGE_TAG} -f Dockerfile .
docker tag ${IMAGE_NAME}:${IMAGE_TAG} rhun/${IMAGE_NAME}:${IMAGE_TAG}
docker push rhun/${IMAGE_NAME}:${IMAGE_TAG}

minikube start --driver=docker --memory 4g --nodes 2
kubectl create namespace monitoring
helm install prometheus-stack prometheus-community/kube-prometheus-stack -n monitoring

kubectl apply -f deployment/wine-model-local.yaml
kubectl port-forward svc/wine-model-service 8080:80

python api_call.py

Adding prometheus instrumentation package is resulting in some requests taking a long amount of time

Hello again @jeremyjordan,

We are trying to decrease the latency of our BERT model prediction service that is deployed using FastAPI. The predictions are called through the /predict endpoint. We looked into the tracing and found one of the bottlenecks is the prometheus-fastapi-instrumentator. About 1% of the requests do timeout because they exceed 10s.

We also discovered that some metrics are not getting reported on 4 requests/second. Some requests took 30-50 seconds, with the starlette/fastapi taking long times. So it seems that under high usage, the /metrics endpoint doesn't get enough resources, and hence all /metrics requests wait for some time and fail eventually. So having separate container for metrics could help. Or if possible to have metrics delayed/paused under high load. Any insight/guidance would be much appreciated.

Screen Shot 2021-12-03 at 6 37 51 PM

Screen Shot 2021-12-03 at 6 37 38 PM

Screen Shot 2021-12-03 at 7 59 40 PM

How to Monitor NLP Models?

Hi Jeremy,

I'm following your template for a POC, and it's been very helpful. I'm creating a REST API for an NLP model (Multinomial Naive Bayes) and I'm not sure how to monitor this particular model when the predictions are classes instead of float values like the wine quality prediction model. How would the prometheus instrumentation be used to capture metrics for classification models?

Thanks,

Riley

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.