GithubHelp home page GithubHelp logo

meg-huggingface / model-recommender Goto Github PK

View Code? Open in Web Editor NEW

This project forked from philschmid/model-recommender

0.0 0.0 0.0 142 KB

License: MIT License

Python 4.88% Makefile 0.04% Jupyter Notebook 94.98% Dockerfile 0.10%

model-recommender's Introduction

Hugging Face TGI Recommender

Hugging Face recommender is a utility package to estimate and get helpful information for deploying and training hugging face models. The package includes:

  • recommender: python library to estimate the required resources for a given respecting mode memory, kv-cache and generation.
  • api: FastAPI app to expose the recommender as a REST API.
  • gradio: Gradio app to expose the recommender as a web app.
  • notebooks: Jupyter notebooks to test and experiment with the recommender, used for our partners, e.g. AWS, Cloudflare, etc.
  • tests: Unit tests for the recommender, not much here.

API

Routes

GET /v1/tgi/config

Summary: Returns Hugging Face Text Generation Configuration

Query Parameters:

  • model_id (string, optional): Model ID
  • gpu_memory (integer, optional, default=0): GPU memory
  • num_gpus (integer, optional, default=1): Number of GPUs

GET /v1/provider/{provider}/recommend

Summary: Generates an Instance Type recommendation for a given provider (gcp, huggingface) and returns the instance type and TGI configuration.

Path Parameters:

  • provider (string, required): Provider name (one of: huggingface, gcp, aws (sagemaker))

Query Parameters:

  • model_id (string, optional): Model ID
  • gpu_memory (integer, optional, default=999): GPU memory

Development

  1. Install the requirements:
pip install -e ".[api]"
  1. Run the app with in-memory cache:
cd api && uvicorn app.main:app --reload
  1. Build the app
docker build -t recommender-api -f api/Dockerfile .
  1. Run the app:
docker run -p 8000:8000 recommender-api
  1. curl the app:

To test the dummy route, you can use curl or any HTTP client:

curl  -X GET 'localhost:8000/v1/provider/gcp/recommend?model_id=HuggingFaceH4%2Fzephyr-7b-beta'

Deployment

Push to main and ArgoCD will deploy the app to the cluster.

model-recommender's People

Contributors

philschmid avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.