GithubHelp home page GithubHelp logo

jimburtoft / nos Goto Github PK

View Code? Open in Web Editor NEW

This project forked from autonomi-ai/nos

0.0 0.0 0.0 15.96 MB

⚡️ A fast and flexible PyTorch inference server that runs on any cloud or AI HW.

Home Page: https://docs.nos.run/

License: Apache License 2.0

Shell 0.19% Python 97.51% Makefile 1.99% Jinja 0.31%

nos's Introduction

Nitro Boost for your AI Infrastructure

Website | Docs | Tutorials | Playground | Blog | Discord

PyPI Version PyPI Version PyPI Downloads Docker Pulls
PyPi Downloads Discord PyPi Version

NOS is a fast and flexible PyTorch inference server that runs on any cloud or AI HW.

🛠️ Key Features

  • 👩‍💻 Easy-to-use: Built for PyTorch and designed to optimize, serve and auto-scale Pytorch models in production without compromising on developer experience.
  • 🥷 Multi-modal & Multi-model: Serve multiple foundational AI models (LLMs, Diffusion, Embeddings, Speech-to-Text and Object Detection) simultaneously, in a single server.
  • ⚙️ HW-aware Runtime: Deploy PyTorch models effortlessly on modern AI accelerators (NVIDIA GPUs, AWS Inferentia2, AMD - coming soon, and even CPUs).
  • ☁️ Cloud-agnostic Containers: Run on any cloud (AWS, GCP, Azure, Lambda Labs, On-Prem) with our ready-to-use inference server containers.

🔥 What's New

🚀 Quickstart

We highly recommend that you go to our quickstart guide to get started. To install the NOS client, you can run the following command:

conda create -n nos python=3.8 -y
conda activate nos
pip install torch-nos

Once the client is installed, you can start the NOS server via the NOS serve CLI. This will automatically detect your local environment, download the docker runtime image and spin up the NOS server:

nos serve up --http --logging-level INFO

You are now ready to run your first inference request with NOS! You can run any of the following commands to try things out. You can set the logging level to DEBUG if you want more detailed information from the server.

👩‍💻 What can NOS do?

💬 Chat / LLM Agents (ChatGPT-as-a-Service)


NOS provides an OpenAI-compatible server with streaming support so that you can connect your favorite OpenAI-compatible LLM client to talk to NOS.


API / Usage

gRPC API ⚡

from nos.client import Client

client = Client("[::]:50051")

model = client.Module("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
response = model.chat(message="Tell me a story of 1000 words with emojis", _stream=True)

REST API

curl \
-X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
    "model": "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    "messages": [{
        "role": "user",
        "content": "Tell me a story of 1000 words with emojis"
    }],
    "temperature": 0.7,
    "stream": true
  }'

🏞️ Image Generation (Stable-Diffusion-as-a-Service)


Build MidJourney discord bots in seconds.


API / Usage

gRPC API ⚡

from nos.client import Client

client = Client("[::]:50051")

sdxl = client.Module("stabilityai/stable-diffusion-xl-base-1-0")
image, = sdxl(prompts=["hippo with glasses in a library, cartoon styling"],
              width=1024, height=1024, num_images=1)

REST API

curl \
-X POST http://localhost:8000/v1/infer \
-H 'Content-Type: application/json' \
-d '{
    "model_id": "stabilityai/stable-diffusion-xl-base-1-0",
    "inputs": {
        "prompts": ["hippo with glasses in a library, cartoon styling"],
        "width": 1024, "height": 1024,
        "num_images": 1
    }
}'

🧠 Text & Image Embedding (CLIP-as-a-Service)


Build scalable semantic search of images/videos in minutes.


API / Usage

gRPC API ⚡

from nos.client import Client

client = Client("[::]:50051")

clip = client.Module("openai/clip-vit-base-patch32")
txt_vec = clip.encode_text(texts=["fox jumped over the moon"])

REST API

curl \
-X POST http://localhost:8000/v1/infer \
-H 'Content-Type: application/json' \
-d '{
    "model_id": "openai/clip-vit-base-patch32",
    "method": "encode_text",
    "inputs": {
        "texts": ["fox jumped over the moon"]
    }
}'

🎙️ Audio Transcription (Whisper-as-a-Service)


Perform real-time audio transcription using Whisper.


API / Usage

gRPC API ⚡

from pathlib import Path
from nos.client import Client

client = Client("[::]:50051")

model = client.Module("openai/whisper-small.en")
with client.UploadFile(Path("audio.wav")) as remote_path:
  response = model(path=remote_path)
# {"chunks": ...}

REST API

curl \
-X POST http://localhost:8000/v1/infer/file \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'model_id=openai/whisper-small.en' \
-F '[email protected]'

🧐 Object Detection (YOLOX-as-a-Service)


Run classical computer-vision tasks in 2 lines of code.


API / Usage

gRPC API ⚡

from pathlib import Path
from nos.client import Client

client = Client("[::]:50051")

model = client.Module("yolox/medium")
response = model(images=[Image.open("image.jpg")])

REST API

curl \
-X POST http://localhost:8000/v1/infer/file \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'model_id=yolox/medium' \
-F '[email protected]'

⚒️ Custom models


Want to run models not supported by NOS? You can easily add your own models following the examples in the NOS Playground.

📄 License

This project is licensed under the Apache-2.0 License.

📡 Telemetry

NOS collects anonymous usage data using Sentry. This is used to help us understand how the community is using NOS and to help us prioritize features. You can opt-out of telemetry by setting NOS_TELEMETRY_ENABLED=0.

🤝 Contributing

We welcome contributions! Please see our contributing guide for more information.

🔗 Quick Links


<style> .md-typeset h1, .md-content__button { display: none; } </style>

nos's People

Contributors

spillai avatar outtanames avatar jiexiong2016 avatar jimburtoft avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.