GithubHelp home page GithubHelp logo

py-ai-server's Introduction

py-ai-server

Overview

This is an api server to support my ai-experiments collection of demos/prototypes.

Instead of finding and using a growing number of server projects to serve separate apis, I made an api server that does what I need.

This was/is mostly a learning exercise to introduce myself to lower-level inference on AI models.

The rest of this README and likely other places is out of date. I had the project working but I haven't used it for at least a few months so this is currently unmaintained.

Setup

../setup-api.sh

Structure Info

For each of LLM, TTS or STT (or any other AI model types in the future), the following structure is used to (hopefully) manage the variety of the AI ecosystem:

  • api/ (e.g. llm_api) - The API routes for that AI. The endpoints use the manager for that AI to do the work.
  • client/[AI]_client_manager.py - The manager for that AI. The manager exposes methods to power the API and handles which client to use.
  • client/[AI]/ - The clients for that AI. There's a base.py class that all clients inherit from, and then a client for each client. The client has the specific code for working with the model.
  • models/ - contains the pydantic models that the API routes and clients use.

As of now, the LLM API is the only one with more than 1 client for the manager to use. TODO clients:

  • TTS/XTTS - xtts actually supports other models too (yourtts is one that does voice cloning)
  • TTS/Silero - Is a fast TTS that can be used for testing or when low on VRAM.
  • STT/OpenAI - Supposed to be no difference but you wouldn't have to run the model, uses the API obviously

The manager handles converting options to the correct format for the client. Currently this only happens with LLM.

Models, Clients/Loaders, Sources

This is an attempt by me to treat different AI types/projects in a similar way

Every type of AI has a single Manager, as well as 1 or more Clients (or Loader). Through the manager, a model is picked and Loaded, and the Model is what the appropriate Client uses to actually run inference (the "generate" in Generative AI).

Models are a bit of a mix. Different clients may support models that other clients also support. To solve this, there are Sources, which are simply different sources for models for that AI.

Example: GGUF models can be used by either LlamaCppPython or Transformers.

Further, certain AI like TTS might have additional assets (like Voices) which can be used across clients or models (for TTS voices, they'd be supported by any voice cloning model).

Ideas

  • --hot-[llm|tts|...] - hot-load the model (using websockets?)

Resources

LLM

SD (rename? maybe /img)

TTS

STT

py-ai-server's People

Contributors

parsehex avatar

Watchers

 avatar

py-ai-server's Issues

Packaging / Deployment

  • Use bat/sh scripts, like A1111 and similar projects
  • How to allow specifying only needed deps? Venv is ~7gb
    • I guess with A1111 you get the base and can install extensions which have their own reqs

Model Management

The goal is to have UI that you can download models thru. For this repo, we need to flesh out an API to facilitate this.

Need to be able to:

  • List models available to download
    • LLMs come from HF (& some others)
    • SD models from CivitAI
  • Get model info
    • User pastes link/id and we pull its info
    • Do CAI/HF return model type?
  • Download model from link
  • List installed models
  • Delete model
  • Get installed model info
    • Is there any extra info we could get about models?

Overall Thoughts

  • Maybe we support CivitAI and HF & allow overriding what kind of model it is? Auto-fill from info returned from api but allow changing
  • So, thinking to have model management its own set of endpoints? Use options or alt. paths to choose between llm, tts, etc.

Consider switching to OpenAI-compatible API

Right now we expose a custom endpoint for LLM. I think it's worth considering using an OpenAI API server, maybe just llama-cpp-python.

I'm not sure on this. It's basically outsourcing all of LLM to a package, but that might not be bad.

So, I want to use an openai-compatible api so that I can use vercel's ai sdk. I feel like I'd still want my own api tho.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.