GithubHelp home page GithubHelp logo

hartl3y94 / alreadyme-ai-serving Goto Github PK

View Code? Open in Web Editor NEW

This project forked from readme-generator/alreadyme-ai-serving

0.0 0.0 0.0 84 KB

Serving large language model with transformers

License: Apache License 2.0

Python 96.02% Dockerfile 3.98%

alreadyme-ai-serving's Introduction

Serving ALREADYME.md AI Model

FastAPI docker redoc license FOSSA Status black CodeFactor

This repository is to serve ALREADYME model on FastAPI.

Requirements

  • torch
  • fastapi[all]
  • omegaconf
  • transformers
  • loguru

Prerequisites

Before starting the server, the fine-tuned model weight is required. While transformers pipeline has extremely slow, we use pickling to enhance the initialization time. Because of that, some conversion is needed:

import torch
from transformers import pipeline

pipe = pipeline("text-generation", "bloom-1b7-finetuned-readme-270k-steps", torch_dtype=torch.float16, device=0)
torch.save(pipe, "bloom-1b7-finetuned-readme-270k-steps/pipeline.pt")

Move the transformer model to app/resources and change the path in app/resources/config.yaml.

Run the server

We recommend to build a docker image instead using in local. But it would be better to run before building the image to check any bug in the code and your fine-tuned model.

Start locally

$ cd app
$ uvicorn main:app --ip [your ip address] --port [your port]

Build docker

We do not provide any pre-build image yet. Build your own image with custom fine-tuned model!

$ docker build -t alreadyme-ai-serving:v0.1.2 -f Dockerfile \
    --build-args CUDA_VER=11.6.1 \
    --build-args CUDNN_VER=8 \
    --build-args UBUNTU_VER=18.04 \
    --build-args PYTHON_VER=39

You can change the version of cuda, cudnn, ubuntu and python. They can be useful for compatibility of different cloud environment. After build your image, run docker by:

$ docker run --gpus all -p 8080:80 alreadyme-ai-serving:v0.1.2

The docker container will launch the server on port 80, so you should binding to your own port number (e.g. 8080).

Documentation

alreadyme-ai-serving supports OpenAPI and you can see the documentation of the APIs in your server. If the server is running locally, check out http://127.0.0.1:8080/docs for swagger or http://127.0.0.1:8080/redoc for redoc.

For convenience, we hosted free redoc documentation page. You may login to see the details.

License

alreadyme-ai-serving is released under the Apache License 2.0. License can be found in here.

FOSSA Status

alreadyme-ai-serving's People

Contributors

affjljoo3581 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.