GithubHelp home page GithubHelp logo

alpaca-7b-truss's Introduction

Alpaca-7B Truss

This is a Truss for Alpaca-7B, a fine-tuned variant of Llama-7B. Llama is a family of language models released by Meta. This README will walk you through how to deploy this Truss on Baseten to get your own instance of Alpaca-7B.

Truss

Truss is an open-source model serving framework developed by Baseten. It allows you to develop and deploy machine learning models onto Baseten (and other platforms like AWS or GCP. Using Truss, you can develop a GPU model using live-reload, package models and their associated code, create Docker containers and deploy on Baseten.

Deploying Alpaca-7B

To deploy the Alpaca-7B Truss, you'll need to follow these steps:

  1. Prerequisites:
  • Make sure you have a Baseten account and API key. You can sign up for a Baseten account here.
  • Note that as of today, Baseten requires a Business Plan subscription to get GPU resources for your model. However, this won't be the case soon. If you need LLaMa now and want to access GPU compute, please feel free to send a direct message to @aqaderb or @aaronrelph on Twitter, and we'll help you set it up.
  1. Install Truss and the Baseten Python client: If you haven't already, install the Baseten Python client and Truss in your development environment using:
pip install --upgrade baseten truss
  1. Load the Alpaca-7B Truss: Assuming you've cloned this repo, spin up an IPython shell and load the Truss into memory:
import truss

alpaca7b_truss = truss.load("path/to/alpaca7b_truss")
  1. Log in to Baseten: Log in to your Baseten account using your API key (key found here):
import baseten

baseten.login("PASTE_API_KEY_HERE")
  1. Deploy the Alpaca-7B Truss: Deploy the Alpaca-7B Truss to Baseten with the following command:
baseten.deploy(alpaca7b_truss)

Once your Truss is deployed, you can start using the Alpaca-7B model through the Baseten platform! Navigate to the Baseten UI to watch the model build and deploy and invoke it via the REST API.

Alpaca-7B API Documentation

This section provides an overview of the Alpaca-7B API, its parameters, and how to use it. The API consists of a single route named predict, which you can invoke to generate text based on the provided instruction.

API Route: predict

The predict route is the primary method for generating text completions based on a given instruction. It takes several parameters:

  • instruction: The input text that you want the model to generate a response for.
  • temperature (optional, default=0.1): Controls the randomness of the generated text. Higher values produce more diverse results, while lower values produce more deterministic results.
  • top_p (optional, default=0.75): The cumulative probability threshold for token sampling. The model will only consider tokens whose cumulative probability is below this threshold.
  • top_k (optional, default=40): The number of top tokens to consider when sampling. The model will only consider the top_k highest-probability tokens.
  • num_beams (optional, default=4): The number of beams used for beam search. Increasing this value can result in higher-quality output but will increase the computational cost.

The API also supports passing any parameter supported by Huggingface's Transformers.generate.

Example Usage

You can use the baseten model package to invoke your model from Python

import baseten
# You can retrieve your deployed model ID from the UI
model = baseten.deployed_model_version_id('YOUR_MODEL_ID')

request = {
    "prompt": "What's the meaning of life?",
    "temperature": 0.1,
    "top_p": 0.75,
    "top_k": 40,
    "num_beams": 4,
}

response = model.predict(request)

You can also invoke your model via a REST API

curl -X POST " https://app.baseten.co/models/YOUR_MODEL_ID/predict" \
     -H "Content-Type: application/json" \
     -d '{
           "prompt": "What's the meaning of life?",
           "temperature": 0.1,
           "top_p": 0.75,
           "top_k": 40,
           "num_beams": 4
         }'

alpaca-7b-truss's People

Contributors

aspctu avatar pankajroark avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.