GithubHelp home page GithubHelp logo

manote101 / llama_index_ray Goto Github PK

View Code? Open in Web Editor NEW

This project forked from amogkam/llama_index_ray

0.0 0.0 0.0 71.92 MB

Using LlamaIndex with Ray for productionizing LLM applications

JavaScript 0.05% Python 0.01% CSS 0.03% HTML 99.91%

llama_index_ray's Introduction

llama_index_ray

Overview of Ray and LlamaIndex

An example on how to use LlamaIndex with Ray for productionizing LLM applications.

LlamaIndex is a data framework for building LLM applications. It provides abstractions for ingesting data from various sources, data structures for storing and indexing data, and a retrieval and query interface.

Ray is A general-purpose, open source, distributed compute framework for scaling AI applications in native-Python.

By using Ray with LlamaIndex, we can easily build production-quality LLM applications that can scale out to large amounts of data.

Example

In this example, we build a Q&A system from 2 datasources: the Ray documentation, and the Ray/Anyscale blog posts.

In particular, we create a subquestion query engine, that can handle complex queries that involve multiple datasources. For example, a query like "Compare and contrast how the Ray docs and the Ray blogs present Ray Serve" requires both datasources to be queried.

Step 1: Scalable Data Indexing

The first step is to load our data sources and create our data ingestion pipeline. This involves parsing the soruce data and embedding the data using GPUs. The embeddings are then persisted in a vector store.

LlamaIndex provides the abstraction for reading and loading the data, while Ray Datasets is used to scale out the processing/embedding across multiple GPUs.

Run python create_vector_index.py to run the data indexing.

Step 2: Deploy the Q&A application

Next, we use LlamaIndex and Ray Serve to deploy our Q&A application.

Using LlamaIndex, we can define multiple query engines to answer questions from multiple sources. The default LLM for LlamaIndex is OpenAI GPT-3.5.

  1. Ray Documentation only
  2. Ray blog posts only
  3. Both Ray documentation and Ray blog posts

Using Ray Serve, we can deploy this app so that we can send it query requests. For production settings, Ray Serve has built-in support for load balancing & autoscaling.

serve run deploy_app:deployment

Step 3: Query the application

Finally, we can query the application. We provide a simple query script: query.py.

The first argument is which engine to use, either docs, blogs, or subquestion, which map to the three engines defined in step 2. The second argument is the query we want to send.

python query.py "subquestion" "Can you tell me more about Aviary?"

Response: 
Aviary is an open source multi-LLM serving solution developed by Anyscale. It is designed to make it easier to deploy and manage large-scale machine learning models. It provides a unified API for managing models, as well as a set of tools for monitoring and debugging model performance. Aviary also supports multiple languages, including Python, Java, and Go. 


Sub-question 1:
Sub question: Does the Ray documentation mention Aviary?
Response: 
No, the Ray documentation does not mention Aviary.

Sub-question 2:
Sub question: Are there any Ray blog posts about Aviary?
Response: 
Yes, there is a Ray blog post about Aviary. It is titled "Announcing Aviary: Open Source Multi-LLM Serving Solution" and can be found at the path /home/ray/default/llama_index_ray/www.anyscale.com/blog/announcing-aviary-open-source-multi-llm-serving-solution.html.

llama_index_ray's People

Contributors

amogkam avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.