GithubHelp home page GithubHelp logo

rag's Introduction

My Retrieval Augmented Generation (RAG) Implementations

Goals

The first goal of this project is to build a robust RAG system that encapsulates the newest methods and models in an intuitive python interface. The second goal is to put all this logic on a server and expose REST APIs to a Next.js frontend that will include a Generative User Intergace. TBD

Core

This is a work in progress. Right now I'm finishing up the core abstractions. Everything is broken down into a few core services:

  • data.py
    • this includes the Index class (embedding database)
    • and the Reranker class
  • generators.py
    • this containes the generators that use instructor to output schema-validated data
    • models.py contains the data models
  • chat.py
    • this class wraps around popular LLM APIs and exposes certain methods for interface
  • rag.py
    • TODO: this class will have an end to end configurable rag solution and leverage each of the aforementioned abstractions
  • utils.py
    • pretty self explanatory, just utils and configuration

Running Tests

  • use pytest to run all local tests
  • use pytest --external to run all local tests, including those that make external API calls (which can take a while)

Changelog

Week of 3/27

  • Update the generator code to use Anthropic
  • add reranking to the index class
  • Implement the multiple query inside the index by searching all queries, then removing duplicates, before finally reranking

Week of 4/3

  • finish the Cohere class in chat.py
  • finish the Chat factory class
  • add tests for Chat factory class
  • add ask_stream function to the abstract chat class
  • sanity check all the new abstractions (see notebooks/tests/sanity-check-chat.ipynb)
    • finish cohere stream, others are working
  • make a full RAG abstraction
  • get all chat.py tests passing
  • updated chat_stream and print_stream to yield/return mulitple response objects
  • got a FastAPI server running, configured globals, middleware, routes, and models
  • got the /chat endpoint working with OAI
  • figure out how to make the temp, model, and max_tokens params optional in server/models/chat
  • sanity check in a notebook the /chat endpoint with OAI, Anthropic, and Cohere
  • implement tests in tests/server/routes/chat
  • move server prints to logging?
  • get the streaming endpoint working
  • expose an all purpose chat endpoint that takes in params and returns a stream (FastAPI)

Week of 4/30

  • rebuild logging to use one folder in the root dir
  • implement the message logic in the chat functions
    • get all tests passing
    • merge into main
  • added more robust error handling to the chat routes
  • added message validation rules for order
  • add usage/cost monitoring per request
  • add logging config for lib
  • check multiple message support for all models (Claude is not working rn)

TODO

  • integrate LlaMA 3 70B and 400B when it drops!! (with GROQ)
    • write in the docs the process for integraging a new model (add the specific class, add to Chat factory class, add message convert messages, add to model config)
  • change the default RAG to persist=Fals
  • add rerank 3
  • build and text the ability to generate queries that include the best location/db to search for
  • Add tool calls for openai and Anthropic
  • add Instructor support for Chat models
  • add rate limits to model config
  • make model config into a data class, add embedding models and rerank models

rag's People

Contributors

beverm2391 avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.