GithubHelp home page GithubHelp logo

brag's Introduction

bRAG [WIP]

bRAG is an example repo how you can do Basic Retrieval Augmented Generation the alternative way. No langchain. No ORMs. No alembic/migration generation tools. Instead we go with PDF. What in the world? pugsql, dbmate & fastapi. That's it.

To better understand this project, you should read the following article: Build your own LLM RAG API without langchain using OpenAI, qdrant and PDF stack

More coming.

Or to be more precise, we will use:

  1. plain python openai client to interact with our LLM
  2. qdrant as our vector database of choice and their client.
  3. For the embeddings we will use openai Qdrant's fastembed lib.
  4. the API will be written using fastapi
  5. no ORM for us, instead we will roll with pugsql
  6. migrations will be done using plain sql and dbMate.
  7. instead of poetry, we will go for pyenv + pip-tools
  8. on the db front - postgres
  9. the app will be containerised - docker
  10. simple ingestion with faststream and nats

Does that sound weird to you? Because it is! Even more fun.

Basics

One click deployment

docker-compose up

local development for the api + db & vdb in docker

docker compose up -d database qdrant nats api worker
make run-dev

that's it. Then head over to http://localhost:8000/docs to see the swagger docs.

To run the worker locally with faststream: o run the worker processing the reviews:

faststream run --reload app.process.subscriber:app

lastly to ingest:

python -m app.process.ingest

To get run-dev working proceed through the following steps:

To start, let's make sure you have pyenv installed. What is pyenv? You can read a bit about it on my blog. Pyenv, poetry and other rascals - modern Python dependency and version management.

Long story short it's like a virtualenv but for python versions that keeps everything isolated, without polluting your system interpreter or interpreters of other projects, which would be bad. You don't need to know much more than that.

If you are on mac you can just

brew install pyenv

or if you do not like homebrew/are linux based:

curl https://pyenv.run | bash

Remember to set up your shell for pyenv... In short you have to add some stuff to either your .bashrc, .zshrc or .profile. Why? So the proper 'commands' are available in your terminal and so that stuff works. How?

# For bash:
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo 'command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(pyenv init -)"' >> ~/.bashrc

# For Zsh:
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.zshrc
echo 'command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.zshrc
echo 'eval "$(pyenv init -)"' >> ~/.zshrc

Done? Now you just need to get pyenv to get python 3.11 downloaded & installed locally (no worries it won't change your system interpreter):

pyenv install 3.11

When you are done installing pyenv, we can get going.

We will name our project bRAG - the name coming from basic Retrieval Augmented Generation API. First of all, let's create a directory of the project:

mkdir bRAG
cd bRAG

Now we can set up a python version we want to use in this particular directory. How? With pyenv and pyenv-virtualenv, which is nowadays installed by default with the basic pyenv installer. If you need to, check the article I referred before to understand what's happening.

pyenv virtualenv 3.11 bRAG-3-11  # this creates a 'virtualenv' based on python 3.11 named bRAG-3-11
pyenv local bRAG-3-11

after that just install stuff from requirements.txt.

Project structure

The file structure is:

 |-- app/  # Main codebase directory
 |---- chat/
 |------ __init__.py
 |------ api.py
 |------ constants.py
 |------ message.py
 |------ models.py
 |------ streams.py
 
 |---- core/
 |------ __init__.py
 |------ api.py
 |------ logs.py
 |------ middlewares.py
 |------ models.py
 
 |---- __init__.py
 |---- db.py
 |---- main.py
 
 
 |-- db/  # Database/migration/pugsql related code
 |---- migrations/
 |---- queries/
 |---- schema.sql
 
 |-- settings/  # Settings files directory
 |---- base.py
 |---- gunicorn.conf.py
 |-- tests/  # Main tests directory
 |-- requirements/
 
 |-- .dockerignore
 |-- .env.example
 |-- .gitignore
 |-- pre-commit-config.yaml  # for linting in during development
 |-- docker-compose.yml
 |-- Dockerfile
 |-- Makefile  # useful shortcuts
 |-- README.md

brag's People

Contributors

grski avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.