GithubHelp home page GithubHelp logo

llm_rag's Introduction

Simple demo of the RAG(Retrieval Augmented Generation)

  1. Use Bert-based Model developed by ckiplab to extract text embeddings.
  2. Use ResNet50 or EfficientNet_B7 to get the image embeddings.
  3. Create vector database through Milvus to store embeddings including texts and images embeddings.
  4. Use similarity search to find the similar images or QA texts.

How to run it

  • At first, download docker compose yaml file
    wget https://github.com/milvus-io/milvus/releases/download/v2.3.2/milvus-standalone-docker-compose.yml -O docker-compose.yml
  • To run Milvus, execute the command below
    sudo docker-compose up -d
  • Check if all the containers are activated. There will be three docker containers running(milvus-etcd, milvus-minio, milvus-standalone)
    sudo docker compose ps
  • Download the demo data found on the Internet(include texts & images)
    mkdir data
    cd data
    ## Get text data
    wget https://github.com/A-baoYang/finetune-with-openai/raw/main/example_data/faq.jsonl
    
    ## Get image data (manual download from the web-site or use the kaggle API)
    kaggle datasets download -d jehanbhathena/weather-dataset
    unzip archive.zip
  • Run create_img_db.py to create the collection for images
    python create_img_db.py
  • Run create_QA_db.py to create the collection for texts
    python create_QA_db.py
  • Run img_search.py or QA_text_search.py to do the similarity search

Ref.

llm_rag's People

Contributors

jasonluo-tw avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.