GithubHelp home page GithubHelp logo

halilergul1 / qa-app Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 0.0 4.23 MB

Question-Answering App Over Your Own Data with LLamaindex and ElasticSearch !

Dockerfile 0.82% Python 37.00% Jupyter Notebook 62.18%
chunking embeddings large-language-models llm nlp nlp-machine-learning question-answering rag semanticsearch vectorsearch semanticsegmentation

qa-app's Introduction

Illustration

Alt text

QA-app

Given a pdf folder consisting of various pdf files, this app deploys elastich search along with LLamaindex and can be used as Question Answering to your documents. Together with FastAPI, one can give query and get response and sources as output. Also can be used by companies as their QA app for their long policy documents to automate customer queiries regarding parental leave and so forth. I would like to hear your comments and feedbacks!

For custom use, you can add your own documents to /pdfs folder. Currently there are two articles added by default

How to Setup and Use the Service Easily

This section details the steps required to set up and run the service locally for development, testing, and deployment using Docker. Follow these instructions to get your environment ready.

Running the Application

To run the application using Docker, follow these steps and make sure you are in the project directory:

  1. Set Your OPENAI_API_KEY:

    • You have to set your OPENAI_API_KEY before running Docker as an environment variable. You can do this by running the following command in the terminal on macOS and Linux (You can use "set" instead of "export" on Windows):
      export OPENAI_API_KEY='your-api-key'
      
  2. Build the Docker Application:

    • Use Docker Compose to build the application. This will read the docker-compose.yml file and set up the necessary Docker containers:
      docker-compose build
      
  3. Start the Application:

    • Once the build is complete, start the application by running:
      docker-compose up
      
    • This command starts all the services defined in your Docker Compose configuration. The application will be running on http://localhost:8000 and you can send queries to the API to try on http://localhost:8000/docs#/default/perform_query_query__post.

Local Development Setup (optional, for DEVs)

  1. Create a Virtual Environment:

    • It's recommended to use a virtual environment to isolate package dependencies. To create a virtual environment, run:
      python3 -m venv venv
      
    • Activate the virtual environment:
      • On Windows:
        .\venv\Scripts\activate
        
      • On macOS and Linux:
        source venv/bin/activate
        
  2. Install Required Packages:

    • Install all the necessary packages using pip. This includes libraries specified in your requirements.txt:
      pip install -r requirements.txt
      
  3. Install Test Dependencies:

    • Ensure that your testing environment is also ready by installing the required packages for testing:
      pip install pytest
      

Running Test Cases defined in e2e_test.py (optional, for DEVs)

To execute tests, first ensure that the Docker containers are up and running. Also make sure you properly followed the steps "Local Development Setup" above to create venv and install requirement files. You can perform the following:

  1. Start Docker Containers:

    • If not already running, start your Docker containers:
      docker-compose up
      
  2. Run Tests:

    • Execute the tests using pytest by running the following command in the terminal while you are in the project directory and the Docker containers are running:
      pytest
      

These steps will help you set up the development environment, run the application, and execute tests efficiently. Adjust the commands according to your specific configurations if necessary.

Directory Structure (optional)

This section provides an overview of the main directories and files in this repository, explaining how they contribute to the project.

πŸ“‚ Base Directory

  • app.py: The entry point of the program. Contains the main functionalities executed when the project is run via FAST api.
  • IndexManager.py: Contains the IndexManager class with attributes and methods handling index loading, index initialization, creation, query engine creation and formatting output
  • ingest.py: Implements the data ingestion part from pdfs into llamaindex nodes
  • index.py: Takes the generated nodes and setup a Elasticsearch vector store and assemble a query engine of llamaindex.
  • querying.py: Takes a query and generates a response.
  • evaluate.py: Generates synthetic data out of document nodes for model evaluation via RelevancyEvaluator.
  • e2e_test.py: contains methods for end-to-end test cases covering different possible scenarious while taking user queries etc.
  • config.py: contains the configuration variables for the model.

πŸ“‚ /Documentation

  • Description: Includes a single pdf file that explains the experiments and the final solution method.

πŸ“‚ /pdfs

  • Description: Includes data files (which are shared pdf files) used in the project.

πŸ“‚ /experiments

  • Description: Includes py notebooks that come with 2-3 different solutions for the QA task. The file semanticChunker.ipynb is the final solution method. The details were explained in experiments.pdf

πŸ“‚ /manual-test

  • Description: Contains index checker and loader files that implement the main pipeline with hardcoded variables.

πŸ“‚ /results

  • Description: It only includes raw results of RelevancyEvaluator of the final method/solution (semanticChunker)

πŸ“„ README.md

  • Description: Provides an overview of the project, installation instructions, and usage examples.

πŸ“„ .gitignore

  • Description: Specifies intentionally untracked files to ignore.

qa-app's People

Contributors

halilergul1 avatar

Stargazers

 avatar Necmettin Γ‡arkacΔ± avatar  avatar  avatar  avatar

Watchers

 avatar

qa-app's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.