GithubHelp home page GithubHelp logo

chat-your-gesetzentwurf's Introduction

Wahlwave - Backend

Welcome to the Wahlwave (formerly Chat-Your-Gesetzentwurf) Backend repository. Wahlwave is designed for the interactive exploration of German election programs. The backend is a crucial piece in the Wahlwave application, working behind the scenes to process queries, manage data, and integrate various technologies.

Core Features

  • Query Processing: The backend is key in handling user inputs from the Wahlwave Frontend, ensuring efficient processing and response generation.
  • Data Management and Storage: Utilizing PostgreSQL for database management and AWS S3 for file storage, the backend handles large volumes of election program PDFs and their associated data while keeping the data up-to-date using Scrapy.
  • Vector Database with Qdrant: For managing and querying vector data, Qdrant is employed, enhancing the efficiency of data retrieval.
  • Local ChatGPT Models on Azure: The backend integrates with Azure-hosted OpenAI services for running local ChatGPT models, contributing to the robust and GDPR-compliant AI capabilities of the Wahlwave platform.
  • Monitoring and Evaluation Tools: Tools like Langfuse for monitoring and Ragas for response evaluation are integral to maintaining the quality and reliability of the platform's outputs.

The Wahlwave Backend, with its use of different technologies and data management strategies, is essential for delivering a seamless and interactive user experience on the Wahlwave platform. It's where the technical intricacies meet user-focused functionalities, creating a reliable and efficient system for political content exploration.

Prerequisites

  • Install Python 3.11 or higher
  • Install poetry

Running Locally

Clone the Project

Clone the project repository from GitHub:

git clone https://github.com/RichardKruemmel/chat-your-gesetzentwurf.git

Set Environment Variables

A .sample.env file is included in the project repository as a template for your own .env file. Copy the .sample.env file and rename the copy to .env:

cp .sample.env .env

Edit the .env file to set your own environment variables.

Running the Backend Independently (Optional)

Note: Running the backend by itself won't establish a connection with the database. Make sure to run the database service separately or together with the backend for full functionality.

We are using [poetry]https://python-poetry.org/) for dependency management and uvicorn for server implementation. Follow these commands:

Every time you run the backend

  # Create a virtual environment
  $ poetry shell
  # Install all packages
  $ poetry install
  # Start API server on port 8000
  $ poetry run uvicorn app.main:app --reload

Docker (coming soon)

Ensure you have Docker Desktop installed on your machine. Docker Desktop) is a comprehensive solution for running Docker on Windows and MacOS systems. It includes Docker Compose, which is required to orchestrate our multi-container application.

To start all services, run:

  docker build -t wahlwave-backend .
  docker run -p 8000:8000 wahlwave-backend

Database Schema

Datbase ERD

Architecture

Architecture

  1. Users access the WebApp via the Streamlit Frontend, initiating requests which are getting resolved by the backend .
  2. Requests are routed to the Elastic Beanstalk Environment containing EC2 instances within a Virtual Private Cloud (VPC), and then directed to the Load Balancer. The Load Balancer allocates the incoming requests to the optimal EC2 instance based on health checks and configurations.
  3. Within the chosen EC2 instance, the FastAPI Backend handles API requests
  4. The Python Scraper collects data from the AbgeordnetenWatch API as required to keep the database up-to-date.
  5. If there are new election program pdfs be available the scraper downloads it from the Abgeordnetenwatch Servers and stores them locally inside the S3 bucket.
  6. For data retrieval and storage, the server communicates with the Postgres Database hosted on AWS RDS.
  7. For operations requiring the Locally Hosted Language Model (LLM), the server reaches out to the LLM hosted in the Azure Cloud, ensuring GDPR compliance and data privacy.
  8. For vector similarity searches or other vector operations, the FastAPI backend equips the LLM's hosted on Azure with vector tools.

CI CD

CI/CD

License

This project is under the GNU General Public License. See LICENSE for more details.

Future Development

This project is under development. Future releases will include improved AI Agent structures and integration with new frontend features requested by users.

chat-your-gesetzentwurf's People

Contributors

richardkruemmel avatar

Watchers

 avatar

chat-your-gesetzentwurf's Issues

Implement Scraper for Database Population and API Monitoring

Description

Implement a scraper service to handle two primary tasks:

1. Database Population:

  • Create a script for initial database population from the third-party API.
  • Set up logic for incremental updates to the database based on new records from the third-party API.

2. API Monitoring and Database Updating:

  • Implement a scheduled task to regularly check the third-party API for updates.
  • Develop logic to detect changes in the API data and update the database accordingly.
  • Ensure robust error handling and logging for all scraper operations.

Tasks:

  • Set up a new directory /backend/app/scraper for the scraper service.
  • Implement a script for the initial population of the database.
  • Set up logic for incremental updates to the database.
  • Create a scheduled task for regular API monitoring.
  • Develop change detection logic to identify updates in the API data.
  • Implement database update functionality to reflect changes detected in the API data.
  • Ensure robust error handling and logging for scraper operations.
  • Write tests to verify the correct functionality of the scraper service.
  • Document the scraper service functionality, usage, and error handling.

Acceptance Criteria:

  • The scraper service should successfully populate the database initially with data from the third-party API.
  • The scraper should regularly check the API for updates and apply any updates to the database.
  • All scraper operations should have error handling to manage potential issues and should log errors and successful operations for troubleshooting and monitoring purposes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.