GithubHelp home page GithubHelp logo

konradszafer / hugging-face-qa-bot Goto Github PK

View Code? Open in Web Editor NEW
40.0 3.0 5.0 260.71 MB

Open source Hugging Face Question Answering Bot to aid users in developing and troubleshooting ML solutions.

License: MIT License

Jupyter Notebook 38.70% Python 60.60% Shell 0.16% Dockerfile 0.55%
chatbot huggingface question-answering information-retrieval langchain vector-database rag

hugging-face-qa-bot's Introduction

Hugging Face Documentation Question Answering System

A multi-interface Q&A system that uses Hugging Face's LLM and Retrieval Augmented Generation (RAG) to deliver answers based on Hugging Face documentation. Operable as an API, Discord bot, or Gradio app, it also provides links to the documentation used to formulate each answer.

Example

Example

Table of Contents

Setting up

To execute any of the available interfaces, specify the required parameters in the .env file based on the .env.example located in the config/ directory. Alternatively, you can set these as environment variables:

  • QUESTION_ANSWERING_MODEL_ID - (str) A string that specifies either the model ID from the Hugging Face Hub or the directory containing the model weights
  • EMBEDDING_MODEL_ID - (str) embedding model ID from the Hugging Face Hub. We recommend using the hkunlp/instructor-large
  • INDEX_REPO_ID - (str) Repository ID from the Hugging Face Hub where the index is stored. List of the most actual indexes can be found in this section: Indexes
  • PROMPT_TEMPLATE_NAME - (str) Name of the model prompt template used for question answering, templates are stored in the config/api/prompt_templates directory
  • USE_DOCS_FOR_CONTEXT - (bool) Use retrieved documents as a context for a given query
  • NUM_RELEVANT_DOCS - (int) Number of documents used for the previous feature
  • ADD_SOURCES_TO_RESPONSE - (bool) Include sources of the retrieved documents used as a context for a given query
  • USE_MESSAGES_IN_CONTEXT - (bool) Use chat history for conversational experience
  • DEBUG - (bool) Provides additional logging

Install the necessary dependencies from the requirements file:

pip install -r requirements.txt

Running

Gradio

After completing all steps as described in the Setting up section, specify the APP_MODE environment variable as gradio and run the following command:

python3 app.py

API Serving

By default, the API is served at http://0.0.0.0:8000. To launch it, complete all the steps outlined in the Setting up section, then execute the following command:

python3 -m api

Discord Bot

To interact with the system as a Discord bot, add additional required environment variables from the Discord bot section of the .env.example file in the config/ directory.

  • DISCORD_TOKEN - (str) API key for the bot application
  • QA_SERVICE_URL - (str) URL of the API service. We recommend using: http://0.0.0.0:8000
  • NUM_LAST_MESSAGES - (int) Number of messages used for context in conversations
  • USE_NAMES_IN_CONTEXT - (bool) Include usernames in the conversation context
  • ENABLE_COMMANDS - (bool) Allow commands, e.g., channel cleanup
  • DEBUG - (bool) Provides additional logging

After completing all steps, run:

python3 -m bot

To host bot on Hugging Face Spaces, specify the APP_MODE environment variable as discord, and the bot will be run automatically from the app.py file.

Indexes List

The following list contains the most current indexes that can be used for the system:

Development Instructions

We use Python 3.10

To install all necessary Python packages, run the following command:

pip install -r requirements.txt

We use the pipreqsnb to generate the requirements.txt file. To install pipreqsnb, run the following command:

pip install pipreqsnb

To generate the requirements.txt file, run the following command:

pipreqsnb --force .

To run unit tests, you can use the following command:

pytest -o "testpaths=tests" --noconftest

hugging-face-qa-bot's People

Contributors

janekdev avatar konradszafer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

hugging-face-qa-bot's Issues

Add support for sending long responses on Discord (>2000 characters)

The feature should detect when a response exceeds the 2000-character limit.
When a response exceeds the limit, it should be split into multiple messages, each containing no more than 2000 characters.
The answer should be subdivided by new lines, not characters, and preferably in a way that maintains the coherence and readability of the answer.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.