GithubHelp home page GithubHelp logo

mikecavallo / know-my-doc Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jainsid24/know-my-doc

0.0 0.0 0.0 21.18 MB

KnowMyDoc is a GPT3.5 Powered Python-based conversational AI utility that enables you to build a chatbot with your own data sources and web pages. With KnowMyDoc, you can easily create a chatbot that can answer complex questions by utilizing advanced machine learning techniques and natural language processing (NLP) algorithms.

Home Page: https://jainsid24.github.io/know-my-doc/

Python 45.29% CSS 24.95% HTML 23.72% Dockerfile 6.04%

know-my-doc's Introduction

KnowMyDoc

Python version License Commit Activity Repo Size OpenAI API key Docker Code style: black

Chat

KnowMyDoc Chat

KnowMyDoc is a GPT3.5 powered Python-based conversational AI tool that enables users to build a reference enabled chatbot by utilizing advanced machine learning techniques and natural language processing (NLP) algorithms. The utility is fully containerized and API-driven, which allows for a seamless and rapid chatbot creation experience.

KnowMyDoc leverages the LangChain library for LLM prompt engineering and conversation chaining. Users can easily customize the chatbot's prompts and personalize its responses based on the context and tone of the conversation. KnowMyDoc's LLM-based approach ensures that the chatbot can maintain a consistent and coherent conversation even when dealing with large amounts of data and provide relevant sources per response. The chatbots also remain in the confines of provided knowledge.

In addition, KnowMyDoc utilizes the Chroma vector similarity search engine to enable fast and efficient lookup of relevant data. By creating embeddings of users' documents and web pages, KnowMyDoc can quickly identify and retrieve the most relevant information for the user's queries.

Other features of KnowMyDoc include:

  • Support for loading documents from local data sources and web urls
  • Support for persona and message tone
  • AI qa limited to knowledge sources
  • Text splitting to optimize indexing and similarity search
  • NLTK support for text processing and tokenization Support for OpenAI embeddings and vector stores, including Chroma
  • Logging support for troubleshooting and analysis

Configuration

Before you can use the utility, you need to set up the configuration file. The configuration file is a YAML file that contains the following options:

  • openai_api_key: Your OpenAI API key.
  • data_directory: The directory where your local data sources are located.
  • data_files_glob: A glob pattern that specifies which files in data_directory to use as data sources.
  • webpages: A list of URLs of webpages to use as data sources.
  • tone: The tone to use for the chatbot's responses (e.g., "formal", "informal", "friendly", etc.).
  • persona: The persona to use for the chatbot.
  • You can copy the config.example.yaml file to config.yaml and modify the options as needed.

Getting Started

To use this utility:

  1. Clone the repository
git clone https://github.com/jainsid24/know-my-doc
  1. Build the Docker image by running the following command in the terminal:
docker build -t know-my-doc:latest .
  1. Once the image is built, run the Docker container using the following command:
docker run -p 5001:5001 know-my-doc
  1. Use curl/postman for API call
curl --header "Content-Type: application/json" \
     --request POST \
     --data '{"question": "When was JWST launched?"}' \
     http://<pods-ip-address>:5001/api/chat

Contributing

If you find a bug or have an idea for a new feature, please open an issue or submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

know-my-doc's People

Contributors

github-prathma avatar jainsid24 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.