GithubHelp home page GithubHelp logo

squarecat / doc-buddy Goto Github PK

View Code? Open in Web Editor NEW
169.0 4.0 30.0 781 KB

GPT chatbot that will learn documents and instruction manuals uploaded to it

Home Page: https://github.com/squarecat/doc-buddy

JavaScript 100.00%
chatbot embeddings gpt-3 gpt-4 openai pinecone

doc-buddy's Introduction

Documentation Buddy

Documentation Buddy is a Telegram chatbot powered by GPT and OpenAI. You can upload PDF and other documents, the bot will learn from them and you can ask it questions.

Useful in a variety of situations where you have too much information to learn and need a quick reference guide!

Requirements

  • Your own OpenAI account and API Key
  • A Pinecone account and API Key (for embeddings)
  • Somewhere s3-like to save files (for re-indexing if needed)
  • A Telegram key from BotFather

Optional Requirements:

  • If you don't use the DigitalOcean one-click deploy then you will also need to run OpenAIs embeddings project separately. You can get it from here. If you do this then set the env variable EMBEDDINGS_URL to match. If you use the Deploy to DigitalOcean button below then this will be deployed for you as the memory component.
  • A DigitalOcean account if you want to deploy directly to DigitalOceans App Platform. If you don't already have one, you can sign up here.

Getting Started

  1. Create a new Telegram bot with BotFather. Step-by-step guide here.
  2. You will get a token from BotFather that looks like this: 4839574812:AAFD39kkdpWt3ywyRZergyOLMaJhac60qc. You will use this for the TELEGRAM_TOKEN env variable later.
  3. Create a Pinecone account. This is used to store the embeddings data of the uploaded documents.
  4. Get a Pinecone API key from the settings pages. You will use this for the PINECONE_API_KEY env variable. The index will be created automatically later so you don't need to do anything else.
  5. Create a Digital Ocean Space (or other s3 like storage). This is used to store the documents uploaded in case they need to be reindexed later.
  6. Click the button below to deploy directly as a Digital Ocean App. This is a simple step-by-step process that takes <5 minutes.
  7. On the Environment Variables page of the Digital Ocean App creation process, replace the default values with the values you've generated.
  8. The app should start automatically and connect to the Telegram API
  9. Open a chat in Telegram with the bot you created, and try chatting to it!

Deploying the App

Click this button to deploy the app to the DigitalOcean App Platform. If you are not logged in, you will be prompted to log in with your DigitalOcean account.

Deploy to DigitalOcean

When you get to the "Review" section of the deploy, make sure to set the plan to Basic as it will default to Pro. The chat component should run fine on a $5/mo sized instance, and the memory component will probably be fine at $5 also, but much faster at $10/mo.

Environment Variables

You'll need to add these to the DigitalOcean app env variables.

Key Default Description
OPEN_AI_MODEL "gpt-3.5-turbo" The AI model that the assistant will use to reply. GPT-3.5-Turbo will be good enough for most cases
OPEN_AI_KEY Your OpenAI API Key
TELEGRAM_TOKEN The token you get from BotFather
BEARER_TOKEN Set this to a randomly generated string
STORAGE_NAME The name of the s3 storage bucket eg. "doc-buddy"
STORAGE_URL The url of the s3 bucket eg. https://sfo3.digitaloceanspaces.com/
STORAGE_KEY The API key of the s3 bucket
STORAGE_SECRET The Secret key of the s3 bucket
DATASTORE pinecone
PINECONE_API_KEY Your Pinecone API key
PINECONE_ENVIRONMENT The envrinment of your Pinecone index eg. "northamerica-northeast1-gcp"
PINECONE_INDEX The name of your Pinecone index eg. "doc-buddy-memory"

Usage - uploading documentation

Simply upload a doc to the Telegram chat and doc buddy will learn the contents of that document.

Example file upload

Customizing the assistant

You can edit the prompt that is given to the assistant in the prompt.md file.

Sponsor

Sponsored by Ellie - Your AI Email assistant. Ellie learns from your writing style and crafts replies as if they were written by you!

Ellie example

DigitalOcean Referral Badge

doc-buddy's People

Contributors

jivings avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

doc-buddy's Issues

tlsv1 unrecognized alert 112

Trying to get this to run locally instead of on DO since I prefer to selfhost. However am running into this error:

$ node index.js
[server]: started on 1333
[server]: Bot listening...
Error: write EPROTO 139860701919168:error:14094458:SSL routines:ssl3_read_bytes:tlsv1 unrecognized name:../deps/openssl/openssl/ssl/record/rec_layer_s3.c:1565:SSL alert number 112

    at WriteWrap.onWriteComplete [as oncomplete] (node:internal/stream_base_commons:94:16) {
  errno: -71,
  code: 'EPROTO',
  syscall: 'write',
  '$metadata': { attempts: 1, totalRetryDelay: 0 }
}
null

Any ideas? It looks like it's complaining about my S3 config, but as far as I'm able to tell, my s3 config is correct. I'm using minio to back it with a Let's encrypt cert.

STORAGE_NAME="docbuddy-bucket-name"
STORAGE_URL="https://s3.my.domain/"
STORAGE_KEY="my_access_key"
STORAGE_SECRET="my_secret_key"

What am I missing here? I have several other applications using this s3 endpoint, so I know it works, and I've confirmed the bucket manually via commandline too.

Based on the error it looks like its trying to use TLSv1? How can I configure it to use TLSv1.2?

"I had some trouble with that: Failed to save embeddings"

Any hints on why the telegram bot is telling me "I had some trouble with that: Failed to save embeddings" I've double and triple checked my Pinecone API Key, seems like it's getting hung up there, as it's not making it to my S3 bucket.

Digital Ocean Deploy Error

Hey,

I want to play a little bit with your telegram bot, but the Digital Ocean deploy is not working properly. I receive the following error message when I click on the create resource button: GitHub user does not have access to squarecat/documentation-buddyClose

Thanks,
Csaba

Two quick questions

First. Is it possible to upload PDFs in bulk, and then direct the bot to digest the repository? Loading one at a time via chat may take a while.

Second. I’m leaning toward trying this on Discord rather than Telegram. Any tips/advice?

Hope you’re well.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.