GithubHelp home page GithubHelp logo

mayooear / langchain-supabase-website-chatbot Goto Github PK

View Code? Open in Web Editor NEW
641.0 5.0 187.0 4.4 MB

Build a chatgpt chatbot for your website using LangChain, Supabase, Typescript, Openai, and Next.js.

Home Page: https://www.youtube.com/watch?v=R2FMzcsmQY8

TypeScript 75.02% JavaScript 2.09% PLpgSQL 3.78% CSS 19.11%
langchain supabase typescript gpt3 notion openai chatgpt nextjs

langchain-supabase-website-chatbot's Introduction

LangChain & Supabase - Create a ChatGpt Chatbot for Your Website

Create a chatgpt chatbot for your website using LangChain, Supabase, Typescript, Openai, and Next.js. LangChain is a framework that makes it easier to build scalable AI/LLM apps. Supabase is an open source Postgres database that can store embeddings using a pg vector extension.

Tutorial video

Get in touch via twitter if you need help

The visual guide of this repo and tutorial is in the visual guide folder.

Development

  1. Clone the repo
git clone [github https url]
  1. Install packages
pnpm install
  1. Set up your .env file
  • Copy .env.local.example into .env Your .env file should look like this:
OPENAI_API_KEY=

NEXT_PUBLIC_SUPABASE_URL=
NEXT_PUBLIC_SUPABASE_ANON_KEY=
SUPABASE_SERVICE_ROLE_KEY=

  • Visit openai to retrieve API keys and insert into your .env file.
  • Visit supabase to create a database and retrieve your keys in the user dashboard as per docs instructions
  1. In the config folder, replace the urls in the array with your website urls (the script requires more than one url).

  2. In the utils/custom_web_loader.ts inside the load function replace the values of title, date and content with the css elements of text you'd like extract from a given webpage. You can learn more about how to use Cheerio here

You can add your custom elements to the metadata to meet your needs, note however that the default loader format as per below expects at least a string for pageContent and metadata that contains a source property as a returned value:

async load(): Promise<Document[]>{
  const $ = await this.scrape();
      const text = $("body").text();
    const metadata = { source: this.webPath };
    return [new Document({ pageContent: text, metadata })];
  }

The pageContent and metadata will later be stored in your supabase database table.

  1. Copy and run schema.sql in your supabase sql editor
  • cross check the documents table exists in the database as well as the match_documents function.

๐Ÿง‘ Instructions for scraping and embedding

To run the scraping and embedding script in scripts/scrape-embed.ts simply run:

npm run scrape-embed

This script will visit all the urls noted in the config folder and extract the data you specified in the custom_web_loader.ts file.

Then it will use OpenAI's Embeddings(text-embedding-ada-002) to convert your scraped data into vectors.

Run the app

Once you've verified that the embeddings and content have been successfully added to your supabase table, you can run the app npm run dev and type a question to ask your website.

Credit

Frontend of this repo is inspired by langchain-chat-nextjs

This repo uses in-depth Notion guides from the website of productivity expert, Thomas Frank.

langchain-supabase-website-chatbot's People

Contributors

mayooear avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

langchain-supabase-website-chatbot's Issues

Error: supabaseUrl is required.

Error: supabaseUrl is required.

NEXT_PUBLIC_SUPABASE_URL=https://ciwcasymbzrxeddwtfyt.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJzdXBhYmFzZSIsInJlZiI6ImNpd2Nhc3ltYnpyeGVkZHd0Znl0Iiwicm9sZSI6ImFub24iLCJpYXQiOjE2OTE1ODgxOTQsImV4cCI6MjAwNzE2NDE5NH0.gFH2szBvK8MXP9A7YDnAX_dfj8x9K3pzO9Rox0IC6do

SUPABASE_SERVICE_ROLE_KEY=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJzdXBhYmFzZSIsInJlZiI6ImNpd2Nhc3ltYnpyeGVkZHd0Znl0Iiwicm9sZSI6ImFub24iLCJpYXQiOjE2OTE1ODgxOTQsImV4cCI6MjAwNzE2NDE5NH0.gFH2szBvK8MXP9A7YDnAX_dfj8x9K3pzO9Rox0IC6do
image

Set owner for document while embedding and querying (Supabase)

Hello,

This is not an issue, just a question need help:

I added owner_sid to the documents table to determine the ownership of documents by users and only allow users with sid (uuid) to query their own documents. Here is the SQL snippet I have customized. I am unsure how to pass the user sid when 'embedding' and 'querying' though.

-- Enable the pgvector extension to work with embedding vectors
create extension vector;

-- Create a table to store your documents
create table documents (
  id bigserial primary key,
  content text, -- corresponds to Document.pageContent
  metadata jsonb, -- corresponds to Document.metadata
  embedding vector(1536), -- 1536 works for OpenAI embeddings, change if needed
  -- customize
  owner_sid uuid not null -- only users can query their documents
);

-- Create a function to search for documents
create function match_documents (
  query_embedding vector(1536),
  match_count int,
  -- owner
  document_owner uuid
) returns table (
  id bigint,
  content text,
  metadata jsonb,
  similarity float
)
language plpgsql
as $$
#variable_conflict use_column
begin
  return query
  select
    id,
    content,
    metadata,
    1 - (documents.embedding <=> query_embedding) as similarity,
    -- owner
    owner_sid
  from documents
  order by documents.embedding <=> query_embedding
  limit match_count;
end;
$$;

-- Create an index to be used by the search function
create index on documents
  using ivfflat (embedding vector_cosine_ops)
  with (lists = 100);

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.