GithubHelp home page GithubHelp logo

iamdeveloper's Introduction

Next.js OpenAI Doc Search Starter

This starter takes all the .mdx files in the pages directory and processes them to use as custom context within OpenAI Text Completion prompts.

Deploy

Deploy this starter to Vercel. The Supabase integration will automatically set the required environment variables and configure your Database Schema. All you have to do is set your OPENAI_KEY and you're ready to go!

Deploy with Vercel

Technical Details

Building your own custom ChatGPT involves four steps:

  1. [๐Ÿ‘ท Build time] Pre-process the knowledge base (your .mdx files in your pages folder).
  2. [๐Ÿ‘ท Build time] Store embeddings in Postgres with pgvector.
  3. [๐Ÿƒ Runtime] Perform vector similarity search to find the content that's relevant to the question.
  4. [๐Ÿƒ Runtime] Inject content into OpenAI GPT-3 text completion prompt and stream response to the client.

๐Ÿ‘ท Build time

Step 1. and 2. happen at build time, e.g. when Vercel builds your Next.js app. During this time the generate-embeddings script is being executed which performs the following tasks:

sequenceDiagram
    participant Vercel
    participant DB (pgvector)
    participant OpenAI (API)
    loop 1. Pre-process the knowledge base
        Vercel->>Vercel: Chunk .mdx pages into sections
        loop 2. Create & store embeddings
            Vercel->>OpenAI (API): create embedding for page section
            OpenAI (API)->>Vercel: embedding vector(1536)
            Vercel->>DB (pgvector): store embedding for page section
        end
    end

In addition to storing the embeddings, this script generates a checksum for each of your .mdx files and stores this in another database table to make sure the embeddings are only regenerated when the file has changed.

๐Ÿƒ Runtime

Step 3. and 4. happen at runtime, anytime the user submits a question. When this happens, the following sequence of tasks is performed:

sequenceDiagram
    participant Client
    participant Edge Function
    participant DB (pgvector)
    participant OpenAI (API)
    Client->>Edge Function: { query: lorem ispum }
    critical 3. Perform vector similarity search
        Edge Function->>OpenAI (API): create embedding for query
        OpenAI (API)->>Edge Function: embedding vector(1536)
        Edge Function->>DB (pgvector): vector similarity search
        DB (pgvector)->>Edge Function: relevant docs content
    end
    critical 4. Inject content into prompt
        Edge Function->>OpenAI (API): completion request prompt: query + relevant docs content
        OpenAI (API)-->>Client: text/event-stream: completions response
    end

The relevant files for this are the SearchDialog (Client) component and the vector-search (Edge Function).

The initialization of the database, including the setup of the pgvector extension is stored in the supabase/migrations folder which is automatically applied to your local Postgres instance when running supabase start.

Local Development

Configuration

  • cp .env.example .env
  • Set your OPENAI_KEY in the newly created .env file.

Start Supabase

Make sure you have Docker installed and running locally. Then run

supabase start

Start the Next.js App

In a new terminal window, run

pnpm dev

Deploy

Deploy this starter to Vercel. The Supabase integration will automatically set the required environment variables and configure your Database Schema. All you have to do is set your OPENAI_KEY and you're ready to go!

Deploy with Vercel

Learn More

Video: How I Built Supabaseโ€™s OpenAI Doc Search

iamdeveloper's People

Contributors

mdirshaddev avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.