GithubHelp home page GithubHelp logo

hyili / chatgptassistant Goto Github PK

View Code? Open in Web Editor NEW
28.0 3.0 4.0 39 KB

Voice2voice ChatGPT Assistant built through OpenAI Whisper (speech2text) + OpenAI ChatGPT API + Google Text2Speech Service (text2speech)

Python 83.52% Shell 4.61% CSS 1.89% HTML 9.98%
chatgpt chatgpt-api text2speech whisper assistant chat google openai python speech2text voice voice-assistant

chatgptassistant's Introduction

ChatGPT Assistant

A voice2voice chatgpt assistant
Constructed by using OpenAI Whisper + OpenAI ChatGPT API + Google Text2Speech Service

Introduce

  • Speech2Text through OpenAI's Whisper Model (currently using local CPU)
  • Chat with ChatGPT through its API
  • Text2Speech through Google's Text2Speech Service

News

  • 2023/03/26:
    • Replace sox with pydub for playing the speech from Google
    • Move the prompting from system to user role, which is more effective
  • 2023/03/12:
    • SIMPLE WebUI support for chat history with automatically websocket notification
    • Mute the code blocks before get into text2speech service
  • 2023/03/07:
    • We can now ask ChatGPT to reset the session for us. Therefore it will clear out the current session, preventing spend the quota on unrelated history messages.
    • Use PyAudio instead of using arecord/lame which is only available for specific platform

Known Issues:

  • 2023/03/12:
    • Code blocks might be corrupted, if it contains "\n" "\t"
    • Websocket active notify has large delay. Don't know why... need time to survey

Attention

  • Whisper would automatically download model for the first time
  • Make sure use a python virtual env before start
  • Currently, only 1 background session available at any time

Requirements

Run the following command manually or using scripts/install.sh

$ pip3 insntall -r requirements.txt
$ apt install portaudio19-dev
$ mkdir record private audio markdown

Preparation (ChatGPT API Key)

Get your api key here: https://platform.openai.com/account/api-keys

$ echo "{CHATGPT_ACCESS_KEY}" > private/api_keys

Simple Run (ChatGPT + Text2Speech)

You can input text and send to ChatGPT through API
Then, you can hear the response

$ ./scripts/run_simple.sh

ChatGPTAssistant in the background (Speech2Text + ChatGPT + Text2Speech)

Start/Restart a ChatGPT session (wait for your voice audio file in the background)

$ ./scripts/start_background_session.sh

Stop the previous ChatGPT session if there is one

$ ./scripts/stop_background_session.sh

Start to record voice after it runs, ctrl+c when finished

$ ./scripts/record_audio.sh

ChatGPTAssistant UI (Speech2Text + ChatGPT + TextUI for response)

Under Construction ...

TBD ...

  • keyboard shortcut to record the user's voice
  • keyboard shortcut to restart the ChatGPT session
  • be able to load previous session from history
  • ...

Reference Sites

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.