GithubHelp home page GithubHelp logo

hyili / chatgptassistant Goto Github PK

View Code? Open in Web Editor NEW
29.0 3.0 4.0 39 KB

Voice2voice ChatGPT Assistant built through OpenAI Whisper (speech2text) + OpenAI ChatGPT API + Google Text2Speech Service (text2speech)

Python 83.52% Shell 4.61% CSS 1.89% HTML 9.98%
chatgpt chatgpt-api text2speech whisper assistant chat google openai python speech2text voice voice-assistant

chatgptassistant's Introduction

ChatGPT Assistant

A voice2voice chatgpt assistant
Constructed by using OpenAI Whisper + OpenAI ChatGPT API + Google Text2Speech Service

Introduce

  • Speech2Text through OpenAI's Whisper Model (currently using local CPU)
  • Chat with ChatGPT through its API
  • Text2Speech through Google's Text2Speech Service

News

  • 2023/03/26:
    • Replace sox with pydub for playing the speech from Google
    • Move the prompting from system to user role, which is more effective
  • 2023/03/12:
    • SIMPLE WebUI support for chat history with automatically websocket notification
    • Mute the code blocks before get into text2speech service
  • 2023/03/07:
    • We can now ask ChatGPT to reset the session for us. Therefore it will clear out the current session, preventing spend the quota on unrelated history messages.
    • Use PyAudio instead of using arecord/lame which is only available for specific platform

Known Issues:

  • 2023/03/12:
    • Code blocks might be corrupted, if it contains "\n" "\t"
    • Websocket active notify has large delay. Don't know why... need time to survey

Attention

  • Whisper would automatically download model for the first time
  • Make sure use a python virtual env before start
  • Currently, only 1 background session available at any time

Requirements

Run the following command manually or using scripts/install.sh

$ pip3 insntall -r requirements.txt
$ apt install portaudio19-dev
$ mkdir record private audio markdown

Preparation (ChatGPT API Key)

Get your api key here: https://platform.openai.com/account/api-keys

$ echo "{CHATGPT_ACCESS_KEY}" > private/api_keys

Simple Run (ChatGPT + Text2Speech)

You can input text and send to ChatGPT through API
Then, you can hear the response

$ ./scripts/run_simple.sh

ChatGPTAssistant in the background (Speech2Text + ChatGPT + Text2Speech)

Start/Restart a ChatGPT session (wait for your voice audio file in the background)

$ ./scripts/start_background_session.sh

Stop the previous ChatGPT session if there is one

$ ./scripts/stop_background_session.sh

Start to record voice after it runs, ctrl+c when finished

$ ./scripts/record_audio.sh

ChatGPTAssistant UI (Speech2Text + ChatGPT + TextUI for response)

Under Construction ...

TBD ...

  • keyboard shortcut to record the user's voice
  • keyboard shortcut to restart the ChatGPT session
  • be able to load previous session from history
  • ...

Reference Sites

chatgptassistant's People

Contributors

hyili avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.