GithubHelp home page GithubHelp logo

gpt-voice-assistant's Introduction

Welcome to gpt-voice-assistant

Last Commit

This software builds on top of carter-voice-assistant project and replaces the Carter API with ๐Ÿ‘พ OpenAI API. With this integration, the assistant is able to provide more accurate and sophisticated responses to user input.

gpt-voice-assistant pixel-art by JVPC0D3R

๐Ÿ›  how it works

GPT-3.5 is the core of the assistant, but this project uses other AI models to extract more data from the user and it's environment:

  • The first model implemented is ๐Ÿฆป Whisper , which was prebuilt in the original Carter project. Whisper's goal is to listen to the user and transcript it's voice into text.

  • In order to give vision to the assistant, I used ๐Ÿ‘ Ultralytics YOLOv8 model, which can detect, classify and track objects in real time.

  • To give the assistant access to the Internet I implemented a ๐Ÿ” SerpAPI based module.

  • In order for the assistant to know if the user wants to perform one action or another, I implemented a ๐Ÿ“‘ text classification model, which has to decide if the user input is a chat, a vision query, a google search or a farewell.

  • Also if the user command needs a google search before calling GPT, the assistant has to get arguments to call the SerpAPI. In order to do that I used a ๐Ÿ”‘ keyword extraction model.

๐Ÿ›น getting started

To run the gpt-voice-assistant, you will need to provide an OpenAI API and a SerpAPI key. I suggest creating a python file named keys.py to store the API key variables.

๐Ÿ“ฆ installation

To install and run the gpt-voice-assistant, follow these steps:

git clone https://github.com/JVPRUGBIER/gpt-voice-assistant

Install the required dependencies:

pip install -r requirements.txt

Create a 'keys.py' file in the project directory and add your OpenAI and SerpAPI keys:

OPENAI_API_KEY = "your_api_key"
SERP_API_KEY = "your_api_key"

๐Ÿƒ run the assistant:

Chat using text with GPT

python chat.py -t

Chat using text with GPT and let the assistant read the response out loud

python chat.py -t -v

Have a full speech chat with the gpt-voice-assistant

python chat.py -l -v

gpt-voice-assistant's People

Contributors

ada-ai3915 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.