GithubHelp home page GithubHelp logo

zaida.ai's Introduction

Zaida AI voice assistant

Architecture

To ensure minimal resource usage on the client side, scalability, and the ability to easily incorporate new features in the future, a Microservice Architecture was adopted for this project's server implementation. Each integration is implemented as a decoupled microservice that runs in a containerized environment with preloaded models and/or other resources for efficient processing and no load times during inference. The orchestration of these microservices is currently configured with Docker Compose.

Details on current integrations:

  • offline Speech-to-Text with OpenAI Whisper
    • configured for processing audio device input streams in real time
    • custom Docker image with a minimal serve.py module
    • WebSocket protocol support with websockets library
    • asynchronous voice recording (client side) and transcription (server side) with asyncio library
  • offline Text-to-Speech with MycroftAI Mimic3
  • offline Natural Language Understanding with Rasa Open Source and SpaCy

Features

  • real-time Speech-To-Speech interaction
  • text summarization
  • ask for the current time in any country / city / state supported by SpaCy, or just the local time if no location is specified

Setup

git clone https://github.com/mcleonte/zaida.ai.git
cd zaida.ai
docker compose up  # add --detach flag or run it in another terminal
./nlu-train.sh
poetry install

# Option 1
poetry run zaida

# Option 2
poetry shell  # or "source .venv/bin/activate"
zaida

If ./nlu-train.sh fails with a ConnectionError, your should wait a few more seconds to let all the Docker services to start up, as the NLU service take a few more seconds longer. After the initialization, interactions should feel real-time.

Notes

This is still a very new project, as I've just almost finished polishing the STT and TTS integrations. However, there isn't much to go with on the NLU side yet, which I want to focus on next. So far I plan on developing features also for:

  • daily tasks and workflows
  • window & environment management
  • filesystem & browser navigation

and many other will follow.

zaida.ai's People

Contributors

mcleonte avatar

Stargazers

 avatar

Watchers

 avatar Kostas Georgiou avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.