GithubHelp home page GithubHelp logo

nyegyesa's Introduction

Nyegyesa: a speech-based language-learning assistant!

Scenario

pending update

Why does this project exist?

I started this project to immerse myself in the world of open-source AI models. My goal is to use this to learn a language by practicing speaking the sentences I actually want to say, rather than the ones on the language-learning app. It would entail:

  • speech to text model
  • language-to-language translation model
  • text to speech model that will 'speak' the result to me in MY intonation [a voice cloning model] so I can directly copy that.

How to set it up

setting up a virtual envirionment

I'm using python Python 3.10.0rc2 I'm using python venv setting up one venv on cmd python -m venv name_of_virtual_env to activate the venv, use .\name_of_virtual_env\Scripts\Activate.ps1 on PowerShell and .\name_of_virtual_env\Scripts\activate.bat on CMD (though this might work for either?) if you're using vscode, install the Python extension and set up the venv inside your project folder. This way, VS Code will run the venv automatically when you run a Python file.

downloading the dependencies

I used a requirements file because it's cleaner for me to keep track of. I added comments to it to let me know which model needs which deps since I'm using more than one model. pip install -r .\requirements.txt I also added a req_list_from_pip.txt file that has all the versions, even of dependency dependencies.

downloading the transformers

git clone https://github.com/huggingface/transformers.git
cd transformers
pip install -e .

ffmpeg install for the text to speech [currently using bark]

https://www.wikihow.com/Install-FFmpeg-on-Windows to download and install then ffmpeg and ffmpeg-python added to the deps

thoughts / troubles / todos:

  • ✅Finished setting up Whisper STT
  • Whisper that I used seems to be old. I'm getting a ERROR: Exception in ASGI application that doesn't stop the app from running but is definitely not happy. Other than that, it works fine.
  • Example output:
    • Result: STT result: Buenos días, buenos tardes, buenos noches, hola
    • Time taken: STT finished in 0.0136 minutes
  • *️⃣Next step is to add keywords like language='es' so that whatever I speak is returned as spanish.

snapshot of resource usage:

nyegyesa's People

Contributors

b40deep avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.