GithubHelp home page GithubHelp logo

martins6 / speaker_recognition Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 14.31 MB

Project to recognize user's voice through Deep Learning with Python with the LibriSpeech data.

License: MIT License

Jupyter Notebook 100.00%
deep-learning neural-network voice-recognition signal-processing speaker-recognition

speaker_recognition's Introduction

Voice Recognition

Project to recognize voices with Neural Networks with Python. The main purpose of this project is to explore the capabilities of Neural Networks and the extraction of features from signals in order to recognize different voices. This project was born out of the idea to build the hability to a virtual assistant to recognize solely your own voice. This idea came to me when contributing to the Jarvis open-source project, which is an virtual assistant for desktop.

Data preparation

The dataset that this project uses comes soley from the LibriSpeech. The files are very well organized. I've build two Juypter Notebooks to process the dataset first as a set of files with each speaker id (LibriSpeech_Files_Pre_Processing.ipynb), and then to extract features from their voices (Signal_Feature_Extraction.ipynb). You can define hyperparameters in order to prepare more data automatically. In the most recent run, I've used 30 different speakers with 30 seconds of audio recordings from each, approximately.

Neural Network Model

I've choosen to model the features extracted from the signal through dense layer Neural networks (a.k.a Deep Learning) so far I was able to achieve a 84% accuracy. I'm trying to avoid feeding a whole lot of data and exploring how to achieve more with less.

Future Works and Contribution

I hope to explore more the Fourier Transformation and Wavelets for the feature extraction from signals. Also, explore more different types of Neural Networks to better achieve results.

Acknowledgment

Jurgen Arias has documented a similar project very well. It has really helped this project of mine kickstart.

speaker_recognition's People

Contributors

martins6 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.