GithubHelp home page GithubHelp logo

scribe's Introduction

Scribe

Simple speech recognition for Python. Run the script, say some things into your microphone, and then see what you said (or an approximation).

Powered by pyaudio and Sphinx.

Installation

Sphinxbase

Download sphinxbase and extract the files.

Now, run:

cd sphinxbase
./configure;make clean all;make install
cd python
python setup.py install

You may need to use sudo for make install or python setup.py install.

Pocketsphinx

Download pocketsphinx and extract the files.

Now, run:

cd pocketsphinx
./configure;make clean all;make install
cd python
python setup.py install

Packages (Linux only)

Now, run:

cd speech-recognizer
sudo xargs -a apt-packages.txt apt-get install

Pyaudio

Now, download the right version of pyaudio and install it.

Language files

If you want to speak english, you need to get the english language model and the english acoustic model.

You will need to put the acoustic model into scribe/hmm, and the language model into scribe/lm.

The filetree should look like this for english:

scribe
├── dict
│   └── cmu07a.dic
├── hmm
│   ├── feat.params
│   ├── feature_transform
│   ├── mdef
│   ├── means
│   ├── mixture_weights
│   ├── noisedict
│   ├── README
│   ├── transition_matrices
│   └── variances
├── lm
│   └── cmusphinx-5.0-en-us.lm.dmp

For other languages, check here, or see below on training your own model. If you use different language models, acoustic models, or dictionaries, you will want to change these paths in recognizer.py:

HMDIR = os.path.join(BASE_PATH, "hmm")
LMDIR = os.path.join(BASE_PATH, "lm/cmusphinx-5.0-en-us.lm.dmp")
DICTD = os.path.join(BASE_PATH, "dict/cmu07a.dic")

Run

To run, you just have to:

cd speech-recognizer
python recognizer.py

You should be able to talk for a few seconds, after which it will spend some time processing, and the show you what you said.

Configure

There are some options that you can modify at the top of recognizer.py. The easiest one to modify is RECORD_SECONDS.

More reading

To find out more, read up on sphinx.

You can train the language models to make them more accurate, use unsupported languages, or be more domain-specific.

scribe's People

Contributors

vikparuchuri avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.