GithubHelp home page GithubHelp logo

woniesong92 / deepphonetictoolstutorial Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mlspeech/deepphonetictoolstutorial

0.0 2.0 0.0 4.56 MB

Tutorial on {Deep} Phonetic Tools given in BigPhon @ LabPhon15

Home Page: http://mlspeech.github.io

Python 84.58% R 15.42%

deepphonetictoolstutorial's Introduction

{Deep} Phonetic Tools Tutorial

The repository contains the scripts, data and links to the repositories used in the tutorial presented at BigPhon, a LabPhon15 Satellite Workshop.

Installation

The code is compatible with Mac OS X and Linux and was tested on OS X El-Capitan and Ubuntu 14.04. In order to install these tools, you need to type in the command line:

git clone --recursive https://github.com/MLSpeech/DeepPhoneticToolsTutorial.git

Then

sudo pip install scikits.talkbox

and also

cd DeepPhoneticToolsTutorial/AutoVOT/autovot/code/
make

Dependencies

The code uses the following dependencies:

  • Torch7 with RNN package
git clone https://github.com/torch/distro.git ~/torch --recursive
cd ~/torch; bash install-deps;
./install.sh 

# On Linux with bash
source ~/.bashrc
# On Linux with zsh
source ~/.zshrc
# On OSX or in Linux with none of the above.
source ~/.profile

# For rnn package installation
luarocks install rnn

Model Installation

The model for DeepWDM should be downloaded from here: RNN model. Then, it should be moved to DeepWDM/back_end/results/ as follows

cp ~/Downloads/1_layer_model.net ~/DeepPhoneticToolsTutorial/DeepWDM/back_end/results

Usage and Examples

Example 1: Processing a single file

In the first part of the usage example, we process a waveform in which the word goose /g uw s/ is pronounced in isolation by a male speaker.

In order to find the duration of the whole word, type:

python DeepWDM.py sampleFiles/waveforms/goose_male.wav sampleFiles/word_durations/goose_male.TextGrid sampleFiles/goose_male.csv

The resulted TextGrid will contain a tier called WORD. To extract the vowel duration from the waveform, type:

python AutoVowelDuration.py sampleFiles/waveforms/goose_male.wav sampleFiles/vowel_durations/goose_male.TextGrid sampleFiles/goose_male.csv

The resulting TextGrid will contain a tier called VOWEL. In order to estimate the formants of the vowel defined by the previous step, type:

python DeepFormants.py sampleFiles/waveforms/goose_male.wav sampleFiles/vowel_durations/goose_male.TextGrid sampleFiles/goose_male.csv --tier_name VOWEL

In order to estimate the voice onset time (VOT) of the stop consonant at the beginning of the word, we first need to define a search window. To define a search window from 180 msec before the beginning of the word to 100 msec after the beginning of the word, just type:

python DeepWDM.py sampleFiles/waveforms/goose_male.wav sampleFiles/word_durations/goose_male.TextGrid  sampleFiles/goose_male_duration.csv
python GenerateSearchWindows.py sampleFiles/word_durations/goose_male.TextGrid --before 0.18 --after 0.1

The resulting TextGrid will include a new tier called WINDOW. The actual extraction of the VOT can be done as follows:

python AutoVOT.py sampleFiles/waveforms/goose_male.wav sampleFiles/word_durations/goose_male.TextGrid sampleFiles/vot.csv

Example 2: Processing a directory of files

We now show how to process a whole directory of files. To extract vowel durations and formants, type:

python AutoVowelDuration.py sampleFiles/waveforms sampleFiles/vowel_durations sampleFiles/vowel_durations.csv

python DeepFormants.py sampleFiles/waveforms sampleFiles/vowel_durations sampleFiles/formants.csv

In order to extract VOT from the initial stop consonant of each word, type:

python DeepWDM.py sampleFiles/waveforms sampleFiles/word_durations sampleFiles/word_durations.csv

python GenerateSearchWindows.py sampleFiles/word_durations

python AutoVOT.py sampleFiles/waveforms sampleFiles/word_durations sampleFiles/vot.csv

For more details, please refer to: https://mlspeech.github.io

deepphonetictoolstutorial's People

Contributors

adiyoss avatar ecibelli avatar jkeshet avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.