GithubHelp home page GithubHelp logo

mbencherif / praatio Goto Github PK

View Code? Open in Web Editor NEW

This project forked from timmahrt/praatio

0.0 0.0 0.0 1.18 MB

A python library for working with praat, textgrids, time aligned audio transcripts, and audio files. It is primarily used for extracting features from and making manipulations on audio files given hierarchical time-aligned transcriptions (utterance > word > syllable > phone, etc).

License: Other

Python 77.86% Jupyter Notebook 22.14%

praatio's Introduction

praatIO

https://travis-ci.org/timmahrt/praatIO.svg?branch=master https://coveralls.io/repos/github/timmahrt/praatIO/badge.svg?branch=master https://img.shields.io/badge/license-MIT-blue.svg?

Questions? Comments? Feedback? Chat with us on gitter!

Join the chat at https://gitter.im/praatio/Lobby

A library for working with praat, time aligned audio transcripts, and audio files that comes with batteries included. Praat uses a file format called textgrids, which are time aligned speech transcripts. This library isn't just a data struct for reading and writing textgrids--many utilities are provided to make it easy to work with with transcripts and associated audio files. This library also provides some other tools for use with praat.

Praat is an open source software program for doing phonetic analysis and annotation of speech. Praat can be downloaded here

What can you do with this library?

  • query a textgrid to get information about the tiers or intervals contained within:

    tg = tgio.openTextGrid("path_to_textgrid")
    
    entryList = tg.tierDict["speaker_1_tier"].entryList # Get all intervals
    
    entryList = tg.tierDict["phone_tier"].find("a") # Get all instances of 'a'
    
  • create or augment textgrids using data from other sources

  • found that you clipped your audio file five seconds early and have added it back to your wavefile but now your textgrid is misaligned? Add five seconds to every interval in the textgrid:

    tg = tgio.openTextGrid("path_to_textgrid")
    
    moddedTG = tg.editTimestamps(5, 5, 5)
    
    moddedTG.save('output_path_to_textgrid')
    
  • manipulate an audio file based on information in a textgrid:

    see splitAudioOnTier() in /praatio/praatio_scripts.py
    
  • remove all intervals (and associated intervals in other tiers) that don't match a query.:

    # This would remove all words that are not content words from the word_tier
    
    # and also remove their associated phone listings in the phone_tier
    
    tg = tgio.openTextGrid("path_to_textgrid")
    
    print(tg.tierNameList)
    
    >> ["word_tier", "phone_tier"]
    
    subTG = tg.getSubtextgrid("word_tier", isContentWord, True)
    
    subTG.save('output_path_to_textgrid')
    
  • utilize the klattgrid interface to raise all speech formants by 20% (among other possible manipulations):

    tg = tgio.openTextGrid("path_to_textgrid")
    
    incrTwenty = lambda x: x * 1.2
    
    kg.tierDict["oral_formants"].modifySubtiers("formants",incrTwenty)
    
    kg.save(join(outputPath, "bobby_twenty_percent_less.KlattGrid"))
    
  • replace labeled segments in a recording with silence or delete them

    see /examples/deleteVowels.py

  • use set operations (union, intersection, difference) on textgrid tiers

    see /examples/textgrid_set_operations.py

There are tutorials available for learning how to use PraatIO. These are in the form of IPython Notebooks which can be found in the /tutorials/ folder distributed with PraatIO.

You can view them online using the external website Jupyter:

Tutorial 1: An introduction and tutorial

Ver 3.6 (May 05, 2017)

  • Major clean up of tgio

    • Ver 3.6 is not backwards compatible with previous versions of PraatIO. Lots of changes to tgio.
  • Tutorials folder added

Ver 3.5 (April 04, 2017)

  • Added code for reading, writing, and manipulating audio files (praatio.audioio)
  • eraseRegion() and insertRegion() added to textgrids and textgrid tiers

Ver 3.4 (February 04, 2017)

  • Added place for very specific scripts (praatio.applied_scripts)

    • added code for using with input and output textgrids to SPPAS, a forced aligner
  • Lots of minor features and bugfixes

Ver 3.3 (June 27, 2016)

  • Find zero-crossings in a wave file

    • for shifting all boundaries in a textgrid see praatio_scripts.tgBoundariesToZeroCrossings()
    • for finding individual zero crossings, see praatio_scripts.findNearestZeroCrossing()
  • Pitch features

    • pitch extraction is now ten times faster
    • automatic pitch halving/doubling detection
    • median filtering
  • Textgrid features

    • set operations over two tiers (union, difference, or intersection)
    • erase a section of a textgrid (and a section of the corresponding wave file)
  • Extraction of pitch formants using praat

  • Lots of small bugfixes

Ver 3.2 (January 29, 2016)

  • Float precision is now preserved in file I/O
  • Integration tests added; using Travis CI and Coveralls for build automation.
  • Lots of small bugfixes
  • Moved point processes into 1D and 2D point objects

Ver 3.1 (December 16, 2015)

  • Support for reading/writing point processes

Ver 3.0 (November 10, 2015)

  • Support for reading and writing klattgrids

Ver 2.1 (July 27, 2015)

  • Addition of praatio_scripts.py where commonly used scripts will be placed
  • Import clash led to praatio.py being renamed to tgio.py

Ver 2.0 (February 5, 2015)

  • Support for reading, writing, and manipulating point tiers
  • Ported to python 3
  • Major cleanup/reorganizing of code

Ver 1.0 (August 31, 2014)

  • Reading and writing of textgrids
  • Support for reading, writing, and manipulating interval tiers

Python 2.6.* or above

Python 3.3.* or above (actually, probably any version of python 3)

Click here to see the specific versions of python that praatIO is tested under

99% of the time you're going to want to run:

from praatio import tgio
tg = tgio.openTextGrid(r"C:\Users\tim\Documents\transcript.TextGrid")

Or if you want to work with KlaatGrid files:

from praatio import kgio
kg = kgio.openKlattGrid(r"C:\Users\tim\Documents\transcript.KlattGrid")

See /test for example usages

If you on Windows, you can use the installer found here (check that it is up to date though) Windows installer

PraatIO is on pypi and can be installed or upgraded from the command-line shell with pip like so:

pip install praatio --upgrade

Otherwise, to manually install, after downloading the source from github, from a command-line shell, navigate to the directory containing setup.py and type:

python setup.py install

If python is not in your path, you'll need to enter the full path e.g.:

C:\Python27\python.exe setup.py install

PraatIO is general purpose coding and doesn't need to be cited but if you would like to, it can be cited like so:

Tim Mahrt. PraatIO. https://github.com/timmahrt/praatIO, 2016.

Development of PraatIO was possible thanks to NSF grant BCS 12-51343 to Jennifer Cole, José I. Hualde, and Caroline Smith and to the A*MIDEX project (n° ANR-11-IDEX-0001-02) to James Sneed German funded by the Investissements d'Avenir French Government program, managed by the French National Research Agency (ANR).

praatio's People

Contributors

timmahrt avatar sih4sing5hong5 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.