GithubHelp home page GithubHelp logo

runngezhang / pyspeechrev Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mravanelli/pyspeechrev

0.0 1.0 0.0 2.17 MB

This python code performs an efficient speech reverberation starting from a dataset of close-talking speech signals and a collection of acoustic impulse responses.

Python 100.00%

pyspeechrev's Introduction

pySpeechRev

This python code performs an efficient speech reverberation starting from a dataset of close-talking speech signals and a collection of acoustic impulse responses.

The reverberated signal y[n] is computed in the following way:

y[n]=x[n] * h[n]

where x[n] is the clean signal and * is the convolutional operator.

The script takes in input the following arguments:

  • in_folder: folder where the original close-talk dataset is stored.
  • out_folder: folder where the reverberated dataset will be stored.
  • list.txt : it is a text file where each row should contain: original_wav_file IR_file.

Before run it, make sure you have all the needed python packages. In particular:

  • pysoundfile: pip install pysoundfile
  • numpy
  • scipy

Example:

python pySpeechRev.py clean_examples/ rev_examples/ list.txt

Reverberated TIMIT

To create a reverberated version of TIMIT do the following steps:

  • Make sure you have the TIMIT dataset. If not, it can be downloaded from the LDC website (https://catalog.ldc.upenn.edu/LDC93S1).
  • Change lst_TIMIT.txt according to the paths of your TIMIT Dataset
  • Run:
python pySpeechRev.py $path_TIMIT  $path_TIMIT_rev lst_TIMIT.txt

The current version of TIMIT has been contaminated with some high-quality impulse responses of the DIRHA-English Dataset [3].

Tested on: Python 2.7, Ubuntu

This code has been used in the following papers (please cite them if you use this code):

[1] M. Ravanelli, P. Svaizer, M. Omologo, "Realistic Multi-Microphone Data Simulation for Distant Speech Recognition", in Proceedings of Interspeech 2016. https://arxiv.org/abs/1711.09470

[2] M. Ravanelli, M. Omologo, "Contaminated speech training methods for robust DNN-HMM distant speech recognition", in Proceedings of INTERSPEECH 2015. https://arxiv.org/abs/1710.03538

[3] M. Ravanelli, M. Omologo, "The DIRHA-English corpus and related tasks for distant-speech recognition in domestic environments", in Proceedings of ASRU 2015. https://arxiv.org/abs/1710.02560

pyspeechrev's People

Contributors

mravanelli avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.