GithubHelp home page GithubHelp logo

zelaki / sail_align Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nassosoassos/sail_align

0.0 0.0 0.0 139.92 MB

SailAlign is an open-source software toolkit for robust long speech-text alignment implementing an adaptive, iterative speech recognition and text alignment scheme that allows for the processing of very long (and possibly noisy) audio and is robust to transcription errors. It is mainly written as a perl library but its functionality also depends on freely available software, namely HTK, srilm and sclite.

Perl 77.37% Shell 21.37% Python 1.26%

sail_align's Introduction

SailAlign

Long speech-text alignment can facilitate large-scale study of rich spoken 
language resources that have recently become widely accessible, e.g., collections 
of audio books, or multimedia documents. SailAlign is an open-source software toolkit 
for robust long speech-text alignment implementing an adaptive, iterative speech recognition 
and text alignment scheme that allows for the processing of very long (and possibly noisy) 
audio and is robust to transcription errors.

The software package was originally presented at VLSP-2011:
A. Katsamanis, M. Black, P. Georgiou, L. Goldstein and S. Narayanan, 
SailAlign: Robust long speech-text alignment,
in Proc. of Workshop on New Tools and Methods for Very-Large Scale Phonetics Research, Jan. 2011.

INSTALLATION

To install this module, run the following command while inside the SailAlign distribution folder. 
The module can currently only work on linux - like operating systems. It has been tested in the following architectures:
linux_i686, linux_x86_64 
Compiled HTK binaries, including HDecode have to be available in your system, e.g., in the folder $HTK_BIN.
You may need to be connected to the internet to install dependencies :

       1) cpan Module::Build
       	  This will install the Module::Build perl package from CPAN. You may skip this step if you have 
	  the package already installed. It is used for the installation of the SailAlign package. If you haven't run the
	  command cpan before you may have to go through a setup process. You can just
	  select the default options when user input is required.  
       2) perl Build.PL
       	  This command configures the building process. Among others it will try to detect dependencies
	  that are not installed in your system. Ignore any messages for missing dependencies in the beginning. 
	  Provide the folder $HTK_BIN when asked. 
          Internet connection is required at this point to download the acoustic models. 
       3) sudo ./Build installdeps
       	  Run this command if you got a message for missing dependencies in the previous step.
       3) ./Build
       	  This command will build the module.
       4) sudo ./Build install
          This command will install the module. The following perl scripts will be added to your path:
	  sail_align, sail_align_parallel


TESTING

To test that the module has been properly installed you may run the following from inside the distribution folder:
>> sail_align -i support/data/timit_5.wav -t support/data/timit_5.txt -w support/test/local \
   	      -e timit_sample_test -c config/timit_alignment.cfg

This will run the alignment algorithm for the audio file timit_5.wav and the corresponding transcription timit_5.txt. The 
results will be in the working directory 'support/test/local/timit_sample_test'. You may check the tutorial in the 'docs'
folder for additional details.
For spanish:
>> sail_align -i support/data/spanish/voxforge_spanish.wav -t support/data/spanish/voxforge_spanish.txt -w support/test/local_spanish \
   	      -e spanish_test -c config/spanish_alignment.cfg
For greek:
>> sail_align -i support/data/greek/greeksyn_05.wav -t support/data/greek/greeksyn_05.txt -w support/test/local_greek \
   	      -e greek_test -c config/greek_alignment.cfg

FIXES

In case you get a warning like the following, there is no need to worry:
"Ambiguous call resolved as CORE::read(), qualify as such or use & at (eval 18) line 4.
 Subroutine Audio::Wav::Read::read redefined at /usr/lib/perl5/site_perl/5.10.0/Audio/Wav/Read.pm line 316."
It is due to a minor bug in the Audio::Wav external dependency and it does not affect sail_align's result.

To avoid it you may apply the following patch:
sudo patch /usr/lib/perl5/site_perl/5.10.0/Audio/Wav/Read.pm < support/patches/audio_wav_read.patch 


SUPPORT AND DOCUMENTATION

You may find additional documentation inside the docs subfolder, including the paper on SailAlign presented at 
VLSP-2011 and a relevant tutorial that was given there as well.

Contact Nassos Katsamanis for technical support.

LICENSE AND COPYRIGHT

Copyright (C) 2010-2013 Athanasios Katsamanis (http://sipi.usc.edu/~nkatsam)

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 dated June, 1991 or at your option
any later version. SRILM binaries are only included in the distribution 
for convenience and you may find the corresponding license, srilm-license, in the folder
LICENSES.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

A copy of the GNU General Public License is available in the source tree;
if not, write to the Free Software Foundation, Inc.,
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.