GithubHelp home page GithubHelp logo

alibaba-mit-speech's Introduction

Alibaba-MIT-Speech

This is a PATCH file with the DFSMN related codes and example scripts for LibriSpeech task.

Apply Patch

The patch is built based on the Kaldi speech recognition toolkit with commit "04b1f7d6658bc035df93d53cb424edc127fab819".

You can apply this patch to your own kaldi branch by using the following commands:

##Take a look at what changes are in the patch

git apply --stat Alibaba_MIT_Speech_DFSMN.patch

##Test the patch before you actually apply it

git apply --check Alibaba_MIT_Speech_DFSMN.patch

##If you don’t get any errors, the patch can be applied cleanly.

git am --signoff < Alibaba_MIT_Speech_DFSMN.patch

Run Example Scripts:

The training scripts and experimental results for the LibriSpeech task is available at kaldi/egs/librispeech/s5.

There are three DFSMN configurations with different model size: DFSMN_S, DFSMN_M, DFSMN_L.


#Training FSMN models on the cleaned-up data

#Three configurations of DFSMN with different model size: DFSMN_S, DFSMN_M, DFSMN_L

local/nnet/run_fsmn_ivector.sh DFSMN_S

local/nnet/run_fsmn_ivector.sh DFSMN_M

local/nnet/run_fsmn_ivector.sh DFSMN_L


The DFSMN_S is a small DFSMN with six DFSMN-components while DFSMN_L is a large DFSMN consist of 10 DFSMN-components.

For the 960-hours-setting, it takes about 2-3 days to train DFSMN_S only using one M40 GPU.

And the detailed experimental results are listed in the RESULTS file.

alibaba-mit-speech's People

Contributors

tramphero avatar leiming99 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.