GithubHelp home page GithubHelp logo

themidwestcanapps / gan_harmonized_with_hmms Goto Github PK

View Code? Open in Web Editor NEW

This project forked from gary083/gan_harmonized_with_hmms

0.0 1.0 0.0 8.74 MB

Code:Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models

Home Page: https://arxiv.org/abs/1904.04100

Python 38.11% Shell 52.82% Perl 9.07%

gan_harmonized_with_hmms's Introduction

GAN_Harmonized_with_HMMs

This is the implementation of our paper. In this paper, we proposed an unsupervised speech (phoneme) recogntion system which can achieve 33.1% phoneme error rate on TIMIT. This method developed a GAN-based model to achieve unsupervised phoneme recognition and we further use a set of HMMs to work in harmony with the GAN.

How to use

Dependencies

  1. tensorflow 1.13

  2. kaldi

  3. srilm (can be built with kaldi/tools/install_srilm.sh)

  4. librosa

Data preprocess

  • Usage:
  1. Modify path.sh with your path of Kaldi and srilm.
  2. Modify config.sh with your code path and timit path.
  3. Run $ bash preprocess.sh
  • This script will extract features and split dataset into train/test set.

  • The data which WFST-decoder needed also generate from here.

Train model

  • Usage:
  1. Modify the experimental setting in config.sh.
  2. Modify the GAN-based model's parameter in src/GAN-based-model/config.yaml.
  3. Run $ bash run.sh
  • This scipt contains the training flow for GAN-based model and HMM model.

  • GAN-based model generated the transcription for training HMM model.

  • HMM model refined the phoneme boundaries for training GAN-based model.

Note

  • Training process with boundaries generated by GAS (bnd_type=uns) is unstable, which need more training attempts to achieve the satisfactory performance.

Hyperparameters in config.sh

bnd_type : type of initial phoneme boundaries (orc/uns).

setting : matched and nonmatched case in our paper (match/nonmatch).

jobs : number of jobs in parallel (depends on your decive).

Reference

Completely Unsupervised Speech Recognition By A Generative AdversarialNetwork Harmonized With Iteratively Refined Hidden Markov Models, Kuan-Yu Chen, Che-Ping Tsai et.al.

Links

  1. The WFST decoder for phoneme classifier1 .
  2. The training scripts for Unsupervised HMM 1 .

Acknowledgement

Special thanks to Che-Ping Tsai (jackyyy0228) for kaldi parts! Special thanks to Sung-Feng Huang (b02901071) for pytorch version!

gan_harmonized_with_hmms's People

Contributors

gary083 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.