GithubHelp home page GithubHelp logo

rizquuula / indian_english_asr Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sidgupta234/indian_english_asr

0.0 1.0 0.0 81.56 MB

An Indian English ASR system based on Hidden Markov Models (HMM) has been designed using Kaldi(Povey et al., 2011).

Shell 48.21% Python 34.39% Perl 7.87% Jupyter Notebook 0.01% Roff 9.53%

indian_english_asr's Introduction

Indian English ASR

Contents

About the project

With the leaps in computational power and ever-increasing amount of structured speech transcribed data to avail, the accuracy of Automated Speech Recognition (ASR) systems has seen substantial improvements over the last few years. Given a large amount of transcribed data, the systems have proven to be capable of performing especially well when speech is produced by native speakers. In cases when a language, say, English, is spoken by L2 speakers, there may be a heavy influence of their native language on their accent when they speak English; the scenario can make it difficult for an ASR system to make correct transcriptions. The accent influence for building an ASR system is a big challenge for speakers of a country like India- one of the most linguistically diverse countries, which has a large number of multilingual non-native English speakers. If a person A, whose L1 is Malayalam, and there is another person B whose L1 is Telugu then the accent produced while they speak English could be completely different.

In this project, an Indian English ASR system based on Hidden Markov Models (HMM) has been designed using Kaldi(Povey et al., 2011). We aim to use available continuous English speech transcribed data obtained from non-native Indian English speakers in order to build an ASR system.

This project was made as part of the IIIT Hyderabad Advanced Summer School on Natural Language Processing (IASNLP 2022).

Installation and Testing Process

  • Install Kaldi using their official documentation.
  • Go to Kaldi folder in your system and clone this repository in /egs using the following command.
    git clone https://github.com/sidgupta234/Indian_English_ASR
  • Add your audio files for testing in Indian_English_ASR/summer_asr_nptel/custom_test_dataset. Make sure the .wav files are in 16Khz and mono-channel format. (To convert to required format you can use this script)
  • Create text, utt2spk, spk2utt and wav.scp file in data/custom_test_dataset (A script to ease this process will soon be added)
  • Run the script run_testcases.sh to get the transcription of the audio files!

Project File and Slides

Project File are available here
Slides delivered for the project during the summer school are available here

Team

Siddharth Gupta

Rohit Reddy

Deepu C.

indian_english_asr's People

Contributors

sidgupta234 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.