Traditional ASR (Signal & Cepstral Analysis, DTW, HMM) & DNNs (Custom Models + DeepSpeech) on Indian Accent Speech

Home Page: https://towardsdatascience.com/indian-accent-speech-recognition-2d433eb7edac

License: Creative Commons Zero v1.0 Universal

Jupyter Notebook 99.08% Python 0.92%

accent accented-speech asr cepstral-analysis custom-training deepspeech dnn hmm indian indian-language

indian-accent-speech-recognition's Issues

Bundle and release model checkpoints

Hi @AdroitAnandAI ! this is some impressive work, and it would be great to make it more usable by releasing model checkpoints as well... the current protobuf model files are harder to use with newer DeepSpeech releases, but checkpoints can be more easily re-formatted

CreateModel failed with error code 15

Very good article and I hope model should also be good. However, not able to run it due to some error. I have downloaded your model which you had put in this repo. Please suggest:

deepspeech --model output_graph_indian_test.pbmm --lm models/lm.binary --audio chunk02_speaker0_16.wav --alphabet models/alphabet.txt --trie models/trie

TensorFlow: v1.12.0-10-ge232881
DeepSpeech: v0.4.1-0-g0e40db6
Data loss: Corrupted memmapped model file: output_graph_indian_test.pbmm Invalid directory offset
Traceback (most recent call last):
  File "/anaconda3/envs/berttorchenv_dev/bin/deepspeech", line 8, in <module>
    sys.exit(main())
  File "/anaconda3/envs/berttorchenv_dev/lib/python3.6/site-packages/deepspeech/client.py", line 80, in main
    ds = Model(args.model, N_FEATURES, N_CONTEXT, args.alphabet, BEAM_WIDTH)
  File "/anaconda3/envs/berttorchenv_dev/lib/python3.6/site-packages/deepspeech/__init__.py", line 14, in __init__
    raise RuntimeError("CreateModel failed with error code {}".format(status))
RuntimeError: CreateModel failed with error code 15

deepspeech: error: argument --lm_alpha: invalid float value: 'speech/lm.binary'

Hi,
Thank you for the article and the model. But for me, when I run the command given in the doc, I am getting
deepspeech: error: ambiguous option: --lm could match --lm_alpha, --lm_beta
and on adding lm-alpha like so
deepspeech --model speech/output_graph.pbmm --lm speech/lm.binary --trie speech/trie --audio content/8455-210777-0068.wav,
I am getting
deepspeech: error: argument --lm_alpha: invalid float value: 'speech/lm.binary'.
My version for Deepspeech is 0.8.2.
I am kinda new at this and any help will be appreciated.
Thank you

Cannot find pre-trained model

Hi,
I'm trying to access your pre-trained model. But it says the "file is in owner's trash". Can you please re-upload it again?
Thanks

pre-trained model is not Available

hello sir "output_graph.pbmm" is currently not available in your reposetory can you please make it availabe or please provide the file as it is pretrained model and must need for this please accept this request

Missing licence File

I am assuming that the code and model provided in the public repository is available for reuse in other applications and personal projects but It would be great to have a licence such as MIT added to the repository to help in making the decision easier.

Does this also support voice cloning with indian accent?

Greetings ,

The repo is awesome and also thanks for google colab implementation, it's easy to use , but after testing the repo with ease , i got the output from the .wav file as a text (as mentioned), but i am also curious whether can i do a voice clone of my own voice in real time on the generated text in an indian accent?

Any help/guidance would be really helpfull,
Thanks,
satyam.

Provide Pretrained model

Hello,
Loved your article and your work. Can you provide trained model along with access to raw data for references.
Really appreciate it.

adroitanandai / indian-accent-speech-recognition Goto Github PK

indian-accent-speech-recognition's Issues

Bundle and release model checkpoints

CreateModel failed with error code 15

deepspeech: error: argument --lm_alpha: invalid float value: 'speech/lm.binary'

Cannot find pre-trained model

pre-trained model is not Available

Missing licence File

Does this also support voice cloning with indian accent?

Provide Pretrained model

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs