Comments (8)
Are you using external LM shallow fusion for decoding? Shallow fusion tends to have such problem. See if the deletion error is reduced without shallow fusion.
Anyways I think the length mismatch between training / decoding is the cause. There are several work in literature trying to mitigate this, e.g.:
https://arxiv.org/pdf/1911.02242.pdf
https://arxiv.org/pdf/1910.11455.pdf
from espresso.
Ok will turn it off. Thanks for the pointers. Do you have any plans to implement? Or if you point me to the appropriate modules and give me some high level instructions, maybe I will try myself. I assume the second paper (on streaming RNN-Ts) is less relevant?
from espresso.
In order to implement the first paper, I think you might need to modify espresso/data/asr_dataset.py
or add a new dataset class to chop utterances into overlapping segments, and then modify espresso/speech_recognize.py to merge hyps from all the segments within a long utterance.
from espresso.
Thanks. How about the attention aspects? (forcing monotonic attention).
from espresso.
Maybe you can get some reference from https://github.com/freewym/espresso/tree/master/examples/simultaneous_translation/modules
from espresso.
OK, I turned off shallow fusion but it still stops decoding after about 8-10 seconds for the longer utterances. WER is about 60%. Note the Kaldi decoding with the TDNN Hybrid for this corpus is about 24%. Any other parameters to work with before I have to resort to more extreme measures?
from espresso.
Typical long utterance attention plot, if this suggests something.
from espresso.
OK, I think so there is no obvious way to avoid such issue without specially designed algorithms
from espresso.
Related Issues (20)
- Stream ASR HOT 1
- TypeError: get_asr_dataset_from_json() got an unexpected keyword argument 'combined' HOT 1
- Different WERs when decoding with different batch size (--max-sentences) HOT 5
- ONNX exportation of speech_lstm based model HOT 1
- Build PyChain with CPU HOT 6
- WER difference when decode with different batch size HOT 5
- SpecAug slows down training time HOT 4
- WSJ Recipe: "wsj_data_prep.sh: Spot check of command line arguments failed" HOT 1
- SpecAugment is not used due to a typo in the prefetch_called assignment HOT 1
- Getting OOM error at the middle of the training in asr_swbd recipe on lstm encoder decoder model HOT 4
- How can I find Kaldi directory? HOT 37
- Google Colab research creation HOT 1
- token_text as outputs HOT 2
- Which is the difference between stage 5 and stage 7 HOT 1
- Error in training stage of run_chain_e2e_bichar.sh: 'odict_items' object is not an iterator HOT 4
- Transformer LM in ASR HOT 2
- 'frame_subsampling_factor' in chain e2e setup HOT 2
- hydra.errors.ConfigCompositionException: Could not override 'task.data'. HOT 1
- Android Espresso not able to test fragement
- SHA hashes in 'main' branch are different from those in the 'origin/main' HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from espresso.