Comments (4)
Hypothesis is a NamedTuple object. You can refer attributes.
https://github.com/espnet/espnet/blob/master/espnet/nets/beam_search.py#L19-L33
from espnet_model_zoo.
Hypothesis is a NamedTuple object. You can refer attributes.
https://github.com/espnet/espnet/blob/master/espnet/nets/beam_search.py#L19-L33
Hi, thanks for your response.
By referring to the link. I modified the code as follows:
nbests = speech2text(speech)
text, *_, score_bundle = nbests[0]
By executing the following:
print(score_bundle.score)
print(score_bundle.scores)
I got :
tensor(-57.1623, device='cuda:0')
{'decoder': tensor(-2.6879, device='cuda:0'), 'lm': tensor(-55.0374, device='cuda:0'), 'ctc': tensor(-0.8112, device='cuda:0')}
I think the number "-57.1623" is the the result of log P_encdec(y|x) + log P_ctc(y|x) + log P_lm(y), where log P_encdec(y|x) is -2.6879, log P_ctc(y|x) is -0.8112 and log P_lm(y) is -55.0374, a bit mismatch though...
If I denote -57.1623 as nbests[0].score
Can I just grab nbests[0] until nbests[100], and using nbests[0].score/ (nbests[0].score + nbests[1].score + ...+ nbests[100].score) to roughly obtain the decoding confidence score?
Thanks a lot
from espnet_model_zoo.
score
is the weighted sum of scores. You need to decide the weight when instantiation of Speech2Text class.
You can get the arbitrary n-best scores by giving nbest
argument to Speech2Text, but I think it's not trivial to regard it as the confidence score.
from espnet_model_zoo.
Thanks for the comment.
I currently treat the "score" (i.e., -57.1623) as a rough confidence score to indicate how confident the model predicts the semantic meaning of the audio is so. From my observation, the score of nbests[0] is higher than that of the nbests[1]. I guess it is adequate for my purpose.
from espnet_model_zoo.
Related Issues (20)
- Can't load chime4 model HOT 1
- Is there a pretrained model for source separation/speech enhancement? HOT 1
- Error when test ASR HOT 1
- Update PYPI HOT 3
- Is there a Mandarin multi-speaker pretrained model? HOT 1
- ASR demo multiple threads HOT 1
- Request for a default data folder for fallback HOT 2
- Is it possible to upload some pretrained models of tacotron2 for the libritts dataset? HOT 2
- Uploading ESPnet2 model to Zenodo HOT 6
- CSJ's pretrained conformer-based ASR model on zenodo HOT 1
- 'Speech2Text' has no attribute 'from_pretrained' HOT 6
- Using an original model trained in espnet1 HOT 6
- ModuleNotFoundError: No module named 'espnet_model_zoo.downloader'; 'espnet_model_zoo' is not a package HOT 1
- Missing getitem on huggingface page:
- TypeError: init() got an unexpected keyword argument 'train_config' HOT 1
- librosa.util.exceptions.ParameterError: Window size mismatch: 512 != 400 when using streaming transformer model HOT 2
- FileNotFoundError HOT 4
- Redundant ljspeech vits models HOT 3
- Huggingface downloader / cache, offline mode HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from espnet_model_zoo.