GithubHelp home page GithubHelp logo

Comments (4)

kamo-naoyuki avatar kamo-naoyuki commented on August 11, 2024

Hypothesis is a NamedTuple object. You can refer attributes.

https://github.com/espnet/espnet/blob/master/espnet/nets/beam_search.py#L19-L33

from espnet_model_zoo.

pengcheng-tech avatar pengcheng-tech commented on August 11, 2024

Hypothesis is a NamedTuple object. You can refer attributes.

https://github.com/espnet/espnet/blob/master/espnet/nets/beam_search.py#L19-L33

Hi, thanks for your response.

By referring to the link. I modified the code as follows:

nbests = speech2text(speech)

text, *_, score_bundle = nbests[0]

By executing the following:

print(score_bundle.score)
print(score_bundle.scores)

I got :
tensor(-57.1623, device='cuda:0')
{'decoder': tensor(-2.6879, device='cuda:0'), 'lm': tensor(-55.0374, device='cuda:0'), 'ctc': tensor(-0.8112, device='cuda:0')}

I think the number "-57.1623" is the the result of log P_encdec(y|x) + log P_ctc(y|x) + log P_lm(y), where log P_encdec(y|x) is -2.6879, log P_ctc(y|x) is -0.8112 and log P_lm(y) is -55.0374, a bit mismatch though...

If I denote -57.1623 as nbests[0].score
Can I just grab nbests[0] until nbests[100], and using nbests[0].score/ (nbests[0].score + nbests[1].score + ...+ nbests[100].score) to roughly obtain the decoding confidence score?

Thanks a lot

from espnet_model_zoo.

kamo-naoyuki avatar kamo-naoyuki commented on August 11, 2024

score is the weighted sum of scores. You need to decide the weight when instantiation of Speech2Text class.

You can get the arbitrary n-best scores by giving nbest argument to Speech2Text, but I think it's not trivial to regard it as the confidence score.

from espnet_model_zoo.

pengcheng-tech avatar pengcheng-tech commented on August 11, 2024

Thanks for the comment.

I currently treat the "score" (i.e., -57.1623) as a rough confidence score to indicate how confident the model predicts the semantic meaning of the audio is so. From my observation, the score of nbests[0] is higher than that of the nbests[1]. I guess it is adequate for my purpose.

from espnet_model_zoo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.