GithubHelp home page GithubHelp logo

suber's Issues

Verbatim SubER

Hi again,
This is not a real issue but an "enhancement" request.
I am using SubER for a paper and asking if there is a way to obtain more information about the results obtained, i.e. since the metric is Levenshtein-based, can we have information about deletion, insertion, etc.?
It would be useful to perform analyses and have some suggestions about system behavior.
Thank you

Statistical Significance / Confidence Intervals

Hi @patrick-wilken ,

I think it would be great to offer the opportunity to score statistical significance between two hypotheses. This can be done with bootstrap resampling, even though the main challenge would be to understand how to sample from SRT files, as there is (in general) no alignment between SRTs generated by two systems nor with the references. Do you have comments/ideas on how to do this? I can also assist with the implementation.

Thanks,
Marco

SubER computation on more than 1 srt file

Hi, this more than a problem is a question: if I have more than 1 .srt file with which I can make a comparison, how can I compute the SubER metric (and also the AS-/T-BLEU) metrics?
Is it sufficient to concatenate them and then compute the metrics or we need something more sophisticated?

Thanks for your work.

Punctuation and case sensitive

Hi @patrick-wilken,
I am here again to point out some observations I made on the SubER outcomes I obtained from the analysis of different models.
I know that you found no significative differences in the correlation of SubER with and without punctuation and true casing, as reported in the paper, but I think it would be very useful to add an option to the SubER tool in which you can indicate whether to use or not punctuation and true casing. Currently, you are normalizing the text, and tokenization is not needed to compute TER (as far as I understood from your implementation) but it would become necessary if we avoid the normalization step (as they do in the sacrebleu tool).
I noticed that computing SubER by normalizing the text strongly favors systems that are not good at inserting punctuation and correctly capitalizing words and the outcomes of SubER are in fact in contrast with BLEU scores but also with Sigma scores.
Just to give you an idea, I found that a system scoring 5 BLEU point less than all the other systems that I tested can achieve a lower (thus, better) SubER and the difference in the quality of the translation also emerges upon manual evaluation and the absence of punctuation strongly affects the understanding. Therefore, I suggest integrating the option that I mentioned before and maybe further exploring this aspect.

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.