GithubHelp home page GithubHelp logo

Comments (2)

patrick-wilken avatar patrick-wilken commented on September 15, 2024

Hi Marco,
thanks for the proposal, definitely sounds like a useful addition! You are in particular referring to this https://aclanthology.org/W04-3250.pdf, right?
Regarding sampling from subtitles: yes, that seems to be much less obvious than sampling from sentences. For the SubER calculation the files are already split into parts at points in time where both hypothesis and reference agree that there is no subtitle. So far this is an implementation detail for more efficient computation. But this is the closest thing to parallel segments that currently exists and those could maybe be used as units for sampling? There are several problems with this though: 1. segmentation depends on the hypotheses; 2. probably too few segments, depending on specific subtitle content; 3. length of segments varies greatly.
Another idea that comes to my mind is to calculate the SubER edit operations on the whole file, sample a subset of reference subtitle blocks, and calculate SubER scores using only the edit operations (and reference length) corresponding to those blocks. But this is only brainstorming right now, have to think it through...
I will be travelling the next two weeks, so can only really look into this after that. 🙃

from suber.

mgaido91 avatar mgaido91 commented on September 15, 2024

Hi @patrick-wilken ! Thanks for your reply. Yes, that is the paper I was referring to. I looked into the code in these days and the easiest thing that comes to my mind is the following:

In the SubER for loop (https://github.com/apptek/SubER/blob/main/suber/metrics/suber.py#L29), we can keep track of the single edits and reference lengths, instead of just comulating them. Once we have these fine-grained stats, we can bootstrap with them. I already have some sort of implementation doing this. The main issues in this case would be:

  1. How to integrate this in a clean way in the tool?
  2. In this way we can only compute confidence intervals rather than the statistical significance between two hypotheses. But this second thing is very hard for all alignment issues. So as a first step, CI may be enough. What do you think?

Thanks,
Marco

from suber.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.