Merge common functionality across submodules into util about mir_eval HOT 17 CLOSED

craffel commented on July 23, 2024

Merge common functionality across submodules into util

from mir_eval.

Comments (17)

craffel commented on July 23, 2024

Need to move code for computing false positives/negatives of event times over a window into util. It's used in both beat and onset in the respective f_measure functions.. Is this the same as boundary.detection, @bmcfee?

from mir_eval.

bmcfee commented on July 23, 2024

Tough question: are the semantics of finding a boundary the same as those of detecting an onset/beat? boundary.detection now enforces that each estimated/reference boundary be used at most once in the precision/recall calculation, and I'm not sure if that's what you want for onsets.

If so, we can abstract out the guts of the boundary metric into a function util.window_matching, which computes the size of the maximal matching between reference and estimates within a specified window.

from mir_eval.

craffel commented on July 23, 2024

Onsets/beats only allow each onset/beat to be counted once. The difference is that it's done greedily, I think, but I'm not 100% if that matters. We should discuss in person.

from mir_eval.

bmcfee commented on July 23, 2024

Ah. I suspect it doesn't matter too much for beats (as in segments), but it might be a big deal for onsets, especially if there are rapid sequences of onsets (eg in an arpeggio).

from mir_eval.

craffel commented on July 23, 2024

Can you walk through a degenerate case where it would matter? It seems like if you're just counting true/false positive/negatives it won't matter. But it would obviously matter if you care about distance to closest event.

from mir_eval.

craffel commented on July 23, 2024

We're going to fix on the non-greedy approach for all submodules.

from mir_eval.

bmcfee commented on July 23, 2024

Just to document the discussion from earlier: the greedy matching strategy can fail when the gap between two events is smaller than twice the window length. In these cases, it's possible to have an estimate which could feasibly map to either reference event, and mapping greedily to the closest one can result in a smaller global mapping

This probably does not affect boundary detection at 0.5 seconds. It might affect boundary detection at 3 seconds (<=6sec segments are not uncommon).

Beat tracking is probably safe, given the relatively small window length, but a degenerate beat tracker might output many events which cause trouble anyway. (Although, in this case, the score would probably be so low as to not matter if we have erroneous assignments...)

Onset detection is probably the most troublesome case, given the relatively lax constraints over the timing between subsequent events (compared to beats or segments, at least).

from mir_eval.

dpwe commented on July 23, 2024

I can't help wanting to know what the "non-greedy approach" is.
Greedy is usually used as a heuristic when the optimal solution (i.e.,
the correspondence that minimizes total error) requires an exponential
search, and no optimal polynomial algorithm is known. I don't suppose
we're doing the exponential search. So what are we doing?

DAn.

On Thu, Apr 17, 2014 at 7:14 PM, craffel [email protected] wrote:

We're going to fix on the non-greedy approach for all submodules.

—
Reply to this email directly or view it on GitHub.

from mir_eval.

craffel commented on July 23, 2024

It's here:
https://github.com/craffel/mir_eval/blob/master/mir_eval/boundary.py#L114
Greedy algorithms are also done when the non-greedy algorithms are non-obvious, even when they're non-exponential.

from mir_eval.

dpwe commented on July 23, 2024

Or, put succinctly: "We find the answer using magic"

On Thu, Apr 17, 2014 at 10:39 PM, craffel [email protected] wrote:

It's here:
https://github.com/craffel/mir_eval/blob/master/mir_eval/boundary.py#L114
Greedy algorithms are also done when the non-greedy algorithms are
non-obvious, even when they're non-exponential.

—
Reply to this email directly or view it on GitHub.

from mir_eval.

dpwe commented on July 23, 2024

I assume this tests out correctly?

# L. Lovasz On determinants, matchings and random algorithms.
# In L. Budach, editor, Fundamentals of Computation Theory, pages

565-574. Akademie-Verlag, 1979.
#
# If we build the skew-symmetric adjacency matrix
# D[i, n_ref+j] = 1 <=> ref[i] within window of est[j]
# D[n_ref + j, i] = -1 <=> same
#
# then rank(D) = 2 * maximum matching
#
# This way, we find the optimal assignment of reference and
annotation boundaries.

I'm trying to figure this out...

If we have really broad matching windows, so that all of ref[i] are
within the windows of all of est[i], then we end up with a
block-constant square matrix:

n_r rows [ 0 | 1]
n_e rows [ -1 | 0]

That looks like rank 2 to me, so max matching = 1, so the algorithm
returns precision and recall = 1/n_boundaries

yet I think if n_r == n_e, you can make a complete assignment, so the
best-case precision and recall should be 1.

If my interpretation is right, this algorithm is more like a
worst-case matching, that penalizes ambiguous assignments, whereas I
think the point of the metric is to find the best-case assignment.

The quick way to check is to see how the metrics vary as the window
increases. The scores should go up, but I'm guessing it will go down.

DAn.

On Thu, Apr 17, 2014 at 10:39 PM, craffel [email protected] wrote:

It's here:
https://github.com/craffel/mir_eval/blob/master/mir_eval/boundary.py#L114
Greedy algorithms are also done when the non-greedy algorithms are
non-obvious, even when they're non-exponential.

—
Reply to this email directly or view it on GitHub.

from mir_eval.

bmcfee commented on July 23, 2024

can't help wanting to know what the "non-greedy approach" is.
Greedy is usually used as a heuristic when the optimal solution (i.e.,
the correspondence that minimizes total error) requires an exponential
search, and no optimal polynomial algorithm is known. I don't suppose
we're doing the exponential search. So what are we doing?

Ok, let's formalize the problem. Given a set of estimated boundaries P and a set of reference boundaries R, we want to find the largest matching between P and R subject to the window constraint. This is an instance of maximal matching in a bipartite graph. Let the vertices V = P + R and add edges (i, j) <=> |P[i] - R[j] <= window. Then the goal is to find the largest subset of edges such that each vertex is contained in at most one edge. The size of this set (the matching) is M, the precision is M / |P|, and recall is M / |R|.

This problem is well-studied, and there exist several polynomial time algorithms to solve it. See: http://en.wikipedia.org/wiki/Matching_(graph_theory)#In_unweighted_bipartite_graphs

from mir_eval.

bmcfee commented on July 23, 2024

That looks like rank 2 to me, so max matching = 1, so the algorithm
returns precision and recall = 1/n_boundaries

Yup, you're right. Looks like I misinterpreted the paper I pulled this idea from!

However, the maximum matching idea is still correct, and we can plug in a hopcroft-karp solver to compute it correctly.

from mir_eval.

dpwe commented on July 23, 2024

OK, so we need a maximum bipartite matching, and the Augmenting Path
Algorithm appears to be the conceptually simplest (least efficient)
polynomial algorithm. Anyone have a copy of West, Douglas Brent
(1999), Introduction to Graph Theory (2nd ed.) to hand?

DAn.

On Fri, Apr 18, 2014 at 9:15 AM, Brian McFee [email protected] wrote:

can't help wanting to know what the "non-greedy approach" is.
Greedy is usually used as a heuristic when the optimal solution (i.e.,
the correspondence that minimizes total error) requires an exponential
search, and no optimal polynomial algorithm is known. I don't suppose
we're doing the exponential search. So what are we doing?

Ok, let's formalize the problem. Given a set of estimated boundaries P and a
set of reference boundaries R, we want to find the largest matching between
P and R subject to the window constraint. This is an instance of maximal
matching in a bipartite graph. Let the vertices V = P + R and add edges (i,
j) <=> |P[i] - R[j] <= window. Then the goal is to find the largest subset
of edges such that each vertex is contained in at most one edge. The size of
this set (the matching) is M, the precision is M / |P|, and recall is M /
|R|.

This problem is well-studied, and there exist several polynomial time
algorithms to solve it. See:
http://en.wikipedia.org/wiki/Matching_(graph_theory)#In_unweighted_bipartite_graphs

—
Reply to this email directly or view it on GitHub.

from mir_eval.

bmcfee commented on July 23, 2024

Hopcroft-Karp is now implemented in mir_eval.util as of d7df358. The boundary detection metric has been rewritten to use it, and it now does The Right Thing.

from mir_eval.

craffel commented on July 23, 2024

@ejhumphrey Currently intervals are validated in the chord.score decorator by

        # Intervals should be (n, 2) array
        if intervals.ndim != 2 or intervals.shape[1] != 2:
            raise ValueError('intervals should be an ndarray'
                            ' of size (n, 2)')
        # There should be as many intervals as labels
        if intervals.shape[0] != N:
            raise ValueError('intervals contains {} entries but '
                            'len(reference_labels) = len(estimated_labels)'
                            ' = {}'.format(intervals.shape[0], N))
        if 0 in np.diff(np.array(intervals), axis=1):
            warnings.warn('Zero-duration interval')

The first and last check are included in util.validate_intervals:
https://github.com/craffel/mir_eval/blob/master/mir_eval/util.py#L560
except that instead of checking for empty intervals, it checks for any intervals of length less than or equal to zero. It also checks for negative interval times which seems helpful. Can you think of any reason I shouldn't replace the first and last check with a call to util.validate_intervals?

from mir_eval.

ejhumphrey commented on July 23, 2024

at the moment no, but I'm shutting down for the day ... lemme pin this to
the top of my todo list for tomorrow and give it more brain power then.

On Tue, Jul 22, 2014 at 7:35 PM, craffel [email protected] wrote:

@ejhumphrey https://github.com/ejhumphrey Currently intervals are
validated in the chord.score decorator by
    # Intervals should be (n, 2) array
    if intervals.ndim != 2 or intervals.shape[1] != 2:
        raise ValueError('intervals should be an ndarray'
                        ' of size (n, 2)')
    # There should be as many intervals as labels
    if intervals.shape[0] != N:
        raise ValueError('intervals contains {} entries but '
                        'len(reference_labels) = len(estimated_labels)'
                        ' = {}'.format(intervals.shape[0], N))
    if 0 in np.diff(np.array(intervals), axis=1):
        warnings.warn('Zero-duration interval')
The first and last check are included in util.validate_intervals:
https://github.com/craffel/mir_eval/blob/master/mir_eval/util.py#L560
except that instead of checking for empty intervals, it checks for any
intervals of length less than or equal to zero. It also checks for negative
interval times which seems helpful. Can you think of any reason I shouldn't
replace the first and last check with a call to util.validate_intervals?

—
Reply to this email directly or view it on GitHub
#29 (comment).

from mir_eval.

Merge common functionality across submodules into util about mir_eval HOT 17 CLOSED

Comments (17)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs