Comments (17)
Need to move code for computing false positives/negatives of event times over a window into util. It's used in both beat
and onset
in the respective f_measure
functions.. Is this the same as boundary.detection
, @bmcfee?
from mir_eval.
Tough question: are the semantics of finding a boundary the same as those of detecting an onset/beat? boundary.detection
now enforces that each estimated/reference boundary be used at most once in the precision/recall calculation, and I'm not sure if that's what you want for onsets.
If so, we can abstract out the guts of the boundary metric into a function util.window_matching
, which computes the size of the maximal matching between reference and estimates within a specified window.
from mir_eval.
Onsets/beats only allow each onset/beat to be counted once. The difference is that it's done greedily, I think, but I'm not 100% if that matters. We should discuss in person.
from mir_eval.
Ah. I suspect it doesn't matter too much for beats (as in segments), but it might be a big deal for onsets, especially if there are rapid sequences of onsets (eg in an arpeggio).
from mir_eval.
Can you walk through a degenerate case where it would matter? It seems like if you're just counting true/false positive/negatives it won't matter. But it would obviously matter if you care about distance to closest event.
from mir_eval.
We're going to fix on the non-greedy approach for all submodules.
from mir_eval.
Just to document the discussion from earlier: the greedy matching strategy can fail when the gap between two events is smaller than twice the window length. In these cases, it's possible to have an estimate which could feasibly map to either reference event, and mapping greedily to the closest one can result in a smaller global mapping
This probably does not affect boundary detection at 0.5 seconds. It might affect boundary detection at 3 seconds (<=6sec segments are not uncommon).
Beat tracking is probably safe, given the relatively small window length, but a degenerate beat tracker might output many events which cause trouble anyway. (Although, in this case, the score would probably be so low as to not matter if we have erroneous assignments...)
Onset detection is probably the most troublesome case, given the relatively lax constraints over the timing between subsequent events (compared to beats or segments, at least).
from mir_eval.
I can't help wanting to know what the "non-greedy approach" is.
Greedy is usually used as a heuristic when the optimal solution (i.e.,
the correspondence that minimizes total error) requires an exponential
search, and no optimal polynomial algorithm is known. I don't suppose
we're doing the exponential search. So what are we doing?
DAn.
On Thu, Apr 17, 2014 at 7:14 PM, craffel [email protected] wrote:
We're going to fix on the non-greedy approach for all submodules.
—
Reply to this email directly or view it on GitHub.
from mir_eval.
It's here:
https://github.com/craffel/mir_eval/blob/master/mir_eval/boundary.py#L114
Greedy algorithms are also done when the non-greedy algorithms are non-obvious, even when they're non-exponential.
from mir_eval.
Or, put succinctly: "We find the answer using magic"
On Thu, Apr 17, 2014 at 10:39 PM, craffel [email protected] wrote:
It's here:
https://github.com/craffel/mir_eval/blob/master/mir_eval/boundary.py#L114
Greedy algorithms are also done when the non-greedy algorithms are
non-obvious, even when they're non-exponential.—
Reply to this email directly or view it on GitHub.
from mir_eval.
I assume this tests out correctly?
# L. Lovasz On determinants, matchings and random algorithms.
# In L. Budach, editor, Fundamentals of Computation Theory, pages
565-574. Akademie-Verlag, 1979.
#
# If we build the skew-symmetric adjacency matrix
# D[i, n_ref+j] = 1 <=> ref[i] within window of est[j]
# D[n_ref + j, i] = -1 <=> same
#
# then rank(D) = 2 * maximum matching
#
# This way, we find the optimal assignment of reference and
annotation boundaries.
I'm trying to figure this out...
If we have really broad matching windows, so that all of ref[i] are
within the windows of all of est[i], then we end up with a
block-constant square matrix:
n_r rows [ 0 | 1]
n_e rows [ -1 | 0]
That looks like rank 2 to me, so max matching = 1, so the algorithm
returns precision and recall = 1/n_boundaries
yet I think if n_r == n_e, you can make a complete assignment, so the
best-case precision and recall should be 1.
If my interpretation is right, this algorithm is more like a
worst-case matching, that penalizes ambiguous assignments, whereas I
think the point of the metric is to find the best-case assignment.
The quick way to check is to see how the metrics vary as the window
increases. The scores should go up, but I'm guessing it will go down.
DAn.
On Thu, Apr 17, 2014 at 10:39 PM, craffel [email protected] wrote:
It's here:
https://github.com/craffel/mir_eval/blob/master/mir_eval/boundary.py#L114
Greedy algorithms are also done when the non-greedy algorithms are
non-obvious, even when they're non-exponential.—
Reply to this email directly or view it on GitHub.
from mir_eval.
can't help wanting to know what the "non-greedy approach" is.
Greedy is usually used as a heuristic when the optimal solution (i.e.,
the correspondence that minimizes total error) requires an exponential
search, and no optimal polynomial algorithm is known. I don't suppose
we're doing the exponential search. So what are we doing?
Ok, let's formalize the problem. Given a set of estimated boundaries P
and a set of reference boundaries R
, we want to find the largest matching between P
and R
subject to the window constraint. This is an instance of maximal matching in a bipartite graph. Let the vertices V = P + R
and add edges (i, j) <=> |P[i] - R[j] <= window
. Then the goal is to find the largest subset of edges such that each vertex is contained in at most one edge. The size of this set (the matching) is M
, the precision is M / |P|
, and recall is M / |R|
.
This problem is well-studied, and there exist several polynomial time algorithms to solve it. See: http://en.wikipedia.org/wiki/Matching_(graph_theory)#In_unweighted_bipartite_graphs
from mir_eval.
That looks like rank 2 to me, so max matching = 1, so the algorithm
returns precision and recall = 1/n_boundaries
Yup, you're right. Looks like I misinterpreted the paper I pulled this idea from!
However, the maximum matching idea is still correct, and we can plug in a hopcroft-karp solver to compute it correctly.
from mir_eval.
OK, so we need a maximum bipartite matching, and the Augmenting Path
Algorithm appears to be the conceptually simplest (least efficient)
polynomial algorithm. Anyone have a copy of West, Douglas Brent
(1999), Introduction to Graph Theory (2nd ed.) to hand?
DAn.
On Fri, Apr 18, 2014 at 9:15 AM, Brian McFee [email protected] wrote:
can't help wanting to know what the "non-greedy approach" is.
Greedy is usually used as a heuristic when the optimal solution (i.e.,
the correspondence that minimizes total error) requires an exponential
search, and no optimal polynomial algorithm is known. I don't suppose
we're doing the exponential search. So what are we doing?Ok, let's formalize the problem. Given a set of estimated boundaries P and a
set of reference boundaries R, we want to find the largest matching between
P and R subject to the window constraint. This is an instance of maximal
matching in a bipartite graph. Let the vertices V = P + R and add edges (i,
j) <=> |P[i] - R[j] <= window. Then the goal is to find the largest subset
of edges such that each vertex is contained in at most one edge. The size of
this set (the matching) is M, the precision is M / |P|, and recall is M /
|R|.This problem is well-studied, and there exist several polynomial time
algorithms to solve it. See:
http://en.wikipedia.org/wiki/Matching_(graph_theory)#In_unweighted_bipartite_graphs—
Reply to this email directly or view it on GitHub.
from mir_eval.
Hopcroft-Karp is now implemented in mir_eval.util
as of d7df358. The boundary detection metric has been rewritten to use it, and it now does The Right Thing.
from mir_eval.
@ejhumphrey Currently intervals are validated in the chord.score
decorator by
# Intervals should be (n, 2) array
if intervals.ndim != 2 or intervals.shape[1] != 2:
raise ValueError('intervals should be an ndarray'
' of size (n, 2)')
# There should be as many intervals as labels
if intervals.shape[0] != N:
raise ValueError('intervals contains {} entries but '
'len(reference_labels) = len(estimated_labels)'
' = {}'.format(intervals.shape[0], N))
if 0 in np.diff(np.array(intervals), axis=1):
warnings.warn('Zero-duration interval')
The first and last check are included in util.validate_intervals
:
https://github.com/craffel/mir_eval/blob/master/mir_eval/util.py#L560
except that instead of checking for empty intervals, it checks for any intervals of length less than or equal to zero. It also checks for negative interval times which seems helpful. Can you think of any reason I shouldn't replace the first and last check with a call to util.validate_intervals
?
from mir_eval.
at the moment no, but I'm shutting down for the day ... lemme pin this to
the top of my todo list for tomorrow and give it more brain power then.
On Tue, Jul 22, 2014 at 7:35 PM, craffel [email protected] wrote:
@ejhumphrey https://github.com/ejhumphrey Currently intervals are
validated in the chord.score decorator by# Intervals should be (n, 2) array if intervals.ndim != 2 or intervals.shape[1] != 2: raise ValueError('intervals should be an ndarray' ' of size (n, 2)') # There should be as many intervals as labels if intervals.shape[0] != N: raise ValueError('intervals contains {} entries but ' 'len(reference_labels) = len(estimated_labels)' ' = {}'.format(intervals.shape[0], N)) if 0 in np.diff(np.array(intervals), axis=1): warnings.warn('Zero-duration interval')
The first and last check are included in util.validate_intervals:
https://github.com/craffel/mir_eval/blob/master/mir_eval/util.py#L560
except that instead of checking for empty intervals, it checks for any
intervals of length less than or equal to zero. It also checks for negative
interval times which seems helpful. Can you think of any reason I shouldn't
replace the first and last check with a call to util.validate_intervals?—
Reply to this email directly or view it on GitHub
#29 (comment).
from mir_eval.
Related Issues (20)
- mir_eval.key.validate(reference_key, estimated_key)¶ when the key or the mode are unknown HOT 3
- More information on multipitch evaluation? HOT 3
- perfect 5th detection in mir_eval.key is asymmetric HOT 7
- Hierarchy measures: speed vs memory? HOT 2
- Bypassing validation HOT 2
- How does mir_eval run on gpu HOT 1
- Docs don't list the new alignment metrics HOT 6
- New fingerprinting module HOT 2
- Evaluation of SDR and SIR HOT 2
- How to evaluate SIR and SDR for mono wav files HOT 1
- Are there any methods for transposing chord notations from one key to another? HOT 1
- Debian Package - Disable function /tests with construction errors HOT 7
- Test fails on 32 bit x86 HOT 2
- update numpy dependency HOT 7
- Migrate pull request automated testing from Travis CI to GitHub Actions HOT 2
- Rename default branch to `main` HOT 1
- Guidance on the right metric? HOT 3
- Support matplotlib 3.8 HOT 7
- Entropy Based Evaluation of unlabelled sections
- Release a patched version to PyPI for NumPy compatibility HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mir_eval.