Comments (7)
At the moment the code specifically loads both columns as floats:
https://github.com/craffel/mir_eval/blob/master/mir_eval/input_output.py#L229
Can you use np.loadtxt to load columns of different datatypes? It seems as though you have to specify a type and all columns are expected to be of that type.
I wrote this loader with melody in mind - it acts as a first check to ensure the data is of the correct type before proceeding. If you make this function return values that are not necessarily floats, you'd have to add a new check for the melody eval code or it could break...
from mir_eval.
I believe that yes, you can load different datatypes using that function. The docstring has this example:
>>> d = StringIO("M 21 72\nF 35 58")
>>> np.loadtxt(d, dtype={'names': ('gender', 'age', 'weight'),
... 'formats': ('S1', 'i4', 'f4')})
array([('M', 21, 72.0), ('F', 35, 58.0)],
dtype=[('gender', '|S1'), ('age', '<i4'), ('weight', '<f4')])
One, it's overkill for what I'm asking, but I think you could massage it to behave as intended. Two, while convenient, there's nothing that strictly says it's necessary to use this function (np.loadtxt) versus iterating over a file handle; it's what the rest of the loaders do anyways.
Sure, format assertions are always good, and we'd need a different one if the implementation changes.
from mir_eval.
I think lists of strings are saner than np.arrays of strings, so I'd prefer that it just dealt with list-like objects of any type. Feel free to change at will.
from mir_eval.
So, currently there are @bmcfee 's functions for loading annotations (ranges)/events: mir_eval.io.load_events
and mir_eval.io.load_intervals
. load_events
loads in one- or two-column files; the first column always gets read in as floats (times), the second (if it exists) gets read in as a list of strings (labels). Similarly, load_annotation
loads in two- or three-; first two are floats specifying the intervals (ranges), last column if it exists is a list of string labels. @justinsalamon's load_time_series
does exactly the same thing as load_events
except that the second column reads in floats. It seems like these functions should be merged, and the last column should be either returned as a np.ndarray or a list of strings.
Similarly, mir_eval.util.adjust_intervals
is a useful function across many tasks but currently is only really suitable for segments. It'd be useful to use for chords, but the "padding" label should be just 'N' (no chord), I would argue; and the function only allows for setting a prefix. Similarly, it'd be useful for time series, but the padding label is always a string. It seems like for both of these cases it'd be nice if it could handle labels as a string or float list-like, and that the padding label could be set arbitrarily.
Is everyone comfortable with me trying to merge all of this functionality?
from mir_eval.
load_time_series does exactly the same thing as load_events except that the second column reads in floats. It seems like these functions should be merged, and the last column should be either returned as a np.ndarray or a list of strings.
Sure. load_events
allows one argument (converter
) to be passed in to specify the event index type (defaults to float
). Seems like the quick fix here is to replicate this functionality for label parsing, so we have event_converter
and label_converter
(defaults to str
) in https://github.com/craffel/mir_eval/blob/master/mir_eval/input_output.py#L60-L61 .
It'd be useful to use for chords, but the "padding" label should be just 'N' (no chord), I would argue; and the function only allows for setting a prefix.
Agreed, but I'm not sure of an elegant way to do this right now. The __%s
notation is a crutch following the dummy label generation in the loader, where you want synthetic labels to be unique (numbered). I figured we should have consistent keying for synthetic labels. But, in adjust_*
, we only ever use __T_MIN
and __T_MAX
. I really wish python supported the %*s
formatter right about now...
from mir_eval.
We still need to merge the loaders. load_time_series
loads two float columns. load_events
loads one float column and optionally one string column. load_intervals
loads two float columns and optionally one string column.
from mir_eval.
OK, the changes proposed above
#35 (comment)
have been implemented and all code/example usage has been updated to reflect this change.
@bmcfee @ejhumphrey @urinieto Important - io.load_intervals
now does not return labels; if you want to load labeled intervals you need to use io.load_labeled_intervals
. Same is true for load_events
vs load_labeled_events
but AFAIK no one was using load_events
for loading labels. Also, none of these functions will generate synthetic labels themselves. If you want synthetic labels like these functions used to return when no labels were present, use util.generate_labels
. Take a look at the evaluators/example usage for updated usage examples.
from mir_eval.
Related Issues (20)
- perfect 5th detection in mir_eval.key is asymmetric HOT 7
- Hierarchy measures: speed vs memory? HOT 2
- Bypassing validation HOT 2
- How does mir_eval run on gpu HOT 1
- Docs don't list the new alignment metrics HOT 6
- New fingerprinting module HOT 2
- Evaluation of SDR and SIR HOT 2
- How to evaluate SIR and SDR for mono wav files HOT 1
- Are there any methods for transposing chord notations from one key to another? HOT 1
- Debian Package - Disable function /tests with construction errors HOT 7
- Test fails on 32 bit x86 HOT 2
- update numpy dependency HOT 7
- Migrate pull request automated testing from Travis CI to GitHub Actions HOT 2
- Rename default branch to `main` HOT 1
- Guidance on the right metric? HOT 3
- Support matplotlib 3.8 HOT 7
- Entropy Based Evaluation of unlabelled sections
- Release a patched version to PyPI for NumPy compatibility HOT 1
- Docs: multipitch.evaluation macro scores HOT 1
- 0.8 meta issue HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mir_eval.