Introduction
In this year we propose a newly annotated polyphonic dataset. This dataset contains a wider range of real-world music in comparison to the old dataset used from 2009. Specifically, the new dataset contains 3 clips of piano solo, 3 clips of string quartet, 2 clips of piano quintet, and 2 clips of violin sonata (violin with piano accompaniment), all of which are selected from real-world recordings. The length of each clip is between 20 and 30 seconds. The dataset is annotated by the method described in the following paper:
Li Su and Yi-Hsuan Yang, "Escaping from the Abyss of Manual Annotation: New Methodology of Building Polyphonic Datasets for Automatic Music Transcription," in Int. Symp. Computer Music Multidisciplinary Research (CMMR), June 2015.
As also mentioned in the paper, we tried our best to calibrate the errors (mostly the mismatch between onset and offset time stamps) in the preliminary annotation by human labor. Since there are still potential errors of annotation that we didn’t find, we decide to make the data and the annotation publicly available after the announcement of MIREX result this year. Specifically, we encourage every participant to help us check the annotation. The result of each competing algorithm will be updated based on the revised annotation. We hope that this can let the participants get more detailed information about the behaviors of the algorithm performing on the dataset. Moreover, in this way we can join our efforts to create a better dataset for the research on multiple-F0 estimation and tracking.
General Legend
Sub code
|
Submission name
|
Abstract
|
Contributors
|
BW1
|
doMultiF0 |
PDF |
Emmanouil Benetos, Tillman Weyde
|
BW2
|
NoteTracking1 |
PDF |
Emmanouil Benetos, Tillman Weyde
|
BW3
|
NoteTracking2 |
PDF |
Emmanouil Benetos, Tillman Weyde
|
CB1
|
Silvet1 |
PDF |
Chris Cannam, Emmanouil Benetos, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, and Dan Stowell
|
CB2
|
Silvet2 |
PDF |
Chris Cannam, Emmanouil Benetos, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, and Dan Stowell
|
SY1
|
MPE1 |
PDF |
Li Su, Yi-Hsuan Yang
|
SY2
|
MPE2 |
PDF |
Li Su, Yi-Hsuan Yang
|
SY3
|
MPE3 |
PDF |
Li Su, Yi-Hsuan Yang
|
SY4
|
MPE4 |
PDF |
Li Su, Yi-Hsuan Yang
|
Task 1: Multiple Fundamental Frequency Estimation (MF0E)
MF0E Overall Summary Results
|
BW1 |
CB1 |
CB2 |
SY1 |
SY2 |
SY3 |
SY4 |
Accuracy |
0.354 |
0.233 |
0.237 |
0.39 |
0.375 |
0.369 |
0.359 |
Accuracy Chroma |
0.425 |
0.275 |
0.298 |
0.462 |
0.454 |
0.444 |
0.438 |
download these results as csv
Detailed Results
|
Precision |
Recall |
Accuracy |
Etot |
Esubs |
Emiss |
Efa |
|
BW1 |
0.614 |
0.480 |
0.356 |
0.684 |
0.183 |
0.337 |
0.165 |
CB1 |
0.617 |
0.315 |
0.259 |
0.736 |
0.166 |
0.520 |
0.051 |
CB2 |
0.585 |
0.299 |
0.240 |
0.757 |
0.184 |
0.518 |
0.056 |
SY1 |
0.516 |
0.626 |
0.385 |
0.779 |
0.242 |
0.133 |
0.404 |
SY2 |
0.500 |
0.620 |
0.375 |
0.795 |
0.254 |
0.125 |
0.415 |
SY3 |
0.535 |
0.567 |
0.369 |
0.742 |
0.241 |
0.192 |
0.310 |
SY4 |
0.532 |
0.556 |
0.364 |
0.732 |
0.247 |
0.198 |
0.288 |
download these results as csv
Detailed Chroma Results
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)
|
precision |
recall |
overall_acc |
Etot |
Esubs |
Emiss |
Efa |
BW1 |
0.722 |
0.509 |
0.425 |
0.591 |
0.097 |
0.394 |
0.1 |
CB1 |
0.705 |
0.311 |
0.275 |
0.718 |
0.101 |
0.587 |
0.029 |
CB2 |
0.732 |
0.335 |
0.298 |
0.696 |
0.091 |
0.574 |
0.031 |
SY1 |
0.601 |
0.667 |
0.462 |
0.61 |
0.166 |
0.168 |
0.277 |
SY2 |
0.586 |
0.669 |
0.454 |
0.626 |
0.178 |
0.154 |
0.294 |
SY3 |
0.626 |
0.604 |
0.444 |
0.596 |
0.16 |
0.236 |
0.201 |
SY4 |
0.623 |
0.596 |
0.438 |
0.598 |
0.167 |
0.237 |
0.194 |
download these results as csv
Individual Results Files for Task 1
BW1= Emmanouil Benetos, Tillman Weyde
CB1= Chris Cannam, Emmanouil Benetos, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Dan Stowell
CB2= Chris Cannam, Emmanouil Benetos, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Dan Stowell
SY1= Li Su, Yi-Hsuan Yang
SY2= Li Su, Yi-Hsuan Yang
SY3= Li Su, Yi-Hsuan Yang
SY4= Li Su, Yi-Hsuan Yang
Info about the filenames
The first two letters of the filename represent the music type:
PQ = piano quintet,
PS = piano solo,
SQ = string quartet,
VS = violin sonata (with piano accompaniment)
Run Times
Friedman tests for Multiple Fundamental Frequency Estimation (MF0E)
The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the performance (accuracy) on individual files.
Tukey-Kramer HSD Multi-Comparison
TeamID |
TeamID |
Lowerbound |
Mean |
Upperbound |
Significance |
SY1 |
SY2 |
-2.1483 |
0.7000 |
3.5483 |
FALSE |
SY1 |
SY3 |
-1.3483 |
1.5000 |
4.3483 |
FALSE |
SY1 |
SY4 |
-1.5483 |
1.3000 |
4.1483 |
FALSE |
SY1 |
BW1 |
-1.6483 |
1.2000 |
4.0483 |
FALSE |
SY1 |
CB1 |
1.1517 |
4.0000 |
6.8483 |
TRUE |
SY1 |
CB2 |
1.7517 |
4.6000 |
7.4483 |
TRUE |
SY2 |
SY3 |
-2.0483 |
0.8000 |
3.6483 |
FALSE |
SY2 |
SY4 |
-2.2483 |
0.6000 |
3.4483 |
FALSE |
SY2 |
BW1 |
-2.3483 |
0.5000 |
3.3483 |
FALSE |
SY2 |
CB1 |
0.4517 |
3.3000 |
6.1483 |
TRUE |
SY2 |
CB2 |
1.0517 |
3.9000 |
6.7483 |
TRUE |
SY3 |
SY4 |
-3.0483 |
-0.2000 |
2.6483 |
FALSE |
SY3 |
BW1 |
-3.1483 |
-0.3000 |
2.5483 |
FALSE |
SY3 |
CB1 |
-0.3483 |
2.5000 |
5.3483 |
FALSE |
SY3 |
CB2 |
0.2517 |
3.1000 |
5.9483 |
TRUE |
SY4 |
BW1 |
-2.9483 |
-0.1000 |
2.7483 |
FALSE |
SY4 |
CB1 |
-0.1483 |
2.7000 |
5.5483 |
FALSE |
SY4 |
CB2 |
0.4517 |
3.3000 |
6.1483 |
TRUE |
BW1 |
CB1 |
-0.0483 |
2.8000 |
5.6483 |
FALSE |
BW1 |
CB2 |
0.5517 |
3.4000 |
6.2483 |
TRUE |
CB1 |
CB2 |
-2.2483 |
0.6000 |
3.4483 |
FALSE |
download these results as csv
Task 2:Note Tracking (NT)
NT Mixed Set Overall Summary Results
This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.
|
BW2 |
BW3 |
CB1 |
CB2 |
SY1 |
SY2 |
SY3 |
SY4 |
Ave. F-Measure Onset-Offset |
0.0752 |
0.0652 |
0.0562 |
0.0404 |
0.0485 |
0.0416 |
0.0499 |
0.0461 |
Ave. F-Measure Onset Only |
0.3190 |
0.2855 |
0.2267 |
0.1572 |
0.2338 |
0.2278 |
0.2248 |
0.2223 |
Ave. F-Measure Chroma |
0.0911 |
0.0822 |
0.0707 |
0.0588 |
0.0620 |
0.0542 |
0.0665 |
0.0630 |
Ave. F-Measure Onset Only Chroma |
0.3625 |
0.3344 |
0.2637 |
0.2019 |
0.2790 |
0.2786 |
0.2757 |
0.2770 |
download these results as csv
Detailed Results
|
Precision |
Recall |
Ave. F-measure |
Ave. Overlap |
BW2 |
0.087 |
0.070 |
0.075 |
0.764 |
BW3 |
0.080 |
0.058 |
0.065 |
0.762 |
CB1 |
0.070 |
0.048 |
0.056 |
0.752 |
CB2 |
0.044 |
0.040 |
0.040 |
0.836 |
SY1 |
0.042 |
0.060 |
0.049 |
0.755 |
SY2 |
0.036 |
0.052 |
0.042 |
0.837 |
SY3 |
0.041 |
0.069 |
0.050 |
0.836 |
SY4 |
0.039 |
0.063 |
0.046 |
0.833 |
download these results as csv
Detailed Chroma Results
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)
|
Precision |
Recall |
Ave. F-measure |
Ave. Overlap |
BW2 |
0.108 |
0.083 |
0.091 |
0.763 |
BW3 |
0.101 |
0.073 |
0.082 |
0.755 |
CB1 |
0.09 |
0.06 |
0.071 |
0.752 |
CB2 |
0.067 |
0.057 |
0.059 |
0.83 |
SY1 |
0.054 |
0.077 |
0.062 |
0.826 |
SY2 |
0.048 |
0.067 |
0.054 |
0.834 |
SY3 |
0.055 |
0.09 |
0.067 |
0.832 |
SY4 |
0.054 |
0.085 |
0.063 |
0.826 |
download these results as csv
Results Based on Onset Only
|
Precision |
Recall |
Ave. F-measure |
Ave. Overlap |
BW2 |
0.367 |
0.298 |
0.319 |
0.541 |
BW3 |
0.341 |
0.260 |
0.286 |
0.523 |
CB1 |
0.289 |
0.195 |
0.227 |
0.509 |
CB2 |
0.182 |
0.152 |
0.157 |
0.511 |
SY1 |
0.206 |
0.291 |
0.234 |
0.490 |
SY2 |
0.201 |
0.291 |
0.228 |
0.478 |
SY3 |
0.190 |
0.301 |
0.225 |
0.495 |
SY4 |
0.193 |
0.296 |
0.222 |
0.499 |
download these results as csv
Chroma Results Based on Onset Only
|
Precision |
Recall |
Ave. F-measure |
Ave. Overlap |
BW2 |
0.420 |
0.337 |
0.363 |
0.512 |
BW3 |
0.402 |
0.305 |
0.334 |
0.500 |
CB1 |
0.340 |
0.226 |
0.264 |
0.490 |
CB2 |
0.238 |
0.192 |
0.202 |
0.510 |
SY1 |
0.247 |
0.354 |
0.279 |
0.462 |
SY2 |
0.245 |
0.364 |
0.279 |
0.441 |
SY3 |
0.234 |
0.370 |
0.276 |
0.462 |
SY4 |
0.241 |
0.368 |
0.277 |
0.453 |
download these results as csv
Run Times
Friedman Tests for Note Tracking
The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the F-measure on individual files.
Tukey-Kramer HSD Multi-Comparison for Task2
TeamID |
TeamID |
Lowerbound |
Mean |
Upperbound |
Significance |
BW2 |
BW3 |
-1.1202 |
2.2000 |
5.5202 |
FALSE |
BW2 |
SY1 |
-0.4202 |
2.9000 |
6.2202 |
FALSE |
BW2 |
SY2 |
0.1798 |
3.5000 |
6.8202 |
TRUE |
BW2 |
CB1 |
0.8798 |
4.2000 |
7.5202 |
TRUE |
BW2 |
SY3 |
-0.3202 |
3.0000 |
6.3202 |
FALSE |
BW2 |
SY4 |
0.0798 |
3.4000 |
6.7202 |
TRUE |
BW2 |
CB2 |
2.2798 |
5.6000 |
8.9202 |
TRUE |
BW3 |
SY1 |
-2.6202 |
0.7000 |
4.0202 |
FALSE |
BW3 |
SY2 |
-2.0202 |
1.3000 |
4.6202 |
FALSE |
BW3 |
CB1 |
-1.3202 |
2.0000 |
5.3202 |
FALSE |
BW3 |
SY3 |
-2.5202 |
0.8000 |
4.1202 |
FALSE |
BW3 |
SY4 |
-2.1202 |
1.2000 |
4.5202 |
FALSE |
BW3 |
CB2 |
0.0798 |
3.4000 |
6.7202 |
TRUE |
SY1 |
SY2 |
-2.7202 |
0.6000 |
3.9202 |
FALSE |
SY1 |
CB1 |
-2.0202 |
1.3000 |
4.6202 |
FALSE |
SY1 |
SY3 |
-3.2202 |
0.1000 |
3.4202 |
FALSE |
SY1 |
SY4 |
-2.8202 |
0.5000 |
3.8202 |
FALSE |
SY1 |
CB2 |
-0.6202 |
2.7000 |
6.0202 |
FALSE |
SY2 |
CB1 |
-2.6202 |
0.7000 |
4.0202 |
FALSE |
SY2 |
SY3 |
-3.8202 |
-0.5000 |
2.8202 |
FALSE |
SY2 |
SY4 |
-3.4202 |
-0.1000 |
3.2202 |
FALSE |
SY2 |
CB2 |
-1.2202 |
2.1000 |
5.4202 |
FALSE |
CB1 |
SY3 |
-4.5202 |
-1.2000 |
2.1202 |
FALSE |
CB1 |
SY4 |
-4.1202 |
-0.8000 |
2.5202 |
FALSE |
CB1 |
CB2 |
-1.9202 |
1.4000 |
4.7202 |
FALSE |
SY3 |
SY4 |
-2.9202 |
0.4000 |
3.7202 |
FALSE |
SY3 |
CB2 |
-0.7202 |
2.6000 |
5.9202 |
FALSE |
SY4 |
CB2 |
-1.1202 |
2.2000 |
5.5202 |
FALSE |
download these results as csv
NT Piano-Only Overall Summary Results
This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.
6 piano recordings are evaluated separately for this subtask.
|
BW2 |
BW3 |
CB1 |
CB2 |
SY1 |
SY2 |
SY3 |
SY4 |
Ave. F-Measure Onset-Offset |
0.0993 |
0.0873 |
0.0789 |
0.0597 |
0.0570 |
0.0404 |
0.0691 |
0.0570 |
Ave. F-Measure Onset Only |
0.5000 |
0.4751 |
0.3582 |
0.2305 |
0.3026 |
0.2840 |
0.2823 |
0.2730 |
Ave. F-Measure Chroma |
0.1106 |
0.1072 |
0.0862 |
0.0747 |
0.0687 |
0.0517 |
0.0816 |
0.0706 |
Ave. F-Measure Onset Only Chroma |
0.5200 |
0.5062 |
0.3720 |
0.2500 |
0.3259 |
0.3191 |
0.3104 |
0.3076 |
download these results as csv
Detailed Results
|
Precision |
Recall |
Ave. F-measure |
Ave. Overlap |
BW2 |
0.095 |
0.106 |
0.099 |
0.847 |
BW3 |
0.087 |
0.090 |
0.087 |
0.844 |
CB1 |
0.086 |
0.073 |
0.079 |
0.862 |
CB2 |
0.056 |
0.066 |
0.060 |
0.822 |
SY1 |
0.045 |
0.080 |
0.057 |
0.837 |
SY2 |
0.031 |
0.060 |
0.040 |
0.831 |
SY3 |
0.051 |
0.112 |
0.069 |
0.843 |
SY4 |
0.041 |
0.101 |
0.057 |
0.835 |
download these results as csv
Detailed Chroma Results
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)
|
Precision |
Recall |
Ave. F-measure |
Ave. Overlap |
BW2 |
0.107 |
0.117 |
0.111 |
0.844 |
BW3 |
0.107 |
0.110 |
0.107 |
0.835 |
CB1 |
0.094 |
0.081 |
0.086 |
0.854 |
CB2 |
0.071 |
0.081 |
0.075 |
0.828 |
SY1 |
0.054 |
0.097 |
0.069 |
0.830 |
SY2 |
0.040 |
0.076 |
0.052 |
0.823 |
SY3 |
0.060 |
0.132 |
0.082 |
0.839 |
SY4 |
0.052 |
0.123 |
0.071 |
0.827 |
download these results as csv
Results Based on Onset Only
|
Precision |
Recall |
Ave. F-measure |
Ave. Overlap |
BW2 |
0.495 |
0.512 |
0.500 |
0.548 |
BW3 |
0.483 |
0.475 |
0.475 |
0.541 |
CB1 |
0.399 |
0.328 |
0.358 |
0.558 |
CB2 |
0.220 |
0.248 |
0.231 |
0.539 |
SY1 |
0.239 |
0.420 |
0.303 |
0.528 |
SY2 |
0.222 |
0.419 |
0.284 |
0.495 |
SY3 |
0.210 |
0.439 |
0.282 |
0.548 |
SY4 |
0.206 |
0.438 |
0.273 |
0.536 |
download these results as csv
Chroma Results Based on Onset Only
|
Precision |
Recall |
Ave. F-measure |
Ave. Overlap |
BW2 |
0.516 |
0.532 |
0.520 |
0.537 |
BW3 |
0.514 |
0.506 |
0.506 |
0.544 |
CB1 |
0.415 |
0.340 |
0.372 |
0.548 |
CB2 |
0.240 |
0.268 |
0.250 |
0.562 |
SY1 |
0.257 |
0.451 |
0.326 |
0.520 |
SY2 |
0.249 |
0.472 |
0.319 |
0.476 |
SY3 |
0.232 |
0.480 |
0.310 |
0.531 |
SY4 |
0.233 |
0.490 |
0.308 |
0.509 |
download these results as csv
Individual Results Files for Task 2
BW2= Emmanouil Benetos, Tillman Weyde
BW3= Emmanouil Benetos, Tillman Weyde
CB1= Chris Cannam, Emmanouil Benetos, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Dan Stowell
CB2= Chris Cannam, Emmanouil Benetos, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Dan Stowell
SY1= Li Su, Yi-Hsuan Yang
SY2= Li Su, Yi-Hsuan Yang
SY3= Li Su, Yi-Hsuan Yang
SY4= Li Su, Yi-Hsuan Yang