2011:Multiple Fundamental Frequency Estimation & Tracking Results
Contents
Introduction
These are the results for the 2008 running of the Multiple Fundamental Frequency Estimation and Tracking task. For background information about this task set please refer to the 2011:Multiple Fundamental Frequency Estimation & Tracking page.
General Legend
Task 1: Multiple Fundamental Frequency Estimation (MF0E)
MF0E Overall Summary Results
Below are the average scores across 40 test files. These files come from 3 different sources: woodwind quintet recording of bassoon, clarinet, horn,flute and oboe (UIUC); Rendered MIDI using RWC database donated by IRCAM and a quartet recording of bassoon, clarinet, violin and sax donated by Dr. Bryan Pardo`s Interactive Audio Lab (IAL). 20 files coming from 5 sections of the woodwind recording where each section has 4 files ranging from 2 polyphony to 5 polyphony. 12 files from IAL, coming from 4 different songs ranging from 2 polyphony to 4 polyphony and 8 files from RWC synthesized midi ranging from 2 different songs ranging from 2 polphony to 5 polyphony.
BD1 | KD1 | LYC1 | RFF1 | RFF2 | YR1 | YR2 | YR3 | YR4 | |
---|---|---|---|---|---|---|---|---|---|
Accuracy | 0.574 | 0.634 | 0.474 | 0.492 | 0.485 | 0.662 | 0.683 | 0.653 | 0.678 |
Accuracy Chroma | 0.629 | 0.664 | 0.557 | 0.55 | 0.542 | 0.689 | 0.702 | 0.681 | 0.696 |
Detailed Results
Precision | Recall | Accuracy | Etot | Esubs | Emiss | Efa | ||
---|---|---|---|---|---|---|---|---|
BD1 | 0.637 | 0.683 | 0.574 | 0.530 | 0.204 | 0.113 | 0.213 | |
KD1 | 0.850 | 0.657 | 0.634 | 0.384 | 0.083 | 0.261 | 0.041 | |
LYC1 | 0.555 | 0.593 | 0.474 | 0.711 | 0.243 | 0.164 | 0.304 | |
RFF1 | 0.627 | 0.570 | 0.492 | 0.592 | 0.193 | 0.238 | 0.161 | |
RFF2 | 0.567 | 0.602 | 0.485 | 0.657 | 0.217 | 0.181 | 0.259 | |
YR1 | 0.732 | 0.800 | 0.662 | 0.433 | 0.094 | 0.106 | 0.233 | |
YR2 | 0.733 | 0.840 | 0.683 | 0.416 | 0.083 | 0.076 | 0.256 | |
YR3 | 0.714 | 0.804 | 0.653 | 0.458 | 0.100 | 0.096 | 0.263 | |
YR4 | 0.724 | 0.843 | 0.678 | 0.429 | 0.084 | 0.073 | 0.271 |
Detailed Chroma Results
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)
Precision | Recall | Accuracy | Etot | Esubs | Emiss | Efa | ||
---|---|---|---|---|---|---|---|---|
BD1 | 0.700 | 0.754 | 0.629 | 0.460 | 0.134 | 0.113 | 0.213 | |
KD1 | 0.892 | 0.688 | 0.664 | 0.353 | 0.051 | 0.261 | 0.041 | |
LYC1 | 0.651 | 0.703 | 0.557 | 0.601 | 0.134 | 0.164 | 0.304 | |
RFF1 | 0.704 | 0.637 | 0.550 | 0.525 | 0.126 | 0.238 | 0.161 | |
RFF2 | 0.635 | 0.674 | 0.542 | 0.586 | 0.146 | 0.181 | 0.259 | |
YR1 | 0.761 | 0.833 | 0.689 | 0.400 | 0.061 | 0.106 | 0.233 | |
YR2 | 0.753 | 0.864 | 0.702 | 0.392 | 0.059 | 0.076 | 0.256 | |
YR3 | 0.744 | 0.840 | 0.681 | 0.422 | 0.064 | 0.096 | 0.263 | |
YR4 | 0.744 | 0.867 | 0.696 | 0.404 | 0.060 | 0.073 | 0.271 |
Individual Results Files for Task 1
AR1 = Chunghsin Yeh, Axel Roebel
AR2 = Chunghsin Yeh, Axel Roebel
AR3 = Chunghsin Yeh, Axel Roebel
AR4 = Chunghsin Yeh, Axel Roebel
BD1 = Emmanouil Benetos, Simon Dixon
CRCRV1 = F. Quesada et. al.
CRCRV3 = F. Quesada et. al.
DCL1 = Arnaud Dessein, Arshia Cont, Guillaume Lemaitre
DHP1 = Zhiyao Duan, Jinyu Han, Bryan Pardo
JW1 = Jun Wu et. al.
JW2 = Jun Wu et. al.
LYLC1 = C. T. Lee, et. al.
NNTOS1 = M. Nakano, et. al.
Info about the filenames
The filenames starting with part* comes from acoustic woodwind recording, the ones starting with RWC are synthesized. The legend about the instruments are:
bs = bassoon, cl = clarinet, fl = flute, hn = horn, ob = oboe, vl = violin, cel = cello, gtr = guitar, sax = saxophone, bass = electric bass guitar
Run Times
file /nema-raid/www/mirex/results/2009/multif0/task1_runtimes.csv not found
TBA
Friedman tests for Multiple Fundamental Frequency Estimation (MF0E)
The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the performance (accuracy) on individual files.
Tukey-Kramer HSD Multi-Comparison
TeamID | TeamID | Lowerbound | Mean | Upperbound | Significance |
---|---|---|---|---|---|
YR2 | YR4 | -0.9994 | 0.9000 | 2.7994 | FALSE |
YR2 | YR1 | -0.3494 | 1.5500 | 3.4494 | FALSE |
YR2 | YR3 | 1.0006 | 2.9000 | 4.7994 | TRUE |
YR2 | KD1 | 0.4006 | 2.3000 | 4.1994 | TRUE |
YR2 | BD1 | 1.6506 | 3.5500 | 5.4494 | TRUE |
YR2 | RFF1 | 3.7506 | 5.6500 | 7.5494 | TRUE |
YR2 | RFF2 | 4.0506 | 5.9500 | 7.8494 | TRUE |
YR2 | LYC1 | 3.8756 | 5.7750 | 7.6744 | TRUE |
YR4 | YR1 | -1.2494 | 0.6500 | 2.5494 | FALSE |
YR4 | YR3 | 0.1006 | 2.0000 | 3.8994 | TRUE |
YR4 | KD1 | -0.4994 | 1.4000 | 3.2994 | FALSE |
YR4 | BD1 | 0.7506 | 2.6500 | 4.5494 | TRUE |
YR4 | RFF1 | 2.8506 | 4.7500 | 6.6494 | TRUE |
YR4 | RFF2 | 3.1506 | 5.0500 | 6.9494 | TRUE |
YR4 | LYC1 | 2.9756 | 4.8750 | 6.7744 | TRUE |
YR1 | YR3 | -0.5494 | 1.3500 | 3.2494 | FALSE |
YR1 | KD1 | -1.1494 | 0.7500 | 2.6494 | FALSE |
YR1 | BD1 | 0.1006 | 2.0000 | 3.8994 | TRUE |
YR1 | RFF1 | 2.2006 | 4.1000 | 5.9994 | TRUE |
YR1 | RFF2 | 2.5006 | 4.4000 | 6.2994 | TRUE |
YR1 | LYC1 | 2.3256 | 4.2250 | 6.1244 | TRUE |
YR3 | KD1 | -2.4994 | -0.6000 | 1.2994 | FALSE |
YR3 | BD1 | -1.2494 | 0.6500 | 2.5494 | FALSE |
YR3 | RFF1 | 0.8506 | 2.7500 | 4.6494 | TRUE |
YR3 | RFF2 | 1.1506 | 3.0500 | 4.9494 | TRUE |
YR3 | LYC1 | 0.9756 | 2.8750 | 4.7744 | TRUE |
KD1 | BD1 | -0.6494 | 1.2500 | 3.1494 | FALSE |
KD1 | RFF1 | 1.4506 | 3.3500 | 5.2494 | TRUE |
KD1 | RFF2 | 1.7506 | 3.6500 | 5.5494 | TRUE |
KD1 | LYC1 | 1.5756 | 3.4750 | 5.3744 | TRUE |
BD1 | RFF1 | 0.2006 | 2.1000 | 3.9994 | TRUE |
BD1 | RFF2 | 0.5006 | 2.4000 | 4.2994 | TRUE |
BD1 | LYC1 | 0.3256 | 2.2250 | 4.1244 | TRUE |
RFF1 | RFF2 | -1.5994 | 0.3000 | 2.1994 | FALSE |
RFF1 | LYC1 | -1.7744 | 0.1250 | 2.0244 | FALSE |
RFF2 | LYC1 | -2.0744 | -0.1750 | 1.7244 | FALSE |
Task 2:Note Tracking (NT)
NT Mixed Set Overall Summary Results
This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.
A total of 34 files were used in this subtask: 16 from woodwind recording, 8 from IAL quintet recording and 6 piano.
BD2 | BD3 | LYC1 | RFF1 | RFF2 | YR1 | YR3 | |
---|---|---|---|---|---|---|---|
Ave. F-Measure Onset-Offset | 0.2036 | 0.2077 | 0.2076 | 0.1767 | 0.1414 | 0.3493 | 0.3392 |
Ave. F-Measure Onset Only | 0.4465 | 0.4506 | 0.3862 | 0.4078 | 0.3564 | 0.5601 | 0.5465 |
Ave. F-Measure Chroma | 0.2307 | 0.2438 | 0.2573 | 0.2029 | 0.1655 | 0.3579 | 0.3470 |
Ave. F-Measure Onset Only Chroma | 0.5026 | 0.5232 | 0.4649 | 0.4566 | 0.3986 | 0.5647 | 0.5519 |
Detailed Results
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
BD2 | 0.206 | 0.212 | 0.204 | 0.856 |
BD3 | 0.200 | 0.230 | 0.208 | 0.853 |
LYC1 | 0.190 | 0.237 | 0.208 | 0.829 |
RFF1 | 0.143 | 0.248 | 0.177 | 0.864 |
RFF2 | 0.103 | 0.243 | 0.141 | 0.864 |
YR1 | 0.276 | 0.489 | 0.349 | 0.890 |
YR3 | 0.264 | 0.484 | 0.339 | 0.890 |
Detailed Chroma Results
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
BD2 | 0.232 | 0.242 | 0.231 | 0.854 |
BD3 | 0.232 | 0.273 | 0.244 | 0.852 |
LYC1 | 0.234 | 0.297 | 0.257 | 0.823 |
RFF1 | 0.165 | 0.283 | 0.203 | 0.864 |
RFF2 | 0.121 | 0.281 | 0.166 | 0.863 |
YR1 | 0.282 | 0.502 | 0.358 | 0.886 |
YR3 | 0.270 | 0.496 | 0.347 | 0.886 |
Results Based on Onset Only
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
BD2 | 0.460 | 0.451 | 0.447 | 0.692 |
BD3 | 0.444 | 0.483 | 0.451 | 0.695 |
LYC1 | 0.361 | 0.430 | 0.386 | 0.657 |
RFF1 | 0.339 | 0.550 | 0.408 | 0.617 |
RFF2 | 0.268 | 0.576 | 0.356 | 0.596 |
YR1 | 0.445 | 0.778 | 0.560 | 0.736 |
YR3 | 0.429 | 0.773 | 0.547 | 0.735 |
Chroma Results Based on Onset Only
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
BD2 | 0.517 | 0.510 | 0.503 | 0.677 |
BD3 | 0.512 | 0.565 | 0.523 | 0.679 |
LYC1 | 0.432 | 0.523 | 0.465 | 0.640 |
RFF1 | 0.380 | 0.615 | 0.457 | 0.599 |
RFF2 | 0.300 | 0.642 | 0.399 | 0.565 |
YR1 | 0.448 | 0.787 | 0.565 | 0.708 |
YR3 | 0.433 | 0.783 | 0.552 | 0.701 |
Friedman Tests for Note Tracking
The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the F-measure on individual files.
Tukey-Kramer HSD Multi-Comparison for Task2
TeamID | TeamID | Lowerbound | Mean | Upperbound | Significance |
---|---|---|---|---|---|
YR1 | YR3 | -0.8389 | 0.7059 | 2.2506 | FALSE |
YR1 | BD3 | 1.3670 | 2.9118 | 4.4565 | TRUE |
YR1 | LYC1 | 1.4259 | 2.9706 | 4.5153 | TRUE |
YR1 | BD2 | 1.4553 | 3.0000 | 4.5447 | TRUE |
YR1 | RFF1 | 1.9259 | 3.4706 | 5.0153 | TRUE |
YR1 | RFF2 | 2.8964 | 4.4412 | 5.9859 | TRUE |
YR3 | BD3 | 0.6611 | 2.2059 | 3.7506 | TRUE |
YR3 | LYC1 | 0.7200 | 2.2647 | 3.8094 | TRUE |
YR3 | BD2 | 0.7494 | 2.2941 | 3.8389 | TRUE |
YR3 | RFF1 | 1.2200 | 2.7647 | 4.3094 | TRUE |
YR3 | RFF2 | 2.1906 | 3.7353 | 5.2800 | TRUE |
BD3 | LYC1 | -1.4859 | 0.0588 | 1.6036 | FALSE |
BD3 | BD2 | -1.4565 | 0.0882 | 1.6330 | FALSE |
BD3 | RFF1 | -0.9859 | 0.5588 | 2.1036 | FALSE |
BD3 | RFF2 | -0.0153 | 1.5294 | 3.0741 | FALSE |
LYC1 | BD2 | -1.5153 | 0.0294 | 1.5741 | FALSE |
LYC1 | RFF1 | -1.0447 | 0.5000 | 2.0447 | FALSE |
LYC1 | RFF2 | -0.0741 | 1.4706 | 3.0153 | FALSE |
BD2 | RFF1 | -1.0741 | 0.4706 | 2.0153 | FALSE |
BD2 | RFF2 | -0.1036 | 1.4412 | 2.9859 | FALSE |
RFF1 | RFF2 | -0.5741 | 0.9706 | 2.5153 | FALSE |
NT Piano-Only Overall Summary Results
This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger. 6 piano recordings are evaluated separately for this subtask.
BD2 | BD3 | LYC1 | RFF1 | RFF2 | YR1 | YR3 | |
---|---|---|---|---|---|---|---|
Ave. F-Measure Onset-Offset | 0.1003 | 0.1136 | 0.1926 | 0.1941 | 0.1550 | 0.2127 | 0.1913 |
Ave. F-Measure Onset Only | 0.5263 | 0.5890 | 0.5260 | 0.5205 | 0.4435 | 0.6055 | 0.5881 |
Ave. F-Measure Chroma | 0.1098 | 0.1205 | 0.2068 | 0.2261 | 0.1944 | 0.1966 | 0.1800 |
Ave. F-Measure Onset Only Chroma | 0.5400 | 0.5996 | 0.5412 | 0.5645 | 0.4930 | 0.5547 | 0.5391 |
Detailed Results
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
BD2 | 0.113 | 0.091 | 0.100 | 0.818 |
BD3 | 0.127 | 0.103 | 0.114 | 0.819 |
LYC1 | 0.198 | 0.188 | 0.193 | 0.791 |
RFF1 | 0.182 | 0.210 | 0.194 | 0.796 |
RFF2 | 0.124 | 0.209 | 0.155 | 0.787 |
YR1 | 0.180 | 0.263 | 0.213 | 0.821 |
YR3 | 0.160 | 0.243 | 0.191 | 0.818 |
Detailed Chroma Results
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
BD2 | 0.123 | 0.100 | 0.110 | 0.817 |
BD3 | 0.135 | 0.109 | 0.120 | 0.818 |
LYC1 | 0.212 | 0.203 | 0.207 | 0.778 |
RFF1 | 0.211 | 0.246 | 0.226 | 0.794 |
RFF2 | 0.155 | 0.263 | 0.194 | 0.787 |
YR1 | 0.165 | 0.249 | 0.197 | 0.801 |
YR3 | 0.149 | 0.235 | 0.180 | 0.798 |
Results Based on Onset Only
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
BD2 | 0.588 | 0.479 | 0.526 | 0.522 |
BD3 | 0.663 | 0.532 | 0.589 | 0.523 |
LYC1 | 0.531 | 0.525 | 0.526 | 0.558 |
RFF1 | 0.487 | 0.563 | 0.521 | 0.558 |
RFF2 | 0.355 | 0.596 | 0.443 | 0.541 |
YR1 | 0.504 | 0.793 | 0.606 | 0.545 |
YR3 | 0.484 | 0.783 | 0.588 | 0.541 |
Chroma Results Based on Onset Only
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
BD2 | 0.603 | 0.492 | 0.540 | 0.521 |
BD3 | 0.675 | 0.541 | 0.600 | 0.519 |
LYC1 | 0.547 | 0.539 | 0.541 | 0.551 |
RFF1 | 0.528 | 0.612 | 0.565 | 0.552 |
RFF2 | 0.394 | 0.665 | 0.493 | 0.530 |
YR1 | 0.460 | 0.734 | 0.555 | 0.537 |
YR3 | 0.442 | 0.726 | 0.539 | 0.533 |
Individual Results Files for Task 2
AR5 = Chunghsin Yeh, Axel Roebel
AR6 = Chunghsin Yeh, Axel Roebel
CRCRV2 = F. Quesada et. al.
CRCRV4 = F. Quesada et. al.
DCL2 = Arnaud Dessein, Arshia Cont, Guillaume Lemaitre
DHP2 = Zhiyao Duan, Jinyu Han, Bryan Pardo
JW3 = Jun Wu et. al.
JW4 = Jun Wu et. al.
LYLC2 = C. T. Lee, et. al.
Task 3 Instrument Tracking
Same dataset was used as in Task1. The evaluations were performed by first one-to-one matching the detected contours to the ground-truth contours. This is done by selecting the best scoring duo`s of detected and ground-truth contours. If there are extra detected contours that are not matched to any of the ground-truth contours, all the returned F0`s in those contours are added to false positives. If there are extra ground-truth contours that are not matched to any detected contours, all the F0`s in the ground-truth contours are added to false negatives.
MF0It Detailed Results
file /nema-raid/www/mirex/results/2011/mf0/it/summary/task3.results.csv not found
Detailed Chroma Results
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)
file /nema-raid/www/mirex/results/2011/mf0/it/summary/task3.chroma.results.csv not found