2012:Multiple Fundamental Frequency Estimation & Tracking Results

From MIREX Wiki
Revision as of 20:30, 2 October 2012 by Mertbay (talk | contribs) (General Legend)

Introduction

These are the results for the 2008 running of the Multiple Fundamental Frequency Estimation and Tracking task. For background information about this task set please refer to the 2012:Multiple Fundamental Frequency Estimation & Tracking page.

General Legend

Sub code Submission name Abstract Contributors
BD1 BenetosDixon MultiF0 PDF Emmanouil Benetos, Simon Dixon
BD2 BenetosDixon NoteTracking1 PDF Emmanouil Benetos, Simon Dixon
BD3 BenetosDixon NoteTracking2 PDF Emmanouil Benetos, Simon Dixon
CPG1 BenetosDixon NoteTracking2 PDF Zhuo Chen, Danile P.W. Ellis Graham;Grindlay
KD1 Karin Dressler PDF Karin Dressler

Task 1: Multiple Fundamental Frequency Estimation (MF0E)

MF0E Overall Summary Results

Below are the average scores across 40 test files. These files come from 3 different sources: woodwind quintet recording of bassoon, clarinet, horn,flute and oboe (UIUC); Rendered MIDI using RWC database donated by IRCAM and a quartet recording of bassoon, clarinet, violin and sax donated by Dr. Bryan Pardo`s Interactive Audio Lab (IAL). 20 files coming from 5 sections of the woodwind recording where each section has 4 files ranging from 2 polyphony to 5 polyphony. 12 files from IAL, coming from 4 different songs ranging from 2 polyphony to 4 polyphony and 8 files from RWC synthesized midi ranging from 2 different songs ranging from 2 polphony to 5 polyphony.

BD1 CPG1 CPG2 CPG3 FBR1 KD1 KD2
Accuracy 0.579 0.272 0.268 0.268 0.561 0.641 0.642
Accuracy Chroma 0.603 0.311 0.306 0.308 0.573 0.673 0.668

download these results as csv

Detailed Results

Precision Recall Accuracy Etot Esubs Emiss Efa
BD1 0.644 0.719 0.579 0.562 0.159 0.122 0.281
CPG1 0.584 0.278 0.272 0.744 0.193 0.529 0.022
CPG2 0.576 0.274 0.268 0.750 0.195 0.533 0.021
CPG3 0.574 0.274 0.268 0.748 0.197 0.529 0.022
FBR1 0.576 0.883 0.561 0.764 0.080 0.037 0.648
KD1 0.844 0.664 0.641 0.378 0.087 0.249 0.043
KD2 0.855 0.667 0.642 0.376 0.077 0.257 0.043

download these results as csv

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Accuracy Etot Esubs Emiss Efa
BD1 0.672 0.756 0.603 0.525 0.123 0.122 0.281
CPG1 0.669 0.318 0.311 0.704 0.153 0.529 0.022
CPG2 0.663 0.314 0.306 0.710 0.155 0.533 0.021
CPG3 0.663 0.315 0.308 0.707 0.156 0.529 0.022
FBR1 0.589 0.905 0.573 0.743 0.058 0.037 0.648
KD1 0.889 0.698 0.673 0.345 0.053 0.249 0.043
KD2 0.891 0.694 0.668 0.349 0.050 0.257 0.043

download these results as csv

Individual Results Files for Task 1

BD1 = Emmanouil Benetos, Simon Dixon
KD1 = Karin Dressler
LYC1 = Cheng-Te Lee, Yi-Hsuan Yang,Homer Chen
RFF1 = Gustavo Reis,Francisco Fernandéz ,Aníbal Ferreira
RFF2 = Gustavo Reis,Francisco Fernandéz ,Aníbal Ferreira
YR1 = Chunghsin YEH,Axel Roebel
YR2 = Chunghsin YEH,Axel Roebel
YR3 = Chunghsin YEH,Axel Roebel
YR4 = Chunghsin YEH,Axel Roebel


Info about the filenames

The filenames starting with part* comes from acoustic woodwind recording, the ones starting with RWC are synthesized. The legend about the instruments are:

bs = bassoon, cl = clarinet, fl = flute, hn = horn, ob = oboe, vl = violin, cel = cello, gtr = guitar, sax = saxophone, bass = electric bass guitar

Run Times

file /nema-raid/www/mirex/results/2012/mf0/est/runtimes_mf0_2011.csv not found

Friedman tests for Multiple Fundamental Frequency Estimation (MF0E)

The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the performance (accuracy) on individual files.

Tukey-Kramer HSD Multi-Comparison

TeamID TeamID Lowerbound Mean Upperbound Significance
KD2 KD1 -1.4110 0.0125 1.4360 FALSE
KD2 BD1 -0.5610 0.8625 2.2860 FALSE
KD2 FBR1 0.0515 1.4750 2.8985 TRUE
KD2 CPG1 2.2265 3.6500 5.0735 TRUE
KD2 CPG3 2.5015 3.9250 5.3485 TRUE
KD2 CPG2 2.5640 3.9875 5.4110 TRUE
KD1 BD1 -0.5735 0.8500 2.2735 FALSE
KD1 FBR1 0.0390 1.4625 2.8860 TRUE
KD1 CPG1 2.2140 3.6375 5.0610 TRUE
KD1 CPG3 2.4890 3.9125 5.3360 TRUE
KD1 CPG2 2.5515 3.9750 5.3985 TRUE
BD1 FBR1 -0.8110 0.6125 2.0360 FALSE
BD1 CPG1 1.3640 2.7875 4.2110 TRUE
BD1 CPG3 1.6390 3.0625 4.4860 TRUE
BD1 CPG2 1.7015 3.1250 4.5485 TRUE
FBR1 CPG1 0.7515 2.1750 3.5985 TRUE
FBR1 CPG3 1.0265 2.4500 3.8735 TRUE
FBR1 CPG2 1.0890 2.5125 3.9360 TRUE
CPG1 CPG3 -1.1485 0.2750 1.6985 FALSE
CPG1 CPG2 -1.0860 0.3375 1.7610 FALSE
CPG3 CPG2 -1.3610 0.0625 1.4860 FALSE

download these results as csv

Accuracy Per Song Friedman Mean Rankstask1.friedman.Friedman Mean Ranks.png

Task 2:Note Tracking (NT)

NT Mixed Set Overall Summary Results

This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.

A total of 34 files were used in this subtask: 16 from woodwind recording, 8 from IAL quintet recording and 6 piano.

BD2 BD3 CPG1 CPG2 CPG3 FBR2 FT1 KD3 SB5
Ave. F-Measure Onset-Offset 0.234 0.226 0.128 0.128 0.113 0.393 0.021 0.451 0.087
Ave. F-Measure Onset Only 0.43 0.411 0.219 0.225 0.273 0.613 0.055 0.646 0.498
Ave. F-Measure Chroma 0.254 0.259 0.133 0.132 0.117 0.428 0.031 0.468 0.102
Ave. F-Measure Onset Only Chroma 0.473 0.483 0.24 0.244 0.297 0.638 0.082 0.666 0.558

download these results as csv

Detailed Results

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.202 0.296 0.234 0.856
BD3 0.204 0.267 0.226 0.854
CPG1 0.300 0.086 0.128 0.893
CPG2 0.288 0.087 0.128 0.859
CPG3 0.181 0.085 0.113 0.682
FBR2 0.348 0.470 0.393 0.893
FT1 0.084 0.013 0.021 0.360
KD3 0.451 0.456 0.451 0.889
SB5 0.077 0.104 0.087 0.797

download these results as csv

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.218 0.324 0.254 0.850
BD3 0.233 0.308 0.259 0.850
CPG1 0.312 0.090 0.133 0.892
CPG2 0.299 0.090 0.132 0.858
CPG3 0.192 0.088 0.117 0.737
FBR2 0.379 0.512 0.428 0.886
FT1 0.110 0.019 0.031 0.536
KD3 0.469 0.473 0.468 0.887
SB5 0.090 0.123 0.102 0.796

download these results as csv


Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.381 0.524 0.430 0.685
BD3 0.382 0.468 0.411 0.695
CPG1 0.545 0.145 0.219 0.698
CPG2 0.540 0.151 0.225 0.671
CPG3 0.515 0.199 0.273 0.465
FBR2 0.553 0.716 0.613 0.766
FT1 0.218 0.033 0.055 0.312
KD3 0.647 0.652 0.646 0.782
SB5 0.423 0.635 0.498 0.573

download these results as csv

Chroma Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.419 0.581 0.473 0.659
BD3 0.447 0.554 0.483 0.659
CPG1 0.602 0.159 0.240 0.664
CPG2 0.590 0.163 0.244 0.644
CPG3 0.573 0.215 0.297 0.472
FBR2 0.575 0.747 0.638 0.751
FT1 0.302 0.049 0.082 0.386
KD3 0.668 0.673 0.666 0.769
SB5 0.473 0.717 0.558 0.576

download these results as csv

Run Times

TBD

Friedman Tests for Note Tracking

The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the F-measure on individual files.

Tukey-Kramer HSD Multi-Comparison for Task2
TeamID TeamID Lowerbound Mean Upperbound Significance
KD3 FBR2 -1.5876 0.4706 2.5288 FALSE
KD3 BD2 0.3830 2.4412 4.4994 TRUE
KD3 BD3 0.2948 2.3529 4.4111 TRUE
KD3 CPG1 2.1330 4.1912 6.2494 TRUE
KD3 CPG2 2.1183 4.1765 6.2347 TRUE
KD3 CPG3 2.8536 4.9118 6.9699 TRUE
KD3 SB5 3.5595 5.6176 7.6758 TRUE
KD3 FT1 4.4859 6.5441 8.6023 TRUE
FBR2 BD2 -0.0876 1.9706 4.0288 FALSE
FBR2 BD3 -0.1758 1.8824 3.9405 FALSE
FBR2 CPG1 1.6624 3.7206 5.7788 TRUE
FBR2 CPG2 1.6477 3.7059 5.7641 TRUE
FBR2 CPG3 2.3830 4.4412 6.4994 TRUE
FBR2 SB5 3.0889 5.1471 7.2052 TRUE
FBR2 FT1 4.0153 6.0735 8.1317 TRUE
BD2 BD3 -2.1464 -0.0882 1.9699 FALSE
BD2 CPG1 -0.3082 1.7500 3.8082 FALSE
BD2 CPG2 -0.3229 1.7353 3.7935 FALSE
BD2 CPG3 0.4124 2.4706 4.5288 TRUE
BD2 SB5 1.1183 3.1765 5.2347 TRUE
BD2 FT1 2.0448 4.1029 6.1611 TRUE
BD3 CPG1 -0.2199 1.8382 3.8964 FALSE
BD3 CPG2 -0.2347 1.8235 3.8817 FALSE
BD3 CPG3 0.5006 2.5588 4.6170 TRUE
BD3 SB5 1.2065 3.2647 5.3229 TRUE
BD3 FT1 2.1330 4.1912 6.2494 TRUE
CPG1 CPG2 -2.0729 -0.0147 2.0435 FALSE
CPG1 CPG3 -1.3376 0.7206 2.7788 FALSE
CPG1 SB5 -0.6317 1.4265 3.4847 FALSE
CPG1 FT1 0.2948 2.3529 4.4111 TRUE
CPG2 CPG3 -1.3229 0.7353 2.7935 FALSE
CPG2 SB5 -0.6170 1.4412 3.4994 FALSE
CPG2 FT1 0.3095 2.3676 4.4258 TRUE
CPG3 SB5 -1.3523 0.7059 2.7641 FALSE
CPG3 FT1 -0.4258 1.6324 3.6905 FALSE
SB5 FT1 -1.1317 0.9265 2.9847 FALSE

download these results as csv

2012Accuracy Per Song Friedman Mean Rankstask2.friedman.Friedman Mean Ranks.png

NT Piano-Only Overall Summary Results

This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger. 6 piano recordings are evaluated separately for this subtask.

BD2 BD3 CPG1 CPG2 CPG3 FBR2 FT1 KD3 SB5
Ave. F-Measure Onset-Offset 0.1295 0.1853 0.1409 0.1371 0.1591 0.1625 0.0724 0.2680 0.0398
Ave. F-Measure Onset Only 0.5001 0.6078 0.2970 0.3058 0.3841 0.6158 0.1728 0.6565 0.6638
Ave. F-Measure Chroma 0.1471 0.1946 0.1423 0.1384 0.1633 0.1946 0.0740 0.2508 0.0423
Ave. F-Measure Onset Only Chroma 0.5245 0.6223 0.3082 0.3128 0.3905 0.6450 0.1763 0.6141 0.6746

download these results as csv

Detailed Results

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.125 0.135 0.129 0.789
BD3 0.193 0.179 0.185 0.796
CPG1 0.360 0.094 0.141 0.820
CPG2 0.309 0.092 0.137 0.803
CPG3 0.269 0.120 0.159 0.800
FBR2 0.158 0.170 0.163 0.831
FT1 0.258 0.043 0.072 0.568
KD3 0.263 0.275 0.268 0.842
SB5 0.036 0.045 0.040 0.569

download these results as csv

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.141 0.155 0.147 0.764
BD3 0.202 0.188 0.195 0.784
CPG1 0.363 0.095 0.142 0.821
CPG2 0.311 0.093 0.138 0.803
CPG3 0.284 0.123 0.163 0.808
FBR2 0.187 0.207 0.195 0.805
FT1 0.262 0.045 0.074 0.563
KD3 0.246 0.257 0.251 0.839
SB5 0.039 0.048 0.042 0.568

download these results as csv

Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.477 0.531 0.500 0.526
BD3 0.627 0.594 0.608 0.550
CPG1 0.736 0.200 0.297 0.635
CPG2 0.710 0.206 0.306 0.577
CPG3 0.725 0.282 0.384 0.535
FBR2 0.606 0.639 0.616 0.556
FT1 0.593 0.105 0.173 0.421
KD3 0.643 0.674 0.657 0.659
SB5 0.595 0.754 0.664 0.419

download these results as csv

Chroma Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.501 0.556 0.525 0.523
BD3 0.641 0.608 0.622 0.548
CPG1 0.775 0.207 0.308 0.604
CPG2 0.736 0.211 0.313 0.559
CPG3 0.761 0.286 0.391 0.539
FBR2 0.632 0.673 0.645 0.549
FT1 0.602 0.107 0.176 0.410
KD3 0.602 0.630 0.614 0.656
SB5 0.605 0.766 0.675 0.425

download these results as csv

Individual Results Files for Task 2

BD2 = Emmanouil Benetos, Simon Dixon
BD3 = Emmanouil Benetos, Simon Dixon
LYC1 = Cheng-Te Lee, Yi-Hsuan Yang,Homer Chen
RFF1 = Gustavo Reis,Francisco Fernandéz ,Aníbal Ferreira
RFF2 = Gustavo Reis,Francisco Fernandéz ,Aníbal Ferreira
YR1 = Chunghsin YEH,Axel Roebel
YR3 = Chunghsin YEH,Axel Roebel