2017:Multiple Fundamental Frequency Estimation & Tracking Results - Su Dataset

From MIREX Wiki

Introduction

Since last year, a newly annotated polyphonic dataset has been added to this task. This dataset contains a wider range of real-world music in comparison to the old dataset used from 2009. Specifically, the new dataset contains 3 clips of piano solo, 3 clips of string quartet, 2 clips of piano quintet, and 2 clips of violin sonata (violin with piano accompaniment), all of which are selected from real-world recordings. The length of each clip is between 20 and 30 seconds. The dataset is annotated by the method described in the following paper:

Li Su and Yi-Hsuan Yang, "Escaping from the Abyss of Manual Annotation: New Methodology of Building Polyphonic Datasets for Automatic Music Transcription," in Int. Symp. Computer Music Multidisciplinary Research (CMMR), June 2015.

As also mentioned in the paper, we tried our best to calibrate the errors (mostly the mismatch between onset and offset time stamps) in the preliminary annotation by human labor. Since there are still potential errors of annotation that we didn’t find, we decide to make the data and the annotation publicly available after the announcement of MIREX result this year. Specifically, we encourage every participant to help us check the annotation. The result of each competing algorithm will be updated based on the revised annotation. We hope that this can let the participants get more detailed information about the behaviors of the algorithm performing on the dataset. Moreover, in this way we can join our efforts to create a better dataset for the research on multiple-F0 estimation and tracking.

General Legend

Sub code Submission name Abstract Contributors
CB1 Silvet PDF Chris Cannam, Emmanouil Benetos
CB2 Silvet Live PDF Chris Cannam, Emmanouil Benetos
KD1 multiF0_sampled PDF Karin Dressler
KD2 multiF0_midi PDF Karin Dressler
MHMTM1 End-to-End Multi-instrumental ConvNet PDF Gaku Hatanaka, Shinjiro Mita, Alexis Meneses, Daiki Miura, Nattapong Thammasan
MHMTM2 Ensemble category ConvNet to F0 ConvNet PDF Gaku Hatanaka, Shinjiro Mita, Alexis Meneses, Daiki Miura, Nattapong Thammasan
PR1 LPCR PDF Leonid Pogorelyuk, Clarence Rowley
PRGR1 SOT MFFE&T 901 PDF Katarzyna Rokicka, Adam Pluta, Rafal Rokicki, Marcin Gawrysz
PRGR2 SOT MFFE&T 902 PDF Katarzyna Rokicka, Adam Pluta, Rafal Rokicki, Marcin Gawrysz
THK1 Spectral Convolutions PDF John Thickstun, Zaid Harchaoui, Dean Foster, Sham Kakade
WCS1 Piano_Transcription PDF Li Su, Derek Wu, Berlin Chen
ZCY2 Multiple pitch estimation PDF Fuliang Yin, Weiwei Zhang, Zhe Chen
CT1 convlstm PDF Carl Thomé
SL1 samuel-li-onset-detector PDF Samuel Li

Task 1: Multiple Fundamental Frequency Estimation (MF0E)

MF0E Overall Summary Results

Detailed Results

Precision Recall Accuracy Etot Esubs Emiss Efa
CB1.results.task1 0.617 0.236 0.234 0.773 0.150 0.614 0.009
CB2.results.task1 0.586 0.224 0.221 0.788 0.162 0.614 0.011
KD1.results.task1 0.459 0.450 0.381 0.745 0.371 0.179 0.195
KD2.results.task1 0.459 0.450 0.381 0.745 0.371 0.179 0.195
MHMTM1.results.task1 0.612 0.368 0.352 0.676 0.216 0.416 0.044
MHMTM2.results.task1 0.287 0.057 0.057 0.945 0.150 0.793 0.002
PR1.results.task1 0.460 0.332 0.300 0.770 0.306 0.362 0.102
PRGR1.results.task1 0.238 0.060 0.062 0.926 0.183 0.726 0.017
PRGR2.results.task1 0.263 0.102 0.096 1.023 0.215 0.652 0.156
THK1.results.task1 0.701 0.546 0.510 0.529 0.176 0.278 0.075
WCS1.results.task1 0.636 0.397 0.357 0.700 0.186 0.417 0.097
ZCY2.results.task1 0.409 0.282 0.262 0.799 0.362 0.356 0.081

download these results as csv

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Accuracy Etot Esubs Emiss Efa
CB1.results.task1 0.735 0.284 0.281 0.725 0.102 0.614 0.009
CB2.results.task1 0.736 0.284 0.280 0.728 0.102 0.614 0.011
KD1.results.task1 0.571 0.560 0.473 0.635 0.261 0.179 0.195
KD2.results.task1 0.571 0.560 0.473 0.635 0.261 0.179 0.195
MHMTM1.results.task1 0.729 0.440 0.421 0.603 0.144 0.416 0.044
MHMTM2.results.task1 0.494 0.101 0.101 0.901 0.106 0.793 0.002
PR1.results.task1 0.626 0.448 0.406 0.655 0.191 0.362 0.102
PRGR1.results.task1 0.431 0.110 0.114 0.876 0.133 0.726 0.017
PRGR2.results.task1 0.421 0.171 0.158 0.954 0.146 0.652 0.156
THK1.results.task1 0.757 0.591 0.552 0.484 0.131 0.278 0.075
WCS1.results.task1 0.722 0.451 0.405 0.646 0.132 0.417 0.097
ZCY2.results.task1 0.606 0.422 0.391 0.659 0.222 0.356 0.081

download these results as csv

Individual Results Files for Task 1

CB1= Chris Cannam, Emmanouil Benetos
CB2= Chris Cannam, Emmanouil Benetos
KD1= Karin Dressler
KD2= Karin Dressler
MHMTM1= Gaku Hatanaka, Shinjiro Mita, Alexis Meneses, Daiki Miura, Nattapong Thammasan
MHMTM2= Gaku Hatanaka, Shinjiro Mita, Alexis Meneses, Daiki Miura, Nattapong Thammasan
PR1= Leonid Pogorelyuk, Clarence Rowley
PRGR1= Katarzyna Rokicka, Adam Pluta, Rafal Rokicki, Marcin Gawrysz
PRGR2= Katarzyna Rokicka, Adam Pluta, Rafal Rokicki, Marcin Gawrysz
THK1= John Thickstun, Zaid Harchaoui, Dean Foster, Sham Kakade
WCS1= Li Su, Derek Wu, Berlin Chen
ZCY2= Fuliang Yin, Weiwei Zhang, Zhe Chen

Info about the filenames

The first two letters of the filename represent the music type:

PQ = piano quintet, PS = piano solo, SQ = string quartet, VS = violin sonata (with piano accompaniment)

Run Times

Friedman tests for Multiple Fundamental Frequency Estimation (MF0E)

The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the performance (accuracy) on individual files.

Tukey-Kramer HSD Multi-Comparison

TeamID TeamID Lowerbound Mean Upperbound Significance
THK1.results.task1 KD2.results.task1 -2.7603 2.5000 7.7603 FALSE
THK1.results.task1 KD1.results.task1 -2.7603 2.5000 7.7603 FALSE
THK1.results.task1 WCS1.results.task1 -2.1603 3.1000 8.3603 FALSE
THK1.results.task1 MHMTM1.results.task1 -1.8603 3.4000 8.6603 FALSE
THK1.results.task1 PR1.results.task1 -0.7603 4.5000 9.7603 FALSE
THK1.results.task1 ZCY2.results.task1 0.8397 6.1000 11.3603 TRUE
THK1.results.task1 CB1.results.task1 1.6397 6.9000 12.1603 TRUE
THK1.results.task1 CB2.results.task1 2.1397 7.4000 12.6603 TRUE
THK1.results.task1 PRGR2.results.task1 3.8397 9.1000 14.3603 TRUE
THK1.results.task1 PRGR1.results.task1 5.3397 10.6000 15.8603 TRUE
THK1.results.task1 MHMTM2.results.task1 4.6397 9.9000 15.1603 TRUE
KD2.results.task1 KD1.results.task1 -5.2603 0.0000 5.2603 FALSE
KD2.results.task1 WCS1.results.task1 -4.6603 0.6000 5.8603 FALSE
KD2.results.task1 MHMTM1.results.task1 -4.3603 0.9000 6.1603 FALSE
KD2.results.task1 PR1.results.task1 -3.2603 2.0000 7.2603 FALSE
KD2.results.task1 ZCY2.results.task1 -1.6603 3.6000 8.8603 FALSE
KD2.results.task1 CB1.results.task1 -0.8603 4.4000 9.6603 FALSE
KD2.results.task1 CB2.results.task1 -0.3603 4.9000 10.1603 FALSE
KD2.results.task1 PRGR2.results.task1 1.3397 6.6000 11.8603 TRUE
KD2.results.task1 PRGR1.results.task1 2.8397 8.1000 13.3603 TRUE
KD2.results.task1 MHMTM2.results.task1 2.1397 7.4000 12.6603 TRUE
KD1.results.task1 WCS1.results.task1 -4.6603 0.6000 5.8603 FALSE
KD1.results.task1 MHMTM1.results.task1 -4.3603 0.9000 6.1603 FALSE
KD1.results.task1 PR1.results.task1 -3.2603 2.0000 7.2603 FALSE
KD1.results.task1 ZCY2.results.task1 -1.6603 3.6000 8.8603 FALSE
KD1.results.task1 CB1.results.task1 -0.8603 4.4000 9.6603 FALSE
KD1.results.task1 CB2.results.task1 -0.3603 4.9000 10.1603 FALSE
KD1.results.task1 PRGR2.results.task1 1.3397 6.6000 11.8603 TRUE
KD1.results.task1 PRGR1.results.task1 2.8397 8.1000 13.3603 TRUE
KD1.results.task1 MHMTM2.results.task1 2.1397 7.4000 12.6603 TRUE
WCS1.results.task1 MHMTM1.results.task1 -4.9603 0.3000 5.5603 FALSE
WCS1.results.task1 PR1.results.task1 -3.8603 1.4000 6.6603 FALSE
WCS1.results.task1 ZCY2.results.task1 -2.2603 3.0000 8.2603 FALSE
WCS1.results.task1 CB1.results.task1 -1.4603 3.8000 9.0603 FALSE
WCS1.results.task1 CB2.results.task1 -0.9603 4.3000 9.5603 FALSE
WCS1.results.task1 PRGR2.results.task1 0.7397 6.0000 11.2603 TRUE
WCS1.results.task1 PRGR1.results.task1 2.2397 7.5000 12.7603 TRUE
WCS1.results.task1 MHMTM2.results.task1 1.5397 6.8000 12.0603 TRUE
MHMTM1.results.task1 PR1.results.task1 -4.1603 1.1000 6.3603 FALSE
MHMTM1.results.task1 ZCY2.results.task1 -2.5603 2.7000 7.9603 FALSE
MHMTM1.results.task1 CB1.results.task1 -1.7603 3.5000 8.7603 FALSE
MHMTM1.results.task1 CB2.results.task1 -1.2603 4.0000 9.2603 FALSE
MHMTM1.results.task1 PRGR2.results.task1 0.4397 5.7000 10.9603 TRUE
MHMTM1.results.task1 PRGR1.results.task1 1.9397 7.2000 12.4603 TRUE
MHMTM1.results.task1 MHMTM2.results.task1 1.2397 6.5000 11.7603 TRUE
PR1.results.task1 ZCY2.results.task1 -3.6603 1.6000 6.8603 FALSE
PR1.results.task1 CB1.results.task1 -2.8603 2.4000 7.6603 FALSE
PR1.results.task1 CB2.results.task1 -2.3603 2.9000 8.1603 FALSE
PR1.results.task1 PRGR2.results.task1 -0.6603 4.6000 9.8603 FALSE
PR1.results.task1 PRGR1.results.task1 0.8397 6.1000 11.3603 TRUE
PR1.results.task1 MHMTM2.results.task1 0.1397 5.4000 10.6603 TRUE
ZCY2.results.task1 CB1.results.task1 -4.4603 0.8000 6.0603 FALSE
ZCY2.results.task1 CB2.results.task1 -3.9603 1.3000 6.5603 FALSE
ZCY2.results.task1 PRGR2.results.task1 -2.2603 3.0000 8.2603 FALSE
ZCY2.results.task1 PRGR1.results.task1 -0.7603 4.5000 9.7603 FALSE
ZCY2.results.task1 MHMTM2.results.task1 -1.4603 3.8000 9.0603 FALSE
CB1.results.task1 CB2.results.task1 -4.7603 0.5000 5.7603 FALSE
CB1.results.task1 PRGR2.results.task1 -3.0603 2.2000 7.4603 FALSE
CB1.results.task1 PRGR1.results.task1 -1.5603 3.7000 8.9603 FALSE
CB1.results.task1 MHMTM2.results.task1 -2.2603 3.0000 8.2603 FALSE
CB2.results.task1 PRGR2.results.task1 -3.5603 1.7000 6.9603 FALSE
CB2.results.task1 PRGR1.results.task1 -2.0603 3.2000 8.4603 FALSE
CB2.results.task1 MHMTM2.results.task1 -2.7603 2.5000 7.7603 FALSE
PRGR2.results.task1 PRGR1.results.task1 -3.7603 1.5000 6.7603 FALSE
PRGR2.results.task1 MHMTM2.results.task1 -4.4603 0.8000 6.0603 FALSE
PRGR1.results.task1 MHMTM2.results.task1 -5.9603 -0.7000 4.5603 FALSE

download these results as csv

2017 Su Accuracy Per Song Friedman Mean Rankstask1.friedman.Friedman Mean Ranks.png

Task 2:Note Tracking (NT)

NT Mixed Set Overall Summary Results

This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.

CB1 CB2 CT1 KD2 PR1 PRGR1 PRGR2 SL1 ZCY2
Ave. F-Measure Onset-Offset 0.0614 0.0491 0.1511 0.0471 0.0595 0.0098 0.0115 0.0096 0.0343
Ave. F-Measure Onset Only 0.2280 0.1653 0.3017 0.3334 0.2558 0.0543 0.0502 0.2763 0.2054
Ave. F-Measure Chroma 0.0771 0.0707 0.1697 0.0585 0.0829 0.0141 0.0209 0.0181 0.0482
Ave. F-Measure Onset Only Chroma 0.2676 0.2089 0.3308 0.3734 0.3196 0.0843 0.0836 0.3188 0.2676

download these results as csv

Detailed Results

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.077 0.053 0.061 0.731
CB2 0.055 0.047 0.049 0.803
CT1 0.188 0.134 0.151 0.785
KD2 0.051 0.047 0.047 0.841
PR1 0.060 0.064 0.059 0.734
PRGR1 0.015 0.007 0.010 0.491
PRGR2 0.014 0.011 0.012 0.696
SL1 0.015 0.007 0.010 -0.003
ZCY2 0.035 0.036 0.034 0.872

download these results as csv

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.100 0.065 0.077 0.731
CB2 0.081 0.067 0.071 0.803
CT1 0.211 0.150 0.170 0.782
KD2 0.063 0.058 0.059 0.845
PR1 0.084 0.088 0.083 0.722
PRGR1 0.020 0.011 0.014 0.583
PRGR2 0.026 0.020 0.021 0.658
SL1 0.028 0.014 0.018 -0.002
ZCY2 0.050 0.051 0.048 0.862

download these results as csv


Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.291 0.195 0.228 0.508
CB2 0.191 0.159 0.165 0.516
CT1 0.364 0.269 0.302 0.632
KD2 0.373 0.320 0.333 0.485
PR1 0.248 0.291 0.256 0.515
PRGR1 0.086 0.043 0.054 -2.415
PRGR2 0.065 0.047 0.050 0.560
SL1 0.320 0.255 0.276 -0.024
ZCY2 0.217 0.209 0.205 0.488

download these results as csv

Chroma Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.345 0.229 0.268 0.494
CB2 0.246 0.198 0.209 0.510
CT1 0.404 0.293 0.331 0.587
KD2 0.418 0.359 0.373 0.475
PR1 0.314 0.363 0.320 0.490
PRGR1 0.131 0.068 0.084 -2.426
PRGR2 0.108 0.079 0.084 0.506
SL1 0.374 0.291 0.319 -0.029
ZCY2 0.282 0.275 0.268 0.470

download these results as csv

Run Times

Friedman Tests for Note Tracking

The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the F-measure on individual files.

Tukey-Kramer HSD Multi-Comparison for Task2
TeamID TeamID Lowerbound Mean Upperbound Significance
KD2 CT1 -2.9988 0.8000 4.5988 FALSE
KD2 SL1 -1.2988 2.5000 6.2988 FALSE
KD2 PR1 -1.7988 2.0000 5.7988 FALSE
KD2 CB1 -0.2988 3.5000 7.2988 FALSE
KD2 ZCY2 -1.3988 2.4000 6.1988 FALSE
KD2 CB2 0.7012 4.5000 8.2988 TRUE
KD2 PRGR1 1.9012 5.7000 9.4988 TRUE
KD2 PRGR2 2.7012 6.5000 10.2988 TRUE
CT1 SL1 -2.0988 1.7000 5.4988 FALSE
CT1 PR1 -2.5988 1.2000 4.9988 FALSE
CT1 CB1 -1.0988 2.7000 6.4988 FALSE
CT1 ZCY2 -2.1988 1.6000 5.3988 FALSE
CT1 CB2 -0.0988 3.7000 7.4988 FALSE
CT1 PRGR1 1.1012 4.9000 8.6988 TRUE
CT1 PRGR2 1.9012 5.7000 9.4988 TRUE
SL1 PR1 -4.2988 -0.5000 3.2988 FALSE
SL1 CB1 -2.7988 1.0000 4.7988 FALSE
SL1 ZCY2 -3.8988 -0.1000 3.6988 FALSE
SL1 CB2 -1.7988 2.0000 5.7988 FALSE
SL1 PRGR1 -0.5988 3.2000 6.9988 FALSE
SL1 PRGR2 0.2012 4.0000 7.7988 TRUE
PR1 CB1 -2.2988 1.5000 5.2988 FALSE
PR1 ZCY2 -3.3988 0.4000 4.1988 FALSE
PR1 CB2 -1.2988 2.5000 6.2988 FALSE
PR1 PRGR1 -0.0988 3.7000 7.4988 FALSE
PR1 PRGR2 0.7012 4.5000 8.2988 TRUE
CB1 ZCY2 -4.8988 -1.1000 2.6988 FALSE
CB1 CB2 -2.7988 1.0000 4.7988 FALSE
CB1 PRGR1 -1.5988 2.2000 5.9988 FALSE
CB1 PRGR2 -0.7988 3.0000 6.7988 FALSE
ZCY2 CB2 -1.6988 2.1000 5.8988 FALSE
ZCY2 PRGR1 -0.4988 3.3000 7.0988 FALSE
ZCY2 PRGR2 0.3012 4.1000 7.8988 TRUE
CB2 PRGR1 -2.5988 1.2000 4.9988 FALSE
CB2 PRGR2 -1.7988 2.0000 5.7988 FALSE
PRGR1 PRGR2 -2.9988 0.8000 4.5988 FALSE

download these results as csv

2017 Su Note.png

NT Piano-Only Overall Summary Results

This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger. 3 piano solo recordings are evaluated separately for this subtask.

CB1 CB2 CT1 KD2 PR1 PRGR1 PRGR2 SL1 ZCY2
Ave. F-Measure Onset-Offset 0.0892 0.0744 0.1817 0.0530 0.0543 0.0056 0.0137 0.0000 0.0154
Ave. F-Measure Onset Only 0.3686 0.2522 0.4122 0.4508 0.3778 0.0261 0.0400 0.5433 0.2073
Ave. F-Measure Chroma 0.0940 0.0894 0.1860 0.0697 0.0720 0.0130 0.0253 0.0000 0.0249
Ave. F-Measure Onset Only Chroma 0.3834 0.2692 0.4178 0.4767 0.4124 0.0505 0.0696 0.5542 0.2517

download these results as csv

Detailed Results

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.098 0.083 0.089 0.838
CB2 0.072 0.079 0.074 0.774
CT1 0.182 0.182 0.182 0.801
KD2 0.049 0.059 0.053 0.829
PR1 0.043 0.075 0.054 0.818
PRGR1 0.007 0.005 0.006 0.510
PRGR2 0.012 0.017 0.014 0.883
SL1 0.000 0.000 0.000 0.000
ZCY2 0.013 0.019 0.015 0.875

download these results as csv

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.103 0.088 0.094 0.839
CB2 0.088 0.094 0.089 0.784
CT1 0.187 0.186 0.186 0.800
KD2 0.064 0.077 0.070 0.839
PR1 0.058 0.098 0.072 0.807
PRGR1 0.016 0.012 0.013 0.801
PRGR2 0.023 0.031 0.025 0.854
SL1 0.000 0.000 0.000 0.000
ZCY2 0.021 0.031 0.025 0.864

download these results as csv

Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.412 0.336 0.369 0.556
CB2 0.244 0.267 0.252 0.540
DT1 0.009 0.003 0.004 0.220
KB1 0.078 0.055 0.063 0.101
MM1 0.554 0.415 0.470 0.523

download these results as csv

Chroma Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.429 0.349 0.383 0.538
CB2 0.262 0.284 0.269 0.550
CT1 0.426 0.411 0.418 0.569
KD2 0.454 0.508 0.477 0.494
PR1 0.335 0.555 0.412 0.518
PRGR1 0.056 0.050 0.050 -5.976
PRGR2 0.063 0.085 0.070 0.582
SL1 0.608 0.526 0.554 -0.023
ZCY2 0.223 0.296 0.252 0.446

download these results as csv

Individual Results Files for Task 2

CB1= Chris Cannam, Emmanouil Benetos
CB2= Chris Cannam, Emmanouil Benetos
CT1= Carl Thomé
KD2= Karin Dressler
PR1= Leonid Pogorelyuk, Clarence Rowley
PRGR1= Katarzyna Rokicka, Adam Pluta, Rafal Rokicki, Marcin Gawrysz
PRGR2= Katarzyna Rokicka, Adam Pluta, Rafal Rokicki, Marcin Gawrysz
SL1= Samuel Li
ZCY2= Fuliang Yin, Weiwei Zhang, Zhe Chen