2018:Multiple Fundamental Frequency Estimation & Tracking Results - Su Dataset
Contents
Introduction
Since 2015, a newly annotated polyphonic dataset has been added to this task. This dataset contains a wider range of real-world music in comparison to the old dataset used from 2009. Specifically, the new dataset contains 3 clips of piano solo, 3 clips of string quartet, 2 clips of piano quintet, and 2 clips of violin sonata (violin with piano accompaniment), all of which are selected from real-world recordings. The length of each clip is between 20 and 30 seconds. The dataset is annotated by the method described in the following paper:
Li Su and Yi-Hsuan Yang, "Escaping from the Abyss of Manual Annotation: New Methodology of Building Polyphonic Datasets for Automatic Music Transcription," in Int. Symp. Computer Music Multidisciplinary Research (CMMR), June 2015.
As also mentioned in the paper, we tried our best to calibrate the errors (mostly the mismatch between onset and offset time stamps) in the preliminary annotation by human labor. Since there are still potential errors of annotation that we didn’t find, we decide to make the data and the annotation publicly available after the announcement of MIREX result this year. Specifically, we encourage every participant to help us check the annotation. The result of each competing algorithm will be updated based on the revised annotation. We hope that this can let the participants get more detailed information about the behaviors of the algorithm performing on the dataset. Moreover, in this way we can join our efforts to create a better dataset for the research on multiple-F0 estimation and tracking.
General Legend
Sub code | Submission name | Abstract | Contributors |
---|---|---|---|
CB1 | Silvet | Chris Cannam, Emmanouil Benetos | |
CB2 | Silvet Live | Chris Cannam, Emmanouil Benetos | |
KB1 (Note-subtask2 only) | PianoTranscriptor | Rainer Kelz, Sebastian Böck] |
Task 1: Multiple Fundamental Frequency Estimation (MF0E)
MF0E Overall Summary Results
Detailed Results
Precision | Recall | Accuracy | Etot | Esubs | Emiss | Efa | ||
---|---|---|---|---|---|---|---|---|
CB1 | 0.617 | 0.236 | 0.234 | 0.773 | 0.150 | 0.614 | 0.009 | |
CB2 | 0.586 | 0.224 | 0.221 | 0.788 | 0.162 | 0.614 | 0.011 |
Detailed Chroma Results
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)
Precision | Recall | Accuracy | Etot | Esubs | Emiss | Efa | ||
---|---|---|---|---|---|---|---|---|
CB1 | 0.735 | 0.284 | 0.281 | 0.725 | 0.102 | 0.614 | 0.009 | |
CB2 | 0.736 | 0.284 | 0.280 | 0.728 | 0.102 | 0.614 | 0.011 |
Individual Results Files for Task 1
CB1= Chris Cannam, Emmanouil Benetos
CB2= Chris Cannam, Emmanouil Benetos
Info about the filenames
The first two letters of the filename represent the music type:
PQ = piano quintet, PS = piano solo, SQ = string quartet, VS = violin sonata (with piano accompaniment)
Run Times
Friedman tests for Multiple Fundamental Frequency Estimation (MF0E)
The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the performance (accuracy) on individual files.
Tukey-Kramer HSD Multi-Comparison
TeamID | TeamID | Lowerbound | Mean | Upperbound | Significance |
---|---|---|---|---|---|
CB1 | CB2 | -0.2198 | 0.4000 | 1.0198 | FALSE |
Task 2:Note Tracking (NT)
NT Mixed Set Overall Summary Results
This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.
CB1 | CB2 | |
---|---|---|
Ave. F-Measure Onset-Offset | 0.0614 | 0.0491 |
Ave. F-Measure Onset Only | 0.2280 | 0.1653 |
Ave. F-Measure Chroma | 0.0771 | 0.0707 |
Ave. F-Measure Onset Only Chroma | 0.2676 | 0.2089 |
Detailed Results
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
CB1 | 0.077 | 0.053 | 0.061 | 0.731 |
CB2 | 0.055 | 0.047 | 0.049 | 0.803 |
Detailed Chroma Results
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
CB1 | 0.100 | 0.065 | 0.077 | 0.731 |
CB2 | 0.081 | 0.067 | 0.071 | 0.803 |
Results Based on Onset Only
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
CB1 | 0.291 | 0.195 | 0.228 | 0.508 |
CB2 | 0.191 | 0.159 | 0.165 | 0.516 |
Chroma Results Based on Onset Only
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
CB1 | 0.345 | 0.229 | 0.268 | 0.494 |
CB2 | 0.246 | 0.198 | 0.209 | 0.510 |
Run Times
Friedman Tests for Note Tracking
The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the F-measure on individual files.
Tukey-Kramer HSD Multi-Comparison for Task2
TeamID | TeamID | Lowerbound | Mean | Upperbound | Significance |
---|---|---|---|---|---|
CB1 | CB2 | -0.0198 | 0.6000 | 1.2198 | FALSE |
NT Piano-Only Overall Summary Results
This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger. 3 piano solo recordings are evaluated separately for this subtask.
CB1 | CB2 | KB1 | |
---|---|---|---|
Ave. F-Measure Onset-Offset | 0.0892 | 0.0744 | 0.2136 |
Ave. F-Measure Onset Only | 0.3686 | 0.2522 | 0.6090 |
Ave. F-Measure Chroma | 0.0940 | 0.0894 | 0.2180 |
Ave. F-Measure Onset Only Chroma | 0.3834 | 0.2692 | 0.6090 |
Detailed Results
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
CB1 | 0.098 | 0.083 | 0.089 | 0.838 |
CB2 | 0.072 | 0.079 | 0.074 | 0.774 |
KB1 | 0.227 | 0.203 | 0.214 | 0.837 |
Detailed Chroma Results
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
CB1 | 0.103 | 0.088 | 0.094 | 0.839 |
CB2 | 0.088 | 0.094 | 0.089 | 0.784 |
KB1 | 0.232 | 0.207 | 0.218 | 0.839 |
Results Based on Onset Only
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
CB1 | 0.412 | 0.336 | 0.369 | 0.556 |
CB2 | 0.245 | 0.267 | 0.252 | 0.540 |
KB1 | 0.658 | 0.571 | 0.609 | 0.595 |
Chroma Results Based on Onset Only
Precision | Recall | Ave. F-measure | Ave. Overlap | |
---|---|---|---|---|
CB1 | 0.429 | 0.349 | 0.383 | 0.538 |
CB2 | 0.262 | 0.284 | 0.269 | 0.550 |
KB1 | 0.658 | 0.571 | 0.609 | 0.589 |
Individual Results Files for Task 2
CB1= Chris Cannam, Emmanouil Benetos
CB2= Chris Cannam, Emmanouil Benetos
KB1= Rainer Kelz, Sebastian Böck