Difference between revisions of "2005:Audio Onset Detect"

From MIREX Wiki
m
Line 1: Line 1:
==Review 1==
+
==Description==
  
Besides being useful per se, onset detection is a pre-processing step for further music processing: rhythm analysis, beat tracking, instrument classification, and so on. It would be interesting that the proposal shortly discusses whether the evaluation metrics are unbiased wrt to the different potential applications.
+
The aim of this contest is to compare state-of-the-art onset detection algorithms on music recordings. The methods will be evaluated on a large, various and reliably-annotated dataset, composed of sub-datasets grouping files of the same type.
  
In order to decide which algorithm is the winner a single number should be finally extracted. A possibility to do so is tuning the algorithms to a single working point on the ROC curve, e.g. say allow a difference between FP and FN of less than 1%.
+
1) '''Input data'''
The evaluation should account for a statistical significance measure. I suppose McNemar's test could do the job.
 
  
It does not mention whether there will be training data available to participants.
+
''Audio format'':
To my understanding, evaluation on the following three subcategories is enough: monophonic instrument, polyphonic solo instrument and complex mixes.
 
  
I cannot tell whether the suggested participants are willing to participate. Other potential candidates could be: Simon Dixon, Harvey Thornburg, Masataka Goto.
+
The data are monophonic sound files, with the associated onset times and
 +
data about the annotation robustness.
 +
* CD-quality (PCM, 16-bit, 44100 Hz)
 +
* single channel (mono)
 +
* file length between 2 and 36 seconds (total time: 14 minutes)
 +
* File names:
 +
 
 +
''Audio content'':
 +
 
 +
The dataset is subdivided into classes, because onset detection is sometimes performed in applications dedicated to a single type of signal (ex: segmentation of a single track in a mix, drum transcription, complex mixes databases segmentation...). The performance of each algorithm will be assessed on the whole dataset but also on each class separately.
 +
 
 +
The dataset contains 85 files from 5 classes annotated as follows:
 +
* 30 solo drum excerpts cross-annotated by 3 people
 +
* 30 solo monophonic pitched instruments excerpts cross-annotated by 3 people
 +
* 10 solo polyphonic pitched instruments excerpts cross-annotated by 3 people
 +
* 15 complex mixes cross-annotated by 5 people
 +
 
 +
Moreover the monophonic pitched instruments class is divided into 6 sub-classes: brass (2 excerpts), winds (4), sustained strings (6), plucked strings (9), bars and bells (4), singing voice (5).
 +
 
 +
''Nomenclature''
 +
 
 +
<AudioFileName>.wav for the audio file
 +
 
 +
 
 +
2) '''Output data'''
 +
 
 +
The onset detection algoritms will return onset times in a text file: <Results of evaluated Algo path>/<AudioFileName>.output.
 +
 
 +
 
 +
''Onset file Format''
 +
 
 +
<onset time(in seconds)>\n
 +
 
 +
where \n denotes the end of line. The < and > characters are not included.

Revision as of 15:00, 19 September 2005

Description

The aim of this contest is to compare state-of-the-art onset detection algorithms on music recordings. The methods will be evaluated on a large, various and reliably-annotated dataset, composed of sub-datasets grouping files of the same type.

1) Input data

Audio format:

The data are monophonic sound files, with the associated onset times and data about the annotation robustness.

  • CD-quality (PCM, 16-bit, 44100 Hz)
  • single channel (mono)
  • file length between 2 and 36 seconds (total time: 14 minutes)
  • File names:

Audio content:

The dataset is subdivided into classes, because onset detection is sometimes performed in applications dedicated to a single type of signal (ex: segmentation of a single track in a mix, drum transcription, complex mixes databases segmentation...). The performance of each algorithm will be assessed on the whole dataset but also on each class separately.

The dataset contains 85 files from 5 classes annotated as follows:

  • 30 solo drum excerpts cross-annotated by 3 people
  • 30 solo monophonic pitched instruments excerpts cross-annotated by 3 people
  • 10 solo polyphonic pitched instruments excerpts cross-annotated by 3 people
  • 15 complex mixes cross-annotated by 5 people

Moreover the monophonic pitched instruments class is divided into 6 sub-classes: brass (2 excerpts), winds (4), sustained strings (6), plucked strings (9), bars and bells (4), singing voice (5).

Nomenclature

<AudioFileName>.wav for the audio file


2) Output data

The onset detection algoritms will return onset times in a text file: <Results of evaluated Algo path>/<AudioFileName>.output.


Onset file Format

<onset time(in seconds)>\n

where \n denotes the end of line. The < and > characters are not included.