2007:Audio Drum Detection

From MIREX Wiki
Revision as of 01:14, 6 June 2007 by Paulus (talk | contribs) (New page: ==Note== The material below is largely taken from the 2005 page. ==Proposer== 2005 Koen Tanghe (Ghent University). 2007 resurrection Jouni Paulus (Tampere University of Technology) ==Par...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Note

The material below is largely taken from the 2005 page.

Proposer

2005 Koen Tanghe (Ghent University). 2007 resurrection Jouni Paulus (Tampere University of Technology)

Participants

Jouni Paulus (Tampere University of Technology) jouni[dot]paulus[at]tut[dot]fi

Description

The task consists of determining the positions (localization) and corresponding drum class names (labeling) of drum events in polyphonic music. This is very interesting rhythmic information for the popular music genres nowadays, can help in determining tempo and (sub)genre, and can also be queried for directly (typical rhythmic sequences/patterns).

1) Input data The only input for this task is a set of sound file excerpts adhering to the format and content requirements mentioned below. Audio format:

  • CD-quality (PCM, 16-bit, 44100 Hz)
  • mono and stereo
  • 30 seconds excerpts (longer excerpts of whole pieces?)
  • files are named as "001.wav" to "999.wav" (or with another extension depending on the chosen format)

Audio content:

  • polyphonic music with drums (most)
  • polyphonic music without drums (some)
  • different genres / playing styles
  • both live performances and sequenced music
  • different types of drum sets (acoustic, electronic, ...)
  • at least 50 files
  • participants receive a representative subset in advance

2) Output results The output of this task is, for each sound file, an ASCII text file containing 2 columns, where each line represents a drum event. The first column is the position (in seconds) of the drum event, and the second column is the label for the drum event at that position. Multiple drum events may occur at the same time, so there may be multiple lines having the same value in the first column. The file names of the output files are the same as the audio files, but the extension is ".txt" (so: "001.txt" for "001.wav").

Classes and labels that are considered: BD (bass drum) SD (snare drum) HH (hihat)