2007:Audio Drum Detection
Contents
Note
The material below is largely taken from the 2005 page.
Proposer
2005 Koen Tanghe (Ghent University). 2007 resurrection Jouni Paulus (Tampere University of Technology)
Participants
Jouni Paulus (Tampere University of Technology) jouni[dot]paulus[at]tut[dot]fi
Description
The task consists of determining the positions (localization) and corresponding drum class names (labeling) of drum events in polyphonic music. This is very interesting rhythmic information for the popular music genres nowadays, can help in determining tempo and (sub)genre, and can also be queried for directly (typical rhythmic sequences/patterns).
1) Input data The only input for this task is a set of sound file excerpts adhering to the format and content requirements mentioned below. Audio format:
- CD-quality (PCM, 16-bit, 44100 Hz)
- mono and stereo
- 30 seconds excerpts (longer excerpts of whole pieces?)
- files are named as "001.wav" to "999.wav" (or with another extension depending on the chosen format)
Audio content:
- polyphonic music with drums (most)
- polyphonic music without drums (some)
- different genres / playing styles
- both live performances and sequenced music
- different types of drum sets (acoustic, electronic, ...)
- at least 50 files
- participants receive a representative subset in advance
2) Output results The output of this task is, for each sound file, an ASCII text file containing 2 columns, where each line represents a drum event. The first column is the position (in seconds) of the drum event, and the second column is the label for the drum event at that position. Multiple drum events may occur at the same time, so there may be multiple lines having the same value in the first column. The file names of the output files are the same as the audio files, but the extension is ".txt" (so: "001.txt" for "001.wav").
Classes and labels that are considered: BD (bass drum) SD (snare drum) HH (hihat)