Difference between revisions of "2010:Audio Melody Extraction"
From MIREX Wiki
m (→Discussions from 2009) |
(→Discussions from 2009) |
||
Line 14: | Line 14: | ||
== Discussions from 2009 == | == Discussions from 2009 == | ||
− | + | https://www.music-ir.org/mirex/2009/index.php/Audio_Melody_Extraction#Discussions_for_2009 | |
== '''Dataset''' == | == '''Dataset''' == |
Revision as of 14:04, 23 March 2010
Contents
Description
The aim of the MIREX audio melody extraction evaluation is to identify the melody pitch contour from polyphonic musical audio.
The task consists of two parts:
- Voicing detection (deciding whether a particular time frame contains a "melody pitch" or not),
- pitch detection (deciding the most likely melody pitch for each time frame).
We structure the submission to allow these parts to be done independently, i.e. it is possible (via a negative pitch value) to guess a pitch even for frames that were being judged unvoiced. Algorithms which don't perform a discrimination between melodic and non-melodic parts are also welcome!
Discussions for 2010
Discussions from 2009
https://www.music-ir.org/mirex/2009/index.php/Audio_Melody_Extraction#Discussions_for_2009
Dataset
- MIR-1K database : dataset for Mirex, Karaoke recordings of Chinese songs. Instruments: singing voice (male, female), synthetic accompaniment.
- MIREX08 database : 4 excerpts of 1 min. from "north Indian classical vocal performances", instruments: singing voice (male, female), tanpura (Indian instrument, perpetual background drone), harmonium (secondary melodic instrument) and tablas (pitched percussions).
- MIREX05 database : 25 phrase excerpts of 10-40 sec from the following genres: Rock, R&B, Pop, Jazz, Solo classical piano.
- ISMIR04 database : 20 excerpts of about 20s each.
- CD-quality (PCM, 16-bit, 44100 Hz)
- single channel (mono)
- manually annotated reference data (10 ms time grid)
Output Format
- In order to allow for generalization among potential approaches (i.e. frame size, hop size, etc), submitted algorithms should output pitch estimates, in Hz, at discrete instants in time
- so the output file successively contains the time stamp [space or tab] the corresponding frequency value [new line]
- the time grid of the reference file is 10 ms, yet the submission may use a different time grid as output (for example 5.8 ms)
- Instants which are identified unvoiced (there is no dominant melody) can either be scored as 0 Hz or as a negative pitch value. If negative pitch values are given the statistics for Raw Pitch Accuracy and Raw Chroma Accuracy may be improved.
Relevant Development Collections
- Graham's collection: you find the test set here and further explanations on the pages http://www.ee.columbia.edu/~graham/mirex_melody/ and http://labrosa.ee.columbia.edu/projects/melody/
- For the ISMIR 2004 Audio Description Contest, the Music Technology Group of the Pompeu Fabra University assembled a diverse of audio segments and corresponding melody transcriptions including audio excerpts from such genres as Rock, R&B, Pop, Jazz, Opera, and MIDI. (full test set with the reference transcriptions (28.6 MB))