Difference between revisions of "2006:Symbolic Melodic Similarity"
(→Building the ground truth) |
|||
Line 28: | Line 28: | ||
=== Measures === | === Measures === | ||
Use the same measures as [[https://www.music-ir.org/evaluation/mirex-results/sym-melody/index.html last year]] to compare the search results of the various algorithms. | Use the same measures as [[https://www.music-ir.org/evaluation/mirex-results/sym-melody/index.html last year]] to compare the search results of the various algorithms. | ||
+ | |||
+ | === Potential participants === | ||
+ | The following people have confirmed their interest: | ||
+ | * Klaus Frieler | ||
+ | * Nicola Orio | ||
+ | * Kjell Lemstr├╢m | ||
+ | * Rainer Typke | ||
+ | |||
+ | === Nicola Orio === | ||
+ | * Pooling among participants of the first M results of the retrieved files (as in the proposal), with a binary relevance, and if not binary at least quantized in a small number of classes (like 0-5 or 0-3). | ||
+ | * The ability to identify the part of the melody to be presented to the evaluator can be done (and make sense) only for local alignment approaches, not for approaches on more general properties of the melodies. I'm afraid this difference in presenting the retrieval results will bias the assessments, and I suggest to use complete melodies for the pooling of the results. | ||
+ | |||
+ | === Kjell Lemstr├╢m === | ||
+ | * When evaluating the time that algorithms use for the task, the time that is spent by system should be excluded from algorithm running times (this was not the case last year). |
Revision as of 02:45, 12 June 2006
Contents
Overview
This page is devoted to discussions of The MIREX06 Symbolic Melodic Similarity contest. Discussions on the MIREX 06 Symbolic Melodic Similarity contest planning list will be briefly digested on this page. A full digest of the discussions is available to subscribers from the MIREX 06 Symbolic Melodic Similarity contest planning list archives.
Task suggestion: Symbolic Melodic Similarity
Proposed tasks
1. Retrieve the most similar incipits from the UK subset of the RISM A/II collection (about 15,000 incipits), given one of the incipits as a query, and rank them by melodic similarity. Both the query and the collection are monophonic.
2. Like task 1, but with two collections of mostly polyphonic MIDI files to be searched for matches. The query would still be monophonic. The first collection would be 10,000 randomly picked MIDI files from a collection of about 60,000 MIDI files that were harvested from the Web. They include different genres (Western and Asian popular music, classical music, ringtones, just to name a few). The second collection would be more focused: about 1000 .kar files (Karaoke MIDI files) with mostly Western popular music which stem from the same web harvest.
Inputs/Outputs
Task 1: Input: about 15,000 MIDI files containing mostly monophonic incipits, and a MIDI file containing the monophonic query. Expected Output: a list of the names of the X most similar matching MIDI files, ordered by similarity. (the value of X is to be decided)
Task 2: Input: about 10,000 mostly polyphonic MIDI files (or 1000 Karaoke files) plus a MIDI file containing a monophonic query. Output: a list of the X most similar file names, ordered by similarity, plus for each file the time (offset from the beginning in seconds) where the query matches and where the matching bit ends.
Building the ground truth
Unlike last year, it is now nearly impossible to manually build a proper ground truth in advance.
Suggestion: Pool the top M results of all participating algorithms and let every participant judge the relevance of the matches for some queries. To make that a manageable burden, it is important that the algorithms do not only return the names of the matching MIDI files for task 2, but also where the matching bit starts and ends in the matching MIDI file. We can then automatically extract those matching bits and put them into small new MIDI files whose relevance can then be quickly checked.
Measures
Use the same measures as [last year] to compare the search results of the various algorithms.
Potential participants
The following people have confirmed their interest:
- Klaus Frieler
- Nicola Orio
- Kjell Lemstr├╢m
- Rainer Typke
Nicola Orio
- Pooling among participants of the first M results of the retrieved files (as in the proposal), with a binary relevance, and if not binary at least quantized in a small number of classes (like 0-5 or 0-3).
- The ability to identify the part of the melody to be presented to the evaluator can be done (and make sense) only for local alignment approaches, not for approaches on more general properties of the melodies. I'm afraid this difference in presenting the retrieval results will bias the assessments, and I suggest to use complete melodies for the pooling of the results.
Kjell Lemstr├╢m
- When evaluating the time that algorithms use for the task, the time that is spent by system should be excluded from algorithm running times (this was not the case last year).