2010:Symbolic Music Similarity and Retrieval
Contents
Description
Retrieve the most similar items from a collection of symbolic documents, given a query, and rank them by melodic similarity. There will be only 1 task this year. Monophonic to monophonic. Both the query and the documents in the collection will be monophonic.
Each system will be given a query and returned the 10 most melodically similar songs from those taken from the Essen Collection (5274 pieces in the MIDI format; see ESAC Data Homepage for more information). For each query, we made four classes of error-mutations, thus the set comprises the following query classes:
- 0. No errors
- 1. One note deleted
- 2. One note inserted
- 3. One interval enlarged
- 4. One interval compressed
Task Specific Mailing List
You can subscribe to this list to participate in the discussion.
Data
- 5,274 tunes belonging to the Essen folksong collection. The tunes are in standard MIDI file format. Download (< 1 MB)
Evaluation
The same method for building the ground truth as last year will be used. This method has the advantage that no ground truth needs to be built in advance. After the algorithms have been submitted, their results are pooled for every query, and human evaluators are asked to judge the relevance of the matches for some queries.
Submission Format
Input
Parameters:
- the name of a directory containing about 5,000 MIDI files containing monophonic folk songs and
- the name of one MIDI file containing a monophonic query.
The program will be called once for each query.
Expected output
A list of the names of the 10 most similar matching MIDI files, ordered by melodic similarity. Write the file name in separate lines, without empty lines in between.
Packaging submissions
All submissions should be statically linked to all libraries (the presence of dynamically linked libraries cannot be guarenteed).
All submissions should include a README file including the following the information:
- Command line calling format for all executables and an example formatted set of commands
- Number of threads/cores used or whether this should be specified on the command line
- Expected memory footprint
- Expected runtime
- Any required environments (and versions), e.g. python, java, bash, matlab.