2009:Audio Cover Song Identification
Contents
2008 AUDIO COVER SONG IDENTIFICATION TASK OVERVIEW
The Audio Cover Song task was a new task for MIREX 2006. It was closely related to the Audio Music Similarity and Retrieval (AMS) task as the cover songs were embedded in the Audio Music Similarity and Retrieval test collection. However, AMS has change its input format this year so Audio Cover Song and AMS will not be interlinked tasks this year.
Task Description
Within the 1000 pieces in the Audio Cover Song database, there are embedded 30 different "cover songs" each represented by 11 different "versions" for a total of 330 audio files (16bit, monophonic, 22.05khz, wav). The "cover songs" represent a variety of genres (e.g., classical, jazz, gospel, rock, folk-rock, etc.) and the variations span a variety of styles and orchestrations.
Using each of these cover song files in turn as as the "seed/query" file, we will examine the returned lists of items for the presence of the other 10 versions of the "seed/query" file.
Command Line Calling Format
$ /path/to/submission <collection_list_file> <query_list_file> <working_directory> <output_file> <collection_list_file>: Text file containing 1000 full path file names for the 1000 audio files in the collection (including the 330 query documents). Example: /path/to/coversong/collection.txt <query_list_file> : Text file containing the 330 full path file names for the 330 query documents. Example: /path/to/coversong/queries.txt <working_directory> : Full path to a temporary directory where submission will have write access for caching features or calculations. Example: /tmp/submission_id/ <output_file> : Full path to file where submission should output the similarity matrix (1000 header rows + 330 x 1000 data matrix). Example: /path/to/coversong/results/submission_id.txt
Input Files
The collection lists file format will be of the form:
/path/to/audio/file/000.wav\n /path/to/audio/file/001.wav\n /path/to/audio/file/002.wav\n ... * 996 rows omitted * ... /path/to/audio/file/999.wav\n
The query lists file format will be of the form:
/path/to/audio/file/182.wav\n /path/to/audio/file/245.wav\n /path/to/audio/file/432.wav\n ... * 326 rows omitted * ... /path/to/audio/file/973.wav\n
For a total of 330 rows -- query ids are randomly assigned from the pool of 1000 collection ids.
Lines will be terminated by a '\n' character.
Output File
The only output will be a distance matrix file that is 330 rows by 1000 columns in the following format:
Example distance matrix 0.1 (replace this line with your system name) 1 path/to/audio/file/1.wav 2 path/to/audio/file/2.wav 3 path/to/audio/file/3.wav ... N path/to/audio/file/N.wav Q/R 1 2 3 ... N 1 0.0 1.241 0.2e-4 ... 0.4255934 2 1.241 0.000 0.6264 ... 0.2356447 3 50.2e-4 0.6264 0.0000 ... 0.3800000 ... ... ... ... ... 0.7172300 5 0.42559 0.23567 0.38 ... 0.000
All distances should be zero or positive (0.0+) and should not be infinite or NaN. Values should be separated by a TAB.
As N (collection searched for covers) is 1000 and there are 330 original tracks, the distance matrix should be preceded by 1000 rows of file paths and should be composed of 1000 columns of distance (separated by tab characters) and 330 rows (one for each original track). Each row corresponds to a particular query song (the track to find covers of). Please ensure that the query songs are listed in exactly the same order as they appear in the list file you are passed.
Evaluation
We could employ the same measures used in 2006:Audio Cover Song.
... Should it be 2007? --Jserra 03:48, 25 July 2008 (CDT)
... yes it should - a number of additional evaluation metrics were introduced in 2007 and include:
- Total number of covers identified in top 10
- Mean number of covers identified in top 10 (average performance)
- Mean (arithmetic) of Avg. Precisions
- Mean rank of first correctly identified cover
--Kriswest 07:02, 18 August 2008 (CDT)
Any way to get evaluation databases from 2006 and 2007? --Gene Linetsky
For copyright reasons IMIRSEL does not distribute its evaluation databases. Further, keeping the evaluation dataset (somewhat) closed anables it to be used yea on year, with less chance of any system being particularly over-fitted to it and thereby achieving inflated performance estimates. --Kriswest 07:02, 18 August 2008 (CDT)
Potential Participants
- Joan Serrà, Emilia Gómez & Perfecto Herrera
- Alexey Egorov (CBMS Networks) <-- made the submission
- Chuan Cao and Ming Li (ThinkIT Lab., IOA), ccao <at> hccl.ioa.ac.cn, mli <at> hccl.ioa.ac.cn