Difference between revisions of "2008:Query by Tapping"

From MIREX Wiki
(Data processing for output answer file formats)
Line 3: Line 3:
  
 
== Task description ==
 
== Task description ==
Submissions to this task are expected to query a collection of wav files to retrieve 129 tapped queries. The queries are provided in both MIDI and CSV formats (time interval in MS between taps). Results should be returned in the form of a ranked list of WAV filenames for each query.
+
* '''Test database''': 129 ground-truth mono MIDI files.
 
+
* '''WAV Quries''': 272 query files to retrieve 129 known target from collection. So far, 1~6 human assessors have listened and tapped a 15 seconds query rhythm from beginning for each target songs.
* '''WAV file collection''': 272 WAV files.  
+
* '''CSV Quries''': 129 query files in CSV format which are drawn from original MIDI. It include time interval (in ms) between first 25 NoteOn event separate with comma. Rest note and time interval smaller than 100ms will be discarded.
* '''Queries''': queries are specified in two formats:
+
* '''Evaluation''': Mean Reciprocal rank. Return top 10 candidates for each query file.
** '''Midi files''': 129 mono MIDI files. So far, 1~6 human assessors have listened and tapped a 15 seconds query rhythm from beginning of each target song.
 
** '''CSV files''': 129 text files in CSV format. These contain the time interval (in ms) between first 25 NoteOn events extracted from the query Midi files, separated with comma. Rest note and time intervals smaller than 100ms have been discarded.
 
* '''Evaluation''': Mean Reciprocal rank. Only the top 10 candidates for each query will be considered.
 
  
 
== Data processing proposal for calling formats ==
 
== Data processing proposal for calling formats ==
  
=== Indexing the WAV file collections ===
+
=== Indexing the MIDIs collection ===
 
<pre>
 
<pre>
 
Indexing_exe <var1> <var2>
 
Indexing_exe <var1> <var2>
Line 19: Line 16:
 
where
 
where
  
<var1>==<directory_of_WAV_files_path>  
+
<var1>==<directory_of_MIDIs>  
<var2>==<output_and_working_directory_path>
+
<var2>==<indexing_files_output_and_working_directory>
 
</pre>
 
</pre>
  
=== Running queries ===
+
=== Running for the query files ===
 
<pre>
 
<pre>
Query_exe <var3> <var4> <var5>
+
Running_exe <var3> <var4> <var5>
  
 
where
 
where
  
<var3>==<output_and_working_directory_path>  
+
<var3>==<directory_of_indexed_file>  
<var4>==<directory_of_CSV_or_MIDI_file_queries_path>  
+
<var4>==<directory_of_query_rhythm_files>  
<var5>==<answer_file_output_path>  
+
<var5>==<answer_file_output_directory>  
 
</pre>
 
</pre>
  
Line 42: Line 39:
 
</pre>
 
</pre>
 
Each line represents to each of the queries in a given task run.
 
Each line represents to each of the queries in a given task run.
 
At least the top 10 ranked results for each query should be returned. Only the top 10 results will be considered in the evaluation.
 
  
 
== Submission closing date ==
 
== Submission closing date ==

Revision as of 16:30, 18 August 2008

Overview

The main purpose of QBT(Query by Tapping) is to evaluate MIR system in retrieval ground-truth MIDI files by the tapping rhythm. This task provides query rhythm files both in WAV and CSV(symbolic) formate. Evaluation database and query files can be download from http://210.68.135.13/ki/QBT.rar (revised on 2008/8/18)

Task description

  • Test database: 129 ground-truth mono MIDI files.
  • WAV Quries: 272 query files to retrieve 129 known target from collection. So far, 1~6 human assessors have listened and tapped a 15 seconds query rhythm from beginning for each target songs.
  • CSV Quries: 129 query files in CSV format which are drawn from original MIDI. It include time interval (in ms) between first 25 NoteOn event separate with comma. Rest note and time interval smaller than 100ms will be discarded.
  • Evaluation: Mean Reciprocal rank. Return top 10 candidates for each query file.

Data processing proposal for calling formats

Indexing the MIDIs collection

Indexing_exe <var1> <var2>

where

<var1>==<directory_of_MIDIs> 
<var2>==<indexing_files_output_and_working_directory>

Running for the query files

Running_exe <var3> <var4> <var5>

where

<var3>==<directory_of_indexed_file> 
<var4>==<directory_of_query_rhythm_files> 
<var5>==<answer_file_output_directory> 

Data processing for output answer file formats

The answer file for each run would look like:

Q0001:0003,0567,0999,<insert X more responses>,XXXX
Q0002:0103,0567,0998,<insert X more responses>,XXXX
Q000X:0002,0567,0999,<insert X more responses>,XXXX

Each line represents to each of the queries in a given task run.

Submission closing date

22th August 2008.

Interested Participants

  • Shu-Jen Show Hsiao(show.cs95g at nctu.edu.tw)
  • Rainer Typke: I would be interested if the query data can also be made available in symbolic form so we can see what part of the performance comes from good onset detection from audio, and what comes from a good matching algorithm.