Difference between revisions of "2014:Query by Tapping"

From MIREX Wiki
m (Subtask 1: QBT with symbolic input)
m (Subtask 1: QBT with symbolic input)
Line 17: Line 17:
 
* '''Test database''': The set of ground-truth MIDI files corresponding to each dataset.
 
* '''Test database''': The set of ground-truth MIDI files corresponding to each dataset.
 
* '''Query files''': Text files of onset time to retrieve target MIDIs. These onset files can help participant concentrate on similarity matching instead of onset detection. Onset files derived from .wav files cannot guarantee to have perfect detection result from original wav query files.
 
* '''Query files''': Text files of onset time to retrieve target MIDIs. These onset files can help participant concentrate on similarity matching instead of onset detection. Onset files derived from .wav files cannot guarantee to have perfect detection result from original wav query files.
* '''Evaluation''': Return top 10 candidates for each query file. 1 point is scored for a hit in the top 10 and 0 is scored otherwise (Top-10 hit rate). We may also consider top-5 and top-1 scoring.
+
* '''Evaluation''': Return top 10 candidates for each query file. 1 point is scored for a hit in the top 10 and 0 is scored otherwise (Top-10 hit rate). We may also consider Top-5 and Top-1 scoring.
  
 
=== Subtask 2: QBT with wave input ===
 
=== Subtask 2: QBT with wave input ===

Revision as of 12:58, 8 May 2014

Overview

The text of this section is copied from the 2013 page. Please add your comments and discussions for 2014.

The main purpose of QBT (Query by Tapping) is to evaluate MIR system in retrieving ground-truth MIDI files by tapping the onset of music notes to the microphone. This task provides query files in wave format as well as the corresponding human-label onset time in symbolic format. For this year's QBT task, we have three corpora for evaluation:

  • Roger Jang's MIR-QBT: This dataset contains both wav files (recorded via microphone) and onset files (human-labeled onset time).
    • 890 onset & .wav queries; 136 ground-truth MIDI files
  • Show Hsiao's QBT_symbolic: This dataset contains only onset files (obtained from the user's tapping on keyboard).
    • 410 onset queries; 143 ground-truth MIDI files (128 of which have at least one query)
  • CCRMA's QBT-Extended: This dataset contains only onset files (obtained from users tapping on a touchscreen). Documentation can be found here.
    • 3,365 onset queries (1,412 from long-term memory and 1,953 from short-term memory); 51 ground-truth MIDI files

Task description

Subtask 1: QBT with symbolic input

  • Evaluation is performed separately on each dataset
  • Test database: The set of ground-truth MIDI files corresponding to each dataset.
  • Query files: Text files of onset time to retrieve target MIDIs. These onset files can help participant concentrate on similarity matching instead of onset detection. Onset files derived from .wav files cannot guarantee to have perfect detection result from original wav query files.
  • Evaluation: Return top 10 candidates for each query file. 1 point is scored for a hit in the top 10 and 0 is scored otherwise (Top-10 hit rate). We may also consider Top-5 and Top-1 scoring.

Subtask 2: QBT with wave input

  • Test database: About 150 ground-truth monophonic MIDI files in MIR-QBT.
  • Query files: About 800 wave files of tapping recordings to retrieve MIDIs in MIR-QBT.
  • Evaluation: Return top 10 candidates for each query file. 1 point is scored for a hit in the top 10 and 0 is scored otherwise (Top-10 hit rate).


Command formats

Indexing the MIDIs collection

Command format should look like this:

indexing %dbMidi.list% %dir_workspace_root%

where %dbMidi.list% is the input list of database midi files named as uniq_key.mid. For example:

QBT/database/00001.mid
QBT/database/00002.mid
QBT/database/00003.mid
QBT/database/00004.mid
...

Output indexed files are placed into %dir_workspace_root%. (Note that this step is not required unless you want to index or preprocess the midi database.)

Test the query files

The command format should be like this:

qbtProgram %dbMidi_list% %query_file_list% %resultFile% %dir_workspace_root%

You can use %dir_workspace_root% to store any temporary indexing/database structures. (You can omit %dir_workspace_root% if you do not need it at all.) If the input query files are onset files (for subtask 1), then the format of %query_file_list% is like this:

qbtQuery/query_00001.onset   00001.mid
qbtQuery/query_00002.onset   00001.mid
qbtQuery/query_00003.onset   00002.mid
...

(Pleae refer to the readme.txt of the downloaded MIR-QBT corpus for the format of onset files.)

If the input query files are wave files (for subtask 2), the the format of %query_file_list% is like this:

qbtQuery/query_00001.wav   00001.mid
qbtQuery/query_00002.wav   00001.mid
qbtQuery/query_00003.wav   00002.mid
...

The result file gives top-10 candidates for each query. For instance, for wave query file, the result file should have the following format for subtask 1:

qbtQuery/query_00001.onset: 00025 01003 02200 ... 
qbtQuery/query_00002.onset: 01547 02313 07653 ... 
qbtQuery/query_00003.onset: 03142 00320 00973 ... 
...

And for subtask 2:

qbtQuery/query_00001.wav: 00025 01003 02200 ... 
qbtQuery/query_00002.wav: 01547 02313 07653 ... 
qbtQuery/query_00003.wav: 03142 00320 00973 ... 
...

Note that the output should be the names of the MIDI files (e.g., 00025 means 00025.mid); they are not necessary 5-digit numbers.

Potential Participants

name / email

Discussions for 2014