2009:Real-time Audio to Score Alignment (a.k.a Score Following)
Contents
Title
Real-time Audio to Score Alignment, also known as Score Following
Description
The text of this section is copied from the 2008 page. Please add your comments and discussions for 2009.
Score Following is the real-time alignment of an incoming music signal to the music score. The music signal can be symbolic (MIDI) or audio, but we will concentrate here on audio following, unless there are some candidates who'd want their symbolic followers to be evaluated and can propose reference data.
This page describes a proposal for evaluation of score following systems. Discussion of the evaluation procedures on the Score Following contest planning list will be documented on the 2009:Score Following page. A full digest of the discussions is available to subscribers from the Score Following contest planning list archives.
Submissions will be required to estimate alignment precision according to the indexed times. In order for your system to participate, please specify the type of alignment (monophonic, polyphonic), type of training and realtime performance, also separated into two domains (upon enough submissions) for symbolic and audio systems. Note that we also do accept systems that don't run in real-time in practice, as soon as their algorithm is on-line, i.e. without making use of global knowledge of the input.
Discussions for 2009
Your comments here.
Evolution
This year's changes are proposed here and on the list, and are currently under discussion. Proposed changes are mainly about the score and reference file formats and the evaluation metrics:
- the proposed new score and reference file format is described here: 2007:Score File Format
- evaluation metrics will more closely reflect the different approaches and applications of score following
See the details of last year's proposal on the MIREX 2006 Wiki
Title
Real-time Audio to Score Alignment, also known as Score Following
Description
Score Following is the real-time alignment of an incoming music signal to the music score. The music signal can be symbolic (MIDI) or audio, but we will concentrate here on audio following, unless there are some candidates who'd want their symbolic followers to be evaluated and can propose reference data.
This page describes a proposal for evaluation of score following systems. Discussion of the evaluation procedures on the Score Following contest planning list will be documented on the 2009:Score Following page. A full digest of the discussions is available to subscribers from the Score Following contest planning list archives.
Submissions will be required to estimate alignment precision according to the indexed times. In order for your system to participate, please specify the type of alignment (monophonic, polyphonic), type of training and realtime performance, also separated into two domains (upon enough submissions) for symbolic and audio systems. Note that we also do accept systems that don't run in real-time in practice, as soon as their algorithm is on-line, i.e. without making use of global knowledge of the input.
Evolution
This year's changes are proposed here and on the list, and are currently under discussion. Proposed changes are mainly about the score and reference file formats and the evaluation metrics:
- the proposed new score and reference file format is described here: 2009:Score File Format
- evaluation metrics will more closely reflect the different approaches and applications of score following
See the details of last year's proposal on the MIREX 2006 Wiki
Evaluation procedures
Evaluation procedure consists of running score followers on a database of aligned audio to score where the database contains score, and performance audio (for system call) and a reference alignment (for evaluations) -- See below for details.
I/O Format
Each system should conform to the following format:
doScofo.sh "/path/to/audiofile.wav" "/path/to/midi_score_file.wav" "/path/to/result/filename.txt"
The stdout and stderr will be logged.
"/path/to/result/filenam.txt" should be have one line per detected note with the following 4 columns
1. estimated note onset time in performance audio file (ms) 2. detection time relative to performance audio file (ms) 3. note start time in score (ms) 4. MIDI note number in score (int)
Example :
1800 1800 0 75 2021 2022 187.5 73 ... ... ... ...
Remarks: The third column with the detected note's start time in score serves as the unique identifier of a note (or chord for polyphonic scores) that links it to the ground truth onset of that note within the reference alignment files. The fourth column of MIDI note number is there only for your convenience, to know your way around in the result files, if you know the melody in MIDI.
Potential Participants
1. Andreas Arzt (Johannes Kepler University Linz, Austria), andreas[dot]arzt[at]students[dot]jku[dot]at
Your name here.