2010:Score File Format

From MIREX Wiki

This page describes the score file format proposed for the MIREX 2007 2007:Real-time Audio to Score Alignment (a.k.a Score Following) task and likely to be used for the MIREX 2010 2010:Real-time_Audio_to_Score_Alignment_(a.k.a_Score_Following) task.


Description

The proposed MIREX score file format is in ASCII text, with one line per note or other event such as trill, tempo change, or signature change. Lines are separated by newline characters and contain 10 fields, separated by whitespace.

Notes are time-ordered, tempo or signature changes can come at any place. (This could be restrained to "must come at the beginning of the file", if this helps someone.)

The MIREX text score file format will be also used as alignment reference format, where the clock time is no longer the score time but the aligned time in the performance, and maybe as alignment result file.

Columns

Note that we introduced an field event ID, that unambiguously links events across all three types of files.

The last field stream-id can serve later to separate different channels, voices, or streams of events.

The 9th field cue serves to mark events that are musically important, e.g. because they synchronise accompaniment with the performance. This could later be used for a more detailed evaluation.


Example Score Files

Here is a zip with a few examples. Please have a look at them and use them to test your parsers. The corresponding midi files are in there, also, except for Anthemes 2, but that one has a trill...

The metric positions are not very pretty, but it's the best a simple algorithm can do. The absolute time positions are good, anyway.

File:2007 mirex2007-score-examples.zip

Note on higher-level formats

It would of course be great if there was an even better format, maybe XML-based, such as the MX format that was mentioned on the list. From this format, the text-based format could be easily generated, and even MIDI for those participants who couldn't otherwise implement a parser. We'd be glad to hear a concrete proposal.

Event Types

NOTE EVENTS

Template

event-id onset-position onset-ms type pitch interval duration-beat duration-ms cue-num stream-id

Example

1 1	 0	note	72	0	2	4000	1	0
2 3+1/4	 4500	note	60	0	0+1/4	500	2	0	
3 3+3/4	 5000	note	58	0	0+1/2	1000	3	0	
4 3+3/4	 5000	note	48	0	0+1/2	1000	0	0	

Columns for note events

  1. event ID [int > 1]
  2. event onset position [measure+rational] (ex. 42+1/4, 28+3/7)
  3. event onset clock time [ms] (must be consistent with the onset position)
  4. event type [symbol: note, trill, tremolo, ...] (for tempo and signature, see below)
  5. event pitch [float MIDI note number]
  6. event interval [float halftones] (0 for non-trill)
  7. event duration [measure+rational]
  8. event duration [ms]
  9. cue number [int > 0, 0 = no cue]
  10. stream ID [int]


TEMPO CHANGES

Template

0 onset-position onset-ms tempo tempo-bpm - - - - stream-id

Example

0 1	0	tempo	120	-	-	-	-	0

This signifies a tempo of 120 beats per minute.

Columns for tempo changes

  1. event ID [constant int = 0]
  2. event onset position [measure+rational] (ex. 42+1/4, 28+3/7)
  3. event onset clock time [ms] (must be consistent with the onset position)
  4. event type [constant symbol = tempo]
  5. tempo [float bpm]
  6. unused
  7. unused
  8. unused
  9. unused
  10. stream ID [int]

METER CHANGES

Template

0 onset-position onset-ms meter numerator denominator - - - stream-id

Example

0 1	0	meter	4	4	-	-	-	0

This signifies a rhythmic signature of 4/4.

Columns for signature changes

  1. event ID [constant int = 0]
  2. event onset position [measure+rational] (ex. 42+1/4, 28+3/7)
  3. event onset clock time [ms] (must be consistent with the onset position)
  4. event type [constant symbol = meter]
  5. meter numerator [int]
  6. meter denominator [int]
  7. unused
  8. unused
  9. unused
  10. stream ID [int]

Evaluation Metrics

times:

  • t_a reference alignment time
  • t_r reporting time
  • t_e estimated time

measures:

  • d_l = t_r - t_e system latency (between reporting and estimation)
  • d_o = t_a - t_r offset or lag between reporting time and reference
  • d_e = t_a - t_e error between estimation time and reference