2009:Audio Beat Tracking Results

From MIREX Wiki

Introduction

This year was the first year since MIREX 2006 that we have run the Audio Beat Track (ABT) task. ABT 2009 saw two collection used, the original McKinney Collection, and the new Sapp's Mazurka Collection. A number of new scoring metrics were also introduced this year beyond McKinney's P-Score.

McKinney Collection Information

The McKinney Collection comprises 140 files of mixed genres. There are 40 ground-truth sets per music file across which the beats were averaged.

Sapp's Mazurka Collection Information

New this year, we used 322 files drawn from the Mazurka.org dataset put together by Craig Sapp. Craig Sapp was also responsible for creating the high-quality ground-truth files.

Understanding the Metrics

The evaluation methods were taken from the beat evaluation toolbox and are described in the following technical report:

M. E. P. Davies, N. Degara and M. D. Plumbley. "Evaluation methods for musical audio beat tracking algorithms". Technical Report C4DM-TR-09-06.

For further details on the specifics of the methods please refer to the paper. However, here is a brief summary with appropriate references:


F-measure - the standard calculation as used in onset evaluation but with a ±70ms window.

S. Dixon, "Onset detection revisited," in Proceedings of 9th International Conference on Digital Audio Effects (DAFx), Montreal, Canada, pp. 133ΓÇô137, 2006.

S. Dixon, "Evaluation of audio beat tracking system beatroot," Journal of New Music Research, vol. 36, no. 1, pp. 39ΓÇô51, 2007.


Cemgil - beat accuracy is calculated using a Gaussian error function with 40ms standard deviation.

A. T. Cemgil, B. Kappen, P. Desain, and H. Honing, "On tempo tracking: Tempogram representation and Kalman filtering," Journal Of New Music Research, vol. 28, no. 4, pp. 259ΓÇô273, 2001


Goto - binary decision of correct or incorrect tracking based on statistical properties of a beat error sequence.

M. Goto and Y. Muraoka, "Issues in evaluating beat tracking systems," in Working Notes of the IJCAI-97 Workshop on Issues in AI and Music - Evaluation and Assessment, 1997, pp. 9ΓÇô16.


PScore - McKinney's impulse train cross-correlation method as used in 2006.

M. F. McKinney, D. Moelants, M. E. P. Davies, and A. Klapuri, "Evaluation of audio beat tracking and music tempo extraction algorithms," Journal of New Music Research, vol. 36, no. 1, pp. 1ΓÇô16, 2007.


CMLc, CMLt, AMLc, AMLt - continuity-based evaluation methods based on the longest continuously correctly tracked section.

S. Hainsworth, "Techniques for the automated analysis of musical audio," Ph.D. dissertation, Department of Engineering, Cambridge University, 2004.

A. P. Klapuri, A. Eronen, and J. Astola, "Analysis of the meter of acoustic musical signals," IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 1, pp. 342ΓÇô355, 2006.


D, Dg - information based criteria based on analysis of a beat error histogram (note the results are measured in 'bits' and not percentages), see the technical report for a description.

General Legend

Team ID

DRP1 = Matthew Davies, Andrew Robertson, Mark Plumbley (Deterministic)
DRP2 = Matthew Davies, Andrew Robertson, Mark Plumbley (Dumb)
DRP3 = Matthew Davies, Andrew Robertson, Mark Plumbley (Flexible)
DRP4 = Matthew Davies, Andrew Robertson, Mark Plumbley (Standard)
GP1 = Geoffroy Peeters (VF)
GP2 = Geoffroy Peeters (VE)
GP3 = Geoffroy Peeters (CF)
GP4 = Geoffroy Peeters (CE)
OGM1 = Joao Lobato Oliveira, Fabien Gouyon, Luis Gustavo Martins (SC)
OGM2 = Joao Lobato Oliveira, Fabien Gouyon, Luis Gustavo Martins(R)
TL = Tsung-Chi Lee

Results

McKinney Collection

TeamID F-Measure Cemgil Goto P-Score CMLC CMLT AMLC AMLT D Dg
units -> (%) (%) (%) (%) (%) (%) (%) (%) (bits) (bits)
DRP1 24.6 17.5 0.4 37.4 3.8 9.6 7.8 19.2 0.625 0.008
DRP2 35.4 26.1 3.1 46.5 7.5 16.9 13.8 31.7 1.073 0.062
DRP3 47.0 35.6 16.2 52.8 19.4 27.6 45.3 61.3 1.671 0.245
DRP4 48.0 36.3 19.5 53.9 22.2 29.1 50.8 64.0 1.739 0.255
GP1 54.8 41.0 22.2 59.0 26.0 35.5 49.1 66.6 1.680 0.294
GP2 53.7 40.1 20.9 57.9 25.1 33.7 47.6 63.8 1.649 0.281
GP3 54.6 40.9 22.0 59.0 26.1 35.6 48.8 66.3 1.695 0.287
GP4 54.5 40.8 21.6 59.2 26.4 35.5 48.2 64.7 1.685 0.289
OGM1 39.7 29.1 6.5 47.8 11.9 19.9 29.5 47.4 1.283 0.100
OGM2 41.5 30.4 5.7 49.2 11.7 19.5 28.2 46.4 1.340 0.092
TL 50.0 37.0 14.1 53.6 17.9 24.3 35.9 50.8 1.480 0.150

download these results as csv

Sapp's Mazurka Collection

TeamID F-Measure Cemgil Goto P-Score CMLC CMLT AMLC AMLT D Dg
units -> (%) (%) (%) (%) (%) (%) (%) (%) (bits) (bits)
DRP1 27.9 19.9 0.0 37.3 1.5 10.3 2.3 13.8 0.126 0.008
DRP2 48.4 32.5 0.0 55.5 2.6 26.0 3.1 28.1 0.683 0.183
DRP3 67.8 61.5 2.5 65.3 7.8 42.1 9.2 45.6 1.424 0.993
DRP4 45.3 37.6 0.6 50.1 3.9 24.9 5.3 28.5 0.393 0.198
GP1 54.2 42.5 0.0 59.5 4.4 33.2 5.6 36.6 0.447 0.252
GP2 54.7 43.1 0.0 59.9 4.5 33.7 5.7 37.0 0.467 0.269
GP3 44.7 34.8 0.0 51.2 3.2 24.7 4.0 27.9 0.262 0.111
GP4 45.4 35.6 0.0 51.9 3.3 25.2 4.0 28.4 0.284 0.124
OGM1 30.7 21.9 0.0 38.6 1.7 11.0 2.8 15.4 0.134 0.014
OGM2 32.1 23.1 0.0 39.7 1.7 11.5 2.7 15.6 0.134 0.016
TL 44.9 35.7 0.3 51.0 4.4 20.5 4.7 22.5 0.440 0.095

download these results as csv

MIREX 2009 Audio Beat Tracking Runtime Data

Participant Machine Runtime (hh:mm)
DRP1 ALE 00:35
DRP2 ALE 00:42
DRP3 ALE 01:19
DRP4 ALE 01:16
GP1 ALE 00:33
GP2 ALE 00:34
GP3 ALE 00:34
GP4 ALE 00:32
OGM1 ALE 01:52
OGM2 ALE 01:51
TL ALE 00:51

download these results as csv