2025:Audio Beat Tracking
Contents
Description
The text of this section was copied from the 2012 Wiki.
The aim of the automatic beat tracking task is to track each beat locations in a collection of sound files. Unlike the Audio Tempo Extraction task, which aim is to detect tempi for each file, the beat tracking task aims at detecting all beat locations in recordings. The algorithms will be evaluated in terms of their accuracy in predicting beat locations annotated by a group of listeners.
Dataset
Train Dataset
For the audio beat tracking task, we do not impose restrictions on the training data used by machine learning-based systems. However, all submissions must clearly state the specific training dataset(s) used in their extended abstract.
Test Datasets
TBA
TBA
TBA
TBA
Submission Format
Submissions to this task will have to conform to a specified format detailed below. Submissions should be packaged and contain at least two files: The algorithm itself and a README containing contact information and detailing, in full, the use of the algorithm.
Input Data
Participating algorithms will have to read audio in the following format:
- Sample rate: 44.1 KHz
- Sample size: 16 bit
- Number of channels: 1 (mono)
- Encoding: WAV
Output Data
The beat tracking algorithms will return beat-times in an ASCII text file for each input .wav audio file. The specification of this output file is immediately below.
Output File Format (Audio Beat tracking)
The Beat Tracking output file format is an ASCII text format. Each beat time is specified, in seconds, on its own line. Specifically,
<beat time(in seconds)>\n
where \n denotes the end of line. The < and > characters are not included. An example output file would look something like:
0.243 0.486 0.729
Algorithm Calling Format
The submitted algorithm must take as arguments a SINGLE .wav file to perform the onset detection on as well as the full output path and filename of the output file. The ability to specify the output path and file name is essential. Denoting the input .wav file path and name as %input and the output file path and name as %output, a program called foobar could be called from the command-line as follows:
foobar %input %output foobar -i %input -o %output
Moreover, if your submission takes additional parameters, such as a detection threshold, foobar could be called like:
foobar .1 %input %output foobar -param1 .1 -i %input -o %output
If your submission is in MATLAB, it should be submitted as a function. Once again, the function must contain String inputs for the full path and names of the input and output files. Parameters could also be specified as input arguments of the function. For example:
foobar('%input','%output') foobar(.1,'%input','%output')
README File
A README file accompanying each submission should contain explicit instructions on how to to run the program (as well as contact information, etc.). In particular, each command line to run should be specified, using %input for the input sound file and %output for the resulting text file.
For instance, to test the program foobar with different values for parameters param1, the README file would look like:
foobar -param1 .1 -i %input -o %output foobar -param1 .15 -i %input -o %output foobar -param1 .2 -i %input -o %output foobar -param1 .25 -i %input -o %output foobar -param1 .3 -i %input -o %output ...
For a submission using MATLAB, the README file could look like:
matlab -r "foobar(.1,'%input','%output');quit;" matlab -r "foobar(.15,'%input','%output');quit;" matlab -r "foobar(.2,'%input','%output');quit;" matlab -r "foobar(.25,'%input','%output');quit;" matlab -r "foobar(.3,'%input','%output');quit;" ...
The different command lines to evaluate the performance of each parameter set over the whole database will be generated automatically from each line in the README file containing both '%input' and '%output' strings.
Evaluation Procedures
The evaluation methods are taken from the beat evaluation toolbox and are described in the following technical report:
M. E. P. Davies, N. Degara and M. D. Plumbley. "Evaluation methods for musical audio beat tracking algorithms". Technical Report C4DM-TR-09-06. This link now works! :)
For further details on the specifics of the methods please refer to the paper. However, here is a brief summary with appropriate references:
- F-measure - the standard calculation as used in onset evaluation but
with a 70ms window.
S. Dixon, "Onset detection revisited," in Proceedings of 9th International Conference on Digital Audio Effects (DAFx), Montreal, Canada, pp. 133-137, 2006.
S. Dixon, "Evaluation of audio beat tracking system beatroot," Journal of New Music Research, vol. 36, no. 1, pp. 39-51, 2007.
- Cemgil - beat accuracy is calculated using a Gaussian error function
with 40ms standard deviation.
A. T. Cemgil, B. Kappen, P. Desain, and H. Honing, "On tempo tracking: Tempogram representation and Kalman filtering," Journal Of New Music Research, vol. 28, no. 4, pp. 259-273, 2001
- Goto - binary decision of correct or incorrect tracking based on
statistical properties of a beat error sequence.
M. Goto and Y. Muraoka, "Issues in evaluating beat tracking systems," in Working Notes of the IJCAI-97 Workshop on Issues in AI and Music - Evaluation and Assessment, 1997, pp. 9-16.
- PScore - McKinney's impulse train cross-correlation method as used in
2006.
M. F. McKinney, D. Moelants, M. E. P. Davies, and A. Klapuri, "Evaluation of audio beat tracking and music tempo extraction algorithms," Journal of New Music Research, vol. 36, no. 1, pp. 1-16, 2007.
- CMLc, CMLt, AMLc, AMLt - continuity-based evaluation methods based on
the longest continuously correctly tracked section.
S. Hainsworth, "Techniques for the automated analysis of musical audio," Ph.D. dissertation, Department of Engineering, Cambridge University, 2004.
A. P. Klapuri, A. Eronen, and J. Astola, "Analysis of the meter of acoustic musical signals," IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 1, pp. 342-355, 2006.
- D, Dg - information based criteria based on analysis of a beat error
histogram (note the results are measured in 'bits' and not percentages), see the technical report for a description.
Relevant Development Collections (need update)
You can find it here:
(data has been uploaded in both .tgz and .zip format)
User: beattrack Password: b34trx
https://www.music-ir.org/evaluation/MIREX/data/2006/beat/beattrack_train_2006.tgz OR
https://www.music-ir.org/evaluation/MIREX/data/2006/beat/beattrack_train_2006.zip
User: tempo Password: t3mp0
https://www.music-ir.org/evaluation/MIREX/data/2006/tempo/tempo_train_2006.tgz OR
https://www.music-ir.org/evaluation/MIREX/data/2006/tempo/tempo_train_2006.zip