Difference between revisions of "2006:Audio Beat Tracking"

From MIREX Wiki
m (use a sorted list for proposers, ask for a descriptive README)
m (fix email addresses)
Line 1: Line 1:
 
== Proposers ==
 
== Proposers ==
  
* Paul M. Brossier (Queen Mary, University of London) <piem@altern.org>
+
* Paul M. Brossier (Queen Mary, University of London) <piem at altern.org>
* Matthew Davies (Queen Mary, University of London) <matthew.davies@elec.qmul.ac.uk>
+
* Matthew Davies (Queen Mary, University of London) <matthew.davies at elec.qmul.ac.uk>
* Martin F. McKinney (Philips) <mckinney@alum.mit.edu>
+
* Martin F. McKinney (Philips) <mckinney at alum.mit.edu>
  
 
== Description ==
 
== Description ==
Line 51: Line 51:
 
* Martin F. McKinney (Philips) <mckinney at alum.mit.edu>
 
* Martin F. McKinney (Philips) <mckinney at alum.mit.edu>
 
* Dirk Moelants (IPEM, Ghent University) <dirk at moelants.net>
 
* Dirk Moelants (IPEM, Ghent University) <dirk at moelants.net>
* Bill Sethares (University of Wisconsin-Madison), sethares@ece.wisc.edu
+
* Bill Sethares (University of Wisconsin-Madison), <sethares at ece.wisc.edu>
 
* George Tzanetakis (University of Victoria), <gtzan at cs.uvic.ca>
 
* George Tzanetakis (University of Victoria), <gtzan at cs.uvic.ca>
 
* Christian Uhle (Fraunhofer Institut), <uhle at idmt.fhg.de>
 
* Christian Uhle (Fraunhofer Institut), <uhle at idmt.fhg.de>

Revision as of 13:01, 24 July 2006

Proposers

  • Paul M. Brossier (Queen Mary, University of London) <piem at altern.org>
  • Matthew Davies (Queen Mary, University of London) <matthew.davies at elec.qmul.ac.uk>
  • Martin F. McKinney (Philips) <mckinney at alum.mit.edu>

Description

The aim of the automatic beat tracking task is to track each beat locations in a collection of sound files. Unlike the Audio Tempo Extraction task, which aim is to detect one (or two) tempo value and associated phase for each files, the beat tracking task aims at detecting all beat locations in recordings. The algorithms will be evaluated in terms of accuracy for tempo and phase detection on one hand, and in terms of continuity on the other hand.

Input data

Audio Format:

The sound files in the database contain monophonic (single channel) complex recordings, with the associated beat locations annotated by 60 different listeners. All recordings were encoded onto 16-bits with a sampling rate of 44100Hz (CD quality). The file lengths vary between 2 and 36 seconds, with a total duration of 14 minutes.

Audio Content:

The audio recordings were selected to provide a stable tempo value, a wide distribution of tempi values, and a large variety of instrumentation and musical styles. About 20% of the files contains non-binary meters, and a small number of examples contain changing meters.

Output data

Submitted programs should output one beat location per line, with a «new line» character (\n) at the end of each line. The results should either be saved in a text file or printed on the standard output.

Example of possible output:

0.0123156
1.9388662
3.8777323
5.8165980
7.7554634

Each submission should be accompanied with a README file describing how the program should be used. For instance:

To run the program foobar on the file input.wav and store the results in the file output.txt, the following command should be used:

 foobar -i input.wav > output.txt

Participants

  • Miguel Alonso and Ga├½l Richard (ENST, Paris), <miguel.alonso at enst.fr>, <gael.richard at enst.fr> (to be confirmed)
  • Paul Brossier (Queen Mary, University of London), <piem at altern.org>
  • Matthew Davies (Queen Mary, University of London), <matthew.davies at elec.qmul.ac.uk>
  • Douglas Eck (University of Montreal), <eckdoug at iro.umontreal.ca>
  • Geoffroy Peeters (IRCAM, Paris), <peeters at ircam.fr>

Other potential participants:

  • Fabien Gouyon (University Pompeu Fabra) and Simon Dixon (OFAI), <fabien.gouyon at iua.upf.es>, <simon at oefai.at>
  • Anssi Klapuri (Tampere International Center for Signal Processing, Finland), <klap at cs.tut.fi>
  • Martin F. McKinney (Philips) <mckinney at alum.mit.edu>
  • Dirk Moelants (IPEM, Ghent University) <dirk at moelants.net>
  • Bill Sethares (University of Wisconsin-Madison), <sethares at ece.wisc.edu>
  • George Tzanetakis (University of Victoria), <gtzan at cs.uvic.ca>
  • Christian Uhle (Fraunhofer Institut), <uhle at idmt.fhg.de>

Evaluation Procedures

The evaluation of the beat tracking algorithm should reflect the ability of these algorithms to track the tempo accurately (correct tempo value and correct phase alignment). Additionnaly, their ability to produce a continuous output will also be evaluated.

Paul suggestions, after discussion with Matthew Davies and Martin H. Kinney

Note: These procedures are still being discussed and will be confirmed soon.

  • Tempo and phase accuracy: The output of an algorithm will be evaluated against each of the 60 annotation files, in a similar way as done for the Audio Onset Detection: each matching beat location will be counted as a good detection, and each non-matching location as a false positive. Because the extracted beat locations are to be evaluated against the annotations 60 different listeners, we assume the the halving/doubling subjective issue will be addressed in the process.
  • Continuity: To evaluate the ability of an algorithm to produce a continuous output, the ratio of the longest continuous segment over the total length of the file will be used, in a way similar to the one described in [1].

This continuity criteria will be the one used for the final ranking of the different algorithms.

Evaluation Database

A collection of 160 musical exerpts will be used for the evaluation procedure, the same collection used for the Audio Tempo Extraction contest. Each recording has been annotated by 60 different listeners. The annotation procedures were described in [2] and [3].

20 excerpts will be provided for training to the participant, and the remaining 160 excerpts, novel to all participants, will be used for the contest.

References

  1. Masataka Goto and Yoichi Muraoka. Issues in evaluating beat tracking systems. In Working Notes of IJCAI-97 Workshop on Issues in AI and Music - Evaluation and Assessment, pages 9­16, 1997 postscript
  2. McKinney, M.F. and Moelants, D. (2004), Deviations from the resonance theory of tempo induction, Conference on Interdisciplinary Musicology, Graz. pdf
  3. Moelants, D. and McKinney, M.F. (2004), Tempo perception and musical content: What makes a piece slow, fast, or temporally ambiguous? International Conference on Music Perception & Cognition, Evanston, IL. pdf

Comments

Paul's comments: as noted off-list by Matthew Davies, Goto proposed evaluation metrics to evaluate beat tracking algorithms accuracy [1] are somewhat difficult to apply to all beat tracking algorithms without modifications, since they assume the algorithms stabilise on a robust tempo value only after 45 seconds. Even after removing this 45s constraint, the four different metrics obtained by this method are somewhat difficult to interpret.