Difference between revisions of "2007:Audio Genre Classification"

From MIREX Wiki
m
Line 24: Line 24:
  
 
Please edit this if you have suggestions to add or if you disagree.
 
Please edit this if you have suggestions to add or if you disagree.
 +
 +
== Audio format poll ==
 +
 +
<poll>
 +
Use clips from tracks for analysis to reduce processing load (and perhaps increase size of dataset)?
 +
Yes
 +
No
 +
</poll>
 +
 +
<poll>
 +
What is your preferred clip length if we do end up using clips?
 +
30 secs
 +
60 secs
 +
90 secs
 +
120 secs
 +
</poll>
 +
 +
<poll>
 +
What is your preferred audio format? Remember that the less audio data we have to process the larger the dataset can be...
 +
22 khz mono WAV
 +
22 khz stereo WAV
 +
44 khz mono WAV
 +
44 khz stereo WAV
 +
22 khz mono MP3 128kb
 +
22 khz stereo MP3 128kb
 +
44 khz mono MP3 128kb
 +
44 khz stereo MP3 128kb
 +
</poll>
  
 
== Participants ==
 
== Participants ==

Revision as of 09:44, 11 July 2007

Status

This is only a very basic draft version of a task proposal. Once more people show interest we can fill in the details.

Note that audio genre classification algorithms have been evaluated at ISMIR 2004 and MIREX 2005. However, there was no genre classification task in 2006.

Related MIREX 2007 task proposals:

Please feel free to edit this page.

Data

The data used for last year's audio similarity retrieval task (USPOP + USCRAP) could be used. In addition, the Magnatune data used for the ISMIR 2004 genre classification contest could be used.

Please edit this if you have suggestions to add or if you disagree.

Evaluation

Training and test sets will contain different artists. Otherwise standard techniques used to evaluate genre classification performances will be used. (Including techniques to estimate error bars or statistical significance.) In addition to classification accuracies computation times will be measured.

As Magnatune and USPOP are freely available overfitting is possible. More interesting than the final ranking will be the accompanying papers in which the participants describe their work.

Please edit this if you have suggestions to add or if you disagree.

Audio format poll

<poll> Use clips from tracks for analysis to reduce processing load (and perhaps increase size of dataset)? Yes No </poll>

<poll> What is your preferred clip length if we do end up using clips? 30 secs 60 secs 90 secs 120 secs </poll>

<poll> What is your preferred audio format? Remember that the less audio data we have to process the larger the dataset can be... 22 khz mono WAV 22 khz stereo WAV 44 khz mono WAV 44 khz stereo WAV 22 khz mono MP3 128kb 22 khz stereo MP3 128kb 44 khz mono MP3 128kb 44 khz stereo MP3 128kb </poll>

Participants

If you think there is a slight chance that you might want to participate please add your name and email address here.

  • Thomas Lidy (lastname@ifs.tuwien.ac.at)
  • Francois Pachet and Pierre Roy (lastname@csl.sony.fr)
  • Elias Pampalk (firstname.lastname@gmail.com)
  • Tim Pohle (firstname.lastname@jku.at)
  • Kris West (kw at cmp dot uea dot ac dot uk)
  • Enric Guaus (firstname.lastname@iua.upf.edu)
  • Abhinav Singh (abhinavs at iitg.ernet.in) and S.R.M. Prasanna (prasanna at iitg.ernet.in)
  • Ben Fields (map01bf at gold dot ac dot uk)
  • Tom Diethe (initial.surname@cs.ucl.ac.uk)
  • James Bergstra (bergstrj at iro umontreal ca )
  • Vitor Soares (firstname.lastname@clustermedialabs.com)
  • Matt Hoffman (mdhoffma a t cs d o t princeton d o t edu)
  • ....