2005:Audio Genre Classification Results
Goal: To classify polyphonic music audio (in PCM format) into genre categories.
Dataset: Two sets of data were used: Magnatune and USPOP. The Magnatune dataset has a hierarchical genre taxonomy, while the USPOP categories are at a single level. The audio sampling rates used were either 44.1 KHz or 22.05 KHz (mono). More data information is in the following table:
Dataset | Size (@ 44.1 KHz) | Number of Training Files | Number of Testing Files |
---|---|---|---|
Magnatune | 34.3 GB | 1005 | 510 |
USPOP | 28.4 GB | 940 | 474 |
OVERALL | ||
---|---|---|
Rank | Participant | Mean of Magnatune Hierarchical Classification Accuracy and USPOP Raw Classification Accuracy |
1 | Bergstra, Casagrande & Eck (2) | 82.34% |
2 | Bergstra, Casagrande & Eck (1) | 81.77% |
3 | Mandel & Ellis | 78.81% |
4 | West, K. | 75.29% |
5 | Lidy & Rauber (SSD+RH) | 75.27% |
6 | Pampalk, E. | 75.14% |
7 | Lidy & Rauber (RP+SSD) | 74.78% |
8 | Lidy & Rauber (RP+SSD+RH) | 74.58% |
9 | Scaringella, N. | 73.11% |
10 | Ahrendt, P. | 71.55% |
11 | Burred, J. | 62.63% |
12 | Soares, V. | 60.98% |
13 | Tzanetakis, G. | 60.72% |
Rank Participant Mean of Magnatune Hierarchical Classification Accuracy and USPOP Raw Classification Accuracy 1 Bergstra, Casagrande & Eck (2) 82.34% 2 Bergstra, Casagrande & Eck (1) 81.77% 3 Mandel & Ellis 78.81% 4 West, K. 75.29% 5 Lidy & Rauber (SSD+RH) 75.27% 6 Pampalk, E. 75.14% 7 Lidy & Rauber (RP+SSD) 74.78% 8 Lidy & Rauber (RP+SSD+RH) 74.58% 9 Scaringella, N. 73.11% 10 Ahrendt, P. 71.55% 11 Burred, J. 62.63%