2005:Audio Genre Classification Results
From MIREX Wiki
Goal: To classify polyphonic music audio (in PCM format) into genre categories.
Dataset: Two sets of data were used: Magnatune and USPOP. The Magnatune dataset has a hierarchical genre taxonomy, while the USPOP categories are at a single level. The audio sampling rates used were either 44.1 KHz or 22.05 KHz (mono). More data information is in the following table:
Dataset | Size (@ 44.1 KHz) | Number of Training Files | Number of Testing Files |
---|---|---|---|
Magnatune | 34.3 GB | 1005 | 510 |
USPOP | 28.4 GB | 940 | 474 |
OVERALL | ||
---|---|---|
Rank | Participant | Mean of Magnatune Hierarchical Classification Accuracy and USPOP Raw Classification Accuracy |
1 | Bergstra, Casagrande & Eck (2) | 82.34% |
2 | Bergstra, Casagrande & Eck (1) | 81.77% |
3 | Mandel & Ellis | 78.81% |
4 | West, K. | 75.29% |
5 | Lidy & Rauber (SSD+RH) | 75.27% |
6 | Pampalk, E. | 75.14% |
7 | Lidy & Rauber (RP+SSD) | 74.78% |
8 | Lidy & Rauber (RP+SSD+RH) | 74.58% |
9 | Scaringella, N. | 73.11% |
10 | Ahrendt, P. | 71.55% |
11 | Burred, J. | 62.63% |
12 | Soares, V. | 60.98% |
13 | Tzanetakis, G. | 60.72% |
Magnatune Dataset | |||||||||
---|---|---|---|---|---|---|---|---|---|
Rank | Participant | Hierarchical Classification Accuracy | Normalized Hierarchical Classification Accuracy | Raw Classification Accuracy | Normalized Raw Classification Accuracy | Runtime (s) | Machine | Confusion Matrix Files | |
1 | Bergstra, Casagrande & Eck (2) | 77.75% | 73.04% | 75.10% | 69.49% | -- | -- | BCE_2_MTeval.txt | |
2 | Bergstra, Casagrande & Eck (1) | 77.25% | 72.13% | 74.71% | 68.73% | 23400 | B0 | BCE_1_MTeval.txt | |
3 | Mandel & Ellis | 71.96% | 69.63% | 67.65% | 63.99% | 8729 | R | ME_MTeval.txt | |
4 | West, K. | 71.67% | 68.33% | 68.43% | 63.87% | 43327 | B4 | W_MTeval.txt | |
5 | Lidy & Rauber (RP+SSD) | 71.08% | 70.90% | 67.65% | 66.85% | 6372 | B1 | LR_RP+SSD_MTeval.txt | |
6 | Lidy & Rauber (RP+SSD+RH) | 70.88% | 70.52% | 67.25% | 66.27% | 6372 | B1 | LR_RP+SSD+RH_MTeval.txt | |
7 | Lidy & Rauber (SSD+RH) | 70.78% | 69.31% | 67.65% | 65.54% | 6372 | B1 | LR_SSD+RH_MTeval.txt | |
8 | Scaringella, N. | 70.47% | 72.30% | 66.14% | 67.12% | 22740 | G | SN_MTeval.txt | |
9 | Pampalk, E. | 69.90% | 70.91% | 66.47% | 66.26% | 3312 | B0 | P_MTeval.txt | |
10 | Ahrendt, P. | 64.61% | 61.40% | 60.98% | 57.15% | 4920 | B1 | A_MTeval.txt | |
11 | Burred, J. | 59.22% | 61.96% | 54.12% | 55.68% | 12483 | B2 | B_MTeval.txt | |
12 | Tzanetakis, G. | 58.14% | 53.47% | 55.49% | 50.39% | 1312 | B0 | T_MTeval.txt | |
13 | Soares, V. | 55.29% | 60.73% | 49.41% | 53.54% | 23880 | Y | SV_MTeval.txt | |
14 | Li, M. | TO * | -- | -- | -- | -- | -- | -- | |
15 | Chen & Gao | DNC * | -- | -- | -- | -- | -- | -- |
Participant | Mean Accuracy |
---|---|
ANO | 41.77% |
BP1 | 55.66% |
BP2 | 54.76% |
CL1 | 60.97% |
CL2 | 60.03% |
GLR1 | 55.34% |
GLR2 | 45.92% |
GP | 48.85% |
GT1 | 43.69% |
GT2 | 51.48% |
HNOS1 | 43.33% |
HNOS2 | 15.84% |
HNOS3 | 42.24% |
HNOS4 | 29.04% |
HW1 | 56.35% |
HW2 | 53.10% |
LZG | 54.40% |
MTG1 | 54.73% |
MTG2 | 62.05% |
MTG3 | 48.12% |
MTG4 | 48.20% |
MTG5 | 49.75% |
MTG6 | 50.36% |
RK1 | 48.41% |
SS | 52.56% |
TTOS | 44.37% |
VA1 | 53.57% |
VA2 | 53.57% |
XLZZG | 53.54% |
XZZ | 57.18% |