2005:Audio Genre Classification Results
From MIREX Wiki
Goal: To classify polyphonic music audio (in PCM format) into genre categories.
Dataset: Two sets of data were used: Magnatune and USPOP. The Magnatune dataset has a hierarchical genre taxonomy, while the USPOP categories are at a single level. The audio sampling rates used were either 44.1 KHz or 22.05 KHz (mono). More data information is in the following table:
| Dataset | Size (@ 44.1 KHz) | Number of Training Files | Number of Testing Files |
|---|---|---|---|
| Magnatune | 34.3 GB | 1005 | 510 |
| USPOP | 28.4 GB | 940 | 474 |
| OVERALL | ||
|---|---|---|
| Rank | Participant | Mean of Magnatune Hierarchical Classification Accuracy and USPOP Raw Classification Accuracy |
| 1 | Bergstra, Casagrande & Eck (2) | 82.34% |
| 2 | Bergstra, Casagrande & Eck (1) | 81.77% |
| 3 | Mandel & Ellis | 78.81% |
| 4 | West, K. | 75.29% |
| 5 | Lidy & Rauber (SSD+RH) | 75.27% |
| 6 | Pampalk, E. | 75.14% |
| 7 | Lidy & Rauber (RP+SSD) | 74.78% |
| 8 | Lidy & Rauber (RP+SSD+RH) | 74.58% |
| 9 | Scaringella, N. | 73.11% |
| 10 | Ahrendt, P. | 71.55% |
| 11 | Burred, J. | 62.63% |
| 12 | Soares, V. | 60.98% |
| 13 | Tzanetakis, G. | 60.72% |
| Magnatune Dataset | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Rank | Participant | Hierarchical Classification Accuracy | Normalized Hierarchical Classification Accuracy | Raw Classification Accuracy | Normalized Raw Classification Accuracy | Runtime (s) | Machine | Confusion Matrix Files | |
| 1 | Bergstra, Casagrande & Eck (2) | 77.75% | 73.04% | 75.10% | 69.49% | -- | -- | BCE_2_MTeval.txt | |
| 2 | Bergstra, Casagrande & Eck (1) | 77.25% | 72.13% | 74.71% | 68.73% | 23400 | B0 | BCE_1_MTeval.txt | |
| 3 | Mandel & Ellis | 71.96% | 69.63% | 67.65% | 63.99% | 8729 | R | ME_MTeval.txt | |
| 4 | West, K. | 71.67% | 68.33% | 68.43% | 63.87% | 43327 | B4 | W_MTeval.txt | |
| 5 | Lidy & Rauber (RP+SSD) | 71.08% | 70.90% | 67.65% | 66.85% | 6372 | B1 | LR_RP+SSD_MTeval.txt | |
| 6 | Lidy & Rauber (RP+SSD+RH) | 70.88% | 70.52% | 67.25% | 66.27% | 6372 | B1 | LR_RP+SSD+RH_MTeval.txt | |
| 7 | Lidy & Rauber (SSD+RH) | 70.78% | 69.31% | 67.65% | 65.54% | 6372 | B1 | LR_SSD+RH_MTeval.txt | |
| 8 | Scaringella, N. | 70.47% | 72.30% | 66.14% | 67.12% | 22740 | G | SN_MTeval.txt | |
| 9 | Pampalk, E. | 69.90% | 70.91% | 66.47% | 66.26% | 3312 | B0 | P_MTeval.txt | |
| 10 | Ahrendt, P. | 64.61% | 61.40% | 60.98% | 57.15% | 4920 | B1 | A_MTeval.txt | |
| 11 | Burred, J. | 59.22% | 61.96% | 54.12% | 55.68% | 12483 | B2 | B_MTeval.txt | |
| 12 | Tzanetakis, G. | 58.14% | 53.47% | 55.49% | 50.39% | 1312 | B0 | T_MTeval.txt | |
| 13 | Soares, V. | 55.29% | 60.73% | 49.41% | 53.54% | 23880 | Y | SV_MTeval.txt | |
| 14 | Li, M. | TO * | -- | -- | -- | -- | -- | -- | |
| 15 | Chen & Gao | DNC * | -- | -- | -- | -- | -- | -- | |
| Participant | Mean Accuracy |
|---|---|
| ANO | 41.77% |
| BP1 | 55.66% |
| BP2 | 54.76% |
| CL1 | 60.97% |
| CL2 | 60.03% |
| GLR1 | 55.34% |
| GLR2 | 45.92% |
| GP | 48.85% |
| GT1 | 43.69% |
| GT2 | 51.48% |
| HNOS1 | 43.33% |
| HNOS2 | 15.84% |
| HNOS3 | 42.24% |
| HNOS4 | 29.04% |
| HW1 | 56.35% |
| HW2 | 53.10% |
| LZG | 54.40% |
| MTG1 | 54.73% |
| MTG2 | 62.05% |
| MTG3 | 48.12% |
| MTG4 | 48.20% |
| MTG5 | 49.75% |
| MTG6 | 50.36% |
| RK1 | 48.41% |
| SS | 52.56% |
| TTOS | 44.37% |
| VA1 | 53.57% |
| VA2 | 53.57% |
| XLZZG | 53.54% |
| XZZ | 57.18% |