Difference between revisions of "2005:Audio Genre Classification Results"

Latest revision as of 11:41, 2 August 2010

Introduction

Goal

To classify polyphonic music audio (in PCM format) into genre categories.

Dataset

Two sets of data were used: Magnatune and USPOP. The Magnatune dataset has a hierarchical genre taxonomy, while the USPOP categories are at a single level. The audio sampling rates used were either 44.1 KHz or 22.05 KHz (mono). More data information is in the following table:

Dataset	Size (@ 44.1 KHz)	Number of Training Files	Number of Testing Files
Magnatune	34.3 GB	1005	510
USPOP	28.4 GB	940	474

Results

Overall

OVERALL
Rank	Participant	Mean of Magnatune Hierarchical Classification Accuracy and USPOP Raw Classification Accuracy
1	Bergstra, Casagrande & Eck (2)	82.34%
2	Bergstra, Casagrande & Eck (1)	81.77%
3	Mandel & Ellis	78.81%
4	West, K.	75.29%
5	Lidy & Rauber (SSD+RH)	75.27%
6	Pampalk, E.	75.14%
7	Lidy & Rauber (RP+SSD)	74.78%
8	Lidy & Rauber (RP+SSD+RH)	74.58%
9	Scaringella, N.	73.11%
10	Ahrendt, P.	71.55%
11	Burred, J.	62.63%
12	Soares, V.	60.98%
13	Tzanetakis, G.	60.72%

Magnatune Dataset

Magnatune Dataset
Rank	Participant	Hierarchical Classification Accuracy	Normalized Hierarchical Classification Accuracy	Raw Classification Accuracy	Normalized Raw Classification Accuracy	Runtime (s)	Machine	Confusion Matrix Files
1	Bergstra, Casagrande & Eck (2)	77.75%	73.04%	75.10%	69.49%	--	--	BCE_2_MTeval.txt
2	Bergstra, Casagrande & Eck (1)	77.25%	72.13%	74.71%	68.73%	23400	B0	BCE_1_MTeval.txt
3	Mandel & Ellis	71.96%	69.63%	67.65%	63.99%	8729	R	ME_MTeval.txt
4	West, K.	71.67%	68.33%	68.43%	63.87%	43327	B4	W_MTeval.txt
5	Lidy & Rauber (RP+SSD)	71.08%	70.90%	67.65%	66.85%	6372	B1	LR_RP+SSD_MTeval.txt
6	Lidy & Rauber (RP+SSD+RH)	70.88%	70.52%	67.25%	66.27%	6372	B1	LR_RP+SSD+RH_MTeval.txt
7	Lidy & Rauber (SSD+RH)	70.78%	69.31%	67.65%	65.54%	6372	B1	LR_SSD+RH_MTeval.txt
8	Scaringella, N.	70.47%	72.30%	66.14%	67.12%	22740	G	SN_MTeval.txt
9	Pampalk, E.	69.90%	70.91%	66.47%	66.26%	3312	B0	P_MTeval.txt
10	Ahrendt, P.	64.61%	61.40%	60.98%	57.15%	4920	B1	A_MTeval.txt
11	Burred, J.	59.22%	61.96%	54.12%	55.68%	12483	B2	B_MTeval.txt
12	Tzanetakis, G.	58.14%	53.47%	55.49%	50.39%	1312	B0	T_MTeval.txt
13	Soares, V.	55.29%	60.73%	49.41%	53.54%	23880	Y	SV_MTeval.txt
14	Li, M.	TO *	--	--	--	--	--	--
15	Chen & Gao	DNC *	--	--	--	--	--	--

USPOP Dataset

USPOP Dataset
Rank	Participant	Raw Classification Accuracy	Normalized Raw Classification Accuracy	Runtime (s)	Machine	Confusion Matrix Files
1	Bergstra, Casagrande & Eck (2)	86.92%	82.91%			BCE_2_USeval.txt
2	Bergstra, Casagrande & Eck (1)	86.29%	82.50%	23400	B0	BCE_1_USeval.txt
3	Mandel & Ellis	85.65%	76.91%	7856	R	ME_USeval.txt
4	Pampalk, E.	80.38%	78.74%	3090	B0	P_USeval.txt
5	Lidy & Rauber (SSD+RH)	79.75%	75.45%	5164	B1	LR_SSD+RH_USeval.txt
6	West, K.	78.90%	74.67%	18557	B4	W_USeval.txt
7	Lidy & Rauber (RP+SSD)	78.48%	77.62%	5164	B1	LR_RP+SSD_USeval.txt
8	Ahrendt, P.	78.48%	73.23%	9702	B1	A_USeval.txt
9	Lidy & Rauber (RP+SSD+RH)	78.27%	76.84%	5194	B1	LR_RP+SSD+RH_USeval.txt
10	Scaringella, N.	75.74%	77.67%	24606	G	SN_USeval.txt
11	Soares, V.	66.67%	67.28%	14369	Y	SV_USeval.txt
12	Burred, J.	66.03%	72.50%	9233	B2	B_USeval.txt
13	Tzanetakis, G.	63.29%	50.19%	1320	B0	T_USeval.txt
14	Chen & Gao	22.93%	17.96%	N/A	Y	CG_USeval.txt
15	Li, M.	TO *

Difference between revisions of "2005:Audio Genre Classification Results"

Latest revision as of 11:41, 2 August 2010

Contents

Introduction

Goal

Dataset

Results

Overall

Magnatune Dataset

USPOP Dataset

Navigation menu

Views

Personal tools

MIREX by Year

Results by Year

Account Request

Search

Navigation

Tools

@@ Line 1: / Line 1: @@
-'''Goal:''' To classify polyphonic music audio (in PCM format) into genre categories.
+==Introduction==
-'''Dataset:''' Two sets of data were used: Magnatune and USPOP. The Magnatune dataset has a hierarchical genre taxonomy, while the USPOP categories are at a single level. The audio sampling rates used were either 44.1 KHz or 22.05 KHz (mono). More data information is in the following table:
+===Goal===
+To classify polyphonic music audio (in PCM format) into genre categories.
-{| border="1"
+===Dataset===
+Two sets of data were used: Magnatune and USPOP. The Magnatune dataset has a hierarchical genre taxonomy, while the USPOP categories are at a single level. The audio sampling rates used were either 44.1 KHz or 22.05 KHz (mono). More data information is in the following table:
+{| border="1"  cellspacing="0"
 |- style="background: yellow; text-align: center;"
 ! Dataset !! Size (@ 44.1 KHz) !! Number of Training Files !! Number of Testing Files
@@ Line 13: / Line 17: @@
 |}
-<br>
+==Results==
-{| border="1"
+===Overall===
+{| border="1"  cellspacing="0"
 |- style="background: yellow; text-align: center;"
 ! colspan="3" | OVERALL
 |-style="background: yellow;"
-! Rank !! Participant !! Mean of Magnatune Hierarchical Classification <br> Accuracy and USPOP Raw Classification Accuracy
+! Rank !! Participant !! Mean of Magnatune Hierarchical Classification Accuracy <br> and USPOP Raw Classification Accuracy
 |-
-| 1 || Bergstra, Casagrande & Eck (2) || 82.34%
+| 1 || [https://www.music-ir.org/mirex/abstracts/2005/bergstra.pdf Bergstra, Casagrande & Eck (2)] || 82.34%
 |-
-| 2 || Bergstra, Casagrande & Eck (1) || 81.77%
+| 2 || [https://www.music-ir.org/mirex/abstracts/2005/bergstra.pdf Bergstra, Casagrande & Eck (1)] || 81.77%
 |-
-| 3 || Mandel & Ellis || 78.81%
+| 3 || [https://www.music-ir.org/mirex/abstracts/2005/mandel.pdf Mandel & Ellis] || 78.81%
 |-
-| 4 || West, K. || 75.29%
+| 4 || [https://www.music-ir.org/mirex/abstracts/2005/west.pdf West, K.] || 75.29%
 |-
-| 5 || Lidy & Rauber (SSD+RH) || 75.27%
+| 5 || [https://www.music-ir.org/mirex/abstracts/2005/lidy.pdf Lidy & Rauber (SSD+RH)] || 75.27%
 |-
-| 6 || Pampalk, E. || 75.14%
+| 6 || [https://www.music-ir.org/mirex/abstracts/2005/pampalk.pdf Pampalk, E.] || 75.14%
 |-
-| 7 || Lidy & Rauber (RP+SSD) || 74.78%
+| 7 || [https://www.music-ir.org/mirex/abstracts/2005/lidy.pdf Lidy & Rauber (RP+SSD)] || 74.78%
 |-
-| 8 || Lidy & Rauber (RP+SSD+RH) || 74.58%
+| 8 || [https://www.music-ir.org/mirex/abstracts/2005/lidy.pdf Lidy & Rauber (RP+SSD+RH)] || 74.58%
 |-
-| 9 || Scaringella, N. || 73.11%
+| 9 || [https://www.music-ir.org/mirex/abstracts/2005/scaringella.pdf Scaringella, N.] || 73.11%
 |-
-| 10 || Ahrendt, P. || 71.55%
+| 10 || [https://www.music-ir.org/mirex/abstracts/2005/ahrendt.pdf Ahrendt, P.] || 71.55%
 |-
-| 11 || Burred, J. || 62.63%
+| 11 || [https://www.music-ir.org/mirex/abstracts/2005/burred.pdf Burred, J.] || 62.63%
 |-
-| 12 || Soares, V. || 60.98%
+| 12 || [https://www.music-ir.org/mirex/abstracts/2005/soares.pdf Soares, V.] || 60.98%
 |-
-| 13 || Tzanetakis, G. || 60.72%
+| 13 || [https://www.music-ir.org/mirex/abstracts/2005/tzanetakis.pdf Tzanetakis, G.] || 60.72%
 |-
 |}
-<br>
+===Magnatune Dataset===
+{| border="1"  cellspacing="0"
-{| border="1"
 |- style="background: yellow; text-align: center;"
-! colspan="3" | Magnatune Dataset
+! colspan="9" | Magnatune Dataset
 |-style="background: yellow;"
 ! Rank	!! Participant !! Hierarchical Classification Accuracy !! Normalized Hierarchical Classification Accuracy !!	Raw Classification Accuracy	!! Normalized Raw Classification Accuracy !! Runtime (s) !! Machine !! Confusion Matrix Files
 |-
-| 1 || Bergstra, Casagrande & Eck (2) || 77.75% || 73.04% || 75.10% || 69.49% || -- || -- || BCE_2_MTeval.txt
+| 1 || Bergstra, Casagrande & Eck (2) || 77.75% || 73.04% || 75.10% || 69.49% || -- || -- || [https://www.music-ir.org/mirex/results/2005/audio-genre/BCE_2_MTeval.txt BCE_2_MTeval.txt]
 |-
-| 2 || Bergstra, Casagrande & Eck (1) || 77.25%	|| 72.13% || 74.71% || 68.73% || 23400	|| B0 || BCE_1_MTeval.txt
+| 2 || Bergstra, Casagrande & Eck (1) || 77.25%	|| 72.13% || 74.71% || 68.73% || 23400	|| B0 || [https://www.music-ir.org/mirex/results/2005/audio-genre/BCE_1_MTeval.txt BCE_1_MTeval.txt]
 |-
-| 3 ||	Mandel & Ellis	||71.96%||	69.63%||	67.65%	||63.99%||	8729	||R	||ME_MTeval.txt
+| 3 ||	Mandel & Ellis	||71.96%||	69.63%||	67.65%	||63.99%||	8729	||R	||[https://www.music-ir.org/mirex/results/2005/audio-genre/ME_MTeval.txt	ME_MTeval.txt]
 |-
-| 4	||West, K.||	71.67%||	68.33%	||68.43%||	63.87%||	43327	||B4||	W_MTeval.txt
+| 4	||West, K.||	71.67%||	68.33%	||68.43%||63.87%||43327	||B4||	[https://www.music-ir.org/mirex/results/2005/audio-genre/W_MTeval.txt W_MTeval.txt]
 |-
-| 5 ||	Lidy & Rauber (RP+SSD)	|| 71.08% || 70.90%	|| 67.65%	||66.85%	||6372	||B1||	LR_RP+SSD_MTeval.txt
+| 5 ||	Lidy & Rauber (RP+SSD)	|| 71.08% || 70.90%|| 67.65%||66.85%||6372||B1||[https://www.music-ir.org/mirex/results/2005/audio-genre/LR_RP+SSD_MTeval.txt LR_RP+SSD_MTeval.txt]
 |-
-| 6 ||	Lidy & Rauber (RP+SSD+RH) ||	70.88% ||	70.52% ||	67.25% ||	66.27% ||	6372 ||	B1 ||	LR_RP+SSD+RH_MTeval.txt
+| 6 ||	Lidy & Rauber (RP+SSD+RH) ||	70.88% ||70.52% ||67.25% ||66.27% ||6372 ||B1 ||[https://www.music-ir.org/mirex/results/2005/audio-genre/LR_RP+SSD+RH_MTeval.txt	LR_RP+SSD+RH_MTeval.txt]
 |-
-| 7 ||	Lidy & Rauber (SSD+RH) ||	70.78% ||	69.31% ||	67.65% ||	65.54% ||	6372 ||	B1 ||	LR_SSD+RH_MTeval.txt
+| 7 ||	Lidy & Rauber (SSD+RH) ||70.78% ||69.31% ||67.65% ||65.54% ||	6372 ||	B1 ||[https://www.music-ir.org/mirex/results/2005/audio-genre/LR_SSD+RH_MTeval.txt LR_SSD+RH_MTeval.txt]
 |-
-| 8 ||	Scaringella, N.	|| 70.47% || 72.30%	|| 66.14%	|| 67.12%	|| 22740	|| G	|| SN_MTeval.txt
+| 8 ||	Scaringella, N.	|| 70.47% || 72.30%|| 66.14%|| 67.12%|| 22740|| G|| [https://www.music-ir.org/mirex/results/2005/audio-genre/SN_MTeval.txt SN_MTeval.txt]
 |-
-| 9 ||	Pampalk, E. || 69.90%	|| 70.91% || 66.47% ||	66.26%	|| 3312	|| B0	|| P_MTeval.txt
+| 9 ||	Pampalk, E. || 69.90%	|| 70.91% || 66.47% ||	66.26%	|| 3312	|| B0	|| [https://www.music-ir.org/mirex/results/2005/audio-genre/P_MTeval.txt P_MTeval.txt]
 |-
-| 10 || Ahrendt, P. || 64.61%	|| 61.40%	|| 60.98%	|| 57.15% || 4920 ||	B1 || A_MTeval.txt
+| 10 || Ahrendt, P. || 64.61%	|| 61.40%	|| 60.98%	|| 57.15% || 4920 ||	B1 || [https://www.music-ir.org/mirex/results/2005/audio-genre/A_MTeval.txt A_MTeval.txt]
 |-
-| 11 ||	Burred, J. || 59.22% ||	61.96% || 54.12% || 55.68% || 12483 ||	B2 ||	B_MTeval.txt
+| 11 ||	Burred, J. || 59.22% ||	61.96% || 54.12% || 55.68% || 12483 ||	B2 ||[https://www.music-ir.org/mirex/results/2005/audio-genre/B_MTeval.txt B_MTeval.txt]
 |-
-| 12 ||	Tzanetakis, G. || 58.14% ||	53.47% ||	55.49% ||	50.39%	|| 1312	|| B0	|| T_MTeval.txt
+| 12 ||	Tzanetakis, G. || 58.14% ||	53.47% ||	55.49% ||	50.39%	|| 1312	|| B0	|| [https://www.music-ir.org/mirex/results/2005/audio-genre/T_MTeval.txt T_MTeval.txt]
 |-
-| 13 ||	Soares, V. ||	55.29% ||	60.73% ||	49.41% ||	53.54% ||	23880 ||	Y ||	SV_MTeval.txt
+| 13 ||	Soares, V. ||	55.29% ||	60.73% ||	49.41% ||	53.54% ||	23880 ||	Y ||[https://www.music-ir.org/mirex/results/2005/audio-genre/SV_MTeval.txt SV_MTeval.txt]
 |-
-| 14 ||	Li, M. || TO * || -- || -- || -- || -- || -- || -- ||
+| 14 ||	Li, M. || TO * || -- || -- || -- || -- || -- || --
 |-
-| 15 ||	Chen & Gao ||	DNC * || -- || -- || -- || -- || -- || -- ||
+| 15 ||	Chen & Gao ||	DNC * || -- || -- || -- || -- || -- || --
 |-
+|}
+===USPOP Dataset===
+{| border="1"  cellspacing="0"
+|- style="background: yellow; text-align: center;"
+! colspan="7" | USPOP Dataset
+|-style="background: yellow;"
+|----
+!Rank
+!Participant
+!Raw Classification Accuracy
+!Normalized Raw Classification Accuracy
+!Runtime (s)
+!Machine
+!Confusion Matrix Files
+|----
+|1
+|Bergstra, Casagrande &amp; Eck (2)
+|86.92%
+|82.91%
+|
+|
+|[https://www.music-ir.org/mirex/results/2005/audio-genre/BCE_2_USeval.txt BCE_2_USeval.txt]
+|----
+|2
+|Bergstra, Casagrande &amp; Eck (1)
+|86.29%
+|82.50%
+|23400
+|B0
+|[https://www.music-ir.org/mirex/results/2005/audio-genre/BCE_1_USeval.txt BCE_1_USeval.txt]
+|----
+|3
+|Mandel &amp; Ellis
+|85.65%
+|76.91%
+|7856
+|R
+|[https://www.music-ir.org/mirex/results/2005/audio-genre/ME_USeval.txt ME_USeval.txt]
+|----
+|4
+|Pampalk, E.
+|80.38%
+|78.74%
+|3090
+|B0
+|[https://www.music-ir.org/mirex/results/2005/audio-genre/P_USeval.txt P_USeval.txt]
+|----
+|5
+|Lidy &amp; Rauber (SSD+RH)
+|79.75%
+|75.45%
+|5164
+|B1
+|[https://www.music-ir.org/mirex/results/2005/audio-genre/LR_SSD+RH_USeval.txt LR_SSD+RH_USeval.txt]
+|----
+|6
+|West, K.
+|78.90%
+|74.67%
+|18557
+|B4
+|[https://www.music-ir.org/mirex/results/2005/audio-genre/W_USeval.txt W_USeval.txt]
+|----
+|7
+|Lidy &amp; Rauber (RP+SSD)
+|78.48%
+|77.62%
+|5164
+|B1
+|[https://www.music-ir.org/mirex/results/2005/audio-genre/LR_RP+SSD_USeval.txt LR_RP+SSD_USeval.txt]
+|----
+|8
+|Ahrendt, P.
+|78.48%
+|73.23%
+|9702
+|B1
+|[https://www.music-ir.org/mirex/results/2005/audio-genre/A_USeval.txt A_USeval.txt]
+|----
+|9
+|Lidy &amp; Rauber (RP+SSD+RH)
+|78.27%
+|76.84%
+|5194
+|B1
+|[https://www.music-ir.org/mirex/results/2005/audio-genre/LR_RP+SSD+RH_USeval.txt LR_RP+SSD+RH_USeval.txt]
+|----
+|10
+|Scaringella, N.
+|75.74%
+|77.67%
+|24606
+|G
+|[https://www.music-ir.org/mirex/results/2005/audio-genre/SN_USeval.txt SN_USeval.txt]
+|----
+|11
+|Soares, V.
+|66.67%
+|67.28%
+|14369
+|Y
+|[https://www.music-ir.org/mirex/results/2005/audio-genre/SV_USeval.txt SV_USeval.txt]
+|----
+|12
+|Burred, J.
+|66.03%
+|72.50%
+|9233
+|B2
+|[https://www.music-ir.org/mirex/results/2005/audio-genre/B_USeval.txt B_USeval.txt]
+|----
+|13
+|Tzanetakis, G.
+|63.29%
+|50.19%
+|1320
+|B0
+|[https://www.music-ir.org/mirex/results/2005/audio-genre/T_USeval.txt T_USeval.txt]
+|----
+|14
+|Chen &amp; Gao
+|22.93%
+|17.96%
+|N/A
+|Y
+|[https://www.music-ir.org/mirex/results/2005/audio-genre/CG_USeval.txt CG_USeval.txt]
+|----
+|15
+|Li, M.
+|TO *
+|
+|
+|
+|
+|----
 |}