Difference between revisions of "2011:Symbolic Melodic Similarity Results"

Latest revision as of 12:30, 4 November 2011

Introduction

These are the results for the 2011 running of the Symbolic Melodic Similarity task set. For background information about this task set please refer to the 2011:Symbolic Melodic Similarity page.

Each system was given a query and returned the 10 most melodically similar songs from those taken from the Essen Collection (5274 pieces in the MIDI format; see ESAC Data Homepage for more information). For each query, we made four classes of error-mutations, thus the set comprises the following query classes:

0. No errors
1. One note deleted
2. One note inserted
3. One interval enlarged
4. One interval compressed

For each query (and its 4 mutations), the returned results (candidates) from all systems were then grouped together (query set) for evaluation by the human graders. The graders were provide with only heard perfect version against which to evaluate the candidates and did not know whether the candidates came from a perfect or mutated query. Each query/candidate set was evaluated by 1 individual grader. Using the Evalutron 6000 system, the graders gave each query/candidate pair two types of scores. Graders were asked to provide 1 categorical score with 3 categories: NS,SS,VS as explained below, and one fine score (in the range from 0 to 100).

Evalutron 6000 Summary Data

Number of evaluators = 6
Number of evaluations per query/candidate pair = 1
Number of queries per grader = 1
Total number of candidates returned = 3900
Total number of unique query/candidate pairs graded = 895
Average number of query/candidate pairs evaluated per grader: 149
Number of queries = 6 (perfect) with each perfect query error-mutated 4 different ways = 30

General Legend

Sub code	Submission name	Abstract	Contributors
LJY1	LEE1	PDF	Juwan Lee,Seokhwan Jo,Chang D. Yoo
LJY2	LEE2	PDF	Juwan Lee,Seokhwan Jo,Chang D. Yoo
UL1	Shape	PDF	Julián Urbano, Juan Lloréns,Jorge Morato, Sonia Sánchez-Cuadrado
UL2	Pitch	PDF	Julián Urbano, Juan Lloréns,Jorge Morato, Sonia Sánchez-Cuadrado
UL3	Time	PDF	Julián Urbano, Juan Lloréns,Jorge Morato, Sonia Sánchez-Cuadrado
WK1	Tir'a'Mir - binary	PDF	Jacek Wolkowicz, Vlado Keselj
WK2	Tir'a'Mir - melodic cosine	PDF	Jacek Wolkowicz, Vlado Keselj
WK3	Tir'a'Mir - cosine combine	PDF	Jacek Wolkowicz, Vlado Keselj
WK4	Tir'a'Mir - tfidf	PDF	Jacek Wolkowicz, Vlado Keselj
WK5	Tir'a'Mir - melodic bm25	PDF	Jacek Wolkowicz, Vlado Keselj
WK6	Tir'a'Mir - bm25 combine	PDF	Jacek Wolkowicz, Vlado Keselj

Broad Categories

NS = Not Similar
SS = Somewhat Similar
VS = Very Similar

Table Headings

ADR = Average Dynamic Recall
NRGB = Normalize Recall at Group Boundaries
AP = Average Precision (non-interpolated)
PND = Precision at N Documents

Calculating Summary Measures

Fine⁽¹⁾ = Sum of fine-grained human similarity decisions (0-100).
PSum⁽¹⁾ = Sum of human broad similarity decisions: NS=0, SS=1, VS=2.
WCsum⁽¹⁾ = 'World Cup' scoring: NS=0, SS=1, VS=3 (rewards Very Similar).
SDsum⁽¹⁾ = 'Stephen Downie' scoring: NS=0, SS=1, VS=4 (strongly rewards Very Similar).
Greater0⁽¹⁾ = NS=0, SS=1, VS=1 (binary relevance judgment).
Greater1⁽¹⁾ = NS=0, SS=0, VS=1 (binary relevance judgment using only Very Similar).

⁽¹⁾Normalized to the range 0 to 1.

Summary Results

Overall Scores (Includes Perfect and Error Candidates)

SCORE	LJY1	LJY2	UL1	UL2	UL3	WK1	WK2	WK3	WK4	WK5	WK6
ADR	0.6446	0.6595	0.6508	0.6752	0.7257	0.6761	0.665	0.6565	0.6504	0.6543	0.6529
NRGB	0.6255	0.6396	0.6269	0.6512	0.6962	0.6522	0.644	0.6393	0.6319	0.6344	0.6332
AP	0.4807	0.498	0.6262	0.6241	0.6122	0.6104	0.581	0.4618	0.5381	0.5371	0.4969
PND	0.4974	0.5281	0.6625	0.6325	0.6211	0.6048	0.5914	0.497	0.5388	0.5519	0.5059
Fine	0.481	0.494	0.594	0.568	0.552	0.515	0.489	0.434	0.462	0.465	0.458
Wcsum	0.407	0.43	0.543	0.519	0.511	0.457	0.434	0.369	0.393	0.392	0.394
Psum	0.463	0.487	0.615	0.575	0.572	0.497	0.477	0.42	0.428	0.428	0.443
Sdsum	0.378	0.402	0.508	0.491	0.481	0.437	0.413	0.343	0.376	0.374	0.37
Greater0	0.633	0.657	0.83	0.743	0.753	0.617	0.603	0.573	0.533	0.537	0.59
Greater1	0.293	0.317	0.4	0.407	0.39	0.377	0.35	0.267	0.323	0.32	0.297

download these results as csv

Scores by Query Error Types

SCORE	LJY1	LJY2	UL1	UL2	UL3	WK1	WK2	WK3	WK4	WK5	WK6
ADR	0.667	0.6883	0.6827	0.6983	0.7231	0.6817	0.692	0.6671	0.6781	0.6877	0.6658
NRGB	0.6446	0.6679	0.647	0.6729	0.6874	0.6627	0.6708	0.6577	0.6628	0.672	0.6463
AP	0.4849	0.5255	0.6537	0.6328	0.593	0.5977	0.6006	0.4694	0.5517	0.565	0.5053
PND	0.5143	0.5548	0.6952	0.6619	0.6048	0.6119	0.6119	0.5238	0.5548	0.5952	0.5238
Fine	0.472	0.503	0.631	0.595	0.564	0.522	0.473	0.434	0.42	0.445	0.477
Wcsum	0.4	0.428	0.589	0.533	0.528	0.444	0.411	0.372	0.328	0.35	0.417
Psum	0.458	0.492	0.667	0.6	0.592	0.492	0.458	0.425	0.367	0.392	0.467
Sdsum	0.371	0.396	0.55	0.5	0.496	0.421	0.388	0.346	0.308	0.329	0.392
Greater0	0.633	0.683	0.9	0.8	0.783	0.633	0.6	0.583	0.483	0.517	0.617
Greater1	0.283	0.3	0.433	0.4	0.4	0.35	0.317	0.267	0.25	0.267	0.317

download these results as csv

SCORE	LJY1	LJY2	UL1	UL2	UL3	WK1	WK2	WK3	WK4	WK5	WK6
ADR	0.6675	0.6783	0.6247	0.6897	0.7581	0.7047	0.6656	0.6521	0.6417	0.6522	0.6211
NRGB	0.6462	0.6612	0.616	0.6674	0.7331	0.6826	0.6456	0.6312	0.6204	0.6306	0.6067
AP	0.4767	0.4851	0.6124	0.589	0.6036	0.6062	0.5331	0.4238	0.4781	0.4806	0.4584
PND	0.4722	0.4889	0.6444	0.5778	0.6333	0.5889	0.5222	0.4389	0.4722	0.4722	0.4833
Fine	0.472	0.503	0.631	0.595	0.564	0.522	0.473	0.434	0.42	0.445	0.477
Wcsum	0.4	0.428	0.589	0.533	0.528	0.444	0.411	0.372	0.328	0.35	0.417
Psum	0.458	0.492	0.667	0.6	0.592	0.492	0.458	0.425	0.367	0.392	0.467
Sdsum	0.371	0.396	0.55	0.5	0.496	0.421	0.388	0.346	0.308	0.329	0.392
Greater0	0.633	0.683	0.9	0.8	0.783	0.633	0.6	0.583	0.483	0.517	0.617
Greater1	0.283	0.3	0.433	0.4	0.4	0.35	0.317	0.267	0.25	0.267	0.317

download these results as csv

SCORE	LJY1	LJY2	UL1	UL2	UL3	WK1	WK2	WK3	WK4	WK5	WK6
ADR	0.5947	0.6054	0.6572	0.6639	0.6978	0.6541	0.6426	0.6391	0.6262	0.6197	0.6524
NRGB	0.5819	0.5842	0.6373	0.6446	0.6645	0.6294	0.6289	0.624	0.6128	0.6136	0.6403
AP	0.4385	0.4497	0.5977	0.596	0.5875	0.5914	0.5511	0.4294	0.5161	0.5011	0.4977
PND	0.481	0.4952	0.6563	0.6119	0.5714	0.5786	0.6063	0.4571	0.5159	0.5048	0.4905
Fine	0.486	0.484	0.604	0.557	0.561	0.503	0.502	0.424	0.458	0.465	0.463
Wcsum	0.411	0.411	0.561	0.528	0.511	0.461	0.444	0.344	0.394	0.389	0.422
Psum	0.475	0.475	0.642	0.583	0.575	0.492	0.483	0.4	0.425	0.425	0.467
Sdsum	0.379	0.379	0.521	0.5	0.479	0.446	0.425	0.317	0.379	0.371	0.4
Greater0	0.667	0.667	0.883	0.75	0.767	0.583	0.6	0.567	0.517	0.533	0.6
Greater1	0.283	0.283	0.4	0.417	0.383	0.4	0.367	0.233	0.333	0.317	0.333

download these results as csv

SCORE	LJY1	LJY2	UL1	UL2	UL3	WK1	WK2	WK3	WK4	WK5	WK6
ADR	0.6413	0.6543	0.6379	0.6933	0.7266	0.67	0.6579	0.6504	0.633	0.6416	0.653
NRGB	0.6181	0.6355	0.6119	0.6681	0.7034	0.6417	0.6315	0.6292	0.6199	0.6137	0.6217
AP	0.4774	0.4865	0.6142	0.6444	0.6303	0.5871	0.5831	0.4571	0.5296	0.5097	0.4656
PND	0.4694	0.5139	0.6333	0.6444	0.6333	0.5778	0.5667	0.4778	0.5389	0.5333	0.4611
Fine	0.465	0.473	0.537	0.537	0.556	0.465	0.479	0.417	0.454	0.431	0.404
Wcsum	0.383	0.417	0.478	0.483	0.528	0.4	0.411	0.35	0.406	0.35	0.317
Psum	0.425	0.467	0.533	0.525	0.583	0.433	0.45	0.392	0.433	0.383	0.358
Sdsum	0.363	0.392	0.45	0.463	0.5	0.383	0.392	0.329	0.392	0.333	0.296
Greater0	0.55	0.617	0.7	0.65	0.75	0.533	0.567	0.517	0.517	0.483	0.483
Greater1	0.3	0.317	0.367	0.4	0.417	0.333	0.333	0.267	0.35	0.283	0.233

download these results as csv

SCORE	LJY1	LJY2	UL1	UL2	UL3	WK1	WK2	WK3	WK4	WK5	WK6
ADR	0.6526	0.671	0.6515	0.6309	0.7228	0.6698	0.667	0.674	0.6733	0.6699	0.6723
NRGB	0.6369	0.6492	0.622	0.6031	0.6928	0.6448	0.6431	0.6546	0.6436	0.6421	0.6513
AP	0.526	0.543	0.653	0.6585	0.6468	0.6697	0.6372	0.5294	0.615	0.6292	0.5577
PND	0.55	0.5875	0.6833	0.6667	0.6625	0.6667	0.65	0.5875	0.6125	0.6542	0.5708
Fine	0.485	0.495	0.571	0.546	0.522	0.534	0.479	0.44	0.479	0.497	0.458
Wcsum	0.417	0.439	0.517	0.494	0.478	0.483	0.433	0.389	0.417	0.439	0.389
Psum	0.475	0.492	0.583	0.55	0.533	0.525	0.475	0.442	0.45	0.475	0.442
Sdsum	0.388	0.413	0.483	0.467	0.45	0.463	0.413	0.363	0.4	0.421	0.363
Greater0	0.65	0.65	0.783	0.717	0.7	0.65	0.6	0.6	0.55	0.583	0.6
Greater1	0.3	0.333	0.383	0.383	0.367	0.4	0.35	0.283	0.35	0.367	0.283

download these results as csv

Friedman Test with Multiple Comparisons Results (p=0.05)

The Friedman test was run in MATLAB against the Fine summary data over the 30 queries.
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);

Row Labels	LJY1	LJY2	UL2	UL1	UL3	WK1	WK2	WK3	WK4	WK5	WK6
q01	42.2	48.3	61.7	64.2	44	46.9	50	44.2	47.8	46.8	51.9
q01_1	51.1	49.3	66.4	64.2	44	46.1	45.2	34.2	41.2	45	49.8
q01_2	32.3	20.9	42.3	45.6	43.3	38.9	42.7	45	36.8	39.2	46
q01_3	32.1	31.4	37.7	30.5	37.6	32	38.3	32.2	38.2	38	34.9
q01_4	35.5	27.9	42.4	43.9	26.8	39.2	40.3	33	40.5	40.5	42.5
q02	68.6	68.6	73	70.4	71.4	72.2	56.1	42.4	44.1	44.1	23.7
q02_1	64.5	64.5	73	70	71.3	70.1	53.1	42.4	46.7	53.3	23.7
q02_2	55.9	55.9	70.5	67.3	66.9	71.5	62.1	35.4	37.9	41.2	23.7
q02_3	68.6	68.6	76.2	69	71.4	57.9	56.1	42.4	39.7	44.1	23.7
q02_4	65.7	69.9	68.6	70.4	71.4	72.2	51.5	42.4	39.7	44.1	23.7
q03	42.7	50.6	56.3	62	50.4	67.9	62.2	31.1	56.3	61.8	45.7
q03_1	36.6	38.5	56.3	64.2	47.4	67.9	54.7	31.1	27.5	34.3	50
q03_2	52	62.3	56.3	62	50.4	73.2	64.7	28.5	56	53.9	36.7
q03_3	41.5	43.2	56.3	62	50.4	67.9	60	28.5	42.2	44	43.7
q03_4	43.8	55.7	56.3	62	50.4	67.9	62.2	31.1	60.6	66.5	41
q04	50.1	48.1	56.1	57.3	61.1	55.9	40.5	52.9	47.9	43.9	61.3
q04_1	44.6	50.1	52.3	54.9	61.1	45	38.6	49.2	41.8	47.1	53.5
q04_2	54.6	51	51.9	59.1	61.1	35.2	41.1	52.9	46.6	52.2	55.4
q04_3	46.7	47.1	47.8	59.2	54.1	46	35.9	48.5	42.5	38.8	51.6
q04_4	49.2	41.1	56.2	58.4	61.1	51.7	34.1	54	34.6	39.9	61
q05	45	46	56	64	64	52	56	48	53	56	69.5
q05_1	39	48	51	68	72	48	51	48	42	41	67.5
q05_2	45	47	55	64	64	47	49	37	54	45	69.5
q05_3	46	47	46	44	72	39	43	48	58	45	49
q05_4	46	53	60	64	64	53	56	53	62	62	60.5
q06	49.4	49	58.2	57.3	44.5	36.2	41.4	53	51.6	40.2	39.4
q06_1	47.6	51.5	58.2	57.3	42.3	36.2	41.4	55.2	52.7	46	41.4
q06_2	51.5	53.3	58.2	64.1	50.7	36.2	41.4	55.7	43.4	47.5	46.6
q06_3	44.3	46.5	58.2	57.3	48	36.2	53.8	50.7	51.6	48.7	39.4
q06_4	50.7	49.1	44.1	44	39.2	36.2	43.5	50.5	49.8	44.9	46.3

download these results as csv

TeamID	TeamID	Lowerbound	Mean	Upperbound	Significance
UL1	UL2	-1.9997	0.4667	2.9331	FALSE
UL1	UL3	-0.8664	1.6000	4.0664	FALSE
UL1	WK1	0.4003	2.8667	5.3331	TRUE
UL1	LJY2	0.8003	3.2667	5.7331	TRUE
UL1	WK2	0.7336	3.2000	5.6664	TRUE
UL1	LJY1	1.5503	4.0167	6.4831	TRUE
UL1	WK5	1.2503	3.7167	6.1831	TRUE
UL1	WK4	1.2836	3.7500	6.2164	TRUE
UL1	WK3	1.4836	3.9500	6.4164	TRUE
UL2	UL3	-1.3331	1.1333	3.5997	FALSE
UL2	WK1	-0.0664	2.4000	4.8664	FALSE
UL2	LJY2	0.3336	2.8000	5.2664	TRUE
UL2	WK2	0.2669	2.7333	5.1997	TRUE
UL2	LJY1	1.0836	3.5500	6.0164	TRUE
UL2	WK5	0.7836	3.2500	5.7164	TRUE
UL2	WK4	0.8169	3.2833	5.7497	TRUE
UL2	WK3	1.0169	3.4833	5.9497	TRUE
UL3	WK1	-1.1997	1.2667	3.7331	FALSE
UL3	LJY2	-0.7997	1.6667	4.1331	FALSE
UL3	WK2	-0.8664	1.6000	4.0664	FALSE
UL3	LJY1	-0.0497	2.4167	4.8831	FALSE
UL3	WK5	-0.3497	2.1167	4.5831	FALSE
UL3	WK4	-0.3164	2.1500	4.6164	FALSE
UL3	WK3	-0.1164	2.3500	4.8164	FALSE
WK1	LJY2	-2.0664	0.4000	2.8664	FALSE
WK1	WK2	-2.1331	0.3333	2.7997	FALSE
WK1	LJY1	-1.3164	1.1500	3.6164	FALSE
WK1	WK5	-1.6164	0.8500	3.3164	FALSE
WK1	WK4	-1.5831	0.8833	3.3497	FALSE
WK1	WK3	-1.3831	1.0833	3.5497	FALSE
LJY2	WK2	-2.5331	-0.0667	2.3997	FALSE
LJY2	LJY1	-1.7164	0.7500	3.2164	FALSE
LJY2	WK5	-2.0164	0.4500	2.9164	FALSE
LJY2	WK4	-1.9831	0.4833	2.9497	FALSE
LJY2	WK3	-1.7831	0.6833	3.1497	FALSE
WK2	LJY1	-1.6497	0.8167	3.2831	FALSE
WK2	WK5	-1.9497	0.5167	2.9831	FALSE
WK2	WK4	-1.9164	0.5500	3.0164	FALSE
WK2	WK3	-1.7164	0.7500	3.2164	FALSE
LJY1	WK5	-2.7664	-0.3000	2.1664	FALSE
LJY1	WK4	-2.7331	-0.2667	2.1997	FALSE
LJY1	WK3	-2.5331	-0.0667	2.3997	FALSE
WK5	WK4	-2.4331	0.0333	2.4997	FALSE
WK5	WK3	-2.2331	0.2333	2.6997	FALSE
WK4	WK3	-2.2664	0.2000	2.6664	FALSE

download these results as csv

TBA

Raw Scores

The raw data derived from the Evalutron 6000 human evaluations are located on the 2011:Symbolic Melodic Similarity Raw Data page.

Run Times

TBA

@@ Line 21: / Line 21: @@
 '''Number of queries''' = 6 (perfect) with each perfect query error-mutated 4 different ways = 30<br />
-===General Legend===
 == General Legend ==
 {| border="1" cellspacing="0" style="text-align: left; width: 800px;"
@@ Line 29: / Line 28: @@
 	! width="80" style="text-align: center;" | Abstract
 	! width="540" | Contributors
-	|-
-	! KW1
-	| Tir'a'Mir - melodic cosine ||  style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2011/KW1.pdf PDF] || [http://dnlp.ca Jacek Wolkowicz],[http://dnlp.ca Vlado Keselj]
 	|-
@@ Line 43: / Line 39: @@
 	! UL1
-	| Pitch ||  style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2011/UL1.pdf PDF] || [http://julian-urbano.info Julián Urbano], [http://www.kr.inf.uc3m.es Juan Lloréns],[https://sites.google.com/site/jorgemorato/ Jorge Morato]
+	| Shape ||  style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2011/UL1.pdf PDF] || [http://julian-urbano.info Julián Urbano], [http://www.kr.inf.uc3m.es Juan Lloréns],[http://sites.google.com/site/jorgemorato/ Jorge Morato], [http://www.inf.uc3m.es/en/component/comprofiler/userprofile/sscuadra Sonia Sánchez-Cuadrado]
 	|-
 	! UL2
-	| Shape ||  style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2011/UL2.pdf PDF] || [http://julian-urbano.info Julián Urbano], [http://www.kr.inf.uc3m.es Juan Lloréns],[https://sites.google.com/site/jorgemorato/ Jorge Morato]
+	| Pitch ||  style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2011/UL2.pdf PDF] || [http://julian-urbano.info Julián Urbano], [http://www.kr.inf.uc3m.es Juan Lloréns],[http://sites.google.com/site/jorgemorato/ Jorge Morato], [http://www.inf.uc3m.es/en/component/comprofiler/userprofile/sscuadra Sonia Sánchez-Cuadrado]
 	|-
 	! UL3
-	| Time ||  style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2011/UL3.pdf PDF] || [http://julian-urbano.info Julián Urbano], [http://www.kr.inf.uc3m.es Juan Lloréns],[https://sites.google.com/site/jorgemorato/ Jorge Morato]
+	| Time ||  style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2011/UL3.pdf PDF] || [http://julian-urbano.info Julián Urbano], [http://www.kr.inf.uc3m.es Juan Lloréns],[http://sites.google.com/site/jorgemorato/ Jorge Morato], [http://www.inf.uc3m.es/en/component/comprofiler/userprofile/sscuadra Sonia Sánchez-Cuadrado]
 	|-
@@ Line 106: / Line 102: @@
 ==Summary Results==
-===Run Times===
-<csv>2011/sms/sms_runtimes.csv</csv>
 ===Overall Scores (Includes Perfect and Error Candidates)===
 <csv>2011/sms/Overall.csv</csv>
@@ Line 125: / Line 119: @@
 ===Friedman Test with Multiple Comparisons Results (p=0.05)===
 The Friedman test was run in MATLAB against the Fine summary data over the 30 queries.<br />
 Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);
-<!--<csv>2011/sms/sum_friedman_fine.csv</csv>-->
+<csv>2011/sms/sum_friedman_fine.csv</csv>
 <csv>2011/sms/sms_fine_scores_friedman.csv</csv>
 [[Image:2011 sms fine scores friedmans.png]]
+TBA
 ==Raw Scores==
 The raw data derived from the Evalutron 6000 human evaluations are located on the [[2011:Symbolic Melodic Similarity Raw Data]] page.
+===Run Times===
+TBA
 [[Category: Results]]

Difference between revisions of "2011:Symbolic Melodic Similarity Results"

Latest revision as of 12:30, 4 November 2011

Contents

Introduction

Evalutron 6000 Summary Data

General Legend

Broad Categories

Table Headings

Calculating Summary Measures

Summary Results

Overall Scores (Includes Perfect and Error Candidates)

Scores by Query Error Types

Friedman Test with Multiple Comparisons Results (p=0.05)

Raw Scores

Run Times

Navigation menu

Views

Personal tools

MIREX by Year

Results by Year

Account Request

Search

Navigation

Tools