2013:Symbolic Melodic Similarity Results

Introduction

These are the results for the 2011 running of the Symbolic Melodic Similarity task set. For background information about this task set please refer to the 2011:Symbolic Melodic Similarity page.

Each system was given a query and returned the 10 most melodically similar songs from those taken from the Essen Collection (5274 pieces in the MIDI format; see ESAC Data Homepage for more information). For each query, we made four classes of error-mutations, thus the set comprises the following query classes:

0. No errors
1. One note deleted
2. One note inserted
3. One interval enlarged
4. One interval compressed

For each query (and its 4 mutations), the returned results (candidates) from all systems were then grouped together (query set) for evaluation by the human graders. The graders were provide with only heard perfect version against which to evaluate the candidates and did not know whether the candidates came from a perfect or mutated query. Each query/candidate set was evaluated by 1 individual grader. Using the Evalutron 6000 system, the graders gave each query/candidate pair two types of scores. Graders were asked to provide 1 categorical score with 3 categories: NS,SS,VS as explained below, and one fine score (in the range from 0 to 100).

Evalutron 6000 Summary Data

Number of evaluators = 6
Number of evaluations per query/candidate pair = 1
Number of queries per grader = 1
Total number of candidates returned = 3900
Total number of unique query/candidate pairs graded = 895
Average number of query/candidate pairs evaluated per grader: 149
Number of queries = 6 (perfect) with each perfect query error-mutated 4 different ways = 30

General Legend

Sub code	Submission name	Abstract	Contributors
DB1	PPM-DJ	PDF	Antonio de_Carvalho_Junior,Leonardo Batista
ULMS1	ShapeH	PDF	Julián Urbano, Juan Lloréns,Jorge Morato, Sonia Sánchez-Cuadrado
ULMS2	ShapeL	PDF	Julián Urbano, Juan Lloréns,Jorge Morato, Sonia Sánchez-Cuadrado
ULMS3	ShapeG	PDF	Julián Urbano, Juan Lloréns,Jorge Morato, Sonia Sánchez-Cuadrado
ULMS4	ShapeTime	PDF	Julián Urbano, Juan Lloréns,Jorge Morato, Sonia Sánchez-Cuadrado
ULMS5	Time	PDF	Julián Urbano, Juan Lloréns,Jorge Morato, Sonia Sánchez-Cuadrado

Broad Categories

NS = Not Similar
SS = Somewhat Similar
VS = Very Similar

Table Headings

ADR = Average Dynamic Recall
NRGB = Normalize Recall at Group Boundaries
AP = Average Precision (non-interpolated)
PND = Precision at N Documents

Calculating Summary Measures

Fine⁽¹⁾ = Sum of fine-grained human similarity decisions (0-100).
PSum⁽¹⁾ = Sum of human broad similarity decisions: NS=0, SS=1, VS=2.
WCsum⁽¹⁾ = 'World Cup' scoring: NS=0, SS=1, VS=3 (rewards Very Similar).
SDsum⁽¹⁾ = 'Stephen Downie' scoring: NS=0, SS=1, VS=4 (strongly rewards Very Similar).
Greater0⁽¹⁾ = NS=0, SS=1, VS=1 (binary relevance judgment).
Greater1⁽¹⁾ = NS=0, SS=0, VS=1 (binary relevance judgment using only Very Similar).

⁽¹⁾Normalized to the range 0 to 1.

Summary Results

Overall Scores (Includes Perfect and Error Candidates)

SCORE	DB1	ULMS1	ULMS2	ULMS3	ULMS4	ULMS5
ADR	0.0033	0.6085	0.4830	0.5416	0.6706	0.6567
NRGB	0.0049	0.5339	0.4275	0.4707	0.5788	0.5670
AP	0.0014	0.5316	0.2728	0.4175	0.5414	0.4872
PND	0.0067	0.5243	0.3271	0.4464	0.5161	0.4865
Fine	25.1533	62.9367	49.56	54.5733	63.5167	62.6133
PSum	0.32667	1.36	0.93333	1.1633	1.37	1.3267
WCSum	0.34667	1.8867	1.1733	1.5967	1.9067	1.8267
SDSum	0.36667	2.4133	1.4133	2.03	2.4433	2.3267
Greater0	0.30667	0.83333	0.69333	0.73	0.83333	0.82667
Greater1	0.02	0.52667	0.24	0.43333	0.53667	0.5

download these results as csv

Scores by Query Error Types

No Errors

SCORE	DB1	ULMS1	ULMS2	ULMS3	ULMS4	ULMS5
ADR	0.0000	0.6201	0.5108	0.5760	0.6555	0.6464
NRGB	0.0000	0.5177	0.4471	0.4843	0.5389	0.5422
AP	0.0023	0.5165	0.2701	0.4372	0.5199	0.4814
PND	0.0000	0.5357	0.3196	0.4899	0.5077	0.4786
Fine	25.9833	65.85	50.2167	56.8833	66.7667	63.9167
PSum	0.33333	1.45	0.95	1.2167	1.4667	1.3667
WCSum	0.35	2.0167	1.2	1.6833	2.05	1.8833
SDSum	0.36667	2.5833	1.45	2.15	2.6333	2.4
Greater0	0.31667	0.88333	0.7	0.75	0.88333	0.85
Greater1	0.016667	0.56667	0.25	0.46667	0.58333	0.51667

download these results as csv

Note Deletions

SCORE	DB1	ULMS1	ULMS2	ULMS3	ULMS4	ULMS5
ADR	0.0000	0.6258	0.5599	0.5526	0.7231	0.6998
NRGB	0.0000	0.5605	0.5031	0.4754	0.6467	0.6068
AP	0.0000	0.5943	0.3646	0.3955	0.6309	0.5168
PND	0.0000	0.5841	0.3865	0.4238	0.5867	0.4756
Fine	24.05	67.2667	50.3167	52.5167	68.2667	63.45
PSum	0.26667	1.45	0.95	1.1167	1.4833	1.35
WCSum	0.26667	2.0167	1.2	1.5333	2.0833	1.85
SDSum	0.26667	2.5833	1.45	1.95	2.6833	2.35
Greater0	0.26667	0.88333	0.7	0.7	0.88333	0.85
Greater1	0	0.56667	0.25	0.41667	0.6	0.5

download these results as csv

Note Insertions

SCORE	DB1	ULMS1	ULMS2	ULMS3	ULMS4	ULMS5
ADR	0.0000	0.6066	0.4451	0.5154	0.6623	0.6439
NRGB	0.0000	0.5332	0.3777	0.4751	0.5687	0.5476
AP	0.0000	0.5314	0.2225	0.4254	0.5281	0.4953
PND	0.0000	0.4946	0.2714	0.4780	0.4917	0.4679
Fine	24.3667	63.8167	47.35	57.8833	65.0833	62.75
PSum	0.31667	1.3833	0.86667	1.2667	1.4	1.3167
WCSum	0.33333	1.9167	1.0833	1.7333	1.9333	1.8333
SDSum	0.35	2.45	1.3	2.2	2.4667	2.35
Greater0	0.3	0.85	0.65	0.8	0.86667	0.8
Greater1	0.016667	0.53333	0.21667	0.46667	0.53333	0.51667

download these results as csv

Enlarged Intervals

SCORE	DB1	ULMS1	ULMS2	ULMS3	ULMS4	ULMS5
ADR	0.0000	0.5970	0.4818	0.5452	0.6584	0.6576
NRGB	0.0000	0.5390	0.4286	0.4778	0.5622	0.5783
AP	0.0000	0.5270	0.2676	0.3973	0.5244	0.4769
PND	0.0000	0.5212	0.3450	0.3937	0.4878	0.5265
Fine	25.75	57.1	49.55	51.1667	55.9167	59.75
PSum	0.33333	1.2167	0.93333	1.05	1.1833	1.25
WCSum	0.35	1.6833	1.1667	1.4333	1.6333	1.7167
SDSum	0.36667	2.15	1.4	1.8167	2.0833	2.1833
Greater0	0.31667	0.75	0.7	0.66667	0.73333	0.78333
Greater1	0.016667	0.46667	0.23333	0.38333	0.45	0.46667

download these results as csv

Compressed Intervals

SCORE	DB1	ULMS1	ULMS2	ULMS3	ULMS4	ULMS5
ADR	0.0164	0.5929	0.4172	0.5188	0.6539	0.6357
NRGB	0.0243	0.5192	0.3809	0.4407	0.5776	0.5600
AP	0.0049	0.4887	0.2392	0.4319	0.5035	0.4655
PND	0.0333	0.4857	0.3127	0.4468	0.5063	0.4841
Fine	25.6167	60.65	50.3667	54.4167	61.55	63.2
PSum	0.38333	1.3	0.96667	1.1667	1.3167	1.35
WCSum	0.43333	1.8	1.2167	1.6	1.8333	1.85
SDSum	0.48333	2.3	1.4667	2.0333	2.35	2.35
Greater0	0.33333	0.8	0.71667	0.73333	0.8	0.85
Greater1	0.05	0.5	0.25	0.43333	0.51667	0.5

download these results as csv

Friedman Test with Multiple Comparisons Results (p=0.05)

The Friedman test was run in MATLAB against the Fine summary data over the 30 queries.
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);

Row Labels	DB1	ULMS1	ULMS2	ULMS3	ULMS4	ULMS5
q01	40.2	72.9	60.1	60.1	72.9	57.2
q01_1	29.1	72.9	54.2	57.5	72.9	57.2
q01_2	34.3	62	56.1	60.4	55.4	54.6
q01_3	35.5	57.1	56.1	55.8	50	55.2
q01_4	43	55.9	56.1	51.3	55.8	47.9
q02	7.7	48.4	44	55.2	48.4	59.8
q02_1	8.5	54.2	50.5	40.3	54.2	56.6
q02_2	7.5	49.6	35.9	52.8	49.6	56.4
q02_3	8	51.1	44	52.4	51.1	53.5
q02_4	7.8	48.4	44	54.6	48.4	59.8
q03	22	60.5	42.5	54	60.5	55.5
q03_1	25	63.5	42.5	48	63.5	50.5
q03_2	26	60.5	39.5	56	60.5	55.5
q03_3	21	60.5	42.5	54	60.5	55.5
q03_4	19.3	60.5	42.5	54	60.5	55.5
q04	21.2	65.1	37.3	35.1	65.1	65.8
q04_1	19.5	62.9	37.3	30.3	62.9	65.8
q04_2	20	57.1	37.3	40.1	65.8	65.8
q04_3	26.6	56.9	37.3	21	56.9	51.5
q04_4	16.8	60.4	39.2	34.9	60.4	65.8
q05	31.3	77.2	45.9	67.9	82.7	82.7
q05_1	32.2	79.1	45.9	69	85.1	85.1
q05_2	31.9	77.2	43.8	69	82.7	82.7
q05_3	28.9	46	45.9	59.8	46	80.8
q05_4	33.3	77.2	48.9	64.7	82.7	82.7
q06	33.5	71	71.5	69	71	62.5
q06_1	30	71	71.5	70	71	65.5
q06_2	26.5	76.5	71.5	69	76.5	61.5
q06_3	34.5	71	71.5	64	71	62
q06_4	33.5	61.5	71.5	67	61.5	67.5

download these results as csv

TeamID	TeamID	Lowerbound	Mean	Upperbound	Significance
ULMS4	ULMS1	-1.3747	-0.0167	1.3414	FALSE
ULMS4	ULMS5	-1.0914	0.2667	1.6247	FALSE
ULMS4	ULMS3	-0.1247	1.2333	2.5914	FALSE
ULMS4	ULMS2	0.1086	1.4667	2.8247	TRUE
ULMS4	DB1	2.1919	3.5500	4.9081	TRUE
ULMS1	ULMS5	-1.0747	0.2833	1.6414	FALSE
ULMS1	ULMS3	-0.1081	1.2500	2.6081	FALSE
ULMS1	ULMS2	0.1253	1.4833	2.8414	TRUE
ULMS1	DB1	2.2086	3.5667	4.9247	TRUE
ULMS5	ULMS3	-0.3914	0.9667	2.3247	FALSE
ULMS5	ULMS2	-0.1581	1.2000	2.5581	FALSE
ULMS5	DB1	1.9253	3.2833	4.6414	TRUE
ULMS3	ULMS2	-1.1247	0.2333	1.5914	FALSE
ULMS3	DB1	0.9586	2.3167	3.6747	TRUE
ULMS2	DB1	0.7253	2.0833	3.4414	TRUE

download these results as csv

2013:Symbolic Melodic Similarity Results

Contents

Introduction

Evalutron 6000 Summary Data

General Legend

Broad Categories

Table Headings

Calculating Summary Measures

Summary Results

Overall Scores (Includes Perfect and Error Candidates)

Scores by Query Error Types

No Errors

Note Deletions

Note Insertions

Enlarged Intervals

Compressed Intervals

Friedman Test with Multiple Comparisons Results (p=0.05)

Navigation menu

Views

Personal tools

MIREX by Year

Results by Year

Account Request

Search

Navigation

Tools