Difference between revisions of "2009:Audio Music Similarity and Retrieval Results"
(→Team ID) |
(→Team ID) |
||
Line 11: | Line 11: | ||
'''BF1''' = [[Benjamin Fields (chr12)]]<br /> | '''BF1''' = [[Benjamin Fields (chr12)]]<br /> | ||
'''BF2''' = [[Benjamin Fields (mfcc10)]]<br /> | '''BF2''' = [[Benjamin Fields (mfcc10)]]<br /> | ||
− | ''' | + | '''BSWH1''' = [[Dmitry Bogdanov, Joan Serr├á, Nicolas Wack, and Perfecto Herrera (clas)]]<br /> |
+ | '''BSWH2''' = [[Dmitry Bogdanov, Joan Serrà, Nicolas Wack, and Perfecto Herrera (hybrid)]]<br /> | ||
'''CL1''' = [[Chuan Cao, Ming Li]]<br /> | '''CL1''' = [[Chuan Cao, Ming Li]]<br /> | ||
'''CL2''' = [[Chuan Cao, Ming Li]]<br /> | '''CL2''' = [[Chuan Cao, Ming Li]]<br /> | ||
'''GT''' = [[George Tzanetakis]]<br /> | '''GT''' = [[George Tzanetakis]]<br /> | ||
'''LR''' = [[Thomas Lidy, Andreas Rauber]]<br /> | '''LR''' = [[Thomas Lidy, Andreas Rauber]]<br /> | ||
− | ''' | + | '''ME1''' = [[Fran├ºois Maillet, Douglas Eck (mlp)]]<br /> |
+ | '''ME2''' = [[François Maillet, Douglas Eck (sda)]]<br /> | ||
'''PS1''' = [[Tim Pohle, Dominik Schnitzer (2007)]]<br /> | '''PS1''' = [[Tim Pohle, Dominik Schnitzer (2007)]]<br /> | ||
'''PS2''' = [[Tim Pohle, Dominik Schnitzer (2009)]]<br /> | '''PS2''' = [[Tim Pohle, Dominik Schnitzer (2009)]]<br /> | ||
− | ''' | + | '''SH1''' = [[Stephan H├╝bler]]<br /> |
− | ''' | + | '''SH2''' = [[Stephan H├╝bler]]<br /> |
====Broad Categories==== | ====Broad Categories==== |
Revision as of 15:15, 14 October 2009
Contents
Introduction
General Legend
Team ID
ANO = Anonymous
BF1 = Benjamin Fields (chr12)
BF2 = Benjamin Fields (mfcc10)
BSWH1 = Dmitry Bogdanov, Joan Serrà, Nicolas Wack, and Perfecto Herrera (clas)
BSWH2 = Dmitry Bogdanov, Joan Serrà, Nicolas Wack, and Perfecto Herrera (hybrid)
CL1 = Chuan Cao, Ming Li
CL2 = Chuan Cao, Ming Li
GT = George Tzanetakis
LR = Thomas Lidy, Andreas Rauber
ME1 = François Maillet, Douglas Eck (mlp)
ME2 = François Maillet, Douglas Eck (sda)
PS1 = Tim Pohle, Dominik Schnitzer (2007)
PS2 = Tim Pohle, Dominik Schnitzer (2009)
SH1 = Stephan H├╝bler
SH2 = Stephan H├╝bler
Broad Categories
NS = Not Similar
SS = Somewhat Similar
VS = Very Similar
Calculating Summary Measures
Fine(1) = Sum of fine-grained human similarity decisions (0-10).
PSum(1) = Sum of human broad similarity decisions: NS=0, SS=1, VS=2.
WCsum(1) = 'World Cup' scoring: NS=0, SS=1, VS=3 (rewards Very Similar).
SDsum(1) = 'Stephen Downie' scoring: NS=0, SS=1, VS=4 (strongly rewards Very Similar).
Greater0(1) = NS=0, SS=1, VS=1 (binary relevance judgement).
Greater1(1) = NS=0, SS=0, VS=1 (binary relevance judgement using only Very Similar).
(1)Normalized to the range 0 to 1.
Overall Summary Results
NB: The results for BK2 were interpolated from partial data due to a runtime error.
file /nema-raid/www/mirex/results/ams/evalutron/summary_evalutron.csv not found
Friedman's Tests
Friedman's Test (FINE Scores)
The Friedman test was run in MATLAB against the Fine summary data over the 100 queries.
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);
file /nema-raid/www/mirex/results/ams/evalutron/evalutron.fine.friedman.tukeyKramerHSD.csv not found
Friedman's Test (BROAD Scores)
The Friedman test was run in MATLAB against the BROAD summary data over the 100 queries.
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);
file /nema-raid/www/mirex/results/ams/evalutron/evalutron.cat.friedman.tukeyKramerHSD.csv not found
Summary Results by Query
FINE Scores
These are the mean FINE scores per query assigned by Evalutron graders. The FINE scores for the 5 candidates returned per algorithm, per query, have been averaged. Values are bounded between 0.0 and 10.0. A perfect score would be 10. Genre labels have been included for reference.
file /nema-raid/www/mirex/results/ams/evalutron/fine_scores.csv not found
BROAD Scores
These are the mean BROAD scores per query assigned by Evalutron graders. The BROAD scores for the 5 candidates returned per algorithm, per query, have been averaged. Values are bounded between 0 (not similar) and 2 (very similar). A perfect score would be 2. Genre labels have been included for reference.
file /nema-raid/www/mirex/results/ams/evalutron/cat_scores.csv not found
Anonymized Metadata
Raw Scores
The raw data derived from the Evalutron 6000 human evaluations are located on the Audio Music Similarity and Retrieval Raw Data page.