Difference between revisions of "2012:Audio Music Similarity and Retrieval Results"
Kahyun Choi (talk | contribs) (→Friedman's Test (FINE Scores)) |
Kahyun Choi (talk | contribs) (→Friedman's Test (BROAD Scores)) |
||
(35 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
== Introduction == | == Introduction == | ||
− | These are the results for the | + | These are the results for the 2012 running of the Audio Music Similarity and Retrieval task set. For background information about this task set please refer to the Audio Music Similarity and Retrieval page. |
− | Each system was given 7000 songs chosen from IMIRSEL's "uspop", "uscrap" and "american" "classical" and "sundry" collections. Each system then returned a 7000x7000 distance matrix. | + | Each system was given 7000 songs chosen from IMIRSEL's "uspop", "uscrap" and "american" "classical" and "sundry" collections. Each system then returned a 7000x7000 distance matrix. 50 songs were randomly selected from the 10 genre groups (5 per genre) as queries and the first 5 most highly ranked songs out of the 7000 were extracted for each query (after filtering out the query itself, returned results from the same artist were also omitted). Then, for each query, the returned results (candidates) from all participants were grouped and were evaluated by human graders using the Evalutron 6000 grading system. Each individual query/candidate set was evaluated by a single grader. For each query/candidate pair, graders provided two scores. Graders were asked to provide 1 categorical '''BROAD''' score with 3 categories: NS,SS,VS as explained below, and one '''FINE''' score (in the range from 0 to 100). A description and analysis is provided below. |
The systems read in 30 second audio clips as their raw data. The same 30 second clips were used in the grading stage. | The systems read in 30 second audio clips as their raw data. The same 30 second clips were used in the grading stage. | ||
Line 18: | Line 18: | ||
! width="440" | Contributors | ! width="440" | Contributors | ||
|- | |- | ||
− | ! | + | ! DM6 |
− | | | + | | DM6 || style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2012/DM6.pdf PDF] || [http://www.iam.ecs.soton.ac.uk/ Franz de Leon], [http://www.iam.ecs.soton.ac.uk/ Kirk Martinez] |
+ | |- | ||
+ | ! DM7 | ||
+ | | DM7 || style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2012/DM7.pdf PDF] || [http://www.iam.ecs.soton.ac.uk/ Franz de Leon], [http://www.iam.ecs.soton.ac.uk/ Kirk Martinez] | ||
|- | |- | ||
− | ! | + | ! GT3 |
− | | | + | | MarsyasSimilarity || style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2012/GT3.pdf PDF] || [http://www.cs.uvic.ca/~gtzan George Tzanetakis] |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
− | ! | + | ! JR2 |
− | | | + | | modulationSim || style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2012/JR2.pdf PDF] || [http://mirlab.org/jmzen0921 Jia-Min Ren], [http://mirlab.org/jang Jyh-Shing Roger Jang] |
|- | |- | ||
− | ! | + | ! NHHL1 |
− | | | + | | AMSR_2012_1 || style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2012/NHHL1.pdf PDF] || [http://11471178.net/ Byeong-jun Han], [http://marg.snu.ac.kr/people Kyogu Lee],[http://ccrma.stanford.edu/~juhan Juhan Nam],[http://ccrma.stanford.edu/~jorgeh/ Jorge Herrera] |
|- | |- | ||
− | ! | + | ! NHHL2 |
− | | | + | | AMSR_2012_2 || style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2012/NHHL2.pdf PDF] || [http://11471178.net/ Byeong-jun Han], [http://marg.snu.ac.kr/people Kyogu Lee],[http://ccrma.stanford.edu/~juhan Juhan Nam],[http://ccrma.stanford.edu/~jorgeh/ Jorge Herrera] |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
! PS1 | ! PS1 | ||
| PS09 || style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2011/PS1.pdf PDF] || [http://www.ofai.at/~dominik.schnitzer Dominik Schnitzer], [http://www.cp.jku.at/ Tim Pohle] | | PS09 || style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2011/PS1.pdf PDF] || [http://www.ofai.at/~dominik.schnitzer Dominik Schnitzer], [http://www.cp.jku.at/ Tim Pohle] | ||
|- | |- | ||
− | ! | + | ! RW4 |
− | | | + | | modulationSimFrameUBM || style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2012/RW4.pdf PDF] || [http://mirlab.org/jmzen0921 Jia-Min Ren],[http://mirlab.org/new/ Ming-Ju Wu],[http://mirlab.org/jang Jyh-Shing Roger Jang] |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
− | ! | + | ! SSKP1 |
− | | | + | | cbmr_sim_2010 || style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2012/SSPK1.pdf PDF] || [http://www.seyerlehner.info Klaus Seyerlehner], [http://www.cp.jku.at Markus Schedl], [http://www.cp.jku.at Peter Knees], [http://www.cp.jku.at/ Tim Pohle] |
|- | |- | ||
− | + | ! SSKS2 | |
− | | | + | | cbmr_sim_2011 || style="text-align: center;" | [https://www.music-ir.org/mirex/abstracts/2012/SSKS2.pdf PDF] || [http://www.seyerlehner.info Klaus Seyerlehner], [http://www.cp.jku.at Markus Schedl], [http://www.cp.jku.at Peter Knees], [http://www.cp.jku.at/ Reinhard Sonnleitner] |
|- | |- | ||
|} | |} | ||
Line 87: | Line 63: | ||
<csv p=3>2012/ams/summary_evalutron.csv</csv> | <csv p=3>2012/ams/summary_evalutron.csv</csv> | ||
− | |||
− | |||
− | |||
===Friedman's Tests=== | ===Friedman's Tests=== | ||
====Friedman's Test (FINE Scores)==== | ====Friedman's Test (FINE Scores)==== | ||
− | The Friedman test was run in MATLAB against the '''Fine''' summary data over the | + | The Friedman test was run in MATLAB against the '''Fine''' summary data over the 50 queries.<br /> |
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05); | Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05); | ||
Line 101: | Line 74: | ||
====Friedman's Test (BROAD Scores)==== | ====Friedman's Test (BROAD Scores)==== | ||
− | The Friedman test was run in MATLAB against the '''BROAD''' summary data over the | + | The Friedman test was run in MATLAB against the '''BROAD''' summary data over the 50 queries.<br /> |
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05); | Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05); | ||
− | <csv p=3> | + | <csv p=3>2012/ams/evalutron.cat.friedman.tukeyKramerHSD.csv</csv> |
[[File:evalutron.cat.friedman.tukeyKramerHSD.png|500px]] | [[File:evalutron.cat.friedman.tukeyKramerHSD.png|500px]] | ||
Line 112: | Line 85: | ||
These are the mean FINE scores per query assigned by Evalutron graders. The FINE scores for the 5 candidates returned per algorithm, per query, have been averaged. Values are bounded between 0 and 100. A perfect score would be 100. Genre labels have been included for reference. | These are the mean FINE scores per query assigned by Evalutron graders. The FINE scores for the 5 candidates returned per algorithm, per query, have been averaged. Values are bounded between 0 and 100. A perfect score would be 100. Genre labels have been included for reference. | ||
− | <csv p=1> | + | <csv p=1>2012/ams/fine_scores.csv</csv> |
====BROAD Scores==== | ====BROAD Scores==== | ||
These are the mean BROAD scores per query assigned by Evalutron graders. The BROAD scores for the 5 candidates returned per algorithm, per query, have been averaged. Values are bounded between 0 (not similar) and 2 (very similar). A perfect score would be 2. Genre labels have been included for reference. | These are the mean BROAD scores per query assigned by Evalutron graders. The BROAD scores for the 5 candidates returned per algorithm, per query, have been averaged. Values are bounded between 0 (not similar) and 2 (very similar). A perfect score would be 2. Genre labels have been included for reference. | ||
− | <csv p=1> | + | <csv p=1>2012/ams/cat_scores.csv</csv> |
===Raw Scores=== | ===Raw Scores=== | ||
− | The raw data derived from the Evalutron 6000 human evaluations are located on the [[ | + | The raw data derived from the Evalutron 6000 human evaluations are located on the [[2012:Audio Music Similarity and Retrieval Raw Data]] page. |
==Metadata and Distance Space Evaluation== | ==Metadata and Distance Space Evaluation== | ||
Line 131: | Line 104: | ||
=== Reports === | === Reports === | ||
− | ''' | + | '''DM6''' = [https://music-ir.org/mirex/results/2012/ams/statistics/DM6/report.txt Franz de Leon, Kirk Martinez]<br /> |
− | ''' | + | '''DM7''' = [https://music-ir.org/mirex/results/2012/ams/statistics/DM7/report.txt Franz de Leon, Kirk Martinez]<br /> |
− | + | '''GT3''' = [https://music-ir.org/mirex/results/2012/ams/statistics/GT3/report.txt George Tzanetakis]<br /> | |
− | + | '''JR2''' = [https://music-ir.org/mirex/results/2012/ams/statistics/JR2/report.txt Jia-Min Ren, Jyh-Shing Roger Jang]<br /> | |
− | ''' | + | '''NHHL1''' = [https://music-ir.org/mirex/results/2012/ams/statistics/NHHL1/report.txt Byeong-jun Han, Kyogu Lee,Juhan Nam,Jorge Herrera]<br /> |
− | ''' | + | '''NHHL2''' = [https://music-ir.org/mirex/results/2012/ams/statistics/NHHL2/report.txt Byeong-jun Han, Kyogu Lee,Juhan Nam,Jorge Herrera]<br /> |
− | ''' | + | '''PS1''' = [https://music-ir.org/mirex/results/2012/ams/statistics/PS1/report.txt Dominik Schnitzer, Tim Pohle]<br /> |
− | ''' | + | '''RW4''' = [https://music-ir.org/mirex/results/2012/ams/statistics/RW4/report.txt Jia-Min Ren,Ming-Ju Wu,Jyh-Shing Roger Jang]<br /> |
− | + | '''SSKP1''' = [https://music-ir.org/mirex/results/2012/ams/statistics/SSKP1/report.txt Klaus Seyerlehner, Markus Schedl, Peter Knees, Tim Pohle]<br /> | |
− | + | '''SSKP2''' = [https://music-ir.org/mirex/results/2012/ams/statistics/SSKS2/report.txt Klaus Seyerlehner, Markus Schedl, Peter Knees, Reinhard Sonnleitner]<br /> | |
− | '''PS1''' = [https://music-ir.org/mirex/results/ | ||
− | ''' | ||
− | ''' | ||
− | ''' | ||
− | |||
− | |||
− | |||
− |
Latest revision as of 19:40, 19 August 2013
Introduction
These are the results for the 2012 running of the Audio Music Similarity and Retrieval task set. For background information about this task set please refer to the Audio Music Similarity and Retrieval page.
Each system was given 7000 songs chosen from IMIRSEL's "uspop", "uscrap" and "american" "classical" and "sundry" collections. Each system then returned a 7000x7000 distance matrix. 50 songs were randomly selected from the 10 genre groups (5 per genre) as queries and the first 5 most highly ranked songs out of the 7000 were extracted for each query (after filtering out the query itself, returned results from the same artist were also omitted). Then, for each query, the returned results (candidates) from all participants were grouped and were evaluated by human graders using the Evalutron 6000 grading system. Each individual query/candidate set was evaluated by a single grader. For each query/candidate pair, graders provided two scores. Graders were asked to provide 1 categorical BROAD score with 3 categories: NS,SS,VS as explained below, and one FINE score (in the range from 0 to 100). A description and analysis is provided below.
The systems read in 30 second audio clips as their raw data. The same 30 second clips were used in the grading stage.
General Legend
Team ID
Sub code | Submission name | Abstract | Contributors |
---|---|---|---|
DM6 | DM6 | Franz de Leon, Kirk Martinez | |
DM7 | DM7 | Franz de Leon, Kirk Martinez | |
GT3 | MarsyasSimilarity | George Tzanetakis | |
JR2 | modulationSim | Jia-Min Ren, Jyh-Shing Roger Jang | |
NHHL1 | AMSR_2012_1 | Byeong-jun Han, Kyogu Lee,Juhan Nam,Jorge Herrera | |
NHHL2 | AMSR_2012_2 | Byeong-jun Han, Kyogu Lee,Juhan Nam,Jorge Herrera | |
PS1 | PS09 | Dominik Schnitzer, Tim Pohle | |
RW4 | modulationSimFrameUBM | Jia-Min Ren,Ming-Ju Wu,Jyh-Shing Roger Jang | |
SSKP1 | cbmr_sim_2010 | Klaus Seyerlehner, Markus Schedl, Peter Knees, Tim Pohle | |
SSKS2 | cbmr_sim_2011 | Klaus Seyerlehner, Markus Schedl, Peter Knees, Reinhard Sonnleitner |
Broad Categories
NS = Not Similar
SS = Somewhat Similar
VS = Very Similar
Understanding Summary Measures
Fine = Has a range from 0 (failure) to 100 (perfection).
Broad = Has a range from 0 (failure) to 2 (perfection) as each query/candidate pair is scored with either NS=0, SS=1 or VS=2.
Human Evaluation
Overall Summary Results
Measure | DM6 | DM7 | GT3 | JR2 | NHHL1 | NHHL2 | PS1 | RW4 | SSKP1 | SSKS2 |
---|---|---|---|---|---|---|---|---|---|---|
Average Fine Score | 36.176 | 36.332 | 44.872 | 47.020 | 45.944 | 45.944 | 53.136 | 50.000 | 52.640 | 53.188 |
Average Cat Score | 0.680 | 0.682 | 0.894 | 0.956 | 0.926 | 0.926 | 1.128 | 1.048 | 1.138 | 1.132 |
Friedman's Tests
Friedman's Test (FINE Scores)
The Friedman test was run in MATLAB against the Fine summary data over the 50 queries.
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);
TeamID | TeamID | Lowerbound | Mean | Upperbound | Significance |
---|---|---|---|---|---|
SSKS2 | PS1 | -2.014 | -0.110 | 1.794 | FALSE |
SSKS2 | SSKP1 | -1.684 | 0.220 | 2.124 | FALSE |
SSKS2 | RW4 | -1.284 | 0.620 | 2.524 | FALSE |
SSKS2 | JR2 | -0.164 | 1.740 | 3.644 | FALSE |
SSKS2 | NHHL2 | 0.596 | 2.500 | 4.404 | TRUE |
SSKS2 | NHHL1 | 0.596 | 2.500 | 4.404 | TRUE |
SSKS2 | GT3 | 0.616 | 2.520 | 4.424 | TRUE |
SSKS2 | DM7 | 2.726 | 4.630 | 6.534 | TRUE |
SSKS2 | DM6 | 2.776 | 4.680 | 6.584 | TRUE |
PS1 | SSKP1 | -1.574 | 0.330 | 2.234 | FALSE |
PS1 | RW4 | -1.174 | 0.730 | 2.634 | FALSE |
PS1 | JR2 | -0.054 | 1.850 | 3.754 | FALSE |
PS1 | NHHL2 | 0.706 | 2.610 | 4.514 | TRUE |
PS1 | NHHL1 | 0.706 | 2.610 | 4.514 | TRUE |
PS1 | GT3 | 0.726 | 2.630 | 4.534 | TRUE |
PS1 | DM7 | 2.836 | 4.740 | 6.644 | TRUE |
PS1 | DM6 | 2.886 | 4.790 | 6.694 | TRUE |
SSKP1 | RW4 | -1.504 | 0.400 | 2.304 | FALSE |
SSKP1 | JR2 | -0.384 | 1.520 | 3.424 | FALSE |
SSKP1 | NHHL2 | 0.376 | 2.280 | 4.184 | TRUE |
SSKP1 | NHHL1 | 0.376 | 2.280 | 4.184 | TRUE |
SSKP1 | GT3 | 0.396 | 2.300 | 4.204 | TRUE |
SSKP1 | DM7 | 2.506 | 4.410 | 6.314 | TRUE |
SSKP1 | DM6 | 2.556 | 4.460 | 6.364 | TRUE |
RW4 | JR2 | -0.784 | 1.120 | 3.024 | FALSE |
RW4 | NHHL2 | -0.024 | 1.880 | 3.784 | FALSE |
RW4 | NHHL1 | -0.024 | 1.880 | 3.784 | FALSE |
RW4 | GT3 | -0.004 | 1.900 | 3.804 | FALSE |
RW4 | DM7 | 2.106 | 4.010 | 5.914 | TRUE |
RW4 | DM6 | 2.156 | 4.060 | 5.964 | TRUE |
JR2 | NHHL2 | -1.144 | 0.760 | 2.664 | FALSE |
JR2 | NHHL1 | -1.144 | 0.760 | 2.664 | FALSE |
JR2 | GT3 | -1.124 | 0.780 | 2.684 | FALSE |
JR2 | DM7 | 0.986 | 2.890 | 4.794 | TRUE |
JR2 | DM6 | 1.036 | 2.940 | 4.844 | TRUE |
NHHL2 | NHHL1 | -1.904 | 0.000 | 1.904 | FALSE |
NHHL2 | GT3 | -1.884 | 0.020 | 1.924 | FALSE |
NHHL2 | DM7 | 0.226 | 2.130 | 4.034 | TRUE |
NHHL2 | DM6 | 0.276 | 2.180 | 4.084 | TRUE |
NHHL1 | GT3 | -1.884 | 0.020 | 1.924 | FALSE |
NHHL1 | DM7 | 0.226 | 2.130 | 4.034 | TRUE |
NHHL1 | DM6 | 0.276 | 2.180 | 4.084 | TRUE |
GT3 | DM7 | 0.206 | 2.110 | 4.014 | TRUE |
GT3 | DM6 | 0.256 | 2.160 | 4.064 | TRUE |
DM7 | DM6 | -1.854 | 0.050 | 1.954 | FALSE |
Friedman's Test (BROAD Scores)
The Friedman test was run in MATLAB against the BROAD summary data over the 50 queries.
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);
TeamID | TeamID | Lowerbound | Mean | Upperbound | Significance |
---|---|---|---|---|---|
SSKP1 | SSKS2 | -2.052 | -0.210 | 1.632 | FALSE |
SSKP1 | PS1 | -1.682 | 0.160 | 2.002 | FALSE |
SSKP1 | RW4 | -1.022 | 0.820 | 2.662 | FALSE |
SSKP1 | JR2 | -0.262 | 1.580 | 3.422 | FALSE |
SSKP1 | NHHL2 | 0.488 | 2.330 | 4.172 | TRUE |
SSKP1 | NHHL1 | 0.488 | 2.330 | 4.172 | TRUE |
SSKP1 | GT3 | 0.388 | 2.230 | 4.072 | TRUE |
SSKP1 | DM7 | 2.538 | 4.380 | 6.222 | TRUE |
SSKP1 | DM6 | 2.538 | 4.380 | 6.222 | TRUE |
SSKS2 | PS1 | -1.472 | 0.370 | 2.212 | FALSE |
SSKS2 | RW4 | -0.812 | 1.030 | 2.872 | FALSE |
SSKS2 | JR2 | -0.052 | 1.790 | 3.632 | FALSE |
SSKS2 | NHHL2 | 0.698 | 2.540 | 4.382 | TRUE |
SSKS2 | NHHL1 | 0.698 | 2.540 | 4.382 | TRUE |
SSKS2 | GT3 | 0.598 | 2.440 | 4.282 | TRUE |
SSKS2 | DM7 | 2.748 | 4.590 | 6.432 | TRUE |
SSKS2 | DM6 | 2.748 | 4.590 | 6.432 | TRUE |
PS1 | RW4 | -1.182 | 0.660 | 2.502 | FALSE |
PS1 | JR2 | -0.422 | 1.420 | 3.262 | FALSE |
PS1 | NHHL2 | 0.328 | 2.170 | 4.012 | TRUE |
PS1 | NHHL1 | 0.328 | 2.170 | 4.012 | TRUE |
PS1 | GT3 | 0.228 | 2.070 | 3.912 | TRUE |
PS1 | DM7 | 2.378 | 4.220 | 6.062 | TRUE |
PS1 | DM6 | 2.378 | 4.220 | 6.062 | TRUE |
RW4 | JR2 | -1.082 | 0.760 | 2.602 | FALSE |
RW4 | NHHL2 | -0.332 | 1.510 | 3.352 | FALSE |
RW4 | NHHL1 | -0.332 | 1.510 | 3.352 | FALSE |
RW4 | GT3 | -0.432 | 1.410 | 3.252 | FALSE |
RW4 | DM7 | 1.718 | 3.560 | 5.402 | TRUE |
RW4 | DM6 | 1.718 | 3.560 | 5.402 | TRUE |
JR2 | NHHL2 | -1.092 | 0.750 | 2.592 | FALSE |
JR2 | NHHL1 | -1.092 | 0.750 | 2.592 | FALSE |
JR2 | GT3 | -1.192 | 0.650 | 2.492 | FALSE |
JR2 | DM7 | 0.958 | 2.800 | 4.642 | TRUE |
JR2 | DM6 | 0.958 | 2.800 | 4.642 | TRUE |
NHHL2 | NHHL1 | -1.842 | 0.000 | 1.842 | FALSE |
NHHL2 | GT3 | -1.942 | -0.100 | 1.742 | FALSE |
NHHL2 | DM7 | 0.208 | 2.050 | 3.892 | TRUE |
NHHL2 | DM6 | 0.208 | 2.050 | 3.892 | TRUE |
NHHL1 | GT3 | -1.942 | -0.100 | 1.742 | FALSE |
NHHL1 | DM7 | 0.208 | 2.050 | 3.892 | TRUE |
NHHL1 | DM6 | 0.208 | 2.050 | 3.892 | TRUE |
GT3 | DM7 | 0.308 | 2.150 | 3.992 | TRUE |
GT3 | DM6 | 0.308 | 2.150 | 3.992 | TRUE |
DM7 | DM6 | -1.842 | 0.000 | 1.842 | FALSE |
Summary Results by Query
FINE Scores
These are the mean FINE scores per query assigned by Evalutron graders. The FINE scores for the 5 candidates returned per algorithm, per query, have been averaged. Values are bounded between 0 and 100. A perfect score would be 100. Genre labels have been included for reference.
Genre | Query | DM6 | DM7 | GT3 | JR2 | NHHL1 | NHHL2 | PS1 | RW4 | SSKP1 | SSKS2 |
---|---|---|---|---|---|---|---|---|---|---|---|
BAROQUE | d005709 | 44.7 | 45.1 | 77.5 | 77.4 | 53.3 | 53.3 | 60.0 | 77.8 | 44.9 | 50.7 |
BAROQUE | d006218 | 9.9 | 9.9 | 27.0 | 31.2 | 34.3 | 34.3 | 54.8 | 31.2 | 42.2 | 34.6 |
BAROQUE | d010595 | 69.0 | 72.0 | 64.0 | 72.0 | 64.0 | 64.0 | 72.5 | 69.0 | 69.0 | 76.5 |
BAROQUE | d016827 | 21.9 | 21.4 | 30.5 | 16.4 | 11.6 | 11.6 | 19.7 | 29.5 | 25.2 | 24.0 |
BAROQUE | d019925 | 76.1 | 77.4 | 82.0 | 82.5 | 83.1 | 83.1 | 86.3 | 85.8 | 85.9 | 85.0 |
BLUES | e003462 | 13.1 | 13.1 | 25.8 | 21.3 | 24.9 | 24.9 | 22.6 | 24.0 | 25.2 | 19.6 |
BLUES | e006719 | 55.0 | 56.0 | 76.0 | 63.0 | 80.0 | 80.0 | 74.0 | 75.0 | 69.5 | 74.5 |
BLUES | e013942 | 55.5 | 52.0 | 69.0 | 64.0 | 57.0 | 57.0 | 71.0 | 72.0 | 77.0 | 73.0 |
BLUES | e014478 | 37.3 | 40.0 | 9.8 | 24.0 | 30.9 | 30.9 | 31.4 | 19.4 | 21.1 | 23.3 |
BLUES | e019782 | 62.7 | 59.2 | 74.8 | 74.4 | 82.6 | 82.6 | 88.0 | 75.8 | 87.9 | 76.0 |
CLASSICAL | d006152 | 61.1 | 53.9 | 91.3 | 91.3 | 88.4 | 88.4 | 91.4 | 91.7 | 76.9 | 91.4 |
CLASSICAL | d009811 | 12.0 | 12.0 | 21.8 | 14.1 | 3.4 | 3.4 | 22.7 | 31.0 | 17.3 | 26.7 |
CLASSICAL | d015395 | 13.0 | 13.0 | 60.6 | 63.9 | 64.2 | 64.2 | 67.0 | 66.3 | 68.8 | 69.0 |
CLASSICAL | d016084 | 33.0 | 33.0 | 69.0 | 64.5 | 50.0 | 50.0 | 67.5 | 72.0 | 59.5 | 71.5 |
CLASSICAL | d018315 | 20.0 | 20.0 | 63.0 | 63.5 | 64.5 | 64.5 | 70.7 | 64.5 | 60.5 | 63.0 |
COUNTRY | b003088 | 31.4 | 32.7 | 63.0 | 64.1 | 69.4 | 69.4 | 63.9 | 66.8 | 70.1 | 65.6 |
COUNTRY | e008540 | 29.3 | 29.3 | 54.0 | 63.2 | 51.0 | 51.0 | 51.5 | 66.9 | 52.0 | 63.0 |
COUNTRY | e012590 | 26.0 | 26.0 | 38.0 | 41.0 | 25.0 | 25.0 | 56.0 | 44.0 | 46.0 | 44.0 |
COUNTRY | e014995 | 35.2 | 35.2 | 41.5 | 41.6 | 43.3 | 43.3 | 43.3 | 43.5 | 40.6 | 42.6 |
COUNTRY | e016359 | 4.8 | 4.8 | 0.0 | 17.6 | 6.0 | 6.0 | 10.1 | 9.6 | 0.0 | 11.2 |
EDANCE | b006191 | 8.3 | 8.3 | 11.9 | 12.5 | 11.3 | 11.3 | 19.1 | 13.9 | 32.4 | 37.9 |
EDANCE | b011724 | 56.5 | 56.5 | 46.5 | 58.0 | 52.0 | 52.0 | 69.0 | 57.5 | 73.0 | 70.0 |
EDANCE | b013180 | 48.2 | 48.2 | 39.7 | 40.5 | 37.9 | 37.9 | 59.7 | 48.4 | 59.6 | 52.2 |
EDANCE | f010038 | 16.5 | 15.4 | 27.7 | 40.8 | 31.3 | 31.3 | 50.8 | 34.7 | 53.7 | 47.9 |
EDANCE | f016289 | 6.0 | 5.2 | 15.9 | 3.4 | 14.1 | 14.1 | 15.7 | 10.7 | 35.4 | 37.7 |
JAZZ | e002496 | 18.3 | 21.2 | 29.8 | 25.0 | 7.8 | 7.8 | 38.1 | 32.7 | 38.5 | 33.4 |
JAZZ | e003502 | 74.0 | 74.0 | 50.0 | 55.0 | 70.0 | 70.0 | 78.0 | 71.0 | 89.0 | 88.0 |
JAZZ | e011411 | 69.9 | 69.9 | 56.4 | 80.4 | 70.3 | 70.3 | 78.4 | 71.5 | 67.6 | 54.4 |
JAZZ | e014617 | 26.5 | 29.5 | 22.0 | 17.1 | 68.9 | 68.9 | 88.0 | 59.1 | 83.5 | 78.7 |
JAZZ | e019789 | 29.5 | 29.5 | 30.1 | 18.5 | 49.4 | 49.4 | 57.8 | 20.5 | 39.5 | 36.3 |
METAL | b006857 | 50.5 | 50.5 | 54.5 | 64.5 | 55.7 | 55.7 | 49.4 | 64.2 | 65.5 | 61.4 |
METAL | b009281 | 63.5 | 63.5 | 75.5 | 83.5 | 81.0 | 81.0 | 71.5 | 83.5 | 82.5 | 80.0 |
METAL | b014284 | 41.0 | 44.5 | 35.5 | 46.0 | 60.0 | 60.0 | 67.5 | 46.0 | 65.5 | 69.0 |
METAL | b014839 | 25.7 | 25.7 | 31.3 | 32.3 | 38.4 | 38.4 | 38.7 | 29.5 | 24.7 | 31.2 |
METAL | b017570 | 16.4 | 12.6 | 19.5 | 17.2 | 13.2 | 13.2 | 14.1 | 14.4 | 21.9 | 26.5 |
RAPHIPHOP | a002038 | 32.2 | 32.2 | 34.5 | 50.2 | 44.3 | 44.3 | 56.1 | 54.4 | 59.1 | 57.5 |
RAPHIPHOP | a002900 | 25.4 | 25.4 | 37.0 | 29.7 | 28.2 | 28.2 | 39.7 | 39.7 | 28.8 | 40.0 |
RAPHIPHOP | a007956 | 60.7 | 60.7 | 69.2 | 73.0 | 61.8 | 61.8 | 63.7 | 73.1 | 76.6 | 75.1 |
RAPHIPHOP | a009690 | 51.5 | 51.5 | 67.5 | 45.5 | 58.0 | 58.0 | 58.5 | 49.0 | 61.5 | 68.0 |
RAPHIPHOP | b004382 | 72.7 | 72.7 | 76.6 | 77.8 | 79.2 | 79.2 | 81.9 | 81.2 | 80.4 | 79.5 |
ROCKROLL | b000859 | 25.7 | 31.5 | 45.7 | 37.0 | 55.5 | 55.5 | 34.6 | 43.0 | 21.7 | 41.2 |
ROCKROLL | b008224 | 36.1 | 34.4 | 36.7 | 43.8 | 33.5 | 33.5 | 24.9 | 28.5 | 47.9 | 51.7 |
ROCKROLL | b010359 | 5.8 | 5.8 | 19.7 | 13.1 | 10.8 | 10.8 | 15.6 | 10.4 | 22.9 | 18.5 |
ROCKROLL | b010640 | 7.2 | 7.2 | 19.0 | 26.4 | 22.1 | 22.1 | 24.5 | 17.0 | 23.3 | 26.1 |
ROCKROLL | b017313 | 11.5 | 11.5 | 17.3 | 17.0 | 9.0 | 9.0 | 24.0 | 21.5 | 22.0 | 19.0 |
ROMANTIC | d000185 | 66.8 | 66.8 | 84.0 | 84.8 | 81.6 | 81.6 | 88.2 | 87.9 | 86.8 | 86.6 |
ROMANTIC | d007856 | 70.8 | 75.8 | 56.6 | 77.9 | 77.7 | 77.7 | 84.8 | 81.8 | 74.7 | 76.5 |
ROMANTIC | d011611 | 31.8 | 31.8 | 35.1 | 43.0 | 38.6 | 38.6 | 63.9 | 50.4 | 59.6 | 53.2 |
ROMANTIC | d011697 | 7.3 | 7.3 | 28.0 | 33.5 | 24.0 | 24.0 | 27.2 | 33.7 | 31.2 | 22.6 |
ROMANTIC | d012432 | 41.5 | 41.5 | 31.8 | 52.6 | 24.7 | 24.7 | 49.0 | 55.0 | 63.6 | 54.1 |
BROAD Scores
These are the mean BROAD scores per query assigned by Evalutron graders. The BROAD scores for the 5 candidates returned per algorithm, per query, have been averaged. Values are bounded between 0 (not similar) and 2 (very similar). A perfect score would be 2. Genre labels have been included for reference.
Genre | Query | DM6 | DM7 | GT3 | JR2 | NHHL1 | NHHL2 | PS1 | RW4 | SSKP1 | SSKS2 |
---|---|---|---|---|---|---|---|---|---|---|---|
BAROQUE | d005709 | 1.0 | 1.0 | 1.9 | 1.9 | 1.1 | 1.1 | 1.4 | 1.9 | 0.9 | 1.1 |
BAROQUE | d006218 | 0.0 | 0.0 | 0.3 | 0.4 | 0.5 | 0.5 | 1.2 | 0.4 | 0.8 | 0.6 |
BAROQUE | d010595 | 1.3 | 1.3 | 1.2 | 1.4 | 1.3 | 1.3 | 1.4 | 1.3 | 1.4 | 1.4 |
BAROQUE | d016827 | 0.4 | 0.4 | 0.9 | 0.4 | 0.2 | 0.2 | 0.4 | 0.9 | 0.4 | 0.5 |
BAROQUE | d019925 | 1.5 | 1.6 | 1.7 | 1.6 | 1.8 | 1.8 | 2.0 | 1.9 | 1.9 | 1.9 |
BLUES | e003462 | 0.0 | 0.0 | 0.5 | 0.3 | 0.6 | 0.6 | 0.4 | 0.4 | 0.5 | 0.2 |
BLUES | e006719 | 1.1 | 1.2 | 1.7 | 1.1 | 1.9 | 1.9 | 1.5 | 1.5 | 1.4 | 1.7 |
BLUES | e013942 | 1.1 | 1.0 | 1.5 | 1.3 | 1.2 | 1.2 | 1.5 | 1.5 | 1.6 | 1.6 |
BLUES | e014478 | 0.6 | 0.7 | 0.1 | 0.4 | 0.5 | 0.5 | 0.8 | 0.3 | 0.3 | 0.2 |
BLUES | e019782 | 1.3 | 1.2 | 1.6 | 1.6 | 1.9 | 1.9 | 2.0 | 1.6 | 2.0 | 1.6 |
CLASSICAL | d006152 | 1.4 | 1.2 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 1.8 | 2.0 |
CLASSICAL | d009811 | 0.3 | 0.3 | 0.5 | 0.3 | 0.0 | 0.0 | 0.5 | 0.7 | 0.4 | 0.7 |
CLASSICAL | d015395 | 0.2 | 0.2 | 1.4 | 1.7 | 1.4 | 1.4 | 1.6 | 1.6 | 1.7 | 1.7 |
CLASSICAL | d016084 | 0.6 | 0.6 | 1.5 | 1.4 | 0.9 | 0.9 | 1.3 | 1.4 | 1.2 | 1.6 |
CLASSICAL | d018315 | 0.0 | 0.0 | 1.1 | 1.0 | 1.0 | 1.0 | 1.2 | 1.0 | 1.0 | 1.1 |
COUNTRY | b003088 | 0.4 | 0.4 | 1.4 | 1.5 | 1.6 | 1.6 | 1.3 | 1.6 | 1.5 | 1.3 |
COUNTRY | e008540 | 0.5 | 0.5 | 1.0 | 1.3 | 1.2 | 1.2 | 1.1 | 1.4 | 1.2 | 1.4 |
COUNTRY | e012590 | 0.4 | 0.4 | 0.7 | 0.8 | 0.3 | 0.3 | 1.3 | 0.9 | 0.9 | 0.9 |
COUNTRY | e014995 | 0.7 | 0.7 | 1.0 | 0.9 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
COUNTRY | e016359 | 0.0 | 0.0 | 0.0 | 0.3 | 0.1 | 0.1 | 0.1 | 0.1 | 0.0 | 0.2 |
EDANCE | b006191 | 0.0 | 0.0 | 0.1 | 0.1 | 0.0 | 0.0 | 0.2 | 0.1 | 0.8 | 0.8 |
EDANCE | b011724 | 1.1 | 1.1 | 0.9 | 1.2 | 1.0 | 1.0 | 1.5 | 1.2 | 1.6 | 1.5 |
EDANCE | b013180 | 1.1 | 1.1 | 0.7 | 0.8 | 0.7 | 0.7 | 1.4 | 1.1 | 1.5 | 1.2 |
EDANCE | f010038 | 0.1 | 0.1 | 0.3 | 0.6 | 0.5 | 0.5 | 1.0 | 0.5 | 1.1 | 0.8 |
EDANCE | f016289 | 0.1 | 0.1 | 0.4 | 0.0 | 0.3 | 0.3 | 0.5 | 0.2 | 0.9 | 0.9 |
JAZZ | e002496 | 0.4 | 0.5 | 0.8 | 0.7 | 0.0 | 0.0 | 0.8 | 0.9 | 0.9 | 0.8 |
JAZZ | e003502 | 1.5 | 1.5 | 0.7 | 0.9 | 1.3 | 1.3 | 1.4 | 1.4 | 1.9 | 1.8 |
JAZZ | e011411 | 1.3 | 1.3 | 0.8 | 1.8 | 1.3 | 1.3 | 1.7 | 1.6 | 1.1 | 0.7 |
JAZZ | e014617 | 0.4 | 0.5 | 0.4 | 0.2 | 1.7 | 1.7 | 1.9 | 1.4 | 1.8 | 1.8 |
JAZZ | e019789 | 0.7 | 0.7 | 0.5 | 0.2 | 1.1 | 1.1 | 1.1 | 0.2 | 0.8 | 0.7 |
METAL | b006857 | 0.9 | 0.9 | 1.1 | 1.3 | 1.0 | 1.0 | 1.0 | 1.3 | 1.4 | 1.2 |
METAL | b009281 | 1.4 | 1.4 | 1.8 | 2.0 | 1.8 | 1.8 | 1.6 | 2.0 | 2.0 | 2.0 |
METAL | b014284 | 0.9 | 0.9 | 0.3 | 0.9 | 1.4 | 1.4 | 1.6 | 0.9 | 1.6 | 1.8 |
METAL | b014839 | 0.3 | 0.3 | 0.3 | 0.5 | 0.7 | 0.7 | 0.6 | 0.3 | 0.3 | 0.5 |
METAL | b017570 | 0.2 | 0.1 | 0.3 | 0.3 | 0.1 | 0.1 | 0.2 | 0.2 | 0.4 | 0.5 |
RAPHIPHOP | a002038 | 0.5 | 0.5 | 0.7 | 1.0 | 1.0 | 1.0 | 1.4 | 1.3 | 1.5 | 1.5 |
RAPHIPHOP | a002900 | 0.6 | 0.6 | 0.7 | 0.7 | 0.6 | 0.6 | 0.5 | 0.9 | 0.7 | 0.7 |
RAPHIPHOP | a007956 | 1.4 | 1.4 | 1.6 | 1.7 | 1.4 | 1.4 | 1.5 | 1.6 | 1.8 | 1.9 |
RAPHIPHOP | a009690 | 1.0 | 1.0 | 1.4 | 0.7 | 1.2 | 1.2 | 1.1 | 0.8 | 1.2 | 1.4 |
RAPHIPHOP | b004382 | 1.6 | 1.6 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 1.9 | 1.9 |
ROCKROLL | b000859 | 0.5 | 0.6 | 0.9 | 0.7 | 1.1 | 1.1 | 0.7 | 0.9 | 0.3 | 0.7 |
ROCKROLL | b008224 | 0.5 | 0.4 | 0.5 | 0.7 | 0.4 | 0.4 | 0.2 | 0.3 | 0.9 | 1.0 |
ROCKROLL | b010359 | 0.0 | 0.0 | 0.3 | 0.0 | 0.0 | 0.0 | 0.3 | 0.0 | 0.4 | 0.2 |
ROCKROLL | b010640 | 0.1 | 0.1 | 0.4 | 0.5 | 0.4 | 0.4 | 0.8 | 0.4 | 0.7 | 0.7 |
ROCKROLL | b017313 | 0.6 | 0.6 | 0.7 | 0.6 | 0.6 | 0.6 | 0.8 | 0.8 | 0.8 | 0.7 |
ROMANTIC | d000185 | 1.4 | 1.4 | 1.7 | 1.9 | 1.6 | 1.6 | 2.0 | 2.0 | 2.0 | 2.0 |
ROMANTIC | d007856 | 1.2 | 1.3 | 1.0 | 1.6 | 1.4 | 1.4 | 1.8 | 1.9 | 1.4 | 1.5 |
ROMANTIC | d011611 | 0.5 | 0.5 | 0.5 | 0.9 | 0.6 | 0.6 | 1.4 | 1.1 | 1.2 | 1.1 |
ROMANTIC | d011697 | 0.0 | 0.0 | 0.4 | 0.6 | 0.3 | 0.3 | 0.5 | 0.6 | 0.6 | 0.4 |
ROMANTIC | d012432 | 0.9 | 0.9 | 0.5 | 1.1 | 0.3 | 0.3 | 0.9 | 1.1 | 1.5 | 1.2 |
Raw Scores
The raw data derived from the Evalutron 6000 human evaluations are located on the 2012:Audio Music Similarity and Retrieval Raw Data page.
Metadata and Distance Space Evaluation
The following reports provide evaluation statistics based on analysis of the distance space and metadata matches and include:
- Neighbourhood clustering by artist, album and genre
- Artist-filtered genre clustering
- How often the triangular inequality holds
- Statistics on 'hubs' (tracks similar to many tracks) and orphans (tracks that are not similar to any other tracks at N results).
Reports
DM6 = Franz de Leon, Kirk Martinez
DM7 = Franz de Leon, Kirk Martinez
GT3 = George Tzanetakis
JR2 = Jia-Min Ren, Jyh-Shing Roger Jang
NHHL1 = Byeong-jun Han, Kyogu Lee,Juhan Nam,Jorge Herrera
NHHL2 = Byeong-jun Han, Kyogu Lee,Juhan Nam,Jorge Herrera
PS1 = Dominik Schnitzer, Tim Pohle
RW4 = Jia-Min Ren,Ming-Ju Wu,Jyh-Shing Roger Jang
SSKP1 = Klaus Seyerlehner, Markus Schedl, Peter Knees, Tim Pohle
SSKP2 = Klaus Seyerlehner, Markus Schedl, Peter Knees, Reinhard Sonnleitner