https://music-ir.org/mirex/w/api.php?action=feedcontributions&user=IMIRSELBot&feedformat=atomMIREX Wiki - User contributions [en]2024-03-28T15:59:07ZUser contributionsMediaWiki 1.31.1https://music-ir.org/mirex/w/index.php?title=2007:Audio_Music_Mood_Classification&diff=71752007:Audio Music Mood Classification2010-06-07T19:09:21Z<p>IMIRSELBot: Robot: Automated text replacement (-\[http:\/\/(www.)?music-ir.org\/mirex\/200([5-9])\/index.php\/([^\s]+)( .+)?\] +200\2:\3)</p>
<hr />
<div>=FINAL 2007 AMC EVALUATION SCENARIO OVERVIEW=<br />
This section is put here to clarify what will happen for this year's "beta" run of the Audio Mood Classification (AMC) task.<br />
<br />
# We will operate the AMC task as a classic train-test classification task.<br />
# We will n-fold the runs with n to be determined by the size of the final data set, number of participants, etc.<br />
# We will hand-craft the n-fold test-train split lists.<br />
# We will NOT be doing post-run human mood judgments this year using the Evalutron 6000. <br />
# Audio files: 30 sec., 22kHz, mono, 16 bit<br />
<br />
Do take a look at the [[2007:Audio Genre Classification]] task wiki as we are basing the underlying structure of this task on Audio Genre. In fact, an Audio Genre submission should work out of the box with Audio Mood Classification. Note: we really want folks to do a FEATURE EXTRACTION phase first against all the files and then have these features cached some place for re-use during the TRAIN-TEST phase. This way we can really speed up the n-fold processing. Thus, like GENRE, we need to pass three input files to your algos:<br />
<br />
==== 1. Feature extraction list file ====<br />
The list file passed for feature extraction will a simple ASCII list <br />
file. This file will contain one path per line with no header line.<br />
<br />
==== 2. Training list file ====<br />
The list file passed for model training will be a simple ASCII list <br />
file. This file will contain one path per line, followed by a tab character and <br />
the genre label, again with no header line. <br />
<br />
E.g. <example path and filename>\t<mood classification><br />
<br />
==== 3. Test (classification) list file ====<br />
The list file passed for testing classification will be a simple ASCII list <br />
file identical in format to the Feature extraction list file. This file will <br />
contain one path per line with no header line.<br />
<br />
==== Classification output files ====<br />
Participating algorithms should produce a simple ASCII list file identical in <br />
format to the Training list file. This file will contain one path per line, <br />
followed by a tab character and the MOOD label, again with no header line. <br />
E.g.:<br />
<example path and filename>\t<mood classification><br />
<br />
The path to which this list file should be written must be accepted as a <br />
parameter on the command line.<br />
<br />
********************************************<br />
<br />
== Audio collection poll ==<br />
<br />
<poll><br />
Would you like to use 30 secs clips from tracks for analysis to avoid mood change within tracks and reduce processing load ?<br />
Yes<br />
No, I like 60 secs clips<br />
No, I like the whole track <br />
</poll><br />
<br />
<poll><br />
How important do you think cross-validation is?<br />
Very important<br />
Important<br />
Not important<br />
</poll><br />
<br />
<poll><br />
Would you like your algorithm(s) to be evaluated on a closed groundtruth set (as in traditional classification problems, both training and testing data are labeled well before the contest) or on an unlabeled audio pool (in the way described in this wiki page, please see section 7,8,9) ?<br />
On a closed groundtruth set (the size of the set is smaller, but evaluation metrics are more rigorous and support cross-validation)<br />
On an unlabeled audio pool (the size of the pool can be very big, but only a small portion will be judged by human.)<br />
Both <br />
</poll><br />
<br />
<poll><br />
If you like a closed groundtruth set, what is the MINIMUM size of the set you can accept (including training and testing)?<br />
400 clips in total (~80 clips in each category)<br />
600 clips in total (~120 clips in each category)<br />
800 clips in total (~160 clips in each category)<br />
1000 clips in total (~200 clips in each category)<br />
more than 1000 clips<br />
</poll><br />
<br />
<poll><br />
If you like an unlabeled audio pool, what is the MINIMUM size of training audio you can accept?<br />
30 clips in each category<br />
50 clips in each category<br />
80 clips in each category<br />
100 clips in each category<br />
more than 100 clips in each category<br />
</poll><br />
<br />
<br />
<poll><br />
What is your preferred audio format? (the less audio data to process the larger the dataset can be) <br />
22 khz mono WAV<br />
22 khz stereo WAV<br />
44 khz mono WAV<br />
44 khz stereo WAV<br />
22 khz mono MP3 128kb<br />
22 khz stereo MP3 128kb<br />
44 khz mono MP3 128kb<br />
44 khz stereo MP3 128kb<br />
</poll><br />
<br />
<poll><br />
How many algorithms will you likely to submit? (for estimating the number of human assessors needed)<br />
0<br />
1<br />
2<br />
3<br />
</poll><br />
<br />
== Introduction ==<br />
In music psychology and music education, emotion component of music has been recognized as the most strongly associated with music expressivity.(e.g. Juslin et al 2006[[#Related Papers]]). Music information behavior studies (e.g.Cunningham, Jones and Jones 2004, Cunningham, Vignoli 2004, Bainbridge and Falconer 2006 [[#Related Papers]]) have also identified music mood/ emotion as an important criterion used by people in music seeking and organization. Several experiments have been conducted in the MIR community to classify music by mood (e.g. Lu, Liu and Zhang 2006, Pohle, Pampalk, and Widmer 2005, Mandel, Poliner and Ellis 2006, Feng, Zhuang and Pan 2003[[#Related Papers]]). Please note: the MIR community tends to use the word "mood" while musicpsychologists like to use "emotion". We follow the MIR tradition to use "mood" thereafter. <br />
<br />
However, evaluation of music mood classification is difficult as music mood is a very subjective notion. Each aforementioned experiement used different mood categories and different datasets, making comparison on previous work a virtually impossible mission. A contest on music mood classification in MIREX will help build the first ever community available test set and precious ground truth.<br />
<br />
This is the first time in MIREX to attempt a music mood classification evaluation. There are many issues involved in this evaluation task, and let us start discuss them on this wiki. If needed, we will set up a mailing list devoting to the discussion.<br />
<br />
== Mood Categories ==<br />
<br />
The IMIRSEL has derived a set of 5 mood clusters from the AMG mood repository (Hu & Downie 2007[[#Related Papers]]). The mood clusters effectively reduce the diverse mood space into a tangible set of categories, and yet root in the social-cultural context of pop music. Therefore, we propose to use the 5 mood clusters as the categories in this yearΓÇÖs audio mood classification contest. Each of the clusters is a collection of the AMG mood labels which collectively define the cluster: <br />
<br />
*Cluster_1: passionate, rousing, confident,boisterous, rowdy <br />
*Cluster_2: rollicking, cheerful, fun, sweet, amiable/good natured <br />
*Cluster_3: literate, poignant, wistful, bittersweet, autumnal, brooding <br />
*Cluster_4: humorous, silly, campy, quirky, whimsical, witty, wry <br />
*Cluster_5: aggressive, fiery,tense/anxious, intense, volatile,visceral <br />
<br />
At this moment, the IMIRSEL and Cyril Laurier at the Music Technology Group of Barcelona have manually validated the mood clusters and exemplar songs in each cluster. Please see [[#Exemplar Songs in Each Category]] for details. <br />
<br />
We are still seeking additional songs across different genres to enrich this set, and during the process, the cluster with least cross-listener consistency may be dropped, or two clusters often confusing each other may be combined. <br />
<br />
<br />
[[2007:Previous Discussion on Mood Taxonomy]]<br />
<br />
[[2007:Discussion on Mood Categories]]<br />
<br />
== Exemplar Songs in Each Category == <br />
Exemplar songs for each mood cluster are manually selected by multiple human assessors. The purpose is to further clarify the perceptual identities of the mood clusters.<br />
<br />
There are 190 candidate songs in the intersection of AMG mood repository and the USPOP collection in IMIRSEL, and each of these songs has only one unanimous mood cluster label assigned by AMG editors. The mood labels by AMG editors are important benchmark which can help us reach cross-listener consistency on such a subjective task. So far, 6 human assessors have listened to the 190 songs and assigned cluster labels to them. 50 songs are unanimously labeled by the 6 human assessors, 42 songs are unanimously labeled by 5 of the 6 human assessors, and another 40 songs by 4 of the 6 human assessors. The song titles are listed in [[exemplar songs]]. <br />
<br />
The advantages of the exemplar songs are two folds: 1. they will help people better understand what kind of mood each cluster refers to; 2. they can possibly be taken as training data for the algorithms (see the section of [[#Training Set]]). <br />
<br />
Note: Lyrics issue: when labeling the songs, the human assessors were asked to ignore lyrics. As this is a contest focuses on music audio, lyrics should not be taken into consideration. <br />
<br />
[[2007:Previous Discussion on Ground Truth]]<br />
<br />
== Two Evaluation Scenarios ==<br />
<br />
1. Evaluation on a closed groundtruth set.<br />
As in traditional classification problems, both training and testing data are labeled well before the contest. <br />
Pros: evaluation metrics are more rigorous; support cross-validation <br />
cons: training/testing set is limited<br />
<br />
2. Training on a labeled set, but testing on an unlabeled audio pool <br />
As in audio similarity and retrieval contest, each algorithm returns a list of candidates in each mood category, then human assessors make judgments on the returned candidates. <br />
Pros: testing pool can be arbitrarily big; training set is bigger as well (which can be the whole groundtruth set in scenario 1 .) <br />
Cons: innovative but limited evaluation metrics (see below)<br />
<br />
For both scenarios, this is a single-label classification contest, and thus each song can only be classified into one mood cluster. <br />
<br />
'''We will go for scenario 1'''<br />
<br />
== Groundtruth Set ==<br />
<br />
The IMIRSEL is preparing a ground-truth set of audio clips selected from the USPOP collection decribed above and the APM collection (www.apmmusic.com). The bibliographic information of the exemplar songs has been released as above, which is to help participants reach agreements on the meanings of the mood categories.<br />
<br />
The APM audio set has been pre-labeled with the 5 mood clusters according to their metadata provided by APM, and covers a variety of genres: each category covers about 7 major genres (with 20-30 tracks each) and a few minor genres. To make the problem more interesting, the distribution among major genres within each category is made as even as possible. <br />
<br />
To make sure the mood labels are correct, this APM audio collection will subject to human validation before the contest. We prepared a set of 1250 audio clips (250 per category). The audio clips whose mood category assignments reach agreements among 2 out of 3 human assessors will serve as a ground truth set. We are aiming at least 120 audio clips in each mood category. <br />
<br />
After the human validation on this audio set, participating algorithms/ models will be trained and tested within IMIRSEL.<br />
<br />
'''Audio format: 30 second clips, 22.05kHz, mono, 16bit, WAV files''' <br />
<br />
=== Human Validation ===<br />
Subjective judgments by human assessors will be collected for the above mentioned APM audio set using a web-based system, Evalutron6000, developed by the IMIRSEL. (An introduction of this piece of Evalutron 6000 is shown here [[2007:Evalutron6000_Walkthrough_For_Audio_Mood_Classification]]<br />
<br />
Each audio clip is 30 seconds long, and will have 3 human judges listen to it and choose which mood category it belongs to. If 2 of the 3 judges agree on its category, an audio clip will be selected into the groundtruth set.<br />
<br />
== Evaluation Metrics == <br />
<br />
Metrics frequently used in classification problems include: accuracy, precision, recall and F measures (combining precision and recall). The single most important metrics would be accuracy, which allows direct system comparison: <br />
<br />
''Accuracy = # of correctly classified songs / #. of all songs.'' <br />
<br />
Accuracy can be calculated for all clusters as a whole (macro average) or for each cluster then take average of them (micro average).<br />
<br />
Test significance of differences among systems, possibly using<br />
<br />
*a) McNemarΓÇÖs test <br />
<br />
McNemarΓÇÖs test (Dietterich, 1997) is a statistical process that can validate the significance of differences between two classifiers. It was used in Audio Genre Classification and Audio Artist Identification contests in MIREX 2005. <br />
<br />
*b) FriedmanΓÇÖs test<br />
<br />
FriedmanΓÇÖs test used to detect differences in treatments across multiple test attempts. (http://en.wikipedia.org/wiki/Friedman_test). It was used in Audio Similarity, Audio cover song, and Query by Singing/Humming contests in MIREX 2006. <br />
<br />
Besides, run time can be recorded and compared.<br />
<br />
== Important Dates ==<br />
<br />
* Human Validation for Groundtruth Set: August 1 - August 15<br />
* Algorithm Submission Deadline: August 25<br />
<br />
== Packaging your Submission ==<br />
* Be sure that your submission follows the [[#Submission_Format]] outlined below.<br />
* Be sure that your submission accepts the proper [[#Input_File]] format<br />
* Be sure that your submission produces the proper [[#Output_File]] format<br />
* Be sure to follow the [[[2006:Best_Coding_Practices_for_MIREX]]<br />
* Be sure to follow the [[2007:MIREX 2007 Submission Instructions]] <br />
* In the README file that is included with your submission, please answer the following additional questions:<br />
** Approximately how long will the submission take to process ~1000 wav files?<br />
** Approximately how much scratch disk space will the submission need to store any feature/cache files?<br />
** Any special notice regarding to running your algorith<br />
* Submit your system via the URL located at the bottom of [[2007:MIREX 2007 Submission Instructions]] page<br />
<br />
Note that the information that you place in the README file is '''extremely''' important in ensuring that your submission is evaluated properly.<br />
<br />
== Submission Format ==<br />
A submission to the Audio Music Mood Classification evaluation is expected to follow the [[2006:Best_Coding_Practices_for_MIREX]] and must conform to the following for execution:<br />
<br />
=== One Call Format ===<br />
The one call format is appropriate for systems that perform all phases of the classification (typically features extraction, training and testing) in one step. A submission should be an executable program that takes 4 arguments: <br />
* path/to/fileContainingListOfTrainingAudioClips - the path to the list of training audio clips (see [[#File Formats]] below)<br />
* path/to/fileContainingListOfTestingAudioClips - the path to the list of testing audio clips (see [[#File Formats]] below)<br />
* path/to/cacheDir - a directory where the submission can place temporary or scratch files. Note that the contents of this directory can be retained across runs, so if, for whatever reason, the submission needs to be restarted, the submission could make use of the contents of this directory to eliminate the need for reprocessing some inputs.<br />
* path/to/output/Results - the file where the output classification results should be placed. (see [[#File Formats]] below)<br />
<br />
'''Example:'''<br />
<br />
<pre><br />
<br />
doAMC "path/to/fileContainingListOfTrainingAudioClips" "path/to/fileContainingListOfTestingAudioClips" "path/to/cacheDir" "path/to/output/Results" <br />
<br />
</pre><br />
<br />
<br />
=== Two Call Format ===<br />
The one call format is appropriate for systems that perform the training and testing separately. A submission should consists of two executable programs<br />
*trainAMC - this takes 3 arguments: <br />
** path/to/fileContainingListOfTrainingAudioClips - the path to the list of training audio clips (see [[#File Formats]] below)<br />
** path/to/trainingCacheDir - a directory where the submission can place temporary or scratch files. Note that the contents of this directory can be retained across runs, so if, for whatever reason, the submission needs to be restarted, the submission could make use of the contents of this directory to eliminate the need for reprocessing some inputs.<br />
** path/to/trainedClassificationModel - the file where the classification model should be placed<br />
*testAMC - this takes 4 arguments:<br />
** path/to/trainedClassificationModel<br />
** path/to/fileContainingListofTestingAudioClips - the path to the list of testing audio clips (see [[#File Formats]] below)<br />
** path/to/testingCacheDir - a directory where the submission can place temporary or scratch files. <br />
** path/to/output/Results - the file where the output classification results should be placed. (see [[#File Formats]] below)<br />
<br />
'''Example:'''<br />
<br />
<pre><br />
<br />
trainAMC "path/to/fileContainingListOfTrainingAudioClips" "path/to/trainingcacheDir" "path/to/trainedClassificationModel" <br />
testAMC "path/to/trainedClassificationModel" "path/to/fileContainingListofTestingAudioClips" "path/to/testingCacheDir" "path/to/output/Results"<br />
<br />
</pre><br />
<br />
=== Matlab format ===<br />
<br />
Matlab will also be supported in the form of functions in the following formats:<br />
<br />
==== Matlab One call format ====<br />
<pre><br />
doMyMatlabAMC('path/to/fileContainingListOfTrainingAudioClips','path/to/fileContainingListOfTestingAudioClips','path/to/cacheDir','path/to/output/Results')<br />
</pre><br />
<br />
<br />
==== Matlab Two call format ====<br />
<pre><br />
doMyMatlabTrainAMC('path/to/fileContainingListOfTrainingAudioClips','path/to/trainingcacheDir','path/to/trainedClassificationModel')<br />
doMyMatlabTestAMC('path/to/trainedClassificationModel','path/to/fileContainingListofTestingAudioClips','path/to/testingCacheDir','path/to/output/Results')<br />
</pre><br />
<br />
== File Formats ==<br />
<br />
=== Input Files ===<br />
<br />
The input training list file format will be of the form: <br />
<br />
<pre><br />
path/to/training/audio/file/000001.wav\tCluster_3<br />
path/to/training/audio/file/000002.wav\tCluster_5<br />
path/to/training/audio/file/000003.wav\tCluster_2<br />
...<br />
path/to/training/audio/file/00000N.wav\tCluster_1<br />
</pre><br />
<br />
"\t" stands for tab.<br />
<br />
The input testing list file format will be of the form: <br />
<br />
<pre><br />
path/to/testing/audio/file/000010.wav<br />
path/to/testing/audio/file/000020.wav<br />
path/to/testing/audio/file/000030.wav<br />
...<br />
path/to/testing/audio/file/0000N0.wav<br />
</pre><br />
<br />
"\t" stands for tab.<br />
<br />
=== Output File ===<br />
The only output will be a file containing classification results in the following format: <br />
<br />
<pre><br />
Example Classification Results 0.1 (replace this line with your system name)<br />
path/to/testing/audio/file/000010.wav\tCluster_3<br />
path/to/testing/audio/file/000020.wav\tCluster_1<br />
path/to/testing/audio/file/000030.wav\tCluster_5<br />
...<br />
path/to/testing/audio/file/0000N0.wav\tCluster_2<br />
</pre><br />
<br />
"\t" indicates tab. All audio clips should have one and only one mood cluster label.<br />
<br />
==Evaluation Scenario 2==<br />
<br />
=== Training Set ===<br />
<br />
Under evaluation scenario 2, the training set would be the whole ground truth set in scenario 1 (see [[#Groundtruth Set]]).<br />
<br />
=== Unlabeled Song Pool ===<br />
Under evaluation scenario 2, the pool of testing audio to be classified is from the same collection of the training set, i.e. USPOP and APM. We will make sure the audio covers a variety of genres in each mood cluster, which will make the contest harder and more interesting.<br />
<br />
We will randomly select a certain number (say, 1000) of songs from the collections as the audio pool. This number should make the contest interesting enough, but not too hard. And the songs need to cover all 5 mood clusters.<br />
<br />
=== Classification Results ===<br />
Each algorithm will return the top X songs in each cluster. <br />
<br />
This is a single-label classification contest, and thus each song can only be classified into one mood cluster. <br />
<br />
Note: unlike traditional classification problems where all testing samples have ground truth available, this scenario does not have a well labeled testing set. Instead, we use a ΓÇ£poolingΓÇ¥ approach like in TREC and last yearΓÇÖs audio similarity and retrieval contest. This approach collects the top X results from each algorithm and asks human assessors to make judgments on this set of collected results while assuming all other samples are irrelevant or incorrect. This approach cannot measure the absolute ΓÇ£recallΓÇ¥ metrics, but it is valid in comparing relative performances among participating algorithms. <br />
<br />
The actual value of X depends on human assessment protocol and number of available human assessors (see next section [[#Human Assessment]]).<br />
<br />
=== Human Assessment===<br />
Subjective judgments by human assessors will be collected for the pooled results using a web-based system, Evalutron6000, developed by the IMIRSEL. (An introduction of this piece of Evalutron 6000 is shown here [[2007:Evalutron6000_Walkthrough_For_Audio_Mood_Classification]]<br />
<br />
==== How many judgments and assessors ====<br />
Each algorithm returns X songs for each of the 5 mood clusters. Suppose there are Y algorithms, in the worst case, each cluster will have 5* X*Y songs to be judged. Suppose each song needs Z sets of ears, there will be 5*X*Y*Z judgments in total. When making a judgment, a human assessor will listen to the 30 second clip of a song, and label it with one of the 5 mood clusters. <br />
<br />
Human evaluators will be drawn from the participating labs and volunteers from IMIRSEL or on the MIREX lists. Suppose we can get W evaluators, each evaluator will evaluate S = (5*X*Y*Z) / W songs.<br />
<br />
At this moment, there are 10 potential participants on the Wiki, so letΓÇÖs say Y = 6. Suppose each candidate song will be evaluated by 3 judges, Z = 3, and suppose we can get 20 assessors: W = 20: <br />
<br />
*If X = 20, number of judgments for each assessor: S = 90<br />
*If X = 10, S = 45<br />
*If X = 30, S = 135 <br />
*If X = 50, S = 225<br />
*If X = 15, S = 67.5<br />
*…<br />
<br />
In audio similarity contest last year, each assessor made 205 judgments as average. As the judgment for mood is trickier, we may need to give our assessors less burden.<br />
<br />
To eliminate possible bias, we will try to equally distribute candidates returned by each algorithm among human assessors.<br />
<br />
=== Scoring ===<br />
Each algorithm is graded by the number of votes its candidate songs win from the judges. For example, if a song, A, is judged as in Cluster_1 by 2 assessors and as in Cluster_2 by 1 assessors, then the algorithm classifying A as in Cluster_1 will score 2 on this song, while the algorithm classifiying A as Cluster_2 will score 1 on this song. An algorithmΓÇÖs final score is the sum of scores on all the songs it submits. Since each algorithm can only submit 100 songs, the one which wins the most votes of judges win the contest.<br />
<br />
=== Evaluation Metrics ===<br />
Algorithm score as mentioned in last section is a metrics that facilitates direct comparison. <br />
<br />
Besides, metrics frequently used in classification problems include: accuracy, precision, recall and F measures (combining precision and recall). As mentioned above, the pooling approach results in a relative recall measure, therefore, the single most important metrics would be accuracy: <br />
<br />
The original definition of accuracy is:<br />
''Accuracy = # of correctly classified songs / #. of all songs.'' <br />
<br />
According to the above human assessment method, ΓÇ£correctly classified songsΓÇ¥ in this scenario can be defined as songs classified as the majority vote of the judges and, in the case of ties, songs classified as any of the tie votes. For example, suppose each song has 3 judges. If a song is labeled as Cluster_1 by at least 2 judges, then this song will be counted as correct for algorithms classifying it to Cluster_1; if a song is labeled as Cluster_1, Cluster_2 and Cluster_3 once by each of the judges, then this song will be counted as correct for algorithms classifying it to Cluster_1, Cluster_2 or Cluster_3. <br />
<br />
Accuracy can be calculated for all clusters as a whole (macro average) or for each cluster then take average of them (micro average).<br />
<br />
Test significance of differences among systems, possibly using<br />
<br />
*a) McNemarΓÇÖs test <br />
*b) FriedmanΓÇÖs test<br />
<br />
Besides, run time can be recorded and compared.<br />
<br />
== Challenging Issues == <br />
# Mood changeable pieces: some pieces may start from one mood but end up with another one. <br />
<br />
We will use 30 second clips instead of whole songs. The clips will be extracted automatically from the middle of the songs which have more chances to be representative.<br />
<br />
# Multiple label classification: it is possible that one piece can have two or more correct mood labels, but as a start, we strongly suggest to hold a less confusing contest and leave the challenge to future MIREXs.So, for this year, this is a single label classification problem.<br />
<br />
== Participants ==<br />
If you think there is a slight chance that you might consider participating, please add your name and email address here. <br />
<br />
* Kris West (kw at cmp dot uea dot ac dot uk)<br />
* Cyril Laurier (claurier at iua dot upf dot edu)<br />
* Elias Pampalk (<i>firstname.lastname</i>@gmail.com)<br />
* Yuriy Molchanyuk (molchanyuk at onu.edu.ua)<br />
* Shigeki Sagayama (sagayama at hil dot t.u-tokyo.ac.jp)<br />
* Guillaume Nargeot (killy971 at gmail dot com)<br />
* Zhongzhe Xiao (zhongzhe dot xiao at ec-lyon dot fr)<br />
* Kyogu Lee (kglee at ccrma.stanford.edu)<br />
* Vitor Soares (<i>firstname.lastname</i>@clustermedialabs.com)<br />
* Wai Cheung (wlche1@infotech.monash.edu.au)<br />
* Matt Hoffman (mdhoffma <i>a t</i> cs <i>d o t</i> princeton <i>d o t</i> edu)<br />
* Yi-Hsuan Yang (affige at gmail dot com)<br />
* Jose Fornari ( fornari at campus dot jyu dot fi )<br />
<br />
== Moderators ==<br />
* J. Stephen Downie (IMIRSEL, University of Illinois, USA) - [mailto:jdownie@uiuc.edu]<br />
* Xiao Hu (IMIRSEL, University of Illinois, USA) -[mailto:xiaohu@uiuc.edu]<br />
* Cyril Laurier (Music Technology Group, Barcelona, Spain) -[mailto:claurier@iua.upf.edu]<br />
<br />
== Related Papers ==<br />
#Dietterich, T. (1997). '''Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms'''. Neural Computation, 10(7), 1895-1924.<br />
#Hu, Xiao and J. Stephen Downie (2007). '''Exploring mood metadata: Relationships with genre, artist and usage metadata'''. Accepted in the Eighth International Conference on Music Information Retrieval (ISMIR 2007),Vienna, September 23-27, 2007.<br />
# Juslin, P.N., Karlsson, J., Lindstr├╢m E., Friberg, A. and Schoonderwaldt, E(2006), '''Play It Again With Feeling: Computer Feedback in Musical Communication of Emotions'''. In Journal of Experimental Psychology: Applied 2006, Vol.12, No.2, 79-95.<br />
# [http://ismir2004.ismir.net/proceedings/p075-page-415-paper152.pdf Vignoli (ISMIR 2004)] '''Digital Music Interaction Concepts: A User Study'''<br />
# [http://ismir2004.ismir.net/proceedings/p082-page-447-paper221.pdf Cunningham, Jones and Jones (ISMIR 2004)] '''Organizing Digital Music For Use: An Examiniation of Personal Music Collections'''.<br />
# [http://ismir2006.ismir.net/PAPERS/ISMIR0685_Paper.pdf Cunningham, Bainbridge and Falconer (ISMIR 2006)] '''More of an Art than a Science': Supporting the Creation of Playlists and Mixes'''.<br />
# Lu, Liu and Zhang (2006), '''Automatic Mood Detection and Tracking of Music Audio Signals'''. IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 1, JANUARY 2006 <br> Part of this paper appeared in ISMIR 2003 http://ismir2003.ismir.net/papers/Liu.PDF<br />
# [http://www.cp.jku.at/research/papers/Pohle_CBMI_2005.pdf Pohle, Pampalk, and Widmer (CBMI 2005)] '''Evaluation of Frequently Used Audio Features for Classification of Music into Perceptual Categories'''. <br> It separates "mood" and "emotion" as two classifcation dimensions, which are mostly combined in other studies.<br />
# [http://www.ee.columbia.edu/~dpwe/pubs/MandPE06-svm.pdf Mandel, Poliner and Ellis (2006)] '''Support vector machine active learning for music retrieval'''. Multimedia Systems, Vol.12(1). Aug.2006.<br />
# [http://doi.acm.org/10.1145/860435.860508 Feng, Zhuang and Pan (SIGIR 2003)] '''Popular music retrieval by detecting mood'''<br />
# [http://ismir2003.ismir.net/papers/Li.PDF Li and Ogihara (ISMIR 2003)] '''Detecting emotion in music'''<br />
# [http://pubdb.medien.ifi.lmu.de/cgi-bin//info.pl?hilliges2006audio Hilliges, Holzer, Kl├╝ber and Butz (2006)] '''AudioRadar: A metaphorical visualization for the navigation of large music collections'''.In Proceedings of the International Symposium on Smart Graphics 2006, Vancouver Canada. <br> It summarized implicit problems in traditional genre/artist based music organization.<br />
# Juslin, P. N., & Laukka, P. (2004). '''Expression, perception, and induction of musical emotions: A review and a questionnaire study of everyday listening'''. Journal of New Music Research, 33(3), 217-238.<br />
# [http://mpac.ee.ntu.edu.tw/~yihsuan/ Yang, Liu, and Chen (ACMMM 2006)] '''Music emotion classification: A fuzzy approach '''</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Multiple_Fundamental_Frequency_Estimation_%26_Tracking&diff=71742008:Multiple Fundamental Frequency Estimation & Tracking2010-06-07T19:08:52Z<p>IMIRSELBot: Robot: Automated text replacement (-\[http:\/\/(www.)?music-ir.org\/mirex\/200([5-9])\/index.php\/([^\s]+)( .+)?\] +200\2:\3)</p>
<hr />
<div>==Description==<br />
<br />
That a complex music signal can be represented by the F0 contours of its constituent sources is a very useful concept for most music information retrieval systems. There have been many attempts at multiple (aka polyphonic) F0 estimation and melody extraction, a related area. The goal of multiple F0 estimation and tracking is to identify the active F0s in each time frame and to track notes and timbres continuously in a complex music signal. In this task, we would like to evaluate state-of-the-art multiple-F0 estimation and tracking algorithms. Since F0 tracking of all sources in a complex audio mixture can be very hard, we are restricting the problem to 3 cases:<br />
<br />
1. Estimate active fundamental frequencies on a frame-by-frame basis.<br />
<br />
2. Track note contours on a continuous time basis. (as in audio-to-midi). This task will also include a piano transcription sub task.<br />
<br />
3. Track timbre on a continous time basis.<br />
<br />
The deadline For this task is AUGUST 22nd.<br />
<br />
==Data==<br />
<br />
A woodwind quintet transcription of the fifth variation from L. van Beethoven's Variations for String Quartet Op.18 No. 5. Each part (flute, oboe, clarinet, horn, or bassoon) was recorded separately while the performer listened to the other parts (recorded previously) through headphones. Later the parts were mixed to a monaural 44.1kHz/16bits file.<br />
<br />
Synthesized pieces using RWC MIDI and RWC samples. Includes pieces from Classical and Jazz collections. Polyphony changes from 1 to 4 sources.<br />
<br />
Polyphonic piano recordings generated using a disklavier playback piano.<br />
<br />
So, there are 6, 30-sec clips for each polyphony (2-3-4-5) for a total of 30 examples, plus there are 10 30-sec polyphonic piano clips. Please email me about your estimated running time (in terms of n times realtime), if we believe everybodyΓÇÖs algorithm is fast enough, we can increase the number of test samples. (There were 90 x real-time algo`s for melody extraction tasks in the past.)<br />
<br />
All files are in 44.1kHz / 16 bit wave format. The development set can be found at<br />
[https://www.music-ir.org/evaluation/MIREX/data/2007/multiF0/index.htm Development Set for MIREX 2007 MultiF0 Estimation Tracking Task]. <br />
<br />
Send an email to [mailto:mertbay@uiuc.edu mertbay@uiuc.edu] for the username and password.<br />
<br />
==Evaluation==<br />
<br />
This year, We would like to discuss different evaluation methods. From last year`s result, it can be seen that on note tracking, algorithms performed poorly when evaluated using note offsets. Below is the evaluation methods we used last year: <br />
<br />
For Task 1 (frame level evaluation), systems will report the number of active pitches every 10ms. Precision (the portion of correct retrieved pitches for all pitches retrieved for each frame) and Recall (the ratio of correct pitches to all ground truth pitches for each frame) will be reported. A Returned Pitch is assumed to be correct if it is within a half semitone (+ - 3%) of a ground-truth pitch for that frame. Only one ground-truth pitch can be associated with each Returned Pitch.<br />
Also as suggested, an error score as described in [http://www.hindawi.com/GetArticle.aspx?doi=10.1155/2007/48317 Poliner and Ellis p.g. 5 ] will be calculated. <br />
The frame level ground truth will be calculated by [http://www.ircam.fr/pcm/cheveign/sw/yin.zip YIN] and hand corrected.<br />
<br />
For Task 2 (note tracking), again Precision (the ratio of correctly transcribed ground truth notes to the number of ground truth notes for that input clip) and Recall (ratio of correctly transcribed ground truth notes to the number of transcribed notes) will be reported. A ground truth note is assumed to be correctly transcribed if the system returns a note that is within a half semitone (+ - 3%) of that note AND the returned note`s onset is within a 50ms range( + - 25ms) of the onset of the ground truth note, and its offset is within 20% range of the ground truth note`s offset. Again, one ground truth note can only be associated with one transcribed note.<br />
<br />
The ground truth for this task will be annotated by hand. An amplitude threshold relative to the file/instrument will be determined. Note onset is going to be set to the time where its amplitude rises higher than the threshold and the offset is going to be set to the the time where the note`s amplitude decays lower than the threshold. The ground truth is going to be set as the average F0 between the onset and the offset of the note.<br />
In the case of legato, the onset/offset is going to be set to the time where the F0 deviates more than 3% of the average F0 through out the the note up to that point. There is not going to be any vibrato larger than a half semitone in the test data.<br />
<br />
Different statistics can also be reported if agreed by the participants.<br />
<br />
== Submission Format ==<br />
<br />
Submissions have to conform to the specified format below:<br />
<br />
''doMultiF0 "path/to/file.wav" "path/to/output/file.F0" ''<br />
<br />
path/to/file.wav: Path to the input audio file.<br />
<br />
path/to/output/file.F0: The output file. <br />
<br />
Programs can use their working directory if they need to keep temporary cache files or internal debuggin info. Stdout and stderr will be logged.<br />
<br />
For each task, the format of the output file is going to be different:<br />
For the first task, F0-estimation on frame basis, the output will be a file where each row has a time stamp and a number of active F0s in that frame, separated by a tab for every 10ms increments. <br />
<br />
Example :<br />
''time F01 F02 F03 ''<br />
''time F01 F02 F03 F04''<br />
''time ... ... ... ...''<br />
<br />
which might look like:<br />
<br />
''0.78 146.83 220.00 349.23''<br />
''0.79 349.23 146.83 369.99 220.00 ''<br />
''0.80 ... ... ... ...''<br />
<br />
For the second task, for each row, the file should contain the onset, offset and the F0 of each note event separated by a tab, ordered in terms of onset times:<br />
<br />
onset offset F01<br />
onset offset F02<br />
... ... ...<br />
which might look like:<br />
<br />
0.68 1.20 349.23<br />
0.72 1.02 220.00<br />
... ... ...<br />
The DEADLINE is Friday August 31.<br />
<br />
== Antonio's comments 24/07/08 ==<br />
<br />
First of all, thanks to the Mirex team for their effort to make it possible again. Just some comments about the evaluation this year. <br />
<br />
As the first multiple f0 contest took place last year, it was very welcomed and many researchers submitted their algorithms, so the results in Mirex07 provided a very valuable resource for comparing different approaches. However, the participation will probably not be so massive this year so, in case that the database used for evaluation will not be the same than last year, it would be nice to report (if possible) both the results obtained using the Mirex07 database and the results with the Mirex08 database, to directly compare the new approaches with the algorithms presented last year. <br />
<br />
== Emmanuel's comments 25/07/08 ==<br />
<br />
As last year, we will participate in tasks 1 and 2 only.<br />
<br />
For task 2, I understand that the proposed annotation method will take reverberation into account, so that for instance the offset of one note will happen after the onset of the following note in a legato context. Is that true? Computing the amplitudes of the notes is not trivial in the presence of overlapping partials, so I wonder if Mert could tell us a bit more.<br />
<br />
Results could also be evaluated with the onset-only metric used last year.<br />
<br />
== Mert`s comments ==<br />
<br />
Thanks for the comments. Antonio, the new dataset will be the previous year`s plus some more. So last year`s labs can compare their new methods. <br />
Emmanuel, the current ground truth is annotated in a non overlapping way. So within the source, the offset of the previous note can not happen after the onset of the current one. The offset range in the evaluation criteria should be enough to not the cause a false negative because of the reverberation. However, we can come up with a better criteria for evaluating the with the offsets. <br />
<br />
== Antonio's comments 30/07/08 ==<br />
<br />
We will participate in tasks 1 and 2 too. Mert, any news about the deadline to send the algorithms?<br />
<br />
== Jean-Louis's comments 30/07/08 ==<br />
<br />
We will probably participate to task 1. One question about the development set: I know the groundtruth is annotated at frame level, with a hop size of 10ms between the windows. However, could we know the size of the windows and the weighting window that were used? <br />
<br />
I would also like to know whether the first window starts at 0s or is centered around 0s. The ISMIR 2004 database had time stamps giving the center of each window. I wanted to check whether the annotation protocole was the same or not.<br />
<br />
== Mert`s comments 05/08/08 ==<br />
<br />
The Deadline will be 22th of August Friday. It will be announced on the lists soon.<br />
The ground truth on the frame level uses 10ms skip rate with 46ms hanning window. First window is centered at 23ms, second at 33ms and so on. If this is a problem for the community let me know, I can readjust (or reinterpolate) the frame centers to match 10ms,20ms,.....<br />
<br />
== Matti's comments 06/08/08 ==<br />
<br />
Hi all, I'm glad to see a re-run of this task and also potential new teams for this year. Should we fix the submission format so that people could start to prepare their submissions (I guess that the format is the same as last year)?<br />
<br />
<br />
== Mert`s Comments ==<br />
<br />
Let`s use the same I/O formats from last year. The deadline for this task will be August 22nd.<br />
<br />
== Gustavo's Comments ==<br />
<br />
Greetings! I would like to know what is the submission format for task 3. Thanks in advance.<br />
<br />
== Jean-Louis's Comments ==<br />
Hi everyone,<br />
<br><br />
My concerns are for the evaluation of task 1: I just noticed that the definition of the accuracy on the wiki page ([[2007:Multiple_Fundamental_Frequency_Estimation_%26_Tracking_Results]]. I guess the latter is right (seems to make more sense too), i.e. Acc = TP/(TP + FP + FN).<br />
<br><br />
I d also want to warn people about some mistake I ve been doing, computing the Precision and Recall measures: I have been calculating them following the usual TP/(TP+FP) and TP/(TP+FN), but I think for this task, it s a little bit more tricky... Especially for the recall, for which this way of computing does not take into account some of the FP that actually are substitutions (and not additional positives, where there should have been a 0, say)... Seeing the way it was computed for the audio melody extraction task, a few years ago, I dont think my mistake was done during the 2007 evaluation. I would say the afore-mentionned accuracy does not suffer from this "mistake", such that the formula can be used as stated (even if the FP in it is somehow ambiguous). Am I right?<br />
<br><br />
I was wondering: would it be possible to know more precisely how the criteria are computed? Especially, what do you count as FP? I know in audio melody extraction (for which the groundtruth is "monophonic", such that the problems are not exactly the same either), the distinction was made between incorrect pitch (say "IP"): the frame was pitched but the estimated pitch was incorrect, and false positive (''the'' FP): the frame is unpitched, but a pitch was given (instead of 0, for instance). All in all, I mean that distinguishing between substitutions and additions seems to be necessary to obtain relevant measures. I guess everyone will agree about that, since the metrics by Ellis and Poliner were already taking into account this fact... Well, anyway, I am interested to know how you guys at MIREX are doing this! That would help me to "tune" my stuff the right way ! :D<br />
<br><br />
Concerning the output format, is it possible to put 0s instead of nothing for each frame? The system we are working on, based on source separation, outputs a fixed number of pitches (sources) per frame, giving 0 if a source is considered silent. Will it be taken into account in the evaluation or are all the 0s in our output going to count as "FPs"? :)<br />
<br />
==Mert`s Comments==<br />
Hi Jean-Louis,<br />
Accuracy was calculated as Acc=TP/(TP+FP+FN). There was typo in the page. It is fixed now. <br />
The evaluation is something we should discuss more this year. We can evaluate with many different criterias.<br />
As I look at my scripts, the recall is calculated as TP/Nref where Nref is the number of nonzero elements in the ground truth vector. <br />
FP was calculated for each frame as the difference between the number of non zero elements in the detected F0 vector and number of non zero elements in the intersection of the detected F0 vector with the ground truth F0 vector. Then summed accross all frames.<br />
<br />
BTW, you can put 0`s, it is no problem.<br />
<br />
==Gustavo's Comments==<br />
<br />
Hi everyone,<br />
<br />
I just noticed that on the wiki page ([[2007:Multiple_Fundamental_Frequency_Estimation_%26_Tracking_Results]]) as well as in this wiki page it is stated that: "returned note`s onset is within a 50ms range( + - 25ms) of the onset of the ground truth note".<br />
<br />
Which one is correct: +-25ms or +-50ms?<br />
<br />
==Potential Participants==<br />
If you might consider participating, please add your name and email address here and also please sign up for the Multi-F0 mail list:<br />
[https://mail.lis.uiuc.edu/mailman/listinfo/mrx-com03 Multi-F0 Estimation Tracking email list]<br />
<br />
<br />
1. Gustavo Reis (Polytecnic Institute of Leiria, Portugal) and Francisco Fernandez (University of Extremadura, Spain) and Anibal Ferreira (University of Porto, Portugal) (gustavo.reis (at) estg.ipleiria.pt, fcofdez (at) unex.es, ajf (at) fe.up.pt)<br><br />
2. Antonio Pertusa and José M. Iñesta (University of Alicante, Spain) (pertusa@ua.es, inesta@dlsi.ua.es)<br><br />
3. Pablo Cancela (pcancela@gmail.com)<br><br />
4. Emmanuel Vincent (emmanuel.vincent (at) irisa_fr) and Nancy Bertin (nancy.bertin (at) enst_fr)<br><br />
5. Jean-Louis Durrieu (durrieu AT enst DOT fr) (task 1)<br><br />
6. Matti Ryynänen and Anssi Klapuri (Tampere University of Technology) (matti.ryynanen (at) tut.fi, anssi.klapuri (at) tut.fi)<br><br />
7. Koji Egashira (University of Tokyo, Japan) (egashira (at) hil.t.u-tokyo.ac.jp) <br><br />
8. Ruohua Zhou and Josh Reiss (Queen Mary University of London) ( zhou.ruohua, Josh.Reiss@elec.qmul.ac.uk) <br><br />
9. Chunghsin Yeh, Axel Roebel (IRCAM) (cyeh, roebel (at) ircam dot fr) and Wei-Chen Chang (wcchang (at) gmail dot com)<br><br />
10. Valentin Emiya (TELECOM ParisTech - ENST) (valentin.emiya (at) enst_fr)<br><br />
11. Chuan Cao and Ming Li (ThinkIT Lab., IOA), ccao <at> hccl.ioa.ac.cn, mli <at> hccl.ioa.ac.cn <br><br />
12. Michael Groble (mg2467@columbia.edu)<br></div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Music_Mood_Classification&diff=71732008:Audio Music Mood Classification2010-06-07T19:08:43Z<p>IMIRSELBot: Robot: Automated text replacement (-\[http:\/\/(www.)?music-ir.org\/mirex\/200([5-9])\/index.php\/([^\s]+)( .+)?\] +200\2:\3)</p>
<hr />
<div>=2008 AMC EVALUATION SCENARIO OVERVIEW=<br />
This section is put here to clarify what will happen for this year's run of the Audio Mood Classification (AMC) task.<br />
<br />
# We will operate the AMC task as a classic train-test classification task.<br />
# We will n-fold the runs with n to be determined by the size of the final data set, number of participants, etc.<br />
# We will hand-craft the n-fold test-train split lists.<br />
# We will NOT be doing post-run human mood judgments this year using the Evalutron 6000. <br />
# Audio files: 30 sec., 22kHz, mono, 16 bit<br />
<br />
Do take a look at the [[2008:Audio Genre Classification]] task wiki as we are basing the underlying structure of this task on Audio Genre. In fact, an Audio Genre submission should work out of the box with Audio Mood Classification. Note: we really want folks to do a FEATURE EXTRACTION phase first against all the files and then have these features cached some place for re-use during the TRAIN-TEST phase. This way we can really speed up the n-fold processing. Thus, like GENRE, we need to pass three input files to your algos:<br />
<br />
==== 1. Feature extraction list file ====<br />
The list file passed for feature extraction will a simple ASCII list <br />
file. This file will contain one path per line with no header line.<br />
<br />
==== 2. Training list file ====<br />
The list file passed for model training will be a simple ASCII list <br />
file. This file will contain one path per line, followed by a tab character and <br />
the genre label, again with no header line. <br />
<br />
E.g. <example path and filename>\t<mood classification><br />
<br />
==== 3. Test (classification) list file ====<br />
The list file passed for testing classification will be a simple ASCII list <br />
file identical in format to the Feature extraction list file. This file will <br />
contain one path per line with no header line.<br />
<br />
==== Classification output files ====<br />
Participating algorithms should produce a simple ASCII list file identical in <br />
format to the Training list file. This file will contain one path per line, <br />
followed by a tab character and the MOOD label, again with no header line. <br />
E.g.:<br />
<example path and filename>\t<mood classification><br />
<br />
The path to which this list file should be written must be accepted as a <br />
parameter on the command line.<br />
<br />
== Participants ==<br />
If you think there is a slight chance that you might consider participating, please add your name and email address here.<br />
<br />
# Haiba Wang, haiba_access at yahoo.cn<br />
# IMIRSEL, xiaohu@illinois.edu<br />
# Michael Mandel, mim (at) ee.columbia.edu<br />
# Geoffroy Peeters, peeters (at) ircam.fr<br />
<br />
== Introduction ==<br />
In music psychology and music education, emotion component of music has been recognized as the most strongly associated with music expressivity.(e.g. Juslin et al 2006[[#Related Papers]]). Music information behavior studies (e.g.Cunningham, Jones and Jones 2004, Cunningham, Vignoli 2004, Bainbridge and Falconer 2006 [[#Related Papers]]) have also identified music mood/ emotion as an important criterion used by people in music seeking and organization. Several experiments have been conducted in the MIR community to classify music by mood (e.g. Lu, Liu and Zhang 2006, Pohle, Pampalk, and Widmer 2005, Mandel, Poliner and Ellis 2006, Feng, Zhuang and Pan 2003[[#Related Papers]]). Please note: the MIR community tends to use the word "mood" while musicpsychologists like to use "emotion". We follow the MIR tradition to use "mood" thereafter. <br />
<br />
However, evaluation of music mood classification is difficult as music mood is a very subjective notion. Each aforementioned experiement used different mood categories and different datasets, making comparison on previous work a virtually impossible mission. A contest on music mood classification in MIREX will help build the first ever community available test set and precious ground truth.<br />
<br />
This is the first time in MIREX to attempt a music mood classification evaluation. There are many issues involved in this evaluation task, and let us start discuss them on this wiki. If needed, we will set up a mailing list devoting to the discussion.<br />
<br />
== Mood Categories ==<br />
<br />
The IMIRSEL has derived a set of 5 mood clusters from the AMG mood repository (Hu & Downie 2007[[#Related Papers]]). The mood clusters effectively reduce the diverse mood space into a tangible set of categories, and yet root in the social-cultural context of pop music. Therefore, we propose to use the 5 mood clusters as the categories in this yearΓÇÖs audio mood classification contest. Each of the clusters is a collection of the AMG mood labels which collectively define the cluster: <br />
<br />
*Cluster_1: passionate, rousing, confident,boisterous, rowdy <br />
*Cluster_2: rollicking, cheerful, fun, sweet, amiable/good natured <br />
*Cluster_3: literate, poignant, wistful, bittersweet, autumnal, brooding <br />
*Cluster_4: humorous, silly, campy, quirky, whimsical, witty, wry <br />
*Cluster_5: aggressive, fiery,tense/anxious, intense, volatile,visceral <br />
<br />
At this moment, the IMIRSEL and Cyril Laurier at the Music Technology Group of Barcelona have manually validated the mood clusters and exemplar songs in each cluster. Please see [[#Exemplar Songs in Each Category]] for details. <br />
<br />
We are still seeking additional songs across different genres to enrich this set, and during the process, the cluster with least cross-listener consistency may be dropped, or two clusters often confusing each other may be combined. <br />
<br />
== Exemplar Songs in Each Category == <br />
Exemplar songs for each mood cluster are manually selected by multiple human assessors. The purpose is to further clarify the perceptual identities of the mood clusters.<br />
<br />
There are 190 candidate songs in the intersection of AMG mood repository and the USPOP collection in IMIRSEL, and each of these songs has only one unanimous mood cluster label assigned by AMG editors. The mood labels by AMG editors are important benchmark which can help us reach cross-listener consistency on such a subjective task. So far, 6 human assessors have listened to the 190 songs and assigned cluster labels to them. 50 songs are unanimously labeled by the 6 human assessors, 42 songs are unanimously labeled by 5 of the 6 human assessors, and another 40 songs by 4 of the 6 human assessors. <br />
<br />
The advantages of the exemplar songs are two folds: 1. they will help people better understand what kind of mood each cluster refers to; 2. they can possibly be taken as training data for the algorithms (see the section of [[#Training Set]]). <br />
<br />
Note: Lyrics issue: when labeling the songs, the human assessors were asked to ignore lyrics. As this is a contest focuses on music audio, lyrics should not be taken into consideration. <br />
<br />
== Two Evaluation Scenarios ==<br />
<br />
1. Evaluation on a closed groundtruth set.<br />
As in traditional classification problems, both training and testing data are labeled well before the contest. <br />
Pros: evaluation metrics are more rigorous; support cross-validation <br />
cons: training/testing set is limited<br />
<br />
2. Training on a labeled set, but testing on an unlabeled audio pool <br />
As in audio similarity and retrieval contest, each algorithm returns a list of candidates in each mood category, then human assessors make judgments on the returned candidates. <br />
Pros: testing pool can be arbitrarily big; training set is bigger as well (which can be the whole groundtruth set in scenario 1 .) <br />
Cons: innovative but limited evaluation metrics (see below)<br />
<br />
For both scenarios, this is a single-label classification contest, and thus each song can only be classified into one mood cluster. <br />
<br />
'''We will go for scenario 1'''<br />
<br />
== Groundtruth Set ==<br />
<br />
The IMIRSEL is preparing a ground-truth set of audio clips selected from the USPOP collection decribed above and the APM collection (www.apmmusic.com). The bibliographic information of the exemplar songs has been released as above, which is to help participants reach agreements on the meanings of the mood categories.<br />
<br />
The APM audio set has been pre-labeled with the 5 mood clusters according to their metadata provided by APM, and covers a variety of genres: each category covers about 7 major genres (with 20-30 tracks each) and a few minor genres. To make the problem more interesting, the distribution among major genres within each category is made as even as possible. <br />
<br />
To make sure the mood labels are correct, this APM audio collection will subject to human validation before the contest. We prepared a set of 1250 audio clips (250 per category). The audio clips whose mood category assignments reach agreements among 2 out of 3 human assessors will serve as a ground truth set. We are aiming at least 120 audio clips in each mood category. <br />
<br />
After the human validation on this audio set, participating algorithms/ models will be trained and tested within IMIRSEL.<br />
<br />
'''Audio format: 30 second clips, 22.05kHz, mono, 16bit, WAV files''' <br />
<br />
=== Human Validation ===<br />
Subjective judgments by human assessors will be collected for the above mentioned APM audio set using a web-based system, Evalutron6000, developed by the IMIRSEL. <br />
<br />
Each audio clip is 30 seconds long, and will have 3 human judges listen to it and choose which mood category it belongs to. If 2 of the 3 judges agree on its category, an audio clip will be selected into the groundtruth set.<br />
<br />
== Evaluation Metrics == <br />
<br />
Metrics frequently used in classification problems include: accuracy, precision, recall and F measures (combining precision and recall). The single most important metrics would be accuracy, which allows direct system comparison: <br />
<br />
''Accuracy = # of correctly classified songs / #. of all songs.'' <br />
<br />
Accuracy can be calculated for all clusters as a whole (macro average) or for each cluster then take average of them (micro average).<br />
<br />
Test significance of differences among systems, possibly using<br />
<br />
*a) McNemarΓÇÖs test <br />
<br />
McNemarΓÇÖs test (Dietterich, 1997) is a statistical process that can validate the significance of differences between two classifiers. It was used in Audio Genre Classification and Audio Artist Identification contests in MIREX 2005. <br />
<br />
*b) FriedmanΓÇÖs test<br />
<br />
FriedmanΓÇÖs test used to detect differences in treatments across multiple test attempts. (http://en.wikipedia.org/wiki/Friedman_test). It was used in Audio Similarity, Audio cover song, and Query by Singing/Humming contests in MIREX 2006. <br />
<br />
Besides, run time can be recorded and compared.<br />
<br />
== Important Dates ==<br />
<br />
* Human Validation for Groundtruth Set: August 1 - August 15<br />
* Algorithm Submission Deadline: August 25<br />
<br />
== Packaging your Submission ==<br />
* Be sure that your submission follows the [[#Submission_Format]] outlined below.<br />
* Be sure that your submission accepts the proper [[#Input_File]] format<br />
* Be sure that your submission produces the proper [[#Output_File]] format<br />
* Be sure to follow the [[[2006:Best_Coding_Practices_for_MIREX]]<br />
* Be sure to follow the [[2008:MIREX 2008 Submission Instructions]] <br />
* In the README file that is included with your submission, please answer the following additional questions:<br />
** Approximately how long will the submission take to process ~1000 wav files?<br />
** Approximately how much scratch disk space will the submission need to store any feature/cache files?<br />
** Any special notice regarding to running your algorith<br />
<br />
Note that the information that you place in the README file is '''extremely''' important in ensuring that your submission is evaluated properly.<br />
<br />
== Submission Format ==<br />
A submission to the Audio Music Mood Classification evaluation is expected to follow the [[2006:Best_Coding_Practices_for_MIREX]] and must conform to the following for execution:<br />
<br />
=== One Call Format ===<br />
The one call format is appropriate for systems that perform all phases of the classification (typically features extraction, training and testing) in one step. A submission should be an executable program that takes 4 arguments: <br />
* path/to/fileContainingListOfTrainingAudioClips - the path to the list of training audio clips (see [[#File Formats]] below)<br />
* path/to/fileContainingListOfTestingAudioClips - the path to the list of testing audio clips (see [[#File Formats]] below)<br />
* path/to/cacheDir - a directory where the submission can place temporary or scratch files. Note that the contents of this directory can be retained across runs, so if, for whatever reason, the submission needs to be restarted, the submission could make use of the contents of this directory to eliminate the need for reprocessing some inputs.<br />
* path/to/output/Results - the file where the output classification results should be placed. (see [[#File Formats]] below)<br />
<br />
'''Example:'''<br />
<br />
<pre><br />
<br />
doAMC "path/to/fileContainingListOfTrainingAudioClips" "path/to/fileContainingListOfTestingAudioClips" "path/to/cacheDir" "path/to/output/Results" <br />
<br />
</pre><br />
<br />
<br />
=== Two Call Format ===<br />
The one call format is appropriate for systems that perform the training and testing separately. A submission should consists of two executable programs<br />
*trainAMC - this takes 3 arguments: <br />
** path/to/fileContainingListOfTrainingAudioClips - the path to the list of training audio clips (see [[#File Formats]] below)<br />
** path/to/trainingCacheDir - a directory where the submission can place temporary or scratch files. Note that the contents of this directory can be retained across runs, so if, for whatever reason, the submission needs to be restarted, the submission could make use of the contents of this directory to eliminate the need for reprocessing some inputs.<br />
** path/to/trainedClassificationModel - the file where the classification model should be placed<br />
*testAMC - this takes 4 arguments:<br />
** path/to/trainedClassificationModel<br />
** path/to/fileContainingListofTestingAudioClips - the path to the list of testing audio clips (see [[#File Formats]] below)<br />
** path/to/testingCacheDir - a directory where the submission can place temporary or scratch files. <br />
** path/to/output/Results - the file where the output classification results should be placed. (see [[#File Formats]] below)<br />
<br />
'''Example:'''<br />
<br />
<pre><br />
<br />
trainAMC "path/to/fileContainingListOfTrainingAudioClips" "path/to/trainingcacheDir" "path/to/trainedClassificationModel" <br />
testAMC "path/to/trainedClassificationModel" "path/to/fileContainingListofTestingAudioClips" "path/to/testingCacheDir" "path/to/output/Results"<br />
<br />
</pre><br />
<br />
=== Matlab format ===<br />
<br />
Matlab will also be supported in the form of functions in the following formats:<br />
<br />
==== Matlab One call format ====<br />
<pre><br />
doMyMatlabAMC('path/to/fileContainingListOfTrainingAudioClips','path/to/fileContainingListOfTestingAudioClips','path/to/cacheDir','path/to/output/Results')<br />
</pre><br />
<br />
<br />
==== Matlab Two call format ====<br />
<pre><br />
doMyMatlabTrainAMC('path/to/fileContainingListOfTrainingAudioClips','path/to/trainingcacheDir','path/to/trainedClassificationModel')<br />
doMyMatlabTestAMC('path/to/trainedClassificationModel','path/to/fileContainingListofTestingAudioClips','path/to/testingCacheDir','path/to/output/Results')<br />
</pre><br />
<br />
== File Formats ==<br />
<br />
=== Input Files ===<br />
<br />
The input training list file format will be of the form: <br />
<br />
<pre><br />
path/to/training/audio/file/000001.wav\tCluster_3<br />
path/to/training/audio/file/000002.wav\tCluster_5<br />
path/to/training/audio/file/000003.wav\tCluster_2<br />
...<br />
path/to/training/audio/file/00000N.wav\tCluster_1<br />
</pre><br />
<br />
"\t" stands for tab.<br />
<br />
The input testing list file format will be of the form: <br />
<br />
<pre><br />
path/to/testing/audio/file/000010.wav<br />
path/to/testing/audio/file/000020.wav<br />
path/to/testing/audio/file/000030.wav<br />
...<br />
path/to/testing/audio/file/0000N0.wav<br />
</pre><br />
<br />
"\t" stands for tab.<br />
<br />
=== Output File ===<br />
The only output will be a file containing classification results in the following format: <br />
<br />
<pre><br />
Example Classification Results 0.1 (replace this line with your system name)<br />
path/to/testing/audio/file/000010.wav\tCluster_3<br />
path/to/testing/audio/file/000020.wav\tCluster_1<br />
path/to/testing/audio/file/000030.wav\tCluster_5<br />
...<br />
path/to/testing/audio/file/0000N0.wav\tCluster_2<br />
</pre><br />
<br />
"\t" indicates tab. All audio clips should have one and only one mood cluster label.<br />
<br />
==Evaluation Scenario 2==<br />
<br />
=== Training Set ===<br />
<br />
Under evaluation scenario 2, the training set would be the whole ground truth set in scenario 1 (see [[#Groundtruth Set]]).<br />
<br />
=== Unlabeled Song Pool ===<br />
Under evaluation scenario 2, the pool of testing audio to be classified is from the same collection of the training set, i.e. USPOP and APM. We will make sure the audio covers a variety of genres in each mood cluster, which will make the contest harder and more interesting.<br />
<br />
We will randomly select a certain number (say, 1000) of songs from the collections as the audio pool. This number should make the contest interesting enough, but not too hard. And the songs need to cover all 5 mood clusters.<br />
<br />
=== Classification Results ===<br />
Each algorithm will return the top X songs in each cluster. <br />
<br />
This is a single-label classification contest, and thus each song can only be classified into one mood cluster. <br />
<br />
Note: unlike traditional classification problems where all testing samples have ground truth available, this scenario does not have a well labeled testing set. Instead, we use a ΓÇ£poolingΓÇ¥ approach like in TREC and last yearΓÇÖs audio similarity and retrieval contest. This approach collects the top X results from each algorithm and asks human assessors to make judgments on this set of collected results while assuming all other samples are irrelevant or incorrect. This approach cannot measure the absolute ΓÇ£recallΓÇ¥ metrics, but it is valid in comparing relative performances among participating algorithms. <br />
<br />
The actual value of X depends on human assessment protocol and number of available human assessors (see next section [[#Human Assessment]]).<br />
<br />
=== Human Assessment===<br />
Subjective judgments by human assessors will be collected for the pooled results using a web-based system, Evalutron6000, developed by the IMIRSEL. (An introduction of this piece of Evalutron 6000 is shown here [[2008:Evalutron6000_Walkthrough_For_Audio_Mood_Classification]]<br />
<br />
==== How many judgments and assessors ====<br />
Each algorithm returns X songs for each of the 5 mood clusters. Suppose there are Y algorithms, in the worst case, each cluster will have 5* X*Y songs to be judged. Suppose each song needs Z sets of ears, there will be 5*X*Y*Z judgments in total. When making a judgment, a human assessor will listen to the 30 second clip of a song, and label it with one of the 5 mood clusters. <br />
<br />
Human evaluators will be drawn from the participating labs and volunteers from IMIRSEL or on the MIREX lists. Suppose we can get W evaluators, each evaluator will evaluate S = (5*X*Y*Z) / W songs.<br />
<br />
At this moment, there are 10 potential participants on the Wiki, so letΓÇÖs say Y = 6. Suppose each candidate song will be evaluated by 3 judges, Z = 3, and suppose we can get 20 assessors: W = 20: <br />
<br />
*If X = 20, number of judgments for each assessor: S = 90<br />
*If X = 10, S = 45<br />
*If X = 30, S = 135 <br />
*If X = 50, S = 225<br />
*If X = 15, S = 67.5<br />
*…<br />
<br />
In audio similarity contest last year, each assessor made 205 judgments as average. As the judgment for mood is trickier, we may need to give our assessors less burden.<br />
<br />
To eliminate possible bias, we will try to equally distribute candidates returned by each algorithm among human assessors.<br />
<br />
=== Scoring ===<br />
Each algorithm is graded by the number of votes its candidate songs win from the judges. For example, if a song, A, is judged as in Cluster_1 by 2 assessors and as in Cluster_2 by 1 assessors, then the algorithm classifying A as in Cluster_1 will score 2 on this song, while the algorithm classifiying A as Cluster_2 will score 1 on this song. An algorithmΓÇÖs final score is the sum of scores on all the songs it submits. Since each algorithm can only submit 100 songs, the one which wins the most votes of judges win the contest.<br />
<br />
=== Evaluation Metrics ===<br />
Algorithm score as mentioned in last section is a metrics that facilitates direct comparison. <br />
<br />
Besides, metrics frequently used in classification problems include: accuracy, precision, recall and F measures (combining precision and recall). As mentioned above, the pooling approach results in a relative recall measure, therefore, the single most important metrics would be accuracy: <br />
<br />
The original definition of accuracy is:<br />
''Accuracy = # of correctly classified songs / #. of all songs.'' <br />
<br />
According to the above human assessment method, ΓÇ£correctly classified songsΓÇ¥ in this scenario can be defined as songs classified as the majority vote of the judges and, in the case of ties, songs classified as any of the tie votes. For example, suppose each song has 3 judges. If a song is labeled as Cluster_1 by at least 2 judges, then this song will be counted as correct for algorithms classifying it to Cluster_1; if a song is labeled as Cluster_1, Cluster_2 and Cluster_3 once by each of the judges, then this song will be counted as correct for algorithms classifying it to Cluster_1, Cluster_2 or Cluster_3. <br />
<br />
Accuracy can be calculated for all clusters as a whole (macro average) or for each cluster then take average of them (micro average).<br />
<br />
Test significance of differences among systems, possibly using<br />
<br />
*a) McNemarΓÇÖs test <br />
*b) FriedmanΓÇÖs test<br />
<br />
Besides, run time can be recorded and compared.<br />
<br />
== Challenging Issues == <br />
# Mood changeable pieces: some pieces may start from one mood but end up with another one. <br />
<br />
We will use 30 second clips instead of whole songs. The clips will be extracted automatically from the middle of the songs which have more chances to be representative.<br />
<br />
# Multiple label classification: it is possible that one piece can have two or more correct mood labels, but as a start, we strongly suggest to hold a less confusing contest and leave the challenge to future MIREXs.So, for this year, this is a single label classification problem.<br />
<br />
== Moderators ==<br />
* J. Stephen Downie (IMIRSEL, University of Illinois, USA) - [mailto:jdownie@uiuc.edu]<br />
* Xiao Hu (IMIRSEL, University of Illinois, USA) -[mailto:xiaohu@uiuc.edu]<br />
* Cyril Laurier (Music Technology Group, Barcelona, Spain) -[mailto:claurier@iua.upf.edu]<br />
<br />
== Related Papers ==<br />
#Dietterich, T. (1997). '''Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms'''. Neural Computation, 10(7), 1895-1924.<br />
#Hu, Xiao and J. Stephen Downie (2007). '''Exploring mood metadata: Relationships with genre, artist and usage metadata'''. Accepted in the Eighth International Conference on Music Information Retrieval (ISMIR 2007),Vienna, September 23-27, 2007.<br />
# Juslin, P.N., Karlsson, J., Lindstr├╢m E., Friberg, A. and Schoonderwaldt, E(2006), '''Play It Again With Feeling: Computer Feedback in Musical Communication of Emotions'''. In Journal of Experimental Psychology: Applied 2006, Vol.12, No.2, 79-95.<br />
# [http://ismir2004.ismir.net/proceedings/p075-page-415-paper152.pdf Vignoli (ISMIR 2004)] '''Digital Music Interaction Concepts: A User Study'''<br />
# [http://ismir2004.ismir.net/proceedings/p082-page-447-paper221.pdf Cunningham, Jones and Jones (ISMIR 2004)] '''Organizing Digital Music For Use: An Examiniation of Personal Music Collections'''.<br />
# [http://ismir2006.ismir.net/PAPERS/ISMIR0685_Paper.pdf Cunningham, Bainbridge and Falconer (ISMIR 2006)] '''More of an Art than a Science': Supporting the Creation of Playlists and Mixes'''.<br />
# Lu, Liu and Zhang (2006), '''Automatic Mood Detection and Tracking of Music Audio Signals'''. IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 1, JANUARY 2006 <br> Part of this paper appeared in ISMIR 2003 http://ismir2003.ismir.net/papers/Liu.PDF<br />
# [http://www.cp.jku.at/research/papers/Pohle_CBMI_2005.pdf Pohle, Pampalk, and Widmer (CBMI 2005)] '''Evaluation of Frequently Used Audio Features for Classification of Music into Perceptual Categories'''. <br> It separates "mood" and "emotion" as two classifcation dimensions, which are mostly combined in other studies.<br />
# [http://www.ee.columbia.edu/~dpwe/pubs/MandPE06-svm.pdf Mandel, Poliner and Ellis (2006)] '''Support vector machine active learning for music retrieval'''. Multimedia Systems, Vol.12(1). Aug.2006.<br />
# [http://doi.acm.org/10.1145/860435.860508 Feng, Zhuang and Pan (SIGIR 2003)] '''Popular music retrieval by detecting mood'''<br />
# [http://ismir2003.ismir.net/papers/Li.PDF Li and Ogihara (ISMIR 2003)] '''Detecting emotion in music'''<br />
# [http://pubdb.medien.ifi.lmu.de/cgi-bin//info.pl?hilliges2006audio Hilliges, Holzer, Kl├╝ber and Butz (2006)] '''AudioRadar: A metaphorical visualization for the navigation of large music collections'''.In Proceedings of the International Symposium on Smart Graphics 2006, Vancouver Canada. <br> It summarized implicit problems in traditional genre/artist based music organization.<br />
# Juslin, P. N., & Laukka, P. (2004). '''Expression, perception, and induction of musical emotions: A review and a questionnaire study of everyday listening'''. Journal of New Music Research, 33(3), 217-238.<br />
# [http://mpac.ee.ntu.edu.tw/~yihsuan/ Yang, Liu, and Chen (ACMMM 2006)] '''Music emotion classification: A fuzzy approach '''</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Melody_Extraction&diff=71722008:Audio Melody Extraction2010-06-07T19:08:32Z<p>IMIRSELBot: Robot: Automated text replacement (-\[http:\/\/(www.)?music-ir.org\/mirex\/200([5-9])\/index.php\/([^\s]+)( .+)?\] +200\2:\3)</p>
<hr />
<div>[this page is for now a pale copy/paste of MIREX06 webpage: [[2006:Audio_Melody_Extraction]]<br />
=Goal=<br />
To extract the melody line from polyphonic audio.<br />
<br />
The deadline for this task is AUGUST 22nd.<br />
<br />
=Description=<br />
The aim of the MIREX audio melody extraction evaluation is to identify the melody pitch contour from polyphonic musical audio. The task consists of two parts: Voicing detection (deciding whether a particular time frame contains a "melody pitch" or not), and pitch detection (deciding the most likely melody pitch for each time frame). We structure the submission to allow these parts to be done independently, i.e. it is possible (via a negative pitch value) to guess a pitch even for frames that were being judged unvoiced. Algorithms which don't perform a discrimination between melodic and non-melodic parts are also welcome!<br />
<br />
(The audio melody extraction evaluation will be essentially a re-run of last years contest i.e. the same test data is used.)<br />
<br />
'''Dataset''':<br />
* MIREX05 database : 25 phrase excerpts of 10-40 sec from the following genres: Rock, R&B, Pop, Jazz, Solo classical piano<br />
* ISMIR04 database : 20 excerpts of about 20s each<br />
* CD-quality (PCM, 16-bit, 44100 Hz)<br />
* single channel (mono)<br />
* manually annotated reference data (10 ms time grid) <br />
<br />
'''Output Format''':<br />
* In order to allow for generalization among potential approaches (i.e. frame size, hop size, etc), submitted algorithms should output pitch estimates, in Hz, at discrete instants in time<br />
* so the output file successively contains the time stamp [space or tab] the corresponding frequency value [new line]<br />
* the time grid of the reference file is 10 ms, yet the submission may use a different time grid as output (for example 5.8 ms)<br />
* Instants which are identified unvoiced (there is no dominant melody) can either be scored as 0 Hz or as a negative pitch value. If negative pitch values are given the statistics for Raw Pitch Accuracy and Raw Chroma Accuracy may be improved. <br />
<br />
'''Relevant Test Collections'''<br />
* For the ISMIR 2004 Audio Description Contest, the Music Technology Group of the Pompeu Fabra University assembled a diverse of audio segments and corresponding melody transcriptions including audio excerpts from such genres as Rock, R&B, Pop, Jazz, Opera, and MIDI. (full test set with the reference transcriptions (28.6 MB))<br />
* Graham's collection: you find the test set here and further explanations on the pages http://www.ee.columbia.edu/~graham/mirex_melody/ and http://labrosa.ee.columbia.edu/projects/melody/ <br />
<br />
=Potential Participants=<br />
* Jean-Louis Durrieu (TELECOM ParisTech, formerly ENST), durrieu@enst.fr<br />
* Pablo Cancela (pcancela@gmail.com)<br />
* Vishweshwara Rao and Preeti Rao (Indian Institute of Technology Bombay), vishu_rao@iitb.ac.in, prao@ee.iitb.ac.in<br />
* Karin Dressler (kadressler@gmail.com)<br />
* Matti Ryynänen and Anssi Klapuri (Tampere University of Technology), matti.ryynanen <at> tut.fi, anssi.klapuri <at> tut.fi<br />
* Chuan Cao and Ming Li (ThinkIT Lab., IOA), ccao <at> hccl.ioa.ac.cn, mli <at> hccl.ioa.ac.cn<br />
<br />
=RESULTS=<br />
The results for the audio melody extraction task are available on the following page: [[2008:Audio_Melody_Extraction_Results]]<br />
<br />
=JL's Comments 11/07/08=<br />
We propose to re-run the Audio Melody Extraction task this year. <br />
It was dropped last year, but since 2006, there were probably other research on this topic. Anyone interested ?<br />
<br />
=Vishu's comments 14/07/08=<br />
May I also suggest that we additionally have a separate evaluation for cases where the main melody is carried by the human singing voice as opposed to other musical instruments? I ask this for two reasons, the first being that for most popular music the melody is indeed carreid by the human voice. And the second reason is that, while our predominant F0 detector is quite generic, our voicing detector is 'tuned' to the human voice and so less likely to perform well for other instruments.<br />
<br />
=JL's Comments 15/07/08=<br />
Concerning the vocal/non-vocal distinction: this has been done in previous evaluations of audio melody extraction (see https://www.music-ir.org/mirex/2006/index.php/Audio_Melody_Extraction_Results for the results of the MIREX06 task).<br />
I guess separated results for vocal and vocal+non-vocal should be possible once again.<br />
<br />
I had another concern: does anyone know of some extra corpus ? It could be nice to have some more material to test the algorithms. Maybe some more classical excerpts? Does anyone know a way to obtain such data, I mean, with separated track of the main melody so that the work can be half-way done by some automatic algorithm?<br />
<br />
=Vishu's comments : Multi-track Audio available 22/07/08=<br />
We are in possession of about 4 min 15 sec of Indian classical vocal performances with separated tracks of the main melody. For a 10 ms hop, there are about 21000 vocal frames. Would this data be of interest?<br />
<br />
=Karin's comments 22/07/08=<br />
Hi Vishu and others! Any new data is appreciated - and a classical Indian performance would definitely add an interesting new genre :-) I have only made minor changes to my own melody extraction algorithm since I have shifted my priorities to midi note estimation (onset/offset and tone height) of the melody voice. Anyway, I am interested in a new evalutation of my algorithm.<br />
I know that the ISMIR 2004 dataset has annotated midi notes available. Maybe we could also evaluate the extracted midi melody notes - at least for this data set! Is there anyone else interested in this evaluation?<br />
<br />
=JL's Comments 30/07/08=<br />
Hi everyone!<br />
<br><br />
A few comments...<br />
<br><br />
To Vishu: could you upload anything to mert? I would also like to know how you annotated the data. The people who did the groundtruth for ISMIR2004 (E. Gomez in particular) told me that they used 46.44ms long windows (for 44.1kHz sampling rate, that s 2048 samples, hence the "strange" number), with 5.8ms hopsize. This groundtruth has been modified by Andreas (Ehmann) such that the hopsize became 10ms in MIREX05. <br />
<br><br />
The groundtruth for both collections give as first column the time stamp of the _center_ of the window (at least, that s what they did for ISMIR04), and as the second column the corresponding frequency in Hz.<br />
<br><br />
To Karin: It s nice to see former participants coming again risking their algorithms on the same task! I think that s also rather important for further studies: that way, we can directly compare ourselves to the state of the art!<br />
=Vishu's Comments 04/08/08=<br />
Sorry for the delay but I was travelling for a bit. I just uploaded our data to Mert. The ground truth format is the same as MIREX05, except that instead of every 10ms we generate Ground truth values every 10.023ms. This is because our data is sampled at 22.05 kHz and 10ms corressponds to 220.5 samples at that sampling frequency. So this had to be rounded off to a hop of 221 samples (10.023ms).<br />
<br><br />
Regarding the window size for ground truth generation, for each of the four excerpts we used a window length that results in a main lobe width that is reliably able to resolve adjacent harmonics of the lowest expected F0 (known apriori) for that excerpt.<br />
<br />
=JL's Comments 05/08/08=<br />
Hi Vishu, hi all !<br />
<br><br />
I was wondering if it would be possible to include some of our test set in the development set, so that we know what it is about. Maybe some excerpts of 30s each? Do you think that would be feasible?<br />
<br><br />
I am not sure about what you say for the window length... Could you be more precise? I was lately struggling a little bit with the multiF0 dev set, which led me to notice that the groundtruth sequences for the instruments were not completely aligned... I think for the sake of comparison that we should opt for a given window length that all the participants will use. In ISMIR04, that window length was 46.44 ms, which gave 2048 samples @ 44100Hz. This value seems reasonable to me, even though it might look rather long for our purpose (the pitch can evolve rather fast and even during 50ms, one can "see" this effect on the spectrogram of a _small_ chirp, where the lobes of the higher peaks - in frequency - are wider than those of the lower peaks). Most of the groundtruth was generated with windows this size, so I guess it would make more sense if everyone used this size. It might of course not be optimal in some ways, especially if one uses other representations (CQ transform for instance), in which case the participant would be penalized, even if that could lead to better results. Anyway! what do you all think about having one window size for all the participants?<br />
<br><br />
A last thing (for today :D), maybe we should convert all the files to the same sampling rate, for the sake of simplicity? of course one can do it online, with matlab's (bad) resample function. That, again, is about to compare the systems and just them: one should get rid of the potential processings needed (like the resampling step). Should we convene of a specific sampling rate for all the songs?<br />
<br />
=Mert`s Comments 05/08/08=<br />
<br />
Hi everyone thanks for writing your comments. JL we appreciate your data set also. The deadline for this task will be August 22nd. Yes Vishu uploaded the data. It consists of human singing a background instrument and a percussive instrument. I`ll reinterpolate the ground truth to match the 10ms hop size and also upsample it to 44100 khz. I can also recreate the ground truth using yin/wavesurfer/praat to have an 10ms hopsize 46ms window at 44100 khz if you want.<br />
<br />
=Vishu's Comments 07/08/08=<br />
Hi JL and others!<br />
As far as I understand, the window length and alignment of ground truth values<br />
are independant. The alignment would depend on the hop size and nothing<br />
else. <Br><br />
Regarding the window length, ideally for the ground truth computation the<br />
shortest possible window around an analysis time instant should be used in<br />
order to be robust to fast pitch modulations. The best option is to have a<br />
pitch-adaptive window. I would think that this would make your<br />
ground-truth all the more 'truthful' (Especially since the ground truth<br />
computation is also making use of some PDAs (YIN, PRAAT etc.). If this is<br />
the case then I do not think it would be fair to impose a standard<br />
window-length on all participants, since this might negatively affect their<br />
algorithm performance.<Br><br />
For the ground-truth values for our Indian music dataset, we have used<br />
shorter windows (23 ms) for female singers and longer (46 ms) windows for<br />
male singers. This reduces the effect of the faster (Hz/sec) modulations<br />
of the female singers, since they generally have higher pitch. <Br><br />
However, if the ground-truth values themselves are being extracted using<br />
some fixed analysis window length (eg. 46 ms) then I think it would be in<br />
the participants' best interests to use the same window length for their<br />
analysis.<br />
<br />
=JL's Comments 11/08/08=<br />
''the window length and alignment of ground truth values are independant. The alignment would depend on the hop size and nothing else.'':<br />
Once you have the hopsize, I agree that the alignment is straightforward... but only given a certain offset that, according to me, depends on the window size - that s really just a matter of aligning the first window. At least, that is relevant to the way we are annotating the groundtruth in MIREX, if I understand well.<br />
<Br><br />
For your database, does it mean that the time at the center of the first window, for the female sung excerpts, is 11.5ms, while it is 23ms for the male sung ones? I guess we just need to know that so that we can evaluate accordingly.<br />
<br><br />
I would say the difference in window lengths for the male and female excerpts first helps to have a better resolution in frequency, the "tracking" ability of the groundtruth being more related to the hopsize you choose. As I understand it, what you mean is that the approximation saying that the pitch is constant within one analysis window is less false if the windows are small. I guess we just need a trade-off (a window size of 46ms seems right to me, but 23ms aint bad either!) between this approximation and the precision in estimating the f_0 in the window.<br />
<br><br />
I think we can think of 2 types of eventual scenarii for the analysis windows: one in which the f_0 is constant, and the other one for which the f_0 varies. For the first one, I'd say, no problem to annotate. For the second type, I would say the most "human" way of annotating it would be to choose the "mean" of the fundamental frequencies that are present. I may be talking about silly things here, sounds a little bit stupid, but I was wondering whether other people had been thinking about that... If we wanted to annotate correctly those frames, we should give the instantaneous frequency, with the associated instantaneous time, and also give the slope (say the first order derivative of the instantaneous frequency), which is what people sometimes want to estimate. Giving only one f_0 transforms the problem: check the opera excerpts from the ISMIR04 database, with their deep vibrato, on "transition" frames, the FFT is clearly different from a perfect "spectral comb", as it would be the case with a constant f_0. Defining the f_0 for such frames as the maximum of the first lobe (first harmonic) may seem natural, but that is yet another convention.<br />
<br><br />
Another interesting point with those opera excerpts: some of the high frequency components on male performance have variations almost as fast as the ones for the female performances. That means that even if the fundamental frequencies for the male performers do not evolve as fast as for the female performers, their log_2 variations actually are quite close to the latter ones. And since the evaluation criteria are based on the musical scale, that remark has its importance, I think.<br />
<br><br />
But again, for our purpose, I guess the way the annotations has been done is more than sufficient! And new data for evaluation is always welcome! Thank you again for your effort!<br />
= Vishu's comments 12/08/08=<br />
Hi all! <Br><br />
I apologise if I was not clear enough before. When I said that "The alignment would depend on the hop size and nothing else.", I assumed that the center of the first analysis window is at 0 sec. This means that irrespective of window length, the window centers would always be at the same time instants i.e. 0, 10ms, 20ms...<Br><br />
On observing the ground truth files for the ISMIR 2004 testing dataset and the MIREX 2005 training dataset, this seems to be the convention they too follow since the time-stamp of the first ground truth value is 0 sec, which should correspond to the center of the first analysis window. This is the convention that we have followed for our Indian music data. Hope this clarifies things.</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2009:SpecialTagatuneEvaluation&diff=71712009:SpecialTagatuneEvaluation2010-06-07T19:07:59Z<p>IMIRSELBot: Robot: Automated text replacement (-\[http:\/\/(www.)?music-ir.org\/mirex\/200([5-9])\/index.php\/([^\s]+)( .+)?\] +200\2:\3)</p>
<hr />
<div>==Special Tagatune Submission System Open==<br />
<br />
Before submitting your system please read the MIREX submission instructions:<br />
<br />
[[2009:MIREX_2009_Submission_Instructions]] <br />
<br />
We will be using the same system as used for conventional MIREX evaluations, which can be accessed at:<br />
<br />
[https://www.music-ir.org/evaluation/MIREX/submission/ https://www.music-ir.org/evaluation/MIREX/submission/]<br />
<br />
== '''What is Tagatune?''' ==<br />
<br />
Tagatune is a two-player game designed to extract information about music. In each round of the game, two players are each shown a song, either they are shown the same song or two different songs. Each player describes his given song by typing in any number of tags, which are immediately revealed to the partner. After reviewing each otherΓÇÖs tags, the players must each decide whether they have been given the same piece of music as their partner. After both players have voted, the game reveals the true answer (whether the songs given to the pair of players are the same or different) and prepares the next round. Tagatune is live at [http://www.gwap.com/gwap/gamesPreview/tagatune/ www.gwap.com]<br />
<br />
http://www.cs.cmu.edu/~elaw/tagatune.jpg<br />
<br />
Since Tagatune is a two-player game, when no partner is available for a player, a bot (a computer program or algorithm) is instituted to play against that player. In each round of the game, the bot generates a set of appropriate tags for a song and reveals these tags to the player. The player then decides his votes for same or different by comparing what he is listening to and the tags revealed by his bot partner. If the songs given to the bot and the player are identical, and the tags generated by the bot are accurate for the song, then the player will have a high probability of guessing correctly that the songs are the same. Otherwise, we would expect the player to make more mistakes in making this judgment. In short, the hypothesis is that better algorithms generate tags that are more fitting descriptions of songs, which in turn, allows players to have a higher chance of guessing correctly.<br />
<br />
== '''What is the goal of the MIREX Special Tagatune Evaluation?''' ==<br />
<br />
<br />
The goal of the MIREX Special Tagatune Evaluation competition is to investigate a new method of evaluating music tagging algorithms, by using them as bots in Tagatune, and measuring the number of mistakes players make in guessing whether they are listening to the same or different songs (we will call this the Tagatune metric) when paired against different algorithm bots. We are particularly interested in whether there is a statistical correlation between the ranking of the algorithms induced by the Tagatune metric versus the classical metrics used in MIREX. For the motivation behind this evaluation, see [http://www.cs.cmu.edu/~elaw/papers/ICML2008.pdf this paper].<br />
<br />
There are three main steps to this evaluation.<br />
<br />
'''Step 1: Algorithm to Tags'''<br />
<br />
All submitted algorithms will be <br />
<br />
(a) trained using the Tagatune training set and tested on the Tagatune test set, <br />
<br />
(b) trained using the MIREX 2008 training set (MajorMiner data) and tested on the Tagatune test set. <br />
<br />
The trained algorithm must generate a set of tags for each of the songs in the test set, and rank the tags in a particular order (e.g. by confidence, saliency, relevance etc). This part of the evaluation is very similar, if not identical, to the MIREX 2008 Audio Tag Classification task.<br />
<br />
'''Step 2: Tagatune Experiments'''<br />
<br />
These tags will subsequently be displayed to players of Tagatune in a controlled experiment as well as an internet-wide experiment. The number of mistakes players make in guessing whether the songs are same or different is recorded for each algorithm.<br />
<br />
'''Step 3: Ranking'''<br />
<br />
All submitted algorithms will receive two rankings:<br />
<br />
(1) ranking using the MIREX metrics<br />
<br />
(2) ranking using the Tagatune metric<br />
<br />
<br />
== '''The Tagatune Dataset''' ==<br />
<br />
<br />
The Tagatune training and test set consist of music clips that are 29 seconds long, and are associated with 6622 tracks, 517 albums and 270 artists. The genres include classical, new age, electronica, rock, pop, world, jazz, blues, metal, punk etc. The tags used in the experiments are each associated with more than fifty songs, where each song is associated with a tag by more than two players independently. The following table shows the minimum, maximum and average number of songs associated with any tags in the training set, test set and the complete set used in this evaluation.<br />
<br />
<br />
<table border=1><br />
<tr><br />
<td></td><br />
<td>Training Set</td><br />
<td>Test Set</td><br />
<td>Complete Set</td><br />
</tr><br />
<tr><br />
<td align="left">MIN</td><br />
<td align="center">18</td><br />
<td align="center">15</td><br />
<td align="center">50</td><br />
</tr><br />
<tr><br />
<td align="left">MAX</td><br />
<td align="center">2103</td><br />
<td align="center">3767</td><br />
<td align="center">5870</td><br />
</tr><br />
<tr><br />
<td align="left">AVG</td><br />
<td align="center">212</td><br />
<td align="center">288</td><br />
<td align="center">502</td><br />
</tr><br />
</table><br />
<br />
<br />
Number of samples in training set: 9598<br />
<br />
Number of samples in test set: 13194<br />
<br />
<br />
'''The following is a list of 160 tags found in the Tagatune dataset.'''<br />
<br />
<br />
<table><br />
<tr><td>no voice</td><td>singer</td><td>duet</td><td>hard rock</td></tr><br />
<tr><td>world</td><td>harpsichord</td><td>sitar</td><td>chorus</td></tr><br />
<tr><td>female opera</td><td>male vocal</td><td>vocals</td><td>clarinet</td></tr><br />
<tr><td>heavy</td><td>silence</td><td>beats</td><td>funky</td></tr><br />
<tr><td>no strings</td><td>chimes</td><td>foreign</td><td>no piano</td></tr><br />
<tr><td>horns</td><td>classical</td><td>female</td><td>spacey</td></tr><br />
<tr><td>jazz</td><td>guitar</td><td>quiet</td><td>no beat</td></tr><br />
<tr><td>banjo</td><td>electric</td><td>solo</td><td>violins</td></tr><br />
<tr><td>folk</td><td>female voice</td><td>wind</td><td>ambient</td></tr><br />
<tr><td>new age</td><td>synth</td><td>funk</td><td>no singing</td></tr><br />
<tr><td>middle eastern</td><td>trumpet</td><td>percussion</td><td>drum</td></tr><br />
<tr><td>airy</td><td>voice</td><td>repetitive</td><td>birds</td></tr><br />
<tr><td>strings</td><td>bass</td><td>harpsicord</td><td>medieval</td></tr><br />
<tr><td>male voice</td><td>girl</td><td>acoustic</td><td>loud</td></tr><br />
<tr><td>classic</td><td>string</td><td>drums</td><td>electronic</td></tr><br />
<tr><td>not classical</td><td>chanting</td><td>no violin</td><td>not rock</td></tr><br />
<tr><td>no guitar</td><td>organ</td><td>no vocal</td><td>talking</td></tr><br />
<tr><td>choral</td><td>weird</td><td>opera</td><td>fast</td></tr><br />
<tr><td>electric guitar</td><td>male singer</td><td>man singing</td><td>classical guitar</td></tr><br />
<tr><td>country</td><td>violin</td><td>electro</td><td>tribal</td></tr><br />
<tr><td>dark</td><td>male opera</td><td>no vocals</td><td>irish</td></tr><br />
<tr><td>electronica</td><td>horn</td><td>operatic</td><td>arabic</td></tr><br />
<tr><td>low</td><td>instrumental</td><td>trance</td><td>chant</td></tr><br />
<tr><td>strange</td><td>heavy metal</td><td>modern</td><td>bells</td></tr><br />
<tr><td>man</td><td>deep</td><td>fast beat</td><td>hard</td></tr><br />
<tr><td>harp</td><td>no flute</td><td>pop</td><td>lute</td></tr><br />
<tr><td>female vocal</td><td>oboe</td><td>mellow</td><td>orchestral</td></tr><br />
<tr><td>light</td><td>piano</td><td>celtic</td><td>male vocals</td></tr><br />
<tr><td>orchestra</td><td>eastern</td><td>old</td><td>flutes</td></tr><br />
<tr><td>punk</td><td>spanish</td><td>sad</td><td>sax</td></tr><br />
<tr><td>slow</td><td>male</td><td>blues</td><td>vocal</td></tr><br />
<tr><td>indian</td><td>india</td><td>woman</td><td>woman singing</td></tr><br />
<tr><td>rock</td><td>dance</td><td>piano solo</td><td>guitars</td></tr><br />
<tr><td>no drums</td><td>jazzy</td><td>singing</td><td>cello</td></tr><br />
<tr><td>calm</td><td>female vocals</td><td>voices</td><td>techno</td></tr><br />
<tr><td>clapping</td><td>house</td><td>flute</td><td>not opera</td></tr><br />
<tr><td>not english</td><td>oriental</td><td>beat</td><td>upbeat</td></tr><br />
<tr><td>soft</td><td>noise</td><td>choir</td><td>female singer</td></tr><br />
<tr><td>rap</td><td>metal</td><td>hip hop</td><td>water</td></tr><br />
<tr><td>baroque</td><td>women</td><td>fiddle</td><td>english</td></tr><br />
</table><br />
<br />
<br />
'''NOTE''': An interesting effect of Tagatune is that we have collected many negative tags, which indicates the absence of an instrument (e.g. no piano, no guitar) or the genre that the song does not belong to (e.g. not classical, not rock). Participants of this evaluation might want to tailor their algorithms to take advantage of these negative tags that are not available on the MIREX 2008 dataset.<br />
<br />
== '''Submission Format''' ==<br />
<br />
<br />
The submission format is identical to the one for Audio Tag Classification task in MIREX 2008 except for the audio formats, detailed descriptions to be found here: https://www.music-ir.org/mirex/2008/index.php/Audio_Tag_Classification.<br />
<br />
<br />
== '''Audio Formats''' ==<br />
<br />
<br />
Participating algorithms will have to read audio in the following format:<br />
<br />
Γû¬ Sample rate: 44 KHz<br />
<br />
Γû¬ Sample size: 16 bit<br />
<br />
Γû¬ Number of channels: 2 (stereo)<br />
<br />
Γû¬ Encoding: WAV (decoded from MP3 files by IMIRSEL)<br />
<br />
Γû¬ Duration: '''10 or 29''' second clips<br />
<br />
'''NOTE''': Participants should make sure that their algorithms can be trained on audio files that are of a certain duration, but then tested on audio files that are of a different duration. For example, in Step 1(b) of the evaluation, algorithms are trained on the 10s audio files from the MajorMiner dataset and tested on the 29s audio files from the Tagatune dataset.<br />
<br />
== '''Deadlines and Timeline''' ==<br />
<br />
<br />
Submission opening date: Dec 15, 2008<br />
<br />
Submission closing date: Jan 30, 2009<br />
<br />
== '''Organizers''' ==<br />
<br />
J. Stephen Downie<br />
<br />
Edith Law<br />
<br />
Kris West<br />
<br />
Michael Mandel<br />
<br />
Mert Bay<br />
<br />
Andreas F. Ehmann<br />
<br />
M. Cameron Jones<br />
<br />
== '''Results''' ==<br />
<br />
The results of the competition is detailed in the paper [http://www.cs.cmu.edu/~elaw/papers/ismir2009.pdf Evaluation of Algorithms Using Games: The Case of Music Tagging]. The detailed results (Thanks to Kris West) are posted here: https://www.music-ir.org/mirex/2009/index.php/Audio_Tag_Classification_Tagatune_Results<br />
<br />
http://www.cs.cmu.edu/~elaw/papers/result1.JPG <br />
<br />
<br />
http://www.cs.cmu.edu/~elaw/papers/result2.JPG</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2009:Audio_Tag_Classification&diff=71702009:Audio Tag Classification2010-06-07T19:07:49Z<p>IMIRSELBot: Robot: Automated text replacement (-\[http:\/\/(www.)?music-ir.org\/mirex\/200([5-9])\/index.php\/([^\s]+)( .+)?\] +200\2:\3)</p>
<hr />
<div>== Overview ==<br />
<br />
This task will compare various algorithms' abilities to associate tags with 10-second audio clips of songs. The tags come from the [http://majorminer.org MajorMiner game]. This task is very much related to the other audio classification tasks. One new twist, however, is that many tags can apply to the same clip, so instead of one N-way classification per clip, this task requires N binary classifications per clip. <br />
<br />
__TOC__<br />
<br />
== Description ==<br />
<br />
The text of this section is copied from the 2008 page. Please add your comments and discussions for 2009. This proposal may be refined based on feedback from the participants.<br />
<br />
Audio tag classification was first run at MIREX 2008 [[2008:Audio_Tag_Classification]] and as a special MIREX task at 2009<br />
[[2009:SpecialTagatuneEvaluation]] . <br />
<br />
Below is the task description from MIREX 2008. <br />
Please feel free to edit this page.<br />
<br />
== New Mood Multi Tag Dataset for 2009 ==<br />
This dataset is derived from mood related tags on last.fm. All tags in this set are identified by a general affect lexicon (WordNet-Affect) and by human experts. Similar tags are grouped together to define a mood tag group and each song may belong to multiple mood tag groups.<br />
<br />
There are 18 mood tag groups containing 135 unique tags. The dataset contains 3,469 unique songs. The following table lists the tag groups, their member tags and number of songs in each group: <br />
<br />
{| class="wikitable" style="margin: 1em auto 1em auto"<br />
! Group id || Tags || num. of tags || num. of songs<br />
|-<br />
| G12 || calm, comfort, quiet, serene, mellow, chill out, calm down, calming, chillout, comforting, content, cool down, mellow music, mellow rock, peace of mind, quietness, relaxation, serenity, solace, soothe, soothing, still, tranquil, tranquility, tranquility || 25 || 1,680<br />
|-<br />
| G15 || sad, sadness, unhappy, melancholic, melancholy, feeling sad, mood: sad ΓÇô slightly, sad song || 8 || 1,178<br />
|-<br />
| G5 || happy, happiness, happy songs, happy music, glad, mood: happy || 6 || 749<br />
|-<br />
| G32 || romantic, romantic music || 2 || 619<br />
|-<br />
| G2 || upbeat, gleeful, high spirits, zest, enthusiastic, buoyancy, elation, mood: upbeat|| 8 || 543<br />
|-<br />
| G16 || depressed, blue, dark, depressive, dreary, gloom, darkness, depress, depression, depressing, gloomy || 11 || 471<br />
|-<br />
| G28 || anger, angry, choleric, fury, outraged, rage, angry music || 7 || 254<br />
|-<br />
| G17 || grief, heartbreak, mournful, sorrow, sorry, doleful, heartache, heartbreaking, heartsick, lachrymose, mourning, plaintive, regret, sorrowful || 14 || 183<br />
|-<br />
| G14 || dreamy || 1 || 146<br />
|-<br />
| G6 || cheerful, cheer up, festive, jolly, jovial, merry, cheer, cheering, cheery, get happy, rejoice, songs that are cheerful, sunny || 13 || 142<br />
|-<br />
| G8 || brooding, contemplative, meditative, reflective, broody, pensive, pondering, wistful || 8 || 116<br />
|-<br />
| G29 || aggression, aggressive || 2 || 115<br />
|-<br />
| G25 || angst, anxiety, anxious, jumpy, nervous, angsty || 6 || 80<br />
|-<br />
| G9 || confident, encouraging, encouragement, optimism, optimistic || 5 || 61<br />
|-<br />
| G7 || desire, hope, hopeful, mood: hopeful || 4 || 45<br />
|-<br />
| G11 || earnest, heartfelt || 2 || 40<br />
|-<br />
| G31 || pessimism, cynical, pessimistic, weltschmerz, cynical/sarcastic || 5 || 38<br />
|-<br />
| G1 || excitement, exciting, exhilarating, thrill, ardor, stimulating, thrilling, titillating || 8 || 30<br />
|-<br />
| TOTAL || || 135 || 6,490 <br />
|}<br />
<br />
The songs are mostly from the USPOP collection, a detailed breakdown of the songs are listed in the following table: <br />
<br />
{| class="wikitable" style="margin: 1em auto 1em auto"<br />
! Collection || num. of songs in the dataset || percentage of songs in the dataset<br />
|-<br />
| USPOP || 2764 || 80%<br />
|-<br />
| Assorted pop || 366 || 10%<br />
|-<br />
| American music || 145 || 4%<br />
|-<br />
| Beatles || 128 || 4%<br />
|-<br />
| USCRAP || 40 || 1%<br />
|-<br />
| Metal music || 25 || 1%<br />
|-<br />
| Magnatune || 1 || 0%<br />
|-<br />
| TOTAL || 3469 || 100%<br />
|}<br />
<br />
Details on how the mood tag groups were derived are described in [https://www.music-ir.org/archive/papers/ISMIR2009_MoodClassification.pdf X. Hu, J. S. Downie, A.Ehmann, Lyric Text Mining in Music Mood Classification, In Proceedings of the 10th International Symposium on Music Information Retrieval (ISMIR), Oct. 2009, Kobe , Japan] <br />
<br />
Details on how the songs were selected are available in the [https://www.music-ir.org/archive/papers/Mood_Multi_Tag_Data_Description.pdf description].<br />
<br />
== Discussions for 2009 ==<br />
<br />
Your comments here<br />
<br />
I don't know where to put this but the Marsyas team plans to participate in this task if it <br />
is run this year (George Tzanetakis)<br />
<br />
== Discussion from 2008==<br />
<br />
It is possible for each tag to be treated as a completely separate classification problem. It is also possible to present the tags "all at once" for training, but then separately for testing. The former is a subset of the latter, and learning separate classifiers can be done inside any "all at once" classifier. The separate approach, however, has the nice property of being almost identical to the other audio classification tasks.<br />
<br />
Possible ways of presenting training tags<br />
* One at a time<br />
* All at once<br />
<br />
'''Consensus:''' all at once training<br />
<br />
This task could also be run as a retrieval task using a metric like area precision-at-10. It could also be evaluated as a classifier on un-balanced test sets with other metrics like area under the ROC curve or F-measure. The choice of metric would obviously change the types of evaluations that could be performed. The fact that there are no definite negative tags might make an evaluation with many examples more difficult.<br />
<br />
Possible evaluation metrics<br />
* Classification accuracy on a balanced dataset<br />
* Precision-at-K<br />
* Area under the ROC curve<br />
* F-measure<br />
<br />
'''Consensus:''' perform both retrieval and classification tasks. See Doug Turnbull's proposal below<br />
<br />
== Data ==<br />
<br />
All of the data is browseable via the [http://majorminer.org/search MajorMiner search] page.<br />
<br />
=== Music ===<br />
<br />
The music consists of 2300 clips selected at random from 3900 tracks. Each clip is 10 seconds long. The 2300 clips represent a total of 1400 different tracks on 800 different albums by 500 different artists. To give a sense for the music collection, the following genre tags have been applied to these artists, albums, and tracks on Last.fm: electronica, rock, indie, alternative, pop, britpop, idm, new wave, hip-hop, singer-songwriter, trip-hop, post-punk, ambient, jazz.<br />
<br />
=== Tags ===<br />
<br />
The MajorMiner game has collected a total of about 73000 taggings, 12000 of which have been verified by at least two users. In these verified taggings, there are 43 tags that have been verified at least 35 times, for a total of about 9000 verified uses. These are the tags we will be using in this task.<br />
<br />
Note that these data do not include strict negative labels. While many clips are tagged ''rock'', none are tagged ''not rock''. Frequently, however, a clip will be tagged many times without being tagged ''rock''. We take this as an indication that ''rock'' does not apply to that clip. More specifically, a negative example of a particular tag is a clip on which another tag has been verified, but the tag in question has not.<br />
<br />
Here is a list of the top 50 tags along with an approximate number of times each has been verified, how many times it's been used in total, and how many different users have ever used it:<br />
<br />
{| class="wikitable" style="margin: 1em auto 1em auto"<br />
! Tag || Verified || Total || Users<br />
|-<br />
| drums || 962 || 3223 || 127 <br />
|-<br />
| guitar || 845 || 3204 || 181 <br />
|-<br />
| male || 724 || 2452 || 95 <br />
|-<br />
| rock || 658 || 2619 || 198 <br />
|-<br />
| synth || 498 || 1889 || 105 <br />
|-<br />
| electronic || 490 || 1878 || 131 <br />
|-<br />
| pop || 479 || 1761 || 151 <br />
|-<br />
| bass || 417 || 1632 || 99 <br />
|-<br />
| vocal || 355 || 1378 || 99 <br />
|-<br />
| female || 342 || 1387 || 100 <br />
|-<br />
| dance || 322 || 1244 || 115 <br />
|-<br />
| techno || 246 || 943 || 104 <br />
|-<br />
| piano || 179 || 826 || 120 <br />
|-<br />
| electronica || 168 || 686 || 67 <br />
|-<br />
| hip hop || 166 || 701 || 126 <br />
|-<br />
| voice || 160 || 790 || 55 <br />
|-<br />
| slow || 157 || 727 || 90 <br />
|-<br />
| beat || 154 || 708 || 90 <br />
|-<br />
| rap || 151 || 723 || 129 <br />
|-<br />
| jazz || 136 || 735 || 154 <br />
|-<br />
| 80s || 130 || 601 || 94 <br />
|-<br />
| fast || 109 || 494 || 70 <br />
|-<br />
| instrumental || 103 || 539 || 62 <br />
|-<br />
| drum machine || 89 || 427 || 35 <br />
|-<br />
| british || 81 || 383 || 60 <br />
|-<br />
| country || 74 || 360 || 105 <br />
|-<br />
| distortion || 73 || 366 || 55 <br />
|-<br />
| saxophone || 70 || 316 || 86 <br />
|-<br />
| house || 65 || 298 || 66 <br />
|-<br />
| ambient || 61 || 335 || 78 <br />
|-<br />
| soft || 61 || 351 || 58 <br />
|-<br />
| silence || 57 || 200 || 35 <br />
|-<br />
| r&b || 57 || 242 || 59 <br />
|-<br />
| strings || 55 || 252 || 62 <br />
|-<br />
| quiet || 54 || 261 || 57 <br />
|-<br />
| solo || 53 || 268 || 56 <br />
|-<br />
| keyboard || 53 || 424 || 41 <br />
|-<br />
| punk || 51 || 242 || 76 <br />
|-<br />
| horns || 48 || 204 || 38 <br />
|-<br />
| drum and bass || 48 || 191 || 50 <br />
|-<br />
| noise || 46 || 249 || 61 <br />
|-<br />
| funk || 46 || 266 || 90 <br />
|-<br />
| acoustic || 40 || 193 || 58 <br />
|-<br />
| trumpet || 39 || 174 || 68 <br />
|-<br />
| end || 38 || 178 || 36 <br />
|-<br />
| loud || 37 || 218 || 62 <br />
|-<br />
| organ || 35 || 169 || 46 <br />
|-<br />
| metal || 35 || 178 || 64 <br />
|-<br />
| folk || 33 || 195 || 58 <br />
|-<br />
| trance || 33 || 226 || 49 <br />
|}<br />
<br />
== Evaluation ==<br />
<br />
Participating algorithms will be evaluated with 3-fold cross validation. Artist filtering will be used the test and training splits, I.e. training and test sets will contain different artists. The raw classification accuracy and standard deviation for each tag and each algorithm will be computed. <br />
<br />
=== Beta-Binomial model ===<br />
<br />
In order to make the variance of the accuracy estimates the same for all tags, the same number of test examples must be used. This unnecessarily reduces the amount of test data, a property that can be avoided if we use the beta-binomial empirical Bayes estimator of accuracy. The basic idea of the model is that for each submission, it is possible to intelligently combine the overall performance with performance on each tag in proportion to the number of examples of each tag. Basically, performance on tags with more examples will matter more, and performance on tags with fewer examples will be "shrunk" towards the mean of all of the tags. The [http://en.wikipedia.org/wiki/Beta-binomial_model wikipedia] page is a bit sparse, but slightly informative.<br />
<br />
More specifically, the beta-binomial model treats performance on each tag as a binomial random variable, with the parameter of that binomial (the probability of success) drawn from a beta distribution. The parameters of the beta distribution will be estimated and will yield a mean and variance that can be used to compare algorithms. See Chapter 5 of [http://www.amazon.com/gp/product/158488388X Bayesian Data Analysis] by Gelman, Carlin, Stern, and Rubin for even more detail.<br />
<br />
''[Turnbull] When considering tag-based performance, if we use '''Area Under the ROC Curve (AUC)''' instead of classification accuracy, we would not need to use the same number of positive and negative test examples since AUC is a metric is not related to the prior probability of a tag. That is, we can average over AUC value.'' <br />
<br />
'' A good reference for ROC curves and AUC is: [http://www.ailab.si/blaz/predavanja/ozp/gradivo/2003-fawcet-kddj-submitted.pdf 'ROC Graphs: Notes and Practical Considerations for Data Mining Researchers' Tom Fawcett, HP Labs] --[[User:Kriswest|Kriswest]] 08:52, 18 August 2008 (CDT)<br />
<br />
''[Turnbull] "performance on tags with more examples will matter more" - This is a concern since, in our experience, we have found the tags that are most useful for tag-based retrieval are those that are neither the most common (e.g., generic) tags nor the very rare (e.g., obscure) tags.''<br />
<br />
''[TBM] answer to Turnbull, if we use the top 50 tags from MajorMinor, all tags are kinda common, and there's no very rare ones. The problem comes from machine learning, I would not trust a model trained on less than 50 positive examples, therefore the model's performance on tags with more examples should matter more. ''<br />
<br />
=== Ranking and significance testing ===<br />
<br />
Additionally, more standard tests could be performed on the average classification accuracy, although the cross-tag variance tends to increase each algorithm's variance, interfering with significance tests without further handling. One such test is Friedman's ANOVA.<br />
<br />
We wish to compare a number of treatments/systems (the submissions) over a number of blocks/rows. We can either compute average classification accuracy and/or precision metrics over all the tags and use the cross validation folds as the blocks/rows - which will handle variance between different folds. However, we are more interested in considering each tag (averaged over all folds) or (perhaps better) each tag on each fold as a separate block.<br />
<br />
The Friedman test should handle the variance between tags (caused by different difficulties of modeling each tag and different numbers of positive and negative examples per tag) by replacing the actual scores achieved by each system on each block (tag) with the rank achieved by that system on that tag amongst all the systems. Hence, we make the assumption that each tag (or combination of tag and fold) is of equal importance in the evaluation. This is an oft used approach at TREC when considering retrieval results (where each query is of equal importance, but unequal variance/difficulty).<br />
<br />
Tukey-Kramer Honestly Significant Difference multiple comparisons are made over the results of Friedman's ANOVA as this (and other tests, such as multiply applied Student's T-tests) can only safely tell you if one system is statistically significantly different from the rest. If you try to do the full NxN comparisons with such tests then the experiment wide alpha value is cumulative over all the tests. E.g. if we compared 12 systems at an alpha level of 0.05, a total of 66 pairwise comparisons are made and the chance of incorrectly rejecting the hypothesis of no difference in error rates is: 1 ΓÇô (0.95^66) = 0.97 = 97%. This explanation is lifted from a paper by Tague-Sutcliffe and Blustein:<br />
<br />
@article{taguesutcliffe1995sat,<br />
title={A Statistical Analysis of the TREC-3 Data},<br />
author={Tague-Sutcliffe, J. and Blustein, J.},<br />
journal={Overview of the Third Text Retrieval Conference (Trec-3)},<br />
year={1995},<br />
publisher={DIANE Publishing}<br />
}<br />
<br />
The use of Friedman's ANOVA and Tukey-Kramer HSD was originally proposed by Stephen Downie for the evaluation of the Audio Sim queries - and I've used for the same in my thesis and evaluation of classification results (where it tends to give similar results with other tests such as McNemar's Test). It also comes with nice easy to interpret column rank plots and can be used to put the submissions into equivalence groups, see Downie/M.C.Jones/IMIRSEL's papers on the evaluation of Audio Sim at MIREX 2006:<br />
<br />
@InProceedings{jones2007hsj,<br />
title={"Human Similarity Judgements: Implications for the Design of Formal Evaluations"},<br />
author="M.C. Jones and J.S. Downie and A.F. Ehmann",<br />
BOOKTITLE ="Proceedings of ISMIR 2007 International Society of Music Information Retrieval", <br />
year="2007"<br />
}<br />
<br />
--[[User:Kriswest|Kriswest]] 17:54, 23 July 2008 (CDT)<br />
<br />
=== Proposal for three Annotation and Retrieval Sub-tasks ===<br />
<br />
There seem to be to three tasks that are worth considering:<br />
# Tag Classification (Annotation) - T binary classification problems (where T is the number of tags.) Given a tag, identify all relevant songs.<br />
# Tag Ranking (Annotation) - S ranking problems (where S is the number of songs.) Given a song, find the most salient tags. However, we do not want our system to simply output the most common tags (i.e., the tags with the large prior probability.)<br />
# Song Ranking (Retrieval) - T ranking problems. Given a tag, rank order songs based on relevance.<br />
<br />
Our research to date has focused on #2 and #3 because we tend to think of a song-tag pair in terms of an affinity score rather than binary class labels. (While some of the more objective tags (e.g., instruments) can be regarded as all-or-nothing, music is subjective, and as such, tends to resist absolutes. See the genre classification literature for more on this topic - Tzanetakis, McKay, etc.) However, for systems that use binary classifiers (e.g., boosted decision stumps, SVMs), #1 and #3 seem like the most natural tasks. <br />
<br />
Our proposal for evaluation would be propose to split this into these three separate tasks. <br />
<br />
* Task #1: For each tag, we fix the proportion of positive and negative test set. Each participant outputs a binary labeling of songs for that tag which assumes an equal prior. The metric for #1 could be based on classification accuracy: a vector of 3 folds x 50 tags = 150-dimensions vector of accuracies. We then use Kris' ANOVA suggestion to determine statistical significance, as well as report mean classification accuracy by taking the arithmetic mean of the 150-dimensional vector.<br />
<br />
* Task #2: the participant provides an affinity score for each song-tag pair (e.g., a SxT real-valued matrix). For each song, we examine the top N tags (e.g., N = 1, 10, or 25). For each tag that is relevant, we get an "Annotation Score" of 1 / Pr(tag) and 0 otherwise . That is, if the prior probability of a tag is 1/2, the Annotation Score is 2. If the prior probability is 1/10, the Annotation Score is 10. We then average over the the top N tags. The expected value of (average) Annotation Score is 1. If an Annotation Score is greater than 1, it is better than random. We then get a S-dimensional vector of average Annotation Scores which can be used for an ANOVA test and we can report the mean average Annotation Score. <br />
<br />
* Task #3: the participant provides an affinity socre for each song-tag pair (e.g., a SxT real-valued matrix). For each tag, we rank order the songs and compute the Area under the ROC curve (AUC). The ROC curve is a plot of the true positive rate as a function of the false positive rate as we move down this ranked list of songs. AUC is a good metric because we can average AUC across all the tags since it is not related to the prior probability of a tag. Like tasks #1 & #2, we can report mean AUC and do an ANOVA test. <br />
<br />
--[[User:Dougturnbull|Dougturnbull]] 18:58, 24 July 2008 (CDT)<br />
<br />
=== Runtime performance ===<br />
<br />
In addition computation times for feature extraction and training/classification will be measured.<br />
<br />
== Submission format ==<br />
<br />
Submission to this task will have to conform to a specified format detailed below, which is very similar to the audio genre classification task, among others.<br />
<br />
=== Audio formats ===<br />
<br />
Participating algorithms will have to read audio in the following format:<br />
<br />
* Sample rate: 44 KHz<br />
* Sample size: 16 bit<br />
* Number of channels: 2 (stereo)<br />
* Encoding: WAV (decoded from MP3 files by IMIRSEL)<br />
* Duration: 10 second clips<br />
<br />
=== Implementation details ===<br />
<br />
Scratch folders will be provided for all submissions for the storage of feature files and any model files to be produced. Executables will have to accept the path to their scratch folder as a command line parameter. Executables will also have to track which feature files correspond to which audio files internally. To facilitate this process, unique filenames will be assigned to each audio track.<br />
<br />
The audio files to be used in the task will be specified in a simple ASCII list file. For feature extraction and classification this file will contain one path per line with no header line. For model training this file will contain one path per line, followed by a tab character and the tag label, again with no header line. Executables will have to accept the path to these list files as a command line parameter. The formats for the list files are specified below.<br />
<br />
Algorithms should divide their feature extraction and training/classification into separate executables/scripts. This will facilitate a single feature extraction step for the task, while training and classification can be run for each cross-validation fold.<br />
<br />
Multi-processor compute nodes (2, 4 or 8 cores) will be used to run this task. Hence, participants should attempt to use parallelism where-ever possible. Ideally, the number of threads to use should be specified as a command line parameter. Alternatively, implementations may be provided in hard-coded 2, 4 or 8 thread configurations. Single threaded submissions will, of course, be accepted but may be disadvantaged by time constraints.<br />
<br />
<br />
=== I/O formats ===<br />
<br />
In this section the input and output files used in this task are described as are the command line calling format requirements for submissions.<br />
<br />
==== Feature extraction list file ====<br />
<br />
The list file passed for feature extraction will be a simple ASCII list file. This file will contain one path per line with no header line.<br />
<br />
==== Training list file ====<br />
<br />
The list file passed for model training will be a simple ASCII list file. This file will contain one path per line, followed by a tab character and a tag label, again with no header line.<br />
<br />
E.g. <br />
<br />
<example path and filename>\t<tag classification>\n<br />
<br />
In this way, the input file will represent the sparse ground truth matrix. While no line will be duplicated, multiple lines may contain the same path, one for each tag associated with that clip. Any tag that is not specified as applying to a clip does not apply to that clip. The ordering of the lines is arbitrary and should not be depended upon.<br />
<br />
==== Test (classification) list file ====<br />
<br />
The list file passed for testing classification will be a simple ASCII list file identical in format to the Feature extraction list file. This file will contain one path per line with no header line.<br />
<br />
==== Classification output files ====<br />
<br />
Participating algorithms should produce '''two''' simple ASCII list files similar in format to the Training list file. The path to which each list file should be written must be accepted as a parameter on the command line.<br />
<br />
===== Tag Affinity file =====<br />
<br />
The first file will contain one path per line, followed by a tab character and the tag label, followed by another tab character and the affinity of that tag for that file, again with no header line.<br />
<br />
I.e.:<br />
<br />
<example path and filename>\t<tag classification>\t<affinity>\n<br />
<br />
E.g.:<br />
<br />
/data/file1.wav rock 0.9<br />
/data/file1.wav guitar 0.7<br />
/data/file1.wav vocal 0.3<br />
/data/file2.wav rock 0.5<br />
...<br />
<br />
In this way, the output file will represent the sparse classification matrix. A path should be repeated on a separate line for each tag that the submission deems applies to it. If a (path, tag) pair is not specified, it will be assumed to have an affinity of 0. The ordering of the lines is not important and can be arbitrary.<br />
<br />
The affinity will be used for retrieval evaluation metrics, and its only specification is that for a given tag, larger (closer to +infinity) numbers indicate that the tag is more appropriate to a clip than smaller (closer to -infinity) numbers. As submissions are asked to also return a binary relevance listing, submissions that do not compute an affinity should provide only the binary relevance listing file.<br />
<br />
===== Binary relevance file =====<br />
<br />
The second file to be produced is a binary version of the tag classifications, where a tag must be marked as relevant or not relevant to a track. This file will contain one path per line, followed by a tab character and the tag label, followed by another tab character and either a 1 or a 0 indicating the relevance of that tag for that file, again with no header line.<br />
<br />
I.e.:<br />
<br />
<example path and filename>\t<tag classification>\t<relevant? [0 | 1]>\n<br />
<br />
E.g.:<br />
<br />
/data/file1.wav rock 1<br />
/data/file1.wav guitar 1<br />
/data/file1.wav vocal 0<br />
/data/file2.wav rock 1<br />
...<br />
<br />
If a (path, tag) pair is not specified, it will be assumed to be non-relevant (0). Any line with path but no numerical value will be assumed to be relevant (1).<br />
<br />
Hence, the following is equivalent to the example above:<br />
<br />
/data/file1.wav rock<br />
/data/file1.wav guitar<br />
/data/file2.wav rock<br />
<br />
The ordering of the lines is not important and can be arbitrary.<br />
<br />
=== Example submission calling formats ===<br />
<br />
extractFeatures.sh /path/to/scratch/folder /path/to/featureExtractionListFile.txt<br />
TrainAndClassify.sh /path/to/scratch/folder /path/to/trainListFile.txt /path/to/testListFile.txt /path/to/outputAffinityFile.txt /path/to/outputBinaryRelevanceFile.txt<br />
<br />
extractFeatures.sh -numThreads 8 /path/to/scratch/folder /path/to/featureExtractionListFile.txt<br />
TrainAndClassify.sh -numThreads 8 /path/to/scratch/folder /path/to/trainListFile.txt /path/to/testListFile.txt /path/to/outputAffinityFile.txt /path/to/outputBinaryRelevanceFile.txt<br />
<br />
extractFeatures.sh /path/to/scratch/folder /path/to/featureExtractionListFile.txt<br />
Train.sh /path/to/scratch/folder /path/to/trainListFile.txt <br />
Classify.sh /path/to/scratch/folder /path/to/testListFile.txt /path/to/outputAffinityFile.txt /path/to/outputBinaryRelevanceFile.txt<br />
<br />
myAlgo.sh -extract -numThreads 8 /path/to/scratch/folder /path/to/featureExtractionListFile.txt<br />
myAlgo.sh -TrainAndClassify -numThreads 8 /path/to/scratch/folder /path/to/trainListFile.txt /path/to/testListFile.txt /path/to/outputAffinityFile.txt /path/to/outputBinaryRelevanceFile.txt<br />
<br />
myAlgo.sh -extract /path/to/scratch/folder /path/to/featureExtractionListFile.txt<br />
myAlgo.sh -train /path/to/scratch/folder /path/to/trainListFile.txt <br />
myAlgo.sh -classify /path/to/scratch/folder /path/to/testListFile.txt /path/to/outputAffinityFile.txt /path/to/outputBinaryRelevanceFile.txt<br />
<br />
=== Packaging submissions ===<br />
<br />
All submissions should be statically linked to all libraries (the presence of dynamically linked libraries cannot be guaranteed).<br />
<br />
All submissions should include a README file including the following the information:<br />
<br />
* Command line calling format for all executables<br />
* Number of threads/cores used or whether this should be specified on the command line<br />
* Expected memory footprint<br />
* Expected runtime<br />
* Any required environments (and versions) such as Matlab, Java, Python, Bash, Ruby etc.<br />
<br />
=== Time and hardware limits ===<br />
<br />
Due to the potentially high number of participants in this and other audio tasks, hard limits on the runtime of submissions will be specified.<br />
<br />
A hard limit of 24 hours will be imposed on feature extraction times.<br />
<br />
A hard limit of 24 hours will be imposed on each training/classificaiton cycle. Leading to a total runtime limit of 72 hours.<br />
<br />
<br />
== Submission opening date ==<br />
TBA<br />
<br />
== Submission closing date ==<br />
TBA<br />
<br />
== Interested participants ==<br />
<br />
If this sounds interesting to you, please leave your name and email. Doing so is not binding in any way.<br />
<br />
1. Hung-Yi Lo, Ju-Chiang Wang, Hsin-Min Wang (IIS, Academia Sinica, Taiwan), hungyi[at]iis[dot]sinica[dot]edu[dot]tw<br />
<br />
2. James Bergstra bergstrj[at]iro[dot]umontreal[dot]ca<br />
<br />
3. The Marsyas team. Their comments provided in the 2009 discussion part : I don't know where to put this but the Marsyas team plans to participate in this task if it is run this year (George Tzanetakis)<br />
<br />
4. Michael Mandel, Columbia University, mim[at]ee[dot]columbia[dot]edu<br />
<br />
5. Matt Hoffma, Princeton University, mdhoffma[at]cs[dot]princeton[dot]edu<br />
<br />
6. Juan Jose Burred, Geoffroy Peeters (IRCAM), burred[at]ircam[dot]fr</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2009:Audio_Music_Similarity_and_Retrieval&diff=71692009:Audio Music Similarity and Retrieval2010-06-07T19:07:39Z<p>IMIRSELBot: Robot: Automated text replacement (-\[http:\/\/(www.)?music-ir.org\/mirex\/200([5-9])\/index.php\/([^\s]+)( .+)?\] +200\2:\3)</p>
<hr />
<div>===Description===<br />
<br />
This is a suggestion to resurrect the [[2007:Audio_Music_Similarity_and_Retrieval]]. The initial suggestion is to adapt the setup from 2007.<br />
<br />
== Comments ==<br />
<br />
Any thoughts/ideas as to when the target submit day will be for this task this year?<br />
[[User:Bfields|Bfields]] 13:22, 21 August 2009 (UTC)<br />
<br />
== Potential Participants ==<br />
<br />
# George Tzanetakis<br />
# Matt Hoffman<br />
# Thomas Lidy<br />
# Aurora Marsye<br />
# Dmitry Bogdanov (firstname dot lastname at upf dot edu)<br />
# Tim Pohle (firstname dot lastname at jku dot at)<br />
# Michael Mandel<br />
# François Maillet<br />
# Stephan Huebler (firstname dot lastname at ias dot et dot tu-dresden dot de)<br />
# Ben Fields / Mike Jewell (first initial dot lastname at gold dot ac dot uk)<br />
<br />
== Potential Graders ==<br />
<br />
# George Tzanetakis<br />
# Matt Hoffman<br />
# Thomas Lidy<br />
# Aurora Marsye<br />
# Dmitry Bogdanov (firstname dot lastname at upf dot edu)<br />
# Tim Pohle (firstname dot lastname at jku dot at)<br />
# Dominik Schnitzer (firstname dot lastname at ofai dot at)<br />
# Arthur Flexer (firstname dot lastname at ofai dot at)<br />
# Martin Gasser (firstname dot lastname at ofai dot at)<br />
# Bernhard Niedermayer (firstname dot lastname at jku dot at, in case there is a lack of graders)<br />
# Michael Mandel<br />
# François Maillet<br />
# Ben Fields</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2009:Audio_Music_Mood_Classification_Results&diff=71682009:Audio Music Mood Classification Results2010-06-07T19:07:30Z<p>IMIRSELBot: Robot: Automated text replacement (-\[http:\/\/(www.)?music-ir.org\/mirex\/200([5-9])\/index.php\/([^\s]+)( .+)?\] +200\2:\3)</p>
<hr />
<div>==Introduction==<br />
<br />
These are the results for the 2009 running of the Audio Music Mood Classification task. For background information about this task set please refer to the [[2009:Audio Music Mood Classification]] page. The data was created by Xiao Hu and consists of 600 files organized into 5 mood "clusters".<br />
=== Mood Clusters ===<br />
The 5 mood clusters were derived from the AMG mood repository.<br />
* Cluster_1: passionate, rousing, confident,boisterous, rowdy<br />
* Cluster_2: rollicking, cheerful, fun, sweet, amiable/good natured<br />
* Cluster_3: literate, poignant, wistful, bittersweet, autumnal, brooding<br />
* Cluster_4: humorous, silly, campy, quirky, whimsical, witty, wry<br />
* Cluster_5: aggressive, fiery,tense/anxious, intense, volatile,visceral <br />
For more information on the clusters, please see <br />
<br />
[http://ismir2007.ismir.net/proceedings/ISMIR2007_p067_hu.pdf Hu, Xiao and J. Stephen Downie (2007)] '''Exploring mood metadata: Relationships with genre, artist and usage metadata''', In the 8th International Conference on Music Information Retrieval (ISMIR 2007), Vienna, September 23-27, 2007.<br />
<br />
=== Data ===<br />
There are 600 audio clips with 120 in each mood cluster. Each clip belongs to only one mood cluster. <br />
The clips were chosen from the [http://www.apmmusic.com APM] audio set . <br />
<br />
The mood cluster labels of the clips were firstly suggested by their metadata provided by APM and then decided by human validations using the [[2007:Evalutron6000_Walkthrough_For_Audio_Mood_Classification]]<br />
<br />
Each mood cluster covers a variety of genres: each category covers about 7 major genres (with 20-30 tracks each) and a few minor genres, and the distribution among major genres within each category is made as even as possible.<br />
<br />
Audio format: 30 second clips, 22.05kHz, mono, 16bit, WAV files;<br />
The data were evenly split into 3 folds. <br />
<br />
For more information on the dataset and evaluation methods, please see<br />
<br />
[http://ismir2008.ismir.net/papers/ISMIR2008_263.pdf X. Hu, J. S. Downie, C. Laurier, M. Bay, A.Ehmann (2008)] '''The 2007 MIREX Audio Mood Classification Task: Lessons Learned''', In the 9th International Symposium on Music Information Retrieval (ISMIR 2008), Philadelphia, Sept. 2008<br />
<br />
<br />
---------------------------------------------------<br />
<br />
===General Legend===<br />
==== Team ID ====<br />
<br />
'''ANO'''= [https://www.music-ir.org/mirex/abstracts/2009/ANO_train_simi.pdf Anonymous]<br /><br />
'''BP1'''= [https://www.music-ir.org/mirex/abstracts/2009/BP_train_tag.pdf Juan José Burred, Geoffroy Peeters (file)]<br /><br />
'''BP2''' = [https://www.music-ir.org/mirex/abstracts/2009/BP_train_tag.pdf Juan José Burred, Geoffroy Peeters (tw)]<br /><br />
'''CL1''' = [https://www.music-ir.org/mirex/abstracts/2009/CL.pdf Chuan Cao, Ming Li]<br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/abstracts/2009/CL.pdf Chuan Cao, Ming Li]<br /><br />
'''FCY1''' = [https://www.music-ir.org/mirex/abstracts/2009/ Tao Feng, XiaoOu Chen, DeShun Yang]<br /><br />
'''FCY2''' = [https://www.music-ir.org/mirex/abstracts/2009/ Tao Feng, XiaoOu Chen, DeShun Yang]<br /><br />
'''GP''' = [https://www.music-ir.org/mirex/abstracts/2009/Peeters_2009_MIREX_classification.pdf Geoffroy Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/abstracts/2009/GTfinal.pdf George Tzanetakis (mono)]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/abstracts/2009/GTfinal.pdf George Tzanetakis (stereo)]<br /><br />
'''GLR1''' = [https://www.music-ir.org/mirex/abstracts/2009/GLR.pdf Andrei Grecu, Thomas Lidy, Andreas Rauber (full)]<br /><br />
'''GLR2''' = [https://www.music-ir.org/mirex/abstracts/2009/GLR.pdf Andrei Grecu, Thomas Lidy, Andreas Rauber (template)]<br /><br />
'''HNOS1''' = [https://www.music-ir.org/mirex/abstracts/2009/HNOS.pdf Takashi Hasegawa, Takuya Nishimoto, Nobutaka Ono, Shigeki Sagayama (tcca)]<br /><br />
'''HNOS2''' = [https://www.music-ir.org/mirex/abstracts/2009/HNOS.pdf Takashi Hasegawa, Takuya Nishimoto, Nobutaka Ono, Shigeki Sagayama (tcck)]<br /><br />
'''HNOS3''' = [https://www.music-ir.org/mirex/abstracts/2009/HNOS.pdf Takashi Hasegawa, Takuya Nishimoto, Nobutaka Ono, Shigeki Sagayama (tccl)]<br /><br />
'''HNOS4''' = [https://www.music-ir.org/mirex/abstracts/2009/HNOS.pdf Takashi Hasegawa, Takuya Nishimoto, Nobutaka Ono, Shigeki Sagayama (tcpk)]<br /><br />
'''HW1''' = [https://www.music-ir.org/mirex/abstracts/2009/HW_train.pdf Huaxin Wang]<br /><br />
'''HW2''' = [https://www.music-ir.org/mirex/abstracts/2009/HW_train.pdf Huaxin Wang]<br /><br />
'''VA1''' = [https://www.music-ir.org/mirex/abstracts/2009/VA.pdf Thomas Lidy, Andrei Grecu, Andreas Rauber, A. Pertusa, P. J. Ponce de Léon, J. M. Iñesta (WMV)]<br /><br />
'''VA2''' = [https://www.music-ir.org/mirex/abstracts/2009/VA.pdf Thomas Lidy, Andrei Grecu, Andreas Rauber, A. Pertusa, P. J. Ponce de Léon, J. M. Iñesta (BWWV)]<br /><br />
'''LZG''' = [https://www.music-ir.org/mirex/abstracts/2009/LZG.pdf Yi Liu, Tao Zheng, Yue Gao (RUC_1)]<br /><br />
'''RK1''' = [https://www.music-ir.org/mirex/abstracts/2009/RK.pdf Preeti Rao, Sujeet Kini]<br /><br />
'''RK2''' = [https://www.music-ir.org/mirex/abstracts/2009/K.pdf Preeti Rao, Sujeet Kini]<br /><br />
'''SS''' = [https://www.music-ir.org/mirex/abstracts/2009/SS.pdf Klaus Seyerlehner, Markus Schedl]<br /><br />
'''TAOS'''= [https://www.music-ir.org/mirex/abstracts/2009/TAOS.pdf Emiru Tsunoo, Taichi Akase, Nobutaka Ono, Shigeki Sagayama]<br /><br />
'''MTG1''' = [https://www.music-ir.org/mirex/abstracts/2009/MTG_train.pdf Nicolas Wack, Enric Guaus, Cyril Laurier, Owen Meyers, Ricard Marxer, Dmitry Bogdanov, Joan Serrà, Perfecto Herrera (false, rca)]<br /><br />
'''MTG2''' = [https://www.music-ir.org/mirex/abstracts/2009/MTG_train.pdf Nicolas Wack, Enric Guaus, Cyril Laurier, Owen Meyers, Ricard Marxer, Dmitry Bogdanov, Joan Serrà, Perfecto Herrera (true, rca)]<br /><br />
'''MTG3''' = [https://www.music-ir.org/mirex/abstracts/2009/MTG_train.pdf Nicolas Wack, Enric Guaus, Cyril Laurier, Owen Meyers, Ricard Marxer, Dmitry Bogdanov, Joan Serrà, Perfecto Herrera (false, simca)]<br /><br />
'''MTG4''' = [https://www.music-ir.org/mirex/abstracts/2009/MTG_train.pdf Nicolas Wack, Enric Guaus, Cyril Laurier, Owen Meyers, Ricard Marxer, Dmitry Bogdanov, Joan Serrà, Perfecto Herrera (true, simca)]<br /><br />
'''MTG5''' = [https://www.music-ir.org/mirex/abstracts/2009/MTG_train.pdf Nicolas Wack, Enric Guaus, Cyril Laurier, Owen Meyers, Ricard Marxer, Dmitry Bogdanov, Joan Serrà, Perfecto Herrera (false, svm)]<br /><br />
'''MTG6''' = [https://www.music-ir.org/mirex/abstracts/2009/MTG_train.pdf Nicolas Wack, Enric Guaus, Cyril Laurier, Owen Meyers, Ricard Marxer, Dmitry Bogdanov, Joan Serrà, Perfecto Herrera (true, svm)]<br /><br />
'''XLZZG''' = [https://www.music-ir.org/mirex/abstracts/2009/XLZZG.pdf Jieping Xu, Yi Liu, Tao Zheng, Chao Zhen, Yue Gao (RUC_1)]<br /><br />
'''XZZ''' = [https://www.music-ir.org/mirex/abstracts/2009/XZZ.pdf JiePing Xu, Chao Zhen, Tao Zheng (RUC_2)]<br /><br />
<br />
==Overall Summary Results==<br />
===Raw Classification Accuracy Averaged Over Three Train/Test Folds===<br />
<br />
<csv p=3>2009/audiomood/summary_audiomood.csv</csv><br />
<br />
===Accuracy Across Folds===<br />
<br />
<csv p=3>2009/audiomood/audiomood_Accuracy.csv</csv><br />
<br />
===Accuracy Across Categories===<br />
<br />
<csv p=3>2009/audiomood/audiomood_Accuracy_Per_Class.csv</csv><br />
<br />
==Friedman's Tests for Significant Differences==<br />
===Classes vs. System Tukey-Kramer HSD Multi-Comparisons ===<br />
The Friedman test was run in MATLAB against the average accuracy for each class. The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction. Command: <br /><br />
[c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv p=3>2009/audiomood/audiomood_Accuracy_Per_Class.friedman.tukeyKramerHSD.csv</csv><br />
<br />
https://music-ir.org/mirex/results/2009/audiomood/small.audiomood_Accuracy_Per_Class.friedman.tukeyKramerHSD.png<br />
<br />
===Folds vs. Systems Tukey-Kramer HSD Multi-Comparison===<br />
The Friedman test was run in MATLAB against the accuracy for each fold. The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction. Command: <br /> [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv p=3>2009/audiomood/audiomood_Accuracy.friedman.tukeyKramerHSD.csv<br />
</csv><br />
<br />
https://music-ir.org/mirex/results/2009/audiomood/small.audiomood_Accuracy.friedman.tukeyKramerHSD.png<br />
<br />
==Results By Algorithm==<br />
(.tgz) <br /><br />
<br />
'''ANO'''= [https://music-ir.org/mirex/results/2009/audiomood/ANO.tgz Anonymous]<br /><br />
'''BP1'''= [https://music-ir.org/mirex/results/2009/audiomood/BP1.tgz Juan José Burred, Geoffroy Peeters (file)]<br /><br />
'''BP2''' = [https://music-ir.org/mirex/results/2009/audiomood/BP2.tgz Juan José Burred, Geoffroy Peeters (tw)]<br /><br />
'''CL1''' = [https://music-ir.org/mirex/results/2009/audiomood/CL1.tgz Chuan Cao, Ming Li]<br /><br />
'''CL2''' = [https://music-ir.org/mirex/results/2009/audiomood/CL1.tgz Chuan Cao, Ming Li]<br /><br />
'''FCY1''' = [https://music-ir.org/mirex/results/2009/audiomood/FCY1.tgz Tao Feng, XiaoOu Chen, DeShun Yang]<br /><br />
'''FCY2''' = [https://music-ir.org/mirex/results/2009/audiomood/FCY2.tgz Tao Feng, XiaoOu Chen, DeShun Yang]<br /><br />
'''GP''' = [https://music-ir.org/mirex/results/2009/audiomood/GP.tgz Geoffroy Peeters]<br /><br />
'''GT1''' = [https://music-ir.org/mirex/results/2009/audiomood/GT1.tgz George Tzanetakis (mono)]<br /><br />
'''GT2''' = [https://music-ir.org/mirex/results/2009/audiomood/GT2.tgz George Tzanetakis (stereo)]<br /><br />
'''GLR1''' = [https://music-ir.org/mirex/results/2009/audiomood/GLR1.tgz A. Grecu, T. Lidy, A. Rauber (full)]<br /><br />
'''GLR2''' = [https://music-ir.org/mirex/results/2009/audiomood/GLR2.tgz A. Grecu, T. Lidy, A. Rauber (template)]<br /><br />
'''HNOS1''' = [https://music-ir.org/mirex/results/2009/audiomood/HNOS1.tgz Takashi Hasegawa, Takuya Nishimoto, Nobutaka Ono, Shigeki Sagayama (tcca)]<br /><br />
'''HNOS2''' = [https://music-ir.org/mirex/results/2009/audiomood/HNOS2.tgz Takashi Hasegawa, Takuya Nishimoto, Nobutaka Ono, Shigeki Sagayama (tcck)]<br /><br />
'''HNOS3''' = [https://music-ir.org/mirex/results/2009/audiomood/HNOS3.tgz Takashi Hasegawa, Takuya Nishimoto, Nobutaka Ono, Shigeki Sagayama (tccl)]<br /><br />
'''HNOS4''' = [https://music-ir.org/mirex/results/2009/audiomood/HNOS4.tgz Takashi Hasegawa, Takuya Nishimoto, Nobutaka Ono, Shigeki Sagayama (tcpk)]<br /><br />
'''HW1''' = [https://music-ir.org/mirex/results/2009/audiomood/HW1.tgz Huaxin Wang]<br /><br />
'''HW2''' = [https://music-ir.org/mirex/results/2009/audiomood/HW2.tgz Huaxin Wang]<br /><br />
'''VA1''' = [https://music-ir.org/mirex/results/2009/audiomood/VA1.tgz T. Lidy, A. Grecu, A. Rauber, A. Pertusa, P. J. Ponce de Léon, J. M. Iñesta (WMV)]<br /><br />
'''VA2''' = [https://music-ir.org/mirex/results/2009/audiomood/VA2.tgz T. Lidy, A. Grecu, A. Rauber, A. Pertusa, P. J. Ponce de Léon, J. M. Iñesta (BWWV)]<br /><br />
'''LZG''' = [https://music-ir.org/mirex/results/2009/audiomood/LZG.tgz Yi Liu, Tao Zheng, Yue Gao (RUC_1)]<br /><br />
'''RK1''' = [https://music-ir.org/mirex/results/2009/audiomood/RK1.tgz Preeti Rao, Sujeet Kini]<br /><br />
'''RK2''' = [https://music-ir.org/mirex/results/2009/audiomood/RK2.tgz Preeti Rao, Sujeet Kini]<br /><br />
'''SS''' = [https://music-ir.org/mirex/results/2009/audiomood/SS.tgz Klaus Seyerlehner, Markus Schedl]<br /><br />
'''TAOS'''= [https://music-ir.org/mirex/results/2009/audiomood/TAOS.tgz Emiru Tsunoo, Taichi Akase, Nobutaka Ono, Shigeki Sagayama]<br /><br />
'''MTG1''' = [https://music-ir.org/mirex/results/2009/audiomood/MTG1.tgz N. Wack, E. Guaus, C. Laurier, O. Meyers, R. Marxer, D. Bogdanov, J. Serrà, P. Herrera (false, rca)]<br /><br />
'''MTG2''' = [https://music-ir.org/mirex/results/2009/audiomood/MTG2.tgz N. Wack, E. Guaus, C. Laurier, O. Meyers, R. Marxer, D. Bogdanov, J. Serrà, P. Herrera (true, rca)]<br /><br />
'''MTG3''' = [https://music-ir.org/mirex/results/2009/audiomood/MTG3.tgz N. Wack, E. Guaus, C. Laurier, O. Meyers, R. Marxer, D. Bogdanov, J. Serrà, P. Herrera (false, simca)]<br /><br />
'''MTG4''' = [https://music-ir.org/mirex/results/2009/audiomood/MTG4.tgz N. Wack, E. Guaus, C. Laurier, O. Meyers, R. Marxer, D. Bogdanov, J. Serrà, P. Herrera (true, simca)]<br /><br />
'''MTG5''' = [https://music-ir.org/mirex/results/2009/audiomood/MTG5.tgz N. Wack, E. Guaus, C. Laurier, O. Meyers, R. Marxer, D. Bogdanov, J. Serrà, P. Herrera (false, svm)]<br /><br />
'''MTG6''' = [https://music-ir.org/mirex/results/2009/audiomood/MTG6.tgz N. Wack, E. Guaus, C. Laurier, O. Meyers, R. Marxer, D. Bogdanov, J. Serrà, P. Herrera (true, svm)]<br /><br />
'''XLZZG''' = [https://music-ir.org/mirex/results/2009/audiomood/XLZZG.tgz Jieping Xu, Yi Liu, Tao Zheng, Chao Zhen, Yue Gao (RUC_1)]<br /><br />
'''XZZ''' = [https://music-ir.org/mirex/results/2009/audiomood/XZZ.tgz JiePing Xu, Chao Zhen, Tao Zheng (RUC_2)]<br /><br />
<br />
==Run Times==<br />
<br />
<csv>2009/mood.runtime.csv</csv> TBA</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2009:Audio_Music_Mood_Classification&diff=71672009:Audio Music Mood Classification2010-06-07T19:07:19Z<p>IMIRSELBot: Robot: Automated text replacement (-\[http:\/\/(www.)?music-ir.org\/mirex\/200([5-9])\/index.php\/([^\s]+)( .+)?\] +200\2:\3)</p>
<hr />
<div>== Description ==<br />
<br />
The text of this section is copied from the 2008 page. Please add your comments and discussions for 2009. <br />
<br />
This section is put here to clarify what will happen for this year's run of the Audio Mood Classification (AMC) task.<br />
<br />
# We will operate the AMC task as a classic train-test classification task.<br />
# We will n-fold the runs with n to be determined by the size of the final data set, number of participants, etc.<br />
# We will hand-craft the n-fold test-train split lists.<br />
# We will NOT be doing post-run human mood judgments this year using the Evalutron 6000. <br />
# Audio files: 30 sec., 22kHz, mono, 16 bit<br />
<br />
Do take a look at the [[2009:Audio Genre Classification]] task wiki as we are basing the underlying structure of this task on Audio Genre. In fact, an Audio Genre submission should work out of the box with Audio Mood Classification. Note: we really want folks to do a FEATURE EXTRACTION phase first against all the files and then have these features cached some place for re-use during the TRAIN-TEST phase. This way we can really speed up the n-fold processing. Thus, like GENRE, we need to pass three input files to your algos:<br />
<br />
<br />
== Discussions for 2009 ==<br />
Your comments here.<br />
<br />
<br />
==== 1. Feature extraction list file ====<br />
The list file passed for feature extraction will a simple ASCII list <br />
file. This file will contain one path per line with no header line.<br />
<br />
==== 2. Training list file ====<br />
The list file passed for model training will be a simple ASCII list <br />
file. This file will contain one path per line, followed by a tab character and <br />
the genre label, again with no header line. <br />
<br />
E.g. <example path and filename>\t<mood classification><br />
<br />
==== 3. Test (classification) list file ====<br />
The list file passed for testing classification will be a simple ASCII list <br />
file identical in format to the Feature extraction list file. This file will <br />
contain one path per line with no header line.<br />
<br />
==== Classification output files ====<br />
Participating algorithms should produce a simple ASCII list file identical in <br />
format to the Training list file. This file will contain one path per line, <br />
followed by a tab character and the MOOD label, again with no header line. <br />
E.g.:<br />
<example path and filename>\t<mood classification><br />
<br />
The path to which this list file should be written must be accepted as a <br />
parameter on the command line.<br />
<br />
== Participants ==<br />
If you think there is a slight chance that you might consider participating, please add your name and email address here.<br />
<br />
# Your name here<br />
<br />
== Introduction ==<br />
In music psychology and music education, emotion component of music has been recognized as the most strongly associated with music expressivity.(e.g. Juslin et al 2006[[#Related Papers]]). Music information behavior studies (e.g.Cunningham, Jones and Jones 2004, Cunningham, Vignoli 2004, Bainbridge and Falconer 2006 [[#Related Papers]]) have also identified music mood/ emotion as an important criterion used by people in music seeking and organization. Several experiments have been conducted in the MIR community to classify music by mood (e.g. Lu, Liu and Zhang 2006, Pohle, Pampalk, and Widmer 2005, Mandel, Poliner and Ellis 2006, Feng, Zhuang and Pan 2003[[#Related Papers]]). Please note: the MIR community tends to use the word "mood" while musicpsychologists like to use "emotion". We follow the MIR tradition to use "mood" thereafter. <br />
<br />
However, evaluation of music mood classification is difficult as music mood is a very subjective notion. Each aforementioned experiement used different mood categories and different datasets, making comparison on previous work a virtually impossible mission. A contest on music mood classification in MIREX will help build the first ever community available test set and precious ground truth.<br />
<br />
This is the first time in MIREX to attempt a music mood classification evaluation. There are many issues involved in this evaluation task, and let us start discuss them on this wiki. If needed, we will set up a mailing list devoting to the discussion.<br />
<br />
== Mood Categories ==<br />
<br />
The IMIRSEL has derived a set of 5 mood clusters from the AMG mood repository (Hu & Downie 2007[[#Related Papers]]). The mood clusters effectively reduce the diverse mood space into a tangible set of categories, and yet root in the social-cultural context of pop music. Therefore, we propose to use the 5 mood clusters as the categories in this yearΓÇÖs audio mood classification contest. Each of the clusters is a collection of the AMG mood labels which collectively define the cluster: <br />
<br />
*Cluster_1: passionate, rousing, confident,boisterous, rowdy <br />
*Cluster_2: rollicking, cheerful, fun, sweet, amiable/good natured <br />
*Cluster_3: literate, poignant, wistful, bittersweet, autumnal, brooding <br />
*Cluster_4: humorous, silly, campy, quirky, whimsical, witty, wry <br />
*Cluster_5: aggressive, fiery,tense/anxious, intense, volatile,visceral <br />
<br />
At this moment, the IMIRSEL and Cyril Laurier at the Music Technology Group of Barcelona have manually validated the mood clusters and exemplar songs in each cluster. Please see [[#Exemplar Songs in Each Category]] for details. <br />
<br />
We are still seeking additional songs across different genres to enrich this set, and during the process, the cluster with least cross-listener consistency may be dropped, or two clusters often confusing each other may be combined. <br />
<br />
== Exemplar Songs in Each Category == <br />
Exemplar songs for each mood cluster are manually selected by multiple human assessors. The purpose is to further clarify the perceptual identities of the mood clusters.<br />
<br />
There are 190 candidate songs in the intersection of AMG mood repository and the USPOP collection in IMIRSEL, and each of these songs has only one unanimous mood cluster label assigned by AMG editors. The mood labels by AMG editors are important benchmark which can help us reach cross-listener consistency on such a subjective task. So far, 6 human assessors have listened to the 190 songs and assigned cluster labels to them. 50 songs are unanimously labeled by the 6 human assessors, 42 songs are unanimously labeled by 5 of the 6 human assessors, and another 40 songs by 4 of the 6 human assessors. <br />
<br />
The advantages of the exemplar songs are two folds: 1. they will help people better understand what kind of mood each cluster refers to; 2. they can possibly be taken as training data for the algorithms (see the section of [[#Training Set]]). <br />
<br />
Note: Lyrics issue: when labeling the songs, the human assessors were asked to ignore lyrics. As this is a contest focuses on music audio, lyrics should not be taken into consideration. <br />
<br />
== Two Evaluation Scenarios ==<br />
<br />
1. Evaluation on a closed groundtruth set.<br />
As in traditional classification problems, both training and testing data are labeled well before the contest. <br />
Pros: evaluation metrics are more rigorous; support cross-validation <br />
cons: training/testing set is limited<br />
<br />
2. Training on a labeled set, but testing on an unlabeled audio pool <br />
As in audio similarity and retrieval contest, each algorithm returns a list of candidates in each mood category, then human assessors make judgments on the returned candidates. <br />
Pros: testing pool can be arbitrarily big; training set is bigger as well (which can be the whole groundtruth set in scenario 1 .) <br />
Cons: innovative but limited evaluation metrics (see below)<br />
<br />
For both scenarios, this is a single-label classification contest, and thus each song can only be classified into one mood cluster. <br />
<br />
'''We will go for scenario 1'''<br />
<br />
== Groundtruth Set ==<br />
<br />
The IMIRSEL is preparing a ground-truth set of audio clips selected from the USPOP collection decribed above and the APM collection (www.apmmusic.com). The bibliographic information of the exemplar songs has been released as above, which is to help participants reach agreements on the meanings of the mood categories.<br />
<br />
The APM audio set has been pre-labeled with the 5 mood clusters according to their metadata provided by APM, and covers a variety of genres: each category covers about 7 major genres (with 20-30 tracks each) and a few minor genres. To make the problem more interesting, the distribution among major genres within each category is made as even as possible. <br />
<br />
To make sure the mood labels are correct, this APM audio collection will subject to human validation before the contest. We prepared a set of 1250 audio clips (250 per category). The audio clips whose mood category assignments reach agreements among 2 out of 3 human assessors will serve as a ground truth set. We are aiming at least 120 audio clips in each mood category. <br />
<br />
After the human validation on this audio set, participating algorithms/ models will be trained and tested within IMIRSEL.<br />
<br />
'''Audio format: 30 second clips, 22.05kHz, mono, 16bit, WAV files''' <br />
<br />
=== Human Validation ===<br />
Subjective judgments by human assessors will be collected for the above mentioned APM audio set using a web-based system, Evalutron6000, developed by the IMIRSEL. <br />
<br />
Each audio clip is 30 seconds long, and will have 3 human judges listen to it and choose which mood category it belongs to. If 2 of the 3 judges agree on its category, an audio clip will be selected into the groundtruth set.<br />
<br />
== Evaluation Metrics == <br />
<br />
Metrics frequently used in classification problems include: accuracy, precision, recall and F measures (combining precision and recall). The single most important metrics would be accuracy, which allows direct system comparison: <br />
<br />
''Accuracy = # of correctly classified songs / #. of all songs.'' <br />
<br />
Accuracy can be calculated for all clusters as a whole (macro average) or for each cluster then take average of them (micro average).<br />
<br />
Test significance of differences among systems, possibly using<br />
<br />
*a) McNemarΓÇÖs test <br />
<br />
McNemarΓÇÖs test (Dietterich, 1997) is a statistical process that can validate the significance of differences between two classifiers. It was used in Audio Genre Classification and Audio Artist Identification contests in MIREX 2005. <br />
<br />
*b) FriedmanΓÇÖs test<br />
<br />
FriedmanΓÇÖs test used to detect differences in treatments across multiple test attempts. (http://en.wikipedia.org/wiki/Friedman_test). It was used in Audio Similarity, Audio cover song, and Query by Singing/Humming contests in MIREX 2006. <br />
<br />
Besides, run time can be recorded and compared.<br />
<br />
== Important Dates ==<br />
<br />
<br />
* Algorithm Submission Deadline: TBA<br />
<br />
== Packaging your Submission ==<br />
* Be sure that your submission follows the [[#Submission_Format]] outlined below.<br />
* Be sure that your submission accepts the proper [[#Input_File]] format<br />
* Be sure that your submission produces the proper [[#Output_File]] format<br />
* Be sure to follow the [[[2006:Best_Coding_Practices_for_MIREX]]<br />
* Be sure to follow the [[2009:MIREX 2009 Submission Instructions]] <br />
* In the README file that is included with your submission, please answer the following additional questions:<br />
** Approximately how long will the submission take to process ~1000 wav files?<br />
** Approximately how much scratch disk space will the submission need to store any feature/cache files?<br />
** Any special notice regarding to running your algorith<br />
<br />
Note that the information that you place in the README file is '''extremely''' important in ensuring that your submission is evaluated properly.<br />
<br />
== Submission Format ==<br />
A submission to the Audio Music Mood Classification evaluation is expected to follow the [[2006:Best_Coding_Practices_for_MIREX]] and must conform to the following for execution:<br />
<br />
=== One Call Format ===<br />
The one call format is appropriate for systems that perform all phases of the classification (typically features extraction, training and testing) in one step. A submission should be an executable program that takes 4 arguments: <br />
* path/to/fileContainingListOfTrainingAudioClips - the path to the list of training audio clips (see [[#File Formats]] below)<br />
* path/to/fileContainingListOfTestingAudioClips - the path to the list of testing audio clips (see [[#File Formats]] below)<br />
* path/to/cacheDir - a directory where the submission can place temporary or scratch files. Note that the contents of this directory can be retained across runs, so if, for whatever reason, the submission needs to be restarted, the submission could make use of the contents of this directory to eliminate the need for reprocessing some inputs.<br />
* path/to/output/Results - the file where the output classification results should be placed. (see [[#File Formats]] below)<br />
<br />
'''Example:'''<br />
<br />
<pre><br />
<br />
doAMC "path/to/fileContainingListOfTrainingAudioClips" "path/to/fileContainingListOfTestingAudioClips" "path/to/cacheDir" "path/to/output/Results" <br />
<br />
</pre><br />
<br />
<br />
=== Two Call Format ===<br />
The one call format is appropriate for systems that perform the training and testing separately. A submission should consists of two executable programs<br />
*trainAMC - this takes 3 arguments: <br />
** path/to/fileContainingListOfTrainingAudioClips - the path to the list of training audio clips (see [[#File Formats]] below)<br />
** path/to/trainingCacheDir - a directory where the submission can place temporary or scratch files. Note that the contents of this directory can be retained across runs, so if, for whatever reason, the submission needs to be restarted, the submission could make use of the contents of this directory to eliminate the need for reprocessing some inputs.<br />
** path/to/trainedClassificationModel - the file where the classification model should be placed<br />
*testAMC - this takes 4 arguments:<br />
** path/to/trainedClassificationModel<br />
** path/to/fileContainingListofTestingAudioClips - the path to the list of testing audio clips (see [[#File Formats]] below)<br />
** path/to/testingCacheDir - a directory where the submission can place temporary or scratch files. <br />
** path/to/output/Results - the file where the output classification results should be placed. (see [[#File Formats]] below)<br />
<br />
'''Example:'''<br />
<br />
<pre><br />
<br />
trainAMC "path/to/fileContainingListOfTrainingAudioClips" "path/to/trainingcacheDir" "path/to/trainedClassificationModel" <br />
testAMC "path/to/trainedClassificationModel" "path/to/fileContainingListofTestingAudioClips" "path/to/testingCacheDir" "path/to/output/Results"<br />
<br />
</pre><br />
<br />
=== Matlab format ===<br />
<br />
Matlab will also be supported in the form of functions in the following formats:<br />
<br />
==== Matlab One call format ====<br />
<pre><br />
doMyMatlabAMC('path/to/fileContainingListOfTrainingAudioClips','path/to/fileContainingListOfTestingAudioClips','path/to/cacheDir','path/to/output/Results')<br />
</pre><br />
<br />
<br />
==== Matlab Two call format ====<br />
<pre><br />
doMyMatlabTrainAMC('path/to/fileContainingListOfTrainingAudioClips','path/to/trainingcacheDir','path/to/trainedClassificationModel')<br />
doMyMatlabTestAMC('path/to/trainedClassificationModel','path/to/fileContainingListofTestingAudioClips','path/to/testingCacheDir','path/to/output/Results')<br />
</pre><br />
<br />
== File Formats ==<br />
<br />
=== Input Files ===<br />
<br />
The input training list file format will be of the form: <br />
<br />
<pre><br />
path/to/training/audio/file/000001.wav\tCluster_3<br />
path/to/training/audio/file/000002.wav\tCluster_5<br />
path/to/training/audio/file/000003.wav\tCluster_2<br />
...<br />
path/to/training/audio/file/00000N.wav\tCluster_1<br />
</pre><br />
<br />
"\t" stands for tab.<br />
<br />
The input testing list file format will be of the form: <br />
<br />
<pre><br />
path/to/testing/audio/file/000010.wav<br />
path/to/testing/audio/file/000020.wav<br />
path/to/testing/audio/file/000030.wav<br />
...<br />
path/to/testing/audio/file/0000N0.wav<br />
</pre><br />
<br />
"\t" stands for tab.<br />
<br />
=== Output File ===<br />
The only output will be a file containing classification results in the following format: <br />
<br />
<pre><br />
Example Classification Results 0.1 (replace this line with your system name)<br />
path/to/testing/audio/file/000010.wav\tCluster_3<br />
path/to/testing/audio/file/000020.wav\tCluster_1<br />
path/to/testing/audio/file/000030.wav\tCluster_5<br />
...<br />
path/to/testing/audio/file/0000N0.wav\tCluster_2<br />
</pre><br />
<br />
"\t" indicates tab. All audio clips should have one and only one mood cluster label.<br />
<br />
==Evaluation Scenario 2==<br />
<br />
=== Training Set ===<br />
<br />
Under evaluation scenario 2, the training set would be the whole ground truth set in scenario 1 (see [[#Groundtruth Set]]).<br />
<br />
=== Unlabeled Song Pool ===<br />
Under evaluation scenario 2, the pool of testing audio to be classified is from the same collection of the training set, i.e. USPOP and APM. We will make sure the audio covers a variety of genres in each mood cluster, which will make the contest harder and more interesting.<br />
<br />
We will randomly select a certain number (say, 1000) of songs from the collections as the audio pool. This number should make the contest interesting enough, but not too hard. And the songs need to cover all 5 mood clusters.<br />
<br />
=== Classification Results ===<br />
Each algorithm will return the top X songs in each cluster. <br />
<br />
This is a single-label classification contest, and thus each song can only be classified into one mood cluster. <br />
<br />
Note: unlike traditional classification problems where all testing samples have ground truth available, this scenario does not have a well labeled testing set. Instead, we use a ΓÇ£poolingΓÇ¥ approach like in TREC and last yearΓÇÖs audio similarity and retrieval contest. This approach collects the top X results from each algorithm and asks human assessors to make judgments on this set of collected results while assuming all other samples are irrelevant or incorrect. This approach cannot measure the absolute ΓÇ£recallΓÇ¥ metrics, but it is valid in comparing relative performances among participating algorithms. <br />
<br />
The actual value of X depends on human assessment protocol and number of available human assessors (see next section [[#Human Assessment]]).<br />
<br />
=== Human Assessment===<br />
Subjective judgments by human assessors will be collected for the pooled results using a web-based system, Evalutron6000, developed by the IMIRSEL. (An introduction of this piece of Evalutron 6000 is shown here [[2009:Evalutron6000_Walkthrough_For_Audio_Mood_Classification]]<br />
<br />
==== How many judgments and assessors ====<br />
Each algorithm returns X songs for each of the 5 mood clusters. Suppose there are Y algorithms, in the worst case, each cluster will have 5* X*Y songs to be judged. Suppose each song needs Z sets of ears, there will be 5*X*Y*Z judgments in total. When making a judgment, a human assessor will listen to the 30 second clip of a song, and label it with one of the 5 mood clusters. <br />
<br />
Human evaluators will be drawn from the participating labs and volunteers from IMIRSEL or on the MIREX lists. Suppose we can get W evaluators, each evaluator will evaluate S = (5*X*Y*Z) / W songs.<br />
<br />
At this moment, there are 10 potential participants on the Wiki, so letΓÇÖs say Y = 6. Suppose each candidate song will be evaluated by 3 judges, Z = 3, and suppose we can get 20 assessors: W = 20: <br />
<br />
*If X = 20, number of judgments for each assessor: S = 90<br />
*If X = 10, S = 45<br />
*If X = 30, S = 135 <br />
*If X = 50, S = 225<br />
*If X = 15, S = 67.5<br />
*…<br />
<br />
In audio similarity contest last year, each assessor made 205 judgments as average. As the judgment for mood is trickier, we may need to give our assessors less burden.<br />
<br />
To eliminate possible bias, we will try to equally distribute candidates returned by each algorithm among human assessors.<br />
<br />
=== Scoring ===<br />
Each algorithm is graded by the number of votes its candidate songs win from the judges. For example, if a song, A, is judged as in Cluster_1 by 2 assessors and as in Cluster_2 by 1 assessors, then the algorithm classifying A as in Cluster_1 will score 2 on this song, while the algorithm classifiying A as Cluster_2 will score 1 on this song. An algorithmΓÇÖs final score is the sum of scores on all the songs it submits. Since each algorithm can only submit 100 songs, the one which wins the most votes of judges win the contest.<br />
<br />
=== Evaluation Metrics ===<br />
Algorithm score as mentioned in last section is a metrics that facilitates direct comparison. <br />
<br />
Besides, metrics frequently used in classification problems include: accuracy, precision, recall and F measures (combining precision and recall). As mentioned above, the pooling approach results in a relative recall measure, therefore, the single most important metrics would be accuracy: <br />
<br />
The original definition of accuracy is:<br />
''Accuracy = # of correctly classified songs / #. of all songs.'' <br />
<br />
According to the above human assessment method, ΓÇ£correctly classified songsΓÇ¥ in this scenario can be defined as songs classified as the majority vote of the judges and, in the case of ties, songs classified as any of the tie votes. For example, suppose each song has 3 judges. If a song is labeled as Cluster_1 by at least 2 judges, then this song will be counted as correct for algorithms classifying it to Cluster_1; if a song is labeled as Cluster_1, Cluster_2 and Cluster_3 once by each of the judges, then this song will be counted as correct for algorithms classifying it to Cluster_1, Cluster_2 or Cluster_3. <br />
<br />
Accuracy can be calculated for all clusters as a whole (macro average) or for each cluster then take average of them (micro average).<br />
<br />
Test significance of differences among systems, possibly using<br />
<br />
*a) McNemarΓÇÖs test <br />
*b) FriedmanΓÇÖs test<br />
<br />
Besides, run time can be recorded and compared.<br />
<br />
== Challenging Issues == <br />
# Mood changeable pieces: some pieces may start from one mood but end up with another one. <br />
<br />
We will use 30 second clips instead of whole songs. The clips will be extracted automatically from the middle of the songs which have more chances to be representative.<br />
<br />
# Multiple label classification: it is possible that one piece can have two or more correct mood labels, but as a start, we strongly suggest to hold a less confusing contest and leave the challenge to future MIREXs.So, for this year, this is a single label classification problem.<br />
<br />
== Moderators ==<br />
* J. Stephen Downie (IMIRSEL, University of Illinois, USA) - [mailto:jdownie@uiuc.edu]<br />
* Xiao Hu (IMIRSEL, University of Illinois, USA) -[mailto:xiaohu@uiuc.edu]<br />
* Cyril Laurier (Music Technology Group, Barcelona, Spain) -[mailto:claurier@iua.upf.edu]<br />
<br />
== Related Papers ==<br />
#Dietterich, T. (1997). '''Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms'''. Neural Computation, 10(7), 1895-1924.<br />
#Hu, Xiao and J. Stephen Downie (2007). '''Exploring mood metadata: Relationships with genre, artist and usage metadata'''. Accepted in the Eighth International Conference on Music Information Retrieval (ISMIR 2007),Vienna, September 23-27, 2007.<br />
# Juslin, P.N., Karlsson, J., Lindstr├╢m E., Friberg, A. and Schoonderwaldt, E(2006), '''Play It Again With Feeling: Computer Feedback in Musical Communication of Emotions'''. In Journal of Experimental Psychology: Applied 2006, Vol.12, No.2, 79-95.<br />
# [http://ismir2004.ismir.net/proceedings/p075-page-415-paper152.pdf Vignoli (ISMIR 2004)] '''Digital Music Interaction Concepts: A User Study'''<br />
# [http://ismir2004.ismir.net/proceedings/p082-page-447-paper221.pdf Cunningham, Jones and Jones (ISMIR 2004)] '''Organizing Digital Music For Use: An Examiniation of Personal Music Collections'''.<br />
# [http://ismir2006.ismir.net/PAPERS/ISMIR0685_Paper.pdf Cunningham, Bainbridge and Falconer (ISMIR 2006)] '''More of an Art than a Science': Supporting the Creation of Playlists and Mixes'''.<br />
# Lu, Liu and Zhang (2006), '''Automatic Mood Detection and Tracking of Music Audio Signals'''. IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 1, JANUARY 2006 <br> Part of this paper appeared in ISMIR 2003 http://ismir2003.ismir.net/papers/Liu.PDF<br />
# [http://www.cp.jku.at/research/papers/Pohle_CBMI_2005.pdf Pohle, Pampalk, and Widmer (CBMI 2005)] '''Evaluation of Frequently Used Audio Features for Classification of Music into Perceptual Categories'''. <br> It separates "mood" and "emotion" as two classifcation dimensions, which are mostly combined in other studies.<br />
# [http://www.ee.columbia.edu/~dpwe/pubs/MandPE06-svm.pdf Mandel, Poliner and Ellis (2006)] '''Support vector machine active learning for music retrieval'''. Multimedia Systems, Vol.12(1). Aug.2006.<br />
# [http://doi.acm.org/10.1145/860435.860508 Feng, Zhuang and Pan (SIGIR 2003)] '''Popular music retrieval by detecting mood'''<br />
# [http://ismir2003.ismir.net/papers/Li.PDF Li and Ogihara (ISMIR 2003)] '''Detecting emotion in music'''<br />
# [http://pubdb.medien.ifi.lmu.de/cgi-bin//info.pl?hilliges2006audio Hilliges, Holzer, Kl├╝ber and Butz (2006)] '''AudioRadar: A metaphorical visualization for the navigation of large music collections'''.In Proceedings of the International Symposium on Smart Graphics 2006, Vancouver Canada. <br> It summarized implicit problems in traditional genre/artist based music organization.<br />
# Juslin, P. N., & Laukka, P. (2004). '''Expression, perception, and induction of musical emotions: A review and a questionnaire study of everyday listening'''. Journal of New Music Research, 33(3), 217-238.<br />
# [http://mpac.ee.ntu.edu.tw/~yihsuan/ Yang, Liu, and Chen (ACMMM 2006)] '''Music emotion classification: A fuzzy approach '''<br />
<br />
<br />
<br />
== Potential Participants ==<br />
<br />
# Thomas Lidy (+ ...), Vienna University of Technology, Austria, lidy[at]ifs[dot]tuwien[dot]ac[dot]at<br />
# Nicolas Wack et al., MTG Universitat Pompeu Fabra, Spain, nicolas[dot]wack[at]upf[dot]edu (3 algos presented, 1 by me, 1 by Enric Guaus and 1 by Cyril Laurier)<br />
# Tao Feng, XiaoOu Chen, DeShun Yang. Peking University, China. fengtao[at]pku[dot]edu[dot]cn, chenxiaoou[at]icst[dot]pku[dot]edu[dot]cn, yangdeshun[at]icst[dot]pku[dot]edu[dot]cn<br />
# Michael Mandel, Columbia University, mim[at]ee[dot]columbia[dot]edu<br />
# Emiru Tsunoo, et. al., University of Tokyo, Japan, tsunoo[at]hil[dot]t[dot]u-tokyo[dot]ac[dot]jp<br />
# Juan Jose Burred, Geoffroy Peeters (IRCAM), burred[at]ircam[dot]fr<br />
# [http://www.essaymill.com essays topics]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2010:Audio_Tag_Classification&diff=71662010:Audio Tag Classification2010-06-07T19:06:52Z<p>IMIRSELBot: Robot: Automated text replacement (-\[http:\/\/(www.)?music-ir.org\/mirex\/200([5-9])\/index.php\/([^\s]+)( .+)?\] +200\2:\3)</p>
<hr />
<div>__TOC__<br />
<br />
== Description ==<br />
This task will compare various algorithms' abilities to associate descriptive tags with 10-second audio clips of songs. Two datasets are used to implement a pair of sub tasks, based on the MajorMiner and Mood tag datasets. This task is very much related to the other audio classification tasks, however, multiple tags may be applied to each example rather than single-label classification. <br />
<br />
Algorithms will be evaluated both on their ability to apply binary classifications of tags to examples, but also on their ability to rank tags for a track by asking them to return an affinity score for each tag/track pair.<br />
<br />
Audio tag classification was first run at MIREX 2008 [[2008:Audio_Tag_Classification]] and as a special MIREX task at 2009<br />
[[2009:SpecialTagatuneEvaluation]] . <br />
<br />
<br />
=== Task specific mailing list ===<br />
A specific mailing list is provided for the discussion of this task and related tasks ( [[2010:Audio Classification (Test/Train) tasks]], [[2010:Audio_Cover_Song_Identification]], [[2010:Audio_Tag_Classification]], [[2010:Audio_Music_Similarity_and_Retrieval]]) at: [https://mail.lis.uiuc.edu/mailman/listinfo/mrx-com00 https://mail.lis.uiuc.edu/mailman/listinfo/mrx-com00]. If you wish to participate in any of these tasks please sign up to this mailing list as discussion of the task format and evaluation should be conducted there.<br />
<br />
== Data ==<br />
Two datasets will be used to evaluate tagging algorithms: The MajorMiner and Mood tag datasets.<br />
<br />
<br />
=== MajorMiner Tag Dataset ===<br />
The tags come from the [http://majorminer.org MajorMiner game]. <br />
All of the data is browseable via the [http://majorminer.org/search MajorMiner search] page.<br />
<br />
The music consists of 2300 clips selected at random from 3900 tracks. Each clip is 10 seconds long. The 2300 clips represent a total of 1400 different tracks on 800 different albums by 500 different artists. To give a sense for the music collection, the following genre tags have been applied to these artists, albums, and tracks on Last.fm: electronica, rock, indie, alternative, pop, britpop, idm, new wave, hip-hop, singer-songwriter, trip-hop, post-punk, ambient, jazz.<br />
<br />
<br />
The MajorMiner game has collected a total of about 73000 taggings, 12000 of which have been verified by at least two users. In these verified taggings, there are 43 tags that have been verified at least 35 times, for a total of about 9000 verified uses. These are the tags we will be using in this task.<br />
<br />
Note that these data do not include strict negative labels. While many clips are tagged ''rock'', none are tagged ''not rock''. Frequently, however, a clip will be tagged many times without being tagged ''rock''. We take this as an indication that ''rock'' does not apply to that clip. More specifically, a negative example of a particular tag is a clip on which another tag has been verified, but the tag in question has not.<br />
<br />
Here is a list of the top 50 tags along with an approximate number of times each has been verified, how many times it's been used in total, and how many different users have ever used it:<br />
<br />
{| class="wikitable" style="margin: 1em auto 1em auto"<br />
! Tag || Verified || Total || Users<br />
|-<br />
| drums || 962 || 3223 || 127 <br />
|-<br />
| guitar || 845 || 3204 || 181 <br />
|-<br />
| male || 724 || 2452 || 95 <br />
|-<br />
| rock || 658 || 2619 || 198 <br />
|-<br />
| synth || 498 || 1889 || 105 <br />
|-<br />
| electronic || 490 || 1878 || 131 <br />
|-<br />
| pop || 479 || 1761 || 151 <br />
|-<br />
| bass || 417 || 1632 || 99 <br />
|-<br />
| vocal || 355 || 1378 || 99 <br />
|-<br />
| female || 342 || 1387 || 100 <br />
|-<br />
| dance || 322 || 1244 || 115 <br />
|-<br />
| techno || 246 || 943 || 104 <br />
|-<br />
| piano || 179 || 826 || 120 <br />
|-<br />
| electronica || 168 || 686 || 67 <br />
|-<br />
| hip hop || 166 || 701 || 126 <br />
|-<br />
| voice || 160 || 790 || 55 <br />
|-<br />
| slow || 157 || 727 || 90 <br />
|-<br />
| beat || 154 || 708 || 90 <br />
|-<br />
| rap || 151 || 723 || 129 <br />
|-<br />
| jazz || 136 || 735 || 154 <br />
|-<br />
| 80s || 130 || 601 || 94 <br />
|-<br />
| fast || 109 || 494 || 70 <br />
|-<br />
| instrumental || 103 || 539 || 62 <br />
|-<br />
| drum machine || 89 || 427 || 35 <br />
|-<br />
| british || 81 || 383 || 60 <br />
|-<br />
| country || 74 || 360 || 105 <br />
|-<br />
| distortion || 73 || 366 || 55 <br />
|-<br />
| saxophone || 70 || 316 || 86 <br />
|-<br />
| house || 65 || 298 || 66 <br />
|-<br />
| ambient || 61 || 335 || 78 <br />
|-<br />
| soft || 61 || 351 || 58 <br />
|-<br />
| silence || 57 || 200 || 35 <br />
|-<br />
| r&b || 57 || 242 || 59 <br />
|-<br />
| strings || 55 || 252 || 62 <br />
|-<br />
| quiet || 54 || 261 || 57 <br />
|-<br />
| solo || 53 || 268 || 56 <br />
|-<br />
| keyboard || 53 || 424 || 41 <br />
|-<br />
| punk || 51 || 242 || 76 <br />
|-<br />
| horns || 48 || 204 || 38 <br />
|-<br />
| drum and bass || 48 || 191 || 50 <br />
|-<br />
| noise || 46 || 249 || 61 <br />
|-<br />
| funk || 46 || 266 || 90 <br />
|-<br />
| acoustic || 40 || 193 || 58 <br />
|-<br />
| trumpet || 39 || 174 || 68 <br />
|-<br />
| end || 38 || 178 || 36 <br />
|-<br />
| loud || 37 || 218 || 62 <br />
|-<br />
| organ || 35 || 169 || 46 <br />
|-<br />
| metal || 35 || 178 || 64 <br />
|-<br />
| folk || 33 || 195 || 58 <br />
|-<br />
| trance || 33 || 226 || 49 <br />
|}<br />
<br />
<br />
=== Mood Tag Dataset ===<br />
The Mood tag dataset is derived from mood related tags on last.fm. All tags in this set are identified by a general affect lexicon (WordNet-Affect) and by human experts. Similar tags are grouped together to define a mood tag group and each song may belong to multiple mood tag groups.<br />
<br />
There are 18 mood tag groups containing 135 unique tags. The dataset contains 3,469 unique songs. The following table lists the tag groups, their member tags and number of songs in each group: <br />
<br />
{| class="wikitable" style="margin: 1em auto 1em auto"<br />
! Group id || Tags || num. of tags || num. of songs<br />
|-<br />
| G12 || calm, comfort, quiet, serene, mellow, chill out, calm down, calming, chillout, comforting, content, cool down, mellow music, mellow rock, peace of mind, quietness, relaxation, serenity, solace, soothe, soothing, still, tranquil, tranquility, tranquility || 25 || 1,680<br />
|-<br />
| G15 || sad, sadness, unhappy, melancholic, melancholy, feeling sad, mood: sad - slightly, sad song || 8 || 1,178<br />
|-<br />
| G5 || happy, happiness, happy songs, happy music, glad, mood: happy || 6 || 749<br />
|-<br />
| G32 || romantic, romantic music || 2 || 619<br />
|-<br />
| G2 || upbeat, gleeful, high spirits, zest, enthusiastic, buoyancy, elation, mood: upbeat|| 8 || 543<br />
|-<br />
| G16 || depressed, blue, dark, depressive, dreary, gloom, darkness, depress, depression, depressing, gloomy || 11 || 471<br />
|-<br />
| G28 || anger, angry, choleric, fury, outraged, rage, angry music || 7 || 254<br />
|-<br />
| G17 || grief, heartbreak, mournful, sorrow, sorry, doleful, heartache, heartbreaking, heartsick, lachrymose, mourning, plaintive, regret, sorrowful || 14 || 183<br />
|-<br />
| G14 || dreamy || 1 || 146<br />
|-<br />
| G6 || cheerful, cheer up, festive, jolly, jovial, merry, cheer, cheering, cheery, get happy, rejoice, songs that are cheerful, sunny || 13 || 142<br />
|-<br />
| G8 || brooding, contemplative, meditative, reflective, broody, pensive, pondering, wistful || 8 || 116<br />
|-<br />
| G29 || aggression, aggressive || 2 || 115<br />
|-<br />
| G25 || angst, anxiety, anxious, jumpy, nervous, angsty || 6 || 80<br />
|-<br />
| G9 || confident, encouraging, encouragement, optimism, optimistic || 5 || 61<br />
|-<br />
| G7 || desire, hope, hopeful, mood: hopeful || 4 || 45<br />
|-<br />
| G11 || earnest, heartfelt || 2 || 40<br />
|-<br />
| G31 || pessimism, cynical, pessimistic, weltschmerz, cynical/sarcastic || 5 || 38<br />
|-<br />
| G1 || excitement, exciting, exhilarating, thrill, ardor, stimulating, thrilling, titillating || 8 || 30<br />
|-<br />
| TOTAL || || 135 || 6,490 <br />
|}<br />
<br />
The songs are mostly from the USPOP collection, a detailed breakdown of the songs are listed in the following table: <br />
<br />
{| class="wikitable" style="margin: 1em auto 1em auto"<br />
! Collection || num. of songs in the dataset || percentage of songs in the dataset<br />
|-<br />
| USPOP || 2764 || 80%<br />
|-<br />
| Assorted pop || 366 || 10%<br />
|-<br />
| American music || 145 || 4%<br />
|-<br />
| Beatles || 128 || 4%<br />
|-<br />
| USCRAP || 40 || 1%<br />
|-<br />
| Metal music || 25 || 1%<br />
|-<br />
| Magnatune || 1 || 0%<br />
|-<br />
| TOTAL || 3469 || 100%<br />
|}<br />
<br />
Details on how the mood tag groups were derived are described in [https://www.music-ir.org/archive/papers/ISMIR2009_MoodClassification.pdf X. Hu, J. S. Downie, A.Ehmann, Lyric Text Mining in Music Mood Classification, In Proceedings of the 10th International Symposium on Music Information Retrieval (ISMIR), Oct. 2009, Kobe , Japan] <br />
<br />
Details on how the songs were selected are available in the [https://www.music-ir.org/archive/papers/Mood_Multi_Tag_Data_Description.pdf description].<br />
<br />
== Evaluation ==<br />
Participating algorithms will be evaluated with 3-fold artist-filtered cross-validation. An introduction to the evaluation statistics computed is given in the following subsections.<br />
<br />
<br />
=== Binary (Classification) Evaluation ===<br />
Algorithms are evaluated on their performance at tag classification using F-measure. Results are also reported for simple accuracy, however, as this statistic is dominated by the negative example accuracy it is not a reliable indicator of performance (as a system that returns no tags for any example will achieve a high score on this statistic). However, the accuracies are also reported for positive and negative examples separately as these can help elucidate the behaviour of an algorithm (for example demonstrating if the system is under or over predicting).<br />
<br />
<br />
=== Affinity (Ranking) Evaluation ===<br />
Algorithms are evaluated on their performance at tag ranking using the Area Under the Receiver Operating Characteristic Curve (AUC-ROC). The affinity scores for each tag to be applied to a track are sorted prior to the computation of the AUC-ROC statistic, which gives higher scores to ranked tag sets where the correct tags appear towards the top of the set.<br />
<br />
<br />
=== Ranking and significance testing ===<br />
Additionally, more standard tests could be performed on the average classification accuracy, although the cross-tag variance tends to increase each algorithm's variance, interfering with significance tests without further handling. One test that can help resolve these issues is Friedman's ANOVA with Tukey-Kramer HSD.<br />
<br />
We wish to compare a number of treatments/systems (the submissions) over a number of blocks/rows. We can either compute average classification accuracy and/or precision metrics over all the tags and use the cross validation folds as the blocks/rows - which will handle variance between different folds. However, we are more interested in considering each tag (averaged over all folds) or (perhaps better) each tag on each fold as a separate block.<br />
<br />
The Friedman test should handle the variance between tags (caused by different difficulties of modeling each tag and different numbers of positive and negative examples per tag) by replacing the actual scores achieved by each system on each block (tag) with the rank achieved by that system on that tag amongst all the systems. Hence, we make the assumption that each tag (or combination of tag and fold) is of equal importance in the evaluation. This is an often used approach at TREC (Text Retrieval Conference) when considering retrieval results (where each query is of equal importance, but unequal variance/difficulty).<br />
<br />
Tukey-Kramer Honestly Significant Difference multiple comparisons are made over the results of Friedman's ANOVA as this (and other tests, such as multiply applied Student's T-tests) can only safely tell you if one system is statistically significantly different from the rest. If you try to do the full NxN comparisons with such tests then the experiment wide alpha value is cumulative over all the tests. E.g. if we compared 12 systems at an alpha level of 0.05, a total of 66 pairwise comparisons are made and the chance of incorrectly rejecting the hypothesis of no difference in error rates is: 1 - (0.95^66) = 0.97 = 97%. This explanation is lifted from a paper by Tague-Sutcliffe and Blustein:<br />
<br />
@article{taguesutcliffe1995sat,<br />
title={A Statistical Analysis of the TREC-3 Data},<br />
author={Tague-Sutcliffe, J. and Blustein, J.},<br />
journal={Overview of the Third Text Retrieval Conference (Trec-3)},<br />
year={1995},<br />
publisher={DIANE Publishing}<br />
}<br />
<br />
For further details on the use of Friedman's ANOVA with Tukey-Kramer HSD in MIR, please see:<br />
<br />
@InProceedings{jones2007hsj,<br />
title={"Human Similarity Judgments: Implications for the Design of Formal Evaluations"},<br />
author="M.C. Jones and J.S. Downie and A.F. Ehmann",<br />
BOOKTITLE ="Proceedings of ISMIR 2007 International Society of Music Information Retrieval", <br />
year="2007"<br />
}<br />
<br />
=== Runtime performance ===<br />
In addition computation times for feature extraction and training/classification will be measured.<br />
<br />
<br />
== Submission format ==<br />
Submission to this task will have to conform to a specified format detailed below, which is very similar to the audio genre classification task, among others.<br />
<br />
<br />
=== Audio formats ===<br />
Participating algorithms will have to read audio in the following format:<br />
<br />
* Sample rate: 44 KHz<br />
* Sample size: 16 bit<br />
* Number of channels: 2 (stereo)<br />
* Encoding: WAV (decoded from MP3 files by IMIRSEL)<br />
* Duration: 10 second clips<br />
<br />
<br />
=== Implementation details ===<br />
Scratch folders will be provided for all submissions for the storage of feature files and any model files to be produced. Executables will have to accept the path to their scratch folder as a command line parameter. Executables will also have to track which feature files correspond to which audio files internally. To facilitate this process, unique filenames will be assigned to each audio track.<br />
<br />
The audio files to be used in the task will be specified in a simple ASCII list file. For feature extraction and classification this file will contain one path per line with no header line. For model training this file will contain one path per line, followed by a tab character and the tag label, again with no header line. Executables will have to accept the path to these list files as a command line parameter. The formats for the list files are specified below.<br />
<br />
Algorithms should divide their feature extraction and training/classification into separate executables/scripts. This will facilitate a single feature extraction step for the task, while training and classification can be run for each cross-validation fold.<br />
<br />
Multi-processor compute nodes (8 cores) will be used to run this task. Hence, participants should attempt to use parallelism where-ever possible. Ideally, the number of threads to use should be specified as a command line parameter. Alternatively, implementations may be provided in hard-coded 2, 4 or 8 thread configurations. Single threaded submissions will, of course, be accepted but may be disadvantaged by time constraints.<br />
<br />
<br />
=== I/O formats ===<br />
In this section the input and output files used in this task are described as are the command line calling format requirements for submissions.<br />
<br />
<br />
==== Feature extraction list file ====<br />
The list file passed for feature extraction will be a simple ASCII list file. This file will contain one path per line with no header line.<br />
<br />
I.e.<br />
<example path and filename><br />
<br />
E.g. <br />
/path/to/track1.wav<br />
/path/to/track2.wav<br />
...<br />
<br />
==== Training list file ====<br />
The list file passed for model training will be a simple ASCII list file. This file will contain one path per line, followed by a tab character and a tag label, again with no header line.<br />
<br />
I.e. <br />
<br />
<example path and filename>\t<tag classification>\n<br />
<br />
<br />
E.g.<br />
/path/to/track1.wav drum<br />
/path/to/track1.wav silence<br />
...<br />
<br />
<br />
In this way, the input file will represent the sparse ground truth matrix. While no line will be duplicated, multiple lines may contain the same path, one for each tag associated with that clip. Any tag that is not specified as applying to a clip does not apply to that clip. The ordering of the lines is arbitrary and should not be depended upon.<br />
<br />
==== Test (classification) list file ====<br />
The list file passed for testing classification will be a simple ASCII list file identical in format to the Feature extraction list file. This file will contain one path per line with no header line.<br />
<br />
I.e.<br />
<example path and filename><br />
<br />
E.g. <br />
/path/to/track1.wav<br />
/path/to/track2.wav<br />
...<br />
<br />
==== Classification output files ====<br />
Participating algorithms should produce '''two''' simple ASCII list files similar in format to the Training list file. The path to which each list file should be written must be accepted as a parameter on the command line.<br />
<br />
<br />
===== Tag Affinity file =====<br />
The first file will contain one path per line, followed by a tab character and the tag label, followed by another tab character and the affinity of that tag for that file, again with no header line.<br />
<br />
I.e.:<br />
<br />
<example path and filename>\t<tag classification>\t<affinity>\n<br />
<br />
E.g.:<br />
<br />
/data/file1.wav rock 0.9<br />
/data/file1.wav guitar 0.7<br />
/data/file1.wav vocal 0.3<br />
/data/file2.wav rock 0.5<br />
...<br />
<br />
In this way, the output file will represent the sparse classification matrix. A path should be repeated on a separate line for each tag that the submission deems applies to it. If a (path, tag) pair is not specified, it will be assumed to have an affinity of 0. The ordering of the lines is not important and can be arbitrary.<br />
<br />
The affinity will be used for retrieval evaluation metrics, and its only specification is that for a given tag, larger (closer to +infinity) numbers indicate that the tag is more appropriate to a clip than smaller (closer to -infinity) numbers. As submissions are asked to also return a binary relevance listing, submissions that do not compute an affinity should provide only the binary relevance listing file.<br />
<br />
<br />
===== Binary relevance file =====<br />
The second file to be produced is a binary version of the tag classifications, where a tag must be marked as relevant or not relevant to a track. This file will contain one path per line, followed by a tab character and the tag label, followed by another tab character and either a 1 or a 0 indicating the relevance of that tag for that file, again with no header line.<br />
<br />
I.e.:<br />
<br />
<example path and filename>\t<tag classification>\t<relevant? [0 | 1]>\n<br />
<br />
E.g.:<br />
<br />
/data/file1.wav rock 1<br />
/data/file1.wav guitar 1<br />
/data/file1.wav vocal 0<br />
/data/file2.wav rock 1<br />
...<br />
<br />
If a (path, tag) pair is not specified, it will be assumed to be non-relevant (0). Any line with path but no numerical value will be assumed to be relevant (1).<br />
<br />
Hence, the following is equivalent to the example above:<br />
<br />
/data/file1.wav rock<br />
/data/file1.wav guitar<br />
/data/file2.wav rock<br />
<br />
The ordering of the lines is not important and can be arbitrary.<br />
<br />
<br />
=== Example submission calling formats ===<br />
extractFeatures.sh /path/to/scratch/folder /path/to/featureExtractionListFile.txt<br />
TrainAndClassify.sh /path/to/scratch/folder /path/to/trainListFile.txt /path/to/testListFile.txt /path/to/outputAffinityFile.txt /path/to/outputBinaryRelevanceFile.txt<br />
<br />
extractFeatures.sh -numThreads 8 /path/to/scratch/folder /path/to/featureExtractionListFile.txt<br />
TrainAndClassify.sh -numThreads 8 /path/to/scratch/folder /path/to/trainListFile.txt /path/to/testListFile.txt /path/to/outputAffinityFile.txt /path/to/outputBinaryRelevanceFile.txt<br />
<br />
extractFeatures.sh /path/to/scratch/folder /path/to/featureExtractionListFile.txt<br />
Train.sh /path/to/scratch/folder /path/to/trainListFile.txt <br />
Classify.sh /path/to/scratch/folder /path/to/testListFile.txt /path/to/outputAffinityFile.txt /path/to/outputBinaryRelevanceFile.txt<br />
<br />
myAlgo.sh -extract -numThreads 8 /path/to/scratch/folder /path/to/featureExtractionListFile.txt<br />
myAlgo.sh -TrainAndClassify -numThreads 8 /path/to/scratch/folder /path/to/trainListFile.txt /path/to/testListFile.txt /path/to/outputAffinityFile.txt /path/to/outputBinaryRelevanceFile.txt<br />
<br />
myAlgo.sh -extract /path/to/scratch/folder /path/to/featureExtractionListFile.txt<br />
myAlgo.sh -train /path/to/scratch/folder /path/to/trainListFile.txt <br />
myAlgo.sh -classify /path/to/scratch/folder /path/to/testListFile.txt /path/to/outputAffinityFile.txt /path/to/outputBinaryRelevanceFile.txt<br />
<br />
<br />
=== Packaging submissions ===<br />
All submissions should be statically linked to all libraries (the presence of dynamically linked libraries cannot be guaranteed).<br />
<br />
All submissions should include a README file including the following the information:<br />
<br />
* Command line calling format for all executables<br />
* Number of threads/cores used or whether this should be specified on the command line<br />
* Expected memory footprint<br />
* Expected runtime<br />
* Approximately how much scratch disk space will the submission need to store any feature/cache files?<br />
* Any required environments libraries and architectures (including version information) such as Matlab, Java, Python, Bash, Ruby etc.<br />
* Any special notice regarding to running your algorithm <br />
<br />
Note that the information that you place in the README file is extremely important in ensuring that your submission is evaluated properly.<br />
<br />
=== Time and hardware limits ===<br />
Due to the potentially high number of participants in this and other audio tasks, hard limits on the runtime of submissions will be specified.<br />
<br />
A hard limit of 72 hours will be imposed on the full execution of a submission on each dataset (to include feature extraction time and the 3 training/testing cycles required for the 3-fold cross-validated experiment. <br />
<br />
These limits will likely be strictly imposed at MIREX 2010 (due to the very high level of participation that is expected).<br />
<br />
<br />
== Submission opening date ==<br />
<br />
Friday 4th June 2010<br />
<br />
== Submission closing date ==<br />
TBA</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2010:Audio_Music_Mood_Classification&diff=71652010:Audio Music Mood Classification2010-06-07T19:06:17Z<p>IMIRSELBot: Robot: Automated text replacement (-\[http:\/\/(www.)?music-ir.org\/mirex\/200([5-9])\/index.php\/([^\s]+)( .+)?\] +200\2:\3)</p>
<hr />
<div>== Description ==<br />
<br />
The text of this section is copied from the 2009 page. Please add your comments and discussions for 2010. <br />
<br />
This section is put here to clarify what will happen for this year's run of the Audio Mood Classification (AMC) task.<br />
<br />
# We will operate the AMC task as a classic train-test classification task.<br />
# We will n-fold the runs with n to be determined by the size of the final data set, number of participants, etc.<br />
# We will hand-craft the n-fold test-train split lists.<br />
# We will NOT be doing post-run human mood judgments this year using the Evalutron 6000. <br />
# Audio files: 30 sec., 22kHz, mono, 16 bit<br />
<br />
Do take a look at the [[Audio Genre Classification]] task wiki as we are basing the underlying structure of this task on Audio Genre. In fact, an Audio Genre submission should work out of the box with Audio Mood Classification. Note: we really want folks to do a FEATURE EXTRACTION phase first against all the files and then have these features cached some place for re-use during the TRAIN-TEST phase. This way we can really speed up the n-fold processing. Thus, like GENRE, we need to pass three input files to your algos:<br />
<br />
<br />
== Discussions for 2010 ==<br />
Your comments here.<br />
<br />
<br />
==== 1. Feature extraction list file ====<br />
The list file passed for feature extraction will a simple ASCII list <br />
file. This file will contain one path per line with no header line.<br />
<br />
==== 2. Training list file ====<br />
The list file passed for model training will be a simple ASCII list <br />
file. This file will contain one path per line, followed by a tab character and <br />
the genre label, again with no header line. <br />
<br />
E.g. <example path and filename>\t<mood classification><br />
<br />
==== 3. Test (classification) list file ====<br />
The list file passed for testing classification will be a simple ASCII list <br />
file identical in format to the Feature extraction list file. This file will <br />
contain one path per line with no header line.<br />
<br />
==== Classification output files ====<br />
Participating algorithms should produce a simple ASCII list file identical in <br />
format to the Training list file. This file will contain one path per line, <br />
followed by a tab character and the MOOD label, again with no header line. <br />
E.g.:<br />
<example path and filename>\t<mood classification><br />
<br />
The path to which this list file should be written must be accepted as a <br />
parameter on the command line.<br />
<br />
== Introduction ==<br />
In music psychology and music education, emotion component of music has been recognized as the most strongly associated with music expressivity.(e.g. Juslin et al 2006[[#Related Papers]]). Music information behavior studies (e.g.Cunningham, Jones and Jones 2004, Cunningham, Vignoli 2004, Bainbridge and Falconer 2006 [[#Related Papers]]) have also identified music mood/ emotion as an important criterion used by people in music seeking and organization. Several experiments have been conducted in the MIR community to classify music by mood (e.g. Lu, Liu and Zhang 2006, Pohle, Pampalk, and Widmer 2005, Mandel, Poliner and Ellis 2006, Feng, Zhuang and Pan 2003[[#Related Papers]]). Please note: the MIR community tends to use the word "mood" while musicpsychologists like to use "emotion". We follow the MIR tradition to use "mood" thereafter. <br />
<br />
However, evaluation of music mood classification is difficult as music mood is a very subjective notion. Each aforementioned experiement used different mood categories and different datasets, making comparison on previous work a virtually impossible mission. A contest on music mood classification in MIREX will help build the first ever community available test set and precious ground truth.<br />
<br />
This is the first time in MIREX to attempt a music mood classification evaluation. There are many issues involved in this evaluation task, and let us start discuss them on this wiki. If needed, we will set up a mailing list devoting to the discussion.<br />
<br />
== Mood Categories ==<br />
<br />
The IMIRSEL has derived a set of 5 mood clusters from the AMG mood repository (Hu & Downie 2007[[#Related Papers]]). The mood clusters effectively reduce the diverse mood space into a tangible set of categories, and yet root in the social-cultural context of pop music. Therefore, we propose to use the 5 mood clusters as the categories in this yearΓÇÖs audio mood classification contest. Each of the clusters is a collection of the AMG mood labels which collectively define the cluster: <br />
<br />
*Cluster_1: passionate, rousing, confident,boisterous, rowdy <br />
*Cluster_2: rollicking, cheerful, fun, sweet, amiable/good natured <br />
*Cluster_3: literate, poignant, wistful, bittersweet, autumnal, brooding <br />
*Cluster_4: humorous, silly, campy, quirky, whimsical, witty, wry <br />
*Cluster_5: aggressive, fiery,tense/anxious, intense, volatile,visceral <br />
<br />
At this moment, the IMIRSEL and Cyril Laurier at the Music Technology Group of Barcelona have manually validated the mood clusters and exemplar songs in each cluster. Please see [[#Exemplar Songs in Each Category]] for details. <br />
<br />
We are still seeking additional songs across different genres to enrich this set, and during the process, the cluster with least cross-listener consistency may be dropped, or two clusters often confusing each other may be combined. <br />
<br />
== Exemplar Songs in Each Category == <br />
Exemplar songs for each mood cluster are manually selected by multiple human assessors. The purpose is to further clarify the perceptual identities of the mood clusters.<br />
<br />
There are 190 candidate songs in the intersection of AMG mood repository and the USPOP collection in IMIRSEL, and each of these songs has only one unanimous mood cluster label assigned by AMG editors. The mood labels by AMG editors are important benchmark which can help us reach cross-listener consistency on such a subjective task. So far, 6 human assessors have listened to the 190 songs and assigned cluster labels to them. 50 songs are unanimously labeled by the 6 human assessors, 42 songs are unanimously labeled by 5 of the 6 human assessors, and another 40 songs by 4 of the 6 human assessors. <br />
<br />
The advantages of the exemplar songs are two folds: 1. they will help people better understand what kind of mood each cluster refers to; 2. they can possibly be taken as training data for the algorithms (see the section of [[#Training Set]]). <br />
<br />
Note: Lyrics issue: when labeling the songs, the human assessors were asked to ignore lyrics. As this is a contest focuses on music audio, lyrics should not be taken into consideration. <br />
<br />
== Two Evaluation Scenarios ==<br />
<br />
1. Evaluation on a closed groundtruth set.<br />
As in traditional classification problems, both training and testing data are labeled well before the contest. <br />
Pros: evaluation metrics are more rigorous; support cross-validation <br />
cons: training/testing set is limited<br />
<br />
2. Training on a labeled set, but testing on an unlabeled audio pool <br />
As in audio similarity and retrieval contest, each algorithm returns a list of candidates in each mood category, then human assessors make judgments on the returned candidates. <br />
Pros: testing pool can be arbitrarily big; training set is bigger as well (which can be the whole groundtruth set in scenario 1 .) <br />
Cons: innovative but limited evaluation metrics (see below)<br />
<br />
For both scenarios, this is a single-label classification contest, and thus each song can only be classified into one mood cluster. <br />
<br />
'''We will go for scenario 1'''<br />
<br />
== Groundtruth Set ==<br />
<br />
The IMIRSEL is preparing a ground-truth set of audio clips selected from the USPOP collection decribed above and the APM collection (www.apmmusic.com). The bibliographic information of the exemplar songs has been released as above, which is to help participants reach agreements on the meanings of the mood categories.<br />
<br />
The APM audio set has been pre-labeled with the 5 mood clusters according to their metadata provided by APM, and covers a variety of genres: each category covers about 7 major genres (with 20-30 tracks each) and a few minor genres. To make the problem more interesting, the distribution among major genres within each category is made as even as possible. <br />
<br />
To make sure the mood labels are correct, this APM audio collection will subject to human validation before the contest. We prepared a set of 1250 audio clips (250 per category). The audio clips whose mood category assignments reach agreements among 2 out of 3 human assessors will serve as a ground truth set. We are aiming at least 120 audio clips in each mood category. <br />
<br />
After the human validation on this audio set, participating algorithms/ models will be trained and tested within IMIRSEL.<br />
<br />
'''Audio format: 30 second clips, 22.05kHz, mono, 16bit, WAV files''' <br />
<br />
=== Human Validation ===<br />
Subjective judgments by human assessors will be collected for the above mentioned APM audio set using a web-based system, Evalutron6000, developed by the IMIRSEL. <br />
<br />
Each audio clip is 30 seconds long, and will have 3 human judges listen to it and choose which mood category it belongs to. If 2 of the 3 judges agree on its category, an audio clip will be selected into the groundtruth set.<br />
<br />
== Evaluation Metrics == <br />
<br />
Metrics frequently used in classification problems include: accuracy, precision, recall and F measures (combining precision and recall). The single most important metrics would be accuracy, which allows direct system comparison: <br />
<br />
''Accuracy = # of correctly classified songs / #. of all songs.'' <br />
<br />
Accuracy can be calculated for all clusters as a whole (macro average) or for each cluster then take average of them (micro average).<br />
<br />
Test significance of differences among systems, possibly using<br />
<br />
*a) McNemarΓÇÖs test <br />
<br />
McNemarΓÇÖs test (Dietterich, 1997) is a statistical process that can validate the significance of differences between two classifiers. It was used in Audio Genre Classification and Audio Artist Identification contests in MIREX 2005. <br />
<br />
*b) FriedmanΓÇÖs test<br />
<br />
FriedmanΓÇÖs test used to detect differences in treatments across multiple test attempts. (http://en.wikipedia.org/wiki/Friedman_test). It was used in Audio Similarity, Audio cover song, and Query by Singing/Humming contests in MIREX 2006. <br />
<br />
Besides, run time can be recorded and compared.<br />
<br />
== Important Dates ==<br />
<br />
<br />
* Algorithm Submission Deadline: TBA<br />
<br />
== Packaging your Submission ==<br />
* Be sure that your submission follows the [[#Submission_Format]] outlined below.<br />
* Be sure that your submission accepts the proper [[#Input_File]] format<br />
* Be sure that your submission produces the proper [[#Output_File]] format<br />
* Be sure to follow the [[[2006:Best_Coding_Practices_for_MIREX]]<br />
* Be sure to follow the [[MIREX 2010 Submission Instructions]] <br />
* In the README file that is included with your submission, please answer the following additional questions:<br />
** Approximately how long will the submission take to process ~1000 wav files?<br />
** Approximately how much scratch disk space will the submission need to store any feature/cache files?<br />
** Any special notice regarding to running your algorith<br />
<br />
Note that the information that you place in the README file is '''extremely''' important in ensuring that your submission is evaluated properly.<br />
<br />
== Submission Format ==<br />
A submission to the Audio Music Mood Classification evaluation is expected to follow the [[2006:Best_Coding_Practices_for_MIREX]] and must conform to the following for execution:<br />
<br />
=== One Call Format ===<br />
The one call format is appropriate for systems that perform all phases of the classification (typically features extraction, training and testing) in one step. A submission should be an executable program that takes 4 arguments: <br />
* path/to/fileContainingListOfTrainingAudioClips - the path to the list of training audio clips (see [[#File Formats]] below)<br />
* path/to/fileContainingListOfTestingAudioClips - the path to the list of testing audio clips (see [[#File Formats]] below)<br />
* path/to/cacheDir - a directory where the submission can place temporary or scratch files. Note that the contents of this directory can be retained across runs, so if, for whatever reason, the submission needs to be restarted, the submission could make use of the contents of this directory to eliminate the need for reprocessing some inputs.<br />
* path/to/output/Results - the file where the output classification results should be placed. (see [[#File Formats]] below)<br />
<br />
'''Example:'''<br />
<br />
<pre><br />
<br />
doAMC "path/to/fileContainingListOfTrainingAudioClips" "path/to/fileContainingListOfTestingAudioClips" "path/to/cacheDir" "path/to/output/Results" <br />
<br />
</pre><br />
<br />
<br />
=== Two Call Format ===<br />
The one call format is appropriate for systems that perform the training and testing separately. A submission should consists of two executable programs<br />
*trainAMC - this takes 3 arguments: <br />
** path/to/fileContainingListOfTrainingAudioClips - the path to the list of training audio clips (see [[#File Formats]] below)<br />
** path/to/trainingCacheDir - a directory where the submission can place temporary or scratch files. Note that the contents of this directory can be retained across runs, so if, for whatever reason, the submission needs to be restarted, the submission could make use of the contents of this directory to eliminate the need for reprocessing some inputs.<br />
** path/to/trainedClassificationModel - the file where the classification model should be placed<br />
*testAMC - this takes 4 arguments:<br />
** path/to/trainedClassificationModel<br />
** path/to/fileContainingListofTestingAudioClips - the path to the list of testing audio clips (see [[#File Formats]] below)<br />
** path/to/testingCacheDir - a directory where the submission can place temporary or scratch files. <br />
** path/to/output/Results - the file where the output classification results should be placed. (see [[#File Formats]] below)<br />
<br />
'''Example:'''<br />
<br />
<pre><br />
<br />
trainAMC "path/to/fileContainingListOfTrainingAudioClips" "path/to/trainingcacheDir" "path/to/trainedClassificationModel" <br />
testAMC "path/to/trainedClassificationModel" "path/to/fileContainingListofTestingAudioClips" "path/to/testingCacheDir" "path/to/output/Results"<br />
<br />
</pre><br />
<br />
=== Matlab format ===<br />
<br />
Matlab will also be supported in the form of functions in the following formats:<br />
<br />
==== Matlab One call format ====<br />
<pre><br />
doMyMatlabAMC('path/to/fileContainingListOfTrainingAudioClips','path/to/fileContainingListOfTestingAudioClips','path/to/cacheDir','path/to/output/Results')<br />
</pre><br />
<br />
<br />
==== Matlab Two call format ====<br />
<pre><br />
doMyMatlabTrainAMC('path/to/fileContainingListOfTrainingAudioClips','path/to/trainingcacheDir','path/to/trainedClassificationModel')<br />
doMyMatlabTestAMC('path/to/trainedClassificationModel','path/to/fileContainingListofTestingAudioClips','path/to/testingCacheDir','path/to/output/Results')<br />
</pre><br />
<br />
== File Formats ==<br />
<br />
=== Input Files ===<br />
<br />
The input training list file format will be of the form: <br />
<br />
<pre><br />
path/to/training/audio/file/000001.wav\tCluster_3<br />
path/to/training/audio/file/000002.wav\tCluster_5<br />
path/to/training/audio/file/000003.wav\tCluster_2<br />
...<br />
path/to/training/audio/file/00000N.wav\tCluster_1<br />
</pre><br />
<br />
"\t" stands for tab.<br />
<br />
The input testing list file format will be of the form: <br />
<br />
<pre><br />
path/to/testing/audio/file/000010.wav<br />
path/to/testing/audio/file/000020.wav<br />
path/to/testing/audio/file/000030.wav<br />
...<br />
path/to/testing/audio/file/0000N0.wav<br />
</pre><br />
<br />
"\t" stands for tab.<br />
<br />
=== Output File ===<br />
The only output will be a file containing classification results in the following format: <br />
<br />
<pre><br />
Example Classification Results 0.1 (replace this line with your system name)<br />
path/to/testing/audio/file/000010.wav\tCluster_3<br />
path/to/testing/audio/file/000020.wav\tCluster_1<br />
path/to/testing/audio/file/000030.wav\tCluster_5<br />
...<br />
path/to/testing/audio/file/0000N0.wav\tCluster_2<br />
</pre><br />
<br />
"\t" indicates tab. All audio clips should have one and only one mood cluster label.<br />
<br />
==Evaluation Scenario 2==<br />
<br />
=== Training Set ===<br />
<br />
Under evaluation scenario 2, the training set would be the whole ground truth set in scenario 1 (see [[#Groundtruth Set]]).<br />
<br />
=== Unlabeled Song Pool ===<br />
Under evaluation scenario 2, the pool of testing audio to be classified is from the same collection of the training set, i.e. USPOP and APM. We will make sure the audio covers a variety of genres in each mood cluster, which will make the contest harder and more interesting.<br />
<br />
We will randomly select a certain number (say, 1000) of songs from the collections as the audio pool. This number should make the contest interesting enough, but not too hard. And the songs need to cover all 5 mood clusters.<br />
<br />
=== Classification Results ===<br />
Each algorithm will return the top X songs in each cluster. <br />
<br />
This is a single-label classification contest, and thus each song can only be classified into one mood cluster. <br />
<br />
Note: unlike traditional classification problems where all testing samples have ground truth available, this scenario does not have a well labeled testing set. Instead, we use a ΓÇ£poolingΓÇ¥ approach like in TREC and last yearΓÇÖs audio similarity and retrieval contest. This approach collects the top X results from each algorithm and asks human assessors to make judgments on this set of collected results while assuming all other samples are irrelevant or incorrect. This approach cannot measure the absolute ΓÇ£recallΓÇ¥ metrics, but it is valid in comparing relative performances among participating algorithms. <br />
<br />
The actual value of X depends on human assessment protocol and number of available human assessors (see next section [[#Human Assessment]]).<br />
<br />
=== Human Assessment===<br />
Subjective judgments by human assessors will be collected for the pooled results using a web-based system, Evalutron6000, developed by the IMIRSEL. (An introduction of this piece of Evalutron 6000 is shown here [[Evalutron6000_Walkthrough_For_Audio_Mood_Classification]]<br />
<br />
==== How many judgments and assessors ====<br />
Each algorithm returns X songs for each of the 5 mood clusters. Suppose there are Y algorithms, in the worst case, each cluster will have 5* X*Y songs to be judged. Suppose each song needs Z sets of ears, there will be 5*X*Y*Z judgments in total. When making a judgment, a human assessor will listen to the 30 second clip of a song, and label it with one of the 5 mood clusters. <br />
<br />
Human evaluators will be drawn from the participating labs and volunteers from IMIRSEL or on the MIREX lists. Suppose we can get W evaluators, each evaluator will evaluate S = (5*X*Y*Z) / W songs.<br />
<br />
At this moment, there are 10 potential participants on the Wiki, so letΓÇÖs say Y = 6. Suppose each candidate song will be evaluated by 3 judges, Z = 3, and suppose we can get 20 assessors: W = 20: <br />
<br />
*If X = 20, number of judgments for each assessor: S = 90<br />
*If X = 10, S = 45<br />
*If X = 30, S = 135 <br />
*If X = 50, S = 225<br />
*If X = 15, S = 67.5<br />
*…<br />
<br />
In audio similarity contest last year, each assessor made 205 judgments as average. As the judgment for mood is trickier, we may need to give our assessors less burden.<br />
<br />
To eliminate possible bias, we will try to equally distribute candidates returned by each algorithm among human assessors.<br />
<br />
=== Scoring ===<br />
Each algorithm is graded by the number of votes its candidate songs win from the judges. For example, if a song, A, is judged as in Cluster_1 by 2 assessors and as in Cluster_2 by 1 assessors, then the algorithm classifying A as in Cluster_1 will score 2 on this song, while the algorithm classifiying A as Cluster_2 will score 1 on this song. An algorithmΓÇÖs final score is the sum of scores on all the songs it submits. Since each algorithm can only submit 100 songs, the one which wins the most votes of judges win the contest.<br />
<br />
=== Evaluation Metrics ===<br />
Algorithm score as mentioned in last section is a metrics that facilitates direct comparison. <br />
<br />
Besides, metrics frequently used in classification problems include: accuracy, precision, recall and F measures (combining precision and recall). As mentioned above, the pooling approach results in a relative recall measure, therefore, the single most important metrics would be accuracy: <br />
<br />
The original definition of accuracy is:<br />
''Accuracy = # of correctly classified songs / #. of all songs.'' <br />
<br />
According to the above human assessment method, ΓÇ£correctly classified songsΓÇ¥ in this scenario can be defined as songs classified as the majority vote of the judges and, in the case of ties, songs classified as any of the tie votes. For example, suppose each song has 3 judges. If a song is labeled as Cluster_1 by at least 2 judges, then this song will be counted as correct for algorithms classifying it to Cluster_1; if a song is labeled as Cluster_1, Cluster_2 and Cluster_3 once by each of the judges, then this song will be counted as correct for algorithms classifying it to Cluster_1, Cluster_2 or Cluster_3. <br />
<br />
Accuracy can be calculated for all clusters as a whole (macro average) or for each cluster then take average of them (micro average).<br />
<br />
Test significance of differences among systems, possibly using<br />
<br />
*a) McNemarΓÇÖs test <br />
*b) FriedmanΓÇÖs test<br />
<br />
Besides, run time can be recorded and compared.<br />
<br />
== Challenging Issues == <br />
# Mood changeable pieces: some pieces may start from one mood but end up with another one. <br />
<br />
We will use 30 second clips instead of whole songs. The clips will be extracted automatically from the middle of the songs which have more chances to be representative.<br />
<br />
# Multiple label classification: it is possible that one piece can have two or more correct mood labels, but as a start, we strongly suggest to hold a less confusing contest and leave the challenge to future MIREXs.So, for this year, this is a single label classification problem.<br />
<br />
== Moderators ==<br />
* J. Stephen Downie (IMIRSEL, University of Illinois, USA) - [mailto:jdownie@uiuc.edu]<br />
* Xiao Hu (IMIRSEL, University of Illinois, USA) -[mailto:xiaohu@uiuc.edu]<br />
* Cyril Laurier (Music Technology Group, Barcelona, Spain) -[mailto:claurier@iua.upf.edu]<br />
<br />
== Related Papers ==<br />
#Dietterich, T. (1997). '''Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms'''. Neural Computation, 10(7), 1895-1924.<br />
#Hu, Xiao and J. Stephen Downie (2007). '''Exploring mood metadata: Relationships with genre, artist and usage metadata'''. Accepted in the Eighth International Conference on Music Information Retrieval (ISMIR 2007),Vienna, September 23-27, 2007.<br />
# Juslin, P.N., Karlsson, J., Lindstr├╢m E., Friberg, A. and Schoonderwaldt, E(2006), '''Play It Again With Feeling: Computer Feedback in Musical Communication of Emotions'''. In Journal of Experimental Psychology: Applied 2006, Vol.12, No.2, 79-95.<br />
# [http://ismir2004.ismir.net/proceedings/p075-page-415-paper152.pdf Vignoli (ISMIR 2004)] '''Digital Music Interaction Concepts: A User Study'''<br />
# [http://ismir2004.ismir.net/proceedings/p082-page-447-paper221.pdf Cunningham, Jones and Jones (ISMIR 2004)] '''Organizing Digital Music For Use: An Examiniation of Personal Music Collections'''.<br />
# [http://ismir2006.ismir.net/PAPERS/ISMIR0685_Paper.pdf Cunningham, Bainbridge and Falconer (ISMIR 2006)] '''More of an Art than a Science': Supporting the Creation of Playlists and Mixes'''.<br />
# Lu, Liu and Zhang (2006), '''Automatic Mood Detection and Tracking of Music Audio Signals'''. IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 1, JANUARY 2006 <br> Part of this paper appeared in ISMIR 2003 http://ismir2003.ismir.net/papers/Liu.PDF<br />
# [http://www.cp.jku.at/research/papers/Pohle_CBMI_2005.pdf Pohle, Pampalk, and Widmer (CBMI 2005)] '''Evaluation of Frequently Used Audio Features for Classification of Music into Perceptual Categories'''. <br> It separates "mood" and "emotion" as two classifcation dimensions, which are mostly combined in other studies.<br />
# [http://www.ee.columbia.edu/~dpwe/pubs/MandPE06-svm.pdf Mandel, Poliner and Ellis (2006)] '''Support vector machine active learning for music retrieval'''. Multimedia Systems, Vol.12(1). Aug.2006.<br />
# [http://doi.acm.org/10.1145/860435.860508 Feng, Zhuang and Pan (SIGIR 2003)] '''Popular music retrieval by detecting mood'''<br />
# [http://ismir2003.ismir.net/papers/Li.PDF Li and Ogihara (ISMIR 2003)] '''Detecting emotion in music'''<br />
# [http://pubdb.medien.ifi.lmu.de/cgi-bin//info.pl?hilliges2006audio Hilliges, Holzer, Kl├╝ber and Butz (2006)] '''AudioRadar: A metaphorical visualization for the navigation of large music collections'''.In Proceedings of the International Symposium on Smart Graphics 2006, Vancouver Canada. <br> It summarized implicit problems in traditional genre/artist based music organization.<br />
# Juslin, P. N., & Laukka, P. (2004). '''Expression, perception, and induction of musical emotions: A review and a questionnaire study of everyday listening'''. Journal of New Music Research, 33(3), 217-238.<br />
# [http://mpac.ee.ntu.edu.tw/~yihsuan/ Yang, Liu, and Chen (ACMMM 2006)] '''Music emotion classification: A fuzzy approach '''<br />
<br />
<br />
<br />
== Potential Participants ==</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Music_Mood_Classification_Results&diff=71642008:Audio Music Mood Classification Results2010-06-07T18:57:20Z<p>IMIRSELBot: Robot: Automated text replacement (-/mirex/2008/results/ +/mirex/results/2008/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Audio Music Mood Classification task. For background information about this task set please refer to the [[2008:Audio Music Mood Classification]] page. <br />
<br />
===General Legend===<br />
====Team ID====<br />
'''GP1''' = [https://www.music-ir.org/mirex/abstracts/2008/Peeters_2008_ISMIR_MIREX.pdf G. Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT3''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''HW''' = [https://www.music-ir.org/mirex/abstracts/2008/.pdf H. Wang]<br /><br />
'''KL''' = [https://www.music-ir.org/mirex/abstracts/2008/.pdf K. Lee]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/abstracts/2008/.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/abstracts/2008/.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/abstracts/2008/.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/abstracts/2008/.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
==Overall Summary Results==<br />
===MIREX 2008 Audio Mood Classification Summary Results - Raw Classification Accuracy Averaged Over Three Train/Test Folds===<br />
<br />
<csv>2008/mood/audiomood.avg.results.csv</csv><br />
<br />
=====Accuracy Across Folds=====<br />
<br />
<csv>2008/mood/audiomood.results.fold.csv</csv><br />
<br />
=====Accuracy Across Categories=====<br />
<br />
<csv>2008/mood/audiomood.results.class.csv</csv><br />
<br />
===MIREX 2008 Audio Artist Classification Evaluation Logs and Confusion Matrices===<br />
<br />
====MIREX 2008 Audio Mood Classification Run Times====<br />
<br />
<csv>2008/mood.runtime.csv</csv><br />
<br />
====CSV Files Without Rounding====<br />
[https://www.music-ir.org/mirex/results/2008/mood/audiomood_results_fold.csv audiomood_results_fold.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/mood/audiomood_results_class.csv audiomood_results_class.csv]<br /><br />
<br />
====Results By Algorithm====<br />
(.tar.gz) <br /><br />
'''GP1''' = [https://www.music-ir.org/mirex/results/2008/mood/GP1.tar.gz G. Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/results/2008/mood/GT1.tar.gz G. Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/results/2008/mood/GT2.tar.gz G. Tzanetakis]<br /><br />
'''GT3''' = [https://www.music-ir.org/mirex/results/2008/mood/GT3.tar.gz G. Tzanetakis]<br /><br />
'''HW''' = [https://www.music-ir.org/mirex/results/2008/mood/HW.tar.gz G. H. Wang]<br /><br />
'''KL''' = [https://www.music-ir.org/mirex/results/2008/mood/KL.tar.gz K. Lee]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/results/2008/mood/LRPPI1.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/results/2008/mood/LRPPI2.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/results/2008/mood/LRPPI3.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/results/2008/mood/LRPPI4.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/results/2008/mood/ME1.tar.gz I. M. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/results/2008/mood/ME2.tar.gz I. M. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/results/2008/mood/ME3.tar.gz I. M. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
===Friedman's Test for Significant Differences===<br />
====Classes vs. Systems====<br />
The Friedman test was run in MATLAB against the average accuracy for each class.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/mood/perClassAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/mood/perClassAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_mood.perclassaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
====Folds vs. Systems====<br />
The Friedman test was run in MATLAB against the accuracy for each fold.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/mood/perFoldAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/mood/perFoldAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_mood.perfoldaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Melody_Extraction_Results&diff=71632008:Audio Melody Extraction Results2010-06-07T18:57:10Z<p>IMIRSELBot: Robot: Automated text replacement (-/mirex/2008/results/ +/mirex/results/2008/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Audio Melody Extraction task set. For background information about this task set please refer to the [[2008:Audio Melody Extraction]] page. Special thanks to Jean-Louis Durrieu for doing the vocal/non-vocal split summaries.<br />
<br />
===General Legend===<br />
====Team ID==== <br />
<br />
'''PC''' = [https://www.music-ir.org/mirex/abstracts/2008/AudioMelodyExt_pcancela.pdf P. Cancela]<br /><br />
'''CLLY1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_multiF0_CC.pdf C. Cao, M. Li, J. Liu, Y. Yan 1]<br /><br />
'''CLLY2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_multiF0_CC.pdf C. Cao, M. Li, J. Liu, Y. Yan 2]<br /><br />
'''DRD1''' = [https://www.music-ir.org/mirex/abstracts/2008/durrieu_imm_gmm.pdf J-L. Durrieu, G. Richard, B. David 1]<br /><br />
'''DRD2''' = [https://www.music-ir.org/mirex/abstracts/2008/durrieu_imm_gmm.pdf J-L. Durrieu, G. Richard, B. David 2]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/abstracts/2008/ME_ryynanen.pdf M. Ryynänen, A. Klapuri]<br /><br />
'''VR''' = [https://www.music-ir.org/mirex/abstracts/2008/ME_rao.pdf V. Rao, P. Rao]<br /><br />
<br />
====Table Headings====<br />
'''Vx Recall''' = Voicing Detection<br /><br />
'''Vx False Alm''' = Voicing False Alarm<br /><br />
'''Vx d'''' = Voicing d-prime<br /><br />
'''Raw pitch''' = Raw Pitch Accuracy<br /><br />
'''Raw Chroma''' = Raw Chroma Accuracy<br /><br />
'''Overall Acc''' = Overall Acuuracy<br /><br />
<br />
==Overall Summary Results==<br />
<br />
===MIREX 2008 Audio Melody Extraction Overall Summary results - Weighted (by Number of Files) Avg. of all Datasets - All===<br />
<csv>2008/am08_overall.csv</csv><br />
<br />
===MIREX 2008 Audio Melody Extraction Summary results - MIREX 2008 Dataset - All===<br />
<csv>2008/am08_m08_all.csv</csv><br />
<br />
[[Image:2008_am08_m08_all.png]]<br />
<br />
Download the [https://www.music-ir.org/mirex/results/2008/am08_persong_m08_all.xls Excel workbook] for MIREX 2008 Dataset - All (NB: it seems that all the songs of this dataset are Vocal ones, there is therefore no vocal/non-vocal separate results).<br />
<br />
===MIREX 2008 Audio Melody Extraction Summary results - MIREX 2005 Dataset - vocal===<br />
<csv>2008/am08_m05_vocal.csv</csv><br />
<br />
[[Image:2008_am08_m05_vocal.png]]<br />
<br />
Download the [https://www.music-ir.org/mirex/results/2008/am08_persong_m05_vocal.xls Excel Workbook] for MIREX 2005 Dataset - vocal.<br />
<br />
===MIREX 2008 Audio Melody Extraction Summary results - MIREX 2005 Dataset - nonvocal===<br />
<csv>2008/am08_m05_nonvocal.csv</csv><br />
<br />
[[Image:2008_am08_m05_nonvocal.png]]<br />
<br />
Download the [https://www.music-ir.org/mirex/results/2008/am08_persong_m05_nonvocal.xls Excel Workbook] for MIREX 2005 Dataset - nonvocal.<br />
<br />
===MIREX 2008 Audio Melody Extraction Summary results - MIREX 2005 Dataset - All===<br />
<csv>2008/am08_m05_all.csv</csv><br />
<br />
[[Image:2008_am08_m05_all2.png]]<br />
<br />
Download the [https://www.music-ir.org/mirex/results/2008/am08_persong_m05_all.xls Excel Workbook] for MIREX 2005 Dataset - All.<br />
<br />
===MIREX 2008 Audio Melody Extraction Summary results - ADC 2004 Dataset - vocal===<br />
<csv>2008/am08_adc04_vocal.csv</csv><br />
<br />
[[Image:2008_am08_adc04_vocal.png]]<br />
<br />
Download the [https://www.music-ir.org/mirex/results/2008/am08_persong_adc04_vocal.xls Excel Workbook] for ADC 2004 Dataset - vocal.<br />
<br />
===MIREX 2008 Audio Melody Extraction Summary results - ADC 2004 Dataset - nonvocal===<br />
<csv>2008/am08_adc04_nonvocal.csv</csv><br />
<br />
[[Image:2008_am08_adc04_nonvocal.png]]<br />
<br />
Download the [https://www.music-ir.org/mirex/results/2008/am08_persong_adc04_nonvocal.xls Excel Workbook] for ADC 2004 Dataset - nonvocal.<br />
<br />
===MIREX 2008 Audio Melody Extraction Summary results - ADC 2004 Dataset - All===<br />
<csv>2008/am08_adc04_all.csv</csv><br />
<br />
[[Image:2008_am08_adc04_all.png]]<br />
<br />
Download the [https://www.music-ir.org/mirex/results/2008/am08_persong_adc04_all.xls Excel workbook] for ADC 2004 Dataset - All.<br />
<br />
===MIREX 2008 Audio Melody Extraction Runtime Data===<br />
<csv>2008/am08_runtime.csv</csv><br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Genre_Classification_Results&diff=71622008:Audio Genre Classification Results2010-06-07T18:57:00Z<p>IMIRSELBot: Robot: Automated text replacement (-/mirex/2008/results/ +/mirex/results/2008/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Audio Genre Classification task. For background information about this task set please refer to the [[2008:Audio Genre Classification]] page. <br />
<br />
===General Legend===<br />
====Team ID====<br />
'''CL1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_genre_CC.pdf C. Cao, M. Li 1]<br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_genre_CC.pdf C. Cao, M. Li 2]<br /><br />
'''GP1''' = [https://www.music-ir.org/mirex/abstracts/2008/Peeters_2008_ISMIR_MIREX.pdf G. Peeters]<br /><br />
'''GT1 (mono)''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT2 (stereo)''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT3 (multicore)''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf I. M. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf I. M. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf I. M. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
==Overall Summary Results==<br />
===Task 1 (MIXED) Results===<br />
<br />
====MIREX 2008 Audio Genre Classification Summary Results - Raw Classification Accuracy Averaged Over Three Train/Test Folds====<br />
<br />
<csv>2008/genremixed/audiogenre.avg.results.csv</csv><br />
<br />
=====Accuracy Across Folds=====<br />
<br />
<csv>2008/genremixed/audiogenre.results.fold.csv</csv><br />
<br />
=====Accuracy Across Categories=====<br />
<br />
<csv>2008/genremixed/audiogenre.results.class.csv</csv><br />
<br />
====MIREX 2008 Audio Genre Classification Evaluation Logs and Confusion Matrices====<br />
<br />
====MIREX 2008 Audio Genre Classification Run Times====<br />
<br />
<csv>2008/genre.runtime.csv</csv><br />
<br />
====CSV Files Without Rounding====<br />
[https://www.music-ir.org/mirex/results/2008/genremixed/audiogenre_results_fold.csv audiogenre_results_fold.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/genremixed/audiogenre_results_class.csv audiogenre_results_class.csv]<br /><br />
<br />
====Results By Algorithm====<br />
(.tar.gz) <br /><br />
'''CL1''' = [https://www.music-ir.org/mirex/results/2008/genremixed/CL1.tar.gz C. Cao, M. Li 1]<br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/results/2008/genremixed/CL2.tar.gz C. Cao, M. Li 2]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/results/2008/genremixed/LRPPI1.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/results/2008/genremixed/LRPPI2.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/results/2008/genremixed/LRPPI3.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/results/2008/genremixed/LRPPI4.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/results/2008/genremixed/ME1.tar.gz I. M. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/results/2008/genremixed/ME2.tar.gz I. M. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/results/2008/genremixed/ME3.tar.gz I. M. Mandel, D. P. W. Ellis 3]<br /><br />
'''GP''' = [https://www.music-ir.org/mirex/results/2008/genremixed/GP1.tar.gz G. Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/results/2008/genremixed/GT1.tar.gz G. Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/results/2008/genremixed/GT2.tar.gz G. Tzanetakis]<br /><br />
'''GT3''' = [https://www.music-ir.org/mirex/results/2008/genremixed/GT3.tar.gz G. Tzanetakis]<br /><br />
<br />
===Task 2 (LATIN) Results===<br />
<br />
====MIREX 2008 Audio Genre Classification Summary Results - Raw Classification Accuracy Averaged Over Three Train/Test Folds====<br />
<br />
<csv>2008/genrelatin/audiolatin.avg.results.csv</csv><br />
<br />
=====Accuracy Across Folds=====<br />
<br />
<csv>2008/genrelatin/audiolatin.results.fold.csv</csv><br />
<br />
=====Accuracy Across Categories=====<br />
<br />
<csv>2008/genrelatin/audiolatin.results.class.csv</csv><br />
<br />
====MIREX 2008 Audio Genre Classification Evaluation Logs and Confusion Matrices====<br />
<br />
====MIREX 2008 Audio Genre Classification Run Times====<br />
<csv>2008/latin.runtime.csv</csv><br />
<br />
====CSV Files Without Rounding====<br />
[https://www.music-ir.org/mirex/results/2008/genrelatin/audiolatin_results_fold.csv audiolatin_results_fold.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/genrelatin/audiolatin_results_class.csv audiolatin_results_class.csv]<br /><br />
<br />
====Results By Algorithm====<br />
(.tar.gz) <br /><br />
'''CL1''' = [https://www.music-ir.org/mirex/results/2008/genrelatin/CL1.tar.gz C. Cao, M. Li 1]<br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/results/2008/genrelatin/CL2.tar.gz C. Cao, M. Li 2]<br /><br />
'''GP1''' = [https://www.music-ir.org/mirex/results/2008/genrelatin/GP1.tar.gz G. Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/results/2008/genrelatin/GT1.tar.gz G. Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/results/2008/genrelatin/GT2.tar.gz G. Tzanetakis]<br /><br />
'''GT3''' = [https://www.music-ir.org/mirex/results/2008/genrelatin/GT3.tar.gz G. Tzanetakis]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/results/2008/genrelatin/LRPPI1.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/results/2008/genrelatin/LRPPI2.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/results/2008/genrelatin/LRPPI3.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/results/2008/genrelatin/LRPPI4.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/results/2008/genrelatin/ME1.tar.gz I. M. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/results/2008/genrelatin/ME2.tar.gz I. M. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/results/2008/genrelatin/ME3.tar.gz I. M. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
===Friedman's Test for Significant Differences===<br />
====Task 1 (Mixed) Classes vs. Systems====<br />
The Friedman test was run in MATLAB against the average accuracy for each class.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/genremixed/perClassAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/genremixed/perClassAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_genremixed.perclassaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
====Task 1 (Mixed) Folds vs. Systems====<br />
The Friedman test was run in MATLAB against the accuracy for each fold.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/genremixed/perFoldAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/genremixed/perFoldAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_genremixed.perfoldaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
<br />
====Task 2 (Latin) Classes vs. Systems====<br />
The Friedman test was run in MATLAB against the average accuracy for each class.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/genrelatin/perClassAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/genrelatin/perClassAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_genrelatin.perclassaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
====Task 2 (Latin) Folds vs. Systems====<br />
The Friedman test was run in MATLAB against the accuracy for each fold.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/genrelatin/perFoldAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/genrelatin/perFoldAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_genrelatin.perfoldaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Cover_Song_Identification_Results&diff=71612008:Audio Cover Song Identification Results2010-06-07T18:56:50Z<p>IMIRSELBot: Robot: Automated text replacement (-/mirex/2008/results/ +/mirex/results/2008/)</p>
<hr />
<div>Still missing runtimes JSD Sept. 11 2008.<br />
==Introduction==<br />
These are the results for the 2008 running of the Audio Cover Song Identification task. For background information about this task set please refer to the [[2008:Audio Cover Song Identification]] page.<br />
<br />
Each system was given a collection of 1000 songs which included of 30 different classes (sets) of cover songs where each class/set was represented by 11 different versions of a particular song. Each of the 330 cover songs were used as queries and the systems were required to return 10 results for each query. Systems were evaluated on the number of the songs from the same class/set as the query that were returned in the list of 10 results for each query. Average precision, which looks at the entire per-query rank-ordered list of all songs in the collection, was the new metric introduced last year.<br />
<br />
<br />
===General Legend===<br />
====Team ID====<br />
'''CL1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_covsng.pdf C. Cao, M. Li] <br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_covsng.pdf C. Cao, M. Li] <br /><br />
'''EL1''' = [https://www.music-ir.org/mirex/abstracts/2008/cbms_cover_song_id.pdf A. Egorov, G. Linetsky] <br /><br />
'''EL2''' = [https://www.music-ir.org/mirex/abstracts/2008/cbms_cover_song_id.pdf A. Egorov, G. Linetsky] <br /><br />
'''EL3''' = [https://www.music-ir.org/mirex/abstracts/2008/cbms_cover_song_id.pdf A. Egorov, G. Linetsky] <br /><br />
'''JCJ''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract.pdf J. H. Jensen, M. G. Christensen, S. H. Jensen] <br /><br />
'''SGH1''' = [https://www.music-ir.org/mirex/abstracts/2008/CS_Serra.pdf J. Serrà, E. Gόmez, P. Herrera] <br /><br />
'''SGH2''' = [https://www.music-ir.org/mirex/abstracts/2008/CS_Serra.pdf J. Serrà, E. Gόmez, P. Herrera] <br /><br />
<br />
==Overall Summary Results==<br />
<csv>2008//cover/grand.summary.v2.csv</csv><br />
<br />
<br />
<br />
===Number of Correct Covers at Rank X Returned in Top Ten=== <br />
<csv>2008/cover/cover.toptendist.transposed.csv</csv><br />
<br />
===Run Times=== <br />
<csv>2008/cover/coversong_runtimes.csv</csv><br />
<br />
CL1,CL2 ran on FAST2,FAST3. All others ran on ALE Nodes.<br />
<br />
===Friedman's Test for Significant Differences===<br />
The Friedman test was run in MATLAB against the Average Precision summary data over the 30 song groups.<br /> Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/cover/coversong.friedman.anova.csv</csv><br />
<br />
<csv>2008/cover/coversong.friedman.csv</csv><br />
<br />
[[Image:coversong.friedman.png]]<br />
<br />
===Average Performance per Query Group===<br />
These are the arithmetic means of the average precisions within each of the 30 query groups.<br />
<br />
<csv>2008/cover/cover.mapquerygroup.v2.csv</csv><br />
<br />
==Individual Results Files==<br />
===Average Precision Scores for Each Query===<br />
'''CL1''' = [https://www.music-ir.org/mirex/results/2008/cover.cl1.eval.csv C. Cao, M. Li] <br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/results/2008/cover.cl2.eval.csv C. Cao, M. Li] <br /><br />
'''EL1''' = [https://www.music-ir.org/mirex/results/2008/cover.el1.eval.csv A. Egorov, G. Linetsky] <br /><br />
'''EL2''' = [https://www.music-ir.org/mirex/results/2008/cover.el2.eval.csv A. Egorov, G. Linetsky] <br /><br />
'''EL3''' = [https://www.music-ir.org/mirex/results/2008/cover.el3.eval.csv A. Egorov, G. Linetsky] <br /><br />
'''JCJ''' = [https://www.music-ir.org/mirex/results/2008/cover.jcj.eval.csv J. H. Jensen, M. G. Christensen, S. H. Jensen] <br /><br />
'''SGH1''' = [https://www.music-ir.org/mirex/results/2008/cover.sgh1.eval.csv J. Serrà, E. Gόmez, P. Herrera] <br /><br />
'''SGH2''' = [https://www.music-ir.org/mirex/results/2008/cover.sgh2.eval.csv J. Serrà, E. Gόmez, P. Herrera] <br /><br />
<br />
===Ranks of the Ten Cover Songs Returned for Each Query===<br />
'''CL1''' = [https://www.music-ir.org/mirex/results/2008/cover.cl1.eval.debug.csv C. Cao, M. Li] <br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/results/2008/cover.cl2.eval.debug.csv C. Cao, M. Li] <br /><br />
'''EL1''' = [https://www.music-ir.org/mirex/results/2008/cover.el1.eval.debug.csv A. Egorov, G. Linetsky] <br /><br />
'''EL2''' = [https://www.music-ir.org/mirex/results/2008/cover.el2.eval.debug.csv A. Egorov, G. Linetsky] <br /><br />
'''EL3''' = [https://www.music-ir.org/mirex/results/2008/cover.el3.eval.debug.csv A. Egorov, G. Linetsky] <br /><br />
'''JCJ''' = [https://www.music-ir.org/mirex/results/2008/cover.jcj.eval.debug.csv J. H. Jensen, M. G. Christensen, S. H. Jensen] <br /><br />
'''SGH1''' = [https://www.music-ir.org/mirex/results/2008/cover.sgh1.eval.debug.csv J. Serrà, E. Gόmez, P. Herrera] <br /><br />
'''SGH2''' = [https://www.music-ir.org/mirex/results/2008/cover.sgh2.eval.csv J. Serrà, E. Gόmez, P. Herrera] <br /><br />
<br />
===Runtimes===<br />
Where algorithms have been multi-threaded, the longest runtime is reported.<br />
<br />
Where runtimes were not properly reported, file timestamps have been used to approximate a runtime.<br />
<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Classical_Composer_Identification_Results&diff=71602008:Audio Classical Composer Identification Results2010-06-07T18:56:40Z<p>IMIRSELBot: Robot: Automated text replacement (-/mirex/2008/results/ +/mirex/results/2008/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Audio Classical Composer Identification task. For background information about this task set please refer to the [[2007:Audio Classical Composer Identification]] page. <br />
<br />
The data set consisted of 2772 30 second audio clips. The composers represented were:<br />
<br />
#Bach<br />
#Beethoven<br />
#Brahms<br />
#Chopin<br />
#Dvorak<br />
#Handel<br />
#Haydn<br />
#Mendelssohnn<br />
#Mozart<br />
#Schubert<br />
#Vivaldi<br />
<br />
The goal was to correctly identify the composer who wrote each of the pieces represented.<br />
<br />
<br />
===General Legend===<br />
====Team ID====<br />
'''GP1''' = [https://www.music-ir.org/mirex/abstracts/2008/Peeters_2008_ISMIR_MIREX.pdf G. Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT3''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
==Overall Summary Results==<br />
===MIREX 2008 Audio Classical Composer Classification Summary Results - Raw Classification Accuracy Averaged Over Three Train/Test Folds===<br />
<br />
<csv>2008/composer/audiocomposer.avg.results.csv</csv><br />
<br />
=====Accuracy Across Folds=====<br />
<br />
<csv>2008/composer/audiocomposer.results.fold.csv</csv><br />
<br />
=====Accuracy Across Categories=====<br />
<br />
<csv>2008/composer/audiocomposer.results.class.csv</csv><br />
<br />
===MIREX 2008 Audio Classical Composer Classification Evaluation Logs and Confusion Matrices===<br />
<br />
====MIREX 2008 Audio Classical Composer Classification Run Times====<br />
<br />
<csv>2008/composer.runtime.csv</csv><br />
<br />
====CSV Files Without Rounding====<br />
[https://www.music-ir.org/mirex/results/2008/composer/audiocomposer_results_fold.csv audiocomposer_results_fold.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/composer/audiocomposer_results_class.csv audiocomposer_results_class.csv]<br /><br />
<br />
====Results By Algorithm====<br />
(.tar.gz) <br /><br />
'''GP1''' = [https://www.music-ir.org/mirex/results/2008/composer/GP1.tar.gz G. Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/results/2008/composer/GT1.tar.gz G. Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/results/2008/composer/GT2.tar.gz G. Tzanetakis]<br /><br />
'''GT3''' = [https://www.music-ir.org/mirex/results/2008/composer/GT3.tar.gz G. Tzanetakis]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/results/2008/composer/LRPPI1.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/results/2008/composer/LRPPI2.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/results/2008/composer/LRPPI3.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/results/2008/composer/LRPPI4.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/results/2008/composer/ME1.tar.gz I. M. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/results/2008/composer/ME2.tar.gz I. M. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/results/2008/composer/ME3.tar.gz I. M. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
===Friedman's Test for Significant Differences===<br />
====Classes vs. Systems====<br />
The Friedman test was run in MATLAB against the average accuracy for each class.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/composer/perClassAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/composer/perClassAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_composer.perclassaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
====Folds vs. Systems====<br />
The Friedman test was run in MATLAB against the accuracy for each fold.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/composer/perFoldAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/composer/perFoldAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_composer.perfoldaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Chord_Detection_Results&diff=71592008:Audio Chord Detection Results2010-06-07T18:56:30Z<p>IMIRSELBot: Robot: Automated text replacement (-/mirex/2008/results/ +/mirex/results/2008/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Audio Chord Detection task set. For background information about this task set please refer to the [[2008:Audio Chord Detection]] page.<br />
<br />
===Task Descriptions===<br />
<br />
'''Task 1 (Pretrained Systems) [[#Task 1 Results|Go to Task 1 Results]]''':<br />
Systems were pretrained and they were tested against 176 Beatles songs. <br />
<br />
'''Task 2 (Train-Test Systems) [[#Task 2 Results|Go to Task 2 Results]]''': <br />
System trained on ~2/3 of the beatles dataset and tested on ~1/3. Album filtering was applied on each train-test fold such that the songs from the same album can not appear in both train and test sets simultaneously. <br />
<br />
Overlap score was calculated as the ratio between the overlap of the ground truth and detected chords and ground truth duration. Also a secondary overlap score was calculated by ignoring the major-minor variations of the detected chord (e.g., C major == C minor, etc.).<br />
<br />
Note that 4 songs were excluded from the original Beatles dataset because of alignment of ground truth to the audio problems.<br />
The ground truth to audio alignment was done automatically. The script to perform the alignment is going to be released soon by Chris Harte.<br />
<br />
===General Legend===<br />
====Team ID for ChordPreTrained (Task 1)==== <br />
<br />
'''BP''' = [https://www.music-ir.org/mirex/abstracts/2008/CD_bello.pdf J. P. Bello, J. Pickens]<br /><br />
'''KO''' = [https://www.music-ir.org/mirex/abstracts/2008/khadkevich_omologo_final.pdf M. Khadkevich, M. Omologo]<br /><br />
'''KL1''' = [https://www.music-ir.org/mirex/abstracts/2008/XXX.pdf K. Lee 1]<br /><br />
'''KL2''' = [https://www.music-ir.org/mirex/abstracts/2008/XXX.pdf K. Lee 2]<br /><br />
'''MM''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08__mehnert_et_al__cps_based_chord_analysis.pdf M. Mehnert]<br /><br />
'''PP''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_chord_papadopoulos.pdf H.Papadopoulos, G. Peeters]<br /><br />
'''PVM''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2008-audio_chord_detection-ghent_university-johan_pauwels.pdf J. Pauwels, M. Varewyck, J-P. Martens]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/abstracts/2008/CD_ryynanen.pdf M. Ryynänen, A. Klapuri]<br /><br />
<br />
====Team ID for ChordTrainTest (Task 2)==== <br />
<br />
'''DE''' = [https://www.music-ir.org/mirex/abstracts/2008/Ellis08-chordid.pdf D. Ellis]<br /><br />
'''ZL''' = [https://www.music-ir.org/mirex/abstracts/2008/Abstract_xinglin.pdf X. Jhang, C. Lash]<br /><br />
'''KO''' = [https://www.music-ir.org/mirex/abstracts/2008/khadkevich_omologo_final.pdf M. Khadkevich, M. Omologo]<br /><br />
'''KL''' = [https://www.music-ir.org/mirex/abstracts/2008/XXX.pdf K. Lee (withtrain)]<br /><br />
'''UMS''' = [https://www.music-ir.org/mirex/abstracts/2008/uchiyamamirex2008.pdf Y. Uchiyama, K. Miyamoto, S. Sagayama]<br /><br />
'''WD1''' = [https://www.music-ir.org/mirex/abstracts/2008/Mirex08_AudioChordDetection_Weil_Durrieu.pdf J. Weil]<br /><br />
'''WD2''' = [https://www.music-ir.org/mirex/abstracts/2008/Mirex08_AudioChordDetection_Weil_Durrieu.pdf J. Weil, J-L. Durrieu]<br /><br />
<br />
==Overall Summary Results==<br />
===Task 1 Results===<br />
<br />
=====Task 1 Overall Results=====<br />
<br />
<csv>2008/chord/task1_results/pretrained_summary.csv</csv><br />
<br />
<csv>2008/chord/task1_results/pretrained_runtimes.csv</csv><br />
<br />
====Task 1 Summary Data for Download====<br />
[https://www.music-ir.org/mirex/results/2008/chord/task1_results/pretraineed_filenames.csv File Name Set (Pretrained runs)] <br /><br />
[https://www.music-ir.org/mirex/results/2008/chord/task1_results/ACD.task1.results.overlapScores.csv Summary Overlap Data (Pretrained runs)] <br /><br />
[https://www.music-ir.org/mirex/results/2008/chord/task1_results/ACD.task1.results.overlapScores.major_minor.csv Summary Overlap Data (Pretrained runs (Merged maj/min))] <br /><br />
====Task 1 Friedman's Test for Significant Differences====<br />
The Friedman test was run in MATLAB against the Task 1 Overlap Score data over the 176 ground truth songs.<br />
<br />
<br />
<csv>2008/chord/task1_results/task1_friedman.csv</csv><br />
<br />
The Tukey-Kramer HSD multi-comparison data below was generate using the following MATLAB instruction.<br />
<br />
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<csv>2008/chord/task1_results/friedman_detailed.csv</csv><br />
<br />
[[Image:2008_task1.friedman.png]]<br />
<br />
===Task 2 Results===<br />
<br />
=====Task 2 Overall Results=====<br />
<csv>2008/chord/task2_results/summary.csv</csv><br />
<br />
<csv>2008/chord/task2_results/task2_runtimes.csv</csv><br />
<br />
====Task 2 Summary Data for Download====<br />
[https://www.music-ir.org/mirex/results/2008/chord/task2_results/all_filenames.csv File Name Set (Train-test runs)] <br /><br />
[https://www.music-ir.org/mirex/results/2008/chord/task2_results/all3folds_overlap_scores.csv Summary Overlap Data (Train-test runs)] <br /><br />
[https://www.music-ir.org/mirex/results/2008/chord/task2_results/all3folds_overlap_scores_majorMinor.csv Summary Overlap Data (Train-Test runs (Merged maj/min))] <br /><br />
[https://www.music-ir.org/mirex/results/2008/chord/task2_results/individualFriedmansForEachFold.zip Per Fold Summary Data (Train-Test runs (Zip archive))] <br /><br />
<br />
====Task Friedman's Test for Significant Differences====<br />
The Friedman test was run in MATLAB against the Task 2 Overlap Score data over the 176 ground truth songs.<br />
<br />
<br />
<csv>2008/chord/task2_results/all3folds_friedman.txt</csv><br />
<br />
The Tukey-Kramer HSD multi-comparison data below was generate using the following MATLAB instruction.<br />
<br />
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<csv>2008/chord/task2_results/task2_allFolds_friedman_detailed.csv</csv><br />
<br />
[[Image:2008_task2.allfolds_friedman.png]]<br />
<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Artist_Identification_Results&diff=71582008:Audio Artist Identification Results2010-06-07T18:56:20Z<p>IMIRSELBot: Robot: Automated text replacement (-/mirex/2008/results/ +/mirex/results/2008/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Audio Artist Identification task. For background information about this task set please refer to the [[2008:Audio Artist Identification]] page. <br />
<br />
===General Legend===<br />
====Team ID====<br />
'''GP2''' = [https://www.music-ir.org/mirex/abstracts/2008/Peeters_2008_ISMIR_MIREX.pdf G. Peeters]<br /><br />
'''GT1 (mono)''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT2 (stereo)''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT3 (multicore)''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
==Overall Summary Results==<br />
===MIREX 2008 Audio Artist Classification Summary Results - Raw Classification Accuracy Averaged Over Three Train/Test Folds===<br />
<br />
<csv>2008/artist/audioartist.avg.results.csv</csv><br />
<br />
=====Accuracy Across Folds=====<br />
<br />
<csv>2008/artist/audioartist.results.fold.csv</csv><br />
<br />
=====Accuracy Across Categories=====<br />
<br />
<csv>2008/artist/audioartist.results.class.csv</csv><br />
<br />
===MIREX 2008 Audio Artist Classification Evaluation Logs and Confusion Matrices===<br />
<br />
====MIREX 2008 Audio Artist Classification Run Times====<br />
<br />
<csv>2008/artist.runtime.csv</csv><br />
<br />
====CSV Files Without Rounding====<br />
[https://www.music-ir.org/mirex/results/2008/artist/audioartist_results_fold.csv audioartist_results_fold.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/artist/audioartist_results_class.csv audioartist_results_class.csv]<br /><br />
<br />
====Results By Algorithm====<br />
(.tar.gz) <br /><br />
'''GP2''' = [https://www.music-ir.org/mirex/results/2008/artist/GP2.tar.gz G. Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/results/2008/artist/GT1.tar.gz G. Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/results/2008/artist/GT2.tar.gz G. Tzanetakis]<br /><br />
'''GT3''' = [https://www.music-ir.org/mirex/results/2008/artist/GT3.tar.gz G. Tzanetakis]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/results/2008/artist/LRPPI1.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/results/2008/artist/LRPPI2.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/results/2008/artist/LRPPI3.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/results/2008/artist/LRPPI4.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/results/2008/artist/ME1.tar.gz I. M. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/results/2008/artist/ME2.tar.gz I. M. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/results/2008/artist/ME3.tar.gz I. M. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
===Friedman's Test for Significant Differences===<br />
====Classes vs. Systems====<br />
The Friedman test was run in MATLAB against the average accuracy for each class.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/artist/perClassAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/artist/perClassAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_artist.perclassaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
====Folds vs. Systems====<br />
The Friedman test was run in MATLAB against the accuracy for each fold.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/artist/perFoldAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/artist/perFoldAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_artist.perfoldaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2007:Multiple_Fundamental_Frequency_Estimation_%26_Tracking_Results&diff=71572007:Multiple Fundamental Frequency Estimation & Tracking Results2010-06-07T18:55:54Z<p>IMIRSELBot: Robot: Automated text replacement (-/mirex/2007/results/ +/mirex/results/2007/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2007 running of the Multiple Fundamental Frequency Estimation and Tracking task. For background information about this task set please refer to the [[2007:Multiple Fundamental Frequency Estimation & Tracking]] page.<br />
<br />
<br />
<br />
===General Legend===<br />
====Team ID====<br />
'''CC1''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_cao.pdf Chuan Cao, Ming Li, Jian Liu, Yonghong Yan 1]<br /><br />
'''CC2''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_cao.pdf Chuan Cao, Ming Li, Jian Liu, Yonghong Yan 2]<br /><br />
'''AC1''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_cont.pdf Arshia Cont 1]<br /><br />
'''AC2''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_cont.pdf Arshia Cont 2]<br /><br />
'''AC3''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_cont.pdf Arshia Cont 3]<br /><br />
'''AC4''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_cont.pdf Arshia Cont 4]<br /><br />
'''KE1''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_egashira.pdf Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama 1]<br /><br />
'''KE2''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_egashira.pdf Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama 2]<br /><br />
'''KE3''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_egashira.pdf Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama 3]<br /><br />
'''KE4''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_egashira.pdf Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama 4]<br /><br />
'''VE1''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_emiya.pdf Valentin Emiya, Roland Badeau, Bertrand David 1]<br /><br />
'''VE2''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_emiya.pdf Valentin Emiya, Roland Badeau, Bertrand David 2]<br /><br />
'''PL''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_leveau.pdf Pierre Leveau]<br /><br />
'''PI1''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_pertusa.pdf Antonio Pertusa, José Manuel Iñesta 1]<br /><br />
'''PI2''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_pertusa.pdf Antonio Pertusa, José Manuel Iñesta 2]<br /><br />
'''PI3''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_pertusa.pdf Antonio Pertusa, José Manuel Iñesta 3]<br /><br />
'''PE1''' = Graham Poliner, Daniel P. W. Ellis 1<br /><br />
'''PE2''' = Graham Poliner, Daniel P. W. Ellis 2<br /><br />
'''SR''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_raczynski.pdf Stanisław A. Raczyński, Nobutaka Ono, Shigeki Sagayama]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_ryynanen.pdf Matt Ryynänen, Anssi Klapuri]<br /><br />
'''EV1''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_vincent.pdf Emmanuel Vincent, Nancy Bertin, Roland Badeau 1]<br /><br />
'''EV2''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_vincent.pdf Emmanuel Vincent, Nancy Bertin, Roland Badeau 2]<br /><br />
'''EV3''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_vincent.pdf Emmanuel Vincent, Nancy Bertin, Roland Badeau 3]<br /><br />
'''EV4''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_vincent.pdf Emmanuel Vincent, Nancy Bertin, Roland Badeau 4]<br /><br />
'''CY''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_yeh.pdf Chunghsin Yeh]<br /><br />
'''ZR''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_zhou.pdf Ruohua Zhou, Joshua D. Reiss]<br /><br />
<br />
[[Category: Results]]<br />
<br />
==Overall Summary Results Task1==<br />
Below are the average scores across 28 test files. These files consisted of 7 groups, each group having 4 files ranging from 2 polyphony to 5 polyphony. 20 real recordings, 8 synthesized from RWC samples.<br />
<br />
<csv>2007/multiF0Task1.results.csv</csv> <br />
<br />
[[Category:Results]]<br />
<br />
Where<br />
<br />
[[Image:2007_ev_formulas.png]]<br />
<br />
*'''Nref''' is the number of non-zero elements in the ground truth data. <br />
*'''Nsys''' is the number of active elements returned by the system. <br />
*'''Ncorr''' is the number of correctly identified elements.<br />
<br />
==Individual Results Files for Task 1==<br />
===Individual results: Scores per Query===<br />
'''AC1''' = [https://www.music-ir.org/mirex/results/2007/AC1.results.csv Arshia Cont]<br /><br />
'''AC2''' = [https://www.music-ir.org/mirex/results/2007/AC2.results.csv Arshia Cont]<br/><br />
'''CC1''' = [https://www.music-ir.org/mirex/results/2007/CC1.results.csv Chuan Cao, Ming Li, Jian Liu, Yonghong Yan 1] <br/><br />
'''CC2''' = [https://www.music-ir.org/mirex/results/2007/CC2.results.csv Chuan Cao, Ming Li, Jian Liu, Yonghong Yan 1]<br /><br />
'''CY''' = [https://www.music-ir.org/mirex/results/2007/CY.results.csv Chunghsin Yeh]<br /><br />
'''EV1''' = [https://www.music-ir.org/mirex/results/2007/EV1.results.csv Emmanuel Vincent, Nancy Bertin, Roland Badeau]<br /><br />
'''EV2''' = [https://www.music-ir.org/mirex/results/2007/EV2.results.csv Emmanuel Vincent, Nancy Bertin, Roland Badeau]<br /><br />
'''KE1''' = [https://www.music-ir.org/mirex/results/2007/KE1.results.csv Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama]<br /><br />
'''KE2''' = [https://www.music-ir.org/mirex/results/2007/KE2.results.csv Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama]<br /><br />
'''PE''' = [https://www.music-ir.org/mirex/results/2007/PE.results.csv Graham Poliner, Daniel P. W. Ellis]<br /><br />
'''PI1''' = [https://www.music-ir.org/mirex/results/2007/PI1.results.csv Antonio Pertusa, José Manuel Iñesta 1]<br /><br />
'''PL''' = [https://www.music-ir.org/mirex/results/2007/PL.results.csv Pierre Leveau]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/results/2007/RK.results.csv Matt Ryynänen, Anssi Klapuri]<br /><br />
'''SR''' = [https://www.music-ir.org/mirex/results/2007/SR.results.csv Stanislaw A. Raczynski, Nobutaka Ono, Shigeki Sagayama]<br /><br />
'''VE1''' = [https://www.music-ir.org/mirex/results/2007/VE1.results.csv Valentin Emiya, Roland Badeau, Bertrand David ]<br /><br />
'''ZR''' = [https://www.music-ir.org/mirex/results/2007/ZR.results.csv JRuohua Zhou, Joshua D. Reiss]<br /><br />
<br />
[[Category: Results]]<br />
====Info about the filenames====<br />
The filenames starting with part* comes from acoustic woodwind recording, the ones starting with RWC are synthesized. The legend about the instruments are:<br />
<br />
'''bs''' = bassoon<br />
<br />
'''cl''' = clarinet<br />
<br />
'''fl''' = flute<br />
<br />
'''hn''' = horn<br />
<br />
'''ob''' = oboe<br />
<br />
'''vl''' = violin<br />
<br />
'''cel''' = cello<br />
<br />
'''gtr''' = guitar<br />
<br />
'''sax''' saxophone<br />
<br />
'''bass''' = electric bass guitar<br />
<br />
===Run Times===<br />
<csv>2007/multiF0_task1_runtimes.csv</csv> <br />
<br />
[[Category:Results]]<br />
<br />
==Overall Summary Results Task II==<br />
This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.<br />
The Overlap ratio is calculated for an individual correctly identified note as <br />
[[Image:2007_overlap.png]]<br />
<br />
A total of 30 files were used in this task: 16 real recordings, 8 synthesized from RWC samples, and 6 piano. The results below are the average of these 30 files.<br />
<br />
===Results Based on Onset Only===<br />
<csv>2007/multiF0.note.onset.only.eval.csv</csv><br />
<br />
===Results Based on Onset-Offset===<br />
<csv>2007/multiF0.note.eval.csv</csv><br />
<br />
===Piano Results based on Onset Only===<br />
<csv>2007/multiF0.note.onset.only.eval.for.piano.csv</csv><br />
<br />
==Individual Results Files for Task 2==<br />
===For Onset only Evaluation===<br />
'''AC3''' = [https://www.music-ir.org/mirex/results/2007/AC3.note.onset.only.results.csv Arshia Cont]<br /><br />
'''AC4''' = [https://www.music-ir.org/mirex/results/2007/AC4.note.onset.only.results2.csv Arshia Cont]<br/><br />
'''EV3''' = [https://www.music-ir.org/mirex/results/2007/EV3.note.onset.only.results.csv Emmanuel Vincent, Nancy Bertin, Roland Badeau]<br /><br />
'''EV4''' = [https://www.music-ir.org/mirex/results/2007/EV4.note.onset.only.results.csv Emmanuel Vincent, Nancy Bertin, Roland Badeau]<br /><br />
'''KE3''' = [https://www.music-ir.org/mirex/results/2007/KE3.note.onset.only.results.csv Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama]<br /><br />
'''KE4''' = [https://www.music-ir.org/mirex/results/2007/KE4.note.onset.only.results.csv Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama]<br /><br />
'''PE2''' = [https://www.music-ir.org/mirex/results/2007/PE2.note.onset.only.results.csv Graham Poliner, Daniel P. W. Ellis]<br /><br />
'''PI2''' = [https://www.music-ir.org/mirex/results/2007/PI2.note.onset.only.results.csv Antonio Pertusa, José Manuel Iñesta 1]<br /><br />
'''PI3''' = [https://www.music-ir.org/mirex/results/2007/PI3.note.onset.only.results.csv Antonio Pertusa, José Manuel Iñesta 1]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/results/2007/RK.note.onset.only.results.csv Matt Ryynänen, Anssi Klapuri]<br /><br />
'''VE2''' = [https://www.music-ir.org/mirex/results/2007/VE2.note.onset.only.results.csv Valentin Emiya, Roland Badeau, Bertrand David]<br /><br />
<br />
===For Onset/Offset Evaluation===<br />
'''AC3''' = [https://www.music-ir.org/mirex/results/2007/AC3.note.results2.csv Arshia Cont]<br /><br />
'''AC4''' = [https://www.music-ir.org/mirex/results/2007/AC4.note.results2.csv Arshia Cont]<br/><br />
'''EV3''' = [https://www.music-ir.org/mirex/results/2007/EV3.note.results.csv Emmanuel Vincent, Nancy Bertin, Roland Badeau]<br /><br />
'''EV4''' = [https://www.music-ir.org/mirex/results/2007/EV4.note.results.csv Emmanuel Vincent, Nancy Bertin, Roland Badeau]<br /><br />
'''KE3''' = [https://www.music-ir.org/mirex/results/2007/KE3.note.results.csv Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama]<br /><br />
'''KE4''' = [https://www.music-ir.org/mirex/results/2007/KE4.note.results.csv Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama]<br /><br />
'''PE2''' = [https://www.music-ir.org/mirex/results/2007/PE2.note.results.csv Graham Poliner, Daniel P. W. Ellis]<br /><br />
'''PI2''' = [https://www.music-ir.org/mirex/results/2007/PI2.note.results.csv Antonio Pertusa, José Manuel Iñesta 1]<br /><br />
'''PI3''' = [https://www.music-ir.org/mirex/results/2007/PI3.note.results.csv Antonio Pertusa, José Manuel Iñesta 1]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/results/2007/RK.note.results.csv Matt Ryynänen, Anssi Klapuri]<br /><br />
'''VE2''' = [https://www.music-ir.org/mirex/results/2007/VE2.note.results.csv Valentin Emiya, Roland Badeau, Bertrand David ]<br /><br />
<br />
===Info About Filenames===<br />
The filenames starting with part* comes from acoustic woodwind recording, the ones starting with RWC are synthesized. The piano files are: RA_C030_align.wav,bach_847TESTp.wav,beet_pathetique_3TESTp.wav,mz_333_1TESTp.wav,scn_4TESTp.wav.note, ty_januarTESTp.wav.note<br />
<br />
===Run Times===<br />
<csv>2007/multiF0task2.runtimes.csv</csv></div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Real-time_Audio_to_Score_Alignment_(a.k.a._Score_Following)_Results&diff=71562008:Real-time Audio to Score Alignment (a.k.a. Score Following) Results2010-06-07T18:54:59Z<p>IMIRSELBot: Robot: Automated text replacement (-/mirex/2008/results/ +/mirex/results/2008/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Real-time Audio to Score Alignment (a.k.a Score Following) task. For background information about this task set please refer to the [[2008:Real-time Audio to Score Alignment (a.k.a Score Following)]] page. <br />
<br />
===General Legend===<br />
====Team ID====<br />
<br />
<br />
'''MO1''' = [https://www.music-ir.org/mirex/abstracts/2008/XXX.pdf N. Montecchio & Orio 1]<br /><br />
'''MO2''' = [https://www.music-ir.org/mirex/abstracts/2008/XXX.pdf N. Montecchio & Orio 2]<br /><br />
'''RM1''' = [https://www.music-ir.org/mirex/abstracts/2008/Scofo.pdf R. Macrae]<br /><br />
'''RM2''' = [https://www.music-ir.org/mirex/abstracts/2008/Scofo.pdf R. Macrae]<br /><br />
<br />
[[Category: Results]]<br />
<br />
===Summary Results===<br />
<csv>2008/scofo/scofo_summary_results.csv</csv><br />
<br />
===Individual Results===<br />
'''MO''' = [https://www.music-ir.org/mirex/results/2008/scofo/MOResults.zip N. Montecchio & Orio]<br /><br />
'''RM''' = [https://www.music-ir.org/mirex/results/2008/scofo/RMResults.zip R. Macrae ]<br /><br />
<br />
===Summary Results w.r.t R. Macrae`s Evaluation Script===<br />
<csv>2008/scofo/scofo_summary_results_withRobsEvalScript.csv</csv><br />
<br />
===Individual Results w.r.t R. Macrae`s Evaluation Script===<br />
'''MO''' = [https://www.music-ir.org/mirex/results/2008/scofo/MOresults_withRobsEvalScript.zip N. Montecchio & Orio]<br /><br />
'''RM''' = [https://www.music-ir.org/mirex/results/2008/scofo/RMresults_withRobsEvalScript.zip R. Macrae ]<br /><br />
<br />
<br />
The systems are evaluated against the ground truth that is prepared by parsing the score files by each systems own midi parser (MO GT, RM GT).<br />
<br />
<br />
=== Issues with ground-truth ===</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Multiple_Fundamental_Frequency_Estimation_%26_Tracking_Results&diff=71552008:Multiple Fundamental Frequency Estimation & Tracking Results2010-06-07T18:54:49Z<p>IMIRSELBot: Robot: Automated text replacement (-/mirex/2008/results/ +/mirex/results/2008/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Multiple Fundamental Frequency Estimation and Tracking task. For background information about this task set please refer to the [[2008:Multiple Fundamental Frequency Estimation & Tracking]] page.<br />
<br />
===General Legend===<br />
====Team ID====<br />
<br />
'''CL1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_multiF0_Cao.pdf C. Cao, M. Li 1]<br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_multiF0_Cao.pdf C. Cao, M. Li 2]<br /><br />
'''DRD''' = [https://www.music-ir.org/mirex/abstracts/2008/durrieu_multi.pdf J-L. Durrieu, G. Richard, B. David]<br /><br />
'''EOS''' = [https://www.music-ir.org/mirex/abstracts/2008/Egashira2008MIREX09_ver1.pdf K. Egashira, N. Ono, S. Sagayama]<br /><br />
'''EBD1''' = [https://www.music-ir.org/mirex/abstracts/2008/080914_MIREX08_emiya.pdf V. Emiya, R. Badeau, B. David 1]<br /><br />
'''EBD2''' = [https://www.music-ir.org/mirex/abstracts/2008/080914_MIREX08_emiya.pdf V. Emiya, R. Badeau, B. David 2]<br /><br />
'''MG''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_groble.pdf M. Groble]<br /><br />
'''PI1''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_pertusa.pdf A. Pertusa, J. M. I├▒esta 1]<br /><br />
'''PI2''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_pertusa.pdf A. Pertusa, J. M. I├▒esta 2]<br /><br />
'''RFF1''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_reis.pdf G. Reis, F. Fernandez, A. Ferreira 1]<br /><br />
'''RFF2''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_reis.pdf G. Reis, F. Fernandez, A. Ferreira 2]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_ryynanen.pdf M. Ryynänen, A. Klapuri]<br /><br />
'''VBB''' = [https://www.music-ir.org/mirex/abstracts/2008/articleMIREX07.pdf E. Vincent, N. Bertin, R. Badeau]<br /><br />
'''YRC1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_yeh.pdf C. Yeh, A. Roebel, W-C. Chang 1]<br /><br />
'''YRC2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_yeh.pdf C. Yeh, A. Roebel, W-C. Chang 2]<br /><br />
'''ZR1''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_zhou.pdf R. Zhou, J. D. Reiss 1]<br /><br />
'''ZR2''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_zhou.pdf R. Zhou, J. D. Reiss 2]<br /><br />
'''ZR3''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_zhou.pdf R. Zhou, J. D. Reiss 3]<br /><br />
<br />
===Overall Summary Results Task 1===<br />
Below are the average scores across 36 test files. These files consisted of 9 groups, each group having 4 files ranging from 2 polyphony to 5 polyphony. 28 real recordings, 8 synthesized from RWC samples.<br />
<br />
<csv>2008/multif0/task1_summary.csv</csv> <br />
<br />
====Detailed Results====<br />
<br />
<csv>2008/multif0/task1_res.csv</csv><br />
<br />
====Detailed Chroma Results====<br />
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)<br />
<br />
<csv>2008/multif0/task1_res_chroma.csv</csv><br />
<br />
====Individual Results Files for Task 1: Scores per Query====<br />
'''CL1''' = [https://www.music-ir.org/mirex/results/2008/multif0/CL1task1.tar.gz C. Cao, M. Li 1]<br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/results/2008/multif0/CL2task1.tar.gz C. Cao, M. Li 2]<br/><br />
'''DRD''' = [https://www.music-ir.org/mirex/results/2008/multif0/DRDtask1.tar.gz J-L. Durrieu, G. Richard, B. David] <br/><br />
'''EBD1''' = [https://www.music-ir.org/mirex/results/2008/multif0/EBD1task1.tar.gz CV. Emiya, R. Badeau, B. David 1]<br /><br />
'''EBD2''' = [https://www.music-ir.org/mirex/results/2008/multif0/EBD2task1.tar.gz V. Emiya, R. Badeau, B. David 2]<br /><br />
'''EOS''' = [https://www.music-ir.org/mirex/results/2008/multif0/EOStask1.tar.gz K. Egashira, N. Ono, S. Sagayama]<br /><br />
'''MG''' = [https://www.music-ir.org/mirex/results/2008/multif0/MGtask1.tar.gz M. Groble]<br /><br />
'''PI1''' = [https://www.music-ir.org/mirex/results/2008/multif0/PI1task1.tar.gz A. Pertusa, J. M. I├▒esta 1]<br /><br />
'''PI2''' = [https://www.music-ir.org/mirex/results/2008/multif0/PI2task1.tar.gz A. Pertusa, J. M. I├▒esta 2]<br /><br />
'''RFF1''' = [https://www.music-ir.org/mirex/results/2008/multif0/RFF1task1.tar.gz G. Reis, F. Fernandez, A. Ferreira 1]<br /><br />
'''RFF2''' = [https://www.music-ir.org/mirex/results/2008/multif0/RFF2task1.tar.gz GG. Reis, F. Fernandez, A. Ferreira 2]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/results/2008/multif0/RKtask1.tar.gz M. Ryynänen, A. Klapuri]<br /><br />
'''VBB''' = [https://www.music-ir.org/mirex/results/2008/multif0/VBBtask1.tar.gz E. Vincent, N. Bertin, R. Badeau]<br /><br />
'''YRC1''' = [https://www.music-ir.org/mirex/results/2008/multif0/YRC1task1.tar.gz C. Yeh, A. Roebel, W-C. Chang 1]<br /><br />
'''YRC2''' = [https://www.music-ir.org/mirex/results/2008/multif0/YRC2task1.tar.gz C. Yeh, A. Roebel, W-C. Chang 2]<br /><br />
<br />
=====Info about the filenames=====<br />
The filenames starting with part* comes from acoustic woodwind recording, the ones starting with RWC are synthesized. The legend about the instruments are:<br />
<br />
'''bs''' = bassoon,<br />
'''cl''' = clarinet,<br />
'''fl''' = flute,<br />
'''hn''' = horn,<br />
'''ob''' = oboe,<br />
'''vl''' = violin,<br />
'''cel''' = cello,<br />
'''gtr''' = guitar,<br />
'''sax''' = saxophone,<br />
'''bass''' = electric bass guitar<br />
<br />
====Run Times====<br />
<csv>2008/multif0/task1_runtimes.csv</csv><br />
<br />
MG ran on MAC, all other systems ran on ALE Nodes.<br />
<br />
===Overall Summary Results Task 2===<br />
This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.<br />
<br />
A total of 30 files were used in this task: 16 real recordings, 8 synthesized from RWC samples, and 6 piano. The results below are the average of these 30 files.<br />
<br />
<csv>2008/multif0/task2_summary.csv</csv><br />
<br />
====Detailed Results====<br />
<br />
<csv>2008/multif0/task2_res.csv</csv><br />
<br />
====Detailed Chroma Results====<br />
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)<br />
<br />
<csv>2008/multif0/task2_res_chroma.csv</csv><br />
<br />
====Results Based on Onset Only====<br />
<br />
<csv>2008/multif0/task2_res_onsetonly.csv</csv><br />
<br />
====Chroma Results Based on Onset Only====<br />
<br />
<csv>2008/multif0/task2_res_onsetonly_chroma.csv</csv><br />
<br />
====Piano Subset Results Based on Onset Only====<br />
<br />
<csv>2008/multif0/task2_res_onsetonly_piano.csv</csv><br />
<br />
====Individual Results Files for Task 2====<br />
'''EBD1''' = [https://www.music-ir.org/mirex/results/2008/multif0/EBD1task2.tar.gz CV. Emiya, R. Badeau, B. David 1]<br /><br />
'''EBD2''' = [https://www.music-ir.org/mirex/results/2008/multif0/EBD2task2.tar.gz V. Emiya, R. Badeau, B. David 2]<br /><br />
'''EOS''' = [https://www.music-ir.org/mirex/results/2008/multif0/EOStask2.tar.gz K. Egashira, N. Ono, S. Sagayama]<br /><br />
'''PI1''' = [https://www.music-ir.org/mirex/results/2008/multif0/PI1task2.tar.gz A. Pertusa, J. M. I├▒esta 1]<br /><br />
'''PI2''' = [https://www.music-ir.org/mirex/results/2008/multif0/PI2task2.tar.gz A. Pertusa, J. M. I├▒esta 2]<br /><br />
'''RFF1''' = [https://www.music-ir.org/mirex/results/2008/multif0/RFF1task2.tar.gz G. Reis, F. Fernandez, A. Ferreira 1]<br /><br />
'''RFF2''' = [https://www.music-ir.org/mirex/results/2008/multif0/RFF2task2.tar.gz GG. Reis, F. Fernandez, A. Ferreira 2]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/results/2008/multif0/RKtask2.tar.gz M. Ryynänen, A. Klapuri]<br /><br />
'''VBB''' = [https://www.music-ir.org/mirex/results/2008/multif0/VBBtask2.tar.gz E. Vincent, N. Bertin, R. Badeau]<br /><br />
'''YRC1''' = [https://www.music-ir.org/mirex/results/2008/multif0/YRC1task2.tar.gz C. Yeh, A. Roebel, W-C. Chang 1]<br /><br />
'''ZR1''' = [https://www.music-ir.org/mirex/results/2008/multif0/ZR1task2.tar.gz R. Zhou, J. D. Reiss 1]<br /><br />
'''ZR2''' = [https://www.music-ir.org/mirex/results/2008/multif0/ZR2task2.tar.gz R. Zhou, J. D. Reiss 2]<br /><br />
'''ZR3''' = [https://www.music-ir.org/mirex/results/2008/multif0/ZR3task2.tar.gz R. Zhou, J. D. Reiss 3]<br /><br />
<br />
======Info About Filenames======<br />
The filenames starting with part* comes from acoustic woodwind recording, the ones starting with RWC are synthesized. The piano files are: RA_C030_align.wav,bach_847TESTp.wav,beet_pathetique_3TESTp.wav,mz_333_1TESTp.wav,scn_4TESTp.wav.note, ty_januarTESTp.wav.note<br />
<br />
====Run Times====<br />
<csv>2008/multif0/task2_runtimes.csv</csv><br />
<br />
ZR1,ZR2,ZR3 ran on BLACK. All other systems ran on ALE Nodes.<br />
<br />
===Friedman's Test for Significant Differences===<br />
<br />
====Task 1====<br />
The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the performance (accuracy) on individual files.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/multif0/task1.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
<br />
<csv>2008/multif0/task1.friedman.detailed.csv</csv><br />
<br />
[[Image:2008_multif0.task1.friedman.png]]<br />
<br />
====Task 2====<br />
The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the performance (accuracy) on individual files.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/multif0/task2.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
<br />
<csv>2008/multif0/task2.friedman.detailed.csv</csv><br />
<br />
[[Image:2008_multif0.task2.friedman.png]]<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:MIREX2008_Results&diff=71542008:MIREX2008 Results2010-06-07T18:54:39Z<p>IMIRSELBot: Robot: Automated text replacement (-/mirex/2008/results/ +/mirex/results/2008/)</p>
<hr />
<div>===''NEW'' MIREX 2008 Plenary Topics Page===<br />
We have created a [[2008:MIREX2008_Plenary_Topics]] page which is intended to act as an informal notepad for community members who would like to suggest possible discussion topics for the MIREX 2008 Plenary Session. We will be monitoring the "Topics" page constantly prior to the plenary session. Also, please feel free to post your comments and ideas during and after the Plenary Session as we will use this information to help shape future iterations of MIREX.<br />
<br />
===FOR PARTICIPANTS===<br />
Please go ASAP to [[2008:MIREX2008_Poster_List| MIREX2008 Poster List Page]] and add your poster information to the list. The ISMIR organizers need to know how many posters to expect.<br />
<br />
The MIREX 2008 Poster Session will be held Wednesday, 17 September: 16:00-18:00. We will be holding the MIREX plenary meeting as a working lunch meeting at 13:30-15:00 on the same day. <br />
<br />
=OVERALL RESULTS POSTERS=<br />
[https://www.music-ir.org/mirex/results/2008/MIREX2008_overview_A0.pdf MIREX 2008 Overall Results Poster (PDF)] is now available!<br />
<br />
[https://www.music-ir.org/mirex/results/2008/tagPoster.pdf MIREX 2008 Classification tasks overall Results Poster (PDF)] is now available!<br />
<br />
== Results by Task ==<br />
<br />
* [[2008:Audio_Artist_Identification_Results | Audio Artist Identification Results]] (Done)<br />
* [[2008:Audio_Chord_Detection_Results | Audio Chord Detection]] (Done: Needs Abstracts)<br />
* [[2008:Audio_Classical_Composer_Identification_Results | Audio Classical Composer Identification Results ]] (Done)<br />
* [[2008:Audio_Cover_Song_Identification_Results | Audio Cover Song Identification Results]] (Done: Needs Abstracts)<br />
* [[2008:Audio_Genre_Classification_Results | Audio Genre Classification Results]] (Done)<br />
* [[2008:Audio_Melody_Extraction_Results | Audio Melody Extraction Results]] (Done)<br />
* [[2008:Audio_Music_Mood_Classification_Results | Audio Music Mood Classification]] (Done: Needs Abstracts)<br />
* [[2008:Audio_Tag_Classification_Results | Audio Tag Classification Results]] (Done)<br />
* [[2008:Multiple_Fundamental_Frequency_Estimation_&_Tracking_Results | Multiple Fundamental Frequency Estimation & Tracking Results]] (Done)<br />
* [[2008:Query-by-Singing/Humming_Results | Query-by-Singing/Humming Results]] (Done)<br />
* [[2008:Query-by-Tapping_Results | Query-by-Tapping Results]] (Done)<br />
* [[2008:Real-time_Audio_to_Score_Alignment_(a.k.a._Score_Following)_Results | Real-time Audio to Score Alignment (a.k.a. Score Following) Results]] (Done: Need Abstracts)<br />
<br />
----<br />
<br />
== Machine Specifications ==<br />
<csv>2008/mirex08_machine_specs.csv</csv> <br />
<br />
[[Category:Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Tag_Classification_Results&diff=71532008:Audio Tag Classification Results2010-06-07T18:54:30Z<p>IMIRSELBot: Robot: Automated text replacement (-/mirex/2008/results/ +/mirex/results/2008/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Audio Tag Classification task. For background information about this task set please refer to the [[2008:Audio Tag Classification]] page. <br />
<br />
===General Legend===<br />
====Team ID====<br />
'''LB''' = [https://www.music-ir.org/mirex/abstracts/2008/AT_barrington.pdf L. Barrington, D. Turnbull, G. Lanckriet]<br /><br />
'''BBE 1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex_knn.pdf T. Bertin-Mahieux, Y. Bengio, D. Eck (KNN)]<br /><br />
'''BBE 2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex_nnet.pdf T. Bertin-Mahieux, Y. Bengio, D. Eck (NNet)]<br /><br />
'''BBE 3''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex_boosters.pdf T. Bertin-Mahieux, D. Eck, P. Lamere, Y. Bengio (Thierry/Lamere Boosting)]<br /><br />
'''TB''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex_smurfs.pdf T. Bertin-Mahieux (dumb/smurf)]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 3]<br /><br />
'''GP1''' = [https://www.music-ir.org/mirex/abstracts/2008/Peeters_2008_ISMIR_MIREX.pdf G. Peeters 1]<br /><br />
'''GP2''' = [https://www.music-ir.org/mirex/abstracts/2008/Peeters_2008_ISMIR_MIREX.pdf G. Peeters 2]<br /><br />
'''TTKV''' = [https://www.music-ir.org/mirex/abstracts/2008/auth.pdf K. Trohidis, G. Tsoumakas, G. Kalliris, I. Vlahavas]<br /><br />
<br />
==Overall Summary Results==<br />
<br />
<csv>2008/tag/tag.grand.summary.show.csv</csv><br />
<br />
<br />
===Summary Positive Example Accuracy (Average Across All Folds)===<br />
<br />
<csv>2008/tag/rounded/tag.binary_avg_positive_example_Accuracy.csv</csv><br />
<br />
===Summary Negative Example Accuracy (Average Across All Folds)===<br />
<br />
<csv>2008/tag/rounded/tag.binary_avg_negative_example_Accuracy.csv</csv><br />
<br />
===Summary Binary relevance F-Measure (Average Across All Folds)===<br />
<br />
<csv>2008/tag/rounded/tag.binary_avg_Fmeasure.csv</csv><br />
<br />
===Summary Binary Accuracy (Average Across All Folds)===<br />
<br />
<csv>2008/tag/rounded/tag.binary_avg_Accuracy.csv</csv><br />
<br />
===Summary AUC-ROC Tag (Average Across All Folds)===<br />
<br />
<csv>2008/tag/rounded/tag.affinity_tag_AUC_ROC.csv</csv><br />
<br />
==Friedman test results==<br />
<br />
===AUC-ROC Tag Friedman test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the Area Under the ROC curve (AUC-ROC) for each '''tag''' in the test, averaged over all folds.<br />
<br />
<csv>2008/tag/friedmansTables/tag.affinity.AUC_ROC_TAG.friedman.tukeyKramerHSD.csv</csv><br />
<br />
[[Image:2008_affinity.auc_roc_tag.friedman.tukeykramerhsd.png]]<br />
<br />
===AUC-ROC Track Friedman test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the Area Under the ROC curve (AUC-ROC) for each '''track''' in the test. Each track appears in exactly once over all three folds of the test. However, we are uncertain if these measurements are truly independent as an multiple tracks from each artist are used.<br />
<br />
<csv>2008/tag/friedmansTables/tag.affinity.AUC_ROC_TRACK.friedman.tukeyKramerHSD.csv</csv><br />
<br />
[[Image:2008_affinity.auc_roc_track.friedman.tukeykramerhsd.png]]<br />
<br />
===Tag Classification Accuracy Friedman test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the classification accuracy for each '''tag''' in the test, averaged over all folds.<br />
<br />
<csv>2008/tag/friedmansTables/tag.binary_Accuracy.friedman.tukeyKramerHSD.csv</csv><br />
<br />
[[Image:2008_binary_accuracy.friedman.tukeykramerhsd.png]]<br />
<br />
===Tag F-measure Friedman test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the F-measure for each '''tag''' in the test, averaged over all folds.<br />
<br />
<csv>2008/tag/friedmansTables/tag.binary_FMeasure.friedman.tukeyKramerHSD.csv</csv><br />
<br />
[[Image:2008_binary_fmeasure.friedman.tukeykramerhsd.png]]<br />
<br />
<br />
==Beta-Binomial test results==<br />
<br />
===Accuracy on positive examples Beta-Binomial results===<br />
The following table and plot show the results of simulations from the Beta-Binomial model using the accuracy of each algorithm's classification only on the positive examples. It only shows the relative proportion of true positives and false negatives, and should be considered with the classification accuracy on the negative examples. The image shows the estimate of the overall performance with 95% confidence intervals.<br />
<br />
<br />
<csv>2008/tag/tag.binary.per.fold.positive.example.accuracy.betabinomial.csv</csv><br />
<br />
<br />
[[Image:binary_per_fold_positive_example_Accuracy.png]]<br />
<br />
<br />
The plots for each tag are more interesting and the 95% confidence intervals are much tighter. Since there are so many of them, it is difficult to post them to the wiki. You can download a tar.gz zip file containing all of them [https://www.music-ir.org/mirex/results/2008/tag/binary_positive_example_Accuracy.betaBinomial.images.tar.gz here].<br />
<br />
===Accuracy on negative examples Beta-Binomial results===<br />
The following table and plot show the results of simulations from the Beta-Binomial model using the accuracy of each algorithm's classification only on the negative examples. It only shows the relative proportion of true negatives and false positives, and should be considered with the classification accuracy on the positive examples. The image shows the estimate of the overall performance with 95% confidence intervals.<br />
<br />
<br />
<csv>2008/tag/tag.binary.per.fold.negative.example.accuracy.betabinomial.csv</csv><br />
<br />
<br />
[[Image:binary_per_fold_negative_example_Accuracy.png]]<br />
<br />
<br />
The plots for each tag are more interesting and the 95% confidence intervals are much tighter. Since there are so many of them, it is difficult to post them to the wiki. You can download a tar.gz file containing all of them [https://www.music-ir.org/mirex/results/2008/tag/binary_negative_example_Accuracy.betaBinomial.images.tar.gz here].<br />
<br />
==Assorted Results Files for Download==<br />
===AUC-ROC Clip Data===<br />
(Too large for easy Wiki viewing)<br \><br />
[https://www.music-ir.org/mirex/results/2008/tag/rounded/tag.affinity_clip_AUC_ROC.csv tag.affinity_clip_AUC_ROC.csv]<br /><br />
<br />
===CSV Files Without Rounding (Averaged across folds)===<br />
[https://www.music-ir.org/mirex/results/2008/tag/csv_raw/tag.affinity.tag.auc.roc.csv tag.affinity.clip.auc.roc.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/tag/csv_raw/tag.affinity.clip.auc.roc.csv tag.affinity.clip.auc.roc.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/tag/csv_raw/tag.binary.avg.accuracy.csv tag.binary.accuracy.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/tag/csv_raw/tag.binary.avg.fmeasure.csv tag.binary.fmeasure.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/tag/csv_raw/tag.binary.avg.negative.example.accuracy.csv tag.binary.negative.example.accuracy.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/tag/csv_raw/tag.binary.avg.positive.example.accuracy.csv tag.binary.positive.example.accuracy.csv]<br /><br />
<br />
===CSV Files Without Rounding (Fold information)===<br />
[https://www.music-ir.org/mirex/results/2008/tag/csv_raw/tag.binary.per.fold.positive.example.accuracy.csv tag.binary.per.fold.positive.example.accuracy.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/tag/csv_raw/tag.binary.per.fold.negative.example.accuracy.csv tag.binary.per.fold.negative.example.accuracy.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/tag/csv_raw/tag.binary.per.fold.fmeasure.csv tag.binary.per.fold.fmeasure.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/tag/csv_raw/tag.binary.per.fold.accuracy.csv tag.binary.per.fold.accuracy.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/tag/csv_raw/tag.affinity.tag.per.fold.auc.roc.csv tag.affinity.tag.per.fold.auc.roc.csv]<br /><br />
<br />
===Results By Algorithm===<br />
(.tar.gz) <br /><br />
'''LB''' = [https://www.music-ir.org/mirex/results/2008/tag/detailedReports/LB.tar.gz L. Barrington, D. Turnbull, G. Lanckriet]<br /><br />
'''BBE 1''' = [https://www.music-ir.org/mirex/results/2008/tag/detailedReports/BBE1.tar.gz T. Bertin-Mahieux, Y. Bengio, D. Eck (KNN)]<br /><br />
'''BBE 2''' = [https://www.music-ir.org/mirex/results/2008/tag/detailedReports/BBE2.tar.gz T. Bertin-Mahieux, Y. Bengio, D. Eck (NNet)]<br /><br />
'''BBE 3''' = [https://www.music-ir.org/mirex/results/2008/tag/detailedReports/BBE3.tar.gz T. Bertin-Mahieux, D. Eck, P. Lamere, Y. Bengio (Thierry/Lamere Boosting)]<br /><br />
'''TB''' = [https://www.music-ir.org/mirex/results/2008/tag/detailedReports/TB.tar.gz Bertin-Mahieux (dumb/smurf)]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/results/2008/tag/detailedReports/ME1.tar.gz M. I. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/results/2008/tag/detailedReports/ME2.tar.gz M. I. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/results/2008/tag/detailedReports/ME3.tar.gz M. I. Mandel, D. P. W. Ellis 3]<br /><br />
'''GP1''' = [https://www.music-ir.org/mirex/results/2008/tag/detailedReports/GP1.tar.gz G. Peeters 1]<br /><br />
'''GP2''' = [https://www.music-ir.org/mirex/results/2008/tag/detailedReports/GP2.tar.gz G. Peeters 2]<br /><br />
'''TTKV''' = [https://www.music-ir.org/mirex/results/2008/tag/detailedReports/TTKV.tar.gz K. Trohidis, G. Tsoumakas, G. Kalliris, I. Vlahavas]<br /><br />
<br />
<br />
<br />
<br />
[[Category: Results]]<br />
.</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2007:Symbolic_Melodic_Similarity_Results&diff=67302007:Symbolic Melodic Similarity Results2010-05-14T03:52:22Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction== <br />
These are the results for the 2007 running of the Symbolic Melodic Similarity task set. For background information about this task set please refer to the [[2007:Symbolic Melodic Similarity]] page.<br />
<br />
Each system was given a query and returned the 10 most melodically similar songs from those taken from the Essen Collection (5274 pieces in the MIDI format; see [http://www.esac-data.org/ ESAC Data Homepage] for more information). For each query, we made four classes of error-mutations, thus the set comprises the following query classes:<br />
<br />
* 0. No errors<br />
* 1. One note deleted<br />
* 2. One note inserted<br />
* 3. One interval enlarged<br />
* 4. One interval compressed<br />
<br />
For each query (and its 4 mutations), the returned results (candidates) from all systems were then grouped together (query set) for evaluation by the human graders. The graders were provide with only heard perfect version against which to evaluate the candidates and did not know whether the candidates came from a perfect or mutated query. Each query/candidate set was evaluated by 1 individual grader. Using the Evalutron 6000 system, the graders gave each query/candidate pair two types of scores. Graders were asked to provide 1 categorical score with 3 categories: NS,SS,VS as explained below, and one fine score (in the range from 0 to 10).<br />
<br />
====Evalutron 6000 Summary Data====<br />
'''Number of evaluators''' = 6 <br /><br />
'''Number of evaluations per query/candidate pair''' = 1 <br /><br />
'''Number of queries per grader''' = 1 <br /><br />
'''Total number of candidates returned''' = 2400 <br /><br />
'''Total number of unique query/candidate pairs graded''' = 799<br /><br />
'''Average number of query/candidate pairs evaluated per grader: 133 <br /><br />
'''Number of queries''' = 6 (perfect) with each perfect query error-mutated 4 different ways = 30<br /><br />
<br />
===General Legend===<br />
====Team ID====<br />
'''FHAR''' = [https://www.music-ir.org/mirex/abstracts/2007/SMS_ferraro.pdf Pascal Ferraro, Pierre Hanna, Julien Allali, Matthias Robine]<br /><br />
'''GAR1''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_SMS_gomez.pdf Carlos G├│mez, Soraya Abad-Mota, Edna Ruckhaus 1]<br /><br />
'''GAR2''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_SMS_gomez.pdf Carlos G├│mez, Soraya Abad-Mota, Edna Ruckhaus 2]<br /> <br />
'''AP1''' = [https://www.music-ir.org/mirex/abstracts/2007/SMS_pinto.pdf Alberto Pinto 1]<br /><br />
'''AP2''' = [https://www.music-ir.org/mirex/abstracts/2007/SMS_pinto.pdf Alberto Pinto 2]<br /><br />
'''AU1''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_SMS_uitdenbogerd.pdf Alexandra L. Uitdenbogerd 1]<br /><br />
'''AU2''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_SMS_uitdenbogerd.pdf Alexandra L. Uitdenbogerd 2]<br /><br />
'''AU3''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_SMS_uitdenbogerd.pdf Alexandra L. Uitdenbogerd 3]<br /><br />
<br />
====Broad Categories====<br />
'''NS''' = Not Similar<br /><br />
'''SS''' = Somewhat Similar<br /> <br />
'''VS''' = Very Similar<br /><br />
<br />
====Table Headings (Other metrics to be added soon to results by Xiao Hu )====<br />
'''ADR''' = Average Dynamic Recall <br /><br />
'''NRGB''' = Normalize Recall at Group Boundaries <br /><br />
'''AP''' = Average Precision (non-interpolated) <br /><br />
'''PND''' = Precision at N Documents <br /><br />
<br />
===Calculating Summary Measures===<br />
'''Fine'''<sup>(1)</sup> = Sum of fine-grained human similarity decisions (0-10). <br /><br />
'''PSum'''<sup>(1)</sup> = Sum of human broad similarity decisions: NS=0, SS=1, VS=2. <br /><br />
'''WCsum'''<sup>(1)</sup> = 'World Cup' scoring: NS=0, SS=1, VS=3 (rewards Very Similar). <br /><br />
'''SDsum'''<sup>(1)</sup> = 'Stephen Downie' scoring: NS=0, SS=1, VS=4 (strongly rewards Very Similar). <br /><br />
'''Greater0'''<sup>(1)</sup> = NS=0, SS=1, VS=1 (binary relevance judgement).<br /><br />
'''Greater1'''<sup>(1)</sup> = NS=0, SS=0, VS=1 (binary relevance judgement using only Very Similar).<br /><br />
<br />
<sup>(1)</sup>Normalized to the range 0 to 1.<br />
<br />
==Summary Results==<br />
===Run Times===<br />
<csv>2007/sms_runtimes.csv</csv><br />
===Overall Scores (Includes Perfect and Error Candidates)===<br />
<csv>2007/SMS07_overall_norm.csv</csv><br />
<br />
===Overall Summaries (Presented by Error Types)===<br />
<br />
<csv>2007/SMS07_errors_norm.csv</csv><br />
<br />
===Friedman Test with Multiple Comparisons Results (p=0.05)===<br />
The Friedman test was run in MATLAB against the Fine summary data over the 100 queries.<br /><br />
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);<br />
<csv>2007/sms07_sum_friedman_fine.csv</csv><br />
<csv>2007/sms07_detail_friedman_fine.csv</csv><br />
<br />
[[Image:2007 sms fine scores friedmans.png]]<br />
<br />
==Raw Scores==<br />
The raw data derived from the Evalutron 6000 human evaluations are located on the [[2007:Symbolic Melodic Similarity Raw Data]] page.<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2007:Query-by-Singing/Humming_Results&diff=67292007:Query-by-Singing/Humming Results2010-05-14T03:52:13Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2007 running of the Query-by-Singing/Humming task. For background information about this task set please refer to the [[2007:Query by Singing/Humming]] page. <br />
<br />
===Task Descriptions===<br />
<br />
'''Task 1 [[#Task 1 Results|Goto Task 1 Results]]''': The first subtask is the same as last years. In this subtask, submitted systems take a sung query as input and return a list of songs from the test database. Mean reciprocal rank (MRR) of the ground truth is calculated over the top 20 returns. The test database consists of 48 ground-truth MIDIs + 2000 Essen Collection MIDI noise files. See [http://www.esac-data.org/ ESAC Data Homepage] for more information about the Essen Collection. The query database consists of 2797 sung queries. <br />
<br />
'''Task 2 [[#Task 2 Results|Goto Task 2 Results]]''': In the second subtask, the same setup as the first subtask used with combination of different transcribers and matchers. The test databases consists of 106 ground-truth MIDIS + 2000 Essen Collection MIDI noise files. The query databases consists of 355 sung queries.<br />
<br />
===General Legend===<br />
====Team ID====<br />
'''FH''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_ferraro.pdf Pascal Ferraro, Pierre Hanna, Julien Allali, Matthias Robine]<br /><br />
'''CG''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_SMS_gomez.pdf Carlos G├│mez, Soraya Abad-Mota, Edna Ruckhaus]<br /><br />
'''RJ1''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_jang.pdf J.-S. Roger Jang, Nien-Jung Lee, Chao-Ling Hsu 1]<br /><br />
'''RJ2''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_jang.pdf J.-S. Roger Jang, Nien-Jung Lee, Chao-Ling Hsu 2]<br /><br />
'''NM''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_lemstrom.pdf Kjell Lemström, Niko Mikkilä]<br /><br />
'''XW1''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_wu.pdf Xiao Wu, Ming Li 1]<br /><br />
'''XW2''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_wu.pdf Xiao Wu, Ming Li 2]<br /><br />
'''AU1''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_SMS_uitdenbogerd.pdf Alexandra L. Uitdenbogerd 1]<br /><br />
'''AU2''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_SMS_uitdenbogerd.pdf Alexandra L. Uitdenbogerd 2]<br /><br />
'''AU3''' = [https://www.music-ir.org/mirex/abstracts/2007/QBSH_SMS_uitdenbogerd.pdf Alexandra L. Uitdenbogerd 3]<br /><br />
<br />
===Task 1 Results===<br />
The first subtask is the same as last years. In this subtask, submitted systems take a sung query as input and return a list of songs from the test database. Mean reciprocal rank (MRR) of the ground truth is calculated over the top 20 returns. The test database consists of 48 ground-truth MIDIs + 2000 Essen Collection MIDI noise files. The query database consists of 2797 sung queries. <br />
<br />
=====Task 1 Overall Results=====<br />
<csv>2007/qbsh07_task1_overall.csv</csv><br />
<br />
====Task 1 Friedman's Test for Significant Differences====<br />
The Friedman test was run in MATLAB against the QBSH Task 1 MRR data over the 48 ground truth song groups.<br />
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);<br />
<csv>2007/qbsh07_task1_sum_friedmans.csv</csv><br />
<csv>2007/qbsh07_task1_detail_friedmans.csv</csv><br />
[[Image:2007_qbsh07_task1_friedmans.png]]<br />
<br />
====Task 1 Summary Results by Query Group====<br />
<csv>2007/qbsh07_task1_avg_per_group.csv</csv><br />
<br />
===Task 2 Results===<br />
In this subtask, the same setup as the first subtask used with combination of different transcribers and matchers. The test databases consists of 106 ground-truth MIDIS + 2000 Essen Collection MIDI noise files. The query databases consists of 355 sung queries.<br />
<br />
====Task 2 Legend====<br />
=====Team ID=====<br />
'''FH_XW''' = [https://www.music-ir.org/mirex2007/abs/QBSH_ferraro.pdf Pascal Ferraro, Pierre Hanna, Julien Allali, Matthias Robine based on XW note transcriber]<br /><br />
'''CG_XW''' = [https://www.music-ir.org/mirex2007/abs/QBSH_SMS_gomez.pdf Carlos G├│mez, Soraya Abad-Mota, Edna Ruckhaus based on XW note transcriber]<br /><br />
'''RJ1_RJ''' = [https://www.music-ir.org/mirex2007/abs/QBSH_jang.pdf J.-S. Roger Jang, Nien-Jung Lee, Chao-Ling Hsu 1 based on RJ pitch transcriber]<br /><br />
'''RJ1_XW''' = [https://www.music-ir.org/mirex2007/abs/QBSH_jang.pdf J.-S. Roger Jang, Nien-Jung Lee, Chao-Ling Hsu 1 based on XW pitch transcriber]<br /><br />
'''RJ2_RJ''' = [https://www.music-ir.org/mirex2007/abs/QBSH_jang.pdf J.-S. Roger Jang, Nien-Jung Lee, Chao-Ling Hsu 2 based on RJ pitch transcriber]<br /><br />
'''RJ2_XW''' = [https://www.music-ir.org/mirex2007/abs/QBSH_jang.pdf J.-S. Roger Jang, Nien-Jung Lee, Chao-Ling Hsu 2 based on XW pitch transcriber]<br /><br />
'''NM_XW''' = [https://www.music-ir.org/mirex2007/abs/QBSH_lemstrom.pdf Kjell Lemström, Niko Mikkilä based on XW note transcriber]<br /><br />
'''XW1_XW''' = [https://www.music-ir.org/mirex2007/abs/QBSH_wu.pdf Xiao Wu, Ming Li 1 based on XW note transcriber]<br /><br />
'''XW2_XW''' = [https://www.music-ir.org/mirex2007/abs/QBSH_wu.pdf Xiao Wu, Ming Li 2 based on XW pitch transcriber]<br /><br />
<br />
=====Task 2 Overall Results=====<br />
<csv>2007/qbsh07_task2_overall.csv</csv><br />
<br />
====Task 2 Friedman's Test for Significant Differences====<br />
The Friedman test was run in MATLAB against the QBSH Task 1 MRR data over the 48 ground truth song groups.<br />
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);<br />
<csv>2007/qbsh07_task2_sum_friedmans.csv</csv><br />
<csv>2007/qbsh07_task2_detail_friedmans.csv</csv><br />
[[Image:2007_qbsh07_task2_friedmans.png]]<br />
<br />
====Task 2 Summary Results by Query Group====<br />
<csv>2007/qbsh07_task2_avg_per_group.csv</csv><br />
<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2007:Multiple_Fundamental_Frequency_Estimation_%26_Tracking_Results&diff=67282007:Multiple Fundamental Frequency Estimation & Tracking Results2010-05-14T03:52:02Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2007 running of the Multiple Fundamental Frequency Estimation and Tracking task. For background information about this task set please refer to the [[2007:Multiple Fundamental Frequency Estimation & Tracking]] page.<br />
<br />
<br />
<br />
===General Legend===<br />
====Team ID====<br />
'''CC1''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_cao.pdf Chuan Cao, Ming Li, Jian Liu, Yonghong Yan 1]<br /><br />
'''CC2''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_cao.pdf Chuan Cao, Ming Li, Jian Liu, Yonghong Yan 2]<br /><br />
'''AC1''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_cont.pdf Arshia Cont 1]<br /><br />
'''AC2''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_cont.pdf Arshia Cont 2]<br /><br />
'''AC3''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_cont.pdf Arshia Cont 3]<br /><br />
'''AC4''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_cont.pdf Arshia Cont 4]<br /><br />
'''KE1''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_egashira.pdf Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama 1]<br /><br />
'''KE2''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_egashira.pdf Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama 2]<br /><br />
'''KE3''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_egashira.pdf Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama 3]<br /><br />
'''KE4''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_egashira.pdf Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama 4]<br /><br />
'''VE1''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_emiya.pdf Valentin Emiya, Roland Badeau, Bertrand David 1]<br /><br />
'''VE2''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_emiya.pdf Valentin Emiya, Roland Badeau, Bertrand David 2]<br /><br />
'''PL''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_leveau.pdf Pierre Leveau]<br /><br />
'''PI1''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_pertusa.pdf Antonio Pertusa, José Manuel Iñesta 1]<br /><br />
'''PI2''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_pertusa.pdf Antonio Pertusa, José Manuel Iñesta 2]<br /><br />
'''PI3''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_pertusa.pdf Antonio Pertusa, José Manuel Iñesta 3]<br /><br />
'''PE1''' = Graham Poliner, Daniel P. W. Ellis 1<br /><br />
'''PE2''' = Graham Poliner, Daniel P. W. Ellis 2<br /><br />
'''SR''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_raczynski.pdf Stanisław A. Raczyński, Nobutaka Ono, Shigeki Sagayama]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_ryynanen.pdf Matt Ryynänen, Anssi Klapuri]<br /><br />
'''EV1''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_vincent.pdf Emmanuel Vincent, Nancy Bertin, Roland Badeau 1]<br /><br />
'''EV2''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_vincent.pdf Emmanuel Vincent, Nancy Bertin, Roland Badeau 2]<br /><br />
'''EV3''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_vincent.pdf Emmanuel Vincent, Nancy Bertin, Roland Badeau 3]<br /><br />
'''EV4''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_vincent.pdf Emmanuel Vincent, Nancy Bertin, Roland Badeau 4]<br /><br />
'''CY''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_yeh.pdf Chunghsin Yeh]<br /><br />
'''ZR''' = [https://www.music-ir.org/mirex/abstracts/2007/F0_zhou.pdf Ruohua Zhou, Joshua D. Reiss]<br /><br />
<br />
[[Category: Results]]<br />
<br />
==Overall Summary Results Task1==<br />
Below are the average scores across 28 test files. These files consisted of 7 groups, each group having 4 files ranging from 2 polyphony to 5 polyphony. 20 real recordings, 8 synthesized from RWC samples.<br />
<br />
<csv>2007/multiF0Task1.results.csv</csv> <br />
<br />
[[Category:Results]]<br />
<br />
Where<br />
<br />
[[Image:2007_ev_formulas.png]]<br />
<br />
*'''Nref''' is the number of non-zero elements in the ground truth data. <br />
*'''Nsys''' is the number of active elements returned by the system. <br />
*'''Ncorr''' is the number of correctly identified elements.<br />
<br />
==Individual Results Files for Task 1==<br />
===Individual results: Scores per Query===<br />
'''AC1''' = [https://www.music-ir.org/mirex/2007/results/AC1.results.csv Arshia Cont]<br /><br />
'''AC2''' = [https://www.music-ir.org/mirex/2007/results/AC2.results.csv Arshia Cont]<br/><br />
'''CC1''' = [https://www.music-ir.org/mirex/2007/results/CC1.results.csv Chuan Cao, Ming Li, Jian Liu, Yonghong Yan 1] <br/><br />
'''CC2''' = [https://www.music-ir.org/mirex/2007/results/CC2.results.csv Chuan Cao, Ming Li, Jian Liu, Yonghong Yan 1]<br /><br />
'''CY''' = [https://www.music-ir.org/mirex/2007/results/CY.results.csv Chunghsin Yeh]<br /><br />
'''EV1''' = [https://www.music-ir.org/mirex/2007/results/EV1.results.csv Emmanuel Vincent, Nancy Bertin, Roland Badeau]<br /><br />
'''EV2''' = [https://www.music-ir.org/mirex/2007/results/EV2.results.csv Emmanuel Vincent, Nancy Bertin, Roland Badeau]<br /><br />
'''KE1''' = [https://www.music-ir.org/mirex/2007/results/KE1.results.csv Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama]<br /><br />
'''KE2''' = [https://www.music-ir.org/mirex/2007/results/KE2.results.csv Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama]<br /><br />
'''PE''' = [https://www.music-ir.org/mirex/2007/results/PE.results.csv Graham Poliner, Daniel P. W. Ellis]<br /><br />
'''PI1''' = [https://www.music-ir.org/mirex/2007/results/PI1.results.csv Antonio Pertusa, José Manuel Iñesta 1]<br /><br />
'''PL''' = [https://www.music-ir.org/mirex/2007/results/PL.results.csv Pierre Leveau]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/2007/results/RK.results.csv Matt Ryynänen, Anssi Klapuri]<br /><br />
'''SR''' = [https://www.music-ir.org/mirex/2007/results/SR.results.csv Stanislaw A. Raczynski, Nobutaka Ono, Shigeki Sagayama]<br /><br />
'''VE1''' = [https://www.music-ir.org/mirex/2007/results/VE1.results.csv Valentin Emiya, Roland Badeau, Bertrand David ]<br /><br />
'''ZR''' = [https://www.music-ir.org/mirex/2007/results/ZR.results.csv JRuohua Zhou, Joshua D. Reiss]<br /><br />
<br />
[[Category: Results]]<br />
====Info about the filenames====<br />
The filenames starting with part* comes from acoustic woodwind recording, the ones starting with RWC are synthesized. The legend about the instruments are:<br />
<br />
'''bs''' = bassoon<br />
<br />
'''cl''' = clarinet<br />
<br />
'''fl''' = flute<br />
<br />
'''hn''' = horn<br />
<br />
'''ob''' = oboe<br />
<br />
'''vl''' = violin<br />
<br />
'''cel''' = cello<br />
<br />
'''gtr''' = guitar<br />
<br />
'''sax''' saxophone<br />
<br />
'''bass''' = electric bass guitar<br />
<br />
===Run Times===<br />
<csv>2007/multiF0_task1_runtimes.csv</csv> <br />
<br />
[[Category:Results]]<br />
<br />
==Overall Summary Results Task II==<br />
This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.<br />
The Overlap ratio is calculated for an individual correctly identified note as <br />
[[Image:2007_overlap.png]]<br />
<br />
A total of 30 files were used in this task: 16 real recordings, 8 synthesized from RWC samples, and 6 piano. The results below are the average of these 30 files.<br />
<br />
===Results Based on Onset Only===<br />
<csv>2007/multiF0.note.onset.only.eval.csv</csv><br />
<br />
===Results Based on Onset-Offset===<br />
<csv>2007/multiF0.note.eval.csv</csv><br />
<br />
===Piano Results based on Onset Only===<br />
<csv>2007/multiF0.note.onset.only.eval.for.piano.csv</csv><br />
<br />
==Individual Results Files for Task 2==<br />
===For Onset only Evaluation===<br />
'''AC3''' = [https://www.music-ir.org/mirex/2007/results/AC3.note.onset.only.results.csv Arshia Cont]<br /><br />
'''AC4''' = [https://www.music-ir.org/mirex/2007/results/AC4.note.onset.only.results2.csv Arshia Cont]<br/><br />
'''EV3''' = [https://www.music-ir.org/mirex/2007/results/EV3.note.onset.only.results.csv Emmanuel Vincent, Nancy Bertin, Roland Badeau]<br /><br />
'''EV4''' = [https://www.music-ir.org/mirex/2007/results/EV4.note.onset.only.results.csv Emmanuel Vincent, Nancy Bertin, Roland Badeau]<br /><br />
'''KE3''' = [https://www.music-ir.org/mirex/2007/results/KE3.note.onset.only.results.csv Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama]<br /><br />
'''KE4''' = [https://www.music-ir.org/mirex/2007/results/KE4.note.onset.only.results.csv Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama]<br /><br />
'''PE2''' = [https://www.music-ir.org/mirex/2007/results/PE2.note.onset.only.results.csv Graham Poliner, Daniel P. W. Ellis]<br /><br />
'''PI2''' = [https://www.music-ir.org/mirex/2007/results/PI2.note.onset.only.results.csv Antonio Pertusa, José Manuel Iñesta 1]<br /><br />
'''PI3''' = [https://www.music-ir.org/mirex/2007/results/PI3.note.onset.only.results.csv Antonio Pertusa, José Manuel Iñesta 1]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/2007/results/RK.note.onset.only.results.csv Matt Ryynänen, Anssi Klapuri]<br /><br />
'''VE2''' = [https://www.music-ir.org/mirex/2007/results/VE2.note.onset.only.results.csv Valentin Emiya, Roland Badeau, Bertrand David]<br /><br />
<br />
===For Onset/Offset Evaluation===<br />
'''AC3''' = [https://www.music-ir.org/mirex/2007/results/AC3.note.results2.csv Arshia Cont]<br /><br />
'''AC4''' = [https://www.music-ir.org/mirex/2007/results/AC4.note.results2.csv Arshia Cont]<br/><br />
'''EV3''' = [https://www.music-ir.org/mirex/2007/results/EV3.note.results.csv Emmanuel Vincent, Nancy Bertin, Roland Badeau]<br /><br />
'''EV4''' = [https://www.music-ir.org/mirex/2007/results/EV4.note.results.csv Emmanuel Vincent, Nancy Bertin, Roland Badeau]<br /><br />
'''KE3''' = [https://www.music-ir.org/mirex/2007/results/KE3.note.results.csv Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama]<br /><br />
'''KE4''' = [https://www.music-ir.org/mirex/2007/results/KE4.note.results.csv Koji Egashira, Hirokazu Kameoka, Shigeki Sagayama]<br /><br />
'''PE2''' = [https://www.music-ir.org/mirex/2007/results/PE2.note.results.csv Graham Poliner, Daniel P. W. Ellis]<br /><br />
'''PI2''' = [https://www.music-ir.org/mirex/2007/results/PI2.note.results.csv Antonio Pertusa, José Manuel Iñesta 1]<br /><br />
'''PI3''' = [https://www.music-ir.org/mirex/2007/results/PI3.note.results.csv Antonio Pertusa, José Manuel Iñesta 1]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/2007/results/RK.note.results.csv Matt Ryynänen, Anssi Klapuri]<br /><br />
'''VE2''' = [https://www.music-ir.org/mirex/2007/results/VE2.note.results.csv Valentin Emiya, Roland Badeau, Bertrand David ]<br /><br />
<br />
===Info About Filenames===<br />
The filenames starting with part* comes from acoustic woodwind recording, the ones starting with RWC are synthesized. The piano files are: RA_C030_align.wav,bach_847TESTp.wav,beet_pathetique_3TESTp.wav,mz_333_1TESTp.wav,scn_4TESTp.wav.note, ty_januarTESTp.wav.note<br />
<br />
===Run Times===<br />
<csv>2007/multiF0task2.runtimes.csv</csv></div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2007:MIREX2007_Results&diff=67272007:MIREX2007 Results2010-05-14T03:51:52Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>19 September 2007<br />
<br />
All result data sets are now posted. A few tasks are still missing run-time data which, most likely, will have to wait until after ISMIR, as IMIRSEL folks are starting their journeys to Vienna.<br />
<br />
The MIREX 2007 Plenary Session will be held Wednesday, 26 September 2007 (1400-1530h) during the main ISMIR 2007 conference. This will be followed by the MIREX Poster Session (1530-1630h). <br />
<br />
===''NEW'' MIREX 2007 Plenary Topics Page===<br />
We have created a [[2007:MIREX2007_Plenary_Topics]] page which is intended to act as an informal notepad for community members who would like to suggest possible discussion topics for the MIREX 2007 Plenary Session. We will be monitoring the "Topics" page constantly prior to the plenary session. Also, please feel free to post your comments and ideas during and after the Plenary Session as we will use this information to help shape future iterations of MIREX.<br />
<br />
=OVERALL RESULTS POSTER=<br />
[https://www.music-ir.org/mirex/abstracts/2007/MIREX2007_overall_results.pdf MIREX 2007 Overall Results Poster (PDF)] is now available .<br />
<br />
== Results by Task ==<br />
<br />
* [[2007:Audio_Artist_Identification_Results | Audio Artist Identification Results]] (READY)<br />
* [[2007:Audio_Classical_Composer_Identification_Results | Audio Classical Composer Identification Results ]] (READY)<br />
* [[2007:Audio_Cover_Song_Identification_Results | Audio Cover Song Identification Results]] (READY)<br />
* [[2007:Audio_Genre_Classification_Results | Audio Genre Classification Results]] (READY)<br />
* [[2007:Audio_Music_Mood_Classification_Results | Audio Music Mood Classification]] (READY)<br />
* [[2007:Audio_Music_Similarity_and_Retrieval_Results | Audio Music Similarity and Retrieval Results]] (READY)<br />
* [[2007:Audio_Onset_Detection_Results | Audio Onset Detection Results]] (READY)<br />
* [[2007:Multiple_Fundamental_Frequency_Estimation_&_Tracking_Results | Multiple Fundamental Frequency Estimation & Tracking Results]] (READY)<br />
* [[2007:Query-by-Singing/Humming_Results | Query-by-Singing/Humming Results]] (READY)<br />
* [[2007:Symbolic_Melodic_Similarity_Results | Symbolic Melodic Similarity Results]] (READY)<br />
<br />
----<br />
<br />
== Machine Specifications ==<br />
<csv>2007/mirex07_machine_specs.csv</csv> <br />
<br />
[[Category:Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2007:Audio_Onset_Detection_Results&diff=67262007:Audio Onset Detection Results2010-05-14T03:51:42Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2007 running of the Audio Onset Detection task set. For background information about this task set please refer to the [[2007:Audio Onset Detection]] page.<br />
<br />
The aim of the Audio Onset Detection task is to find the time locations at which all musical events in a recording begin. The dataset consists of 85 recordings across 9 different "classes" (e.g. solo drums, polyphonic pitched, etc.). For each sound file, ground truth annotations produced by 3-5 listeners were used for the evaluation. Each algorithm was tested across 10-20 different parameterizations (e.g. thresholds) in order to produce Precision vs. Recall Operating Characteristic (P-ROC) curves. The primary evauluation metric used was the F1-Measure (the equal weighted harmonic mean of precision and recall). <br />
<br />
*Note: There were a few faulty ground truth annotations in the 2005 and 2006 runs of this task. These have been removed for this year's evaluation. Thanks to Dan Stowell for finding these.<br />
<br />
===General Legend===<br />
====Team ID====<br />
<br />
'''lacoste''' = [https://www.music-ir.org/mirex/abstracts/2007/OD_lacoste.pdf Alexandre Lacoste]<br /><br />
'''lee''' = [https://www.music-ir.org/mirex/abstracts/2007/OD_lee.pdf Wan-Chi Lee, Yu Shiu, C.-C. Jay Kuo]<br /><br />
'''roebel''' = [https://www.music-ir.org/mirex/abstracts/2007/OD_roebel.pdf A. R├╢bel]<br /><br />
'''stowell''' = [https://www.music-ir.org/mirex/abstracts/2007/OD_stowell.pdf Dan Stowell, Mark Plumbley]<br /><br />
'''zhou''' = [https://www.music-ir.org/mirex/abstracts/2007/OD_zhou.pdf Ruohua Zhou, Joshua D. Reiss]<br /><br />
<br />
==Overall Summary Results==<br />
===MIREX 2007 Audio Onset Detection Summary Results - Peak F-measure performance across all parameterizations===<br />
<br />
<csv>2007/onset.Total_peak.csv</csv><br />
<br />
===MIREX 2007 Audio Onset Detection Summary Plot===<br />
<br />
[[image:Total.png]]<br />
<br />
===MIREX 2007 Audio Onset Detection Runtime Data===<br />
<br />
<csv>2007/onset.runtime.csv</csv><br />
<br />
==Results by Class==<br />
*[[Audio_Onset_Detection_Results:_Complex]]<br />
*[[Audio_Onset_Detection_Results:_Poly_Pitched]]<br />
*[[Audio_Onset_Detection_Results:_Solo_Bars_and_Bells]]<br />
*[[Audio_Onset_Detection_Results:_Solo_Brass]]<br />
*[[Audio_Onset_Detection_Results:_Solo_Drum]]<br />
*[[Audio_Onset_Detection_Results:_Solo_Plucked_Strings]]<br />
*[[Audio_Onset_Detection_Results:_Solo_Singing_Voice]]<br />
*[[Audio_Onset_Detection_Results:_Solo_Sustained_Strings]]<br />
*[[Audio_Onset_Detection_Results:_Solo_Winds]]<br />
<br />
==Individual Results ==<br />
* [[Audio_Onset_Detection_Results:_Lacoste]]<br />
*[[Audio_Onset_Detection_Results:_Lee_-_Joint_-_0.2]]<br />
*[[Audio_Onset_Detection_Results:_Lee_-_Joint_-_0.3]]<br />
*[[Audio_Onset_Detection_Results:_Lee_-_Joint_-_0.4]]<br />
*[[Audio_Onset_Detection_Results:_Lee_-_LP]]<br />
*[[Audio_Onset_Detection_Results:_Roebel_1]]<br />
*[[Audio_Onset_Detection_Results:_Roebel_2]]<br />
*[[Audio_Onset_Detection_Results:_Roebel_3]]<br />
*[[Audio_Onset_Detection_Results:_Roebel_4]]<br />
*[[Audio_Onset_Detection_Results:_Stowell_-_cd]]<br />
*[[Audio_Onset_Detection_Results:_Stowell_-_mkl]]<br />
*[[Audio_Onset_Detection_Results:_Stowell_-_pd]]<br />
*[[Audio_Onset_Detection_Results:_Stowell_-_pow]]<br />
*[[Audio_Onset_Detection_Results:_Stowell_-_rcd]]<br />
*[[Audio_Onset_Detection_Results:_Stowell_-_som]]<br />
*[[Audio_Onset_Detection_Results:_Stowell_-_wpd]]<br />
*[[Audio_Onset_Detection_Results:_Zhou]]]<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2007:Audio_Music_Similarity_and_Retrieval_Results&diff=67252007:Audio Music Similarity and Retrieval Results2010-05-14T03:51:33Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>[[Category: Results]]<br />
==Introduction==<br />
These are the results for the 2007 running of the Audio Music Similarity and Retrieval task set. For background information about this task set please refer to the [[2007:Audio Music Similarity and Retrieval]] page.<br />
<br />
Each system was given 7000 songs chosen from IMIRSEL's "uspop", "uscrap" and "american" "classical" and "sundry" collections. Each system then returned a 7000x7000 distance matrix. 100 songs were randomly selected from the 10 genre groups (10 per genre) as queries and the first 5 most highly ranked songs out of the 7000 were extracted for each query (after filtering out the query itself, returned results from the same artist were also omitted). Then, for each query, the returned results (candidates) from all participants were grouped and were evaluated by human graders using the Evalutron 6000 grading system. Each individual query/candidate set was evaluated by a single grader. For each query/candidate pair, graders provided two scores. Graders were asked to provide 1 categorical score with 3 categories: NS,SS,VS as explained below, and one fine score (in the range from 0 to 10). A description and analysis is provided below.<br />
<br />
The systems read in 30 second audio clips as their raw data. The same 30 second clips were used in the grading stage.<br />
<br />
===Summary Data on Human Evaluations (Evalutron 6000)===<br />
'''Number of evaluators''' = 20<br /> <br />
'''Number of evaluations per query/candidate pair''' = 1<br /><br />
'''Number of queries per grader''' = 5 <br /><br />
'''Size of the candidate lists''' = 48.32<br /> <br />
'''Number of randomly selected queries''' = 100 <br /><br />
'''Number of query/candidate pairs graded''' = 4832<br />
<br />
====General Legend====<br />
=====Team ID=====<br />
'''BK1''' = [https://www.music-ir.org/mirex/abstracts/2007/AS_bosteels.pdf Klaas Bosteels, Etienne E. Kerre 1]<br /><br />
'''BK2''' = [https://www.music-ir.org/mirex/abstracts/2007/AS_bosteels.pdf Klaas Bosteels, Etienne E. Kerre 2]<br /><br />
'''CB1''' = [https://www.music-ir.org/mirex/abstracts/2007/AS_bastuck.pdf Christoph Bastuck 1]<br /><br />
'''CB2''' = [https://www.music-ir.org/mirex/abstracts/2007/AS_bastuck.pdf Christoph Bastuck 2]<br /><br />
'''CB3''' = [https://www.music-ir.org/mirex/abstracts/2007/AS_bastuck.pdf Christoph Bastuck 3]<br /><br />
'''GT''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_tzanetakis.pdf George Tzanetakis] <br /><br />
'''LB''' = [https://www.music-ir.org/mirex/abstracts/2007/AS_barrington.pdf Luke Barrington, Douglas Turnbull, David Torres, Gert Lanskriet]<br /><br />
'''ME''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_mandel.pdf Michael I. Mandel, Daniel P. W. Ellis]<br /><br />
'''PC''' = [https://www.music-ir.org/mirex/abstracts/2007/AS_paradzinets.pdf Aliaksandr Paradzinets, Liming Chen]<br /><br />
'''PS''' = [https://www.music-ir.org/mirex/abstracts/2007/AS_pohle.pdf Tim Pohle, Dominik Schnitzer]<br /><br />
'''TL1''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_lidy.pdf Thomas Lidy, Andreas Rauber, Antonio Pertusa, José Manuel Iñesta 1]<br /><br />
'''TL2''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_lidy.pdf Thomas Lidy, Andreas Rauber, Antonio Pertusa, José Manuel Iñesta 2]<br /><br />
<br />
====Broad Categories====<br />
'''NS''' = Not Similar<br /><br />
'''SS''' = Somewhat Similar<br /><br />
'''VS''' = Very Similar<br /><br />
<br />
=====Calculating Summary Measures=====<br />
'''Fine'''<sup>(1)</sup> = Sum of fine-grained human similarity decisions (0-10). <br /><br />
'''PSum'''<sup>(1)</sup> = Sum of human broad similarity decisions: NS=0, SS=1, VS=2. <br /><br />
'''WCsum'''<sup>(1)</sup> = 'World Cup' scoring: NS=0, SS=1, VS=3 (rewards Very Similar). <br /><br />
'''SDsum'''<sup>(1)</sup> = 'Stephen Downie' scoring: NS=0, SS=1, VS=4 (strongly rewards Very Similar). <br /><br />
'''Greater0'''<sup>(1)</sup> = NS=0, SS=1, VS=1 (binary relevance judgement).<br /><br />
'''Greater1'''<sup>(1)</sup> = NS=0, SS=0, VS=1 (binary relevance judgement using only Very Similar).<br /><br />
<br />
<sup>(1)</sup>Normalized to the range 0 to 1.<br />
<br />
===Overall Summary Results===<br />
'''NB''': The results for BK2 were interpolated from partial data due to a runtime error.<br />
<br />
<csv>2007/ams07_overall_summary2.csv</csv><br />
<br />
<br />
<br />
===Friedman Test with Multiple Comparisons Results (p=0.05)===<br />
The Friedman test was run in MATLAB against the Fine summary data over the 100 queries.<br /><br />
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);<br />
<csv>2007/ams07_sum_friedman_fine.csv</csv><br />
<csv>2007/ams07_detail_friedman_fine.csv</csv><br />
<br />
[[Image:2007 ams broad scores friedmans.png]]<br />
<br />
===Summary Results by Query===<br />
These are the mean FINE scores per query assigned by Evalutron graders. The FINE scores for the 5 candidates returned per algorithm, per query, have been averaged. Values are bounded between 0.0 and 10.0. A perfect score would be 10. Genre labels have been included for reference. <br />
<br />
<csv>2007/ams07_fine_by_query_with_genre.csv</csv><br />
<br />
These are the mean BROAD scores per query assigned by Evalutron graders. The BROAD scores for the 5 candidates returned per algorithm, per query, have been averaged. Values are bounded between 0 (not similar) and 2 (very similar). A perfect score would be 2. Genre labels have been included for reference. <br />
<csv>2007/ams07_broad_by_query_with_genre.csv</csv><br />
<br />
===Anonymized Metadata===<br />
[https://www.music-ir.org/mirex/results/2007/anonymizedAudioSim07metaData.csv Anonymized Metadata]<br /><br />
<br />
===Raw Scores===<br />
The raw data derived from the Evalutron 6000 human evaluations are located on the [[2007:Audio Music Similarity and Retrieval Raw Data]] page.</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2007:Audio_Music_Mood_Classification_Results&diff=67242007:Audio Music Mood Classification Results2010-05-14T03:51:22Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2007 running of the Audio Music Mood Classification task. For background information about this task set please refer to the [[2007:Audio Music Mood Classification]] page. The data set consisted of 600 30 second audio clips evenly divided into 5 general moods. The assignment of moods to files was done by human evaluators with a file accepted as a representative of a given mood when at least 2 graders concurred in the assignment.<br />
<br />
===General Legend===<br />
====Team ID====<br />
'''ME''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_mandel.pdf Michael I. Mandel, Daniel P. W. Ellis]<br /><br />
'''TL''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_lidy.pdf Thomas Lidy, Andreas Rauber, Antonio Pertusa, José Manuel Iñesta]<br /><br />
'''GT''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_tzanetakis.pdf George Tzanetakis]<br /><br />
'''CL''' = [https://www.music-ir.org/mirex/abstracts/2007/MC_laurier.pdf Cyril Laurier, Perfecto Herrera]<br /><br />
'''IM''' = IMIRSEL M2K<br /><br />
'''KL''' = Kyogu Lee<br /><br />
<br />
==Overall Summary Results==<br />
===MIREX 2007 Audio Mood Classification Summary Results - Raw Classification Accuracy Averaged Over Three Train/Test Folds===<br />
<br />
<csv>2007/mood.results.csv</csv><br />
<br />
===MIREX 2007 Audio Artist Classification Evaluation Logs and Confusion Matrices===<br />
[https://www.music-ir.org/mirex/results/2007/am_imirsel_knn.eval.txt IM_knn]<br /><br />
[https://www.music-ir.org/mirex/results/2007/am_imirsel_svm.eval.txt IM_svm]<br /><br />
[https://www.music-ir.org/mirex/results/2007/am_laurier.eval.txt CL]<br /><br />
[https://www.music-ir.org/mirex/results/2007/am_lee_1.eval.txt KL_1]<br /><br />
[https://www.music-ir.org/mirex/results/2007/am_lee_2.eval.txt KL_2]<br /><br />
[https://www.music-ir.org/mirex/results/2007/am_lidy.eval.txt TL]<br /><br />
[https://www.music-ir.org/mirex/results/2007/am_mandel_1.eval.txt ME]<br /><br />
[https://www.music-ir.org/mirex/results/2007/am_mandel_2_spec.eval.txt ME_spec]<br /><br />
[https://www.music-ir.org/mirex/results/2007/am_tzan.eval.txt GT]<br /><br />
<br />
===MIREX 2007 Audio Mood Classification Run Times===<br />
<br />
<csv>2007/mood.runtime.csv</csv><br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2007:Audio_Genre_Classification_Results&diff=67232007:Audio Genre Classification Results2010-05-14T03:51:12Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2007 running of the Audio Genre Classification task. For background information about this task set please refer to the [[2007:Audio Genre Classification]] page. The data set consisted of 7000 30 second clips covering the following genres (700 tracks per genre):<br />
<br />
A : BAROQUE<br /><br />
B : BLUES<br /><br />
C : CLASSICAL<br /><br />
D : COUNTRY<br /><br />
E : EDANCE<br /><br />
F : JAZZ<br /><br />
G : METAL<br /><br />
H : RAPHIPHOP<br /><br />
I : ROCKROLL<br /><br />
J : ROMANTIC<br /><br />
<br />
Partial scores were earned for some confusions (i.e., ROCKROLL-METAL; JAZZ-BLUES, etc.)<br />
<br />
===General Legend===<br />
====Team ID====<br />
'''ME''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_mandel.pdf Michael I. Mandel, Daniel P. W. Ellis]<br /><br />
'''TL''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_lidy.pdf Thomas Lidy, Andreas Rauber, Antonio Pertusa, José Manuel Iñesta]<br /><br />
'''GT''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_tzanetakis.pdf George Tzanetakis]<br /><br />
'''GH''' = [https://www.music-ir.org/mirex/abstracts/2007/GC_guaus.pdf Enric Guaus, Perfecto Herrera]<br /><br />
'''IM''' = IMIRSEL M2K<br /><br />
<br />
==Overall Summary Results==<br />
===MIREX 2007 Audio Genre Classification Summary Results - Raw and Hierarchical Classification Accuracy Averaged Over Three Train/Test Folds===<br />
<br />
<csv>2007/genre.results.csv</csv><br />
<br />
===MIREX 2007 Audio Genre Classification Evaluation Logs and Confusion Matrices===<br />
[https://www.music-ir.org/mirex/results/2007/ag_guaus.eval.txt GH]<br /><br />
[https://www.music-ir.org/mirex/results/2007/ag_imirsel_knn.eval.txt IM_knn]<br /><br />
[https://www.music-ir.org/mirex/results/2007/ag_imirsel_svm.eval.txt IM_svm]<br /><br />
[https://www.music-ir.org/mirex/results/2007/ag_lidy.eval.txt TL]<br /><br />
[https://www.music-ir.org/mirex/results/2007/ag_mandel_1.eval.txt ME]<br /><br />
[https://www.music-ir.org/mirex/results/2007/ag_mandel_2_spec.eval.txt ME_spec]<br /><br />
[https://www.music-ir.org/mirex/results/2007/ag_tzan.eval.txt GT]<br /><br />
<br />
===MIREX 2007 Audio Genre Classification Run Times===<br />
<br />
<csv>2007/genre.runtime.csv</csv><br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2007:Audio_Cover_Song_Identification_Results&diff=67222007:Audio Cover Song Identification Results2010-05-14T03:51:02Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2007 running of the Audio Cover Song Identification task. For background information about this task set please refer to the [[2007:Audio Cover Song Identification]] page.<br />
<br />
Each system was given a collection of 1000 songs which included of 30 different classes (sets) of cover songs where each class/set was represented by 11 different versions of a particular song. Each of the 330 cover songs were used as queries and the systems were required to return 10 results for each query. Systems were evaluated on the number of the songs from the same class/set as the query that were returned in the list of 10 results for each query. Average precision, which looks at the entire per-query rank-ordered list of all songs in the collection, is a new metric explored this year.<br />
<br />
<br />
===General Legend===<br />
====Team ID====<br />
'''EC''' = [https://www.music-ir.org/mirex/abstracts/2007/CS_ellis.pdf Daniel P. W. Ellis, Courtenay V. Cotton]<br /><br />
'''IM''' = IMIRSEL M2K<br/><br />
'''JB''' = [https://www.music-ir.org/mirex/abstracts/2007/CS_bello.pdf Juan Bello] <br/><br />
'''JEC''' = [https://www.music-ir.org/mirex/abstracts/2007/CS_jensen.pdf Jesper H├╕jvang Jensen, Daniel P. W. Ellis, Mads G. Christensen, S├╕ren Holdt]<br /><br />
'''KL1''' = Kyogu Lee 1<br /><br />
'''KL2''' = Kyogu Lee 2<br /><br />
'''KP''' = [https://www.music-ir.org/mirex/abstracts/2007/CS_kim.pdf Youngmoo E. Kim, Daniel Perelstein] <br /><br />
'''SG''' = [https://www.music-ir.org/mirex/abstracts/2007/CS_serra.pdf Joan Serrà, Emilia Gómez]<br /><br />
<br />
==Overall Summary Results==<br />
<csv>2007/cover.overall.stats.csv</csv> <br />
<br />
===Runtimes===<br />
Where algorithms have been multi-threaded, the longest runtime is reported.<br />
<br />
Where runtimes were not properly reported, file timestamps have been used to approximate a runtime.<br />
<br />
<csv>2007/cover.runtimes.csv</csv><br />
<br />
===Number of Correct Covers at Rank X Returned in Top Ten=== <br />
<csv>2007/cover.rank.distributions.csv</csv> <br />
<br />
===Friedman's Test for Significant Differences===<br />
The Friedman test was run in MATLAB against the Average Precision summary data over the 30 song groups.<br /> Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2007/cover.friedmans.summary.csv</csv><br />
<br />
<csv>2007/cover.friedmans.details.csv</csv><br />
<br />
[[Image:2007 cover friedmans 1.png]]<br />
<br />
===Average Performance per Query Group===<br />
These are the arithmetic means of the average precisions within each of the 30 query groups.<br />
<br />
<csv>2007/cover.avg.prec.by.groups.csv</csv><br />
<br />
==Individual Results Files==<br />
===Average Precision Scores for Each Query===<br />
'''EC''' = [https://www.music-ir.org/mirex/results/2007/cover.labrosa.eval.csv Daniel P. W. Ellis, Courtenay V. Cotton]<br /><br />
'''IM''' = [https://www.music-ir.org/mirex/results/2007/cover.imirsel.eval.csv IMIRSEL M2K]<br/><br />
'''JB''' = [https://www.music-ir.org/mirex/results/2007/cover.bello.eval.csv Juan Bello] <br/><br />
'''JEC''' = [https://www.music-ir.org/mirex/results/2007/cover.jensen.eval.csv Jesper H├╕jvang Jensen, Daniel P. W. Ellis, Mads G. Christensen, S├╕ren Holdt]<br /><br />
'''KL1''' = [https://www.music-ir.org/mirex/results/2007/cover.klee1.eval.csv Kyogu Lee 1]<br /><br />
'''KL2''' = [https://www.music-ir.org/mirex/results/2007/cover.klee2.eval.csv Kyogu Lee 2]<br /><br />
'''KP''' = [https://www.music-ir.org/mirex/results/2007/cover.metlab.eval.csv Youngmoo E. Kim, Daniel Perelstein] <br /><br />
'''SG''' = [https://www.music-ir.org/mirex/results/2007/cover.serra.eval.csv Joan Serrà, Emilia Gómez]<br /><br />
<br />
===Ranks of the Ten Cover Songs Returned for Each Query===<br />
'''EC''' = [https://www.music-ir.org/mirex/results/2007/cover.labrosa.eval.debug.csv Daniel P. W. Ellis, Courtenay V. Cotton]<br /><br />
'''IM''' = [https://www.music-ir.org/mirex/results/2007/cover.imirsel.eval.debug.csv IMIRSEL M2K]<br/><br />
'''JB''' = [https://www.music-ir.org/mirex/results/2007/cover.bello.eval.debug.csv Juan Bello] <br/><br />
'''JEC''' = [https://www.music-ir.org/mirex/results/2007/cover.jensen.eval.debug.csv Jesper H├╕jvang Jensen, Daniel P. W. Ellis, Mads G. Christensen, S├╕ren Holdt]<br /><br />
'''KL1''' = [https://www.music-ir.org/mirex/results/2007/cover.klee1.eval.debug.csv Kyogu Lee 1]<br /><br />
'''KL2''' = [https://www.music-ir.org/mirex/results/2007/cover.klee2.eval.debug.csv Kyogu Lee 2]<br /><br />
'''KP''' = [https://www.music-ir.org/mirex/results/2007/cover.metlab.eval.debug.csv Youngmoo E. Kim, Daniel Perelstein] <br /><br />
'''SG''' = [https://www.music-ir.org/mirex/results/2007/cover.serra.eval.debug.csv Joan Serrà, Emilia Gómez]<br /><br />
<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2007:Audio_Classical_Composer_Identification_Results&diff=67212007:Audio Classical Composer Identification Results2010-05-14T03:50:52Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the inaugural 2007 running of the Audio Classical Composer Identification task. For background information about this task set please refer to the [[2007:Audio Classical Composer Identification]] page. <br />
<br />
The data set consisted of 2772 30 second audio clips. The composers represented were:<br />
<br />
#Bach<br />
#Beethoven<br />
#Brahms<br />
#Chopin<br />
#Dvorak<br />
#Handel<br />
#Haydn<br />
#Mendelssohnn<br />
#Mozart<br />
#Schubert<br />
#Vivaldi<br />
<br />
The goal was to correctly identify the composer who wrote each of the pieces represented.<br />
<br />
===General Legend===<br />
====Team ID====<br />
'''ME''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_mandel.pdf Michael I. Mandel, Daniel P. W. Ellis]<br /><br />
'''TL''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_lidy.pdf Thomas Lidy, Andreas Rauber, Antonio Pertusa, José Manuel Iñesta]<br /><br />
'''GT''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_tzanetakis.pdf George Tzanetakis]<br /><br />
'''KL''' = Kyogu Lee<br /><br />
'''IM''' = IMIRSEL M2K<br /><br />
<br />
==Overall Summary Results==<br />
===MIREX 2007 Audio Classical Composer Classification Summary Results - Raw Classification Accuracy Averaged Over Three Train/Test Folds===<br />
<br />
<csv>2007/composer.results.csv</csv><br />
<br />
===MIREX 2007 Audio Classical Composer Classification Evaluation Logs and Confusion Matrices===<br />
[https://www.music-ir.org/mirex/results/2007/ac_imirsel_knn.eval.txt IM_knn]<br /><br />
[https://www.music-ir.org/mirex/results/2007/ac_imirsel_svm.eval.txt IM_svm]<br /><br />
[https://www.music-ir.org/mirex/results/2007/ac_lee.eval.txt KL]<br /><br />
[https://www.music-ir.org/mirex/results/2007/ac_lidy.eval.txt TL]<br /><br />
[https://www.music-ir.org/mirex/results/2007/ac_mandel_1.eval.txt ME]<br /><br />
[https://www.music-ir.org/mirex/results/2007/ac_mandel_2_spec.eval.txt ME_spec]<br /><br />
[https://www.music-ir.org/mirex/results/2007/ac_tzan.eval.txt GT]<br /><br />
<br />
===MIREX 2007 Audio Classical Composer Classification Run Times===<br />
<br />
<csv>2007/composer.runtime.csv</csv><br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2007:Audio_Artist_Identification_Results&diff=67202007:Audio Artist Identification Results2010-05-14T03:50:42Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2007 running of the Audio Artist Identification task. For background information about this task set please refer to the [[2007:Audio Artist Identification]] page. The data set was 3060 30 second clips representing 102 different recording artists.<br />
<br />
===General Legend===<br />
====Team ID====<br />
'''ME''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_mandel.pdf Michael I. Mandel, Daniel P. W. Ellis]<br /><br />
'''TL''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_lidy.pdf Thomas Lidy, Andreas Rauber, Antonio Pertusa, José Manuel Iñesta]<br /><br />
'''GT''' = [https://www.music-ir.org/mirex/abstracts/2007/AI_CC_GC_MC_AS_tzanetakis.pdf George Tzanetakis]<br /><br />
'''KL''' = Kyogu Lee<br /><br />
'''IM''' = IMIRSEL M2K<br /><br />
<br />
==Overall Summary Results==<br />
===MIREX 2007 Audio Artist Classification Summary Results - Raw Classification Accuracy Averaged Over Three Train/Test Folds===<br />
<br />
<csv>2007/artist.results.csv</csv><br />
<br />
<br />
===MIREX 2007 Audio Artist Classification Evaluation Logs and Confusion Matrices===<br />
[https://www.music-ir.org/mirex/results/2007/aa_imirsel_knn.eval.txt IM_knn]<br /><br />
[https://www.music-ir.org/mirex/results/2007/aa_imirsel_svm.eval.txt IM_svm]<br /><br />
[https://www.music-ir.org/mirex/results/2007/aa_lee.eval.txt KL]<br /><br />
[https://www.music-ir.org/mirex/results/2007/aa_lidy.eval.txt TL]<br /><br />
[https://www.music-ir.org/mirex/results/2007/aa_mandel_1.eval.txt ME]<br /><br />
[https://www.music-ir.org/mirex/results/2007/aa_mandel_2_spec.eval.txt ME_spec]<br /><br />
[https://www.music-ir.org/mirex/results/2007/aa_tzan.eval.txt GT]<br /><br />
<br />
===MIREX 2007 Audio Artist Classification Run Times===<br />
<br />
<csv>2007/artist.runtime.csv</csv><br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Real-time_Audio_to_Score_Alignment_(a.k.a._Score_Following)_Results&diff=67192008:Real-time Audio to Score Alignment (a.k.a. Score Following) Results2010-05-14T03:49:09Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Real-time Audio to Score Alignment (a.k.a Score Following) task. For background information about this task set please refer to the [[2008:Real-time Audio to Score Alignment (a.k.a Score Following)]] page. <br />
<br />
===General Legend===<br />
====Team ID====<br />
<br />
<br />
'''MO1''' = [https://www.music-ir.org/mirex/abstracts/2008/XXX.pdf N. Montecchio & Orio 1]<br /><br />
'''MO2''' = [https://www.music-ir.org/mirex/abstracts/2008/XXX.pdf N. Montecchio & Orio 2]<br /><br />
'''RM1''' = [https://www.music-ir.org/mirex/abstracts/2008/Scofo.pdf R. Macrae]<br /><br />
'''RM2''' = [https://www.music-ir.org/mirex/abstracts/2008/Scofo.pdf R. Macrae]<br /><br />
<br />
[[Category: Results]]<br />
<br />
===Summary Results===<br />
<csv>2008/scofo/scofo_summary_results.csv</csv><br />
<br />
===Individual Results===<br />
'''MO''' = [https://www.music-ir.org/mirex/2008/results/scofo/MOResults.zip N. Montecchio & Orio]<br /><br />
'''RM''' = [https://www.music-ir.org/mirex/2008/results/scofo/RMResults.zip R. Macrae ]<br /><br />
<br />
===Summary Results w.r.t R. Macrae`s Evaluation Script===<br />
<csv>2008/scofo/scofo_summary_results_withRobsEvalScript.csv</csv><br />
<br />
===Individual Results w.r.t R. Macrae`s Evaluation Script===<br />
'''MO''' = [https://www.music-ir.org/mirex/2008/results/scofo/MOresults_withRobsEvalScript.zip N. Montecchio & Orio]<br /><br />
'''RM''' = [https://www.music-ir.org/mirex/2008/results/scofo/RMresults_withRobsEvalScript.zip R. Macrae ]<br /><br />
<br />
<br />
The systems are evaluated against the ground truth that is prepared by parsing the score files by each systems own midi parser (MO GT, RM GT).<br />
<br />
<br />
=== Issues with ground-truth ===</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Query-by-Tapping_Results&diff=67182008:Query-by-Tapping Results2010-05-14T03:49:00Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Query-by-Tapping task. For background information about this task set please refer to the [[2008:Query by Tapping]] page. 481 queries were used in this evaluation.<br />
<br />
===General Legend===<br />
====Team ID====<br />
<br />
'''SH1''' = [https://www.music-ir.org/mirex/abstracts/2008/QT_show.pdf S-J. Hsiao (show)]<br /><br />
'''SH2''' = [https://www.music-ir.org/mirex/abstracts/2008/QT_show.pdf S-J. Hsiao (onset_show)]<br /><br />
'''HL1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex_QBT_1.pdf H-R. Lee 1]<br /><br />
'''HL2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex_QBT_2.pdf H-R. Lee 2]<br /><br />
'''RT''' = [https://www.music-ir.org/mirex/abstracts/2008/QT_typke.pdf R. Typke]<br /><br />
<br />
==Summary Results==<br />
<br />
===Overall MRR Results===<br />
<br />
<csv>2008/qbt/QBTSummary.csv</csv><br />
<br />
===Friedman's Test for Significant Differences===<br />
The Friedman test was run in MATLAB against the MRR summary data over the 103 query groups.<br /> Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/qbt/qbt.friedman.detailed.csv</csv><br />
<br />
<br />
[[Image: qbt.friedman.small.png]]<br />
<br />
===MRR Results by Query Group===<br />
<br />
<csv>2008/qbt/qbt_res_byQueryGroup.csv</csv><br />
<br />
===MRR Results by Query===<br />
<br />
<csv>2008/qbt/qbt.res.csv</csv><br />
<br />
===Runtime Results===<br />
<br />
<csv>2008/qbt.runtime.csv</csv><br />
<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Query-by-Singing/Humming_Results&diff=67172008:Query-by-Singing/Humming Results2010-05-14T03:48:49Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Query-by-Singing/Humming task. For background information about this task set please refer to the [[2008:Query by Singing/Humming]] page. <br />
<br />
===Task Descriptions===<br />
<br />
'''Task 1 [[#Task 1 Results|Goto Task 1 Results]]''': The first subtask is the same as last year. In this subtask, submitted systems take a sung query as input and return a list of songs from the test database. Mean reciprocal rank (MRR) of the ground truth is calculated over the top 20 returns. The test database consists of 48 ground-truth MIDIs + 2000 Essen Collection MIDI noise files. See [http://www.esac-data.org/ ESAC Data Homepage] for more information about the Essen Collection. The query database consists of 2797 sung queries. <br />
<br />
'''Task 2 [[#Task 2 Results|Goto Task 2 Results]]''': The second subtask is the same as last year too. In the second subtask, the same setup as the first subtask used with combination of different transcribers and matchers. The test databases consists of 106 ground-truth MIDIS + 2000 Essen Collection MIDI noise files. The query databases consists of 355 sung queries.<br />
<br />
===General Legend===<br />
====Team ID====<br />
<br />
'''JL1''' = [https://www.music-ir.org/mirex/abstracts/2008/MIREX2008_QBSH_Davidson_abstract.pdf J-S. R. Jang 1]<br /><br />
'''JL2''' = [https://www.music-ir.org/mirex/abstracts/2008/MIREX2008_QBSH_Davidson_abstract.pdf J-S. R. Jang 2]<br /><br />
'''JL3''' = [https://www.music-ir.org/mirex/abstracts/2008/MIREX2008_QBSH_Davidson_abstract.pdf J-S. R. Jang 3]<br /><br />
'''JL4''' = [https://www.music-ir.org/mirex/abstracts/2008/MIREX2008_QBSH_Davidson_abstract.pdf J-S. R. Jang 4]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/abstracts/2008/QBSH_ryynanen.pdf M. Ryynänen, A. Klapuri]<br /><br />
'''LW1''' = [https://www.music-ir.org/mirex/abstracts/2008/QBSH_leiwang.pdf L. Wang 1]<br /><br />
'''LW2''' = [https://www.music-ir.org/mirex/abstracts/2008/QBSH_leiwang.pdf L. Wang 2]<br /><br />
'''LW2''' = [https://www.music-ir.org/mirex/abstracts/2008/QBSH_leiwang.pdf L. Wang 3]<br /><br />
'''WL1''' = [https://www.music-ir.org/mirex/abstracts/2008/QBSH_wu.pdf X. Wu, M. Li 1]<br /><br />
'''WL2''' = [https://www.music-ir.org/mirex/abstracts/2008/QBSH_wu.pdf X. Wu, M. Li 2]<br /><br />
<br />
===Task 1 Results===<br />
<br />
=====Task 1 Overall Results=====<br />
<csv>2008/qbsh/qbsh_task1_summary.csv</csv><br />
<br />
====Task 1 Friedman's Test for Significant Differences====<br />
The Friedman test was run in MATLAB against the QBSH Task 1 MRR data over the 48 ground truth song groups.<br />
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);<br />
<csv>2008/qbsh/qbsh.task1.friedman_detailed.csv</csv><br />
<br />
[[Image:2008_qbsh.task1.friedman.small.png]]<br />
<br />
====Task 1 Summary Results by Query Group====<br />
<csv>2008/qbsh/qbsh.task1.res.byQueryGroup.csv</csv><br />
<br />
===Task 2 Results===<br />
In this subtask, the same setup as the first subtask used with combination of different transcribers and matchers. The test databases consists of 106 ground-truth MIDIS + 2000 Essen Collection MIDI noise files. The query databases consists of 355 sung queries.<br />
<br />
=====Task 2 Overall Results=====<br />
<csv>2008/qbsh/qbsh.task2.summary.csv</csv><br />
<br />
====Task 2 Friedman's Test for Significant Differences====<br />
The Friedman test was run in MATLAB against the QBSH Task 1 MRR data over the 48 ground truth song groups.<br />
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);<br />
<csv>2008/qbsh/qbsh.task2.friedman_detailed.csv</csv><br />
<br />
[[Image:2008_qbsh.task2.friedman.s.png]]<br />
<br />
====Task 2 Summary Results by Query Group====<br />
<csv>2008/qbsh/qbsh_task2_res_byQueryGroup.csv</csv><br />
<br />
===Runtime Results===<br />
<br />
<csv>2008/qbsh.runtime.csv</csv><br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Multiple_Fundamental_Frequency_Estimation_%26_Tracking_Results&diff=67162008:Multiple Fundamental Frequency Estimation & Tracking Results2010-05-14T03:48:39Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Multiple Fundamental Frequency Estimation and Tracking task. For background information about this task set please refer to the [[2008:Multiple Fundamental Frequency Estimation & Tracking]] page.<br />
<br />
===General Legend===<br />
====Team ID====<br />
<br />
'''CL1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_multiF0_Cao.pdf C. Cao, M. Li 1]<br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_multiF0_Cao.pdf C. Cao, M. Li 2]<br /><br />
'''DRD''' = [https://www.music-ir.org/mirex/abstracts/2008/durrieu_multi.pdf J-L. Durrieu, G. Richard, B. David]<br /><br />
'''EOS''' = [https://www.music-ir.org/mirex/abstracts/2008/Egashira2008MIREX09_ver1.pdf K. Egashira, N. Ono, S. Sagayama]<br /><br />
'''EBD1''' = [https://www.music-ir.org/mirex/abstracts/2008/080914_MIREX08_emiya.pdf V. Emiya, R. Badeau, B. David 1]<br /><br />
'''EBD2''' = [https://www.music-ir.org/mirex/abstracts/2008/080914_MIREX08_emiya.pdf V. Emiya, R. Badeau, B. David 2]<br /><br />
'''MG''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_groble.pdf M. Groble]<br /><br />
'''PI1''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_pertusa.pdf A. Pertusa, J. M. I├▒esta 1]<br /><br />
'''PI2''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_pertusa.pdf A. Pertusa, J. M. I├▒esta 2]<br /><br />
'''RFF1''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_reis.pdf G. Reis, F. Fernandez, A. Ferreira 1]<br /><br />
'''RFF2''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_reis.pdf G. Reis, F. Fernandez, A. Ferreira 2]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_ryynanen.pdf M. Ryynänen, A. Klapuri]<br /><br />
'''VBB''' = [https://www.music-ir.org/mirex/abstracts/2008/articleMIREX07.pdf E. Vincent, N. Bertin, R. Badeau]<br /><br />
'''YRC1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_yeh.pdf C. Yeh, A. Roebel, W-C. Chang 1]<br /><br />
'''YRC2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_yeh.pdf C. Yeh, A. Roebel, W-C. Chang 2]<br /><br />
'''ZR1''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_zhou.pdf R. Zhou, J. D. Reiss 1]<br /><br />
'''ZR2''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_zhou.pdf R. Zhou, J. D. Reiss 2]<br /><br />
'''ZR3''' = [https://www.music-ir.org/mirex/abstracts/2008/F0_zhou.pdf R. Zhou, J. D. Reiss 3]<br /><br />
<br />
===Overall Summary Results Task 1===<br />
Below are the average scores across 36 test files. These files consisted of 9 groups, each group having 4 files ranging from 2 polyphony to 5 polyphony. 28 real recordings, 8 synthesized from RWC samples.<br />
<br />
<csv>2008/multif0/task1_summary.csv</csv> <br />
<br />
====Detailed Results====<br />
<br />
<csv>2008/multif0/task1_res.csv</csv><br />
<br />
====Detailed Chroma Results====<br />
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)<br />
<br />
<csv>2008/multif0/task1_res_chroma.csv</csv><br />
<br />
====Individual Results Files for Task 1: Scores per Query====<br />
'''CL1''' = [https://www.music-ir.org/mirex/2008/results/multif0/CL1task1.tar.gz C. Cao, M. Li 1]<br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/2008/results/multif0/CL2task1.tar.gz C. Cao, M. Li 2]<br/><br />
'''DRD''' = [https://www.music-ir.org/mirex/2008/results/multif0/DRDtask1.tar.gz J-L. Durrieu, G. Richard, B. David] <br/><br />
'''EBD1''' = [https://www.music-ir.org/mirex/2008/results/multif0/EBD1task1.tar.gz CV. Emiya, R. Badeau, B. David 1]<br /><br />
'''EBD2''' = [https://www.music-ir.org/mirex/2008/results/multif0/EBD2task1.tar.gz V. Emiya, R. Badeau, B. David 2]<br /><br />
'''EOS''' = [https://www.music-ir.org/mirex/2008/results/multif0/EOStask1.tar.gz K. Egashira, N. Ono, S. Sagayama]<br /><br />
'''MG''' = [https://www.music-ir.org/mirex/2008/results/multif0/MGtask1.tar.gz M. Groble]<br /><br />
'''PI1''' = [https://www.music-ir.org/mirex/2008/results/multif0/PI1task1.tar.gz A. Pertusa, J. M. I├▒esta 1]<br /><br />
'''PI2''' = [https://www.music-ir.org/mirex/2008/results/multif0/PI2task1.tar.gz A. Pertusa, J. M. I├▒esta 2]<br /><br />
'''RFF1''' = [https://www.music-ir.org/mirex/2008/results/multif0/RFF1task1.tar.gz G. Reis, F. Fernandez, A. Ferreira 1]<br /><br />
'''RFF2''' = [https://www.music-ir.org/mirex/2008/results/multif0/RFF2task1.tar.gz GG. Reis, F. Fernandez, A. Ferreira 2]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/2008/results/multif0/RKtask1.tar.gz M. Ryynänen, A. Klapuri]<br /><br />
'''VBB''' = [https://www.music-ir.org/mirex/2008/results/multif0/VBBtask1.tar.gz E. Vincent, N. Bertin, R. Badeau]<br /><br />
'''YRC1''' = [https://www.music-ir.org/mirex/2008/results/multif0/YRC1task1.tar.gz C. Yeh, A. Roebel, W-C. Chang 1]<br /><br />
'''YRC2''' = [https://www.music-ir.org/mirex/2008/results/multif0/YRC2task1.tar.gz C. Yeh, A. Roebel, W-C. Chang 2]<br /><br />
<br />
=====Info about the filenames=====<br />
The filenames starting with part* comes from acoustic woodwind recording, the ones starting with RWC are synthesized. The legend about the instruments are:<br />
<br />
'''bs''' = bassoon,<br />
'''cl''' = clarinet,<br />
'''fl''' = flute,<br />
'''hn''' = horn,<br />
'''ob''' = oboe,<br />
'''vl''' = violin,<br />
'''cel''' = cello,<br />
'''gtr''' = guitar,<br />
'''sax''' = saxophone,<br />
'''bass''' = electric bass guitar<br />
<br />
====Run Times====<br />
<csv>2008/multif0/task1_runtimes.csv</csv><br />
<br />
MG ran on MAC, all other systems ran on ALE Nodes.<br />
<br />
===Overall Summary Results Task 2===<br />
This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.<br />
<br />
A total of 30 files were used in this task: 16 real recordings, 8 synthesized from RWC samples, and 6 piano. The results below are the average of these 30 files.<br />
<br />
<csv>2008/multif0/task2_summary.csv</csv><br />
<br />
====Detailed Results====<br />
<br />
<csv>2008/multif0/task2_res.csv</csv><br />
<br />
====Detailed Chroma Results====<br />
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)<br />
<br />
<csv>2008/multif0/task2_res_chroma.csv</csv><br />
<br />
====Results Based on Onset Only====<br />
<br />
<csv>2008/multif0/task2_res_onsetonly.csv</csv><br />
<br />
====Chroma Results Based on Onset Only====<br />
<br />
<csv>2008/multif0/task2_res_onsetonly_chroma.csv</csv><br />
<br />
====Piano Subset Results Based on Onset Only====<br />
<br />
<csv>2008/multif0/task2_res_onsetonly_piano.csv</csv><br />
<br />
====Individual Results Files for Task 2====<br />
'''EBD1''' = [https://www.music-ir.org/mirex/2008/results/multif0/EBD1task2.tar.gz CV. Emiya, R. Badeau, B. David 1]<br /><br />
'''EBD2''' = [https://www.music-ir.org/mirex/2008/results/multif0/EBD2task2.tar.gz V. Emiya, R. Badeau, B. David 2]<br /><br />
'''EOS''' = [https://www.music-ir.org/mirex/2008/results/multif0/EOStask2.tar.gz K. Egashira, N. Ono, S. Sagayama]<br /><br />
'''PI1''' = [https://www.music-ir.org/mirex/2008/results/multif0/PI1task2.tar.gz A. Pertusa, J. M. I├▒esta 1]<br /><br />
'''PI2''' = [https://www.music-ir.org/mirex/2008/results/multif0/PI2task2.tar.gz A. Pertusa, J. M. I├▒esta 2]<br /><br />
'''RFF1''' = [https://www.music-ir.org/mirex/2008/results/multif0/RFF1task2.tar.gz G. Reis, F. Fernandez, A. Ferreira 1]<br /><br />
'''RFF2''' = [https://www.music-ir.org/mirex/2008/results/multif0/RFF2task2.tar.gz GG. Reis, F. Fernandez, A. Ferreira 2]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/2008/results/multif0/RKtask2.tar.gz M. Ryynänen, A. Klapuri]<br /><br />
'''VBB''' = [https://www.music-ir.org/mirex/2008/results/multif0/VBBtask2.tar.gz E. Vincent, N. Bertin, R. Badeau]<br /><br />
'''YRC1''' = [https://www.music-ir.org/mirex/2008/results/multif0/YRC1task2.tar.gz C. Yeh, A. Roebel, W-C. Chang 1]<br /><br />
'''ZR1''' = [https://www.music-ir.org/mirex/2008/results/multif0/ZR1task2.tar.gz R. Zhou, J. D. Reiss 1]<br /><br />
'''ZR2''' = [https://www.music-ir.org/mirex/2008/results/multif0/ZR2task2.tar.gz R. Zhou, J. D. Reiss 2]<br /><br />
'''ZR3''' = [https://www.music-ir.org/mirex/2008/results/multif0/ZR3task2.tar.gz R. Zhou, J. D. Reiss 3]<br /><br />
<br />
======Info About Filenames======<br />
The filenames starting with part* comes from acoustic woodwind recording, the ones starting with RWC are synthesized. The piano files are: RA_C030_align.wav,bach_847TESTp.wav,beet_pathetique_3TESTp.wav,mz_333_1TESTp.wav,scn_4TESTp.wav.note, ty_januarTESTp.wav.note<br />
<br />
====Run Times====<br />
<csv>2008/multif0/task2_runtimes.csv</csv><br />
<br />
ZR1,ZR2,ZR3 ran on BLACK. All other systems ran on ALE Nodes.<br />
<br />
===Friedman's Test for Significant Differences===<br />
<br />
====Task 1====<br />
The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the performance (accuracy) on individual files.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/multif0/task1.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
<br />
<csv>2008/multif0/task1.friedman.detailed.csv</csv><br />
<br />
[[Image:2008_multif0.task1.friedman.png]]<br />
<br />
====Task 2====<br />
The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the performance (accuracy) on individual files.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/multif0/task2.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
<br />
<csv>2008/multif0/task2.friedman.detailed.csv</csv><br />
<br />
[[Image:2008_multif0.task2.friedman.png]]<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Tag_Classification_Results&diff=67152008:Audio Tag Classification Results2010-05-14T03:48:30Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Audio Tag Classification task. For background information about this task set please refer to the [[2008:Audio Tag Classification]] page. <br />
<br />
===General Legend===<br />
====Team ID====<br />
'''LB''' = [https://www.music-ir.org/mirex/abstracts/2008/AT_barrington.pdf L. Barrington, D. Turnbull, G. Lanckriet]<br /><br />
'''BBE 1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex_knn.pdf T. Bertin-Mahieux, Y. Bengio, D. Eck (KNN)]<br /><br />
'''BBE 2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex_nnet.pdf T. Bertin-Mahieux, Y. Bengio, D. Eck (NNet)]<br /><br />
'''BBE 3''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex_boosters.pdf T. Bertin-Mahieux, D. Eck, P. Lamere, Y. Bengio (Thierry/Lamere Boosting)]<br /><br />
'''TB''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex_smurfs.pdf T. Bertin-Mahieux (dumb/smurf)]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 3]<br /><br />
'''GP1''' = [https://www.music-ir.org/mirex/abstracts/2008/Peeters_2008_ISMIR_MIREX.pdf G. Peeters 1]<br /><br />
'''GP2''' = [https://www.music-ir.org/mirex/abstracts/2008/Peeters_2008_ISMIR_MIREX.pdf G. Peeters 2]<br /><br />
'''TTKV''' = [https://www.music-ir.org/mirex/abstracts/2008/auth.pdf K. Trohidis, G. Tsoumakas, G. Kalliris, I. Vlahavas]<br /><br />
<br />
==Overall Summary Results==<br />
<br />
<csv>2008/tag/tag.grand.summary.show.csv</csv><br />
<br />
<br />
===Summary Positive Example Accuracy (Average Across All Folds)===<br />
<br />
<csv>2008/tag/rounded/tag.binary_avg_positive_example_Accuracy.csv</csv><br />
<br />
===Summary Negative Example Accuracy (Average Across All Folds)===<br />
<br />
<csv>2008/tag/rounded/tag.binary_avg_negative_example_Accuracy.csv</csv><br />
<br />
===Summary Binary relevance F-Measure (Average Across All Folds)===<br />
<br />
<csv>2008/tag/rounded/tag.binary_avg_Fmeasure.csv</csv><br />
<br />
===Summary Binary Accuracy (Average Across All Folds)===<br />
<br />
<csv>2008/tag/rounded/tag.binary_avg_Accuracy.csv</csv><br />
<br />
===Summary AUC-ROC Tag (Average Across All Folds)===<br />
<br />
<csv>2008/tag/rounded/tag.affinity_tag_AUC_ROC.csv</csv><br />
<br />
==Friedman test results==<br />
<br />
===AUC-ROC Tag Friedman test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the Area Under the ROC curve (AUC-ROC) for each '''tag''' in the test, averaged over all folds.<br />
<br />
<csv>2008/tag/friedmansTables/tag.affinity.AUC_ROC_TAG.friedman.tukeyKramerHSD.csv</csv><br />
<br />
[[Image:2008_affinity.auc_roc_tag.friedman.tukeykramerhsd.png]]<br />
<br />
===AUC-ROC Track Friedman test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the Area Under the ROC curve (AUC-ROC) for each '''track''' in the test. Each track appears in exactly once over all three folds of the test. However, we are uncertain if these measurements are truly independent as an multiple tracks from each artist are used.<br />
<br />
<csv>2008/tag/friedmansTables/tag.affinity.AUC_ROC_TRACK.friedman.tukeyKramerHSD.csv</csv><br />
<br />
[[Image:2008_affinity.auc_roc_track.friedman.tukeykramerhsd.png]]<br />
<br />
===Tag Classification Accuracy Friedman test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the classification accuracy for each '''tag''' in the test, averaged over all folds.<br />
<br />
<csv>2008/tag/friedmansTables/tag.binary_Accuracy.friedman.tukeyKramerHSD.csv</csv><br />
<br />
[[Image:2008_binary_accuracy.friedman.tukeykramerhsd.png]]<br />
<br />
===Tag F-measure Friedman test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the F-measure for each '''tag''' in the test, averaged over all folds.<br />
<br />
<csv>2008/tag/friedmansTables/tag.binary_FMeasure.friedman.tukeyKramerHSD.csv</csv><br />
<br />
[[Image:2008_binary_fmeasure.friedman.tukeykramerhsd.png]]<br />
<br />
<br />
==Beta-Binomial test results==<br />
<br />
===Accuracy on positive examples Beta-Binomial results===<br />
The following table and plot show the results of simulations from the Beta-Binomial model using the accuracy of each algorithm's classification only on the positive examples. It only shows the relative proportion of true positives and false negatives, and should be considered with the classification accuracy on the negative examples. The image shows the estimate of the overall performance with 95% confidence intervals.<br />
<br />
<br />
<csv>2008/tag/tag.binary.per.fold.positive.example.accuracy.betabinomial.csv</csv><br />
<br />
<br />
[[Image:binary_per_fold_positive_example_Accuracy.png]]<br />
<br />
<br />
The plots for each tag are more interesting and the 95% confidence intervals are much tighter. Since there are so many of them, it is difficult to post them to the wiki. You can download a tar.gz zip file containing all of them [https://www.music-ir.org/mirex/2008/results/tag/binary_positive_example_Accuracy.betaBinomial.images.tar.gz here].<br />
<br />
===Accuracy on negative examples Beta-Binomial results===<br />
The following table and plot show the results of simulations from the Beta-Binomial model using the accuracy of each algorithm's classification only on the negative examples. It only shows the relative proportion of true negatives and false positives, and should be considered with the classification accuracy on the positive examples. The image shows the estimate of the overall performance with 95% confidence intervals.<br />
<br />
<br />
<csv>2008/tag/tag.binary.per.fold.negative.example.accuracy.betabinomial.csv</csv><br />
<br />
<br />
[[Image:binary_per_fold_negative_example_Accuracy.png]]<br />
<br />
<br />
The plots for each tag are more interesting and the 95% confidence intervals are much tighter. Since there are so many of them, it is difficult to post them to the wiki. You can download a tar.gz file containing all of them [https://www.music-ir.org/mirex/2008/results/tag/binary_negative_example_Accuracy.betaBinomial.images.tar.gz here].<br />
<br />
==Assorted Results Files for Download==<br />
===AUC-ROC Clip Data===<br />
(Too large for easy Wiki viewing)<br \><br />
[https://www.music-ir.org/mirex/2008/results/tag/rounded/tag.affinity_clip_AUC_ROC.csv tag.affinity_clip_AUC_ROC.csv]<br /><br />
<br />
===CSV Files Without Rounding (Averaged across folds)===<br />
[https://www.music-ir.org/mirex/2008/results/tag/csv_raw/tag.affinity.tag.auc.roc.csv tag.affinity.clip.auc.roc.csv]<br /><br />
[https://www.music-ir.org/mirex/2008/results/tag/csv_raw/tag.affinity.clip.auc.roc.csv tag.affinity.clip.auc.roc.csv]<br /><br />
[https://www.music-ir.org/mirex/2008/results/tag/csv_raw/tag.binary.avg.accuracy.csv tag.binary.accuracy.csv]<br /><br />
[https://www.music-ir.org/mirex/2008/results/tag/csv_raw/tag.binary.avg.fmeasure.csv tag.binary.fmeasure.csv]<br /><br />
[https://www.music-ir.org/mirex/2008/results/tag/csv_raw/tag.binary.avg.negative.example.accuracy.csv tag.binary.negative.example.accuracy.csv]<br /><br />
[https://www.music-ir.org/mirex/2008/results/tag/csv_raw/tag.binary.avg.positive.example.accuracy.csv tag.binary.positive.example.accuracy.csv]<br /><br />
<br />
===CSV Files Without Rounding (Fold information)===<br />
[https://www.music-ir.org/mirex/2008/results/tag/csv_raw/tag.binary.per.fold.positive.example.accuracy.csv tag.binary.per.fold.positive.example.accuracy.csv]<br /><br />
[https://www.music-ir.org/mirex/2008/results/tag/csv_raw/tag.binary.per.fold.negative.example.accuracy.csv tag.binary.per.fold.negative.example.accuracy.csv]<br /><br />
[https://www.music-ir.org/mirex/2008/results/tag/csv_raw/tag.binary.per.fold.fmeasure.csv tag.binary.per.fold.fmeasure.csv]<br /><br />
[https://www.music-ir.org/mirex/2008/results/tag/csv_raw/tag.binary.per.fold.accuracy.csv tag.binary.per.fold.accuracy.csv]<br /><br />
[https://www.music-ir.org/mirex/2008/results/tag/csv_raw/tag.affinity.tag.per.fold.auc.roc.csv tag.affinity.tag.per.fold.auc.roc.csv]<br /><br />
<br />
===Results By Algorithm===<br />
(.tar.gz) <br /><br />
'''LB''' = [https://www.music-ir.org/mirex/2008/results/tag/detailedReports/LB.tar.gz L. Barrington, D. Turnbull, G. Lanckriet]<br /><br />
'''BBE 1''' = [https://www.music-ir.org/mirex/2008/results/tag/detailedReports/BBE1.tar.gz T. Bertin-Mahieux, Y. Bengio, D. Eck (KNN)]<br /><br />
'''BBE 2''' = [https://www.music-ir.org/mirex/2008/results/tag/detailedReports/BBE2.tar.gz T. Bertin-Mahieux, Y. Bengio, D. Eck (NNet)]<br /><br />
'''BBE 3''' = [https://www.music-ir.org/mirex/2008/results/tag/detailedReports/BBE3.tar.gz T. Bertin-Mahieux, D. Eck, P. Lamere, Y. Bengio (Thierry/Lamere Boosting)]<br /><br />
'''TB''' = [https://www.music-ir.org/mirex/2008/results/tag/detailedReports/TB.tar.gz Bertin-Mahieux (dumb/smurf)]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/2008/results/tag/detailedReports/ME1.tar.gz M. I. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/2008/results/tag/detailedReports/ME2.tar.gz M. I. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/2008/results/tag/detailedReports/ME3.tar.gz M. I. Mandel, D. P. W. Ellis 3]<br /><br />
'''GP1''' = [https://www.music-ir.org/mirex/2008/results/tag/detailedReports/GP1.tar.gz G. Peeters 1]<br /><br />
'''GP2''' = [https://www.music-ir.org/mirex/2008/results/tag/detailedReports/GP2.tar.gz G. Peeters 2]<br /><br />
'''TTKV''' = [https://www.music-ir.org/mirex/2008/results/tag/detailedReports/TTKV.tar.gz K. Trohidis, G. Tsoumakas, G. Kalliris, I. Vlahavas]<br /><br />
<br />
<br />
<br />
<br />
[[Category: Results]]<br />
.</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Music_Mood_Classification_Results&diff=67142008:Audio Music Mood Classification Results2010-05-14T03:48:19Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Audio Music Mood Classification task. For background information about this task set please refer to the [[2008:Audio Music Mood Classification]] page. <br />
<br />
===General Legend===<br />
====Team ID====<br />
'''GP1''' = [https://www.music-ir.org/mirex/abstracts/2008/Peeters_2008_ISMIR_MIREX.pdf G. Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT3''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''HW''' = [https://www.music-ir.org/mirex/abstracts/2008/.pdf H. Wang]<br /><br />
'''KL''' = [https://www.music-ir.org/mirex/abstracts/2008/.pdf K. Lee]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/abstracts/2008/.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/abstracts/2008/.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/abstracts/2008/.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/abstracts/2008/.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
==Overall Summary Results==<br />
===MIREX 2008 Audio Mood Classification Summary Results - Raw Classification Accuracy Averaged Over Three Train/Test Folds===<br />
<br />
<csv>2008/mood/audiomood.avg.results.csv</csv><br />
<br />
=====Accuracy Across Folds=====<br />
<br />
<csv>2008/mood/audiomood.results.fold.csv</csv><br />
<br />
=====Accuracy Across Categories=====<br />
<br />
<csv>2008/mood/audiomood.results.class.csv</csv><br />
<br />
===MIREX 2008 Audio Artist Classification Evaluation Logs and Confusion Matrices===<br />
<br />
====MIREX 2008 Audio Mood Classification Run Times====<br />
<br />
<csv>2008/mood.runtime.csv</csv><br />
<br />
====CSV Files Without Rounding====<br />
[https://www.music-ir.org/mirex/2008/results/mood/audiomood_results_fold.csv audiomood_results_fold.csv]<br /><br />
[https://www.music-ir.org/mirex/2008/results/mood/audiomood_results_class.csv audiomood_results_class.csv]<br /><br />
<br />
====Results By Algorithm====<br />
(.tar.gz) <br /><br />
'''GP1''' = [https://www.music-ir.org/mirex/2008/results/mood/GP1.tar.gz G. Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/2008/results/mood/GT1.tar.gz G. Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/2008/results/mood/GT2.tar.gz G. Tzanetakis]<br /><br />
'''GT3''' = [https://www.music-ir.org/mirex/2008/results/mood/GT3.tar.gz G. Tzanetakis]<br /><br />
'''HW''' = [https://www.music-ir.org/mirex/2008/results/mood/HW.tar.gz G. H. Wang]<br /><br />
'''KL''' = [https://www.music-ir.org/mirex/2008/results/mood/KL.tar.gz K. Lee]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/2008/results/mood/LRPPI1.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/2008/results/mood/LRPPI2.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/2008/results/mood/LRPPI3.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/2008/results/mood/LRPPI4.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/2008/results/mood/ME1.tar.gz I. M. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/2008/results/mood/ME2.tar.gz I. M. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/2008/results/mood/ME3.tar.gz I. M. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
===Friedman's Test for Significant Differences===<br />
====Classes vs. Systems====<br />
The Friedman test was run in MATLAB against the average accuracy for each class.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/mood/perClassAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/mood/perClassAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_mood.perclassaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
====Folds vs. Systems====<br />
The Friedman test was run in MATLAB against the accuracy for each fold.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/mood/perFoldAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/mood/perFoldAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_mood.perfoldaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Melody_Extraction_Results&diff=67132008:Audio Melody Extraction Results2010-05-14T03:48:09Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Audio Melody Extraction task set. For background information about this task set please refer to the [[2008:Audio Melody Extraction]] page. Special thanks to Jean-Louis Durrieu for doing the vocal/non-vocal split summaries.<br />
<br />
===General Legend===<br />
====Team ID==== <br />
<br />
'''PC''' = [https://www.music-ir.org/mirex/abstracts/2008/AudioMelodyExt_pcancela.pdf P. Cancela]<br /><br />
'''CLLY1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_multiF0_CC.pdf C. Cao, M. Li, J. Liu, Y. Yan 1]<br /><br />
'''CLLY2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_multiF0_CC.pdf C. Cao, M. Li, J. Liu, Y. Yan 2]<br /><br />
'''DRD1''' = [https://www.music-ir.org/mirex/abstracts/2008/durrieu_imm_gmm.pdf J-L. Durrieu, G. Richard, B. David 1]<br /><br />
'''DRD2''' = [https://www.music-ir.org/mirex/abstracts/2008/durrieu_imm_gmm.pdf J-L. Durrieu, G. Richard, B. David 2]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/abstracts/2008/ME_ryynanen.pdf M. Ryynänen, A. Klapuri]<br /><br />
'''VR''' = [https://www.music-ir.org/mirex/abstracts/2008/ME_rao.pdf V. Rao, P. Rao]<br /><br />
<br />
====Table Headings====<br />
'''Vx Recall''' = Voicing Detection<br /><br />
'''Vx False Alm''' = Voicing False Alarm<br /><br />
'''Vx d'''' = Voicing d-prime<br /><br />
'''Raw pitch''' = Raw Pitch Accuracy<br /><br />
'''Raw Chroma''' = Raw Chroma Accuracy<br /><br />
'''Overall Acc''' = Overall Acuuracy<br /><br />
<br />
==Overall Summary Results==<br />
<br />
===MIREX 2008 Audio Melody Extraction Overall Summary results - Weighted (by Number of Files) Avg. of all Datasets - All===<br />
<csv>2008/am08_overall.csv</csv><br />
<br />
===MIREX 2008 Audio Melody Extraction Summary results - MIREX 2008 Dataset - All===<br />
<csv>2008/am08_m08_all.csv</csv><br />
<br />
[[Image:2008_am08_m08_all.png]]<br />
<br />
Download the [https://www.music-ir.org/mirex/2008/results/am08_persong_m08_all.xls Excel workbook] for MIREX 2008 Dataset - All (NB: it seems that all the songs of this dataset are Vocal ones, there is therefore no vocal/non-vocal separate results).<br />
<br />
===MIREX 2008 Audio Melody Extraction Summary results - MIREX 2005 Dataset - vocal===<br />
<csv>2008/am08_m05_vocal.csv</csv><br />
<br />
[[Image:2008_am08_m05_vocal.png]]<br />
<br />
Download the [https://www.music-ir.org/mirex/2008/results/am08_persong_m05_vocal.xls Excel Workbook] for MIREX 2005 Dataset - vocal.<br />
<br />
===MIREX 2008 Audio Melody Extraction Summary results - MIREX 2005 Dataset - nonvocal===<br />
<csv>2008/am08_m05_nonvocal.csv</csv><br />
<br />
[[Image:2008_am08_m05_nonvocal.png]]<br />
<br />
Download the [https://www.music-ir.org/mirex/2008/results/am08_persong_m05_nonvocal.xls Excel Workbook] for MIREX 2005 Dataset - nonvocal.<br />
<br />
===MIREX 2008 Audio Melody Extraction Summary results - MIREX 2005 Dataset - All===<br />
<csv>2008/am08_m05_all.csv</csv><br />
<br />
[[Image:2008_am08_m05_all2.png]]<br />
<br />
Download the [https://www.music-ir.org/mirex/2008/results/am08_persong_m05_all.xls Excel Workbook] for MIREX 2005 Dataset - All.<br />
<br />
===MIREX 2008 Audio Melody Extraction Summary results - ADC 2004 Dataset - vocal===<br />
<csv>2008/am08_adc04_vocal.csv</csv><br />
<br />
[[Image:2008_am08_adc04_vocal.png]]<br />
<br />
Download the [https://www.music-ir.org/mirex/2008/results/am08_persong_adc04_vocal.xls Excel Workbook] for ADC 2004 Dataset - vocal.<br />
<br />
===MIREX 2008 Audio Melody Extraction Summary results - ADC 2004 Dataset - nonvocal===<br />
<csv>2008/am08_adc04_nonvocal.csv</csv><br />
<br />
[[Image:2008_am08_adc04_nonvocal.png]]<br />
<br />
Download the [https://www.music-ir.org/mirex/2008/results/am08_persong_adc04_nonvocal.xls Excel Workbook] for ADC 2004 Dataset - nonvocal.<br />
<br />
===MIREX 2008 Audio Melody Extraction Summary results - ADC 2004 Dataset - All===<br />
<csv>2008/am08_adc04_all.csv</csv><br />
<br />
[[Image:2008_am08_adc04_all.png]]<br />
<br />
Download the [https://www.music-ir.org/mirex/2008/results/am08_persong_adc04_all.xls Excel workbook] for ADC 2004 Dataset - All.<br />
<br />
===MIREX 2008 Audio Melody Extraction Runtime Data===<br />
<csv>2008/am08_runtime.csv</csv><br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Genre_Classification_Results&diff=67122008:Audio Genre Classification Results2010-05-14T03:47:59Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Audio Genre Classification task. For background information about this task set please refer to the [[2008:Audio Genre Classification]] page. <br />
<br />
===General Legend===<br />
====Team ID====<br />
'''CL1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_genre_CC.pdf C. Cao, M. Li 1]<br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_genre_CC.pdf C. Cao, M. Li 2]<br /><br />
'''GP1''' = [https://www.music-ir.org/mirex/abstracts/2008/Peeters_2008_ISMIR_MIREX.pdf G. Peeters]<br /><br />
'''GT1 (mono)''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT2 (stereo)''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT3 (multicore)''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf I. M. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf I. M. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf I. M. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
==Overall Summary Results==<br />
===Task 1 (MIXED) Results===<br />
<br />
====MIREX 2008 Audio Genre Classification Summary Results - Raw Classification Accuracy Averaged Over Three Train/Test Folds====<br />
<br />
<csv>2008/genremixed/audiogenre.avg.results.csv</csv><br />
<br />
=====Accuracy Across Folds=====<br />
<br />
<csv>2008/genremixed/audiogenre.results.fold.csv</csv><br />
<br />
=====Accuracy Across Categories=====<br />
<br />
<csv>2008/genremixed/audiogenre.results.class.csv</csv><br />
<br />
====MIREX 2008 Audio Genre Classification Evaluation Logs and Confusion Matrices====<br />
<br />
====MIREX 2008 Audio Genre Classification Run Times====<br />
<br />
<csv>2008/genre.runtime.csv</csv><br />
<br />
====CSV Files Without Rounding====<br />
[https://www.music-ir.org/mirex/results/2008/genremixed/audiogenre_results_fold.csv audiogenre_results_fold.csv]<br /><br />
[https://www.music-ir.org/mirex/results/2008/genremixed/audiogenre_results_class.csv audiogenre_results_class.csv]<br /><br />
<br />
====Results By Algorithm====<br />
(.tar.gz) <br /><br />
'''CL1''' = [https://www.music-ir.org/mirex/2008/results/genremixed/CL1.tar.gz C. Cao, M. Li 1]<br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/2008/results/genremixed/CL2.tar.gz C. Cao, M. Li 2]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/2008/results/genremixed/LRPPI1.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/2008/results/genremixed/LRPPI2.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/2008/results/genremixed/LRPPI3.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/2008/results/genremixed/LRPPI4.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/2008/results/genremixed/ME1.tar.gz I. M. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/2008/results/genremixed/ME2.tar.gz I. M. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/2008/results/genremixed/ME3.tar.gz I. M. Mandel, D. P. W. Ellis 3]<br /><br />
'''GP''' = [https://www.music-ir.org/mirex/2008/results/genremixed/GP1.tar.gz G. Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/2008/results/genremixed/GT1.tar.gz G. Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/2008/results/genremixed/GT2.tar.gz G. Tzanetakis]<br /><br />
'''GT3''' = [https://www.music-ir.org/mirex/2008/results/genremixed/GT3.tar.gz G. Tzanetakis]<br /><br />
<br />
===Task 2 (LATIN) Results===<br />
<br />
====MIREX 2008 Audio Genre Classification Summary Results - Raw Classification Accuracy Averaged Over Three Train/Test Folds====<br />
<br />
<csv>2008/genrelatin/audiolatin.avg.results.csv</csv><br />
<br />
=====Accuracy Across Folds=====<br />
<br />
<csv>2008/genrelatin/audiolatin.results.fold.csv</csv><br />
<br />
=====Accuracy Across Categories=====<br />
<br />
<csv>2008/genrelatin/audiolatin.results.class.csv</csv><br />
<br />
====MIREX 2008 Audio Genre Classification Evaluation Logs and Confusion Matrices====<br />
<br />
====MIREX 2008 Audio Genre Classification Run Times====<br />
<csv>2008/latin.runtime.csv</csv><br />
<br />
====CSV Files Without Rounding====<br />
[https://www.music-ir.org/mirex/2008/results/genrelatin/audiolatin_results_fold.csv audiolatin_results_fold.csv]<br /><br />
[https://www.music-ir.org/mirex/2008/results/genrelatin/audiolatin_results_class.csv audiolatin_results_class.csv]<br /><br />
<br />
====Results By Algorithm====<br />
(.tar.gz) <br /><br />
'''CL1''' = [https://www.music-ir.org/mirex/2008/results/genrelatin/CL1.tar.gz C. Cao, M. Li 1]<br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/2008/results/genrelatin/CL2.tar.gz C. Cao, M. Li 2]<br /><br />
'''GP1''' = [https://www.music-ir.org/mirex/2008/results/genrelatin/GP1.tar.gz G. Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/2008/results/genrelatin/GT1.tar.gz G. Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/2008/results/genrelatin/GT2.tar.gz G. Tzanetakis]<br /><br />
'''GT3''' = [https://www.music-ir.org/mirex/2008/results/genrelatin/GT3.tar.gz G. Tzanetakis]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/2008/results/genrelatin/LRPPI1.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/2008/results/genrelatin/LRPPI2.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/2008/results/genrelatin/LRPPI3.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/2008/results/genrelatin/LRPPI4.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/2008/results/genrelatin/ME1.tar.gz I. M. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/2008/results/genrelatin/ME2.tar.gz I. M. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/2008/results/genrelatin/ME3.tar.gz I. M. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
===Friedman's Test for Significant Differences===<br />
====Task 1 (Mixed) Classes vs. Systems====<br />
The Friedman test was run in MATLAB against the average accuracy for each class.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/genremixed/perClassAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/genremixed/perClassAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_genremixed.perclassaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
====Task 1 (Mixed) Folds vs. Systems====<br />
The Friedman test was run in MATLAB against the accuracy for each fold.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/genremixed/perFoldAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/genremixed/perFoldAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_genremixed.perfoldaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
<br />
====Task 2 (Latin) Classes vs. Systems====<br />
The Friedman test was run in MATLAB against the average accuracy for each class.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/genrelatin/perClassAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/genrelatin/perClassAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_genrelatin.perclassaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
====Task 2 (Latin) Folds vs. Systems====<br />
The Friedman test was run in MATLAB against the accuracy for each fold.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/genrelatin/perFoldAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/genrelatin/perFoldAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_genrelatin.perfoldaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Cover_Song_Identification_Results&diff=67112008:Audio Cover Song Identification Results2010-05-14T03:47:49Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>Still missing runtimes JSD Sept. 11 2008.<br />
==Introduction==<br />
These are the results for the 2008 running of the Audio Cover Song Identification task. For background information about this task set please refer to the [[2008:Audio Cover Song Identification]] page.<br />
<br />
Each system was given a collection of 1000 songs which included of 30 different classes (sets) of cover songs where each class/set was represented by 11 different versions of a particular song. Each of the 330 cover songs were used as queries and the systems were required to return 10 results for each query. Systems were evaluated on the number of the songs from the same class/set as the query that were returned in the list of 10 results for each query. Average precision, which looks at the entire per-query rank-ordered list of all songs in the collection, was the new metric introduced last year.<br />
<br />
<br />
===General Legend===<br />
====Team ID====<br />
'''CL1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_covsng.pdf C. Cao, M. Li] <br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_covsng.pdf C. Cao, M. Li] <br /><br />
'''EL1''' = [https://www.music-ir.org/mirex/abstracts/2008/cbms_cover_song_id.pdf A. Egorov, G. Linetsky] <br /><br />
'''EL2''' = [https://www.music-ir.org/mirex/abstracts/2008/cbms_cover_song_id.pdf A. Egorov, G. Linetsky] <br /><br />
'''EL3''' = [https://www.music-ir.org/mirex/abstracts/2008/cbms_cover_song_id.pdf A. Egorov, G. Linetsky] <br /><br />
'''JCJ''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract.pdf J. H. Jensen, M. G. Christensen, S. H. Jensen] <br /><br />
'''SGH1''' = [https://www.music-ir.org/mirex/abstracts/2008/CS_Serra.pdf J. Serrà, E. Gόmez, P. Herrera] <br /><br />
'''SGH2''' = [https://www.music-ir.org/mirex/abstracts/2008/CS_Serra.pdf J. Serrà, E. Gόmez, P. Herrera] <br /><br />
<br />
==Overall Summary Results==<br />
<csv>2008//cover/grand.summary.v2.csv</csv><br />
<br />
<br />
<br />
===Number of Correct Covers at Rank X Returned in Top Ten=== <br />
<csv>2008/cover/cover.toptendist.transposed.csv</csv><br />
<br />
===Run Times=== <br />
<csv>2008/cover/coversong_runtimes.csv</csv><br />
<br />
CL1,CL2 ran on FAST2,FAST3. All others ran on ALE Nodes.<br />
<br />
===Friedman's Test for Significant Differences===<br />
The Friedman test was run in MATLAB against the Average Precision summary data over the 30 song groups.<br /> Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/cover/coversong.friedman.anova.csv</csv><br />
<br />
<csv>2008/cover/coversong.friedman.csv</csv><br />
<br />
[[Image:coversong.friedman.png]]<br />
<br />
===Average Performance per Query Group===<br />
These are the arithmetic means of the average precisions within each of the 30 query groups.<br />
<br />
<csv>2008/cover/cover.mapquerygroup.v2.csv</csv><br />
<br />
==Individual Results Files==<br />
===Average Precision Scores for Each Query===<br />
'''CL1''' = [https://www.music-ir.org/mirex/2008/results/cover.cl1.eval.csv C. Cao, M. Li] <br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/2008/results/cover.cl2.eval.csv C. Cao, M. Li] <br /><br />
'''EL1''' = [https://www.music-ir.org/mirex/2008/results/cover.el1.eval.csv A. Egorov, G. Linetsky] <br /><br />
'''EL2''' = [https://www.music-ir.org/mirex/2008/results/cover.el2.eval.csv A. Egorov, G. Linetsky] <br /><br />
'''EL3''' = [https://www.music-ir.org/mirex/2008/results/cover.el3.eval.csv A. Egorov, G. Linetsky] <br /><br />
'''JCJ''' = [https://www.music-ir.org/mirex/2008/results/cover.jcj.eval.csv J. H. Jensen, M. G. Christensen, S. H. Jensen] <br /><br />
'''SGH1''' = [https://www.music-ir.org/mirex/2008/results/cover.sgh1.eval.csv J. Serrà, E. Gόmez, P. Herrera] <br /><br />
'''SGH2''' = [https://www.music-ir.org/mirex/2008/results/cover.sgh2.eval.csv J. Serrà, E. Gόmez, P. Herrera] <br /><br />
<br />
===Ranks of the Ten Cover Songs Returned for Each Query===<br />
'''CL1''' = [https://www.music-ir.org/mirex/2008/results/cover.cl1.eval.debug.csv C. Cao, M. Li] <br /><br />
'''CL2''' = [https://www.music-ir.org/mirex/2008/results/cover.cl2.eval.debug.csv C. Cao, M. Li] <br /><br />
'''EL1''' = [https://www.music-ir.org/mirex/2008/results/cover.el1.eval.debug.csv A. Egorov, G. Linetsky] <br /><br />
'''EL2''' = [https://www.music-ir.org/mirex/2008/results/cover.el2.eval.debug.csv A. Egorov, G. Linetsky] <br /><br />
'''EL3''' = [https://www.music-ir.org/mirex/2008/results/cover.el3.eval.debug.csv A. Egorov, G. Linetsky] <br /><br />
'''JCJ''' = [https://www.music-ir.org/mirex/2008/results/cover.jcj.eval.debug.csv J. H. Jensen, M. G. Christensen, S. H. Jensen] <br /><br />
'''SGH1''' = [https://www.music-ir.org/mirex/2008/results/cover.sgh1.eval.debug.csv J. Serrà, E. Gόmez, P. Herrera] <br /><br />
'''SGH2''' = [https://www.music-ir.org/mirex/2008/results/cover.sgh2.eval.csv J. Serrà, E. Gόmez, P. Herrera] <br /><br />
<br />
===Runtimes===<br />
Where algorithms have been multi-threaded, the longest runtime is reported.<br />
<br />
Where runtimes were not properly reported, file timestamps have been used to approximate a runtime.<br />
<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Classical_Composer_Identification_Results&diff=67102008:Audio Classical Composer Identification Results2010-05-14T03:47:39Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Audio Classical Composer Identification task. For background information about this task set please refer to the [[2007:Audio Classical Composer Identification]] page. <br />
<br />
The data set consisted of 2772 30 second audio clips. The composers represented were:<br />
<br />
#Bach<br />
#Beethoven<br />
#Brahms<br />
#Chopin<br />
#Dvorak<br />
#Handel<br />
#Haydn<br />
#Mendelssohnn<br />
#Mozart<br />
#Schubert<br />
#Vivaldi<br />
<br />
The goal was to correctly identify the composer who wrote each of the pieces represented.<br />
<br />
<br />
===General Legend===<br />
====Team ID====<br />
'''GP1''' = [https://www.music-ir.org/mirex/abstracts/2008/Peeters_2008_ISMIR_MIREX.pdf G. Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT3''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
==Overall Summary Results==<br />
===MIREX 2008 Audio Classical Composer Classification Summary Results - Raw Classification Accuracy Averaged Over Three Train/Test Folds===<br />
<br />
<csv>2008/composer/audiocomposer.avg.results.csv</csv><br />
<br />
=====Accuracy Across Folds=====<br />
<br />
<csv>2008/composer/audiocomposer.results.fold.csv</csv><br />
<br />
=====Accuracy Across Categories=====<br />
<br />
<csv>2008/composer/audiocomposer.results.class.csv</csv><br />
<br />
===MIREX 2008 Audio Classical Composer Classification Evaluation Logs and Confusion Matrices===<br />
<br />
====MIREX 2008 Audio Classical Composer Classification Run Times====<br />
<br />
<csv>2008/composer.runtime.csv</csv><br />
<br />
====CSV Files Without Rounding====<br />
[https://www.music-ir.org/mirex/2008/results/composer/audiocomposer_results_fold.csv audiocomposer_results_fold.csv]<br /><br />
[https://www.music-ir.org/mirex/2008/results/composer/audiocomposer_results_class.csv audiocomposer_results_class.csv]<br /><br />
<br />
====Results By Algorithm====<br />
(.tar.gz) <br /><br />
'''GP1''' = [https://www.music-ir.org/mirex/2008/results/composer/GP1.tar.gz G. Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/2008/results/composer/GT1.tar.gz G. Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/2008/results/composer/GT2.tar.gz G. Tzanetakis]<br /><br />
'''GT3''' = [https://www.music-ir.org/mirex/2008/results/composer/GT3.tar.gz G. Tzanetakis]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/2008/results/composer/LRPPI1.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/2008/results/composer/LRPPI2.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/2008/results/composer/LRPPI3.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/2008/results/composer/LRPPI4.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/2008/results/composer/ME1.tar.gz I. M. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/2008/results/composer/ME2.tar.gz I. M. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/2008/results/composer/ME3.tar.gz I. M. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
===Friedman's Test for Significant Differences===<br />
====Classes vs. Systems====<br />
The Friedman test was run in MATLAB against the average accuracy for each class.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/composer/perClassAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/composer/perClassAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_composer.perclassaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
====Folds vs. Systems====<br />
The Friedman test was run in MATLAB against the accuracy for each fold.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/composer/perFoldAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/composer/perFoldAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_composer.perfoldaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Chord_Detection_Results&diff=67092008:Audio Chord Detection Results2010-05-14T03:47:29Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Audio Chord Detection task set. For background information about this task set please refer to the [[2008:Audio Chord Detection]] page.<br />
<br />
===Task Descriptions===<br />
<br />
'''Task 1 (Pretrained Systems) [[#Task 1 Results|Go to Task 1 Results]]''':<br />
Systems were pretrained and they were tested against 176 Beatles songs. <br />
<br />
'''Task 2 (Train-Test Systems) [[#Task 2 Results|Go to Task 2 Results]]''': <br />
System trained on ~2/3 of the beatles dataset and tested on ~1/3. Album filtering was applied on each train-test fold such that the songs from the same album can not appear in both train and test sets simultaneously. <br />
<br />
Overlap score was calculated as the ratio between the overlap of the ground truth and detected chords and ground truth duration. Also a secondary overlap score was calculated by ignoring the major-minor variations of the detected chord (e.g., C major == C minor, etc.).<br />
<br />
Note that 4 songs were excluded from the original Beatles dataset because of alignment of ground truth to the audio problems.<br />
The ground truth to audio alignment was done automatically. The script to perform the alignment is going to be released soon by Chris Harte.<br />
<br />
===General Legend===<br />
====Team ID for ChordPreTrained (Task 1)==== <br />
<br />
'''BP''' = [https://www.music-ir.org/mirex/abstracts/2008/CD_bello.pdf J. P. Bello, J. Pickens]<br /><br />
'''KO''' = [https://www.music-ir.org/mirex/abstracts/2008/khadkevich_omologo_final.pdf M. Khadkevich, M. Omologo]<br /><br />
'''KL1''' = [https://www.music-ir.org/mirex/abstracts/2008/XXX.pdf K. Lee 1]<br /><br />
'''KL2''' = [https://www.music-ir.org/mirex/abstracts/2008/XXX.pdf K. Lee 2]<br /><br />
'''MM''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08__mehnert_et_al__cps_based_chord_analysis.pdf M. Mehnert]<br /><br />
'''PP''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex08_chord_papadopoulos.pdf H.Papadopoulos, G. Peeters]<br /><br />
'''PVM''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2008-audio_chord_detection-ghent_university-johan_pauwels.pdf J. Pauwels, M. Varewyck, J-P. Martens]<br /><br />
'''RK''' = [https://www.music-ir.org/mirex/abstracts/2008/CD_ryynanen.pdf M. Ryynänen, A. Klapuri]<br /><br />
<br />
====Team ID for ChordTrainTest (Task 2)==== <br />
<br />
'''DE''' = [https://www.music-ir.org/mirex/abstracts/2008/Ellis08-chordid.pdf D. Ellis]<br /><br />
'''ZL''' = [https://www.music-ir.org/mirex/abstracts/2008/Abstract_xinglin.pdf X. Jhang, C. Lash]<br /><br />
'''KO''' = [https://www.music-ir.org/mirex/abstracts/2008/khadkevich_omologo_final.pdf M. Khadkevich, M. Omologo]<br /><br />
'''KL''' = [https://www.music-ir.org/mirex/abstracts/2008/XXX.pdf K. Lee (withtrain)]<br /><br />
'''UMS''' = [https://www.music-ir.org/mirex/abstracts/2008/uchiyamamirex2008.pdf Y. Uchiyama, K. Miyamoto, S. Sagayama]<br /><br />
'''WD1''' = [https://www.music-ir.org/mirex/abstracts/2008/Mirex08_AudioChordDetection_Weil_Durrieu.pdf J. Weil]<br /><br />
'''WD2''' = [https://www.music-ir.org/mirex/abstracts/2008/Mirex08_AudioChordDetection_Weil_Durrieu.pdf J. Weil, J-L. Durrieu]<br /><br />
<br />
==Overall Summary Results==<br />
===Task 1 Results===<br />
<br />
=====Task 1 Overall Results=====<br />
<br />
<csv>2008/chord/task1_results/pretrained_summary.csv</csv><br />
<br />
<csv>2008/chord/task1_results/pretrained_runtimes.csv</csv><br />
<br />
====Task 1 Summary Data for Download====<br />
[https://www.music-ir.org/mirex/2008/results/chord/task1_results/pretraineed_filenames.csv File Name Set (Pretrained runs)] <br /><br />
[https://www.music-ir.org/mirex/2008/results/chord/task1_results/ACD.task1.results.overlapScores.csv Summary Overlap Data (Pretrained runs)] <br /><br />
[https://www.music-ir.org/mirex/2008/results/chord/task1_results/ACD.task1.results.overlapScores.major_minor.csv Summary Overlap Data (Pretrained runs (Merged maj/min))] <br /><br />
====Task 1 Friedman's Test for Significant Differences====<br />
The Friedman test was run in MATLAB against the Task 1 Overlap Score data over the 176 ground truth songs.<br />
<br />
<br />
<csv>2008/chord/task1_results/task1_friedman.csv</csv><br />
<br />
The Tukey-Kramer HSD multi-comparison data below was generate using the following MATLAB instruction.<br />
<br />
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<csv>2008/chord/task1_results/friedman_detailed.csv</csv><br />
<br />
[[Image:2008_task1.friedman.png]]<br />
<br />
===Task 2 Results===<br />
<br />
=====Task 2 Overall Results=====<br />
<csv>2008/chord/task2_results/summary.csv</csv><br />
<br />
<csv>2008/chord/task2_results/task2_runtimes.csv</csv><br />
<br />
====Task 2 Summary Data for Download====<br />
[https://www.music-ir.org/mirex/2008/results/chord/task2_results/all_filenames.csv File Name Set (Train-test runs)] <br /><br />
[https://www.music-ir.org/mirex/2008/results/chord/task2_results/all3folds_overlap_scores.csv Summary Overlap Data (Train-test runs)] <br /><br />
[https://www.music-ir.org/mirex/2008/results/chord/task2_results/all3folds_overlap_scores_majorMinor.csv Summary Overlap Data (Train-Test runs (Merged maj/min))] <br /><br />
[https://www.music-ir.org/mirex/2008/results/chord/task2_results/individualFriedmansForEachFold.zip Per Fold Summary Data (Train-Test runs (Zip archive))] <br /><br />
<br />
====Task Friedman's Test for Significant Differences====<br />
The Friedman test was run in MATLAB against the Task 2 Overlap Score data over the 176 ground truth songs.<br />
<br />
<br />
<csv>2008/chord/task2_results/all3folds_friedman.txt</csv><br />
<br />
The Tukey-Kramer HSD multi-comparison data below was generate using the following MATLAB instruction.<br />
<br />
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<csv>2008/chord/task2_results/task2_allFolds_friedman_detailed.csv</csv><br />
<br />
[[Image:2008_task2.allfolds_friedman.png]]<br />
<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2008:Audio_Artist_Identification_Results&diff=67082008:Audio Artist Identification Results2010-05-14T03:47:19Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Audio Artist Identification task. For background information about this task set please refer to the [[2008:Audio Artist Identification]] page. <br />
<br />
===General Legend===<br />
====Team ID====<br />
'''GP2''' = [https://www.music-ir.org/mirex/abstracts/2008/Peeters_2008_ISMIR_MIREX.pdf G. Peeters]<br /><br />
'''GT1 (mono)''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT2 (stereo)''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''GT3 (multicore)''' = [https://www.music-ir.org/mirex/abstracts/2008/mirex2007.pdf G. Tzanetakis]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/abstracts/2008/abstract_mirex08_class.pdf T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/abstracts/2008/AA_AG_AT_MM_CC_mandel.pdf M. I. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
==Overall Summary Results==<br />
===MIREX 2008 Audio Artist Classification Summary Results - Raw Classification Accuracy Averaged Over Three Train/Test Folds===<br />
<br />
<csv>2008/artist/audioartist.avg.results.csv</csv><br />
<br />
=====Accuracy Across Folds=====<br />
<br />
<csv>2008/artist/audioartist.results.fold.csv</csv><br />
<br />
=====Accuracy Across Categories=====<br />
<br />
<csv>2008/artist/audioartist.results.class.csv</csv><br />
<br />
===MIREX 2008 Audio Artist Classification Evaluation Logs and Confusion Matrices===<br />
<br />
====MIREX 2008 Audio Artist Classification Run Times====<br />
<br />
<csv>2008/artist.runtime.csv</csv><br />
<br />
====CSV Files Without Rounding====<br />
[https://www.music-ir.org/mirex/2008/results/artist/audioartist_results_fold.csv audioartist_results_fold.csv]<br /><br />
[https://www.music-ir.org/mirex/2008/results/artist/audioartist_results_class.csv audioartist_results_class.csv]<br /><br />
<br />
====Results By Algorithm====<br />
(.tar.gz) <br /><br />
'''GP2''' = [https://www.music-ir.org/mirex/2008/results/artist/GP2.tar.gz G. Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/2008/results/artist/GT1.tar.gz G. Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/2008/results/artist/GT2.tar.gz G. Tzanetakis]<br /><br />
'''GT3''' = [https://www.music-ir.org/mirex/2008/results/artist/GT3.tar.gz G. Tzanetakis]<br /><br />
'''LRPPI1''' = [https://www.music-ir.org/mirex/2008/results/artist/LRPPI1.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 1]<br /><br />
'''LRPPI2''' = [https://www.music-ir.org/mirex/2008/results/artist/LRPPI2.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 2]<br /><br />
'''LRPPI3''' = [https://www.music-ir.org/mirex/2008/results/artist/LRPPI3.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 3]<br /><br />
'''LRPPI4''' = [https://www.music-ir.org/mirex/2008/results/artist/LRPPI4.tar.gz T. Lidy, A. Rauber, A. Pertusa, P. Peonce de Leon, J. M. I├▒esta 4]<br /><br />
'''ME1''' = [https://www.music-ir.org/mirex/2008/results/artist/ME1.tar.gz I. M. Mandel, D. P. W. Ellis 1]<br /><br />
'''ME2''' = [https://www.music-ir.org/mirex/2008/results/artist/ME2.tar.gz I. M. Mandel, D. P. W. Ellis 2]<br /><br />
'''ME3''' = [https://www.music-ir.org/mirex/2008/results/artist/ME3.tar.gz I. M. Mandel, D. P. W. Ellis 3]<br /><br />
<br />
===Friedman's Test for Significant Differences===<br />
====Classes vs. Systems====<br />
The Friedman test was run in MATLAB against the average accuracy for each class.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/artist/perClassAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/artist/perClassAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_artist.perclassaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
====Folds vs. Systems====<br />
The Friedman test was run in MATLAB against the accuracy for each fold.<br />
<br />
=====Friedman's Anova Table=====<br />
<br />
<csv>2008/artist/perFoldAccuracy.friedman.csv</csv><br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
The Tukey-Kramer HSD multi-comparison data below was generated using the following MATLAB instruction.<br />
Command: [c, m, h, gnames] = multicompare(stats, 'ctype', 'tukey-kramer', 'estimate', 'friedman', 'alpha', 0.05);<br />
<br />
<csv>2008/artist/perFoldAccuracy.friedman.detail.csv</csv><br />
<br />
[[Image:2008_artist.perfoldaccuracy.friedman.tukeykramerhsd.png]]<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2009:Music_Structure_Segmentation_Results&diff=67072009:Music Structure Segmentation Results2010-05-14T03:46:35Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
This task concerns itself with analyzing the structure of music audio files, and labeling the corresponding segments, e.g. {verse, chorus, bridge, etc}, {A, B, C, etc.}. A more detailed description can be found at the task page [[2009:Structural_Segmentation]]. The dataset consists of 297 popular music songs.<br />
<br />
===General Legend===<br />
====Team ID====<br />
<br />
'''ANO1''' = [https://www.music-ir.org/mirex/abstracts/2009/Ano.pdf Anonymous] <br /><br />
'''ANO2''' = [https://www.music-ir.org/mirex/abstracts/2009/Ano.pdf Anonymous] <br /><br />
'''MND''' = [https://www.music-ir.org/mirex/abstracts/2009/ACD_SS_mauch.pdf Matthias Mauch, Katy Noland, Simon Dixon] <br /><br />
'''PK''' = [https://www.music-ir.org/mirex/abstracts/2009/PK.pdf Jouni Paulus, Anssi Klapuri]<br /><br />
'''GP''' = [https://www.music-ir.org/mirex/abstracts/2009/Peeters_2009_MIREX_structure.pdf Geoffroy Peeters ] <br /><br />
<br />
====Evaluation Measures====<br />
'''overSegScore''' - normalised conditional entropy based over-segmentation score, S_o from [http://ismir2008.ismir.net/papers/ISMIR2008_219.pdf Lukashevich ISMIR2008]<br><br />
'''underSegScore''' - normalised conditional entropy based under-segmentation score, S_u from [http://ismir2008.ismir.net/papers/ISMIR2008_219.pdf Lukashevich ISMIR2008]<br><br />
'''pwF''' - frame pair clustering F-measure from [http://dx.doi.org/10.1109/TASL.2007.910781 Levy & Sandler TASLP2008]<br><br />
'''pwPrecision''' - frame pair clustering precision rate from [http://dx.doi.org/10.1109/TASL.2007.910781 Levy & Sandler TASLP2008]<br><br />
'''pwRecall''' - frame pair clustering recall rate from [http://dx.doi.org/10.1109/TASL.2007.910781 Levy & Sandler TASLP2008]<br><br />
'''R''' - Rand clustering index from [http://www.springerlink.com/content/x64124718341j1j0/fulltext.pdf Hubert & Arabie, "Comparing partitions", Journal of Classification, 1985]<br><br />
'''Fmeasure@[0.5, 3]s''' - segment boundary recovery evaluation measure. claimed boundary is accepted if it is within the specified window length from a true boundary, overall F-measure for boundary recovery<br><br />
'''precRate@[0.5, 3]s''' - segment boundary recovery precision rate<br><br />
'''recRate@[0.5, 3]s''' - segment boundary recovery recall rate<br><br />
'''medianTrue2claim''' - median distance from an annotated segment boundary to the closest found boundary, seconds<br><br />
'''medianClaim2true''' - median distance from a found segment boundary to the closest annotated one, seconds<br><br />
<br />
The calculation of the measures is described in [[2009:Structural_Segmentation#Evaluation_Measures]].<br />
<br />
===MIREX 2009 Music Structure Summary Results - Mean of all Measures===<br />
<br />
<csv>2009/structure/structure.summary.csv</csv><br />
<br />
===MIREX 2009 Music Structure Summary Runtime Data===<br />
<csv>2009/structure/structure.runtime.csv</csv><br />
<br />
===Individual Participant Results===<br />
*[[Music_Structure_Segmentation_Results:_AN01]]<br />
*[[Music_Structure_Segmentation_Results:_AN02]]<br />
*[[Music_Structure_Segmentation_Results:_GP]]<br />
*[[Music_Structure_Segmentation_Results:_MND]]<br />
*[[Music_Structure_Segmentation_Results:_PK]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2009:Multiple_Fundamental_Frequency_Estimation_%26_Tracking_Results&diff=67062009:Multiple Fundamental Frequency Estimation & Tracking Results2010-05-14T03:46:26Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2008 running of the Multiple Fundamental Frequency Estimation and Tracking task. For background information about this task set please refer to the [[2009:Multiple Fundamental Frequency Estimation & Tracking]] page.<br />
<br />
<br />
===General Legend===<br />
<br />
<br />
====Team ID====<br />
<br />
'''BVB''' = [https://www.music-ir.org/mirex/abstracts/2009/BVB.pdf Nancy Bertin, Emmanuel Vincent, Roland Badeau]<br /><br />
'''DHP1''' = [https://www.music-ir.org/mirex/abstracts/2009/DHP.pdf Zhiyao Duan, Jinyu Han,Bryan Pardo]<br /><br />
'''DHP2''' = [https://www.music-ir.org/mirex/abstracts/2009/DHP.pdf Zhiyao Duan, Jinyu Han,Bryan Pardo]<br /><br />
'''NEOS1''' = [https://www.music-ir.org/mirex/abstracts/2009/NEOS1.pdf Masahiro Nakano, Koji Egashira, Nobutaka Ono, Shigeki Sagayama]<br /><br />
'''NEOS2''' = [https://www.music-ir.org/mirex/abstracts/2009/NEOS2.pdf Masahiro Nakano, Koji Egashira, Nobutaka Ono, Shigeki Sagayama]<br /><br />
'''NPA1''' = [https://www.music-ir.org/mirex/abstracts/2009/NPA.pdf Paolo Nesi, Gianni Pantaleo, Fabrizio Argenti]<br /><br />
'''NPA2''' = [https://www.music-ir.org/mirex/abstracts/2009/NPA.pdf Paolo Nesi, Gianni Pantaleo, Fabrizio Argenti]<br /><br />
'''RS1''' = [https://www.music-ir.org/mirex/abstracts/2009/RS.pdf S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS2''' = [https://www.music-ir.org/mirex/abstracts/2009/RS.pdf S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS3''' = [https://www.music-ir.org/mirex/abstracts/2009/RS.pdf S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS4''' = [https://www.music-ir.org/mirex/abstracts/2009/RS.pdf S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS5''' = [https://www.music-ir.org/mirex/abstracts/2009/ S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS6''' = [https://www.music-ir.org/mirex/abstracts/2009/RS.pdf S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''YR1''' = [https://www.music-ir.org/mirex/abstracts/2009/YR.pdf Chunghsin Yeh, Axel Roebel]<br /><br />
'''YR2''' = [https://www.music-ir.org/mirex/abstracts/2009/YR.pdf Chunghsin Yeh, Axel Roebel]<br /><br />
'''ZL''' = [https://www.music-ir.org/mirex/abstracts/2009/zhang_MFF.pdf Xueliang Zhang, Wenju Liu]<br /><br />
<br />
==Task 1: Multiple Fundamental Frequency Estimation (MF0E)==<br />
<br />
===MF0E Overall Summary Results===<br />
Below are the average scores across 40 test files. These files come from 3 different sources: woodwind quintet recording of bassoon, clarinet, horn,flute and oboe (UIUC); Rendered MIDI using RWC database donated by IRCAM and a quartet recording of bassoon, clarinet, violin and sax donated by Dr. Bryan Pardo`s Interactive Audio Lab (IAL). 20 files coming from 5 sections of the woodwind recording where each section has 4 files ranging from 2 polyphony to 5 polyphony. 12 files from IAL, coming from 4 different songs ranging from 2 polyphony to 4 polyphony and 8 files from RWC synthesized midi ranging from 2 different songs ranging from 2 polphony to 5 polyphony. <br />
<br />
<br />
<br />
<csv>2009/mf0/est/summary/task1.overall.csv</csv> <br />
<br />
====Detailed Results====<br />
<br />
<csv>2009/mf0/est/summary/task1.results.csv</csv><br />
<br />
====Detailed Chroma Results====<br />
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)<br />
<br />
<csv>2009/mf0/est/summary/task1.chroma.results.csv</csv><br />
<br />
===Individual Results Files for Task 1===<br />
'''BVB''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/BVB.tar.gz Nancy Bertin, Emmanuel Vincent, Roland Badeau]<br /><br />
'''DHP1''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/DHP1.tar.gz Zhiyao Duan, Jinyu Han,Bryan Pardo]<br /><br />
'''DHP2''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/DHP2.tar.gz Zhiyao Duan, Jinyu Han,Bryan Pardo]<br /><br />
'''NEOS1''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/NEOS1.tar.gz Masahiro Nakano, Koji Egashira, Nobutaka Ono, Shigeki Sagayama]<br /><br />
'''NEOS2''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/NEOS2.tar.gz Masahiro Nakano, Koji Egashira, Nobutaka Ono, Shigeki Sagayama]<br /><br />
'''NPA1''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/NPA1.tar.gz Paolo Nesi, Gianni Pantaleo, Fabrizio Argenti]<br /><br />
'''NPA2''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/NPA2.tar.gz Paolo Nesi, Gianni Pantaleo, Fabrizio Argenti]<br /><br />
'''RS1''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/RS1.tar.gz S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS2''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/RS2.tar.gz S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS3''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/RS3.tar.gz S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS4''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/RS4.tar.gz S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS5''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/RS5.tar.gz S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS6''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/RS6.tar.gz S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''YR1''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/YR1.tar.gz Chunghsin Yeh, Axel Roebel]<br /><br />
'''YR2''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/YR2.tar.gz Chunghsin Yeh, Axel Roebel]<br /><br />
'''ZL''' = [https://music-ir.org/mirex/results/2009/mf0/est/tars/ZL.tar.gz Xueliang Zhang, Wenju Liu]<br />
<br />
=====Info about the filenames=====<br />
The filenames starting with part* comes from acoustic woodwind recording, the ones starting with RWC are synthesized. The legend about the instruments are:<br />
<br />
'''bs''' = bassoon,<br />
'''cl''' = clarinet,<br />
'''fl''' = flute,<br />
'''hn''' = horn,<br />
'''ob''' = oboe,<br />
'''vl''' = violin,<br />
'''cel''' = cello,<br />
'''gtr''' = guitar,<br />
'''sax''' = saxophone,<br />
'''bass''' = electric bass guitar<br />
<br />
===Run Times===<br />
<csv>2009/multif0/task1_runtimes.csv</csv><br />
<br />
TBA<br />
<br />
==Task 2A:Mixed Set Note Tracking (NT)==<br />
===NT Mixed Set Overall Summary Results===<br />
This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.<br />
<br />
A total of 34 files were used in this subtask: 16 from woodwind recording, 8 from IAL quintet recording and 6 piano. <br />
<br />
<csv>2009/mf0/nt/summary/task2.overall.results.csv</csv><br />
<br />
====Detailed Results====<br />
<br />
<csv>2009/mf0/nt/summary/task2.results.csv</csv><br />
<br />
====Detailed Chroma Results====<br />
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)<br />
<br />
<csv>2009/mf0/nt/summary/task2.chroma.results.csv</csv><br />
<br />
====Results Based on Onset Only====<br />
<br />
<csv>2009/mf0/nt/summary/task2.onsetOnly.results.csv</csv><br />
<br />
====Chroma Results Based on Onset Only====<br />
<br />
<csv>2009/mf0/nt/summary/task2.onsetOnly.chroma.results.csv</csv><br />
<br />
==Task 2B:Piano-Only Note Tracking (NT)==<br />
===NT Piano-Only Overall Summary Results===<br />
This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.<br />
6 piano recordings are evaluated separately for this subtask.<br />
<br />
<csv>2009/mf0/nt/summary_piano_subtask/task2_piano.overall.results.csv</csv><br />
<br />
====Detailed Results====<br />
<br />
<csv>2009/mf0/nt/summary_piano_subtask/task2_piano.results.csv</csv><br />
<br />
====Detailed Chroma Results====<br />
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)<br />
<br />
<csv>2009/mf0/nt/summary_piano_subtask/task2_piano.chroma.results.csv</csv><br />
<br />
====Results Based on Onset Only====<br />
<br />
<csv>2009/mf0/nt/summary_piano_subtask/task2_piano.onsetOnly.results.csv</csv><br />
<br />
====Chroma Results Based on Onset Only====<br />
<br />
<csv>2009/mf0/nt/summary_piano_subtask/task2_piano.onsetOnly.chroma.results.csv</csv><br />
<br />
<br />
===Individual Results Files for Task 2===<br />
'''BVB''' = [https://music-ir.org/mirex/results/2009/mf0/nt/BVB.tgz Nancy Bertin, Emmanuel Vincent, Roland Badeau]<br /><br />
'''DHP1''' = [https://music-ir.org/mirex/results/2009/mf0/nt/DHP1.tgz Zhiyao Duan, Jinyu Han,Bryan Pardo]<br /><br />
'''DHP2''' = [https://music-ir.org/mirex/results/2009/mf0/nt/DHP2.tgz Zhiyao Duan, Jinyu Han,Bryan Pardo]<br /><br />
'''NEOS1''' = [https://music-ir.org/mirex/results/2009/mf0/nt/NEOS1.tgz Masahiro Nakano, Koji Egashira, Nobutaka Ono, Shigeki Sagayama]<br /><br />
'''NEOS2''' = [https://music-ir.org/mirex/results/2009/mf0/nt/NEOS2.tgz Masahiro Nakano, Koji Egashira, Nobutaka Ono, Shigeki Sagayama]<br /><br />
'''NPA1''' = [https://music-ir.org/mirex/results/2009/mf0/nt/NPA1.tgz Paolo Nesi, Gianni Pantaleo, Fabrizio Argenti]<br /><br />
'''NPA2''' = [https://music-ir.org/mirex/results/2009/mf0/nt/NPA2.tgz Paolo Nesi, Gianni Pantaleo, Fabrizio Argenti]<br /><br />
'''RS1''' = [https://music-ir.org/mirex/results/2009/mf0/nt/RS1.tgz S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS2''' = [https://music-ir.org/mirex/results/2009/mf0/nt/RS2.tgz S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS3''' = [https://music-ir.org/mirex/results/2009/mf0/nt/RS3.tgz S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS4''' = [https://music-ir.org/mirex/results/2009/mf0/nt/RS4.tgz S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS5''' = [https://music-ir.org/mirex/results/2009/mf0/nt/RS5.tgz S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''RS6''' = [https://music-ir.org/mirex/results/2009/mf0/nt/RS6.tgz S. A. Racz├╜nski, S. Sagayma]<br /><br />
'''YR''' = [https://music-ir.org/mirex/results/2009/mf0/nt/YR.tgz Chunghsin Yeh, Axel Roebel]<br /><br />
<br />
======Info About Filenames======<br />
The filenames starting with part* come from acoustic woodwind recording, the ones starting with RWC are synthesized. The piano files are: RA_C030_align.wav, bach_847TESTp.wav, beet_pathetique_3TESTp.wav, mz_333_1TESTp.wav, scn_4TESTp.wav.note, ty_januarTESTp.wav.note. The filenames starting with 01*, 03*, 07*, 09* are coming from the quartet recording.<br />
<br />
===Run Times===<br />
<csv>2009/multif0/task2_runtimes.csv</csv> TBA<br />
<br />
==Task 3 Instrument Tracking==<br />
Same dataset was used as in Task1. The evaluations were performed by first one-to-one matching the detected contours to the ground-truth contours. This is done by selecting the best scoring duo`s of detected and ground-truth contours. If there are extra detected contours that are not matched to any of the ground-truth contours, all the returned F0`s in those contours are added to false positives. If there are extra ground-truth contours that are not matched to any detected contours, all the F0`s in the ground-truth contours are added to false negatives. <br />
<br />
<br />
====MF0It Detailed Results====<br />
<br />
<csv>2009/mf0/it/summary/task3.results.csv</csv><br />
<br />
====Detailed Chroma Results====<br />
Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)<br />
<br />
<csv>2009/mf0/it/summary/task3.chroma.results.csv</csv><br />
<br />
===Individual Results Files for Task 3===<br />
'''DHP1''' = [https://music-ir.org/mirex/results/2009/mf0/it/DHP1.tar.gz Zhiyao Duan, Jinyu Han,Bryan Pardo]<br /><br />
<br />
======Info About Filenames======<br />
The filenames starting with part* come from acoustic woodwind recording, the ones starting with RWC are synthesized. The piano files are: RA_C030_align.wav, bach_847TESTp.wav, beet_pathetique_3TESTp.wav, mz_333_1TESTp.wav, scn_4TESTp.wav.note, ty_januarTESTp.wav.note. The filenames starting with 01*, 03*, 07*, 09* are coming from the quartet recording.<br />
<br />
==Friedman's Test for Significant Differences==<br />
<br />
====Task 1: Multiple Fundamental Frequency Estimation (MF0E)====<br />
The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the performance (accuracy) on individual files.<br />
<br />
=====Tukey-Kramer HSD Multi-Comparison=====<br />
<br />
<csv>2009/mf0/est/summary/Accuracy_Per_Song_Friedman_Mean_Rankstask1.friedman.Friedman_TukeyKramerHSD.csv</csv><br />
<br />
https://music-ir.org/mirex/results/2009/mf0/est/summary/small.Accuracy_Per_Song_Friedman_Mean_Rankstask1.friedman.Friedman_Mean_Ranks.png<br />
<br />
====Task 2: Note Tracking====<br />
The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the F-measure on individual files.<br />
<br />
=====Tukey-Kramer HSD Multi-Comparison for Task2A=====<br />
<br />
<csv>2009/mf0/nt/summary/Accuracy_Per_Song_Friedman_Mean_Rankstask2.friedman.Friedman_TukeyKramerHSD.csv</csv><br />
<br />
https://music-ir.org/mirex/results/2009/mf0/nt/summary/small.Accuracy_Per_Song_Friedman_Mean_Rankstask2.friedman.Friedman_Mean_Ranks.png<br />
=====Tukey-Kramer HSD Multi-Comparison for Task2B (Piano)=====<br />
<br />
<csv>2009/mf0/nt/summary_piano_subtask/Accuracy_Per_Song_Friedman_Mean_Rankstask2_piano.friedman.Friedman_TukeyKramerHSD.csv</csv><br />
<br />
https://www.music-ir.org/mirex/results/2009/mf0/nt/summary_piano_subtask/small.Accuracy_Per_Song_Friedman_Mean_Rankstask2_piano.friedman.Friedman_Mean_Ranks.png<br />
<br />
<br />
<br />
[[Category: Results]]</div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2009:Audio_Tag_Classification_(Mood_Set)_Results&diff=67052009:Audio Tag Classification (Mood Set) Results2010-05-14T03:46:16Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
These are the results for the 2009 running of the Audio Tag Classification (Mood Set) task. For background information about this task set please refer to the [[2009:Audio_Tag_Classification]] page. The data was created by Xiao Hu and consists of 3,469 unique songs and 135 mood tags organized into 18 mood tag groups.<br />
<br />
=== Mood tags ===<br />
The tags were collected from [http://last.fm last.fm]. All tags in this set are mood related as identified and grouped by [http://wndomains.fbk.eu/wnaffect.html WordNet-Affect] and human experts. <br />
<br />
Each mood tag group contains the following tags: <br />
* G12: calm, comfort, quiet, serene, mellow, chill out, calm down, calming, chillout, comforting, content, cool down, mellow music, mellow rock, peace of mind, quietness, relaxation, serenity, solace, soothe, soothing, still, tranquil, tranquility, tranquility<br />
* G15: sad, sadness, unhappy, melancholic, melancholy, feeling sad, mood: sad ΓÇô slightly, sad song<br />
* G5: happy, happiness, happy songs, happy music, glad, mood: happy<br />
* G32: romantic, romantic music<br />
* G2: upbeat, gleeful, high spirits, zest, enthusiastic, buoyancy, elation, mood: upbeat<br />
* G16: depressed, blue, dark, depressive, dreary, gloom, darkness, depress, depression, depressing, gloomy<br />
* G28: anger, angry, choleric, fury, outraged, rage, angry music<br />
* G17: grief, heartbreak, mournful, sorrow, sorry, doleful, heartache, heartbreaking, heartsick, lachrymose, mourning, plaintive, regret, sorrowful<br />
* G14: dreamy<br />
* G6: cheerful, cheer up, festive, jolly, jovial, merry, cheer, cheering, cheery, get happy, rejoice, songs that are cheerful, sunny<br />
* G8: brooding, contemplative, meditative, reflective, broody, pensive, pondering, wistful<br />
* G29: aggression, aggressive<br />
* G25: angst, anxiety, anxious, jumpy, nervous, angsty<br />
* G9: confident, encouraging, encouragement, optimism, optimistic<br />
* G7: desire, hope, hopeful, mood: hopeful<br />
* G11: earnest, heartfelt<br />
* G31: pessimism, cynical, pessimistic, weltschmerz, cynical/sarcastic<br />
* G1: excitement, exciting, exhilarating, thrill, ardor, stimulating, thrilling, titillating<br />
<br />
For details on the mood tag groups, please see<br />
<br />
[https://www.music-ir.org/archive/papers/ISMIR2009_MoodClassification.pdf X. Hu, J. S. Downie, A.Ehmann (2009)]. '''Lyric Text Mining in Music Mood Classification''', In the 10th International Symposium on Music Information Retrieval (ISMIR 2009), Oct. 2009, Kobe, Japan<br />
<br />
=== Data ===<br />
The songs are Western pop songs mostly from the USPOP collection. Each song may belong to multiple mood tag groups. The main rationale on songs selection is: if more than one tag in a group were applied to a song, or if one tag in a group was applied more than once to a song, this song is marked as belonging to this group.<br />
<br />
For details on how the songs were selected, please see the [https://www.music-ir.org/archive/papers/Mood_Multi_Tag_Data_Description.pdf Mood multi-tag data description].<br />
<br />
Audio format: 30 second clips, 44.1kHz, stereo,16bit, WAV files; The data were split into 3 folds with artist filtering. <br />
<br />
<br />
--------------------<br />
<br />
<br />
===General Legend===<br />
====Team ID====<br />
<br />
'''BP1''' = [https://www.music-ir.org/mirex/abstracts/2009/BP.pdf Juan José Burred, Geoffroy Peeters]<br /><br />
'''BP2''' = [https://www.music-ir.org/mirex/abstracts/2009/BP.pdf Juan José Burred, Geoffroy Peeters]<br /><br />
'''CC1''' = [https://www.music-ir.org/mirex/abstracts/2009/CC.pdf Chuan Cao, Ming Li]<br /><br />
'''CC2''' = [https://www.music-ir.org/mirex/abstracts/2009/CC.pdf Chuan Cao, Ming Li]<br /><br />
'''CC3''' = [https://www.music-ir.org/mirex/abstracts/2009/CC.pdf Chuan Cao, Ming Li]<br /><br />
'''CC4''' = [https://www.music-ir.org/mirex/abstracts/2009/CC.pdf Chuan Cao, Ming Li]<br /><br />
'''GP''' = [https://www.music-ir.org/mirex/abstracts/2009/Peeters_2009_MIREX_classification.pdf Geoffroy Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/abstracts/2009/GTfinal.pdf George Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/abstracts/2009/GTfinal.pdf George Tzanetakis]<br /><br />
'''HCB''' = [https://www.music-ir.org/mirex/abstracts/2009/HBC.pdf Matthew D.Hoffman, David M. Blei, Perry R.Cook]<br /><br />
'''LWW1''' = [https://www.music-ir.org/mirex/abstracts/2009/LWW.pdf Hung-Yi Lo, Ju-Chiang Wang, Hsin-Min Wang]<br /><br />
'''LWW2''' = [https://www.music-ir.org/mirex/abstracts/2009/LWW.pdf Hung-Yi Lo, Ju-Chiang Wang, Hsin-Min Wang]<br /><br />
<br />
==Overall Summary Results (Binary)==<br />
<br />
<csv p=3>2009/tag/Mood/summary_binary.csv</csv><br />
<br />
<br />
<br />
===Summary Binary Relevance F-Measure (Average Across All Folds)===<br />
<br />
<csv p=3>2009/tag/Mood/binary_avg_Fmeasure.csv</csv><br />
<br />
===Summary Binary Accuracy (Average Across All Folds)===<br />
<br />
<csv p=3>2009/tag/Mood/binary_avg_Accuracy.csv</csv><br />
<br />
===Summary Positive Example Accuracy (Average Across All Folds)===<br />
<br />
<csv p=3>2009/tag/Mood/binary_avg_positive_example_Accuracy.csv</csv><br />
<br />
===Summary Negative Example Accuracy (Average Across All Folds)===<br />
<br />
<csv p=3>2009/tag/Mood/binary_avg_negative_example_Accuracy.csv</csv><br />
<br />
==Overall Summary Results (Affinity)==<br />
<br />
<csv p=3>2009/tag/Mood/summary_affinity.csv</csv><br />
<br />
===Summary AUC-ROC Tag (Average Across All Folds)===<br />
<br />
<csv p=3>2009/tag/Mood/affinity_tag_AUC_ROC.csv</csv><br />
<br />
==Select Friedman's Test Results==<br />
===Tag F-measure (Binary) Friedman Test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the F-measure for each '''tag''' in the test, averaged over all folds.<br />
<br />
<csv p=3>2009/tag/Mood/binary_FMeasure.friedman.tukeyKramerHSD.csv</csv><br />
<br />
<br />
https://music-ir.org/mirex/results/2009/tag/Mood/small.binary_FMeasure.friedman.tukeyKramerHSD.png<br />
<br />
===Per Track F-measure (Binary) Friedman Test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the F-measure for each '''track''' in the test, averaged over all folds.<br />
<csv p=3>2009/tag/Mood/binary_FMeasure_per_track.friedman.tukeyKramerHSD.csv</csv><br />
<br />
<br />
https://music-ir.org/mirex/results/2009/tag/Mood/small.binary_FMeasure_per_track.friedman.tukeyKramerHSD.png<br />
<br />
===Tag AUC-ROC (Affinity) Friedman Test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the Area Under the ROC curve (AUC-ROC) for each '''tag''' in the test, averaged over all folds.<br />
<br />
<csv p=3>2009/tag/Mood/affinity.AUC_ROC_TAG.friedman.tukeyKramerHSD.csv</csv><br />
<br />
https://music-ir.org/mirex/results/2009/tag/Mood/small.affinity.AUC_ROC_TAG.friedman.tukeyKramerHSD.png<br />
<br />
<br />
===Per Track AUC-ROC (Affinity) Friedman Test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the Area Under the ROC curve (AUC-ROC) for each '''track/clip''' in the test, averaged over all folds.<br />
<br />
<csv p=3>2009/tag/Mood/affinity.AUC_ROC_TRACK.friedman.tukeyKramerHSD.csv</csv><br />
<br />
https://music-ir.org/mirex/results/2009/tag/Mood/small.affinity.AUC_ROC_TRACK.friedman.tukeyKramerHSD.png<br />
<br />
==Assorted Results Files for Download==<br />
===General Results===<br />
[https://music-ir.org/mirex/results/2009/tag/Mood/affinity_tag_fold_AUC_ROC.csv affinity_tag_fold_AUC_ROC.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/affinity_clip_AUC_ROC.csv affinity_clip_AUC_ROC.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/binary_per_fold_Accuracy.csv binary_per_fold_Accuracy.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/binary_per_fold_Fmeasure.csv binary_per_fold_Fmeasure.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/binary_per_fold_negative_example_Accuracy.csv binary_per_fold_negative_example_Accuracy.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/binary_per_fold_per_track_Accuracy.csv binary_per_fold_per_track_Accuracy.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/binary_per_fold_per_track_Fmeasure.csv binary_per_fold_per_track_Fmeasure.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/binary_per_fold_per_track_negative_example_Accuracy.csv binary_per_fold_per_track_negative_example_Accuracy.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/binary_per_fold_per_track_positive_example_Accuracy.csv binary_per_fold_per_track_positive_example_Accuracy.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/binary_per_fold_positive_example_Accuracy.csv binary_per_fold_positive_example_Accuracy.csv]<br /><br />
<br />
===Friedman's Tests Results===<br />
[https://music-ir.org/mirex/results/2009/tag/Mood/affinity.PrecisionAt3.friedman.tukeyKramerHSD.csv affinity.PrecisionAt3.friedman.tukeyKramerHSD.csv ]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/affinity.PrecisionAt3.friedman.tukeyKramerHSD.png affinity.PrecisionAt3.friedman.tukeyKramerHSD.png]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/affinity.PrecisionAt6.friedman.tukeyKramerHSD.csv affinity.PrecisionAt6.friedman.tukeyKramerHSD.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/affinity.PrecisionAt6.friedman.tukeyKramerHSD.png affinity.PrecisionAt6.friedman.tukeyKramerHSD.png]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/affinity.PrecisionAt9.friedman.tukeyKramerHSD.csv affinity.PrecisionAt9.friedman.tukeyKramerHSD.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/affinity.PrecisionAt9.friedman.tukeyKramerHSD.png affinity.PrecisionAt9.friedman.tukeyKramerHSD.png ]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/affinity.PrecisionAt12.friedman.tukeyKramerHSD.csv affinity.PrecisionAt12.friedman.tukeyKramerHSD.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/affinity.PrecisionAt12.friedman.tukeyKramerHSD.png affinity.PrecisionAt12.friedman.tukeyKramerHSD.png]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/affinity.PrecisionAt15.friedman.tukeyKramerHSD.csv affinity.PrecisionAt15.friedman.tukeyKramerHSD.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/affinity.PrecisionAt15.friedman.tukeyKramerHSD.png affinity.PrecisionAt15.friedman.tukeyKramerHSD.png]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/binary_Accuracy.friedman.tukeyKramerHSD.csv binary_Accuracy.friedman.tukeyKramerHSD.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/Mood/binary_Accuracy.friedman.tukeyKramerHSD.png binary_Accuracy.friedman.tukeyKramerHSD.png]<br /><br />
<br />
===Results By Algorithm===<br />
(.tgz format) <br /><br />
<br />
'''BP1''' = [https://www.music-ir.org/mirex/results/2009/tag/Mood/BP1.tgz Juan José Burred, Geoffroy Peeters]<br /><br />
'''BP2''' = [https://www.music-ir.org/mirex/results/2009/tag/Mood/BP2.tgz Juan José Burred, Geoffroy Peeters]<br /><br />
'''CC1''' = [https://www.music-ir.org/mirex/results/2009/tag/Mood/CC1.tgz Chuan Cao, Ming Li]<br /><br />
'''CC2''' = [https://www.music-ir.org/mirex/results/2009/tag/Mood/CC2.tgz Chuan Cao, Ming Li]<br /><br />
'''CC3''' = [https://www.music-ir.org/mirex/results/2009/tag/Mood/CC3.tgz Chuan Cao, Ming Li]<br /><br />
'''CC4''' = [https://www.music-ir.org/mirex/results/2009/tag/Mood/CC4.tgz Chuan Cao, Ming Li]<br /><br />
'''GP''' = [https://www.music-ir.org/mirex/results/2009/tag/Mood/GP.tgz Geoffroy Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/results/2009/tag/Mood/GT1.tgz George Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/results/2009/tag/Mood/GT2.tgz George Tzanetakis]<br /><br />
'''LWW1''' = [https://www.music-ir.org/mirex/results/2009/tag/Mood/LWW1.tgz Hung-Yi Lo, Ju-Chiang Wang, Hsin-Min Wang]<br /><br />
'''LWW2''' = [https://www.music-ir.org/mirex/results/2009/tag/Mood/LWW2.tgz Hung-Yi Lo, Ju-Chiang Wang, Hsin-Min Wang]<br /><br />
'''HCB''' = [https://www.music-ir.org/mirex/results/2009/tag/Mood/HCB.tgz Matthew D.Hoffman, David M. Blei, Perry R.Cook]<br /></div>IMIRSELBothttps://music-ir.org/mirex/w/index.php?title=2009:Audio_Tag_Classification_(MajorMiner)_Set_Results&diff=67042009:Audio Tag Classification (MajorMiner) Set Results2010-05-14T03:46:07Z<p>IMIRSELBot: Robot: Automated text replacement (-mirex/abs/ +mirex/abstracts/)</p>
<hr />
<div>==Introduction==<br />
This task compares various algorithms' abilities to associate tags with 10-second audio clips of songs. The tags come from the MajorMiner game. This task is very much related to the other audio classification tasks, however, instead of one N-way classification per clip, this task requires N binary classifications per clip.<br />
<br />
Two outputs are produced by each algorithm: <br />
* a set of binary classifications indicating which tags are relevant to each example, <br />
* a set of 'affinity' scores which indicate the degree to which each tag applies to each track. <br />
These different outputs allow the algorithms to be evaluated both on tag 'classification' and tag 'ranking' (where the tags may be ranked for each track and tracks ranked for each tag).<br />
<br />
=== Data ===<br />
<br />
All of the data is browsable via the [http://majorminer.org/search MajorMiner search] page.<br />
<br />
The music consists of 2300 clips selected at random from 3900 tracks. Each clip is 10 seconds long. The 2300 clips represent a total of 1400 different tracks on 800 different albums by 500 different artists. To give a sense for the music collection, the following genre tags have been applied to these artists, albums, and tracks on Last.fm: electronica, rock, indie, alternative, pop, britpop, idm, new wave, hip-hop, singer-songwriter, trip-hop, post-punk, ambient, jazz.<br />
<br />
==== Tags ====<br />
<br />
The MajorMiner game has collected a total of about 73000 taggings, 12000 of which have been verified by at least two users. In these verified taggings, there are 43 tags that have been verified at least 35 times, for a total of about 9000 verified uses. These are the tags we will be using in this task.<br />
<br />
Note that these data do not include strict negative labels. While many clips are tagged ''rock'', none are tagged ''not rock''. Frequently, however, a clip will be tagged many times without being tagged ''rock''. We take this as an indication that ''rock'' does not apply to that clip. More specifically, a negative example of a particular tag is a clip on which another tag has been verified, but the tag in question has not.<br />
<br />
Here is a list of the top 50 tags along with an approximate number of times each has been verified, how many times it's been used in total, and how many different users have ever used it:<br />
<br />
{| class="wikitable" style="margin: 1em auto 1em auto"<br />
! Tag || Verified || Total || Users<br />
|-<br />
| drums || 962 || 3223 || 127 <br />
|-<br />
| guitar || 845 || 3204 || 181 <br />
|-<br />
| male || 724 || 2452 || 95 <br />
|-<br />
| rock || 658 || 2619 || 198 <br />
|-<br />
| synth || 498 || 1889 || 105 <br />
|-<br />
| electronic || 490 || 1878 || 131 <br />
|-<br />
| pop || 479 || 1761 || 151 <br />
|-<br />
| bass || 417 || 1632 || 99 <br />
|-<br />
| vocal || 355 || 1378 || 99 <br />
|-<br />
| female || 342 || 1387 || 100 <br />
|-<br />
| dance || 322 || 1244 || 115 <br />
|-<br />
| techno || 246 || 943 || 104 <br />
|-<br />
| piano || 179 || 826 || 120 <br />
|-<br />
| electronica || 168 || 686 || 67 <br />
|-<br />
| hip hop || 166 || 701 || 126 <br />
|-<br />
| voice || 160 || 790 || 55 <br />
|-<br />
| slow || 157 || 727 || 90 <br />
|-<br />
| beat || 154 || 708 || 90 <br />
|-<br />
| rap || 151 || 723 || 129 <br />
|-<br />
| jazz || 136 || 735 || 154 <br />
|-<br />
| 80s || 130 || 601 || 94 <br />
|-<br />
| fast || 109 || 494 || 70 <br />
|-<br />
| instrumental || 103 || 539 || 62 <br />
|-<br />
| drum machine || 89 || 427 || 35 <br />
|-<br />
| british || 81 || 383 || 60 <br />
|-<br />
| country || 74 || 360 || 105 <br />
|-<br />
| distortion || 73 || 366 || 55 <br />
|-<br />
| saxophone || 70 || 316 || 86 <br />
|-<br />
| house || 65 || 298 || 66 <br />
|-<br />
| ambient || 61 || 335 || 78 <br />
|-<br />
| soft || 61 || 351 || 58 <br />
|-<br />
| silence || 57 || 200 || 35 <br />
|-<br />
| r&b || 57 || 242 || 59 <br />
|-<br />
| strings || 55 || 252 || 62 <br />
|-<br />
| quiet || 54 || 261 || 57 <br />
|-<br />
| solo || 53 || 268 || 56 <br />
|-<br />
| keyboard || 53 || 424 || 41 <br />
|-<br />
| punk || 51 || 242 || 76 <br />
|-<br />
| horns || 48 || 204 || 38 <br />
|-<br />
| drum and bass || 48 || 191 || 50 <br />
|-<br />
| noise || 46 || 249 || 61 <br />
|-<br />
| funk || 46 || 266 || 90 <br />
|-<br />
| acoustic || 40 || 193 || 58 <br />
|-<br />
| trumpet || 39 || 174 || 68 <br />
|-<br />
| end || 38 || 178 || 36 <br />
|-<br />
| loud || 37 || 218 || 62 <br />
|-<br />
| organ || 35 || 169 || 46 <br />
|-<br />
| metal || 35 || 178 || 64 <br />
|-<br />
| folk || 33 || 195 || 58 <br />
|-<br />
| trance || 33 || 226 || 49 <br />
|}<br />
<br />
=== Evaluation ===<br />
Participating algorithms were evaluated with 3-fold cross validation. Artist filtering was used in the production of the test and training splits, I.e. training and test sets contained different artists. <br />
<br />
==== Binary Evaluation ====<br />
Algorithms are evaluated on their performance at tag classification using F-measure. Results are also reported for simple accuracy, however, as this statistic is dominated by the negative example accuracy it is not a reliable indicator of performance (as a system that returns no tags for any example will achieve a high score on this statistic). However, the accuracies are also reported for positive and negative examples separately as these can help elucidate the behaviour of an algorithm (for example demonstrating if the system is under of over predicting).<br />
<br />
==== Affinity (ranking) Evaluation ====<br />
Algorithms are evaluated on their performance at tag ranking using the Area Under the Receiver Operating Characteristic Curve (AUC-ROC). The affinity scores for each tag to be applied to a track are sorted prior to the computation of the AUC-ROC statistic, which gives higher scores to ranked tag sets where the correct tags appear towards the top of the set. <br />
<br />
<br />
===General Legend===<br />
====Team ID====<br />
<br />
'''BP1''' = [https://www.music-ir.org/mirex/abstracts/2009/BP_train_tag.pdf Juan José Burred, Geoffroy Peeters]<br /><br />
'''BP2''' = [https://www.music-ir.org/mirex/abstracts/2009/BP_train_tag.pdf Juan José Burred, Geoffroy Peeters]<br /><br />
'''CC1''' = [https://www.music-ir.org/mirex/abstracts/2009/CC.pdf Chuan Cao, Ming Li]<br /><br />
'''CC2''' = [https://www.music-ir.org/mirex/abstracts/2009/CC.pdf Chuan Cao, Ming Li]<br /><br />
'''CC3''' = [https://www.music-ir.org/mirex/abstracts/2009/CC.pdf Chuan Cao, Ming Li]<br /><br />
'''CC4''' = [https://www.music-ir.org/mirex/abstracts/2009/CC.pdf Chuan Cao, Ming Li]<br /><br />
'''GP''' = [https://www.music-ir.org/mirex/abstracts/2009/Peeters_2009_MIREX_classification.pdf Geoffroy Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/abstracts/2009/GTfinal.pdf George Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/abstracts/2009/GTfinal.pdf George Tzanetakis]<br /><br />
'''HBC''' = [https://www.music-ir.org/mirex/abstracts/2009/HBC.pdf Matthew D.Hoffman, David M. Blei, Perry R.Cook]<br /><br />
'''LWW1''' = [https://www.music-ir.org/mirex/abstracts/2009/LWW.pdf Hung-Yi Lo, Ju-Chiang Wang, Hsin-Min Wang]<br /><br />
'''LWW2''' = [https://www.music-ir.org/mirex/abstracts/2009/LWW.pdf Hung-Yi Lo, Ju-Chiang Wang, Hsin-Min Wang]<br /><br />
<br />
==Overall Summary Results (Binary)==<br />
<br />
<csv p=3>2009/tag/MajorMiner/summary_binary.csv</csv><br />
<br />
<br />
<br />
===Summary Binary Relevance F-Measure (Average Across All Folds)===<br />
<br />
<csv p=3>2009/tag/MajorMiner/binary_avg_Fmeasure.csv</csv><br />
<br />
===Summary Binary Accuracy (Average Across All Folds)===<br />
<br />
<csv p=3>2009/tag/MajorMiner/binary_avg_Accuracy.csv</csv><br />
<br />
===Summary Positive Example Accuracy (Average Across All Folds)===<br />
<br />
<csv p=3>2009/tag/MajorMiner/binary_avg_positive_example_Accuracy.csv</csv><br />
<br />
===Summary Negative Example Accuracy (Average Across All Folds)===<br />
<br />
<csv p=3>2009/tag/MajorMiner/binary_avg_negative_example_Accuracy.csv</csv><br />
<br />
==Overall Summary Results (Affinity)==<br />
<br />
<csv p=3>2009/tag/MajorMiner/summary_affinity.csv</csv><br />
<br />
===Summary AUC-ROC Tag (Average Across All Folds)===<br />
<br />
<csv p=3>2009/tag/MajorMiner/affinity_tag_AUC_ROC.csv</csv><br />
<br />
==Select Friedman's Test Results==<br />
===Tag F-measure (Binary) Friedman Test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the F-measure for each '''tag''' in the test, averaged over all folds.<br />
<br />
<csv p=3>2009/tag/MajorMiner/binary_FMeasure.friedman.tukeyKramerHSD.csv</csv><br />
<br />
<br />
https://music-ir.org/mirex/results/2009/tag/MajorMiner/small.binary_FMeasure.friedman.tukeyKramerHSD.png<br />
<br />
===Per Track F-measure (Binary) Friedman Test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the F-measure for each '''track''' in the test, averaged over all folds.<br />
<csv p=3>2009/tag/MajorMiner/binary_FMeasure_per_track.friedman.tukeyKramerHSD.csv</csv><br />
<br />
<br />
https://music-ir.org/mirex/results/2009/tag/MajorMiner/small.binary_FMeasure_per_track.friedman.tukeyKramerHSD.png<br />
<br />
===Tag AUC-ROC (Affinity) Friedman Test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the Area Under the ROC curve (AUC-ROC) for each '''tag''' in the test, averaged over all folds.<br />
<br />
<csv p=3>2009/tag/MajorMiner/affinity.AUC_ROC_TAG.friedman.tukeyKramerHSD.csv</csv><br />
<br />
https://music-ir.org/mirex/results/2009/tag/MajorMiner/small.affinity.AUC_ROC_TAG.friedman.tukeyKramerHSD.png<br />
<br />
===Per Track AUC-ROC (Affinity) Friedman Test===<br />
The following table and plot show the results of Friedman's ANOVA with Tukey-Kramer multiple comparisons computed over the Area Under the ROC curve (AUC-ROC) for each '''track/clip''' in the test, averaged over all folds.<br />
<br />
<csv p=3>2009/tag/MajorMiner/affinity.AUC_ROC_TRACK.friedman.tukeyKramerHSD.csv</csv><br />
<br />
https://music-ir.org/mirex/results/2009/tag/MajorMiner/small.affinity.AUC_ROC_TRACK.friedman.tukeyKramerHSD.png<br />
<br />
==Assorted Results Files for Download==<br />
===General Results===<br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/affinity_tag_fold_AUC_ROC.csv affinity_tag_fold_AUC_ROC.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/affinity_clip_AUC_ROC.csv affinity_clip_AUC_ROC.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/binary_per_fold_Accuracy.csv binary_per_fold_Accuracy.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/binary_per_fold_Fmeasure.csv binary_per_fold_Fmeasure.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/binary_per_fold_negative_example_Accuracy.csv binary_per_fold_negative_example_Accuracy.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/binary_per_fold_per_track_Accuracy.csv binary_per_fold_per_track_Accuracy.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/binary_per_fold_per_track_Fmeasure.csv binary_per_fold_per_track_Fmeasure.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/binary_per_fold_per_track_negative_example_Accuracy.csv binary_per_fold_per_track_negative_example_Accuracy.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/binary_per_fold_per_track_positive_example_Accuracy.csv binary_per_fold_per_track_positive_example_Accuracy.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/binary_per_fold_positive_example_Accuracy.csv binary_per_fold_positive_example_Accuracy.csv]<br /><br />
<br />
===Friedman's Tests Results===<br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/affinity.PrecisionAt3.friedman.tukeyKramerHSD.csv affinity.PrecisionAt3.friedman.tukeyKramerHSD.csv ]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/affinity.PrecisionAt3.friedman.tukeyKramerHSD.png affinity.PrecisionAt3.friedman.tukeyKramerHSD.png]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/affinity.PrecisionAt6.friedman.tukeyKramerHSD.csv affinity.PrecisionAt6.friedman.tukeyKramerHSD.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/affinity.PrecisionAt6.friedman.tukeyKramerHSD.png affinity.PrecisionAt6.friedman.tukeyKramerHSD.png]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/affinity.PrecisionAt9.friedman.tukeyKramerHSD.csv affinity.PrecisionAt9.friedman.tukeyKramerHSD.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/affinity.PrecisionAt9.friedman.tukeyKramerHSD.png affinity.PrecisionAt9.friedman.tukeyKramerHSD.png ]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/affinity.PrecisionAt12.friedman.tukeyKramerHSD.csv affinity.PrecisionAt12.friedman.tukeyKramerHSD.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/affinity.PrecisionAt12.friedman.tukeyKramerHSD.png affinity.PrecisionAt12.friedman.tukeyKramerHSD.png]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/affinity.PrecisionAt15.friedman.tukeyKramerHSD.csv affinity.PrecisionAt15.friedman.tukeyKramerHSD.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/affinity.PrecisionAt15.friedman.tukeyKramerHSD.png affinity.PrecisionAt15.friedman.tukeyKramerHSD.png]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/binary_Accuracy.friedman.tukeyKramerHSD.csv binary_Accuracy.friedman.tukeyKramerHSD.csv]<br /> <br />
[https://music-ir.org/mirex/results/2009/tag/MajorMiner/binary_Accuracy.friedman.tukeyKramerHSD.png binary_Accuracy.friedman.tukeyKramerHSD.png]<br /><br />
<br />
===Results By Algorithm===<br />
(.tgz format) <br /><br />
<br />
'''BP1''' = [https://www.music-ir.org/mirex/results/2009/tag/MajorMiner/BP1.tgz Juan José Burred, Geoffroy Peeters]<br /><br />
'''BP2''' = [https://www.music-ir.org/mirex/results/2009/tag/MajorMiner/BP2.tgz Juan José Burred, Geoffroy Peeters]<br /><br />
'''CC1''' = [https://www.music-ir.org/mirex/results/2009/tag/MajorMiner/CC1.tgz Chuan Cao, Ming Li]<br /><br />
'''CC2''' = [https://www.music-ir.org/mirex/results/2009/tag/MajorMiner/CC2.tgz Chuan Cao, Ming Li]<br /><br />
'''CC3''' = [https://www.music-ir.org/mirex/results/2009/tag/MajorMiner/CC3.tgz Chuan Cao, Ming Li]<br /><br />
'''CC4''' = [https://www.music-ir.org/mirex/results/2009/tag/MajorMiner/CC4.tgz Chuan Cao, Ming Li]<br /><br />
'''GP''' = [https://www.music-ir.org/mirex/results/2009/tag/MajorMiner/GP.tgz Geoffroy Peeters]<br /><br />
'''GT1''' = [https://www.music-ir.org/mirex/results/2009/tag/MajorMiner/GT1.tgz George Tzanetakis]<br /><br />
'''GT2''' = [https://www.music-ir.org/mirex/results/2009/tag/MajorMiner/GT2.tgz George Tzanetakis]<br /><br />
'''LWW1''' = [https://www.music-ir.org/mirex/results/2009/tag/MajorMiner/LWW1.tgz Hung-Yi Lo, Ju-Chiang Wang, Hsin-Min Wang]<br /><br />
'''LWW2''' = [https://www.music-ir.org/mirex/results/2009/tag/MajorMiner/LWW2.tgz Hung-Yi Lo, Ju-Chiang Wang, Hsin-Min Wang]<br /><br />
'''HBC''' = [https://www.music-ir.org/mirex/results/2009/tag/MajorMiner/HBC.tgz Matthew D.Hoffman, David M. Blei, Perry R.Cook]<br /></div>IMIRSELBot