<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://music-ir.org/mirex/w/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=J.+Ashley+Burgoyne</id>
	<title>MIREX Wiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://music-ir.org/mirex/w/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=J.+Ashley+Burgoyne"/>
	<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/wiki/Special:Contributions/J._Ashley_Burgoyne"/>
	<updated>2026-04-29T17:05:53Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.31.1</generator>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:MIREX2013_Results&amp;diff=9890</id>
		<title>2013:MIREX2013 Results</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:MIREX2013_Results&amp;diff=9890"/>
		<updated>2013-11-30T18:54:58Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Other Tasks */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==OVERALL RESULTS POSTERS &amp;lt;!--(First Version: Will need updating as last runs are completed)--&amp;gt;==&lt;br /&gt;
&lt;br /&gt;
This page is under construction. &lt;br /&gt;
&lt;br /&gt;
[https://www.music-ir.org/mirex/results/2013/mirex_2013_poster.pdf MIREX 2013 Overall Results Posters (PDF)]&lt;br /&gt;
&lt;br /&gt;
==Results by Task ==&lt;br /&gt;
&lt;br /&gt;
===Train-Test Task Set===&lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/composer_report/ Audio Classical Composer Identification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/latin_report/ Audio Latin Genre Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/mood_report/index.html Audio Music Mood Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/mixed_report/ Audio Mixed Popular Genre Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
===Other Tasks===&lt;br /&gt;
&lt;br /&gt;
* Audio Beat Tracking Results &lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/dav/ DAV Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/maz/ MAZ Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/mck/ MCK Dataset] &amp;amp;nbsp;&lt;br /&gt;
* Audio Chord Detection Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ace/mrx09/index.html MIREX &amp;amp;rsquo;09 Dataset (old style)]  &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ace/bill/index.html Billboard &amp;amp;rsquo;12 Dataset (old style)]  &amp;amp;nbsp;&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_MIREX_2009 | MIREX &amp;amp;rsquo;09 Dataset]] &amp;amp;nbsp;&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_Billboard_2012 | Billboard &amp;amp;rsquo;12 Dataset]] &amp;amp;nbsp;&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_Billboard_2013 | Billboard &amp;amp;rsquo;13 Dataset]] &amp;amp;nbsp;&lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/akd/ Audio Key Detection Results] &amp;amp;nbsp;&lt;br /&gt;
* Audio Melody Extraction Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/adc04/  ADC04 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx05/ MIREX05 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/ind08/ INDIAN08 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_0db/ MIREX09 0dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_m5db/ MIREX09 -5dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_p5db/ MIREX09 +5dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
* [[2013:Audio_Music_Similarity_and_Retrieval_Results | Audio Music Similarity and Retrieval Results]] &lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/aod/ Audio Onset Detection Results] &amp;amp;nbsp;&lt;br /&gt;
* Audio Tag Classification Results&lt;br /&gt;
** Major Miner Tag dataset&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask1_report/bin/ Binary relevance (classification evaluation)] &amp;amp;nbsp;&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask1_report/aff/ Affinity estimation evaluation] &amp;amp;nbsp;&lt;br /&gt;
** Mood Tag dataset&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask2_report/bin/ Binary relevance (classification evaluation)] &amp;amp;nbsp;&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask2_report/aff/ Affinity estimation evaluation] &amp;amp;nbsp;&lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ate/ Audio Tempo Estimation Results] &amp;amp;nbsp;&lt;br /&gt;
* [[2013:Multiple_Fundamental_Frequency_Estimation_&amp;amp;_Tracking_Results | Multiple Fundamental Frequency Estimation &amp;amp; Tracking Results]]&lt;br /&gt;
* Music Structure Segmentation Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx09/ MIREX09 dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx10_1/ RWC dataset - Quaero (MIREX10) Ground-truth] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx10_2/ RWC dataset - Original RWC Ground-truth] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/sal/ SALAMI dataset] &amp;amp;nbsp;&lt;br /&gt;
* Query-by-Singing/Humming Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1_hidden/  Hidden Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1a_jang/  Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1b_thinkit/ ThinkIt Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1c_ioacas/ IOACAS Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task2_jang/ Subtask2 Dataset] &amp;amp;nbsp;&lt;br /&gt;
* Query-by-Tapping Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task1_jang/  Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task1_hsiao/ HSIAO Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task2_jang/ Subtask2 Dataset] &amp;amp;nbsp;&lt;br /&gt;
*[[2013:Real-time_Audio_to_Score_Alignment_(a.k.a._Score_Following)_Results | Real-time Audio to Score Alignment (a.k.a. Score Following) Results ]]&lt;br /&gt;
* [[2013:Symbolic_Melodic_Similarity_Results | Symbolic Melodic Similarity Results]]&lt;br /&gt;
* [[2013:Discovery of Repeated Themes &amp;amp; Sections Results | Discovery of Repeated Themes &amp;amp; Sections Results]]&lt;br /&gt;
* [[2013:Audio Cover Song Identification Results]]&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_Billboard_2013&amp;diff=9889</id>
		<title>2013:Audio Chord Estimation Results Billboard 2013</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_Billboard_2013&amp;diff=9889"/>
		<updated>2013-11-30T18:37:46Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: Created page with &amp;quot;==Introduction==  This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for a special subset of ...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for a special subset of the ''Billboard'' dataset from McGill University that has never been made available to the public. Further subsets have been withheld to support the ACE task through MIREX 2015.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- &lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!&lt;br /&gt;
! Abstract&lt;br /&gt;
! Contributors&lt;br /&gt;
|-&lt;br /&gt;
| CB3&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CB3.pdf PDF]&lt;br /&gt;
| Taemin Cho &amp;amp; Juan P. Bello&lt;br /&gt;
|-&lt;br /&gt;
| CB4&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CB4.pdf PDF]&lt;br /&gt;
| Taemin Cho &amp;amp; Juan P. Bello&lt;br /&gt;
|-&lt;br /&gt;
| CF2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CF2.pdf PDF]&lt;br /&gt;
| Chris Cannam, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Mark Levy, Massimiliano Zanoni, Dan Stowell &amp;amp; Luís A. Figueira&lt;br /&gt;
|-&lt;br /&gt;
| KO1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/KO1.pdf PDF]&lt;br /&gt;
| Maksim Khadkevich &amp;amp; Maurizio Omologo&lt;br /&gt;
|-&lt;br /&gt;
| KO2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/KO2.pdf PDF]&lt;br /&gt;
| Maksim Khadkevich &amp;amp; Maurizio Omologo&lt;br /&gt;
|-&lt;br /&gt;
| NG1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NG1.pdf PDF]&lt;br /&gt;
| Nikolay Glazyrin&lt;br /&gt;
|-&lt;br /&gt;
| NG2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NG2.pdf PDF]&lt;br /&gt;
| Nikolay Glazyrin&lt;br /&gt;
|-&lt;br /&gt;
| NMSD1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NMSD1.pdf PDF]&lt;br /&gt;
| Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez &amp;amp; Tijl De Bie&lt;br /&gt;
|-&lt;br /&gt;
| NMSD2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NMSD2.pdf PDF]&lt;br /&gt;
| Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez &amp;amp; Tijl De Bie&lt;br /&gt;
|-&lt;br /&gt;
| PP3&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/PP3.pdf PDF] &lt;br /&gt;
| Johan Pauwels &amp;amp; Geoffroy Peeters&lt;br /&gt;
|-&lt;br /&gt;
| PP4&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/PP4.pdf PDF] &lt;br /&gt;
| Johan Pauwels &amp;amp; Geoffroy Peeters&lt;br /&gt;
|-&lt;br /&gt;
| SB8&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/SB8.pdf PDF] &lt;br /&gt;
| Nikolaas Steenbergen &amp;amp; John Ashley Burgoyne&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
All figures can be interpreted as percentages and range from 0 (worst) to 100 (best). The table is sorted on WCSR for the major-minor vocabulary. Algorithms that conducted training are marked with an asterisk; all others were submitted pre-trained.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;2013/ace/billboard13.csv&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
More detailed about the performance of the algorithms, including per-song performance and the breakdown of the WCSR calculations, is available from this archive:&lt;br /&gt;
&lt;br /&gt;
* [https://music-ir.org/mirex/results/2013/ace/BillboardTest2013.zip BillboardTest2013.zip]&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
The recognition output and the ground-truth files are available from this archive:&lt;br /&gt;
&lt;br /&gt;
* [https://music-ir.org/mirex/results/2013/ace/BillboardTest2013Output.zip BillboardTest2013Output.zip]&lt;br /&gt;
&lt;br /&gt;
We hope to generate a graphical comparison of all algorithms against the ground truth early in 2014.&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_Billboard_2012&amp;diff=9888</id>
		<title>2013:Audio Chord Estimation Results Billboard 2012</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_Billboard_2012&amp;diff=9888"/>
		<updated>2013-11-30T18:33:45Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: Created page with &amp;quot;==Introduction==  This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for an abridged version ...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for an abridged version of the ''Billboard'' dataset from McGill University, including a representative sample of American popular music from the 1950s through the 1990s, as used for MIREX 2012.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- &lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!&lt;br /&gt;
! Abstract&lt;br /&gt;
! Contributors&lt;br /&gt;
|-&lt;br /&gt;
| CB3&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CB3.pdf PDF]&lt;br /&gt;
| Taemin Cho &amp;amp; Juan P. Bello&lt;br /&gt;
|-&lt;br /&gt;
| CB4&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CB4.pdf PDF]&lt;br /&gt;
| Taemin Cho &amp;amp; Juan P. Bello&lt;br /&gt;
|-&lt;br /&gt;
| CF2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CF2.pdf PDF]&lt;br /&gt;
| Chris Cannam, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Mark Levy, Massimiliano Zanoni, Dan Stowell &amp;amp; Luís A. Figueira&lt;br /&gt;
|-&lt;br /&gt;
| KO1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/KO1.pdf PDF]&lt;br /&gt;
| Maksim Khadkevich &amp;amp; Maurizio Omologo&lt;br /&gt;
|-&lt;br /&gt;
| KO2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/KO2.pdf PDF]&lt;br /&gt;
| Maksim Khadkevich &amp;amp; Maurizio Omologo&lt;br /&gt;
|-&lt;br /&gt;
| NG1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NG1.pdf PDF]&lt;br /&gt;
| Nikolay Glazyrin&lt;br /&gt;
|-&lt;br /&gt;
| NG2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NG2.pdf PDF]&lt;br /&gt;
| Nikolay Glazyrin&lt;br /&gt;
|-&lt;br /&gt;
| NMSD1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NMSD1.pdf PDF]&lt;br /&gt;
| Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez &amp;amp; Tijl De Bie&lt;br /&gt;
|-&lt;br /&gt;
| NMSD2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NMSD2.pdf PDF]&lt;br /&gt;
| Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez &amp;amp; Tijl De Bie&lt;br /&gt;
|-&lt;br /&gt;
| PP3&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/PP3.pdf PDF] &lt;br /&gt;
| Johan Pauwels &amp;amp; Geoffroy Peeters&lt;br /&gt;
|-&lt;br /&gt;
| PP4&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/PP4.pdf PDF] &lt;br /&gt;
| Johan Pauwels &amp;amp; Geoffroy Peeters&lt;br /&gt;
|-&lt;br /&gt;
| SB8&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/SB8.pdf PDF] &lt;br /&gt;
| Nikolaas Steenbergen &amp;amp; John Ashley Burgoyne&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
All figures can be interpreted as percentages and range from 0 (worst) to 100 (best). The table is sorted on WCSR for the major-minor vocabulary. Algorithms that conducted training are marked with an asterisk; all others were submitted pre-trained.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;2013/ace/billboard12.csv&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
More detailed about the performance of the algorithms, including per-song performance and the breakdown of the WCSR calculations, is available from this archive:&lt;br /&gt;
&lt;br /&gt;
* [https://music-ir.org/mirex/results/2013/ace/BillboardTest2012.zip BillboardTest2012.zip]&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
The recognition output and the ground-truth files are available from this archive:&lt;br /&gt;
&lt;br /&gt;
* [https://music-ir.org/mirex/results/2013/ace/BillboardTest2012Output.zip BillboardTest2012Output.zip]&lt;br /&gt;
&lt;br /&gt;
We hope to generate a graphical comparison of all algorithms against the ground truth early in 2014.&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9887</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9887"/>
		<updated>2013-11-30T18:29:38Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Algorithmic Output */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- &lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!&lt;br /&gt;
! Abstract&lt;br /&gt;
! Contributors&lt;br /&gt;
|-&lt;br /&gt;
| CB3&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CB3.pdf PDF]&lt;br /&gt;
| Taemin Cho &amp;amp; Juan P. Bello&lt;br /&gt;
|-&lt;br /&gt;
| CB4&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CB4.pdf PDF]&lt;br /&gt;
| Taemin Cho &amp;amp; Juan P. Bello&lt;br /&gt;
|-&lt;br /&gt;
| CF2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CF2.pdf PDF]&lt;br /&gt;
| Chris Cannam, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Mark Levy, Massimiliano Zanoni, Dan Stowell &amp;amp; Luís A. Figueira&lt;br /&gt;
|-&lt;br /&gt;
| KO1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/KO1.pdf PDF]&lt;br /&gt;
| Maksim Khadkevich &amp;amp; Maurizio Omologo&lt;br /&gt;
|-&lt;br /&gt;
| KO2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/KO2.pdf PDF]&lt;br /&gt;
| Maksim Khadkevich &amp;amp; Maurizio Omologo&lt;br /&gt;
|-&lt;br /&gt;
| NG1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NG1.pdf PDF]&lt;br /&gt;
| Nikolay Glazyrin&lt;br /&gt;
|-&lt;br /&gt;
| NG2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NG2.pdf PDF]&lt;br /&gt;
| Nikolay Glazyrin&lt;br /&gt;
|-&lt;br /&gt;
| NMSD1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NMSD1.pdf PDF]&lt;br /&gt;
| Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez &amp;amp; Tijl De Bie&lt;br /&gt;
|-&lt;br /&gt;
| NMSD2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NMSD2.pdf PDF]&lt;br /&gt;
| Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez &amp;amp; Tijl De Bie&lt;br /&gt;
|-&lt;br /&gt;
| PP3&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/PP3.pdf PDF] &lt;br /&gt;
| Johan Pauwels &amp;amp; Geoffroy Peeters&lt;br /&gt;
|-&lt;br /&gt;
| PP4&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/PP4.pdf PDF] &lt;br /&gt;
| Johan Pauwels &amp;amp; Geoffroy Peeters&lt;br /&gt;
|-&lt;br /&gt;
| SB8&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/SB8.pdf PDF] &lt;br /&gt;
| Nikolaas Steenbergen &amp;amp; John Ashley Burgoyne&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
All figures can be interpreted as percentages and range from 0 (worst) to 100 (best). The table is sorted on WCSR for the major-minor vocabulary. Algorithms that conducted training are marked with an asterisk; all others were submitted pre-trained.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;2013/ace/mirex09.csv&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
More detailed about the performance of the algorithms, including per-song performance and the breakdown of the WCSR calculations, is available from this archive:&lt;br /&gt;
&lt;br /&gt;
* [https://music-ir.org/mirex/results/2013/ace/MirexChord2009.zip MirexChord2009.zip]&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
The recognition output and the ground-truth files are available from this archive:&lt;br /&gt;
&lt;br /&gt;
* [https://music-ir.org/mirex/results/2013/ace/MirexChord2009Output.zip MirexChord2009Output.zip]&lt;br /&gt;
&lt;br /&gt;
We hope to generate a graphical comparison of all algorithms against the ground truth early in 2014.&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9886</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9886"/>
		<updated>2013-11-30T18:25:40Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Complete Results */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- &lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!&lt;br /&gt;
! Abstract&lt;br /&gt;
! Contributors&lt;br /&gt;
|-&lt;br /&gt;
| CB3&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CB3.pdf PDF]&lt;br /&gt;
| Taemin Cho &amp;amp; Juan P. Bello&lt;br /&gt;
|-&lt;br /&gt;
| CB4&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CB4.pdf PDF]&lt;br /&gt;
| Taemin Cho &amp;amp; Juan P. Bello&lt;br /&gt;
|-&lt;br /&gt;
| CF2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CF2.pdf PDF]&lt;br /&gt;
| Chris Cannam, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Mark Levy, Massimiliano Zanoni, Dan Stowell &amp;amp; Luís A. Figueira&lt;br /&gt;
|-&lt;br /&gt;
| KO1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/KO1.pdf PDF]&lt;br /&gt;
| Maksim Khadkevich &amp;amp; Maurizio Omologo&lt;br /&gt;
|-&lt;br /&gt;
| KO2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/KO2.pdf PDF]&lt;br /&gt;
| Maksim Khadkevich &amp;amp; Maurizio Omologo&lt;br /&gt;
|-&lt;br /&gt;
| NG1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NG1.pdf PDF]&lt;br /&gt;
| Nikolay Glazyrin&lt;br /&gt;
|-&lt;br /&gt;
| NG2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NG2.pdf PDF]&lt;br /&gt;
| Nikolay Glazyrin&lt;br /&gt;
|-&lt;br /&gt;
| NMSD1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NMSD1.pdf PDF]&lt;br /&gt;
| Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez &amp;amp; Tijl De Bie&lt;br /&gt;
|-&lt;br /&gt;
| NMSD2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NMSD2.pdf PDF]&lt;br /&gt;
| Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez &amp;amp; Tijl De Bie&lt;br /&gt;
|-&lt;br /&gt;
| PP3&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/PP3.pdf PDF] &lt;br /&gt;
| Johan Pauwels &amp;amp; Geoffroy Peeters&lt;br /&gt;
|-&lt;br /&gt;
| PP4&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/PP4.pdf PDF] &lt;br /&gt;
| Johan Pauwels &amp;amp; Geoffroy Peeters&lt;br /&gt;
|-&lt;br /&gt;
| SB8&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/SB8.pdf PDF] &lt;br /&gt;
| Nikolaas Steenbergen &amp;amp; John Ashley Burgoyne&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
All figures can be interpreted as percentages and range from 0 (worst) to 100 (best). The table is sorted on WCSR for the major-minor vocabulary. Algorithms that conducted training are marked with an asterisk; all others were submitted pre-trained.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;2013/ace/mirex09.csv&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
More detailed about the performance of the algorithms, including per-song performance and the breakdown of the WCSR calculations, is available from this archive:&lt;br /&gt;
&lt;br /&gt;
* [https://music-ir.org/mirex/results/2013/ace/MirexChord2009.zip MirexChord2009.zip]&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9885</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9885"/>
		<updated>2013-11-30T18:24:21Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Complete Results */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- &lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!&lt;br /&gt;
! Abstract&lt;br /&gt;
! Contributors&lt;br /&gt;
|-&lt;br /&gt;
| CB3&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CB3.pdf PDF]&lt;br /&gt;
| Taemin Cho &amp;amp; Juan P. Bello&lt;br /&gt;
|-&lt;br /&gt;
| CB4&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CB4.pdf PDF]&lt;br /&gt;
| Taemin Cho &amp;amp; Juan P. Bello&lt;br /&gt;
|-&lt;br /&gt;
| CF2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CF2.pdf PDF]&lt;br /&gt;
| Chris Cannam, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Mark Levy, Massimiliano Zanoni, Dan Stowell &amp;amp; Luís A. Figueira&lt;br /&gt;
|-&lt;br /&gt;
| KO1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/KO1.pdf PDF]&lt;br /&gt;
| Maksim Khadkevich &amp;amp; Maurizio Omologo&lt;br /&gt;
|-&lt;br /&gt;
| KO2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/KO2.pdf PDF]&lt;br /&gt;
| Maksim Khadkevich &amp;amp; Maurizio Omologo&lt;br /&gt;
|-&lt;br /&gt;
| NG1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NG1.pdf PDF]&lt;br /&gt;
| Nikolay Glazyrin&lt;br /&gt;
|-&lt;br /&gt;
| NG2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NG2.pdf PDF]&lt;br /&gt;
| Nikolay Glazyrin&lt;br /&gt;
|-&lt;br /&gt;
| NMSD1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NMSD1.pdf PDF]&lt;br /&gt;
| Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez &amp;amp; Tijl De Bie&lt;br /&gt;
|-&lt;br /&gt;
| NMSD2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NMSD2.pdf PDF]&lt;br /&gt;
| Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez &amp;amp; Tijl De Bie&lt;br /&gt;
|-&lt;br /&gt;
| PP3&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/PP3.pdf PDF] &lt;br /&gt;
| Johan Pauwels &amp;amp; Geoffroy Peeters&lt;br /&gt;
|-&lt;br /&gt;
| PP4&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/PP4.pdf PDF] &lt;br /&gt;
| Johan Pauwels &amp;amp; Geoffroy Peeters&lt;br /&gt;
|-&lt;br /&gt;
| SB8&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/SB8.pdf PDF] &lt;br /&gt;
| Nikolaas Steenbergen &amp;amp; John Ashley Burgoyne&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
All figures can be interpreted as percentages and range from 0 (worst) to 100 (best). The table is sorted on WCSR for the major-minor vocabulary. Algorithms that conducted training are marked with an asterisk; all others were submitted pre-trained.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;2013/ace/mirex09.csv&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
More detailed about the performance of the algorithms, including per-song performance and the breakdown of the WCSR calculations, is available from this archive:&lt;br /&gt;
&lt;br /&gt;
* [[http://&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9884</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9884"/>
		<updated>2013-11-30T18:08:02Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Advanced chord vocabularies */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- &lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!&lt;br /&gt;
! Abstract&lt;br /&gt;
! Contributors&lt;br /&gt;
|-&lt;br /&gt;
| CB3&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CB3.pdf PDF]&lt;br /&gt;
| Taemin Cho &amp;amp; Juan P. Bello&lt;br /&gt;
|-&lt;br /&gt;
| CB4&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CB4.pdf PDF]&lt;br /&gt;
| Taemin Cho &amp;amp; Juan P. Bello&lt;br /&gt;
|-&lt;br /&gt;
| CF2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CF2.pdf PDF]&lt;br /&gt;
| Chris Cannam, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Mark Levy, Massimiliano Zanoni, Dan Stowell &amp;amp; Luís A. Figueira&lt;br /&gt;
|-&lt;br /&gt;
| KO1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/KO1.pdf PDF]&lt;br /&gt;
| Maksim Khadkevich &amp;amp; Maurizio Omologo&lt;br /&gt;
|-&lt;br /&gt;
| KO2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/KO2.pdf PDF]&lt;br /&gt;
| Maksim Khadkevich &amp;amp; Maurizio Omologo&lt;br /&gt;
|-&lt;br /&gt;
| NG1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NG1.pdf PDF]&lt;br /&gt;
| Nikolay Glazyrin&lt;br /&gt;
|-&lt;br /&gt;
| NG2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NG2.pdf PDF]&lt;br /&gt;
| Nikolay Glazyrin&lt;br /&gt;
|-&lt;br /&gt;
| NMSD1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NMSD1.pdf PDF]&lt;br /&gt;
| Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez &amp;amp; Tijl De Bie&lt;br /&gt;
|-&lt;br /&gt;
| NMSD2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NMSD2.pdf PDF]&lt;br /&gt;
| Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez &amp;amp; Tijl De Bie&lt;br /&gt;
|-&lt;br /&gt;
| PP3&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/PP3.pdf PDF] &lt;br /&gt;
| Johan Pauwels &amp;amp; Geoffroy Peeters&lt;br /&gt;
|-&lt;br /&gt;
| PP4&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/PP4.pdf PDF] &lt;br /&gt;
| Johan Pauwels &amp;amp; Geoffroy Peeters&lt;br /&gt;
|-&lt;br /&gt;
| SB8&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/SB8.pdf PDF] &lt;br /&gt;
| Nikolaas Steenbergen &amp;amp; John Ashley Burgoyne&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
All figures can be interpreted as percentages and range from 0 (worst) to 100 (best). The table is sorted on WCSR for the major-minor vocabulary. Algorithms that conducted training are marked with an asterisk; all others were submitted pre-trained.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;2013/ace/mirex09.csv&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9883</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9883"/>
		<updated>2013-11-30T18:07:35Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Submissions */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- style=&amp;quot;background: yellow&amp;quot;&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!&lt;br /&gt;
! Abstract&lt;br /&gt;
! Contributors&lt;br /&gt;
|-&lt;br /&gt;
| CB3&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CB3.pdf PDF]&lt;br /&gt;
| Taemin Cho &amp;amp; Juan P. Bello&lt;br /&gt;
|-&lt;br /&gt;
| CB4&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CB4.pdf PDF]&lt;br /&gt;
| Taemin Cho &amp;amp; Juan P. Bello&lt;br /&gt;
|-&lt;br /&gt;
| CF2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/CF2.pdf PDF]&lt;br /&gt;
| Chris Cannam, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Mark Levy, Massimiliano Zanoni, Dan Stowell &amp;amp; Luís A. Figueira&lt;br /&gt;
|-&lt;br /&gt;
| KO1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/KO1.pdf PDF]&lt;br /&gt;
| Maksim Khadkevich &amp;amp; Maurizio Omologo&lt;br /&gt;
|-&lt;br /&gt;
| KO2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/KO2.pdf PDF]&lt;br /&gt;
| Maksim Khadkevich &amp;amp; Maurizio Omologo&lt;br /&gt;
|-&lt;br /&gt;
| NG1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NG1.pdf PDF]&lt;br /&gt;
| Nikolay Glazyrin&lt;br /&gt;
|-&lt;br /&gt;
| NG2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NG2.pdf PDF]&lt;br /&gt;
| Nikolay Glazyrin&lt;br /&gt;
|-&lt;br /&gt;
| NMSD1&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NMSD1.pdf PDF]&lt;br /&gt;
| Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez &amp;amp; Tijl De Bie&lt;br /&gt;
|-&lt;br /&gt;
| NMSD2&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/NMSD2.pdf PDF]&lt;br /&gt;
| Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez &amp;amp; Tijl De Bie&lt;br /&gt;
|-&lt;br /&gt;
| PP3&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/PP3.pdf PDF] &lt;br /&gt;
| Johan Pauwels &amp;amp; Geoffroy Peeters&lt;br /&gt;
|-&lt;br /&gt;
| PP4&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/PP4.pdf PDF] &lt;br /&gt;
| Johan Pauwels &amp;amp; Geoffroy Peeters&lt;br /&gt;
|-&lt;br /&gt;
| SB8&lt;br /&gt;
| style=&amp;quot;text-align: center;&amp;quot; | [https://www.music-ir.org/mirex/abstracts/2013/SB8.pdf PDF] &lt;br /&gt;
| Nikolaas Steenbergen &amp;amp; John Ashley Burgoyne&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
All figures can be interpreted as percentages and range from 0 (worst) to 100 (best). The table is sorted on WCSR for the major-minor vocabulary. Algorithms that conducted training are marked with an asterisk; all others were submitted pre-trained.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;2013/ace/mirex09.csv&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9882</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9882"/>
		<updated>2013-11-30T17:42:53Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Summary */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- style=&amp;quot;background: yellow&amp;quot;&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
All figures can be interpreted as percentages and range from 0 (worst) to 100 (best). The table is sorted on WCSR for the major-minor vocabulary. Algorithms that conducted training are marked with an asterisk; all others were submitted pre-trained.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;2013/ace/mirex09.csv&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9881</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9881"/>
		<updated>2013-11-30T17:42:12Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Summary */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- style=&amp;quot;background: yellow&amp;quot;&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
All figures can be interpreted as percentages and range from 0 (worst) to 100 (best). The table is sorted on WCSR for the major-minor vocabulary.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;2013/ace/mirex09.csv&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9880</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9880"/>
		<updated>2013-11-30T16:28:04Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Summary */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- style=&amp;quot;background: yellow&amp;quot;&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;2013/ace/mirex09.csv&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9879</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9879"/>
		<updated>2013-11-30T16:27:53Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Summary */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- style=&amp;quot;background: yellow&amp;quot;&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;2013/ace/mirex09.cv&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9878</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9878"/>
		<updated>2013-11-30T16:25:26Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Summary */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- style=&amp;quot;background: yellow&amp;quot;&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;2013/ace/mirex09.csv&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9877</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9877"/>
		<updated>2013-11-30T16:25:03Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Summary */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- style=&amp;quot;background: yellow&amp;quot;&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;results/2013/ace/mirex09.csv&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9876</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9876"/>
		<updated>2013-11-30T11:30:42Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Summary */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- style=&amp;quot;background: yellow&amp;quot;&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;https://www.sugarsync.com/pf/D7802793_74307221_203340&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9875</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9875"/>
		<updated>2013-11-29T23:48:50Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Summary */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- style=&amp;quot;background: yellow&amp;quot;&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;csv&amp;gt;2013/ace/mirex09.csv&amp;lt;/csv&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9874</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9874"/>
		<updated>2013-11-29T23:28:25Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Advanced chord vocabularies */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor;&lt;br /&gt;
# Seventh chords;&lt;br /&gt;
# Major and minor with inversions; and&lt;br /&gt;
# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- style=&amp;quot;background: yellow&amp;quot;&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9873</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9873"/>
		<updated>2013-11-29T23:27:33Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Advanced chord vocabularies */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
&lt;br /&gt;
## Chord root note only;&lt;br /&gt;
## Major and minor;&lt;br /&gt;
## Seventh chords;&lt;br /&gt;
## Major and minor with inversions; and&lt;br /&gt;
## Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- style=&amp;quot;background: yellow&amp;quot;&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:MIREX2013_Results&amp;diff=9872</id>
		<title>2013:MIREX2013 Results</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:MIREX2013_Results&amp;diff=9872"/>
		<updated>2013-11-29T23:26:58Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Other Tasks */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==OVERALL RESULTS POSTERS &amp;lt;!--(First Version: Will need updating as last runs are completed)--&amp;gt;==&lt;br /&gt;
&lt;br /&gt;
This page is under construction. &lt;br /&gt;
&lt;br /&gt;
[https://www.music-ir.org/mirex/results/2013/mirex_2013_poster.pdf MIREX 2013 Overall Results Posters (PDF)]&lt;br /&gt;
&lt;br /&gt;
==Results by Task ==&lt;br /&gt;
&lt;br /&gt;
===Train-Test Task Set===&lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/composer_report/ Audio Classical Composer Identification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/latin_report/ Audio Latin Genre Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/mood_report/index.html Audio Music Mood Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/mixed_report/ Audio Mixed Popular Genre Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
===Other Tasks===&lt;br /&gt;
&lt;br /&gt;
* Audio Beat Tracking Results &lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/dav/ DAV Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/maz/ MAZ Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/mck/ MCK Dataset] &amp;amp;nbsp;&lt;br /&gt;
* Audio Chord Detection Results (will add more results soon)&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ace/mrx09/index.html MIREX &amp;amp;rsquo;09 Dataset (old style)]  &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ace/bill/index.html Billboard &amp;amp;rsquo;12 Dataset (old style)]  &amp;amp;nbsp;&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_MIREX_2009 | MIREX &amp;amp;rsquo;09 Dataset]] &amp;amp;nbsp;&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_Billboard_2012 | Billboard &amp;amp;rsquo;12 Dataset]] &amp;amp;nbsp;&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_Billboard_2013 | Billboard &amp;amp;rsquo;13 Dataset]] &amp;amp;nbsp;&lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/akd/ Audio Key Detection Results] &amp;amp;nbsp;&lt;br /&gt;
* Audio Melody Extraction Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/adc04/  ADC04 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx05/ MIREX05 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/ind08/ INDIAN08 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_0db/ MIREX09 0dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_m5db/ MIREX09 -5dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_p5db/ MIREX09 +5dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
* [[2013:Audio_Music_Similarity_and_Retrieval_Results | Audio Music Similarity and Retrieval Results]] &lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/aod/ Audio Onset Detection Results] &amp;amp;nbsp;&lt;br /&gt;
* Audio Tag Classification Results&lt;br /&gt;
** Major Miner Tag dataset&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask1_report/bin/ Binary relevance (classification evaluation)] &amp;amp;nbsp;&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask1_report/aff/ Affinity estimation evaluation] &amp;amp;nbsp;&lt;br /&gt;
** Mood Tag dataset&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask2_report/bin/ Binary relevance (classification evaluation)] &amp;amp;nbsp;&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask2_report/aff/ Affinity estimation evaluation] &amp;amp;nbsp;&lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ate/ Audio Tempo Estimation Results] &amp;amp;nbsp;&lt;br /&gt;
* [[2013:Multiple_Fundamental_Frequency_Estimation_&amp;amp;_Tracking_Results | Multiple Fundamental Frequency Estimation &amp;amp; Tracking Results]]&lt;br /&gt;
* Music Structure Segmentation Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx09/ MIREX09 dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx10_1/ RWC dataset - Quaero (MIREX10) Ground-truth] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx10_2/ RWC dataset - Original RWC Ground-truth] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/sal/ SALAMI dataset] &amp;amp;nbsp;&lt;br /&gt;
* Query-by-Singing/Humming Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1_hidden/  Hidden Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1a_jang/  Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1b_thinkit/ ThinkIt Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1c_ioacas/ IOACAS Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task2_jang/ Subtask2 Dataset] &amp;amp;nbsp;&lt;br /&gt;
* Query-by-Tapping Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task1_jang/  Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task1_hsiao/ HSIAO Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task2_jang/ Subtask2 Dataset] &amp;amp;nbsp;&lt;br /&gt;
*[[2013:Real-time_Audio_to_Score_Alignment_(a.k.a._Score_Following)_Results | Real-time Audio to Score Alignment (a.k.a. Score Following) Results ]]&lt;br /&gt;
* [[2013:Symbolic_Melodic_Similarity_Results | Symbolic Melodic Similarity Results]]&lt;br /&gt;
* [[2013:Discovery of Repeated Themes &amp;amp; Sections Results | Discovery of Repeated Themes &amp;amp; Sections Results]]&lt;br /&gt;
* [[2013:Audio Cover Song Identification Results]]&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9870</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9870"/>
		<updated>2013-11-29T23:26:21Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: moved 2013:Audio Chord Estimation Results MIREX2009 to 2013:Audio Chord Estimation Results MIREX 2009: Missing space&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
&lt;br /&gt;
*# Chord root note only;&lt;br /&gt;
*# Major and minor;&lt;br /&gt;
*# Seventh chords;&lt;br /&gt;
*# Major and minor with inversions; and&lt;br /&gt;
*# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- style=&amp;quot;background: yellow&amp;quot;&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX2009&amp;diff=9871</id>
		<title>2013:Audio Chord Estimation Results MIREX2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX2009&amp;diff=9871"/>
		<updated>2013-11-29T23:26:21Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: moved 2013:Audio Chord Estimation Results MIREX2009 to 2013:Audio Chord Estimation Results MIREX 2009: Missing space&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[2013:Audio Chord Estimation Results MIREX 2009]]&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9869</id>
		<title>2013:Audio Chord Estimation Results MIREX 2009</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation_Results_MIREX_2009&amp;diff=9869"/>
		<updated>2013-11-29T23:25:22Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: Created page with &amp;quot;==Introduction==  This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics datas...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Introduction==&lt;br /&gt;
&lt;br /&gt;
This year, we have started a new evaluation battery for audio chord estimation. This page contains the results of these new evaluations for the Isophonics dataset, a.k.a. the MIREX 2009 dataset. It comprises the collected Beatles, Queen, and Zweieck datasets from Queen Mary, University of London, and has been used for audio chord estimation in MIREX for many years.&lt;br /&gt;
&lt;br /&gt;
==Why evaluate differently?==&lt;br /&gt;
&lt;br /&gt;
* Researchers interested in automatic chord estimation have been dissatisfied with the traditional evaluation techniques used for this task at MIREX.&lt;br /&gt;
&lt;br /&gt;
* Numerous alternatives have been proposed in the literature (Harte, 2010; Mauch, 2010; Pauwels &amp;amp; Peeters, 2013). &lt;br /&gt;
&lt;br /&gt;
* At ISMIR 2010 in Utrecht, a group discussed alternatives and developed the [[The_Utrecht_Agreement_on_Chord_Evaluation | Utrecht Agreement]] for updating the task, but until this year, nobody had implemented any of the suggestions.&lt;br /&gt;
&lt;br /&gt;
==What’s new?==&lt;br /&gt;
&lt;br /&gt;
===More precise recall estimation===&lt;br /&gt;
&lt;br /&gt;
* MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth: the total duration of segments where the predictions match the ground truth divided by the total duration of the song. &lt;br /&gt;
&lt;br /&gt;
* In previous years, MIREX has used an approximate CSR by sampling both the ground-truth and the automatic annotations every 10 ms.&lt;br /&gt;
&lt;br /&gt;
* Following Harte (2010), we view the ground-truth and estimated annotations instead as continuous segmentations of the audio because (1) this is more precise and also (2) more computationally efficient. &lt;br /&gt;
&lt;br /&gt;
* Moreover, because pieces of music come in a wide variety of lengths, we believe it is better to weight the CSR by the length of the song. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
===Advanced chord vocabularies===&lt;br /&gt;
&lt;br /&gt;
* We computed WCSR with five different chord vocabulary mappings: &lt;br /&gt;
&lt;br /&gt;
*# Chord root note only;&lt;br /&gt;
*# Major and minor;&lt;br /&gt;
*# Seventh chords;&lt;br /&gt;
*# Major and minor with inversions; and&lt;br /&gt;
*# Seventh chords with inversions. &lt;br /&gt;
&lt;br /&gt;
* With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. &lt;br /&gt;
&lt;br /&gt;
* A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. &lt;br /&gt;
&lt;br /&gt;
* For instance, in the major and minor case, G:7(#9) is mapped to G:maj because the interval set of G:maj, {1,3,5}, is a subset of the interval set of the G:7(#9), {1,3,5,b7,#9}. In the seventh-chord case, G:7(#9) is mapped to G:7 instead because the interval set of G:7 {1, 3, 5, b7} is also a subset of G:7(#9) but is larger than G:maj.&lt;br /&gt;
&lt;br /&gt;
* Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus of American popular music (Burgoyne et al., 2011).&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Most Frequent Chord Qualities in the ''Billboard'' Corpus&lt;br /&gt;
|- style=&amp;quot;background: yellow&amp;quot;&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq.&lt;br /&gt;
! Cum. Freq.&lt;br /&gt;
|-&lt;br /&gt;
| maj&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 52&lt;br /&gt;
|-&lt;br /&gt;
| min&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 13&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 65&lt;br /&gt;
|-&lt;br /&gt;
| 7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 10&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 75&lt;br /&gt;
|-&lt;br /&gt;
| min7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 8&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 83&lt;br /&gt;
|-&lt;br /&gt;
| maj7&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 3&lt;br /&gt;
| align=&amp;quot;right&amp;quot;| 86&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Evaluation of segmentation===&lt;br /&gt;
&lt;br /&gt;
* The chord transcription literature includes several other evaluation metrics, which mainly focus on the segmentation of the transcription.&lt;br /&gt;
&lt;br /&gt;
* We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences (Abdallah et al., 2005; Mauch, 2010). &lt;br /&gt;
&lt;br /&gt;
* Depending on the order of application, the directional Hamming distance yields a measure of over- or under-segmentation. To keep the scaling consistent with WCSR values (1.0 is best and 0.0 is worst), we report 1 – over-segmentation and 1 – under-segmentation, as well as the harmonic mean of these values (cf. Harte, 2010).&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Submissions==&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
==Results==&lt;br /&gt;
&lt;br /&gt;
===Summary===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Comparative Statistics===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Complete Results===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;br /&gt;
&lt;br /&gt;
===Algorithmic Output===&lt;br /&gt;
&lt;br /&gt;
* ''coming soon...''&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:MIREX2013_Results&amp;diff=9868</id>
		<title>2013:MIREX2013 Results</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:MIREX2013_Results&amp;diff=9868"/>
		<updated>2013-11-29T20:41:15Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Other Tasks */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==OVERALL RESULTS POSTERS &amp;lt;!--(First Version: Will need updating as last runs are completed)--&amp;gt;==&lt;br /&gt;
&lt;br /&gt;
This page is under construction. &lt;br /&gt;
&lt;br /&gt;
[https://www.music-ir.org/mirex/results/2013/mirex_2013_poster.pdf MIREX 2013 Overall Results Posters (PDF)]&lt;br /&gt;
&lt;br /&gt;
==Results by Task ==&lt;br /&gt;
&lt;br /&gt;
===Train-Test Task Set===&lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/composer_report/ Audio Classical Composer Identification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/latin_report/ Audio Latin Genre Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/mood_report/index.html Audio Music Mood Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/mixed_report/ Audio Mixed Popular Genre Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
===Other Tasks===&lt;br /&gt;
&lt;br /&gt;
* Audio Beat Tracking Results &lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/dav/ DAV Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/maz/ MAZ Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/mck/ MCK Dataset] &amp;amp;nbsp;&lt;br /&gt;
* Audio Chord Detection Results (will add more results soon)&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ace/mrx09/index.html MIREX &amp;amp;rsquo;09 Dataset (old style)]  &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ace/bill/index.html Billboard &amp;amp;rsquo;12 Dataset (old style)]  &amp;amp;nbsp;&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_MIREX2009 | MIREX &amp;amp;rsquo;09 Dataset]] &amp;amp;nbsp;&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_Billboard2012 | Billboard &amp;amp;rsquo;12 Dataset]] &amp;amp;nbsp;&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_Billboard2013 | Billboard &amp;amp;rsquo;13 Dataset]] &amp;amp;nbsp;&lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/akd/ Audio Key Detection Results] &amp;amp;nbsp;&lt;br /&gt;
* Audio Melody Extraction Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/adc04/  ADC04 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx05/ MIREX05 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/ind08/ INDIAN08 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_0db/ MIREX09 0dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_m5db/ MIREX09 -5dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_p5db/ MIREX09 +5dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
* [[2013:Audio_Music_Similarity_and_Retrieval_Results | Audio Music Similarity and Retrieval Results]] &lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/aod/ Audio Onset Detection Results] &amp;amp;nbsp;&lt;br /&gt;
* Audio Tag Classification Results&lt;br /&gt;
** Major Miner Tag dataset&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask1_report/bin/ Binary relevance (classification evaluation)] &amp;amp;nbsp;&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask1_report/aff/ Affinity estimation evaluation] &amp;amp;nbsp;&lt;br /&gt;
** Mood Tag dataset&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask2_report/bin/ Binary relevance (classification evaluation)] &amp;amp;nbsp;&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask2_report/aff/ Affinity estimation evaluation] &amp;amp;nbsp;&lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ate/ Audio Tempo Estimation Results] &amp;amp;nbsp;&lt;br /&gt;
* [[2013:Multiple_Fundamental_Frequency_Estimation_&amp;amp;_Tracking_Results | Multiple Fundamental Frequency Estimation &amp;amp; Tracking Results]]&lt;br /&gt;
* Music Structure Segmentation Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx09/ MIREX09 dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx10_1/ RWC dataset - Quaero (MIREX10) Ground-truth] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx10_2/ RWC dataset - Original RWC Ground-truth] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/sal/ SALAMI dataset] &amp;amp;nbsp;&lt;br /&gt;
* Query-by-Singing/Humming Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1_hidden/  Hidden Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1a_jang/  Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1b_thinkit/ ThinkIt Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1c_ioacas/ IOACAS Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task2_jang/ Subtask2 Dataset] &amp;amp;nbsp;&lt;br /&gt;
* Query-by-Tapping Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task1_jang/  Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task1_hsiao/ HSIAO Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task2_jang/ Subtask2 Dataset] &amp;amp;nbsp;&lt;br /&gt;
*[[2013:Real-time_Audio_to_Score_Alignment_(a.k.a._Score_Following)_Results | Real-time Audio to Score Alignment (a.k.a. Score Following) Results ]]&lt;br /&gt;
* [[2013:Symbolic_Melodic_Similarity_Results | Symbolic Melodic Similarity Results]]&lt;br /&gt;
* [[2013:Discovery of Repeated Themes &amp;amp; Sections Results | Discovery of Repeated Themes &amp;amp; Sections Results]]&lt;br /&gt;
* [[2013:Audio Cover Song Identification Results]]&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:MIREX2013_Results&amp;diff=9867</id>
		<title>2013:MIREX2013 Results</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:MIREX2013_Results&amp;diff=9867"/>
		<updated>2013-11-29T20:41:00Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Other Tasks */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==OVERALL RESULTS POSTERS &amp;lt;!--(First Version: Will need updating as last runs are completed)--&amp;gt;==&lt;br /&gt;
&lt;br /&gt;
This page is under construction. &lt;br /&gt;
&lt;br /&gt;
[https://www.music-ir.org/mirex/results/2013/mirex_2013_poster.pdf MIREX 2013 Overall Results Posters (PDF)]&lt;br /&gt;
&lt;br /&gt;
==Results by Task ==&lt;br /&gt;
&lt;br /&gt;
===Train-Test Task Set===&lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/composer_report/ Audio Classical Composer Identification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/latin_report/ Audio Latin Genre Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/mood_report/index.html Audio Music Mood Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/mixed_report/ Audio Mixed Popular Genre Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
===Other Tasks===&lt;br /&gt;
&lt;br /&gt;
* Audio Beat Tracking Results &lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/dav/ DAV Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/maz/ MAZ Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/mck/ MCK Dataset] &amp;amp;nbsp;&lt;br /&gt;
* Audio Chord Detection Results (will add more results soon)&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ace/mrx09/index.html MIREX &amp;amp;rsquo;09 Dataset (old style)]  &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ace/bill/index.html Billboard &amp;amp;rsquo;12 Dataset (old style)]  &amp;amp;nbsp;&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_MIREX2009 | MIREX &amp;amp;rsquo;09 Dataset]] &amp;amp;nbsp&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_Billboard2012 | Billboard &amp;amp;rsquo;12 Dataset]] &amp;amp;nbsp;&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_Billboard2013 | Billboard &amp;amp;rsquo;13 Dataset]] &amp;amp;nbsp;&lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/akd/ Audio Key Detection Results] &amp;amp;nbsp;&lt;br /&gt;
* Audio Melody Extraction Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/adc04/  ADC04 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx05/ MIREX05 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/ind08/ INDIAN08 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_0db/ MIREX09 0dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_m5db/ MIREX09 -5dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_p5db/ MIREX09 +5dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
* [[2013:Audio_Music_Similarity_and_Retrieval_Results | Audio Music Similarity and Retrieval Results]] &lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/aod/ Audio Onset Detection Results] &amp;amp;nbsp;&lt;br /&gt;
* Audio Tag Classification Results&lt;br /&gt;
** Major Miner Tag dataset&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask1_report/bin/ Binary relevance (classification evaluation)] &amp;amp;nbsp;&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask1_report/aff/ Affinity estimation evaluation] &amp;amp;nbsp;&lt;br /&gt;
** Mood Tag dataset&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask2_report/bin/ Binary relevance (classification evaluation)] &amp;amp;nbsp;&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask2_report/aff/ Affinity estimation evaluation] &amp;amp;nbsp;&lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ate/ Audio Tempo Estimation Results] &amp;amp;nbsp;&lt;br /&gt;
* [[2013:Multiple_Fundamental_Frequency_Estimation_&amp;amp;_Tracking_Results | Multiple Fundamental Frequency Estimation &amp;amp; Tracking Results]]&lt;br /&gt;
* Music Structure Segmentation Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx09/ MIREX09 dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx10_1/ RWC dataset - Quaero (MIREX10) Ground-truth] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx10_2/ RWC dataset - Original RWC Ground-truth] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/sal/ SALAMI dataset] &amp;amp;nbsp;&lt;br /&gt;
* Query-by-Singing/Humming Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1_hidden/  Hidden Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1a_jang/  Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1b_thinkit/ ThinkIt Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1c_ioacas/ IOACAS Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task2_jang/ Subtask2 Dataset] &amp;amp;nbsp;&lt;br /&gt;
* Query-by-Tapping Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task1_jang/  Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task1_hsiao/ HSIAO Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task2_jang/ Subtask2 Dataset] &amp;amp;nbsp;&lt;br /&gt;
*[[2013:Real-time_Audio_to_Score_Alignment_(a.k.a._Score_Following)_Results | Real-time Audio to Score Alignment (a.k.a. Score Following) Results ]]&lt;br /&gt;
* [[2013:Symbolic_Melodic_Similarity_Results | Symbolic Melodic Similarity Results]]&lt;br /&gt;
* [[2013:Discovery of Repeated Themes &amp;amp; Sections Results | Discovery of Repeated Themes &amp;amp; Sections Results]]&lt;br /&gt;
* [[2013:Audio Cover Song Identification Results]]&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:MIREX2013_Results&amp;diff=9866</id>
		<title>2013:MIREX2013 Results</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:MIREX2013_Results&amp;diff=9866"/>
		<updated>2013-11-29T20:40:39Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Other Tasks */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==OVERALL RESULTS POSTERS &amp;lt;!--(First Version: Will need updating as last runs are completed)--&amp;gt;==&lt;br /&gt;
&lt;br /&gt;
This page is under construction. &lt;br /&gt;
&lt;br /&gt;
[https://www.music-ir.org/mirex/results/2013/mirex_2013_poster.pdf MIREX 2013 Overall Results Posters (PDF)]&lt;br /&gt;
&lt;br /&gt;
==Results by Task ==&lt;br /&gt;
&lt;br /&gt;
===Train-Test Task Set===&lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/composer_report/ Audio Classical Composer Identification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/latin_report/ Audio Latin Genre Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/mood_report/index.html Audio Music Mood Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/mixed_report/ Audio Mixed Popular Genre Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
===Other Tasks===&lt;br /&gt;
&lt;br /&gt;
* Audio Beat Tracking Results &lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/dav/ DAV Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/maz/ MAZ Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/mck/ MCK Dataset] &amp;amp;nbsp;&lt;br /&gt;
* Audio Chord Detection Results (will add more results soon)&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ace/mrx09/index.html MIREX &amp;amp;rsquo;09 Dataset (old style)]  &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ace/bill/index.html Billboard &amp;amp;rsquo;12 Dataset (old style)]  &amp;amp;nbsp;&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_MIREX2009 | MIREX &amp;amp;rsquo;09 Dataset]] &amp;amp;nbsp&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_Billboard2012 | Billboard &amp;amp;rsquo;12 Dataset]] &amp;amp;nbsp&lt;br /&gt;
** [[2013:Audio_Chord_Estimation_Results_Billboard2013 | Billboard &amp;amp;rsquo;13 Dataset]] &amp;amp;nbsp&lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/akd/ Audio Key Detection Results] &amp;amp;nbsp;&lt;br /&gt;
* Audio Melody Extraction Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/adc04/  ADC04 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx05/ MIREX05 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/ind08/ INDIAN08 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_0db/ MIREX09 0dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_m5db/ MIREX09 -5dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_p5db/ MIREX09 +5dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
* [[2013:Audio_Music_Similarity_and_Retrieval_Results | Audio Music Similarity and Retrieval Results]] &lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/aod/ Audio Onset Detection Results] &amp;amp;nbsp;&lt;br /&gt;
* Audio Tag Classification Results&lt;br /&gt;
** Major Miner Tag dataset&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask1_report/bin/ Binary relevance (classification evaluation)] &amp;amp;nbsp;&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask1_report/aff/ Affinity estimation evaluation] &amp;amp;nbsp;&lt;br /&gt;
** Mood Tag dataset&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask2_report/bin/ Binary relevance (classification evaluation)] &amp;amp;nbsp;&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask2_report/aff/ Affinity estimation evaluation] &amp;amp;nbsp;&lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ate/ Audio Tempo Estimation Results] &amp;amp;nbsp;&lt;br /&gt;
* [[2013:Multiple_Fundamental_Frequency_Estimation_&amp;amp;_Tracking_Results | Multiple Fundamental Frequency Estimation &amp;amp; Tracking Results]]&lt;br /&gt;
* Music Structure Segmentation Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx09/ MIREX09 dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx10_1/ RWC dataset - Quaero (MIREX10) Ground-truth] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx10_2/ RWC dataset - Original RWC Ground-truth] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/sal/ SALAMI dataset] &amp;amp;nbsp;&lt;br /&gt;
* Query-by-Singing/Humming Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1_hidden/  Hidden Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1a_jang/  Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1b_thinkit/ ThinkIt Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1c_ioacas/ IOACAS Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task2_jang/ Subtask2 Dataset] &amp;amp;nbsp;&lt;br /&gt;
* Query-by-Tapping Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task1_jang/  Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task1_hsiao/ HSIAO Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task2_jang/ Subtask2 Dataset] &amp;amp;nbsp;&lt;br /&gt;
*[[2013:Real-time_Audio_to_Score_Alignment_(a.k.a._Score_Following)_Results | Real-time Audio to Score Alignment (a.k.a. Score Following) Results ]]&lt;br /&gt;
* [[2013:Symbolic_Melodic_Similarity_Results | Symbolic Melodic Similarity Results]]&lt;br /&gt;
* [[2013:Discovery of Repeated Themes &amp;amp; Sections Results | Discovery of Repeated Themes &amp;amp; Sections Results]]&lt;br /&gt;
* [[2013:Audio Cover Song Identification Results]]&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:MIREX2013_Results&amp;diff=9865</id>
		<title>2013:MIREX2013 Results</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:MIREX2013_Results&amp;diff=9865"/>
		<updated>2013-11-06T23:46:19Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Other Tasks */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==OVERALL RESULTS POSTERS &amp;lt;!--(First Version: Will need updating as last runs are completed)--&amp;gt;==&lt;br /&gt;
&lt;br /&gt;
This page is under construction. &lt;br /&gt;
&lt;br /&gt;
[https://www.music-ir.org/mirex/results/2013/mirex_2013_poster.pdf MIREX 2013 Overall Results Posters (PDF)]&lt;br /&gt;
&lt;br /&gt;
==Results by Task ==&lt;br /&gt;
&lt;br /&gt;
===Train-Test Task Set===&lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/composer_report/ Audio Classical Composer Identification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/latin_report/ Audio Latin Genre Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/mood_report/index.html Audio Music Mood Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/mixed_report/ Audio Mixed Popular Genre Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
===Other Tasks===&lt;br /&gt;
&lt;br /&gt;
* Audio Beat Tracking Results &lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/dav/ DAV Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/maz/ MAZ Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/mck/ MCK Dataset] &amp;amp;nbsp;&lt;br /&gt;
* Audio Chord Detection Results (will add more results soon)&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ace/mrx09/index.html MIREX &amp;amp;rsquo;09 Dataset (old style)]  &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ace/bill/index.html Billboard &amp;amp;rsquo;12 Dataset (old style)]  &amp;amp;nbsp;&lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/akd/ Audio Key Detection Results] &amp;amp;nbsp;&lt;br /&gt;
* Audio Melody Extraction Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/adc04/  ADC04 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx05/ MIREX05 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/ind08/ INDIAN08 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_0db/ MIREX09 0dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_m5db/ MIREX09 -5dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_p5db/ MIREX09 +5dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
* [[2013:Audio_Music_Similarity_and_Retrieval_Results | Audio Music Similarity and Retrieval Results]] &lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/aod/ Audio Onset Detection Results] &amp;amp;nbsp;&lt;br /&gt;
* Audio Tag Classification Results&lt;br /&gt;
** Major Miner Tag dataset&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask1_report/bin/ Binary relevance (classification evaluation)] &amp;amp;nbsp;&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask1_report/aff/ Affinity estimation evaluation] &amp;amp;nbsp;&lt;br /&gt;
** Mood Tag dataset&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask2_report/bin/ Binary relevance (classification evaluation)] &amp;amp;nbsp;&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask2_report/aff/ Affinity estimation evaluation] &amp;amp;nbsp;&lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ate/ Audio Tempo Estimation Results] &amp;amp;nbsp;&lt;br /&gt;
* [[2013:Multiple_Fundamental_Frequency_Estimation_&amp;amp;_Tracking_Results | Multiple Fundamental Frequency Estimation &amp;amp; Tracking Results]]&lt;br /&gt;
* Music Structure Segmentation Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx09/ MIREX09 dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx10_1/ RWC dataset - Quaero (MIREX10) Ground-truth] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx10_2/ RWC dataset - Original RWC Ground-truth] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/sal/ SALAMI dataset] &amp;amp;nbsp;&lt;br /&gt;
* Query-by-Singing/Humming Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1_hidden/  Hidden Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1a_jang/  Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1b_thinkit/ ThinkIt Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1c_ioacas/ IOACAS Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task2_jang/ Subtask2 Dataset] &amp;amp;nbsp;&lt;br /&gt;
* Query-by-Tapping Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task1_jang/  Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task1_hsiao/ HSIAO Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task2_jang/ Subtask2 Dataset] &amp;amp;nbsp;&lt;br /&gt;
*[[2013:Real-time_Audio_to_Score_Alignment_(a.k.a._Score_Following)_Results | Real-time Audio to Score Alignment (a.k.a. Score Following) Results ]]&lt;br /&gt;
* [[2013:Symbolic_Melodic_Similarity_Results | Symbolic Melodic Similarity Results]]&lt;br /&gt;
* [[2013:Discovery of Repeated Themes &amp;amp; Sections Results | Discovery of Repeated Themes &amp;amp; Sections Results]]&lt;br /&gt;
* [[2013:Audio Cover Song Identification Results]]&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:MIREX2013_Results&amp;diff=9864</id>
		<title>2013:MIREX2013 Results</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:MIREX2013_Results&amp;diff=9864"/>
		<updated>2013-11-06T23:45:21Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Other Tasks */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==OVERALL RESULTS POSTERS &amp;lt;!--(First Version: Will need updating as last runs are completed)--&amp;gt;==&lt;br /&gt;
&lt;br /&gt;
This page is under construction. &lt;br /&gt;
&lt;br /&gt;
[https://www.music-ir.org/mirex/results/2013/mirex_2013_poster.pdf MIREX 2013 Overall Results Posters (PDF)]&lt;br /&gt;
&lt;br /&gt;
==Results by Task ==&lt;br /&gt;
&lt;br /&gt;
===Train-Test Task Set===&lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/composer_report/ Audio Classical Composer Identification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/latin_report/ Audio Latin Genre Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/mood_report/index.html Audio Music Mood Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
* [https://www.music-ir.org/nema_out/mirex2013/results/act/mixed_report/ Audio Mixed Popular Genre Classification Results ]&amp;amp;nbsp;&amp;amp;nbsp; &lt;br /&gt;
===Other Tasks===&lt;br /&gt;
&lt;br /&gt;
* Audio Beat Tracking Results &lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/dav/ DAV Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/maz/ MAZ Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/abt/mck/ MCK Dataset] &amp;amp;nbsp;&lt;br /&gt;
* Audio Chord Detection Results (will add more results soon)&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ace/mrx09/index.html MIREX '09 Dataset (old style)]  &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ace/bill/index.html Billboard '12 Dataset (old style)]  &amp;amp;nbsp;&lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/akd/ Audio Key Detection Results] &amp;amp;nbsp;&lt;br /&gt;
* Audio Melody Extraction Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/adc04/  ADC04 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx05/ MIREX05 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/ind08/ INDIAN08 Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_0db/ MIREX09 0dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_m5db/ MIREX09 -5dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ame/mrx09_p5db/ MIREX09 +5dB Dataset] &amp;amp;nbsp;&lt;br /&gt;
* [[2013:Audio_Music_Similarity_and_Retrieval_Results | Audio Music Similarity and Retrieval Results]] &lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/aod/ Audio Onset Detection Results] &amp;amp;nbsp;&lt;br /&gt;
* Audio Tag Classification Results&lt;br /&gt;
** Major Miner Tag dataset&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask1_report/bin/ Binary relevance (classification evaluation)] &amp;amp;nbsp;&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask1_report/aff/ Affinity estimation evaluation] &amp;amp;nbsp;&lt;br /&gt;
** Mood Tag dataset&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask2_report/bin/ Binary relevance (classification evaluation)] &amp;amp;nbsp;&lt;br /&gt;
*** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/atg/subtask2_report/aff/ Affinity estimation evaluation] &amp;amp;nbsp;&lt;br /&gt;
* [https://nema.lis.illinois.edu/nema_out/mirex2013/results/ate/ Audio Tempo Estimation Results] &amp;amp;nbsp;&lt;br /&gt;
* [[2013:Multiple_Fundamental_Frequency_Estimation_&amp;amp;_Tracking_Results | Multiple Fundamental Frequency Estimation &amp;amp; Tracking Results]]&lt;br /&gt;
* Music Structure Segmentation Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx09/ MIREX09 dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx10_1/ RWC dataset - Quaero (MIREX10) Ground-truth] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/mrx10_2/ RWC dataset - Original RWC Ground-truth] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/struct/sal/ SALAMI dataset] &amp;amp;nbsp;&lt;br /&gt;
* Query-by-Singing/Humming Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1_hidden/  Hidden Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1a_jang/  Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1b_thinkit/ ThinkIt Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task1c_ioacas/ IOACAS Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbsh/qbsh_task2_jang/ Subtask2 Dataset] &amp;amp;nbsp;&lt;br /&gt;
* Query-by-Tapping Results&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task1_jang/  Jang Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task1_hsiao/ HSIAO Dataset] &amp;amp;nbsp;&lt;br /&gt;
** [https://nema.lis.illinois.edu/nema_out/mirex2013/results/qbt/qbt_task2_jang/ Subtask2 Dataset] &amp;amp;nbsp;&lt;br /&gt;
*[[2013:Real-time_Audio_to_Score_Alignment_(a.k.a._Score_Following)_Results | Real-time Audio to Score Alignment (a.k.a. Score Following) Results ]]&lt;br /&gt;
* [[2013:Symbolic_Melodic_Similarity_Results | Symbolic Melodic Similarity Results]]&lt;br /&gt;
* [[2013:Discovery of Repeated Themes &amp;amp; Sections Results | Discovery of Repeated Themes &amp;amp; Sections Results]]&lt;br /&gt;
* [[2013:Audio Cover Song Identification Results]]&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation&amp;diff=9557</id>
		<title>2013:Audio Chord Estimation</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation&amp;diff=9557"/>
		<updated>2013-09-09T13:29:07Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: /* Bibliography */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Description =&lt;br /&gt;
&lt;br /&gt;
This task requires participants to extract or transcribe a sequence of chords from an audio music recording. For many applications in music information retrieval, extracting the harmonic structure of an audio track is very desirable, for example for segmenting pieces into characteristic segments, for finding similar pieces, or for semantic analysis of music. The extraction of the harmonic structure requires the estimation of a sequence of chords that is as precise as possible. This includes the full characterisation of chords – root, quality, and bass note – as well as their chronological order, including specific onset times and durations. Audio chord estimation has a long history in MIREX, and readers interested in this history, especially with respect to evaluation methodology, should review the work of Christopher Harte (2010), Pauwels and Peeters (2013), and the [https://www.music-ir.org/mirex/wiki/The_Utrecht_Agreement_on_Chord_Evaluation “Utrecht Agreement”] on evaluation metrics.&lt;br /&gt;
&lt;br /&gt;
= Data =&lt;br /&gt;
&lt;br /&gt;
Two datasets are used to evaluate chord transcription accuracy.&lt;br /&gt;
&lt;br /&gt;
; Isophonics&lt;br /&gt;
: The collected Beatles, Queen, and Zweieck datasets from the Centre for Digital Music at Queen Mary, University of London (http://www.isophonics.net/), as used for Audio Chord Estimation in MIREX for many years. Available from http://www.isophonics.net/. See also Matthias Mauch’s dissertation (2010) and Harte et al.’s introductory paper (2005).&lt;br /&gt;
; Billboard&lt;br /&gt;
: An abridged version of the ''Billboard'' dataset from McGill University, including a representative sample of American popular music from the 1950s through the 1990s. Available from http://billboard.music.mcgill.ca. See also Ashley Burgoyne’s dissertation (2012) and Burgoyne et al.’s introductory paper (2011). Parsing tools for the data are available from http://hackage.haskell.org/package/billboard-parser/ and documented by De Haas and Burgoyne (2012).&lt;br /&gt;
&lt;br /&gt;
== Training and Testing ==&lt;br /&gt;
&lt;br /&gt;
The training and testing divisions differ for the two data sets. The Isophonics has been available publicly for so long that it no longer makes sense to offer a separate training phase; as such, the entire data set will be used for testing, as in previous years. In contrast, in order to support MIREX, a portion of the ''Billboard'' ground truth has been withheld from the public. Submissions may train on all of the songs that have been publicly released so far: the MIREX servers have access to the ground-truth annotations and the original audio. Whether trained or not, all submissions will be tested against a fresh set of 200 songs that have never been released publicly.&lt;br /&gt;
&lt;br /&gt;
The ground-truth files contain one line per unique chord, in the form &amp;lt;code&amp;gt;{start_time end_time chord}&amp;lt;/code&amp;gt;, e.g.,&lt;br /&gt;
&amp;lt;pre&amp;gt;...&lt;br /&gt;
41.2631021 44.2456460 B&lt;br /&gt;
44.2456460 45.7201230 E&lt;br /&gt;
45.7201230 47.2061900 E:7/3&lt;br /&gt;
47.2061900 48.6922670 A&lt;br /&gt;
48.6922670 50.1551240 A:min/b3&lt;br /&gt;
...&amp;lt;/pre&amp;gt;&lt;br /&gt;
Start and end times are in seconds from the start of the file. Chord labels follow the syntax proposed by C. Harte et al. (2005). Please note that the syntax has changed slightly since since it was originally described; in particular, the root is no longer implied as a voiced element of a chord so a C major chord (notes C, E and G) should be written C:(1,3,5) instead of just C:(3,5) if using the interval list representation. As before, the labels C and C:maj are equivalent to C:(1,3,5).&lt;br /&gt;
&lt;br /&gt;
= Evaluation =&lt;br /&gt;
&lt;br /&gt;
To evaluate the quality of an automatic transcription, a transcription is compared to ground truth created by one or more human annotators. MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;math&amp;gt;\textrm{CSR} =   \frac{\textrm{total duration of segments where annotation equals estimation}}  {\textrm{total duration of annotated segments}}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In previous years, MIREX has used an approximate CSR calculated by sampling both the ground-truth and the automatic annotations every 10 ms and dividing the number of correctly annotated samples by the total number of samples. Following Christopher Harte (2010, §8.1.2), however, we can view the ground-truth and estimated annotations as continuous segmentations of the audio and calculate the CSR by considering the cumulative length of the correctly overlapping segments. This way of calculating the CSR is more precise, as the precision of the frame-based method is limited by the frame length, and computationally more efficient, as it reduces the number of segment comparisons. Because pieces of music come in a wide variety of lengths, we will weight the CSR by the length of the song when computing an average for a given corpus. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
== Chord Vocabularies ==&lt;br /&gt;
&lt;br /&gt;
[chord-eval]&lt;br /&gt;
&lt;br /&gt;
We propose a set of single chord evaluation measures for MIREX that extends the previous iterations of MIREX and combines it with evaluation measures proposed in the literature, providing a more complete assessment of the transcription quality. Following Pauwels and Peeters (2013), we suggest using the CSR with five different chord vocabulary mappings.&lt;br /&gt;
&lt;br /&gt;
In each of these calculations, the full chord descriptions of either the estimated or the ground-truth transcriptions, which might contain complex chord annotations, would be mapped to the following classes:&lt;br /&gt;
&lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor: {&amp;lt;code&amp;gt;N, maj, min&amp;lt;/code&amp;gt;};&lt;br /&gt;
# Seventh chords: {&amp;lt;code&amp;gt;N, maj, min, maj7, min7, 7&amp;lt;/code&amp;gt;};&lt;br /&gt;
# Major and minor with inversions: {&amp;lt;code&amp;gt;N, maj, min, maj/3, min/b3, maj/5, min/5&amp;lt;/code&amp;gt;}; or&lt;br /&gt;
# Seventh chords with inversions: {&amp;lt;code&amp;gt;N, maj, min, maj7, min7, 7, maj/3, min/b3, maj7/3, min7/b3, 7/3, maj/5, min/5, maj7/5, min7/5, 7/5, maj7/7, min7/b7, 7/b7&amp;lt;/code&amp;gt;}.&lt;br /&gt;
&lt;br /&gt;
With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. For instance, in the major and minor case, &amp;lt;code&amp;gt;G:7(#9)&amp;lt;/code&amp;gt; is mapped to &amp;lt;code&amp;gt;G:maj&amp;lt;/code&amp;gt; because the interval set of &amp;lt;code&amp;gt;G:maj&amp;lt;/code&amp;gt;, {&amp;lt;code&amp;gt;1,3,5&amp;lt;/code&amp;gt;}, is a subset of the interval set of the &amp;lt;code&amp;gt;G:7(#9)&amp;lt;/code&amp;gt;, {&amp;lt;code&amp;gt;1,3,5,b7,#9&amp;lt;/code&amp;gt;}. In the seventh-chord case, &amp;lt;code&amp;gt;G:7(#9)&amp;lt;/code&amp;gt; is mapped to &amp;lt;code&amp;gt;G:7&amp;lt;/code&amp;gt; instead because the interval set of &amp;lt;code&amp;gt;G:7&amp;lt;/code&amp;gt; {&amp;lt;code&amp;gt;1, 3, 5, b7&amp;lt;/code&amp;gt;} is also a subset of &amp;lt;code&amp;gt;G:7(#9)&amp;lt;/code&amp;gt; but is larger than &amp;lt;code&amp;gt;G:maj&amp;lt;/code&amp;gt;. If a chord cannot be represented by a certain class, e.g., mapping a &amp;lt;code&amp;gt;D:aug&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;F:sus4(9)&amp;lt;/code&amp;gt; to {&amp;lt;code&amp;gt;maj, min&amp;lt;/code&amp;gt;}, the chord is excluded from the evaluation if it occurs in the ground-truth, and it is considered a mismatch if it occurs in an estimated annotation.&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
|+ Most frequent chord qualities in the McGill ''Billboard'' corpus.&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq. (%)&lt;br /&gt;
! Cum. Freq (%)&lt;br /&gt;
|- &lt;br /&gt;
|maj &lt;br /&gt;
|52&lt;br /&gt;
|52&lt;br /&gt;
|-&lt;br /&gt;
|min&lt;br /&gt;
|13&lt;br /&gt;
|65&lt;br /&gt;
|-&lt;br /&gt;
|7&lt;br /&gt;
|10&lt;br /&gt;
|75&lt;br /&gt;
|-&lt;br /&gt;
|min7&lt;br /&gt;
|8&lt;br /&gt;
|83&lt;br /&gt;
|-&lt;br /&gt;
|maj7&lt;br /&gt;
|3&lt;br /&gt;
|86&lt;br /&gt;
|-&lt;br /&gt;
|5&lt;br /&gt;
|2&lt;br /&gt;
|88&lt;br /&gt;
|-&lt;br /&gt;
|1&lt;br /&gt;
|2&lt;br /&gt;
|90&lt;br /&gt;
|-&lt;br /&gt;
|maj(9)&lt;br /&gt;
|1&lt;br /&gt;
|91&lt;br /&gt;
|-&lt;br /&gt;
|maj6&lt;br /&gt;
|1&lt;br /&gt;
|92&lt;br /&gt;
|-&lt;br /&gt;
|sus4&lt;br /&gt;
|1&lt;br /&gt;
|93&lt;br /&gt;
|-&lt;br /&gt;
|sus7&lt;br /&gt;
|1&lt;br /&gt;
|94&lt;br /&gt;
|-&lt;br /&gt;
|sus9&lt;br /&gt;
|1&lt;br /&gt;
|94&lt;br /&gt;
|-&lt;br /&gt;
|7(#9)&lt;br /&gt;
|1&lt;br /&gt;
|95&lt;br /&gt;
|-&lt;br /&gt;
|min9&lt;br /&gt;
|1&lt;br /&gt;
|96&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus (see table above), which is a balanced sample of American popular music from the 1950s through the 1990s (J.A. Burgoyne, Wild, and Fujinaga 2011). Pure major and minor chords alone account for 65 percent of all chords encountered, whereas augmented and diminished triads account for 0.2 percent or less of the corpus each. Our arguments for our particular seventh-chord vocabulary as opposed to the set of all tetrads follows similar reasoning; our proposed vocabulary accounts for 86 percent of all chords, whereas no other standard type of seventh chord accounts for more than 0.2 percent of the corpus. In future years, the table suggests that we might consider introducing vocabularies including power chords, and possibly suspended chords or added sixths and ninths as well.&lt;br /&gt;
&lt;br /&gt;
== Chord Segmentation ==&lt;br /&gt;
&lt;br /&gt;
Besides CSR, the chord transcription literature includes several other metrics for evaluating chord transcriptions, which mainly focus on the segmentation of the automatic transcription. We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences ((S. A. Abdallah et al. 2005); (Mauch 2010, §2.3.3)). Depending on the order of application, the directional Hamming distance yields a measure of over- or under segmentation. Both directions can be combined to yield an overall quality metric (Christopher Harte 2010, §8.3.2):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;math&amp;gt;Q = 1 - \frac{\textrm{maximum of directional Hamming distances in     either direction}}      {\textrm{total duration of song}}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Submission Format =&lt;br /&gt;
&lt;br /&gt;
== Audio Format ==&lt;br /&gt;
&lt;br /&gt;
Audio tracks in the training directory will be encoded as 44.1 kHz 16bit mono WAV files.&lt;br /&gt;
&lt;br /&gt;
== I/O Format ==&lt;br /&gt;
&lt;br /&gt;
The algorithms should output text files with a similar format to that used in the ground truth transcriptions. That is to say, they should be flat text files with chord segment labels and times arranged thus:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;start_time end_time chord_label&amp;lt;/pre&amp;gt;&lt;br /&gt;
with elements separated by white spaces, times given in seconds, chord labels corresponding to the syntax described by C. Harte et al. (2005), and one chord segment per line. As in all benchmarks after 2008, end times are a mandatory component of the output. For the evaluation process we will assume enharmonic equivalence for chord roots. We will no longer accept participants who would only like to be evaluated on major/minor chords and want to use the number format.&lt;br /&gt;
&lt;br /&gt;
== Command line calling format ==&lt;br /&gt;
&lt;br /&gt;
Submissions have to conform to the specified format below:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;extractFeaturesAndTrain  &amp;amp;quot;/path/to/trainFileList.txt&amp;amp;quot;  &amp;amp;quot;/path/to/scratch/dir&amp;amp;quot;  &amp;lt;/pre&amp;gt;&lt;br /&gt;
where &amp;lt;code&amp;gt;fileList.txt&amp;lt;/code&amp;gt; has the paths to each WAV file. The features extracted on this stage can be stored under &amp;lt;code&amp;gt;/path/to/scratch/dir&amp;lt;/code&amp;gt;. The ground truth files for the supervised learning will be in the same path with a &amp;lt;code&amp;gt;.txt&amp;lt;/code&amp;gt; extension at the end. For example for &amp;lt;code&amp;gt;/path/to/trainFile1.wav&amp;lt;/code&amp;gt;, there will be a corresponding ground truth file called &amp;lt;code&amp;gt;/path/to/trainFile1.wav.txt&amp;lt;/code&amp;gt;. For testing:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;doChordID.sh &amp;amp;quot;/path/to/testFileList.txt&amp;amp;quot; &amp;amp;quot;/path/to/scratch/dir&amp;amp;quot; &amp;amp;quot;/path/to/results/dir&amp;amp;quot; &amp;lt;/pre&amp;gt;&lt;br /&gt;
If there is no training, you can ignore the second argument here. In the results directory, there should be one file for each testfile with same name as the test file + &amp;lt;code&amp;gt;.txt&amp;lt;/code&amp;gt;. Programs can use their working directory if they need to keep temporary cache files or internal debugging info. Standard output and standard error will be logged.&lt;br /&gt;
&lt;br /&gt;
== Packaging submissions ==&lt;br /&gt;
&lt;br /&gt;
All submissions should be statically linked to all libraries (the presence of dynamically linked libraries cannot be guaranteed). All submissions should include a &amp;lt;code&amp;gt;README&amp;lt;/code&amp;gt; file including the following information:&lt;br /&gt;
&lt;br /&gt;
* Command line calling format for all executables and an example formatted set of commands&lt;br /&gt;
* Number of threads/cores used or whether this should be specified on the command line&lt;br /&gt;
* Expected memory footprint&lt;br /&gt;
* Expected runtime&lt;br /&gt;
* Any required environments (and versions), e.g. Python, Java, bash, MATLAB.&lt;br /&gt;
&lt;br /&gt;
= Time and Hardware limits =&lt;br /&gt;
&lt;br /&gt;
A hard limit of 24 hours will be imposed on runs (total feature extraction and querying times). Submissions that exceed this runtime may not receive a result.&lt;br /&gt;
&lt;br /&gt;
= Discussion =&lt;br /&gt;
&lt;br /&gt;
= Bibliography =&lt;br /&gt;
&lt;br /&gt;
Abdallah, Samer A., Katy Noland, Mark B. Sandler, Michael Casey, and Christophe Rhodes. 2005. “Theory and Evaluation of a Bayesian Music Structure Extractor.” In ''Proceedings of the International Society for Music Information Retrieval Conference'', 420–425.&lt;br /&gt;
&lt;br /&gt;
Burgoyne, J. A., J. Wild, and I. Fujinaga. 2011. “An expert ground truth set for audio chord recognition and music analysis.” In ''Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR)'', 633–638.&lt;br /&gt;
&lt;br /&gt;
Burgoyne, John Ashley. 2012. “Stochastic Processes and Database-Driven Musicology.” Ph.D. diss. Montréal, Québec, Canada: McGill University.&lt;br /&gt;
&lt;br /&gt;
Haas, W. B. de, and John~Ashley Burgoyne. 2012. ''Parsing the Billboard Chord Transcriptions''. Technical report UU-CS- 2012-018, Department of Information and Computing Sciences, Utrecht University.&lt;br /&gt;
&lt;br /&gt;
Harte, C., M. Sandler, S. Abdallah, and E. Gómez. 2005. “Symbolic representation of musical chords: A proposed syntax for text annotations.” In ''Proceedings of the 6th International Society for Music Information Retrieval Conference (ISMIR)'', 66–71.&lt;br /&gt;
&lt;br /&gt;
Harte, Christopher. 2010. “Towards automatic extraction of harmony information from music signals.” Ph.D. diss. Queen Mary, University of London.&lt;br /&gt;
&lt;br /&gt;
Mauch, Matthias. 2010. “Automatic Chord Transcription from Audio Using Computational Models of Musical Context.” Ph.D. diss. Queen Mary University of London.&lt;br /&gt;
&lt;br /&gt;
Pauwels, Johan, and Geoffroy Peeters. 2013. “Evaluating automatically estimated chord sequences.” In ''Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)''. Vancouver, British Columbia, Canada.&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation&amp;diff=9556</id>
		<title>2013:Audio Chord Estimation</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2013:Audio_Chord_Estimation&amp;diff=9556"/>
		<updated>2013-09-09T13:28:20Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Description =&lt;br /&gt;
&lt;br /&gt;
This task requires participants to extract or transcribe a sequence of chords from an audio music recording. For many applications in music information retrieval, extracting the harmonic structure of an audio track is very desirable, for example for segmenting pieces into characteristic segments, for finding similar pieces, or for semantic analysis of music. The extraction of the harmonic structure requires the estimation of a sequence of chords that is as precise as possible. This includes the full characterisation of chords – root, quality, and bass note – as well as their chronological order, including specific onset times and durations. Audio chord estimation has a long history in MIREX, and readers interested in this history, especially with respect to evaluation methodology, should review the work of Christopher Harte (2010), Pauwels and Peeters (2013), and the [https://www.music-ir.org/mirex/wiki/The_Utrecht_Agreement_on_Chord_Evaluation “Utrecht Agreement”] on evaluation metrics.&lt;br /&gt;
&lt;br /&gt;
= Data =&lt;br /&gt;
&lt;br /&gt;
Two datasets are used to evaluate chord transcription accuracy.&lt;br /&gt;
&lt;br /&gt;
; Isophonics&lt;br /&gt;
: The collected Beatles, Queen, and Zweieck datasets from the Centre for Digital Music at Queen Mary, University of London (http://www.isophonics.net/), as used for Audio Chord Estimation in MIREX for many years. Available from http://www.isophonics.net/. See also Matthias Mauch’s dissertation (2010) and Harte et al.’s introductory paper (2005).&lt;br /&gt;
; Billboard&lt;br /&gt;
: An abridged version of the ''Billboard'' dataset from McGill University, including a representative sample of American popular music from the 1950s through the 1990s. Available from http://billboard.music.mcgill.ca. See also Ashley Burgoyne’s dissertation (2012) and Burgoyne et al.’s introductory paper (2011). Parsing tools for the data are available from http://hackage.haskell.org/package/billboard-parser/ and documented by De Haas and Burgoyne (2012).&lt;br /&gt;
&lt;br /&gt;
== Training and Testing ==&lt;br /&gt;
&lt;br /&gt;
The training and testing divisions differ for the two data sets. The Isophonics has been available publicly for so long that it no longer makes sense to offer a separate training phase; as such, the entire data set will be used for testing, as in previous years. In contrast, in order to support MIREX, a portion of the ''Billboard'' ground truth has been withheld from the public. Submissions may train on all of the songs that have been publicly released so far: the MIREX servers have access to the ground-truth annotations and the original audio. Whether trained or not, all submissions will be tested against a fresh set of 200 songs that have never been released publicly.&lt;br /&gt;
&lt;br /&gt;
The ground-truth files contain one line per unique chord, in the form &amp;lt;code&amp;gt;{start_time end_time chord}&amp;lt;/code&amp;gt;, e.g.,&lt;br /&gt;
&amp;lt;pre&amp;gt;...&lt;br /&gt;
41.2631021 44.2456460 B&lt;br /&gt;
44.2456460 45.7201230 E&lt;br /&gt;
45.7201230 47.2061900 E:7/3&lt;br /&gt;
47.2061900 48.6922670 A&lt;br /&gt;
48.6922670 50.1551240 A:min/b3&lt;br /&gt;
...&amp;lt;/pre&amp;gt;&lt;br /&gt;
Start and end times are in seconds from the start of the file. Chord labels follow the syntax proposed by C. Harte et al. (2005). Please note that the syntax has changed slightly since since it was originally described; in particular, the root is no longer implied as a voiced element of a chord so a C major chord (notes C, E and G) should be written C:(1,3,5) instead of just C:(3,5) if using the interval list representation. As before, the labels C and C:maj are equivalent to C:(1,3,5).&lt;br /&gt;
&lt;br /&gt;
= Evaluation =&lt;br /&gt;
&lt;br /&gt;
To evaluate the quality of an automatic transcription, a transcription is compared to ground truth created by one or more human annotators. MIREX typically uses ''chord symbol recall'' (CSR) to estimate how well the predicted chords match the ground truth:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;math&amp;gt;\textrm{CSR} =   \frac{\textrm{total duration of segments where annotation equals estimation}}  {\textrm{total duration of annotated segments}}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In previous years, MIREX has used an approximate CSR calculated by sampling both the ground-truth and the automatic annotations every 10 ms and dividing the number of correctly annotated samples by the total number of samples. Following Christopher Harte (2010, §8.1.2), however, we can view the ground-truth and estimated annotations as continuous segmentations of the audio and calculate the CSR by considering the cumulative length of the correctly overlapping segments. This way of calculating the CSR is more precise, as the precision of the frame-based method is limited by the frame length, and computationally more efficient, as it reduces the number of segment comparisons. Because pieces of music come in a wide variety of lengths, we will weight the CSR by the length of the song when computing an average for a given corpus. This final number is referred to as the ''weighted chord symbol recall'' (WCSR).&lt;br /&gt;
&lt;br /&gt;
== Chord Vocabularies ==&lt;br /&gt;
&lt;br /&gt;
[chord-eval]&lt;br /&gt;
&lt;br /&gt;
We propose a set of single chord evaluation measures for MIREX that extends the previous iterations of MIREX and combines it with evaluation measures proposed in the literature, providing a more complete assessment of the transcription quality. Following Pauwels and Peeters (2013), we suggest using the CSR with five different chord vocabulary mappings.&lt;br /&gt;
&lt;br /&gt;
In each of these calculations, the full chord descriptions of either the estimated or the ground-truth transcriptions, which might contain complex chord annotations, would be mapped to the following classes:&lt;br /&gt;
&lt;br /&gt;
# Chord root note only;&lt;br /&gt;
# Major and minor: {&amp;lt;code&amp;gt;N, maj, min&amp;lt;/code&amp;gt;};&lt;br /&gt;
# Seventh chords: {&amp;lt;code&amp;gt;N, maj, min, maj7, min7, 7&amp;lt;/code&amp;gt;};&lt;br /&gt;
# Major and minor with inversions: {&amp;lt;code&amp;gt;N, maj, min, maj/3, min/b3, maj/5, min/5&amp;lt;/code&amp;gt;}; or&lt;br /&gt;
# Seventh chords with inversions: {&amp;lt;code&amp;gt;N, maj, min, maj7, min7, 7, maj/3, min/b3, maj7/3, min7/b3, 7/3, maj/5, min/5, maj7/5, min7/5, 7/5, maj7/7, min7/b7, 7/b7&amp;lt;/code&amp;gt;}.&lt;br /&gt;
&lt;br /&gt;
With the exception of no-chords, calculating the vocabulary mapping involves examining the root note, the bass note, and the relative interval structure of the chord labels. A mapping exists if both the root notes and bass notes match, and the structure of the output label is the largest possible subset of the input label given the vocabulary. For instance, in the major and minor case, &amp;lt;code&amp;gt;G:7(#9)&amp;lt;/code&amp;gt; is mapped to &amp;lt;code&amp;gt;G:maj&amp;lt;/code&amp;gt; because the interval set of &amp;lt;code&amp;gt;G:maj&amp;lt;/code&amp;gt;, {&amp;lt;code&amp;gt;1,3,5&amp;lt;/code&amp;gt;}, is a subset of the interval set of the &amp;lt;code&amp;gt;G:7(#9)&amp;lt;/code&amp;gt;, {&amp;lt;code&amp;gt;1,3,5,b7,#9&amp;lt;/code&amp;gt;}. In the seventh-chord case, &amp;lt;code&amp;gt;G:7(#9)&amp;lt;/code&amp;gt; is mapped to &amp;lt;code&amp;gt;G:7&amp;lt;/code&amp;gt; instead because the interval set of &amp;lt;code&amp;gt;G:7&amp;lt;/code&amp;gt; {&amp;lt;code&amp;gt;1, 3, 5, b7&amp;lt;/code&amp;gt;} is also a subset of &amp;lt;code&amp;gt;G:7(#9)&amp;lt;/code&amp;gt; but is larger than &amp;lt;code&amp;gt;G:maj&amp;lt;/code&amp;gt;. If a chord cannot be represented by a certain class, e.g., mapping a &amp;lt;code&amp;gt;D:aug&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;F:sus4(9)&amp;lt;/code&amp;gt; to {&amp;lt;code&amp;gt;maj, min&amp;lt;/code&amp;gt;}, the chord is excluded from the evaluation if it occurs in the ground-truth, and it is considered a mismatch if it occurs in an estimated annotation.&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
|+ Most frequent chord qualities in the McGill ''Billboard'' corpus.&lt;br /&gt;
! Quality&lt;br /&gt;
! Freq. (%)&lt;br /&gt;
! Cum. Freq (%)&lt;br /&gt;
|- &lt;br /&gt;
|maj &lt;br /&gt;
|52&lt;br /&gt;
|52&lt;br /&gt;
|-&lt;br /&gt;
|min&lt;br /&gt;
|13&lt;br /&gt;
|65&lt;br /&gt;
|-&lt;br /&gt;
|7&lt;br /&gt;
|10&lt;br /&gt;
|75&lt;br /&gt;
|-&lt;br /&gt;
|min7&lt;br /&gt;
|8&lt;br /&gt;
|83&lt;br /&gt;
|-&lt;br /&gt;
|maj7&lt;br /&gt;
|3&lt;br /&gt;
|86&lt;br /&gt;
|-&lt;br /&gt;
|5&lt;br /&gt;
|2&lt;br /&gt;
|88&lt;br /&gt;
|-&lt;br /&gt;
|1&lt;br /&gt;
|2&lt;br /&gt;
|90&lt;br /&gt;
|-&lt;br /&gt;
|maj(9)&lt;br /&gt;
|1&lt;br /&gt;
|91&lt;br /&gt;
|-&lt;br /&gt;
|maj6&lt;br /&gt;
|1&lt;br /&gt;
|92&lt;br /&gt;
|-&lt;br /&gt;
|sus4&lt;br /&gt;
|1&lt;br /&gt;
|93&lt;br /&gt;
|-&lt;br /&gt;
|sus7&lt;br /&gt;
|1&lt;br /&gt;
|94&lt;br /&gt;
|-&lt;br /&gt;
|sus9&lt;br /&gt;
|1&lt;br /&gt;
|94&lt;br /&gt;
|-&lt;br /&gt;
|7(#9)&lt;br /&gt;
|1&lt;br /&gt;
|95&lt;br /&gt;
|-&lt;br /&gt;
|min9&lt;br /&gt;
|1&lt;br /&gt;
|96&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Our recommendations are motivated by the frequencies of chord qualities in the ''Billboard'' corpus (see table above), which is a balanced sample of American popular music from the 1950s through the 1990s (J.A. Burgoyne, Wild, and Fujinaga 2011). Pure major and minor chords alone account for 65 percent of all chords encountered, whereas augmented and diminished triads account for 0.2 percent or less of the corpus each. Our arguments for our particular seventh-chord vocabulary as opposed to the set of all tetrads follows similar reasoning; our proposed vocabulary accounts for 86 percent of all chords, whereas no other standard type of seventh chord accounts for more than 0.2 percent of the corpus. In future years, the table suggests that we might consider introducing vocabularies including power chords, and possibly suspended chords or added sixths and ninths as well.&lt;br /&gt;
&lt;br /&gt;
== Chord Segmentation ==&lt;br /&gt;
&lt;br /&gt;
Besides CSR, the chord transcription literature includes several other metrics for evaluating chord transcriptions, which mainly focus on the segmentation of the automatic transcription. We propose to include the directional Hamming distance in the evaluation. The directional Hamming distance is calculated by finding for each annotated segment the maximally overlapping segment in the other annotation, and then summing the differences ((S. A. Abdallah et al. 2005); (Mauch 2010, §2.3.3)). Depending on the order of application, the directional Hamming distance yields a measure of over- or under segmentation. Both directions can be combined to yield an overall quality metric (Christopher Harte 2010, §8.3.2):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;math&amp;gt;Q = 1 - \frac{\textrm{maximum of directional Hamming distances in     either direction}}      {\textrm{total duration of song}}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Submission Format =&lt;br /&gt;
&lt;br /&gt;
== Audio Format ==&lt;br /&gt;
&lt;br /&gt;
Audio tracks in the training directory will be encoded as 44.1 kHz 16bit mono WAV files.&lt;br /&gt;
&lt;br /&gt;
== I/O Format ==&lt;br /&gt;
&lt;br /&gt;
The algorithms should output text files with a similar format to that used in the ground truth transcriptions. That is to say, they should be flat text files with chord segment labels and times arranged thus:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;start_time end_time chord_label&amp;lt;/pre&amp;gt;&lt;br /&gt;
with elements separated by white spaces, times given in seconds, chord labels corresponding to the syntax described by C. Harte et al. (2005), and one chord segment per line. As in all benchmarks after 2008, end times are a mandatory component of the output. For the evaluation process we will assume enharmonic equivalence for chord roots. We will no longer accept participants who would only like to be evaluated on major/minor chords and want to use the number format.&lt;br /&gt;
&lt;br /&gt;
== Command line calling format ==&lt;br /&gt;
&lt;br /&gt;
Submissions have to conform to the specified format below:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;extractFeaturesAndTrain  &amp;amp;quot;/path/to/trainFileList.txt&amp;amp;quot;  &amp;amp;quot;/path/to/scratch/dir&amp;amp;quot;  &amp;lt;/pre&amp;gt;&lt;br /&gt;
where &amp;lt;code&amp;gt;fileList.txt&amp;lt;/code&amp;gt; has the paths to each WAV file. The features extracted on this stage can be stored under &amp;lt;code&amp;gt;/path/to/scratch/dir&amp;lt;/code&amp;gt;. The ground truth files for the supervised learning will be in the same path with a &amp;lt;code&amp;gt;.txt&amp;lt;/code&amp;gt; extension at the end. For example for &amp;lt;code&amp;gt;/path/to/trainFile1.wav&amp;lt;/code&amp;gt;, there will be a corresponding ground truth file called &amp;lt;code&amp;gt;/path/to/trainFile1.wav.txt&amp;lt;/code&amp;gt;. For testing:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;doChordID.sh &amp;amp;quot;/path/to/testFileList.txt&amp;amp;quot; &amp;amp;quot;/path/to/scratch/dir&amp;amp;quot; &amp;amp;quot;/path/to/results/dir&amp;amp;quot; &amp;lt;/pre&amp;gt;&lt;br /&gt;
If there is no training, you can ignore the second argument here. In the results directory, there should be one file for each testfile with same name as the test file + &amp;lt;code&amp;gt;.txt&amp;lt;/code&amp;gt;. Programs can use their working directory if they need to keep temporary cache files or internal debugging info. Standard output and standard error will be logged.&lt;br /&gt;
&lt;br /&gt;
== Packaging submissions ==&lt;br /&gt;
&lt;br /&gt;
All submissions should be statically linked to all libraries (the presence of dynamically linked libraries cannot be guaranteed). All submissions should include a &amp;lt;code&amp;gt;README&amp;lt;/code&amp;gt; file including the following information:&lt;br /&gt;
&lt;br /&gt;
* Command line calling format for all executables and an example formatted set of commands&lt;br /&gt;
* Number of threads/cores used or whether this should be specified on the command line&lt;br /&gt;
* Expected memory footprint&lt;br /&gt;
* Expected runtime&lt;br /&gt;
* Any required environments (and versions), e.g. Python, Java, bash, MATLAB.&lt;br /&gt;
&lt;br /&gt;
= Time and Hardware limits =&lt;br /&gt;
&lt;br /&gt;
A hard limit of 24 hours will be imposed on runs (total feature extraction and querying times). Submissions that exceed this runtime may not receive a result.&lt;br /&gt;
&lt;br /&gt;
= Discussion =&lt;br /&gt;
&lt;br /&gt;
= Bibliography =&lt;br /&gt;
&lt;br /&gt;
Abdallah, Samer A., Katy Noland, Mark B. Sandler, Michael Casey, and Christophe Rhodes. 2005. “Theory and Evaluation of a Bayesian Music Structure Extractor.” In ''Proceedings of the International Society for Music Information Retrieval Conference'', 420–425.&lt;br /&gt;
&lt;br /&gt;
Burgoyne, J. A., J. Wild, and I. Fujinaga. 2011. “An expert ground truth set for audio chord recognition and music analysis.” In ''Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR)'', 633–638.&lt;br /&gt;
&lt;br /&gt;
Burgoyne, John Ashley. 2012. “Stochastic Processes and Database-Driven Musicology.” Ph.D. diss. Montréal, Québec, Canada: McGill University. http://digitool.Library.McGill.CA:80/R/-?func=dbin-jump-full&amp;amp;object_id=107704&amp;amp;silo_library=GEN01.&lt;br /&gt;
&lt;br /&gt;
Haas, W. B. de, and John~Ashley Burgoyne. 2012. ''Parsing the Billboard Chord Transcriptions''. Technical report UU-CS- 2012-018, Department of Information and Computing Sciences, Utrecht University.&lt;br /&gt;
&lt;br /&gt;
Harte, C., M. Sandler, S. Abdallah, and E. Gómez. 2005. “Symbolic representation of musical chords: A proposed syntax for text annotations.” In ''Proceedings of the 6th International Society for Music Information Retrieval Conference (ISMIR)'', 66–71.&lt;br /&gt;
&lt;br /&gt;
Harte, Christopher. 2010. “Towards automatic extraction of harmony information from music signals.” Ph.D. diss. Queen Mary, University of London.&lt;br /&gt;
&lt;br /&gt;
Mauch, Matthias. 2010. “Automatic Chord Transcription from Audio Using Computational Models of Musical Context.” Ph.D. diss. Queen Mary University of London.&lt;br /&gt;
&lt;br /&gt;
Pauwels, Johan, and Geoffroy Peeters. 2013. “Evaluating automatically estimated chord sequences.” In ''Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)''. Vancouver, British Columbia, Canada.&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
	<entry>
		<id>https://music-ir.org/mirex/w/index.php?title=2012:Audio_Chord_Estimation&amp;diff=8868</id>
		<title>2012:Audio Chord Estimation</title>
		<link rel="alternate" type="text/html" href="https://music-ir.org/mirex/w/index.php?title=2012:Audio_Chord_Estimation&amp;diff=8868"/>
		<updated>2012-08-27T16:48:35Z</updated>

		<summary type="html">&lt;p&gt;J. Ashley Burgoyne: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[The Utrecht Agreement on Chord Evaluation]]&lt;br /&gt;
&lt;br /&gt;
===Evaluation of Chord Transcriptions===&lt;br /&gt;
&lt;br /&gt;
Before the final description of the chord evaluation goes live here, please see the discussion based on the [[The Utrecht Agreement on Chord Evaluation]].&lt;br /&gt;
&lt;br /&gt;
== Description ==&lt;br /&gt;
This task requires participants to extract or transcribe a sequence of chords from an audio music recording. For many applications in music information retrieval, extracting the harmonic structure of an audio track is very desirable, for example for segmenting pieces into characteristic segments, for finding similar pieces, or for semantic analysis of music.&lt;br /&gt;
&lt;br /&gt;
The extraction of the harmonic structure requires the detection of as many chords as possible in a piece. That includes the characterisation of chords with a key and type as well as a chronological order with onset and duration of the chords.&lt;br /&gt;
&lt;br /&gt;
Although some publications are available on this topic [1,2,3,4,5], comparison of the results is difficult, because different measures are used to assess the performance. To overcome this problem an accurately defined methodology is needed. This includes a repertory of the findable chords, a defined test set along with ground truth and unambiguous calculation rules to measure the performance.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Data ==&lt;br /&gt;
Three datasets are used to evaluate chord transcription accuracy:&lt;br /&gt;
&lt;br /&gt;
=== Beatles dataset ===&lt;br /&gt;
Christopher Harte`s Beatles dataset consisting of annotations of 12 Beatles albums.&lt;br /&gt;
&lt;br /&gt;
The text annotation procedure of musical chords that was used to produce this dataset is presented in [6]. &lt;br /&gt;
&lt;br /&gt;
=== Queen and Zweieck dataset ===&lt;br /&gt;
Matthias Mauch's Queen and Zweieck dataset consisting of 38 songs from Queen and Zweieck.&lt;br /&gt;
&lt;br /&gt;
=== Billboard dataset (abridged) ===&lt;br /&gt;
An abridged version of Ashley Burgoyne's Billboard dataset [9], consisting of about 200 songs for training (previously published) and 200 songs for testing (to be published for the first time at ISMIR).&lt;br /&gt;
&lt;br /&gt;
===Example ground-truth file ===&lt;br /&gt;
The ground-truth files take the form:&lt;br /&gt;
&lt;br /&gt;
 ...&lt;br /&gt;
 41.2631021 44.2456460 B&lt;br /&gt;
 44.2456460 45.7201230 E&lt;br /&gt;
 45.7201230 47.2061900 E:7/3&lt;br /&gt;
 47.2061900 48.6922670 A&lt;br /&gt;
 48.6922670 50.1551240 A:min/b3&lt;br /&gt;
 ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Evaluation ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Segmentation Score ===&lt;br /&gt;
&lt;br /&gt;
The segmentation score will be calculated using directional hamming distance as described in [8]. An over-segmentation value (m) and an under-segmentation value (f) will be calculated and the final segmentation score will be calculated using the worst case from these two i.e:&lt;br /&gt;
&lt;br /&gt;
segmentation score = 1 - max(m,f)&lt;br /&gt;
&lt;br /&gt;
m and f are not independent of each other so combining them this way ensures that a good score in one does not hide a bad score in the other. The combined segmentation score can take values between 0 and 1 with 0 being the worst and 1 being the best result.-- Chrish 17:05, 9 September 2009 (UTC)&lt;br /&gt;
&lt;br /&gt;
=== Frame-based recall ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
For recall evaluation, we may define a different chord dictionary for each level of evaluation (dyads, triads, tetrads etc). Each dictionary is a text file containing chord shorthands / interval lists of the chords that will be considered in that evaluation. The following dictionaries are proposed:&lt;br /&gt;
&lt;br /&gt;
For dyad comparison of major/minor chords only:&lt;br /&gt;
&lt;br /&gt;
N&amp;lt;br&amp;gt;&lt;br /&gt;
X:maj&amp;lt;br&amp;gt;&lt;br /&gt;
X:min&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For comparison of standard triad chords:&lt;br /&gt;
&lt;br /&gt;
N&amp;lt;br&amp;gt;&lt;br /&gt;
X:maj&amp;lt;br&amp;gt;&lt;br /&gt;
X:min&amp;lt;br&amp;gt;&lt;br /&gt;
X:aug&amp;lt;br&amp;gt;&lt;br /&gt;
X:dim&amp;lt;br&amp;gt;&lt;br /&gt;
X:sus2&amp;lt;br&amp;gt;&lt;br /&gt;
X:sus4&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For comparison of tetrad (quad) chords (currently only for the Beatles and Queen and Zweieck datasets):&lt;br /&gt;
&lt;br /&gt;
N &amp;lt;br&amp;gt;&lt;br /&gt;
X:maj &amp;lt;br&amp;gt;&lt;br /&gt;
X:min&amp;lt;br&amp;gt;&lt;br /&gt;
X:aug&amp;lt;br&amp;gt;&lt;br /&gt;
X:dim&amp;lt;br&amp;gt;&lt;br /&gt;
X:sus2&amp;lt;br&amp;gt;&lt;br /&gt;
X:sus4&amp;lt;br&amp;gt;&lt;br /&gt;
X:maj7&amp;lt;br&amp;gt;&lt;br /&gt;
X:7&amp;lt;br&amp;gt;&lt;br /&gt;
X:maj(9)&amp;lt;br&amp;gt;&lt;br /&gt;
X:aug(7)	&amp;lt;br&amp;gt;&lt;br /&gt;
X:min(7)&amp;lt;br&amp;gt;&lt;br /&gt;
X:min7&amp;lt;br&amp;gt;&lt;br /&gt;
X:min(9)&amp;lt;br&amp;gt;&lt;br /&gt;
X:dim(7)&amp;lt;br&amp;gt;&lt;br /&gt;
X:hdim7	&amp;lt;br&amp;gt;&lt;br /&gt;
X:sus4(7)&amp;lt;br&amp;gt;&lt;br /&gt;
X:sus4(b7)&amp;lt;br&amp;gt;&lt;br /&gt;
X:dim7&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
For each evaluation level, the ground truth annotation is compared against the dictionary. Any chord label not belonging to the current dictionary will be replaced with an &amp;quot;X&amp;quot; in a local copy of the annotation and will not be included in the recall calculation.&lt;br /&gt;
&lt;br /&gt;
Note that the level of comparison in terms of intervals can be varied. For example, in a triad evaluation we can consider the first three component intervals in the chord so that a major (1,3,5) and a major7 (1,3,5,7) will be considered the same chord. For a tetrad (quad) evaluation, we would consider the first 4 intervals so major and major7 would then be considered to be different chords.&lt;br /&gt;
&lt;br /&gt;
For the maj/min evaluation (using the first example dictionary), using an interval comparison of 2 (dyad) will compare only the first two intervals of each chord label. This would map augmented and diminished chords to major and minor respectively (and any other symbols that had a major 3rd or minor 3rd as their first interval). Using an interval comparison of 3 with the same dictionary would keep only those chords that have major and minor triads as their first 3 intervals so augmented and diminished chords would be removed from the evaluation.&lt;br /&gt;
&lt;br /&gt;
After the annotation has been &amp;quot;filtered&amp;quot; using a given dictionary, it can be compared against the machine generated estimates output by the algorithm under test. The chord sequences described in the annotation and estimate text files are sampled at a given frame rate (in this case 10ms per frame) to give two sequences of chord frames which may be compared directly with each other. For calculating a hit or a miss, the chord labels from the current frame in each sequence will be compared.  Chord comparison is done by converting each chord label into an ordered list of pitch classes then comparing the two lists element by element. If the lists match to the required number of intervals then a hit is recorded, otherwise the estimate is considered a miss. It should be noted that, by converting to pitch classes in the comparison, this evaluation ignores enharmonic pitch and interval spellings so the following chords (slightly silly example just for illustration) will all evaluate as identical:&lt;br /&gt;
&lt;br /&gt;
C:maj = Dbb:maj = C#:(b1,b3,#4)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Basic recall calculation algorithm:&lt;br /&gt;
&lt;br /&gt;
1) filter annotated transcription using chord dictionary for a defined number of intervals&lt;br /&gt;
&lt;br /&gt;
2) sample annotated transcription and machine estimated transcription at 10ms intervals to create a sequence of annotation frames and estimate frames&lt;br /&gt;
&lt;br /&gt;
3) start at the first frame&lt;br /&gt;
&lt;br /&gt;
4) get chord label for current annotation frame and estimate frame&lt;br /&gt;
&lt;br /&gt;
5) check annotation label:&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
IF symbol is 'X' (i.e. non-dictionary) &amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
THEN ignore frame (record number of ignored frames)&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
ELSE compare annotated/estimated chords for the predefined number of intervals &amp;lt;br&amp;gt;&lt;br /&gt;
increment hit count if chords match&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
ENDIF&lt;br /&gt;
&lt;br /&gt;
6) increment frame count &lt;br /&gt;
&lt;br /&gt;
7) go back to 4 until final chord frame&lt;br /&gt;
--[[User:Chrish|Chrish]] 17:05, 9 September 2009 (UTC)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Submission Format ==&lt;br /&gt;
&lt;br /&gt;
=== Audio Format ===&lt;br /&gt;
Audio tracks will be encoded as 44.1 kHz 16bit mono WAV files.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== I/O Format ===&lt;br /&gt;
The expected output chord transcription file for participating algorithms is that proposed by Christopher Harte [6]. &lt;br /&gt;
&lt;br /&gt;
Hence, algorithms should output text files with a similar format to that used in the ground truth transcriptions. That is to say, they should be flat text files with chord segment labels and times arranged thus:&lt;br /&gt;
&lt;br /&gt;
 start_time end_time chord_label&lt;br /&gt;
&lt;br /&gt;
with elements separated by white spaces, times given in seconds, chord labels corresponding to the syntax described in [6] and one chord segment per line. &lt;br /&gt;
&lt;br /&gt;
The chord root is given as a natural (A|B|C|D|E|F|G) followed by optional sharp or flat modifiers (#|b). For the evaluation process we may assume enharmonic equivalence for chord roots. For a given chord type on root X, the chord labels can be given as a list of intervals or as a shorthand notation as shown in the following table:&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;5&amp;quot; cellspacing=&amp;quot;0&amp;quot; align=&amp;quot;center&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
!NAME&lt;br /&gt;
!INTERVALS&lt;br /&gt;
!SHORTHAND&lt;br /&gt;
|-&lt;br /&gt;
|-*Triads:		&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|major&lt;br /&gt;
|X:(1,3,5)&lt;br /&gt;
|X or X:maj &lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|minor&lt;br /&gt;
|X:(1,b3,5)&lt;br /&gt;
|X:min &lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|diminished&lt;br /&gt;
|X:(1,b3,b5)&lt;br /&gt;
|X:dim&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|augmented&lt;br /&gt;
|X:(1,3,#5)&lt;br /&gt;
|X:aug&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|suspended4&lt;br /&gt;
|X:(1,4,5)&lt;br /&gt;
|X:sus4&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|possible 6th triad:	&lt;br /&gt;
|&lt;br /&gt;
|	&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|suspended2&lt;br /&gt;
|X:(1,2,5)&lt;br /&gt;
|X:sus2&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|*Quads: 	&lt;br /&gt;
|&lt;br /&gt;
|	&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|major-major7&lt;br /&gt;
|X:(1,3,5,7)&lt;br /&gt;
|X:maj7&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|major-minor7&lt;br /&gt;
|X:(1,3,5,b7)&lt;br /&gt;
|X:7&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|major-add9&lt;br /&gt;
|X:(1,3,5,9)&lt;br /&gt;
|X:maj(9)&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|major-major7-#5&lt;br /&gt;
|X:(1,3,#5,7)&lt;br /&gt;
|X:aug(7)&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|minor-major7&lt;br /&gt;
|X:(1,b3,5,7)&lt;br /&gt;
|X:min(7)&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|minor-minor7&lt;br /&gt;
|X:(1,b3,5,b7)&lt;br /&gt;
|X:min7&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|minor-add9&lt;br /&gt;
|X:(1,b3,5,9)&lt;br /&gt;
|X:min(9)&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|minor 7/b5 (ambiguous - could be either of the following)		&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|minor-major7-b5&lt;br /&gt;
|X:(1,b3,b5,7)&lt;br /&gt;
|X:dim(7)&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|minor-minor7-b5  (a half diminished-7th)&lt;br /&gt;
|X:(1,b3,b5,b7)&lt;br /&gt;
|X:hdim7&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|sus4-major7&lt;br /&gt;
|X:(1,4,5,7)&lt;br /&gt;
|X:sus4(7)&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|sus4-minor7&lt;br /&gt;
|X:(1,4,5,b7)&lt;br /&gt;
|X:sus4(b7)&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|omitted from list on wiki:&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|diminished7&lt;br /&gt;
|X:(1,b3,b5,bb7)&lt;br /&gt;
|X:dim7&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
|No Chord&lt;br /&gt;
|N&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Please note that two things have changed in the syntax since it was originally described in [6]. The first change is that the root is no longer implied as a voiced element of a chord so a C major chord (notes C, E and G) should be written C:(1,3,5) instead of just C:(3,5) if using the interval list representation. As before, the labels C and C:maj are equivalent to C:(1,3,5). The second change is that the shorthand label &amp;quot;sus2&amp;quot; (intervals 1,2,5) has been added to the available shorthand list.--[[User:Chrish|Chrish]] 17:05, 9 September 2009 (UTC)&lt;br /&gt;
&lt;br /&gt;
We still accept participants who would only like to be evaluated on major/minor chords and want to use the number format which is an integer chord id on range 0-24, where values 0-11  denote the C major, C# major, ..., B major  and  12-23 denote the C minor, C# minor, ..., B minor and         24    denotes silence or no-chord segments. '''Please note that the format is still the same'''&lt;br /&gt;
&lt;br /&gt;
 start_time end_time chord_number&lt;br /&gt;
&lt;br /&gt;
Systems are supposed to print out the onset-offset times as opposed to MIREX 2008 chord output format where only onset were used.&lt;br /&gt;
&lt;br /&gt;
=== Command line calling format ===&lt;br /&gt;
&lt;br /&gt;
Submissions have to conform to the specified format below:&lt;br /&gt;
&lt;br /&gt;
 ''extractFeaturesAndTrain  &amp;quot;/path/to/trainFileList.txt&amp;quot;  &amp;quot;/path/to/scratch/dir&amp;quot; '' &lt;br /&gt;
&lt;br /&gt;
Where fileList.txt has the paths to each wav file. The features extracted on this stage can be stored under &amp;quot;/path/to/scratch/dir&amp;quot; &lt;br /&gt;
The ground truth files for the supervised learning will be in the same path with a &amp;quot;.txt&amp;quot; extension at the end. For example for &amp;quot;/path/to/trainFile1.wav&amp;quot;, there will be a corresponding ground truth file called &amp;quot;/path/to/trainFile1.wav.txt&amp;quot; . &lt;br /&gt;
&lt;br /&gt;
For testing:&lt;br /&gt;
&lt;br /&gt;
 ''doChordID.sh &amp;quot;/path/to/testFileList.txt&amp;quot;  &amp;quot;/path/to/scratch/dir&amp;quot; &amp;quot;/path/to/results/dir&amp;quot; '' &lt;br /&gt;
&lt;br /&gt;
If there is no training, you can ignore the second argument here. In the results directory, there should be one file for each testfile with same name as the test file + .txt . &lt;br /&gt;
&lt;br /&gt;
Programs can use their working directory if they need to keep temporary cache files or internal debuggin info. Stdout and stderr will be logged.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Packaging submissions ===&lt;br /&gt;
All submissions should be statically linked to all libraries (the presence of dynamically linked libraries cannot be guaranteed).&lt;br /&gt;
&lt;br /&gt;
All submissions should include a README file including the following information:&lt;br /&gt;
&lt;br /&gt;
* Command line calling format for all executables and an example formatted set of commands&lt;br /&gt;
* Number of threads/cores used or whether this should be specified on the command line&lt;br /&gt;
* Expected memory footprint&lt;br /&gt;
* Expected runtime&lt;br /&gt;
* Any required environments (and versions), e.g. python, java, bash, matlab.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Time and hardware limits ==&lt;br /&gt;
Due to the potentially high number of particpants in this and other audio tasks, hard limits on the runtime of submissions are specified. &lt;br /&gt;
 &lt;br /&gt;
A hard limit of 24 hours will be imposed on runs (total feature extraction and querying times). Submissions that exceed this runtime may not receive a result.&lt;br /&gt;
&lt;br /&gt;
== Discussion ==&lt;br /&gt;
Please write your comments below with your name and date.&lt;br /&gt;
&lt;br /&gt;
Somewhere in the email discussion on the MIREX list, there was a mention that the recent systems run on the Beatles/Queen/Zweieck dataset might have over-learnt the properties of this dataset. I just wondered whether, during or post-MIREX, there was any way to formally/experimentally demonstrate this? I mean, beyond making the observation that there is a &amp;quot;drop&amp;quot; in performance from an open dataset to a closed one. The issue would seem particularly pertinent with regard to this dataset since it's been public for sometime.&lt;br /&gt;
(Matthew Davies, 9th August)&lt;br /&gt;
&lt;br /&gt;
== Potential Participants ==&lt;br /&gt;
name / email&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Bibliography ==&lt;br /&gt;
&lt;br /&gt;
1. Harte, C.A. and Sandler, M.B. (2005). '''Automatic chord identification using a quantised chromagram.''' Proceedings of 118th Audio Engineering Society's Convention.&lt;br /&gt;
&lt;br /&gt;
2. Sailer, C. and Rosenbauer K. (2006). '''A bottom-up approach to chord detection.''' Proceedings of International Computer Music Conference 2006.&lt;br /&gt;
&lt;br /&gt;
3. Shenoy, A. and Wang, Y. (2005). '''Key, chord, and rythm tracking of popular music recordings.''' Computer Music Journal 29(3), 75-86.&lt;br /&gt;
&lt;br /&gt;
4. Sheh, A. and Ellis, D.P.W. (2003). '''Chord segmentation and recognition using em-trained hidden markov models.''' Proceedings of 4th International Conference on Music Information Retrieval.&lt;br /&gt;
&lt;br /&gt;
5. Yoshioka, T. et al. (2004). '''Automatic Chord Transcription with concurrent recognition of chord symbols and boundaries.''' Proceedings of 5th International Conference on Music Information Retrieval.&lt;br /&gt;
&lt;br /&gt;
6. Harte, C. et al. (2005). '''Symbolic representation of musical chords: a proposed syntax for text annotations.''' Proceedings of 6th International Conference on Music Information Retrieval.&lt;br /&gt;
&lt;br /&gt;
7. Papadopoulos, H. and Peeters, G. (2007). '''Large-scale study of chord estimation algorithms based on chroma representation and HMM.''' Proceedings of 5th International Conference on Content-Based Multimedia Indexing.&lt;br /&gt;
&lt;br /&gt;
8. Abdallah, S. et al. (2005). '''Theory and Evaluation of a Bayesian Music Structure Extractor''' (pp. 420-425) Proc. 6th International Conference on Music Information Retrieval, ISMIR 2005.&lt;br /&gt;
&lt;br /&gt;
9. John Ashley Burgoyne et al. (2011). '''An expert ground-truth set for audio chord recognition and music analysis''' (pp. 633–638) Proc. 12th International Society for Music Information Retrieval Conference, ISMIR 2006. [http://ismir2011.ismir.net/papers/OS8-1.pdf (PDF)]&lt;/div&gt;</summary>
		<author><name>J. Ashley Burgoyne</name></author>
		
	</entry>
</feed>