2010:Audio Chord Estimation
Contents
Description
The text of this section is copied from the 2009 page. This task was first run in 2008. Please add your comments and discussions for 2010.
For many applications in music information retrieval, extracting the harmonic structure is very desirable, for example for segmenting pieces into characteristic segments, for finding similar pieces, or for semantic analysis of music.
The extraction of the harmonic structure requires the detection of as many chords as possible in a piece. That includes the characterisation of chords with a key and type as well as a chronological order with onset and duration of the chords.
Although some publications are available on this topic [1,2,3,4,5], comparison of the results is difficult, because different measures are used to assess the performance. To overcome this problem an accurately defined methodology is needed. This includes a repertory of the findable chords, a defined test set along with ground truth and unambiguous calculation rules to measure the performance.
Regarding this we suggest to introduced the new evaluation task Audio Chord Detection.
Data
Christopher Harte`s Beatles dataset is used for the evaluations last year. This dataset consists of 12 Beatles albums [6]. An approach for text annotation of musical chords is presented in [6]. This year an extra dataset was donated by Matthias Mauch which consists of 38 songs from Queen and Zweieck. The data will be provided as 44.1 kHz 16bit mono wav. The ground-truth looks like this:
41.2631021 44.2456460 B
44.2456460 45.7201130 E
45.7201130 47.2061900 E:7/3
47.2061900 48.6922670 A
48.6922670 50.1551240 A:min/b3
I/O Format
This year I/O format needs to be changed to evaluate on all triads an quads. We are planning to use the format suggested by Christopher Harte [6]. The chord root is given as a natural (A|B|C|D|E|F|G) followed by optional sharp or flat modifiers (#|b). For the evaluation process we may assume enharmonic equivalence for chord roots. For a given chord type on root X, the chord labels can be given as a list of intervals or as a shorthand notation as shown in the following table:
NAME | INTERVALS | SHORTHAND |
---|---|---|
major | X:(1,3,5) | X or X:maj |
minor | X:(1,b3,5) | X:min |
diminished | X:(1,b3,b5) | X:dim |
augmented | X:(1,3,#5) | X:aug |
suspended4 | X:(1,4,5) | X:sus4 |
possible 6th triad: | ||
suspended2 | X:(1,2,5) | X:sus2 |
*Quads: | ||
major-major7 | X:(1,3,5,7) | X:maj7 |
major-minor7 | X:(1,3,5,b7) | X:7 |
major-add9 | X:(1,3,5,9) | X:maj(9) |
major-major7-#5 | X:(1,3,#5,7) | X:aug(7) |
minor-major7 | X:(1,b3,5,7) | X:min(7) |
minor-minor7 | X:(1,b3,5,b7) | X:min7 |
minor-add9 | X:(1,b3,5,9) | X:min(9) |
minor 7/b5 (ambiguous - could be either of the following) | ||
minor-major7-b5 | X:(1,b3,b5,7) | X:dim(7) |
minor-minor7-b5 (a half diminished-7th) | X:(1,b3,b5,b7) | X:hdim7 |
sus4-major7 | X:(1,4,5,7) | X:sus4(7) |
sus4-minor7 | X:(1,4,5,b7) | X:sus4(b7) |
omitted from list on wiki: | ||
diminished7 | X:(1,b3,b5,bb7) | X:dim7 |
No Chord | N |
However, we still accept participants who would only like to be evaluated on major/minor and want to use last year`s format which is an integer chord id on range 0-24, where values 0-11 denote the C major, C# major, ..., B major and 12-23 denote the C minor, C# minor, ..., B minor and 24 denotes silence or no-chord segments
Submission Format
Submissions have to conform to the specified format below:
extractFeaturesAndTrain "/path/to/trainFileList.txt" "/path/to/scratch/dir"
Where fileList.txt has the paths to each wav file. The features extracted on this stage can be stored under "/path/to/scratch/dir" The ground truth files for the supervised learning will be in the same path with a ".txt" extension at the end. For example for "/path/to/trainFile1.wav", there will be a corresponding ground truth file called "/path/to/trainFile1.wav.txt" .
For testing:
doChordID.sh "/path/to/testFileList.txt" "/path/to/scratch/dir" "/path/to/results/dir"
If there is no training, you can ignore the second argument here. In the results directory, there should be one file for each testfile with same name as the test file + .txt .
Programs can use their working directory if they need to keep temporary cache files or internal debuggin info. Stdout and stderr will be logged.
Discussions for 2010
Discussions from 2009
https://www.music-ir.org/mirex/2009/index.php/Audio_Chord_Detection#Discussions
Potential Participants
Your name here
Bibliography
1.Harte,C.A. and Sandler,M.B.(2005). Automatic chord identification using a quantised chromagram. Proceedings of 118th Audio Engineering Society's Convention.
2.Sailer,C. and Rosenbauer K.(2006). A bottom-up approach to chord detection. Proceedings of International Computer Music Conference 2006.
3.Shenoy,A. and Wang,Y.(2005). Key, chord, and rythm tracking of popular music recordings. Computer Music Journal 29(3), 75-86.
4.Sheh,A. and Ellis,D.P.W.(2003). Chord segmentation and recognition using em-trained hidden markov models. Proceedings of 4th International Conference on Music Information Retrieval.
5.Yoshioka,T. et al.(2004). Automatic Chord Transcription with concurrent recognition of chord symbols and boundaries. Proceedings of 5th International Conference on Music Information Retrieval.
6.Harte,C. and Sandler,M. and Abdallah,S. and G├│mez,E.(2005). Symbolic representation of musical chords: a proposed syntax for text annotations. Proceedings of 6th International Conference on Music Information Retrieval.
7.Papadopoulos,H. and Peeters,G.(2007). Large-scale study of chord estimation algorithms based on chroma representation and HMM. Proceedings of 5th International Conference on Content-Based Multimedia Indexing.
8.Samer Abdallah, Katy Noland, Mark Sandler, Michael Casey & Christophe Rhodes: Theory and Evaluation of a Bayesian Music Structure Extractor (pp. 420-425) Proc. 6th International Conference on Music Information Retrieval, ISMIR 2005.