2005:Symbolic Genre Class
Contents
Proposer
Cory McKay (McGill University) cory.mckay@mail.mcgill.ca
Title
Genre Classification of MIDI Files
Description
Submitted software will automatically classify MIDI recordings into genre categories.
1) Genre Categories The genre categories will be organized hierarchically, in order to enable evaluation of how well entries can perform both coarse and fine classifications. The particular categories to be used will be determined by the evaluation committee. Individual recordings could belong to more than one category, as this is more realistic than requiring that each recording be classified as belonging to exactly one category. A total of three to five coarse categories and ten to fifteen fine categories will be used. Model classifications will be made by the evaluation committee or a sub-committee of the evaluation committee. Entrants will be provided with the selection and organization of categories so that they can configure their software to reflect them before submission.
2) Training and Testing Recordings Training and testing recordings will be chosen by the evaluation committee and kept confidential until after evaluations are complete. The test recordings will then be released, copyrights permitting.
3) Input Data Training will be performed by providing the software (through a command-line argument) with a text file listing training MIDI file paths and model genre(s). Testing will be performed by providing the software (through a command-line argument) with a text file that contains a list of file paths of test MIDI recordings.
4) Output Data The software will produce a text file listing test recording file paths and the genre(s) that each has been classified as.
Potential Participants
- George Tzanetakis (University of Victoria), gtzan@cs.uvic.ca, high likelihood
- Cory McKay & Ichiro Fujinaga (McGill University), cory.mckay@mail.mcgill.ca, high likelihood
- Pedro J. Ponce de Leon & Jose M. Inesta (Universidad de Alicante), pierre@dlsi.ua.es, medium likelihood
- Roberto Basili, Alfredo Serafini & Armando Stellato (University of Rome Tor Vergata), basili@info.uniroma2.it, medium likelihood
- Man-Kwan Shan & Fang-Fei Kuo (National Cheng Chi University), mkshan@cs.nccu.edu.tw, medium likelihood
- Rudi Cilibrasi, cilibrar@gmail.com, medium likelihood
Evaluation Procedures
Entries will be evaluated based on their success rates with respect ot both fine and coarse classifications. Entrants will have the option of enabling their software to output classifications of "unknown," which will be penalized less severely during evaluation than misclassifications, as classifications flagged as uncertain are much better than false classifications in a practical context. Evaluation will be performed using 5-fold cross validation.
Submissions in C/C++, Java, MatLab and Python (and other languages?) will be accepted.
Relevant Test Collections
- On-line repositories of MIDI files (sample links available at http://www.music.mcgill.ca/~cmckay/midi.html )
- Research databases.
Review 1
The problem is very interesting for MIR, but too vaguely described. The role of the committee is not to propose anything, but to review the proposed evaluation sessions. Thus the author should propose a detailed list of genres and corresponding data.
I'm not against organizing the genres hierarchically and associating several genres to each file, but this raises many issues that are not discussed at all here. If a track belongs to several genres, are these genres equally weighted or not ? Are they determined by asking several people to classify each track into one genre, or by asking each one to classify each track into several genres ? If there are coarse categories for classical and folk music, where lies the fine category of classical music adapted from folk songs ? I suggest that the contest concentrates on the single genre problem.
The choice of the genre classes is a crucial issue for the contest to be held several times. Indeed existing databases can be reused only when the defined categories are identical each year. Obviously the list of categories should reflect the list of MIDI music available on the internet. It would help if some data were already labeled according to this list.
The list of relevant data should be developed. How many files are needed for learning and testing ? Have the participants already collected some labeled data that they could give to the organizers ? How much ?
Regarding the release of the data, I think that it would be better not to release anything. The training and test data should always be accessible through the D2K interface, and thus no copyright problem would appear. Is it possible to ensure that the test data are used only for testing and not for learning ? Is it possible to implement learning easily in M2K ? (each algorithm may use different structures to store learnt data)
Finally, the evaluation procedure seems nice, but I don't have any clue whether the proposed participants are really interested.
Review 2
This is an interesting topic, one that I haven't seen much work on. I do not believe that its difficult to get a large collection of midi files. Many are in public domain, were never intended to be copyrighted, or have copyleft / creative commons licences. However, its still difficult to assemble a reasonable collection of midi files of appropriate length which accurately represent a sufficient number of genres. This must be addressed.
A key point is that it requires the Contest Committee to handlabel a large number of midi files. We also need to determine what our genres are. Is the Committee capable and willing to do this? I personally would find it very difficult to determine the genre of a midi recording which I don't recognize. MIDI all sounds like Muzak to me, unless I know the original audio recording. Has anyone tried midi-based genre classification before?
I have no problems with the suggested evaluation and testing procedures.
I think we need some more feedback on whether people are really interested in this. Most researchers who use MIDI, to my knowledge, aren't concerned with genre issues. George typically works with audio, so the proposer is the only one I'm aware of who I know is interested. I could be wrong so lets ask around. We also need to explore the handlabelling task, and to see if we can assemble a decent collection (which we should do regardless of this proposal).
If there is significant interest, and the labeling can be done, then we should accept it.
Downie's Comments
1. Happy to see another symbolic proposal!
2. See my comments w/r/t the Audio Genre proposal. We need to make these two tasks as similar as possible!
Rudi's Comments
Looks good to me. I agree with the first reviewer's comments wholeheartedly. I think it may be too complicated to do hierarchical genre classification. What if we just restrict ourselves to just 2-5 genres and pretend they are disjoint? And then only pick songs that clearly fit one or the other. I'm not against a hierarchical system necessarily, but it does seem like it may involve so much more work and arbitrariness in labelling, scoring, etc. If you just want to get something a bit more interesting than simple Jazz / Rock / Classical then how about happy / sad music? We could train on two different dimensions in two parts (or perhaps on the same set of songs?) to add a little variety without much additional complexity on the part of the participants or organizers. Or how about "hit song" (greater than 1 million copies sold or something) versus "not hit" like that Hit Song predictor that got some press lately.
I agree that we will need to get some more parameters about the number of MIDI files involved in the experiment. Let's put a finer point on the data model. Each training sample will have
- a MIDI file, provided as an absolute pathname string
- a string song title
- a string artist/group name
- a numerical genre classification code as an integer
- any other codes (e.g. happy/sad or hit/not-hit) also as integers
On each run of the system, the training set will be partitioned into five parts and set up for five-fold cross-validation testing. It will be given each song for training along with one of the N different integer label codes for whatever test is in progress. The program will read from standard input the following information, one record per line:
- first line, the number of MIDI songs for training as ASCII decimal, then a space, then the number of testing songs, then a newline
- next the training songs, one per line, with an ASCII decimal label code then a space then the absolute filename of the MIDI file, then a newline
- next the testing songs, one per line, as an absolute filename
The program is to output one integer prediction per line, for each test song, in order.
The program may assume the current directory is readable, searchable, and writable.