Talk:Audio Melody Extr
Matija's Comments
Some comments:
There should be an option to use different hop/frame sizes. Maybe a preferred size could be given (i.e. the one used for ground truth), while for others, ground truth data could be interpolated to fit any hop size (loss of accuracy is at the risk of submitter)
Last year's data should be augmented with some new data; next to mentioned sources, RWC is a useful source, as MIDI transcriptions are also available (although not aligned) and may provide a starting point for annotation. UPF's tool would certainly be useful. Are there any score-to-audio alignment tools available?
I agree that we could have several evaluations:
- f0 without taking into consideration unvoiced/accompaniment parts, thereby ignoring algorithm's capability of separating melody from other parts (considering and ignoring octave errors) and emphasizing f0 detection
- f0 as last year (considering and ignoring octave errors)
- melody segmentation, as proposed by reviewer 2, but this would also mean that ground truth should include accompaniment, which is probably not realistic
- edit distance ?
If ground truth f0 is not estimated accurately enough, then some discretization scheme similar to Emmanuel's suggestions would be appropriate, but I disagree with just MIDI pitches, as they are too coarse, especially with vocal parts.