2005:Audio Melody Extr
Problem is reasonably well defined and would be considered interesting in terms of current research.
Before having any kind of surgery (including dental surgery) or emergency treatment, tell the medical doctor or dentist in charge that you are taking this medicine.
Do not take this medicine for 5 days before any surgery, including dental surgery, unless otherwise directed by your medical doctor or dentist. Taking aspirin during this time may cause bleeding problems.
Dizziness, lightheadedness, or fainting may occur, especially when you get up suddenly from a lying or sitting position. Getting up slowly may help lessen this problem.
Nausea or vomiting may occur, especially after the first couple of doses. This effect may go away if you lie down for a while. However, if nausea or vomiting continues, check with your doctor. Lying down for a while may also help some other side effects, such as dizziness or lightheadedness.
This medicine may cause some people to become drowsy, dizzy, or lightheaded, or to feel a false sense of well-being. Make sure you know how you react to this medicine before you drive, use machines, or do anything else that could be dangerous if you are dizzy or are not alert and clearheaded.
For patients taking the buffered aspirin, codeine, and caffeine combination (C2 Buffered with Codeine):
This product contains antacids that can keep many other medicines, especially some medicines used to treat infections, from working properly. This problem can be prevented by not taking the 2 medicines too close together. Ask your pharmacist how long you should wait between taking any other medicine and the buffered aspirin, codeine, and caffeine combination.
No mention of audio format/sampling rate, will assume:
- CD-quality (CM, 16-bit, 44100 Hz)
- 30 seconds excerpts
- files are named as "001.wav" to "999.wav"
No mention of frame size or hop size, will this be the same as 2004 competition (Frame size 2048, hop size 256)? Is this optimal? Would some participants prefer to use different sizes. Could the proposed evaluation metrics be modified to use absolute time indexes and a tolerance and therefore be independent of framing?
In the proposed evaluation metrics there is no mention of whether option 1 and option two will be averages as they were last year, or how option 3 will be combined with these. Statistical significance of differences between submissions should be estimated.
Re-use and augmentation of last year's database is fine, however there is no mention of where new data will come from. Obviously the Magnatune database would be a good source, as this can also be distributed, however it may be best to distribute last years database and hold back new examples. How big should new database be? 50 files? I assume there are likely to be no trained submissions, or they will be pre-trained therefore a single pass over the data should be fine. There is also no mention of how many non-participating transcribers will produce the ground-truth and how differences in transcriptions will be resolved. Given IP status of Magnatune database, distribution to transcribers should not be a problem.
Given the high number of potential participants, I think we can be confident of sufficient participation to run the evaluation.
Recommendation: Significant refinements to proposal and accept.