Difference between revisions of "2007:Multiple Fundamental Frequency Estimation & Tracking"

From MIREX Wiki
(New page: ==Description== A complex music signal can be represented by the F0`s contours of its constituent sources which is very useful in most music information retrieval systems. There have been...)
 
(Description)
Line 12: Line 12:
  
 
We are more interested in the more general but feasible first case. The third case, which is subset of first case should be considered as a subtask since in most professional recordings, sources are recorded individually and panned across two stereo channels, researchers should take advantage of that.
 
We are more interested in the more general but feasible first case. The third case, which is subset of first case should be considered as a subtask since in most professional recordings, sources are recorded individually and panned across two stereo channels, researchers should take advantage of that.
 +
 +
==Data==
 +
 +
Since extracting F0 contours of all sources is a challenging task, the number of sources should be limited to 4-5 pitched instruments (no percussions).
 +
Annotating the ground truth data is an important issue, one option is to start with midi files and use a realistic synthesizer to create the data, to have completely accurate ground truth. A real world data set can be the RWC database, but this database is already public.
 +
Please make your recommendations on creating a database for this task.
 +
 +
 +
==Evaluation==
 +
 +
The evaluation will be similar to the previous Audio Melody Extraction Tasks, based on the voicing and F0 detection for each source. Each  F0-contour extracted from the song by the proposed system will be scored by one of the ground truth contours for that song that results in the highest score.
 +
Another score based on the just the raw frequency estimates per frame without tracking is also going to be reported.

Revision as of 16:19, 5 February 2007

Description

A complex music signal can be represented by the F0`s contours of its constituent sources which is very useful in most music information retrieval systems. There have been many attempts in multi-F0 estimation, and related area melody extraction. The goal of multiple F0 tracking is to extract contours of each source from a complex music signal. In this task we would like to evaluate the state-of-art multi-F0 tracking algorithms. Since F0 tracking of all sources in a complex audio mixture can be very hard, we have to restrict our problem space. The possible cases are:

1. Multiple instruments active at the same time but each playing monophonically (one note at a time) and each instrument having a different timbre in a single channel input.

2. Multiple sources playing polyphonically (e.g. chords…) in a single channel input.

3. Multiple sources playing polyphonically in a stereo panned mixture.

We are more interested in the more general but feasible first case. The third case, which is subset of first case should be considered as a subtask since in most professional recordings, sources are recorded individually and panned across two stereo channels, researchers should take advantage of that.

Data

Since extracting F0 contours of all sources is a challenging task, the number of sources should be limited to 4-5 pitched instruments (no percussions). Annotating the ground truth data is an important issue, one option is to start with midi files and use a realistic synthesizer to create the data, to have completely accurate ground truth. A real world data set can be the RWC database, but this database is already public. Please make your recommendations on creating a database for this task.


Evaluation

The evaluation will be similar to the previous Audio Melody Extraction Tasks, based on the voicing and F0 detection for each source. Each F0-contour extracted from the song by the proposed system will be scored by one of the ground truth contours for that song that results in the highest score. Another score based on the just the raw frequency estimates per frame without tracking is also going to be reported.