Ground Truth

Corresponding to how the mood taxonomy is going to be set up, there are two ways to obtain ground truth for evaluation purpose.

1. human judgment: we can elicit subjective judgments by human evaluators by using an online application comparable to IMIRSEL's Evalutron 6000. Details need to be further discussed. To start, we propose human evaluators to choose one mood label from a set for each music piece. Each piece may get at least 3 eyeballs and a label with at least 2 votes will be assigned to this piece as ground truth. Of cause there will be disagreement and depending on the number of available categories, votes to some pieces may be too scattered and thus invalidate judgments on those pieces.

2. collect labels from popular music websites. A problem is AMG only provide labels for albums. And even if labels for tracks are available, they might not be available for the pieces in our contest dataset.

Yet another (good) way of obtaining ground truth: 3. obtain datasets used in existing research. Those datasets have been labeled by individual reseachers.

