Introduction
These are the results for the 2018 running of the Music and/or Speech Detection tasks. For background information about this task set please refer to the 2018:Music and/or Speech Detection page.
General Legend
Sub code
|
Abstract
|
Contributors
|
DD1
|
PDF |
David Doukhan
|
JHKK1
|
PDF |
Byeong-Yong Jang, Woon-Haeng Heo, Jung-Hyun Kim, Oh-Wook Kwon
|
JHKK2
|
PDF |
Byeong-Yong Jang, Woon-Haeng Heo, Jung-Hyun Kim, Oh-Wook Kwon
|
JHKK3
|
PDF |
Byeong-Yong Jang, Woon-Haeng Heo, Jung-Hyun Kim, Oh-Wook Kwon
|
LN1
|
PDF |
Minsuk Choi, Jongpil Lee, Juhan Nam
|
MM1
|
PDF |
Matija Marolt
|
MM2
|
PDF |
Matija Marolt
|
MM3
|
PDF |
Matija Marolt
|
MMG1
|
PDF |
Blai Meléndez-Catalán, Emilio Molina, Emilia Gómez
|
MMG2
|
PDF |
Blai Meléndez-Catalán, Emilio Molina, Emilia Gómez
|
Statistics notation
<class>_F = segment-level F-measure for the class <class>
<class>_F_500_on = onset-only event-level F-measure (500 ms tolerance) for the class <class>
<class>_F_500_onoff = onset-offset event-level F-measure (500 ms tolerance) for the class <class>
<class>_F_1000_on = onset-only event-level F-measure (1000 ms tolerance) for the class <class>
<class>_F_1000_onoff = onset-offset event-level F-measure (1000 ms tolerance) for the class <class>
Task 1: Music Detection
===Dataset 1===
Segment-level Evaluation
Sub code
|
Accuracy
|
Music_F
|
No-Music_F
|
DD1
|
0.6860 |
0.5424 |
0.7611
|
JHKK1
|
0.7798 |
0.7123 |
0.8215
|
JHKK2
|
0.8005 |
0.7415 |
0.8375
|
LN1
|
0.6251 |
0.5022 |
0.6987
|
MM1
|
0.6135 |
0.3899 |
0.7172
|
MM2
|
0.6807 |
0.5478 |
0.7531
|
MM3
|
0.6075 |
0.3124 |
0.7254
|
MMG1
|
0.9049 |
0.8996 |
0.9097
|
Event-level Evaluation
Sub code
|
Music_F_500_on
|
Music_F_500_onoff
|
Music_F_1000_on
|
Music_F_1000_onoff
|
DD1
|
0.2877 |
0.093 |
0.312 |
0.1142
|
JHKK1
|
0.2303 |
0.0765 |
0.294 |
0.1173
|
JHKK2
|
0.2522 |
0.0931 |
0.3245 |
0.1389
|
LN1
|
0.1348 |
0.0139 |
0.1704 |
0.0231
|
MM1
|
0.2044 |
0.0662 |
0.2137 |
0.0831
|
MM2
|
0.2464 |
0.0817 |
0.2736 |
0.1049
|
MM3
|
0.1379 |
0.0525 |
0.1619 |
0.0676
|
MMG1
|
0.5177 |
0.2693 |
0.5813 |
0.3502
|
Task 2: Speech Detection
Dataset 1
Segment-level Evaluation
Sub code
|
Accuracy
|
Speech_F
|
No-Speech_F
|
DD1
|
0.877 |
0.9186 |
0.7493
|
JHKK3
|
0.8307 |
0.8795 |
0.7143
|
LN1
|
0.6908 |
0.7472 |
0.6007
|
MM1
|
0.8626 |
0.9115 |
0.6948
|
MM2
|
0.8619 |
0.909 |
0.713
|
MM3
|
0.8508 |
0.9086 |
0.5966
|
Event-level Evaluation
Sub code
|
Speech_F_500_on
|
Speech_F_500_onoff
|
Speech_F_1000_on
|
Speech_F_1000_onoff
|
DD1
|
0.415 |
0.1603 |
0.4477 |
0.2122
|
JHKK3
|
0.2882 |
0.0777 |
0.3289 |
0.0962
|
LN1
|
0.2686 |
0.0529 |
0.3484 |
0.0883
|
MM1
|
0.4607 |
0.2068 |
0.4898 |
0.2336
|
MM2
|
0.4422 |
0.1999 |
0.5093 |
0.266
|
MM3
|
0.4439 |
0.1775 |
0.4879 |
0.2122
|
Task 3: Music and Speech Detection
Dataset 1
Segment-level Evaluation
Sub code
|
Music_F
|
Speech_F
|
LN1
|
0.4936 |
0.7718
|
MM1
|
0.3899 |
0.9115
|
MM2
|
0.5478 |
0.909
|
MM3
|
0.3124 |
0.9086
|
Event-level Evaluation
Sub code
|
Music_F_500_on
|
Music_F_500_onoff
|
Music_F_1000_on
|
Music_F_1000_onoff
|
Speech_F_500_on
|
Speech_F_500_onoff
|
Speech_F_1000_on
|
Speech_F_1000_onoff
|
LN1
|
0.1116 |
0.0088 |
0.1459 |
0.0186 |
0.2645 |
0.0462 |
0.348 |
0.0786
|
MM1
|
0.2044 |
0.0662 |
0.2137 |
0.0831 |
0.4607 |
0.2068 |
0.4898 |
0.2336
|
MM2
|
0.2464 |
0.0817 |
0.2736 |
0.1049 |
0.4422 |
0.1999 |
0.5093 |
0.266
|
MM3
|
0.1379 |
0.0525 |
0.1619 |
0.0676 |
0.4439 |
0.1775 |
0.4879 |
0.2122
|
Task 4: Music Relative Loudness Estimation
Dataset 1
Segment-level Evaluation
Sub code
|
Accuracy
|
Fg-Music_F
|
Bg-Music_F
|
No-Music_F
|
MMG2
|
0.8615 |
0.788 |
0.821 |
0.9064
|
Event-level Evaluation
Sub code
|
Fg-Music_F_500_on
|
Fg-Music_F_500_onoff
|
Fg-Music_F_1000_on
|
Fg-Music_F_1000_onoff
|
Bg-Music_F_500_on
|
Bg-Music_F_500_onoff
|
Bg-Music_F_1000_on
|
Bg-Music_F_1000_onoff
|
Speech_F_500_on
|
Speech_F_500_onoff
|
Speech_F_1000_on
|
Speech_F_1000_onoff
|
MMG2
|
0.3298 |
0.1775 |
0.4106 |
0.2742 |
0.3853 |
0.1388 |
0.4463 |
0.2024 |
0.5254 |
0.3123 |
0.5927 |
0.3925
|