Difference between revisions of "2018:Music and or Speech Detection Results"
Line 20: | Line 20: | ||
! JHKK3 | ! JHKK3 | ||
| [https://www.music-ir.org/mirex/abstracts/2018/JHKK3.pdf PDF] || Byeong-Yong Jang, Woon-Haeng Heo, Jung-Hyun Kim, Oh-Wook Kwon | | [https://www.music-ir.org/mirex/abstracts/2018/JHKK3.pdf PDF] || Byeong-Yong Jang, Woon-Haeng Heo, Jung-Hyun Kim, Oh-Wook Kwon | ||
+ | |- | ||
+ | ! LN1 | ||
+ | | [https://www.music-ir.org/mirex/abstracts/2018/LN1.pdf PDF] || Minsuk Choi, Jongpil Lee, Juhan Nam | ||
|- | |- | ||
! MM1 | ! MM1 | ||
Line 29: | Line 32: | ||
! MM3 | ! MM3 | ||
| [https://www.music-ir.org/mirex/abstracts/2018/MM3.pdf PDF] || Matija Marolt | | [https://www.music-ir.org/mirex/abstracts/2018/MM3.pdf PDF] || Matija Marolt | ||
− | |||
− | |||
− | |||
|- | |- | ||
! MMG1 | ! MMG1 | ||
Line 61: | Line 61: | ||
! JHKK2 | ! JHKK2 | ||
| 0.8005 || 0.7415 || 0.8375 | | 0.8005 || 0.7415 || 0.8375 | ||
+ | |- | ||
+ | ! LN1 | ||
+ | | 0.6251 || 0.5022 || 0.6987 | ||
|- | |- | ||
! MM1 | ! MM1 | ||
Line 70: | Line 73: | ||
! MM3 | ! MM3 | ||
| 0.6075 || 0.3124 || 0.7254 | | 0.6075 || 0.3124 || 0.7254 | ||
− | |||
− | |||
− | |||
|- | |- | ||
! MMG1 | ! MMG1 | ||
Line 96: | Line 96: | ||
! JHKK2 | ! JHKK2 | ||
| 0.2522 || 0.0931 || 0.3245 || 0.1389 | | 0.2522 || 0.0931 || 0.3245 || 0.1389 | ||
+ | |- | ||
+ | ! LN1 | ||
+ | | 0.1348 || 0.0139 || 0.1704 || 0.0231 | ||
|- | |- | ||
! MM1 | ! MM1 | ||
Line 105: | Line 108: | ||
! MM3 | ! MM3 | ||
| 0.1379 || 0.0525 || 0.1619 || 0.0676 | | 0.1379 || 0.0525 || 0.1619 || 0.0676 | ||
− | |||
− | |||
− | |||
|- | |- | ||
! MMG1 | ! MMG1 | ||
Line 128: | Line 128: | ||
==Task 2: Speech Detection== | ==Task 2: Speech Detection== | ||
+ | |||
+ | ===Segment-level Evaluation=== | ||
[https://www.music-ir.org/mirex/wiki/2018:Music_and/or_Speech_Detection#Evaluation_Dataset Dataset 1] | [https://www.music-ir.org/mirex/wiki/2018:Music_and/or_Speech_Detection#Evaluation_Dataset Dataset 1] | ||
Line 136: | Line 138: | ||
! width="80" style="text-align: center;" | Accuracy | ! width="80" style="text-align: center;" | Accuracy | ||
! width="80" | Speech F-measure | ! width="80" | Speech F-measure | ||
− | ! width="80" | No Speech F-measure | + | ! width="80" | No-Speech F-measure |
|- | |- | ||
! DD1 | ! DD1 | ||
Line 143: | Line 145: | ||
! JHKK3 | ! JHKK3 | ||
| 0.8307 || 0.8795 || 0.7143 | | 0.8307 || 0.8795 || 0.7143 | ||
+ | |- | ||
+ | ! LN1 | ||
+ | | 0.6908 || 0.7472 || 0.6007 | ||
|- | |- | ||
! MM1 | ! MM1 | ||
Line 156: | Line 161: | ||
| 0.6908 || 0.7472 || 0.6007 | | 0.6908 || 0.7472 || 0.6007 | ||
|} | |} | ||
+ | |||
+ | ===Event-level Evaluation=== | ||
+ | |||
+ | {| border="1" cellspacing="0" style="text-align: left; width: 240px;" | ||
+ | |- style="background: yellow;" | ||
+ | ! width="80" | Sub code | ||
+ | ! width="80" style="text-align: center;" | Speech_F_500_on | ||
+ | ! width="80" | Speech_F_500_onoff | ||
+ | ! width="80" | Speech_F_1000_on | ||
+ | ! width="80" | Speech_F_1000_onoff | ||
+ | |- | ||
+ | ! DD1 | ||
+ | | 0.2877 || 0.093 || 0.312 || 0.1142 | ||
+ | |- | ||
+ | ! JHKK3 | ||
+ | | 0.2303 || 0.0765 || 0.294 || 0.1173 | ||
+ | |- | ||
+ | ! LN1 | ||
+ | | 0.1348 || 0.0139 || 0.1704 || 0.0231 | ||
+ | |- | ||
+ | ! MM1 | ||
+ | | 0.2044 || 0.0662 || 0.2137 || 0.0831 | ||
+ | |- | ||
+ | ! MM2 | ||
+ | | 0.2464 || 0.0817 || 0.2736 || 0.1049 | ||
+ | |- | ||
+ | ! MM3 | ||
+ | | 0.1379 || 0.0525 || 0.1619 || 0.0676 | ||
+ | |} | ||
+ | |||
+ | '''Notes on metrics:''' | ||
+ | |||
+ | Speech_F = segment-level F-measure for the speech class | ||
+ | |||
+ | No-Speech_F = segment-level F-measure for the no-speech class | ||
+ | |||
+ | Speech_F_500_on = onset-only event-level F-measure (500 ms tolerance) for the speech class | ||
+ | |||
+ | Speech_F_500_onoff = onset-offset event-level F-measure (500 ms tolerance) for the speech class | ||
+ | |||
+ | Speech_F_1000_on = onset-only event-level F-measure (1000 ms tolerance) for the speech class | ||
+ | |||
+ | Speech_F_1000_onoff = onset-offset event-level F-measure (1000 ms tolerance) for the speech class |
Revision as of 09:58, 17 September 2018
Contents
Introduction
These are the results for the 2018 running of the Music and/or Speech Detection tasks. For background information about this task set please refer to the 2018:Music and/or Speech Detection page.
General Legend
Sub code | Abstract | Contributors |
---|---|---|
DD1 | David Doukhan | |
JHKK1 | Byeong-Yong Jang, Woon-Haeng Heo, Jung-Hyun Kim, Oh-Wook Kwon | |
JHKK2 | Byeong-Yong Jang, Woon-Haeng Heo, Jung-Hyun Kim, Oh-Wook Kwon | |
JHKK3 | Byeong-Yong Jang, Woon-Haeng Heo, Jung-Hyun Kim, Oh-Wook Kwon | |
LN1 | Minsuk Choi, Jongpil Lee, Juhan Nam | |
MM1 | Matija Marolt | |
MM2 | Matija Marolt | |
MM3 | Matija Marolt | |
MMG1 | Blai Meléndez-Catalán, Emilio Molina, Emilia Gómez | |
MMG2 | Blai Meléndez-Catalán, Emilio Molina, Emilia Gómez |
Task 1: Music Detection
Segment-level Evaluation
Sub code | Accuracy | Music_F | No-Music_F |
---|---|---|---|
DD1 | 0.6860 | 0.5424 | 0.7611 |
JHKK1 | 0.7798 | 0.7123 | 0.8215 |
JHKK2 | 0.8005 | 0.7415 | 0.8375 |
LN1 | 0.6251 | 0.5022 | 0.6987 |
MM1 | 0.6135 | 0.3899 | 0.7172 |
MM2 | 0.6807 | 0.5478 | 0.7531 |
MM3 | 0.6075 | 0.3124 | 0.7254 |
MMG1 | 0.9049 | 0.8996 | 0.9097 |
Event-level Evaluation
Sub code | Music_F_500_on | Music_F_500_onoff | Music_F_1000_on | Music_F_1000_onoff |
---|---|---|---|---|
DD1 | 0.2877 | 0.093 | 0.312 | 0.1142 |
JHKK1 | 0.2303 | 0.0765 | 0.294 | 0.1173 |
JHKK2 | 0.2522 | 0.0931 | 0.3245 | 0.1389 |
LN1 | 0.1348 | 0.0139 | 0.1704 | 0.0231 |
MM1 | 0.2044 | 0.0662 | 0.2137 | 0.0831 |
MM2 | 0.2464 | 0.0817 | 0.2736 | 0.1049 |
MM3 | 0.1379 | 0.0525 | 0.1619 | 0.0676 |
MMG1 | 0.5177 | 0.2693 | 0.5813 | 0.3502 |
Notes on metrics:
Music_F = segment-level F-measure for the music class
No-Music_F = segment-level F-measure for the no-music class
Music_F_500_on = onset-only event-level F-measure (500 ms tolerance) for the music class
Music_F_500_onoff = onset-offset event-level F-measure (500 ms tolerance) for the music class
Music_F_1000_on = onset-only event-level F-measure (1000 ms tolerance) for the music class
Music_F_1000_onoff = onset-offset event-level F-measure (1000 ms tolerance) for the music class
Task 2: Speech Detection
Segment-level Evaluation
Sub code | Accuracy | Speech F-measure | No-Speech F-measure |
---|---|---|---|
DD1 | 0.877 | 0.9186 | 0.7493 |
JHKK3 | 0.8307 | 0.8795 | 0.7143 |
LN1 | 0.6908 | 0.7472 | 0.6007 |
MM1 | 0.8626 | 0.9115 | 0.6948 |
MM2 | 0.8619 | 0.909 | 0.713 |
MM3 | 0.8508 | 0.9086 | 0.5966 |
LN1 | 0.6908 | 0.7472 | 0.6007 |
Event-level Evaluation
Sub code | Speech_F_500_on | Speech_F_500_onoff | Speech_F_1000_on | Speech_F_1000_onoff |
---|---|---|---|---|
DD1 | 0.2877 | 0.093 | 0.312 | 0.1142 |
JHKK3 | 0.2303 | 0.0765 | 0.294 | 0.1173 |
LN1 | 0.1348 | 0.0139 | 0.1704 | 0.0231 |
MM1 | 0.2044 | 0.0662 | 0.2137 | 0.0831 |
MM2 | 0.2464 | 0.0817 | 0.2736 | 0.1049 |
MM3 | 0.1379 | 0.0525 | 0.1619 | 0.0676 |
Notes on metrics:
Speech_F = segment-level F-measure for the speech class
No-Speech_F = segment-level F-measure for the no-speech class
Speech_F_500_on = onset-only event-level F-measure (500 ms tolerance) for the speech class
Speech_F_500_onoff = onset-offset event-level F-measure (500 ms tolerance) for the speech class
Speech_F_1000_on = onset-only event-level F-measure (1000 ms tolerance) for the speech class
Speech_F_1000_onoff = onset-offset event-level F-measure (1000 ms tolerance) for the speech class