Difference between revisions of "2016:Singing Voice Separation Results"

From MIREX Wiki
(Description)
Line 3: Line 3:
 
=== Description ===
 
=== Description ===
  
These are the results for the 2016 running of the Singing Voice Separation task set. For more information about this task set please refer to the [[2016:Singing Voice Separation]] page.
+
These are the results for the 2016 running of the Singing Voice Separation task set. The evaluation set is kindly provided by [http://mac.citi.sinica.edu.tw/ikala/ iKala]. If you need to cite this page, please also cite T.-S. Chan, T.-C. Yeh, Z.-C. Fan, H.-W. Chen, L. Su, Y.-H. Yang, and R. Jang, "Vocal activity informed singing voice separation with the iKala dataset," in Proc. IEEE Int. Conf. Acoust., Speech and Signal Process., 2015, pp. 718-722. For more information about this task set please refer to the [[2016:Singing Voice Separation]] page.
  
 
=== Legend ===
 
=== Legend ===

Revision as of 03:41, 3 August 2016

Introduction

Description

These are the results for the 2016 running of the Singing Voice Separation task set. The evaluation set is kindly provided by iKala. If you need to cite this page, please also cite T.-S. Chan, T.-C. Yeh, Z.-C. Fan, H.-W. Chen, L. Su, Y.-H. Yang, and R. Jang, "Vocal activity informed singing voice separation with the iKala dataset," in Proc. IEEE Int. Conf. Acoust., Speech and Signal Process., 2015, pp. 718-722. For more information about this task set please refer to the 2016:Singing Voice Separation page.

Legend

Submission code Submission name Abstract PDF Contributors
GD1 Harmonic Modeling of Singing Voice for Source Separation PDF Georgi Dzhambazov
HC1 MIREX 2016 Submission for Singing Voice Separation PDF Yi-Chun Huang, Tai-Shih Chi
LCP1 Deep Clustering for Singing Voice Separation PDF Yi Luo, Zhuo Chen, John R. Hershey, Jonathan Le Roux, Daniel P. W. Ellis
LCP2 Deep Clustering for Singing Voice Separation PDF Yi Luo, Zhuo Chen, John R. Hershey, Jonathan Le Roux, Daniel P. W. Ellis
MC2 MIREX 2016 Submission for Singing Voice Separation - Marius Miron, Pritish Chandna
MC3 MIREX 2016 Submission for Singing Voice Separation - Marius Miron, Pritish Chandna
RSGP1 Singing Voice Separation Using Deep Neural Networks and F0 Estimation PDF Gerard Roma, Emad M. Grais, Andrew J. R. Simpson, Mark D. Plumbley

Evaluation Criteria

GNSDR = Global Normalized Signal-to-Distortion Ratio
NSDR = Normalized Signal-to-Distortion Ratio
SIR = Signal-to-Interference Ratio
SAR = Signal-to-Artifacts Ratio

Summary

Summary Results

Algorithm Voice GNSDR (dB) Music GNSDR (dB) Runtime (m)
GD1 -2.2810 0.3954 26.4413
HC1 4.6309 7.8180 28.9727
LCP1 6.0726 10.9256 37.8235
LCP2 6.3414 11.1878 32.4800
MC2 5.2891 9.6678 34.8084
MC3 5.4920 9.8049 36.7194
RSGP1 3.2589 8.7664 32.3578

NSDR

For the Singing Voice (dB)

Algorithm Mean SD Min Max Median
GD1 -2.281 3.534 -11.740 6.623 -1.935
HC1 4.631 2.903 -1.127 14.260 3.883
LCP1 6.073 3.462 -1.658 17.170 5.649
LCP2 6.341 3.370 -1.958 17.240 5.997
MC2 5.289 2.914 -1.302 12.571 4.945
MC3 5.492 2.881 -0.453 12.448 5.195
RSGP1 3.259 3.617 -14.027 12.045 3.304

download these results as csv

For the Music Accompaniment (dB)

Algorithm Mean SD Min Max Median
GD1 0.395 1.470 -2.260 5.825 0.211
HC1 7.818 2.647 -1.258 15.861 8.046
LCP1 10.926 3.835 0.742 19.960 10.883
LCP2 11.188 3.626 2.508 19.875 11.087
MC2 9.668 3.676 -7.875 22.734 9.926
MC3 9.805 3.944 -7.679 22.453 10.099
RSGP1 8.766 3.828 -6.905 18.065 8.966

download these results as csv

Boxplots

2016-svs-nsdr.png

SIR

For the Singing Voice (dB)

Algorithm Mean SD Min Max Median
GD1 6.562 9.778 -30.324 24.205 9.043
HC1 9.762 10.006 -28.081 23.070 12.713
LCP1 13.822 10.320 -24.604 26.302 16.844
LCP2 14.518 10.163 -23.923 26.606 17.467
MC2 10.471 10.189 -29.559 28.854 12.621
MC3 10.844 10.314 -28.896 29.313 13.346
RSGP1 16.240 10.743 -26.850 33.092 18.471

download these results as csv

For the Music Accompaniment (dB)

Algorithm Mean SD Min Max Median
GD1 1.984 9.805 -11.864 37.146 0.840
HC1 8.978 9.299 -2.271 42.987 7.576
LCP1 24.015 7.915 5.929 47.039 23.596
LCP2 25.170 7.220 11.146 48.129 24.578
MC2 19.811 6.984 2.837 42.629 19.143
MC3 19.609 7.091 2.366 43.881 18.602
RSGP1 17.737 6.484 4.017 42.258 16.593

download these results as csv

Boxplots

2016-svs-sir.png

SAR

For the Singing Voice (dB)

Algorithm Mean SD Min Max Median
GD1 2.394 4.562 -12.304 9.967 3.465
HC1 10.768 7.287 -19.592 20.876 12.282
LCP1 10.183 8.150 -23.437 20.348 12.538
LCP2 10.137 8.305 -22.992 20.063 12.235
MC2 11.243 7.441 -18.240 20.293 12.813
MC3 11.248 7.498 -15.216 20.749 12.811
RSGP1 6.627 7.140 -24.741 14.289 8.313

download these results as csv

For the Music Accompaniment (dB)

Algorithm Mean SD Min Max Median
GD1 2.708 2.661 -6.024 10.608 2.948
HC1 8.019 3.476 -3.232 16.401 7.483
LCP1 7.168 3.732 -2.080 17.939 7.087
LCP2 7.346 3.483 -0.713 18.004 7.451
MC2 6.106 3.537 -2.182 13.294 6.145
MC3 6.288 3.507 -3.431 13.943 6.559
RSGP1 5.286 2.539 -1.849 13.933 5.373

download these results as csv

Boxplots

2016-svs-sar.png

Runtime Data

Submission Code Runtime (m)
GD1 26.4413
HC1 28.9727
LCP1 37.8235
LCP2 32.4800
MC2 34.8084
MC3 36.7194
RSGP1 32.3578

download these results as csv