2024:Polyphonic Transcription Results
From MIREX Wiki
Contents
Submissions
Group | Extended Abstract | Methods Used | Summary Onset F1 | Holistic Note F1 |
---|---|---|---|---|
wlazbzfll | CRNN + regression onset&offset | 0.7066 | 0.1607 | |
teamWLY | FiLM with CNN + LSTM + regression onset & offset | 0.9592 | 0.8465 | |
Transkun V2 (Baseline 1) | ViT + Neural SemiCRF | 0.9490 | 0.8764 | |
Transkun V2 Aug (Baseline 2) | ViT + Neural SemiCRF + Data Augmentation | 0.9648 | 0.9081 | |
hFT-Transformer | Two stacks of Transformers for Frequency and Time + regression onset&offset | 0.9416 | 0.8359 |
- average note onset F1 = average of note onset F1 on all three datasets
- holistic note F1 = average of note onset+offset+velocity on Maestro and SMD
Detailed Results on Maestro
Activation | Note Onset | Note Onset+Offset | Note Onset+Offset+ vel. | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Team | prec | recall | F1 | prec | recall | F1 | prec | recall | F1 | prec | recall | F1 |
wlazbzfll | 0.7324 | 0.7673 | 0.7445 | 0.7411 | 0.6511 | 0.6845 | 0.3537 | 0.3158 | 0.3294 | 0.1527 | 0.1353 | 0.1415 |
teamWLY | 0.953 | 0.9261 | 0.939 | 0.9944 | 0.9719 | 0.9829 | 0.9277 | 0.907 | 0.9171 | 0.9061 | 0.886 | 0.8958 |
Transkun V2 | 0.9576 | 0.9489 | 0.953 | 0.9956 | 0.9714 | 0.9832 | 0.9465 | 0.9238 | 0.9349 | 0.9411 | 0.9186 | 0.9296 |
Transkun V2 Aug | 0.9495 | 0.9522 | 0.9505 | 0.9971 | 0.9715 | 0.984 | 0.9437 | 0.9197 | 0.9314 | 0.9386 | 0.9149 | 0.9264 |
hFT-Transformer | 0.9537 | 0.9108 | 0.9307 | 0.9964 | 0.9545 | 0.9745 | 0.9244 | 0.8861 | 0.9045 | 0.9144 | 0.8768 | 0.8948 |
pedal activation | pedal onset | pedal onset+offset | |||||||
---|---|---|---|---|---|---|---|---|---|
Team | prec | recall | F1 | prec | recall | F1 | prec | recall | F1 |
wlazbzfll | 0.9429 | 0.9432 | 0.9419 | 0.7775 | 0.7812 | 0.7785 | 0.7377 | 0.7405 | 0.7383 |
teamWLY | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Transkun V2 | 0.9671 | 0.9453 | 0.9541 | 0.8909 | 0.8421 | 0.8642 | 0.8632 | 0.8165 | 0.8377 |
Transkun V2 Aug | 0.9546 | 0.9416 | 0.9454 | 0.8883 | 0.8116 | 0.8453 | 0.8497 | 0.7790 | 0.8102 |
hFT-Transformer | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
- offset derivations on MAPS deviates strongly from a Normal distribution, suggesting potential annotation issues
- N/A on pedals means that no pedal is included in the transcribed results
- Aug means the model is trained with data augmentation
Detailed Results on MAPS
Activation | Note Onset | Note Onset+Offset | Note Onset+Offset+ vel. | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Team | prec | recall | F1 | prec | recall | F1 | prec | recall | F1 | prec | recall | F1 |
wlazbzfll | 0.8021 | 0.7338 | 0.7626 | 0.741 | 0.6822 | 0.7024 | 0.4421 | 0.4166 | 0.4243 | 0.1793 | 0.1698 | 0.1724 |
teamWLY | 0.9049 | 0.8394 | 0.8691 | 0.918 | 0.9008 | 0.9089 | 0.6896 | 0.6772 | 0.683 | 0.4653 | 0.4571 | 0.461 |
Transkun V2 | 0.887 | 0.8252 | 0.8535 | 0.8671 | 0.9048 | 0.8849 | 0.6325 | 0.6613 | 0.6461 | 0.4351 | 0.4551 | 0.4446 |
Transkun V2 Aug | 0.9446 | 0.8334 | 0.8843 | 0.9396 | 0.9056 | 0.9219 | 0.7105 | 0.6854 | 0.6975 | 0.5596 | 0.5401 | 0.5495 |
hFT-Transformer | 0.916 | 0.716 | 0.8019 | 0.8747 | 0.8841 | 0.8789 | 0.5689 | 0.5756 | 0.5719 | 0.3592 | 0.3636 | 0.3612 |
pedal activation | pedal onset | pedal onset+offset | |||||||
---|---|---|---|---|---|---|---|---|---|
Team | prec | recall | F1 | prec | recall | F1 | prec | recall | F1 |
wlazbzfll | 0.8565 | 0.8217 | 0.833 | 0.5164 | 0.582 | 0.539 | 0.3301 | 0.378 | 0.3477 |
teamWLY | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Transkun V2 | 0.8498 | 0.8547 | 0.8449 | 0.6521 | 0.7182 | 0.6732 | 0.4903 | 0.5427 | 0.5088 |
Transkun V2 Aug | 0.8893 | 0.8389 | 0.8583 | 0.7313 | 0.7529 | 0.7343 | 0.5499 | 0.5650 | 0.5532 |
hFT-Transformer | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
- offset derivations on MAPS deviates strongly from a Normal distribution, suggesting potential annotation issues
- N/A on pedals means that no pedal is included in the transcribed results
- Aug means the model is trained with data augmentation
Detailed Results on SMD
Activation | Note Onset | Note Onset+Offset | Note Onset+Offset+ vel. | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Team | prec | recall | F1 | prec | recall | F1 | prec | recall | F1 | prec | recall | F1 |
wlazbzfll | 0.7766 | 0.7862 | 0.7775 | 0.751 | 0.7323 | 0.7329 | 0.3915 | 0.3958 | 0.3895 | 0.1704 | 0.1696 | 0.1682 |
teamWLY | 0.9237 | 0.9357 | 0.9289 | 0.9945 | 0.9771 | 0.9857 | 0.9005 | 0.8849 | 0.8926 | 0.8042 | 0.7902 | 0.7971 |
Transkun V2 | 0.9203 | 0.9491 | 0.934 | 0.9816 | 0.9766 | 0.979 | 0.9013 | 0.8968 | 0.899 | 0.8255 | 0.8211 | 0.8232 |
Transkun V2 Aug | 0.9389 | 0.9518 | 0.9448 | 0.997 | 0.9801 | 0.9884 | 0.9284 | 0.9128 | 0.9205 | 0.8974 | 0.8823 | 0.8897 |
hFT-Transformer | 0.9291 | 0.8968 | 0.9115 | 0.9875 | 0.9563 | 0.9713 | 0.8834 | 0.856 | 0.8692 | 0.7897 | 0.7651 | 0.7769 |
pedal activation | pedal onset | pedal onset+offset | |||||||
---|---|---|---|---|---|---|---|---|---|
Team | prec | recall | F1 | prec | recall | F1 | prec | recall | F1 |
wlazbzfll | 0.9402 | 0.9459 | 0.9415 | 0.7921 | 0.8018 | 0.7956 | 0.727 | 0.735 | 0.7299 |
teamWLY | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Transkun V2 | 0.9364 | 0.9507 | 0.942 | 0.8722 | 0.8101 | 0.8388 | 0.803 | 0.7471 | 0.773 |
Transkun V2 Aug | 0.9491 | 0.9428 | 0.9447 | 0.8788 | 0.8043 | 0.8383 | 0.8208 | 0.7526 | 0.7837 |
hFT-Transformer | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
- offset derivations on MAPS deviates strongly from a Normal distribution, suggesting potential annotation issues
- N/A on pedals means that no pedal is included in the transcribed results
- Aug means the model is trained with data augmentation