Difference between revisions of "2024:Lyrics-to-Audio Alignment Results"
(Created page with " = Submissions = {| class="wikitable" |- style="font-weight:bold;" ! style="vertical-align:bottom;" | Sub Code ! Extended Abstract ! Contributors ! Methods |- | style="verti...") |
|||
(11 intermediate revisions by the same user not shown) | |||
Line 41: | Line 41: | ||
|- style="vertical-align:bottom;" | |- style="vertical-align:bottom;" | ||
| style="text-align:left;" | NUS | | style="text-align:left;" | NUS | ||
− | | 0.217 | + | | '''0.217''' |
− | | 0.046 | + | | '''0.046''' |
− | | 0.751 | + | | '''0.751''' |
− | | 0.945 | + | | '''0.945''' |
|} | |} | ||
Line 60: | Line 60: | ||
|- style="vertical-align:bottom;" | |- style="vertical-align:bottom;" | ||
| style="text-align:left;" | FZZ1 | | style="text-align:left;" | FZZ1 | ||
− | | 0.584 | + | | '''0.584''' |
| 0.252 | | 0.252 | ||
− | | 0.683 | + | | '''0.683''' |
− | | 0.887 | + | | '''0.887''' |
|- style="vertical-align:bottom;" | |- style="vertical-align:bottom;" | ||
| style="text-align:left;" | NUS | | style="text-align:left;" | NUS | ||
| 0.651 | | 0.651 | ||
− | | 0.136 | + | | '''0.136''' |
| 0.502 | | 0.502 | ||
| 0.729 | | 0.729 | ||
Line 165: | Line 165: | ||
|} | |} | ||
− | + | == Hansen's dataset a cappella == | |
+ | |||
+ | Notice that Hansen's dataset and Mauch's dataset overlap with commonly used training sets (e.g., DALI). The results are shown for reference only. | ||
+ | |||
+ | {| class="wikitable" style="text-align:right;" | ||
+ | |- style="font-weight:bold; text-align:left;" | ||
+ | ! style="vertical-align:bottom;" | Group | ||
+ | ! style="vertical-align:bottom;" | Average absolute error | ||
+ | ! style="vertical-align:bottom;" | Median absolute error | ||
+ | ! style="vertical-align:bottom;" | Percentage of correct segments | ||
+ | ! Percentage of correct onsets with tolerance | ||
+ | |- style="vertical-align:bottom;" | ||
+ | | style="text-align:left;" | FZZ1 | ||
+ | | 0.101 | ||
+ | | 0.044 | ||
+ | | 0.783 | ||
+ | | 0.971 | ||
+ | |- style="vertical-align:bottom;" | ||
+ | | style="text-align:left;" | NUS | ||
+ | | 0.132 | ||
+ | | 0.031 | ||
+ | | 0.791 | ||
+ | | 0.965 | ||
+ | |} | ||
+ | |||
+ | == Hansen's dataset == | ||
+ | |||
+ | Notice that Hansen's dataset and Mauch's dataset overlap with commonly used training sets (e.g., DALI). The results are shown for reference only. | ||
+ | |||
+ | {| class="wikitable" style="text-align:right;" | ||
+ | |- style="font-weight:bold; text-align:left;" | ||
+ | ! style="vertical-align:bottom;" | Group | ||
+ | ! style="vertical-align:bottom;" | Average absolute error | ||
+ | ! style="vertical-align:bottom;" | Median absolute error | ||
+ | ! style="vertical-align:bottom;" | Percentage of correct segments | ||
+ | ! Percentage of correct onsets with tolerance | ||
+ | |- style="vertical-align:bottom;" | ||
+ | | style="text-align:left;" | FZZ1 | ||
+ | | 3.264 | ||
+ | | 3.604 | ||
+ | | 0.648 | ||
+ | | 0.870 | ||
+ | |- style="vertical-align:bottom;" | ||
+ | | style="text-align:left;" | NUS | ||
+ | | 0.107 | ||
+ | | 0.052 | ||
+ | | 0.764 | ||
+ | | 0.972 | ||
+ | |} | ||
+ | |||
+ | == Mauch's dataset == | ||
− | + | Notice that Hansen's dataset and Mauch's dataset overlap with commonly used training sets (e.g., DALI). The results are shown for reference only. | |
+ | |||
+ | {| class="wikitable" style="text-align:right;" | ||
+ | |- style="font-weight:bold; text-align:left;" | ||
+ | ! style="vertical-align:bottom;" | Group | ||
+ | ! style="vertical-align:bottom;" | Average absolute error | ||
+ | ! style="vertical-align:bottom;" | Median absolute error | ||
+ | ! style="vertical-align:bottom;" | Percentage of correct segments | ||
+ | ! Percentage of correct onsets with tolerance | ||
+ | |- style="vertical-align:bottom;" | ||
+ | | style="text-align:left;" | FZZ1 | ||
+ | | 0.900 | ||
+ | | 0.122 | ||
+ | | 0.489 | ||
+ | | 0.844 | ||
+ | |- style="vertical-align:bottom;" | ||
+ | | style="text-align:left;" | NUS | ||
+ | | 0.192 | ||
+ | | 0.098 | ||
+ | | 0.478 | ||
+ | | 0.910 | ||
+ | |} | ||
− | == | + | == Per-song result == |
− | + | Please see the [https://futuremirex.com/portal/wp-content/uploads/2024/11/lyrics_alignment_per_song_result.csv CSV] file. | |
− | = | + | = Remarks = |
− | + | We want to thank to previous task captain Georgi Dzhambazov for the support on evaluation scripts and datasets, and Emir Demirel for his suggestions to the task. |
Latest revision as of 08:48, 14 November 2024
Contents
Submissions
Sub Code | Extended Abstract | Contributors | Methods |
---|---|---|---|
FZZ1 | Wanpeng Fan, Jiaye Zhu, Peng Zhong | WavLM + Conformer | |
NUS (baseline) | Link | Xiaoxue Gao,Chitralekha Gupta, Haizhou Li | Genre-informed Silence + Phone Model |
Results
Jamendo V1
Jamendo V1 refers to the 20 English songs in the Jamendo dataset (https://github.com/f90/jamendolyrics) with old annotations. This is the dataset used in previous MIREXes to make a fair comparison with the previous submissions.
Group | Average absolute error | Median absolute error | Percentage of correct segments | Percentage of correct onsets with tolerance |
---|---|---|---|---|
FZZ1 | 0.547 | 0.047 | 0.686 | 0.912 |
NUS | 0.217 | 0.046 | 0.751 | 0.945 |
Jamendo V2 MultiLang
Jamendo V2 contains all 79 songs in the Jamendo dataset (https://github.com/f90/jamendolyrics) with new annotations.
Group | Average absolute error | Median absolute error | Percentage of correct segments | Percentage of correct onsets with tolerance |
---|---|---|---|---|
FZZ1 | 0.584 | 0.252 | 0.683 | 0.887 |
NUS | 0.651 | 0.136 | 0.502 | 0.729 |
Language-Specific Results
Jamendo V2 En
Group | Average absolute error | Median absolute error | Percentage of correct segments | Percentage of correct onsets with tolerance |
---|---|---|---|---|
FZZ1 | 0.619 | 0.143 | 0.698 | 0.896 |
NUS | 0.216 | 0.046 | 0.784 | 0.947 |
Jamendo V2 Fr
Group | Average absolute error | Median absolute error | Percentage of correct segments | Percentage of correct onsets with tolerance |
---|---|---|---|---|
FZZ1 | 0.371 | 0.045 | 0.661 | 0.897 |
NUS | 0.809 | 0.157 | 0.400 | 0.665 |
Jamendo V2 Gr
Group | Average absolute error | Median absolute error | Percentage of correct segments | Percentage of correct onsets with tolerance |
---|---|---|---|---|
FZZ1 | 0.371 | 0.045 | 0.661 | 0.897 |
NUS | 0.809 | 0.157 | 0.400 | 0.665 |
Jamendo V2 Es
Group | Average absolute error | Median absolute error | Percentage of correct segments | Percentage of correct onsets with tolerance |
---|---|---|---|---|
FZZ1 | 0.452 | 0.039 | 0.703 | 0.905 |
NUS | 0.969 | 0.230 | 0.392 | 0.613 |
Hansen's dataset a cappella
Notice that Hansen's dataset and Mauch's dataset overlap with commonly used training sets (e.g., DALI). The results are shown for reference only.
Group | Average absolute error | Median absolute error | Percentage of correct segments | Percentage of correct onsets with tolerance |
---|---|---|---|---|
FZZ1 | 0.101 | 0.044 | 0.783 | 0.971 |
NUS | 0.132 | 0.031 | 0.791 | 0.965 |
Hansen's dataset
Notice that Hansen's dataset and Mauch's dataset overlap with commonly used training sets (e.g., DALI). The results are shown for reference only.
Group | Average absolute error | Median absolute error | Percentage of correct segments | Percentage of correct onsets with tolerance |
---|---|---|---|---|
FZZ1 | 3.264 | 3.604 | 0.648 | 0.870 |
NUS | 0.107 | 0.052 | 0.764 | 0.972 |
Mauch's dataset
Notice that Hansen's dataset and Mauch's dataset overlap with commonly used training sets (e.g., DALI). The results are shown for reference only.
Group | Average absolute error | Median absolute error | Percentage of correct segments | Percentage of correct onsets with tolerance |
---|---|---|---|---|
FZZ1 | 0.900 | 0.122 | 0.489 | 0.844 |
NUS | 0.192 | 0.098 | 0.478 | 0.910 |
Per-song result
Please see the CSV file.
Remarks
We want to thank to previous task captain Georgi Dzhambazov for the support on evaluation scripts and datasets, and Emir Demirel for his suggestions to the task.