Difference between revisions of "2024:Symbolic Music Generation Results"

From MIREX Wiki
(Results)
(Submissions)
Line 10: Line 10:
 
|-
 
|-
 
| style="vertical-align:bottom;" | Chart-Accompaniment
 
| style="vertical-align:bottom;" | Chart-Accompaniment
| style="vertical-align:bottom;" |  
+
| style="vertical-align:bottom;" | [https://futuremirex.com/portal/wp-content/uploads/2024/11/chart_accomp_2024_ISMIR_LBD.pdf PDF]
 
| style="vertical-align:bottom;" | BART
 
| style="vertical-align:bottom;" | BART
 
| A BART model generating piano accompaniments using beat-based tokenization.
 
| A BART model generating piano accompaniments using beat-based tokenization.

Revision as of 11:07, 11 November 2024

Submissions

Team Extended Abstract Methods Methodology
Chart-Accompaniment PDF BART A BART model generating piano accompaniments using beat-based tokenization.
AccoMontage (BL-1) PDF Style Transfer A hybrid algorithm generating piano accompaniments by rule-based search and music representation learning.
Whole-Song-Gen (BL-2) PDF DDPM A denoising diffusion probabilistic model (DDPM) generating piano accompaniments as piano-roll images
Compose-&-Embesslish (BL-3) PDF Transformer A Transformer-based architecture generating piano performances in beat-based event sequences.

Results

Team Subjective Evaluation Objective Evaluation
Coherecy ↑ Naturalness ↑ Creativity ↑ Musicality ↑ NLL ↓
Chart-Accompaniment 1.92 ± 0.11d 1.87 ± 0.10c 2.62 ± 0.13c 2.01 ± 0.11c 4.12 ± 0.12c
AccoMontage (BL-1) 3.77 ± 0.11a 3.59 ± 0.11a 3.65 ± 0.11a 3.63 ± 0.12a 2.48 ± 0.07a
Whole-Song-Gen (BL-2) 3.59 ± 0.11b 3.24 ± 0.11b 3.66 ± 0.10a 3.47 ± 0.13b 2.87 ± 0.08b
Compose-&-Embesslish (BL-3) 3.39 ± 0.10c 3.38 ± 0.12b 3.13 ± 0.10b 3.36 ± 0.11b 7.41 ± 0.07d

Note: Results are reported in the form of mean ± sems (sem refers to standard error of mean), where s is a letter. Different letters within a column indicate significant differences (p-value p < 0.05) based on a Wilcoxon signed rank test.

Objective Evaluation Details: Each model generates 16 samples for each of 6 test pieces. Negative Log Likelihood (NLL) is computed by inputing the molody and accompaniment into the MuseCoco 1B model.

Subjective Evaluation Details: One piece cherry-picked from 16 samples of each test piece, resulting in 6 pages of questions. We collect responses from 22 participants (18 complete submissions and 4 partial submissions). For complete submissions, the average completion time is 16min 59s.