Difference between revisions of "2025:Symbolic Music Generation Results"
Zhaojw1998 (talk | contribs)  (→Results)  | 
				Zhaojw1998 (talk | contribs)   (→Results)  | 
				||
| Line 63: | Line 63: | ||
|}  | |}  | ||
| − | '''  | + | |
| + | '''Notes on Evaluation Results''': Results are reported in the form of mean ± sem<sup>s</sup> (sem refers to standard error of mean), where s is a letter. Different letters within a column indicate significant differences (p-value p < 0.05) based on a Wilcoxon signed rank test with Holm-Bonferroni correction.  | ||
| + | |||
| + | '''Notes on Baseline Models''': For MuseCoco, we use the *xlarge* model variant with 1.2 billion learnable parameters. For Anticipatory Music Transformer, we use the *Large* model variant with 780M learnable parameters.  | ||
'''Subjective Evaluation Details''': Each test sample was cherry-picked from 8 samples generated from the corresponding prompt. A total of 6 prompts of varied styles (Pop, Classical, and Jazz) were tested, resulting in a 6-page survey. Responses were collected from 20 participants with diverse music backgrounds.  | '''Subjective Evaluation Details''': Each test sample was cherry-picked from 8 samples generated from the corresponding prompt. A total of 6 prompts of varied styles (Pop, Classical, and Jazz) were tested, resulting in a 6-page survey. Responses were collected from 20 participants with diverse music backgrounds.  | ||
Revision as of 23:21, 12 September 2025
Submissions
| Team | Extended Abstract | Methods | 
|---|---|---|
| RWKV (Zhou-Zheng et al.) | [1] | RWKV | 
| PixelGen | [2] | Hierarchical Transformer | 
| MuseCoco (BL-1) | [3] | Transformer | 
| Anticipatory Music Transformer (BL-2) | [4] | Transformer | 
Results
| Team | Subjective Evaluation | |||
|---|---|---|---|---|
| Coherecy ↑ | Structure ↑ | Creativity ↑ | Musicality ↑ | |
| RWKV (Zhou-Zheng et al.) | 3.57 ± 0.10a | 3.58 ± 0.10a | 3.26 ± 0.10a | 3.50 ± 0.10a | 
| PixelGen | 2.39 ± 0.10c | 2.37 ± 0.09c | 2.85 ± 0.09b | 2.48 ± 0.09c | 
| MuseCoco (BL-1) | 3.11 ± 0.10b | 3.07 ± 0.09b | 3.08 ± 0.09ab | 2.95 ± 0.09b | 
| Anticipatory Music Transformer (BL-2) | 3.70 ± 0.10a | 3.69 ± 0.09a | 3.30 ± 0.10a | 3.45 ± 0.10a | 
Notes on Evaluation Results: Results are reported in the form of mean ± sems (sem refers to standard error of mean), where s is a letter. Different letters within a column indicate significant differences (p-value p < 0.05) based on a Wilcoxon signed rank test with Holm-Bonferroni correction.
Notes on Baseline Models: For MuseCoco, we use the *xlarge* model variant with 1.2 billion learnable parameters. For Anticipatory Music Transformer, we use the *Large* model variant with 780M learnable parameters.
Subjective Evaluation Details: Each test sample was cherry-picked from 8 samples generated from the corresponding prompt. A total of 6 prompts of varied styles (Pop, Classical, and Jazz) were tested, resulting in a 6-page survey. Responses were collected from 20 participants with diverse music backgrounds.