Latest revision as of 05:12, 1 July 2026

Submissions

Team	Extended Abstract	Methods
RWKV (Zhou-Zheng et al.)	PDF	RWKV
PixelGen	PDF	Hierarchical Transformer
MuseCoco (BL-1)	[1]	Transformer
Anticipatory Music Transformer (BL-2)	[2]	Transformer

Results

Team	Subjective Evaluation
Team	Coherecy ↑	Structure ↑	Creativity ↑	Musicality ↑
RWKV (Zhou-Zheng et al.)	3.57 ± 0.10^a	3.58 ± 0.10^a	3.26 ± 0.10^a	3.50 ± 0.10^a
PixelGen	2.39 ± 0.10^c	2.37 ± 0.09^c	2.85 ± 0.09^b	2.48 ± 0.09^c
MuseCoco (BL-1)	3.11 ± 0.10^b	3.07 ± 0.09^b	3.08 ± 0.09^ab	2.95 ± 0.09^b
Anticipatory Music Transformer (BL-2)	3.70 ± 0.10^a	3.69 ± 0.09^a	3.30 ± 0.10^a	3.45 ± 0.10^a

Evaluation Results

Results are reported in the form of mean ± sem^s (sem refers to standard error of mean), where s is a letter. Different letters within a column indicate significant differences (p-value p < 0.05) based on a Wilcoxon signed rank test with Holm-Bonferroni correction.

Baseline Models

For MuseCoco, we use the xlarge model with 1.2 billion learnable parameters. For Anticipatory Music Transformer, we use the Large model with 780M learnable parameters.

Subjective Evaluation Details

A double-blind online survey was conducted to test music quality. Each model was anonymised, and for each test prompt, a sample was cherry-picked from 8 generated candidates. A total of 8 prompts of varied styles (pop, classical, and jazzy) were tested, resulting in an 8-page survey. The page order and the sample order within each page were both randomised.

Responses were collected from 20 participants with diverse music backgrounds. 14 participants completed all 8 pages with an average completion time of 32 minutes.

Listening Samples

Play-along piano-roll continuations (visual + audio) for every prompt and system are available on the demo page:

▶ Open the interactive demo

@@ Line 9: / Line 9: @@
 |-
 | RWKV (Zhou-Zheng et al.)
-| [https://www.music-ir.org/mirex/wiki/MIREX_HOME]
+| [http://futuremirex.com/portal/wp-content/uploads/2025/symbolic-music-generation/RWKV.pdf PDF]
 | RWKV
 |-
 | PixelGen
-| [https://www.music-ir.org/mirex/wiki/MIREX_HOME]
+| [http://futuremirex.com/portal/wp-content/uploads/2025/symbolic-music-generation/PixelGen.pdf PDF]
 | Hierarchical Transformer
 |-
@@ Line 42: / Line 42: @@
 | 3.58 ± 0.10<sup>a</sup>
 | 3.26 ± 0.10<sup>a</sup>
-| '''3.5 ± 0.10<sup>a</sup>'''
+| '''3.50 ± 0.10<sup>a</sup>'''
 |-
 | PixelGen
@@ Line 57: / Line 57: @@
 |-
 | Anticipatory Music Transformer (BL-2)
-| '''3.70 ± 0.10<sup>c</sup>'''
+| '''3.70 ± 0.10<sup>a</sup>'''
-| '''3.69 ± 0.09<sup>b</sup>'''
+| '''3.69 ± 0.09<sup>a</sup>'''
-| '''3.30 ± 0.10<sup>b</sup>'''
+| '''3.30 ± 0.10<sup>a</sup>'''
-| 3.45 ± 0.10<sup>b</sup>
+| 3.45 ± 0.10<sup>a</sup>
 |}
-'''Note''': Results are reported in the form of mean ± sem<sup>s</sup> (sem refers to standard error of mean), where s is a letter. Different letters within a column indicate significant differences (p-value p < 0.05) based on a Wilcoxon signed rank test.
-'''Subjective Evaluation Details''': One piece cherry-picked from 16 samples of each test piece, resulting in 6 pages of questions. We collect responses from 22 participants (18 complete submissions and 4 partial submissions). For complete submissions, the average completion time is 16min 59s.
+'''Evaluation Results'''
+Results are reported in the form of mean ± sem<sup>s</sup> (sem refers to standard error of mean), where s is a letter. Different letters within a column indicate significant differences (p-value p < 0.05) based on a Wilcoxon signed rank test with Holm-Bonferroni correction.
+'''Baseline Models'''
+For MuseCoco, we use the ''xlarge'' model with 1.2 billion learnable parameters. For Anticipatory Music Transformer, we use the ''Large'' model with 780M learnable parameters.
+'''Subjective Evaluation Details'''
+A double-blind online survey was conducted to test music quality. Each model was anonymised, and for each test prompt, a sample was cherry-picked from 8 generated candidates. A total of 8 prompts of varied styles (pop, classical, and jazzy) were tested, resulting in an 8-page survey. The page order and the sample order within each page were both randomised.
+Responses were collected from 20 participants with diverse music backgrounds. 14 participants completed all 8 pages with an average completion time of 32 minutes.
+= Listening Samples =
+Play-along piano-roll continuations (visual + audio) for every prompt and system are available on the demo page:
+: '''[https://futuremirex.com/demos/symbolic-music-generation/ ▶ Open the interactive demo]'''

Difference between revisions of "2025:Symbolic Music Generation Results"

Latest revision as of 05:12, 1 July 2026

Submissions

Results

Listening Samples

Navigation menu

Views

Personal tools

MIREX by Year

Results by Year

Account Request

Search

Navigation

Tools