Difference between revisions of "2025:Music Reasoning QA Results"
From MIREX Wiki
Nicolaus526 (talk | contribs) (→MMAR Results) |
Nicolaus526 (talk | contribs) (→MMAR Results) |
||
Line 26: | Line 26: | ||
| style="text-align:right;" | 67.07% | | style="text-align:right;" | 67.07% | ||
| style="text-align:right;" | 58.33% | | style="text-align:right;" | 58.33% | ||
+ | |- | ||
+ | | Baseline 3 | ||
+ | | SAR-LM (w/ Gemini) | ||
+ | | style="text-align:right;" | TBA | ||
+ | | style="text-align:right;" | TBA | ||
+ | | style="text-align:right;" | TBA | ||
+ | | style="text-align:right;" | TBA | ||
+ | | style="text-align:right;" | TBA | ||
|} | |} | ||
Revision as of 16:23, 12 September 2025
MMAR Results
System | Methods Used | ACC | music ACC | mix-sound-music | mix-music-speech | mix-sound-music-speech |
---|---|---|---|---|---|---|
Baseline 1 | SAR-LM (w/ Qwen3) | 40.00% | 33.98% | 27.27% | 48.78% | 37.50% |
Baseline 2 | Qwen2.5-Omni | 56.70% | 40.78% | 54.55% | 67.07% | 58.33% |
Baseline 3 | SAR-LM (w/ Gemini) | TBA | TBA | TBA | TBA | TBA |
OMniBench Results
System | Methods Used | ACC | music ACC |
---|---|---|---|
Baseline 1 | SAR-LM | 31.26% | 41.50% |
Baseline 2 | Qwen2-Audio-7B-Instruct | 40.72% | 38.68% |