2026:Rhythm Game Chart Generation
Contents
Task Description
Evaluation Criteria
While existing research relies on researcher-defined quantitative evaluation methods, we adopt a more user-centered 3-stage approach. The following evaluation will be performed on songs that are not included in the dataset provided to the participants. We will recruit about 5 osu!taiko mappers as the jury, and the jury will also be involved in the test song selection process.
Algorithmic Evaluation
In this stage, we will manually pick 3 songs for evaluation. First, we will use osu! star rating calculator (https://pypi.org/project/osu-sr-calculator/) to calculate the star ratings of the generated beatmaps. This allows us to assess whether the model's difficulty conditioning is functioning correctly. Then, we will follow AiMod (https://osu.ppy.sh/wiki/en/Client/Beatmap_editor/AiMod) and MapsetVerifier (https://github.com/Naxesss/MapsetVerifier) to detect unplayable placements of objects algorithmically. Models passing this evaluation phase will move on to human evaluation.
Expert Evaluation
In this stage, the jury will rate the generated charts based on musical representation, creativity, and gameplay similar to the Monthly Beatmapping Contest (https://osu.ppy.sh/wiki/en/Contests/Monthly_Beatmapping_Contest), and we will also introduce a humanity score evaluating how human-like the generated charts are. We will then use the jury's ratings to determine which finalists move on to player evaluation.
Community Evaluation
In this stage, a single showcase song will be selected for community evaluation. Participants will choose a difficulty level aligned with their skill level and playtest charts generated by each model alongside a human-authored chart. Players will vote on musical representation, creativity, gameplay, and they will be asked to identify which chart is authored by human.
Resources
We recommend participants to read [this document https://docs.google.com/document/d/1WyoOXPpwVaLPXJeP-4SrGO6ywxEbbkKwVbg8jDfgtnI/edit?tab=t.0] to see how osu!taiko mapping works from a mapper's perspective.
Training Dataset
To ensure data quality and fairness, we require all participants to use beatmaps ranked from May 5, 2019 to December 31, 2025 for training and validation. We recommend using https://osudl.org/ and filter for "ranked>=2019-05-05 and ranked<2026-01-01 and mode=t", which yields 727 beatmapsets. To minimize legal risks and save resources, we also encourage participants to alternatively filter for "ranked>=2019-05-05 and ranked<2026-01-01 and mode=t and featured artist", which yields 174 beatmapsets. For more information on featured artists, see https://osu.ppy.sh/beatmaps/artists.
In osu!taiko, there are usually Kantan, Futsuu, Muzukashii, Oni, and Inner Oni difficulty levels. However, sometimes people may use different names for them, and there also exist levels that are higher than Inner Oni. To ensure consistency, we recommend using the following tier classification logic:
raw_name = osu_file.version_name.lower()
# Keep only a-z and '
clean_name = RegexReplace(raw_name, "[^a-z']", "")
if "kantan" in clean_name:
return 1
elif "futsuu" in clean_name:
return 2
elif "muzukashii" in clean_name:
return 3
elif "oni" in clean_name:
if "inner" in clean_name or "ura" in clean_name:
return 5
elif "hell" in clean_name:
return 6
# Logic for standard Oni vs Mapper's Oni
elif clean_name == "oni" or clean_name matches "^.+'soni$":
return 4
return None # Discard non-standard variants
Submission Requirements
Timeline
- July 1, 2026: Submission system opens
- Sep 11, 2026: Submission deadline for the generation system
- Oct 15, 2026: Results published
Contact
- Ziyun Liu — ziyun.liu@univ-lille.fr
- Carolina Carusi — carolina.carusi@kaist.ac.kr
Long-term plan
We are committed to maintaining and expanding this task in the coming years. Our long-term roadmap includes the introduction of additional osu! modes and other open-access rhythm games, as well as other research objectives such as cross-mode/cross-game transfer learning ability and fine-grained generation controllability.