mSCAN - a Multilingual Dataset for Compositional Generalization Evaluation

Reymond, Amélie Thu TâmmSCAN - a Multilingual Dataset for Compositional Generalization EvaluationMy University2025Compositional generalizationCross Linguistic EvaluationLarge Language Models EvaluationLinguisticsComputer scienceArtificial intelligenceLinguisticsMy UniversityMy UniversitySteinert-Threlkeld, Shane2025-08-012025-08-012025en-USThesisReymond_washington_0250O_28510.pdfhttps://hdl.handle.net/1773/53674application/pdfnoneThesis (Master's)--University of Washington, 2025Language models achieve remarkable results on a variety of tasks, yet still struggle on compositional generalization benchmarks. The majority of these benchmarks evaluate performance in English only, leaving open the question of whether these results generalize to other languages. As an initial step to answering this question, we introduce mSCAN, a multilingual adaptation of the SCAN dataset covering Mandarin Chinese, French, Hindi and Russian. It was produced by a rule-based translation, developed in cooperation with native speakers. We then showcase this dataset on some in-context learning experiments on multiple open-source multilingual models.