Steinert-Threlkeld, ShaneReymond, Amélie Thu Tâm2025-08-012025-08-012025Reymond_washington_0250O_28510.pdfhttps://hdl.handle.net/1773/53674Thesis (Master's)--University of Washington, 2025Language models achieve remarkable results on a variety of tasks, yet still struggle on compositional generalization benchmarks. The majority of these benchmarks evaluate performance in English only, leaving open the question of whether these results generalize to other languages. As an initial step to answering this question, we introduce mSCAN, a multilingual adaptation of the SCAN dataset covering Mandarin Chinese, French, Hindi and Russian. It was produced by a rule-based translation, developed in cooperation with native speakers. We then showcase this dataset on some in-context learning experiments on multiple open-source multilingual models.application/pdfen-USnoneCompositional generalizationCross Linguistic EvaluationLarge Language Models EvaluationLinguisticsComputer scienceArtificial intelligenceLinguisticsmSCAN - a Multilingual Dataset for Compositional Generalization EvaluationThesis