Inferring Grammars from Interlinear Glossed Text: Extracting Typological and Lexical Properties for the Automatic Generation of HPSG Grammars
MetadataShow full item record
This dissertation presents a grammar inference system that leverages linguistic knowledge recorded in the form of annotations in interlinear glossed text (IGT) and in a meta-grammar engineering system (the LinGO Grammar Matrix customization system) to automatically produce machine-readable HPSG grammars. Building on prior work to handle the inference of lexical classes, stems, affixes and position classes, and preliminary work on inferring case systems and word order, I present an integrated grammar inference system called BASIL that covers a wide range of fundamental linguistic phenomena. System development was guided by 27 geneologically and geographically diverse languages, and I test the system's cross-linguistic generalizability on an additional 5 held-out languages, using datasets provided by field linguists. My system out-performs three baseline systems in increasing coverage while limiting ambiguity and producing richer semantic representations than the baselines or previous work in grammar inference.
- Linguistics