A Grammar Library for Information Structure
MetadataShow full item record
This dissertation makes substantial contributions to both the theoretical and computational treatment of information structure, with an eye toward creating natural language processing applications such as multilingual machine translation systems. The aim of the present dissertation is to create a grammar library of information structure for the LinGO Grammar Matrix system (Bender et al. 2010b). Information structure consists of focus, topic, contrast, and background, and refers to how speakers package semantic content they wish to convey to listeners. The information structure of individual sentences is crucial to understanding the cohesiveness of larger segments of text. Despite the crucial role information structure plays in conveying meaning, there is insufficient research on how computational language models might successfully incorporate information structure marking particularly from a multilingual perspective. Part I introduces the current study, and gives some background information. Part II provides cross-linguistic findings about information structure meanings and markings. Part III exploits a naturally occurring text in four languages (e.g. English, Spanish, Russian, and Korean) to formulate a cross-linguistic generalization about distributional properties of information structure. Drawing from these cross-linguistic findings, Part IV shows how information structure can be represented within the HPSG/MRS framework (Pollard and Sag, 1994; Copestake et al., 2005). Part V explores the construction of a grammar library for creating customized grammars incorporating information structure and shows how the information structure-based model improves performance of transfer-based machine translation.
- Linguistics