Investigating Natural Language Interactions in Communities

Loading...
Thumbnail Image

Authors

Luu, Kelvin

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In this work, we investigate language use in communities. First, we study how authors of scientific textsexplain how their paper relates to another. We propose a new task, relationship explanation generation for scientific texts, by using in-line citation text as a source of evidence. We introduce a new model, SCIGEN, empowered by pretrained language models. We perform extensive human evaluation and discuss potential shortcomings of our system’s generations. Second, we describe work on investigating how NLP models degrade over time due to the dynamic nature of most communities. We find evidence that static models, like SCIGEN, do not generalize temporally. Based on this finding, we investigate how model performance deterioration over time differs across several tasks and domains. We discover that models with temporally misaligned train and testing sets can suffer from large amounts of performance degradation. Finally, we show how elevating temporal characteristics in communities allows us to study particular social phenomena. Namely, we explore the problem of quantifying persuasive skill over time. Using data from an online debate forum, we construct a model of debater skill based on on the Elo ranking model and incorporate historical lingusitic data. In order to estimate skill, we frame our prediction task to forecast the outcome of a debate. Though this work considers a wide range of NLP applications, it is unified by the idea that language emerges from ever-changing communities of people engaging with each other. Our findings demonstrate how the temporal dynamics and social underpinnings of language can inform NLP research and practice.

Description

Thesis (Ph.D.)--University of Washington, 2022

Citation

DOI