Investigating Natural Language Interactions in Communities
Loading...
Date
Authors
Luu, Kelvin
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In this work, we investigate language use in communities. First, we study how authors of scientific textsexplain how their paper relates to another. We propose a new task, relationship explanation generation for
scientific texts, by using in-line citation text as a source of evidence. We introduce a new model, SCIGEN,
empowered by pretrained language models. We perform extensive human evaluation and discuss potential
shortcomings of our system’s generations.
Second, we describe work on investigating how NLP models degrade over time due to the dynamic nature
of most communities. We find evidence that static models, like SCIGEN, do not generalize temporally. Based
on this finding, we investigate how model performance deterioration over time differs across several tasks
and domains. We discover that models with temporally misaligned train and testing sets can suffer from large
amounts of performance degradation.
Finally, we show how elevating temporal characteristics in communities allows us to study particular
social phenomena. Namely, we explore the problem of quantifying persuasive skill over time. Using data
from an online debate forum, we construct a model of debater skill based on on the Elo ranking model and
incorporate historical lingusitic data. In order to estimate skill, we frame our prediction task to forecast the
outcome of a debate.
Though this work considers a wide range of NLP applications, it is unified by the idea that language
emerges from ever-changing communities of people engaging with each other. Our findings demonstrate how
the temporal dynamics and social underpinnings of language can inform NLP research and practice.
Description
Thesis (Ph.D.)--University of Washington, 2022
