Evaluating Transformer's Ability to Learn Mildly Context-Sensitive Languages

Wang, Shunjie

Evaluating Transformer's Ability to Learn Mildly Context-Sensitive Languages

dc.contributor.advisor	Steinert-Threlkeld, Shane
dc.contributor.author	Wang, Shunjie
dc.date.accessioned	2021-10-29T16:22:07Z
dc.date.available	2021-10-29T16:22:07Z
dc.date.issued	2021-10-29
dc.date.submitted	2021
dc.description	Thesis (Master's)--University of Washington, 2021
dc.description.abstract	Transformer models perform well on NLP tasks, but recent theoretical studies suggest their ability in modeling certain regular and context-free languages are limited. This creates a disparity given their success in modeling natural language strings, which are hypothesized to be mildly context-sensitive. We complement previous works on transformers and formal languages by relating them to mildly context-sensitive grammar formalisms with varying degrees of weak generative capacity. We test simple vanilla transformer models' ability to learn copying, crossing, and multiple agreements languages, and find that they generalize well to unseen in-domain data and have comparable performance to LSTMs, and learn highly interpretable self-attention patterns. However, such transformers cannot consistently recognize strings from the languages that are longer than the ones seen during training, and are often outperformed by LSTMs in this setting. We present initial evidence that suggests this is due to the limitation of the vanilla sinusoidal positional encoding.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Wang_washington_0250O_23376.pdf
dc.identifier.uri	http://hdl.handle.net/1773/48053
dc.language.iso	en_US
dc.rights	none
dc.subject	mildly context-sensitive
dc.subject	transformer
dc.subject	Linguistics
dc.subject.other	Linguistics
dc.title	Evaluating Transformer's Ability to Learn Mildly Context-Sensitive Languages
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Wang_washington_0250O_23376.pdf
Size:: 2.54 MB
Format:: Adobe Portable Document Format

Download

Collections

Linguistics