Low-Rank  RNN  Adaptation  for  Context-Aware  Language  Modeling

Jaech, Aaron

Low-Rank RNN Adaptation for Context-Aware Language Modeling

dc.contributor.advisor	Ostendorf, Mari
dc.contributor.author	Jaech, Aaron
dc.date.accessioned	2018-07-31T21:11:38Z
dc.date.available	2018-07-31T21:11:38Z
dc.date.issued	2018-07-31
dc.date.submitted	2018
dc.description	Thesis (Ph.D.)--University of Washington, 2018
dc.description.abstract	A long-standing weakness of statistical language models is that their performance drastically degrades if they are used on data that varies even slightly from the data on which they were trained. In practice, applications require the use of adaptation methods to adjust the predictions of the model to match the local context. For instance, in a speech recognition application, a single static language model would not be able to handle all the different ways that people speak to their voice assistants such as selecting music and sending a message to a friend. An adapted model would make its predictions conditioned on the knowledge of who is speaking and what task they are trying to do. The current standard approach to recurrent neural network language model adaptation is to apply a simple linear shift to the recurrent and/or output layer bias vector. Although this is helpful, it does not go far enough. This thesis introduces a new approach to adaptation, which we call the FactorCell, that generates a custom recurrent network for each context by applying a low-rank transformation. The FactorCell allows for a more substantial change to the recurrent layer weights. Different from previous approaches, the introduction of a rank hyperparameter gives control over how different or similar the adapted models should be. In our experiments on several different datasets and multiple types of context, the increased adaptation of the recurrent layer is always helpful, as measured by perplexity, the standard for evaluating language models. We also demonstrate impact on two applications: personalized query completion and context-specific text generation, finding that the enhanced adaptation benefits both. We also show that the FactorCell provides a more effective text classification model, but more importantly the classification results reveal that there are important differences between the models that are not captured by perplexity. The classification metric is particularly important for the text generation application.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Jaech_washington_0250E_18951.pdf
dc.identifier.uri	http://hdl.handle.net/1773/42292
dc.language.iso	en_US
dc.rights	none
dc.subject	language modeling
dc.subject	natural language processing
dc.subject	Computer science
dc.subject	Statistics
dc.subject.other	Electrical engineering
dc.title	Low-Rank RNN Adaptation for Context-Aware Language Modeling
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Jaech_washington_0250E_18951.pdf
Size:: 1009.27 KB
Format:: Adobe Portable Document Format

Download

Collections

Electrical engineering