Low-Rank RNN Adaptation for Context-Aware Language Modeling

dc.contributor.advisorOstendorf, Mari
dc.contributor.authorJaech, Aaron
dc.date.accessioned2018-07-31T21:11:38Z
dc.date.available2018-07-31T21:11:38Z
dc.date.issued2018-07-31
dc.date.submitted2018
dc.descriptionThesis (Ph.D.)--University of Washington, 2018
dc.description.abstractA long-standing weakness of statistical language models is that their performance drastically degrades if they are used on data that varies even slightly from the data on which they were trained. In practice, applications require the use of adaptation methods to adjust the predictions of the model to match the local context. For instance, in a speech recognition application, a single static language model would not be able to handle all the different ways that people speak to their voice assistants such as selecting music and sending a message to a friend. An adapted model would make its predictions conditioned on the knowledge of who is speaking and what task they are trying to do. The current standard approach to recurrent neural network language model adaptation is to apply a simple linear shift to the recurrent and/or output layer bias vector. Although this is helpful, it does not go far enough. This thesis introduces a new approach to adaptation, which we call the FactorCell, that generates a custom recurrent network for each context by applying a low-rank transformation. The FactorCell allows for a more substantial change to the recurrent layer weights. Different from previous approaches, the introduction of a rank hyperparameter gives control over how different or similar the adapted models should be. In our experiments on several different datasets and multiple types of context, the increased adaptation of the recurrent layer is always helpful, as measured by perplexity, the standard for evaluating language models. We also demonstrate impact on two applications: personalized query completion and context-specific text generation, finding that the enhanced adaptation benefits both. We also show that the FactorCell provides a more effective text classification model, but more importantly the classification results reveal that there are important differences between the models that are not captured by perplexity. The classification metric is particularly important for the text generation application.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherJaech_washington_0250E_18951.pdf
dc.identifier.urihttp://hdl.handle.net/1773/42292
dc.language.isoen_US
dc.rightsnone
dc.subjectlanguage modeling
dc.subjectnatural language processing
dc.subjectComputer science
dc.subjectStatistics
dc.subject.otherElectrical engineering
dc.titleLow-Rank RNN Adaptation for Context-Aware Language Modeling
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Jaech_washington_0250E_18951.pdf
Size:
1009.27 KB
Format:
Adobe Portable Document Format