Unsupervised Text Representation Learning with Interactive Language

Cheng, Hao

Unsupervised Text Representation Learning with Interactive Language

dc.contributor.advisor	Ostendorf, Mari
dc.contributor.author	Cheng, Hao
dc.date.accessioned	2020-02-04T19:23:02Z
dc.date.available	2020-02-04T19:23:02Z
dc.date.issued	2020-02-04
dc.date.submitted	2019
dc.description	Thesis (Ph.D.)--University of Washington, 2019
dc.description.abstract	Distributed text representations learned through unsupervised learning have recently shown great success in various language processing tasks. However, most of the existing work focuses solely on learning text representations from written documents with little or limited structured context. Different from written documents, dialogues and multi-party discussions contain structured context information in that they take the form of a sequence of turn-taking responses reflecting the attributes of participants and their corresponding communication goals. This thesis aims to represent interactive language with two types of context, i.e. local text context and global mode context. An unsupervised text representation learning framework is developed to capture the structured context in interactive language. Experiments show that capturing such context in a text representation can be useful for various language understanding tasks. An important focus of the proposed unsupervised text representation learning framework is to discover latent discrete factors in language. We formulate the latent factor learning as a conditional generation process by dynamically querying a memory of latent mode vectors for template information that is shared across the data samples. A potential advantage of using the latent mode vectors is that they can make the resulting model more interpretable. Based on qualitative analysis, we find the learned latent factors correspond to speaking style, intent, sentiment and even speaker related attributes, such as gender and personality. The text representation approach is assessed on four interaction scenarios using four different tasks: 1) community endorsement prediction in multi-party text-based online discussions, 2) topic decision prediction in human-socialbot spoken dialogues, 3) dialogue act prediction in human-human open-domain spoken dialogues, and 4) dialogue state tracking in human-wizard text-based task-oriented dialogues. The resulting text representation is shown to be effective for all four scenarios, demonstrating the benefit of incorporating interaction context with the unsupervised text representation learning.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Cheng_washington_0250E_21061.pdf
dc.identifier.uri	http://hdl.handle.net/1773/45087
dc.language.iso	en_US
dc.rights	CC BY
dc.subject
dc.subject	Electrical engineering
dc.subject.other	Electrical engineering
dc.title	Unsupervised Text Representation Learning with Interactive Language
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Cheng_washington_0250E_21061.pdf
Size:: 2.1 MB
Format:: Adobe Portable Document Format

Download

Collections

Electrical engineering