Unsupervised Text Representation Learning with Interactive Language

Cheng, Hao

Unsupervised Text Representation Learning with Interactive Language

Files

Cheng_washington_0250E_21061.pdf (2.1 MB)

Date

2020-02-04

Authors

Cheng, Hao

Abstract

Distributed text representations learned through unsupervised learning have recently shown great success in various language processing tasks. However, most of the existing work focuses solely on learning text representations from written documents with little or limited structured context. Different from written documents, dialogues and multi-party discussions contain structured context information in that they take the form of a sequence of turn-taking responses reflecting the attributes of participants and their corresponding communication goals. This thesis aims to represent interactive language with two types of context, i.e. local text context and global mode context. An unsupervised text representation learning framework is developed to capture the structured context in interactive language. Experiments show that capturing such context in a text representation can be useful for various language understanding tasks. An important focus of the proposed unsupervised text representation learning framework is to discover latent discrete factors in language. We formulate the latent factor learning as a conditional generation process by dynamically querying a memory of latent mode vectors for template information that is shared across the data samples. A potential advantage of using the latent mode vectors is that they can make the resulting model more interpretable. Based on qualitative analysis, we find the learned latent factors correspond to speaking style, intent, sentiment and even speaker related attributes, such as gender and personality. The text representation approach is assessed on four interaction scenarios using four different tasks: 1) community endorsement prediction in multi-party text-based online discussions, 2) topic decision prediction in human-socialbot spoken dialogues, 3) dialogue act prediction in human-human open-domain spoken dialogues, and 4) dialogue state tracking in human-wizard text-based task-oriented dialogues. The resulting text representation is shown to be effective for all four scenarios, demonstrating the benefit of incorporating interaction context with the unsupervised text representation learning.