Unsupervised Learning : Model-guided and Model-agnostic Approaches

Mukherjee, Sudipto

Unsupervised Learning : Model-guided and Model-agnostic Approaches

dc.contributor.advisor	Kannan, Sreeram
dc.contributor.author	Mukherjee, Sudipto
dc.date.accessioned	2020-10-26T20:38:11Z
dc.date.available	2020-10-26T20:38:11Z
dc.date.issued	2020-10-26
dc.date.submitted	2020
dc.description	Thesis (Ph.D.)--University of Washington, 2020
dc.description.abstract	Unsupervised learning is the branch of machine learning that is aimed at learning patterns from data without labels. Supervised learning with millions of labels for image classification had driven the modern deep learning revolution in the past few years. Deep neural networks have exceeded human performance at this specific task. But the requirement of such large amounts of labeled data for these models makes one skeptical about the generalization of such intelligence to myriad tasks. While training a neural network to classify images of a cat, one might wonder : Do humans really need hundreds of images to differentiate a cat from an elephant ? Or is there some underlying principle that can be rendered useful by a machine in its race to match human intelligence ? Unsupervised learning unveils the potential of machine learning algorithms beyond empirical risk minimization and extend them to learning non-trivial representations of the data. At the core of such learning are two distinct principles - model-agnostic representation learning and model-guided inference. The goal of this thesis is to extend the present literature on unsupervised learning through design of novel unsupervised algorithms for clustering, information estimation and model-guided inference. Our journey starts with one of the simplest, yet most fundamental unsupervised learning problem, namely clustering. We explore how modern generative principles such as Generative Adversarial Learning (GAN) can be used to cluster diverse types of data. Even though auto-encoders had been used for clustering in the past, clustering using GANs was unexplored prior to this work. ClusterGAN modifies the vanilla GAN architecture to enable embedding of data in the latent space where cluster structure is revealed. It also improves the generation ability of vanilla GANs by segregating a complex multi-modal distribution into simpler components. Recently, information-theoretic quantities such as mutual information and cross-entropy have been used to regularize unsupervised representation learning and improve clustering. Estimation of such quantities is another fundamental problem in unsupervised learning, which is related to the broader statistical problem of estimating functionals of probability density. We design an estimator, CCMI, for mutual information estimation using classifier likelihood ratio in an unsupervised manner and demonstrate its suitability for high dimensional real-valued information estimation. The conditional variant of this quantity, conditional mutual information (CMI), is also estimated and applied to conditional independence testing. The above approaches to unsupervised learning do not assume any model for the data generation and learn it implicitly from data. However, in many real-world problems, one has domain knowledge about the data-generation process. Utilizing such domain knowledge can help to further reduce data complexity and abandon the need for deep learning models. It also imparts interpretability to the learning process. We apply such learning techniques to a specific phenomenon in genomics, known as segmental duplication. The problem can be formulated as either a (a) low-rank matrix completion or a (b) robust signed community detection based on suitable assumptions on the data. We design algorithms for resolving segmental duplication in genomes under these two formulations. Finally, we explore another application in natural language understanding where unsupervised and supervised approaches blend gracefully. This also illustrates a situation where labeled data could be difficult to obtain and an unsupervised solution may be used.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Mukherjee_washington_0250E_22029.pdf
dc.identifier.uri	http://hdl.handle.net/1773/46346
dc.language.iso	en_US
dc.rights	none
dc.subject	Clustering
dc.subject	Generative Adversarial Network
dc.subject	Latent factor models
dc.subject	Matrix completion
dc.subject	Mutual information
dc.subject	Unsupervised learning
dc.subject	Computer science
dc.subject	Artificial intelligence
dc.subject.other	Computer science and engineering
dc.title	Unsupervised Learning : Model-guided and Model-agnostic Approaches
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Mukherjee_washington_0250E_22029.pdf
Size:: 8.87 MB
Format:: Adobe Portable Document Format

Download

Collections

Computer science and engineering