Unsupervised Learning : Model-guided and Model-agnostic Approaches

dc.contributor.advisorKannan, Sreeram
dc.contributor.authorMukherjee, Sudipto
dc.date.accessioned2020-10-26T20:38:11Z
dc.date.available2020-10-26T20:38:11Z
dc.date.issued2020-10-26
dc.date.submitted2020
dc.descriptionThesis (Ph.D.)--University of Washington, 2020
dc.description.abstractUnsupervised learning is the branch of machine learning that is aimed at learning patterns from data without labels. Supervised learning with millions of labels for image classification had driven the modern deep learning revolution in the past few years. Deep neural networks have exceeded human performance at this specific task. But the requirement of such large amounts of labeled data for these models makes one skeptical about the generalization of such intelligence to myriad tasks. While training a neural network to classify images of a cat, one might wonder : Do humans really need hundreds of images to differentiate a cat from an elephant ? Or is there some underlying principle that can be rendered useful by a machine in its race to match human intelligence ? Unsupervised learning unveils the potential of machine learning algorithms beyond empirical risk minimization and extend them to learning non-trivial representations of the data. At the core of such learning are two distinct principles - model-agnostic representation learning and model-guided inference. The goal of this thesis is to extend the present literature on unsupervised learning through design of novel unsupervised algorithms for clustering, information estimation and model-guided inference. Our journey starts with one of the simplest, yet most fundamental unsupervised learning problem, namely clustering. We explore how modern generative principles such as Generative Adversarial Learning (GAN) can be used to cluster diverse types of data. Even though auto-encoders had been used for clustering in the past, clustering using GANs was unexplored prior to this work. ClusterGAN modifies the vanilla GAN architecture to enable embedding of data in the latent space where cluster structure is revealed. It also improves the generation ability of vanilla GANs by segregating a complex multi-modal distribution into simpler components. Recently, information-theoretic quantities such as mutual information and cross-entropy have been used to regularize unsupervised representation learning and improve clustering. Estimation of such quantities is another fundamental problem in unsupervised learning, which is related to the broader statistical problem of estimating functionals of probability density. We design an estimator, CCMI, for mutual information estimation using classifier likelihood ratio in an unsupervised manner and demonstrate its suitability for high dimensional real-valued information estimation. The conditional variant of this quantity, conditional mutual information (CMI), is also estimated and applied to conditional independence testing. The above approaches to unsupervised learning do not assume any model for the data generation and learn it implicitly from data. However, in many real-world problems, one has domain knowledge about the data-generation process. Utilizing such domain knowledge can help to further reduce data complexity and abandon the need for deep learning models. It also imparts interpretability to the learning process. We apply such learning techniques to a specific phenomenon in genomics, known as segmental duplication. The problem can be formulated as either a (a) low-rank matrix completion or a (b) robust signed community detection based on suitable assumptions on the data. We design algorithms for resolving segmental duplication in genomes under these two formulations. Finally, we explore another application in natural language understanding where unsupervised and supervised approaches blend gracefully. This also illustrates a situation where labeled data could be difficult to obtain and an unsupervised solution may be used.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherMukherjee_washington_0250E_22029.pdf
dc.identifier.urihttp://hdl.handle.net/1773/46346
dc.language.isoen_US
dc.rightsnone
dc.subjectClustering
dc.subjectGenerative Adversarial Network
dc.subjectLatent factor models
dc.subjectMatrix completion
dc.subjectMutual information
dc.subjectUnsupervised learning
dc.subjectComputer science
dc.subjectArtificial intelligence
dc.subject.otherComputer science and engineering
dc.titleUnsupervised Learning : Model-guided and Model-agnostic Approaches
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Mukherjee_washington_0250E_22029.pdf
Size:
8.87 MB
Format:
Adobe Portable Document Format