Kannan, SreeramMukherjee, Sudipto2020-10-262020-10-262020-10-262020Mukherjee_washington_0250E_22029.pdfhttp://hdl.handle.net/1773/46346Thesis (Ph.D.)--University of Washington, 2020Unsupervised learning is the branch of machine learning that is aimed at learning patterns from data without labels. Supervised learning with millions of labels for image classification had driven the modern deep learning revolution in the past few years. Deep neural networks have exceeded human performance at this specific task. But the requirement of such large amounts of labeled data for these models makes one skeptical about the generalization of such intelligence to myriad tasks. While training a neural network to classify images of a cat, one might wonder : Do humans really need hundreds of images to differentiate a cat from an elephant ? Or is there some underlying principle that can be rendered useful by a machine in its race to match human intelligence ? Unsupervised learning unveils the potential of machine learning algorithms beyond empirical risk minimization and extend them to learning non-trivial representations of the data. At the core of such learning are two distinct principles - model-agnostic representation learning and model-guided inference. The goal of this thesis is to extend the present literature on unsupervised learning through design of novel unsupervised algorithms for clustering, information estimation and model-guided inference. Our journey starts with one of the simplest, yet most fundamental unsupervised learning problem, namely clustering. We explore how modern generative principles such as Generative Adversarial Learning (GAN) can be used to cluster diverse types of data. Even though auto-encoders had been used for clustering in the past, clustering using GANs was unexplored prior to this work. ClusterGAN modifies the vanilla GAN architecture to enable embedding of data in the latent space where cluster structure is revealed. It also improves the generation ability of vanilla GANs by segregating a complex multi-modal distribution into simpler components. Recently, information-theoretic quantities such as mutual information and cross-entropy have been used to regularize unsupervised representation learning and improve clustering. Estimation of such quantities is another fundamental problem in unsupervised learning, which is related to the broader statistical problem of estimating functionals of probability density. We design an estimator, CCMI, for mutual information estimation using classifier likelihood ratio in an unsupervised manner and demonstrate its suitability for high dimensional real-valued information estimation. The conditional variant of this quantity, conditional mutual information (CMI), is also estimated and applied to conditional independence testing. The above approaches to unsupervised learning do not assume any model for the data generation and learn it implicitly from data. However, in many real-world problems, one has domain knowledge about the data-generation process. Utilizing such domain knowledge can help to further reduce data complexity and abandon the need for deep learning models. It also imparts interpretability to the learning process. We apply such learning techniques to a specific phenomenon in genomics, known as segmental duplication. The problem can be formulated as either a (a) low-rank matrix completion or a (b) robust signed community detection based on suitable assumptions on the data. We design algorithms for resolving segmental duplication in genomes under these two formulations. Finally, we explore another application in natural language understanding where unsupervised and supervised approaches blend gracefully. This also illustrates a situation where labeled data could be difficult to obtain and an unsupervised solution may be used.application/pdfen-USnoneClusteringGenerative Adversarial NetworkLatent factor modelsMatrix completionMutual informationUnsupervised learningComputer scienceArtificial intelligenceComputer science and engineeringUnsupervised Learning : Model-guided and Model-agnostic ApproachesThesis