Learning and Manifolds: Leveraging the Intrinsic Geometry
Perrault-Joncas, Dominique Chipman
MetadataShow full item record
In this work, we explore and exploit the use of differential operators on manifolds - the Laplace-Beltrami operator in particular - in learning tasks. In particular, we are interested in uncovering the geometric structure of data (unsupervised learning) and in exploiting information contained in unlabelled data for regression and classification tasks (semi-supervised learning). First, building on the Laplacian Eigenmap and Diffusionmaps framework, we propose a new paradigm that offers a guarantee, under reasonable assumptions, that any manifold learning algorithm will preserve the geometry of a data set. Our approach is based on augmenting the output of embedding algorithms with geometric information embodied in the Riemannian metric of the manifold. We provide an algorithm for estimating the Riemannian metric from data, consider its consistency, and demonstrate possible applications of our approach in a variety of examples. Second, we extend the idea of learning the geometry of the data to improve the performance of prediction tasks. From a statistical point of view, this means dealing with data that are locally collinear, but where the global relationship between the covariates is non-linear. We do this by combining the Matérn Gaussian process - a flexible and easily interpretable Bayesian non-parametric regression model - with the Laplace-Beltrami operator, which embodies all the intrinsic geometry of the manifold. This yields a principled geometrical approach for learning tasks on the intrinsic geometry of the manifold. Finally, we turn to the problem of setting the hyperparameters used to construct the graph Laplacian, the highly prevalent non-parametric estimator of the Laplace-Beltrami operator. Specifically, we study the problem of setting the kernel bandwidth, to construct the graph Laplacian for Euclidean data - a parameter that, according to our results, has a material impact in both the unsupervised and semi-supervised learning context. We exploit the connection between manifold geometry and the Laplace-Beltrami operator so as to obtain the hyperparameters for which the graph Laplacian best captures the geometry of the data.
- Statistics