Principal Curves in High Dimensions

Loading...
Thumbnail Image

Authors

Milchgrub, Alon

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Principal curves are curves passing through the "center" of a dataset and generalize the notion of principal components to non-linear curves, thus providing more meaningful insights to the structure of the data. SCMS is an elegant and robust method for recovering the principal curves of a dataset. However, its dependence on evaluating the Hessian of the density function at every step and the need for an abundance of probes make it prohibitively computationally intensive when dealing with large and high-dimensional data. In this paper we present L-SCMS, an approximation algorithm for SCMS which reduces the computational complexity from n^2 to n, where n is the dimensionality of the data. We also present MorsePCS, an extension to SCMS that provides with additional information on the structure of the data, while reducing the number of steps required. Finally, we provide empirical evaluation of our MorsePCS, indicating that the runtime guarantees do realize in practice, while maintaining the quality of the curves produced

Description

Thesis (Ph.D.)--University of Washington, 2021

Citation

DOI