Principal Curves in High Dimensions
Loading...
Date
Authors
Milchgrub, Alon
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Principal curves are curves passing through the "center" of a dataset and generalize the notion of principal components to non-linear curves, thus providing more meaningful insights to the structure of the data. SCMS is an elegant and robust method for recovering the principal curves of a dataset. However, its dependence on evaluating the Hessian of the density function at every step and the need for an abundance of probes make it prohibitively computationally intensive when dealing with large and high-dimensional data. In this paper we present L-SCMS, an approximation algorithm for SCMS which reduces the computational complexity from n^2 to n, where n is the dimensionality of the data. We also present MorsePCS, an extension to SCMS that provides with additional information on the structure of the data, while reducing the number of steps required. Finally, we provide empirical evaluation of our MorsePCS, indicating that the runtime guarantees do realize in practice, while maintaining the quality of the curves produced
Description
Thesis (Ph.D.)--University of Washington, 2021
