Sparse Partial Least Squares Methods and Extensions for Modeling Heart-Healthy Diets

Loading...
Thumbnail Image

Authors

Gasca, Natalie

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

When investigating the link between diet and cardiovascular disease (CVD), nutritional epidemiologists often use unsupervised methods to construct dietary patterns using hundreds of foods. We posit that diet summaries can be better tailored to CVD by incorporating outcome data and sparsity. Partial least squares (PLS) is an appealing supervised method because its patterns are correlated with a continuous response while also capturing covariate variability. However, its statistical and modeling assumptions are not well characterized for non-continuous outcomes. In this dissertation, we clarify the implications of incorporating PLS into linear and Cox models, with the aim of constructing parsimonious patterns to facilitate hypothesis generation. First, we identify an advantageous sparse PLS procedure (SPLS) that targets variable selection for continuous data. We propose using SPLS after fitting the Cox or approximate Cox model to analyze a right-censored survival outcome. To enable proper adjustment for covariates that do not require dimension reduction, we demonstrate that various scientific premises and goals require different types of adjustment when using PLS. These contributions are verified by simulation studies, analytic results, and applications to CVD-related endpoints. Our findings allow for more informed method selection and deeper insight connecting least squares and PLS regression coefficients.

Description

Thesis (Ph.D.)--University of Washington, 2021

Citation

DOI

Collections