Dimension Reduction for Spatially-Misaligned and Multi-Pollutant Data with Missing Observations

dc.contributor.advisorSzpiro, Adam A
dc.contributor.authorVu, Phuong Thu
dc.date.accessioned2020-02-04T19:24:33Z
dc.date.available2020-02-04T19:24:33Z
dc.date.issued2020-02-04
dc.date.submitted2019
dc.descriptionThesis (Ph.D.)--University of Washington, 2019
dc.description.abstractAccurate predictions of pollutant concentrations at new locations are often of interest in air pollution studies, in which data are usually not measured at all study locations. Ambient air is also a mixture of many chemical components, which can modify the associations between its total mass and various health outcomes. Principal component analysis (PCA) can be incorporated to obtain lower-dimensional representative scores of the multi-pollutant data. Spatial prediction models can then be used to estimate these scores at new locations. Recently developed predictive PCA (PredPCA) modifies the traditional algorithm to improve the overall predictive performance. However, these approaches require complete data, whereas multi-pollutant data tend to have complex missing patterns. In the first part of this dissertation, we propose a probabilistic version of PredPCA that can directly handle incomplete data with flexible model-based imputation accounting for geographic and spatial information. In the second part, we reformulate the PredPCA algorithm into a convex optimization problem by incorporating spatial information into the low-rank matrix completion framework. The advantages of our proposed method include simultaneous estimation of all components, orthogonality, and a mechanism to handle missing data. Finally, we leverage these core ideas to modify existing technique in low-rank tensor approximation to handle misaligned spatiotemporal data.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherVu_washington_0250E_20886.pdf
dc.identifier.urihttp://hdl.handle.net/1773/45120
dc.language.isoen_US
dc.rightsCC BY-NC-ND
dc.subjectair pollution
dc.subjectdimension reduction
dc.subjectmatrix completion
dc.subjectmissing data
dc.subjectmultivariate analysis
dc.subjecttensor completion
dc.subjectBiostatistics
dc.subjectStatistics
dc.subject.otherBiostatistics
dc.titleDimension Reduction for Spatially-Misaligned and Multi-Pollutant Data with Missing Observations
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Vu_washington_0250E_20886.pdf
Size:
8.35 MB
Format:
Adobe Portable Document Format

Collections