Methods for Correlated Data: Large-scale Linear Mixed Models and Brain Connectivity Networks

Loading...
Thumbnail Image

Authors

Yue, Kun

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This dissertation addresses the challenges associated with correlated data in diverse fields, emphasizing both statistical methodology and applications in the realms of genetics and neuroimaging. The overarching theme revolves around the development and refinement of statistical tools, particularly focusing on large-scale linear mixed models and brain connectivity networks. In the second chapter, we confront the computational inefficiencies and instability issues of standard methods for estimating variance components in linear mixed models, commonly used in genetic studies. Utilizing regularized estimation strategies, we propose the restricted Haseman-Elston (REHE) regression and its resampling variant (reREHE) estimators, along with an inference framework for REHE, as fast and robust alternatives that provide non-negative estimates with comparable accuracy to REML. The merits of REHE are illustrated using real data and benchmark simulation studies. The third chapter is motivated by the problem of inferring the graph structure of functional connectivity networks from multi-level functional magnetic resonance imaging (fMRI) data. We develop a valid inference framework for high-dimensional graphical models that accounts for group-level heterogeneity. We introduce a neighborhood-based method to learn the graph structure and reframe the problem as that of inferring fixed effect parameters in a doubly high-dimensional linear mixed model. Specifically, we propose a LASSO-based estimator and a de-biased LASSO-based inference framework for the fixed effect parameters of the linear mixed model. Moreover, we introduce consistent estimators for the variance components in order to identify subject-specific edges in the inferred graph. We also adapt our method to account for serial correlation by learning heterogeneous graphs in the setting of a vector autoregressive model. We demonstrate the performance of the proposed framework using real data and benchmark simulation studies. The fourth chapter delves into the temporally dynamic brain connectivity of the default mode network as a potential biomarker for Alzheimer's Disease (AD). Existing amyloid beta ($A\beta$) biomarkers, though effective, confront practical limitations. Brain functional connectivity alterations linked to AD pathology propose a non-invasive avenue for $A\beta$ detection. However, current FC measurements lack standalone sensitivity. We investigate temporally dynamic FC through resting-state functional MRI and introduce the Generalized Autoregressive Conditional Heteroscedastic Dynamic Conditional Correlation (DCC-GARCH) model. To fulfill the model assumptions, we employ whitening procedures to remove the serial correlations. Recognizing the limitations of traditional methods, we introduce an iterative data-adaptive autoregressive model (IDAR) capable of modeling complex serial correlation structures for both long- and short-TR datasets. We comprehensively illustrate IDAR's performance by assessing residual serial correlations post-whitening and type-I error rates in task testing. After applying the IDAR approach to pre-process fMRI signals, we estimate dynamic functional connectivity profiles with DCC-GARCH models. Our results demonstrate superior sensitivity to CSF $A\beta$ status and provide crucial insights into dynamic functional connectivity analysis in AD.

Description

Thesis (Ph.D.)--University of Washington, 2024

Citation

DOI

Collections