Bayesian methods for variable selection

dc.contributor.advisorRodriguez, Abel
dc.contributor.advisorRaftery, Adrian
dc.contributor.authorPorwal, Anupreet
dc.date.accessioned2023-09-27T17:22:16Z
dc.date.available2023-09-27T17:22:16Z
dc.date.issued2023-09-27
dc.date.issued2023-09-27
dc.date.issued2023-09-27
dc.date.submitted2023
dc.descriptionThesis (Ph.D.)--University of Washington, 2023
dc.description.abstractChoosing a statistical model and accounting for uncertainty about this choice are important parts of the scientific process and are required for common statistical tasks such as parameter estimation, interval estimation, statistical inference, point prediction and interval prediction. A canonical example is the variable selection problem in a linear regression model. Many ways of doing this have been proposed, including Bayesian and penalized regression methods. Each of these proposals has advantages and disadvantages, and the trade-offs are not always well understood. In this dissertation, we first compare 21 popular existing methods via an extensive simulation study based on a wide range of real datasets. We found that three adaptive Bayesian model averaging (BMA) methods performed best across all the statistical tasks. Subsequently, we also investigate the effect of model space priors on model inference under the BMA framework. For this study, we consider eight reference model space priors used in the literature and three adaptive parameter priors recommended by the previous study. Additionally, we proposed a novel objective prior based on Power-expected-posterior priors for generalized linear models that relies on a Laplace expansion of the likelihood of the imaginary training sample. We investigate both asymptotic and finite-sample properties of the procedures, showing that they are both asymptotically and intrinsically consistent, and that their performance is superior to that of alternative approaches in the literature especially for heavy tailed versions of the priors. Finally, we propose a framework that unifies the two Bayesian paradigms of inducing sparsity namely (mixture of) $g$-priors and continuous shrinkage priors. The mixture of $g$-priors use a single shrinkage parameter across all predictors included in the model, incorporate correlation structure of covariates into the priors and allows for model selection, however suffers from Conditional Lindley Paradox (CLP). Continuous shrinkage priors like the Horseshoe prior, on the other hand, allow for a different shrinkage parameter for each coefficient however does not perform model selection. We propose global local-$g$ priors that borrows strength of the two paradigms and allows differential shrinkage across predictors while performing model selection. Additionally, we propose Dirichlet Process (DP) block-$g$ priors that allows combining $g$-priors and Bayesian non-parametric tools to incorporate correlation structure in the priors as well as adaptively identify and cluster predictors with varying degrees of relevance using a different shrinkage parameter for different clusters. We show empirically and theoretically that our proposed priors avoid CLP while still performing competitively or superior to existing methods in terms of model selection, parameter estimation and prediction.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherPorwal_washington_0250E_26131.pdf
dc.identifier.urihttp://hdl.handle.net/1773/50930
dc.language.isoen_US
dc.rightsCC BY
dc.subjectBayesian model averaging
dc.subjectdefault priors
dc.subjectgeneralised linear models
dc.subjectlinear regression
dc.subjectmodel selection
dc.subjectZellner's g priors
dc.subjectStatistics
dc.subject.otherStatistics
dc.titleBayesian methods for variable selection
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Porwal_washington_0250E_26131.pdf
Size:
1.91 MB
Format:
Adobe Portable Document Format

Collections