Bayesian methods for variable selection

Porwal, Anupreet

Bayesian methods for variable selection

dc.contributor.advisor	Rodriguez, Abel
dc.contributor.advisor	Raftery, Adrian
dc.contributor.author	Porwal, Anupreet
dc.date.accessioned	2023-09-27T17:22:16Z
dc.date.available	2023-09-27T17:22:16Z
dc.date.issued	2023-09-27
dc.date.issued	2023-09-27
dc.date.issued	2023-09-27
dc.date.submitted	2023
dc.description	Thesis (Ph.D.)--University of Washington, 2023
dc.description.abstract	Choosing a statistical model and accounting for uncertainty about this choice are important parts of the scientific process and are required for common statistical tasks such as parameter estimation, interval estimation, statistical inference, point prediction and interval prediction. A canonical example is the variable selection problem in a linear regression model. Many ways of doing this have been proposed, including Bayesian and penalized regression methods. Each of these proposals has advantages and disadvantages, and the trade-offs are not always well understood. In this dissertation, we first compare 21 popular existing methods via an extensive simulation study based on a wide range of real datasets. We found that three adaptive Bayesian model averaging (BMA) methods performed best across all the statistical tasks. Subsequently, we also investigate the effect of model space priors on model inference under the BMA framework. For this study, we consider eight reference model space priors used in the literature and three adaptive parameter priors recommended by the previous study. Additionally, we proposed a novel objective prior based on Power-expected-posterior priors for generalized linear models that relies on a Laplace expansion of the likelihood of the imaginary training sample. We investigate both asymptotic and finite-sample properties of the procedures, showing that they are both asymptotically and intrinsically consistent, and that their performance is superior to that of alternative approaches in the literature especially for heavy tailed versions of the priors. Finally, we propose a framework that unifies the two Bayesian paradigms of inducing sparsity namely (mixture of) $g$-priors and continuous shrinkage priors. The mixture of $g$-priors use a single shrinkage parameter across all predictors included in the model, incorporate correlation structure of covariates into the priors and allows for model selection, however suffers from Conditional Lindley Paradox (CLP). Continuous shrinkage priors like the Horseshoe prior, on the other hand, allow for a different shrinkage parameter for each coefficient however does not perform model selection. We propose global local-$g$ priors that borrows strength of the two paradigms and allows differential shrinkage across predictors while performing model selection. Additionally, we propose Dirichlet Process (DP) block-$g$ priors that allows combining $g$-priors and Bayesian non-parametric tools to incorporate correlation structure in the priors as well as adaptively identify and cluster predictors with varying degrees of relevance using a different shrinkage parameter for different clusters. We show empirically and theoretically that our proposed priors avoid CLP while still performing competitively or superior to existing methods in terms of model selection, parameter estimation and prediction.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Porwal_washington_0250E_26131.pdf
dc.identifier.uri	http://hdl.handle.net/1773/50930
dc.language.iso	en_US
dc.rights	CC BY
dc.subject	Bayesian model averaging
dc.subject	default priors
dc.subject	generalised linear models
dc.subject	linear regression
dc.subject	model selection
dc.subject	Zellner's g priors
dc.subject	Statistics
dc.subject.other	Statistics
dc.title	Bayesian methods for variable selection
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Porwal_washington_0250E_26131.pdf
Size:: 1.91 MB
Format:: Adobe Portable Document Format

Download

Collections

Statistics