A unified approach to model-agnostic variable importance
| dc.contributor.advisor | Carone, Marco | |
| dc.contributor.advisor | Simon, Noah R | |
| dc.contributor.author | Williamson, Brian | |
| dc.date.accessioned | 2020-02-04T19:24:34Z | |
| dc.date.issued | 2020-02-04 | |
| dc.date.submitted | 2019 | |
| dc.description | Thesis (Ph.D.)--University of Washington, 2019 | |
| dc.description.abstract | Assessing the relative contribution of subsets of features towards predicting the response is often of interest in predictive modeling applications; this contribution is typically referred to as variable importance. Often, simple population models are used because the associated variable importance measure is easy to interpret; however, estimates may be misleading if the model used is overly simplistic. In an effort to improve prediction performance, complex prediction algorithms are often used instead; however, in these cases variable importance is often defined as a function of the algorithm rather than a summary of the population, rendering formal statistical inference on population importance difficult. In this dissertation, we propose a unified model-agnostic framework for statistical inference on population-level variable importance. Specifically, we define variable importance as a contrast between the predictiveness of the best possible prediction function based on all available features versus all features but those under consideration. We discuss general conditions under which a simple estimator of this importance is nonparametric efficient and allows the construction of valid confidence intervals. We also propose a valid strategy for hypothesis testing. Through simulations, we show that our proposal has good operating characteristics, and we illustrate its use with data from a study of an antibody against HIV-1 infection. | |
| dc.embargo.lift | 2022-01-24T19:24:34Z | |
| dc.embargo.terms | Restrict to UW for 2 years -- then make Open Access | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.other | Williamson_washington_0250E_20880.pdf | |
| dc.identifier.uri | http://hdl.handle.net/1773/45122 | |
| dc.language.iso | en_US | |
| dc.rights | CC BY-NC-ND | |
| dc.subject | machine learning | |
| dc.subject | nonparametric statistics | |
| dc.subject | variable importance | |
| dc.subject | Biostatistics | |
| dc.subject | Statistics | |
| dc.subject.other | Biostatistics | |
| dc.title | A unified approach to model-agnostic variable importance | |
| dc.type | Thesis |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Williamson_washington_0250E_20880.pdf
- Size:
- 10.36 MB
- Format:
- Adobe Portable Document Format
