A unified approach to model-agnostic variable importance
Loading...
Date
Authors
Williamson, Brian
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Assessing the relative contribution of subsets of features towards predicting the response is often of interest in predictive modeling applications; this contribution is typically referred to as variable importance. Often, simple population models are used because the associated variable importance measure is easy to interpret; however, estimates may be misleading if the model used is overly simplistic. In an effort to improve prediction performance, complex prediction algorithms are often used instead; however, in these cases variable importance is often defined as a function of the algorithm rather than a summary of the population, rendering formal statistical inference on population importance difficult. In this dissertation, we propose a unified model-agnostic framework for statistical inference on population-level variable importance. Specifically, we define variable importance as a contrast between the predictiveness of the best possible prediction function based on all available features versus all features but those under consideration. We discuss general conditions under which a simple estimator of this importance is nonparametric efficient and allows the construction of valid confidence intervals. We also propose a valid strategy for hypothesis testing. Through simulations, we show that our proposal has good operating characteristics, and we illustrate its use with data from a study of an antibody against HIV-1 infection.
Description
Thesis (Ph.D.)--University of Washington, 2019
