An Algorithmic Framework for High Dimensional Regression with Dependent Variables
MetadataShow full item record
We present an exploration of the rich theoretical connections between several classes of regularized models, network flows, and recent results in submodular function theory. This work unifies key aspects of these problems under a common theory, leading to novel methods for working with several important models of interest in statistics, machine learning and computer vision. Most notably, we describe the full regularization path of a class of penalized regression problems with dependent variables that includes variants of the fused LASSO and total variation constrained models. We begin by reviewing the concepts of network flows and submodular function optimization theory foundational to our results. We then examine the connections between network flows and the minimum-norm algorithm from submodular optimization, extending and improving several current results. This theory leads to a new representation of the structure of a large class of pairwise regularized models important in machine learning, statistics and computer vision. Finally, by applying an arbitrarily accurate approximation, our approach allows us to efficiently optimize total variation penalized models on continuous functions. Ultimately, our new algorithms scale up easily to high-dimensional problems with millions of variables.
- Statistics