Multi-Task Averaging: Theory and Practice
MetadataShow full item record
This dissertation addresses the problem of estimating the means of multiple distributions. I begin with a brief history of the mean, leading to a discussion and literature review of Stein estimation and multi-task learning. Using a multi-task regularized empirical risk formulation, an algorithm called multi-task averaging (MTA) is derived and analyzed. Two main results are discussed. First, I prove that the MTA solution matrix is right-stochastic, that is, the multi-task mean estimates are always convex combinations of single-task mean estimates. Second, in the two-task case, analysis shows that the MTA estimates have smaller risk than single-task estimates for a range of task similarity values. I use this analysis to derive a theoretically optimal similarity, which has an intuitive form. I then proceed to derive two practical and efficient MTA estimators for real data of any number of tasks: constant MTA and minimax MTA. Extensive simulations and four applications demonstrate that MTA often outperforms the battle-tested James-Stein estimator, as well as single-task estimation.
- Electrical engineering