Scaling Machine Learning via Prioritized Optimization

Johnson, Tyler Bridge

Scaling Machine Learning via Prioritized Optimization

Files

Johnson_washington_0250E_19432.pdf (17.4 MB)

Date

2019-02-22

Authors

Johnson, Tyler Bridge

Abstract

To learn from large datasets, modern machine learning applications rely on scalable training algorithms. Typically such algorithms employ stochastic updates, parallelism, or both. This work develops scalable algorithms via a third approach: prioritized optimization. We first propose a method for prioritizing challenging tasks when training deep models. Our robust approximate importance sampling procedure (RAIS) speeds up stochastic gradient descent by sampling minibatches non-uniformly. By approximating the ideal sampling distribution using robust optimization, RAIS provides much of the benefit of exact importance sampling with little overhead and minimal hyperparameters. In the second part of this work, we develop strategies for prioritizing optimization when solving convex problems with piecewise linear structure. Our BlitzWS working set algorithm offers unique theoretical guarantees and solves several classic machine learning problems very efficiently in practice. We also propose a closely related safe screening test, BlitzScreen, which is state-of-the-art for safe screening in multiple ways. Our final contribution is a “stingy update” rule for coordinate descent. Our StingyCD algorithm prioritizes optimization variables by eliminating provably useless computation. StingyCD requires only simple changes to CD and results in significant speed-ups in practice.