Structure and complexity in non-convex and non-smooth optimization
MetadataShow full item record
Complexity theory drives much of modern optimization, allowing a fair comparison between competing numerical methods. The subject broadly seeks to both develop efficient algorithms and establish limitations on efficiencies of any algorithm for the problem class. Classical complexity theory based on oracle models targets problems that are both smooth and convex. Without smoothness, methods rely on exploiting the structure of the target function to improve on the worst-case complexity of non-smooth convex optimization. This thesis explores complexity of first-order methods for structured non-smooth and non-convex problems. A central example is the minimization of a composition of a convex function with a smooth map - the so-called convex-composite problem class. Nonlinear least squares formulations in engineering and nonlinear model fitting in statistics fall within this framework. The thesis develops new algorithms for the composite problem class, along with inertial variants that are adaptive to convexity. Acceleration is a widely used term in contemporary optimization. The term is often used to describe methods with efficiency guarantees matching the best possible complexity estimates for a given problem class. This thesis develops methods that interpolate between convex and non-convex settings. In particular, we focus on minimizing large finite sum problems, popular for modeling empirical risk in statistical applications, when the user is unaware of the convexity of the objective function. The scheme we describe has convergence guarantees that adapt to the underlying convexity of the objective function. First-order algorithms for non-smooth problems depend on having access to generalized derivatives of the objective function. We conclude the thesis with a fresh look at variational properties of spectral function. These are the functions on the space of symmetric matrices that depend on the matrix only through its eigenvalues. In particular, our analysis dramatically simplifies currently available derivations of differential formulas of such functions.
- Mathematics