Geometry of Feedback Control and Learning

Bu, Jingjing

Geometry of Feedback Control and Learning

dc.contributor.advisor	Mesbahi, Mehran
dc.contributor.advisor	Fazel, Maryam
dc.contributor.author	Bu, Jingjing
dc.date.accessioned	2021-03-19T22:54:11Z
dc.date.available	2021-03-19T22:54:11Z
dc.date.issued	2021-03-19
dc.date.submitted	2020
dc.description	Thesis (Ph.D.)--University of Washington, 2020
dc.description.abstract	In this thesis, we shall study optimal control problems, e.g. linear-quadratic-regulator (LQR), least squares stationary optimal control, linear quadratic (LQ) dynamic games, through the lens of first-order algorithms. The developed theories on these topics are largely derived from model-based dynamic programming. Recently there is a surge of interest in constructing optimal control strategies directly, viewing control synthesis by policy gradient based algorithms. Adopting such a point of view has been partially inspired by the success of learning algorithms, such as Reinforcement Learning (RL), where using principles of Dynamic Programming (DP), one can devise real-time model-free methods for both continuous-time and discrete-time LQR. The direct policy update approach offers advantages in terms of scalability, model-free implementations and richer parameterizations (e.g., structured controller design).We first study the topological and metrical properties of the set of stabilizing feedback controls. The problem is of interest as this set is the natural domain of the cost functions for optimal problems. We present a complete account of the set-theoretic properties for both single-input-single-out (SISO) and multiple-input-multiple-output (MIMO) systems. We particularly prove an upper bound of number of path-connected components in SISO systems. An algorithm on how to identify the connected components is proposed as well. We next move on LQR optimal control. We characterize several analytical properties (smoothness, coerciveness, quadratic growth) that are crucial in the analysis of gradient- based algorithms. We then examine three types of well-posed flows for LQR: gradient flow, natural gradient flow and the quasi-Newton flow. The coercive property suggests that these flows admit unique solutions while gradient dominated property indicates that the corresponding Lyapunov functionals decay at an exponential rate; quadratic growth on the other hand guarantees that the trajectories of these flows are exponentially stable in the sense of Lyapunov. We then discuss the forward Euler discretization of these flows, realized as gradient descent, natural gradient descent and quasi-Newton iteration. We present stepsize criteria for gradient descent and natural gradient descent, guaranteeing that both algorithms converge linearly to the global optima. An optimal stepsize for the quasi-Newton iteration is also proposed, guaranteeing a Q-quadratic convergence rate–and in the meantime–recovering the Hewer algorithm. We then consider the least squares stationary optimal control, i.e., LQR with indefinite state and input cost matrices. Such a setup has important applications in control design with conflicting objectives, such as linear quadratic dynamic games. We show the global convergence of gradient, natural gradient and quasi-Newton policies for this class of indefinite least squares problems. Lastly, we study LQ dynamic games, which is closely related to H∞ optimal control. We propose projection-free sequential algorithms for linear-quadratic dynamics games. These policy gradient based algorithms are akin to Stackelberg leadership model and can be ex- tended to model-free settings. We show that if the “leader” performs natural gradient de- scent/ascent, then the proposed algorithm has a global sublinear convergence to the Nash equilibrium. Moreover, if the leader adopts a quasi-Newton policy, the algorithm enjoys a Q-quadratic convergence. Along the way, we examine and clarify the intricacies of adopting sequential policy updates for LQ games, namely, issues pertaining to stabilization, indefinite cost structure, and circumventing projection steps.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Bu_washington_0250E_22375.pdf
dc.identifier.uri	http://hdl.handle.net/1773/46786
dc.language.iso	en_US
dc.rights	none
dc.subject
dc.subject	Engineering
dc.subject.other	Electrical engineering
dc.title	Geometry of Feedback Control and Learning
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Bu_washington_0250E_22375.pdf
Size:: 4.18 MB
Format:: Adobe Portable Document Format

Download

Collections

Electrical engineering