Geometry of Feedback Control and Learning

dc.contributor.advisorMesbahi, Mehran
dc.contributor.advisorFazel, Maryam
dc.contributor.authorBu, Jingjing
dc.date.accessioned2021-03-19T22:54:11Z
dc.date.available2021-03-19T22:54:11Z
dc.date.issued2021-03-19
dc.date.submitted2020
dc.descriptionThesis (Ph.D.)--University of Washington, 2020
dc.description.abstractIn this thesis, we shall study optimal control problems, e.g. linear-quadratic-regulator (LQR), least squares stationary optimal control, linear quadratic (LQ) dynamic games, through the lens of first-order algorithms. The developed theories on these topics are largely derived from model-based dynamic programming. Recently there is a surge of interest in constructing optimal control strategies directly, viewing control synthesis by policy gradient based algorithms. Adopting such a point of view has been partially inspired by the success of learning algorithms, such as Reinforcement Learning (RL), where using principles of Dynamic Programming (DP), one can devise real-time model-free methods for both continuous-time and discrete-time LQR. The direct policy update approach offers advantages in terms of scalability, model-free implementations and richer parameterizations (e.g., structured controller design).We first study the topological and metrical properties of the set of stabilizing feedback controls. The problem is of interest as this set is the natural domain of the cost functions for optimal problems. We present a complete account of the set-theoretic properties for both single-input-single-out (SISO) and multiple-input-multiple-output (MIMO) systems. We particularly prove an upper bound of number of path-connected components in SISO systems. An algorithm on how to identify the connected components is proposed as well. We next move on LQR optimal control. We characterize several analytical properties (smoothness, coerciveness, quadratic growth) that are crucial in the analysis of gradient- based algorithms. We then examine three types of well-posed flows for LQR: gradient flow, natural gradient flow and the quasi-Newton flow. The coercive property suggests that these flows admit unique solutions while gradient dominated property indicates that the corresponding Lyapunov functionals decay at an exponential rate; quadratic growth on the other hand guarantees that the trajectories of these flows are exponentially stable in the sense of Lyapunov. We then discuss the forward Euler discretization of these flows, realized as gradient descent, natural gradient descent and quasi-Newton iteration. We present stepsize criteria for gradient descent and natural gradient descent, guaranteeing that both algorithms converge linearly to the global optima. An optimal stepsize for the quasi-Newton iteration is also proposed, guaranteeing a Q-quadratic convergence rate–and in the meantime–recovering the Hewer algorithm. We then consider the least squares stationary optimal control, i.e., LQR with indefinite state and input cost matrices. Such a setup has important applications in control design with conflicting objectives, such as linear quadratic dynamic games. We show the global convergence of gradient, natural gradient and quasi-Newton policies for this class of indefinite least squares problems. Lastly, we study LQ dynamic games, which is closely related to H∞ optimal control. We propose projection-free sequential algorithms for linear-quadratic dynamics games. These policy gradient based algorithms are akin to Stackelberg leadership model and can be ex- tended to model-free settings. We show that if the “leader” performs natural gradient de- scent/ascent, then the proposed algorithm has a global sublinear convergence to the Nash equilibrium. Moreover, if the leader adopts a quasi-Newton policy, the algorithm enjoys a Q-quadratic convergence. Along the way, we examine and clarify the intricacies of adopting sequential policy updates for LQ games, namely, issues pertaining to stabilization, indefinite cost structure, and circumventing projection steps.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherBu_washington_0250E_22375.pdf
dc.identifier.urihttp://hdl.handle.net/1773/46786
dc.language.isoen_US
dc.rightsnone
dc.subject
dc.subjectEngineering
dc.subject.otherElectrical engineering
dc.titleGeometry of Feedback Control and Learning
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Bu_washington_0250E_22375.pdf
Size:
4.18 MB
Format:
Adobe Portable Document Format