Linear Quadratic Regulator (LQR) ================================ .. math:: \newcommand{\vect}[1]{\boldsymbol{#1}} \newcommand{\dvect}[1]{\dot{\boldsymbol{#1}}} \newcommand{\ddvect}[1]{\ddot{\boldsymbol{#1}}} \newcommand{\mat}[1]{\boldsymbol{#1}} The linear quadratic regulator (LQR) controller is a well established and widespread optimal controller which acts on a linear system :math:`\dvect{x} = \mat{A} \vect{x} + \mat{B} \vect{u}` and an objective which is specified by a quadratic, instantaneous cost function :math:`J = \vect{x}^T \mat{Q} \vect{x} + \vect{u}^T \mat{R} \vect{u}` with the symmetric and positive definite matrices :math:`\mat{Q} = \mat{Q}^T \succeq 0` and :math:`\mat{R} = \mat{R}^T \succ 0`. This allows for reducing the Hamilton-Jacobi-Bellman (HJB) equation, whose solution is the optimal cost-to-go, from which the optimal policy can be inferred, to the algebraic Riccati equation .. math:: \mat{S}\mat{A} + \mat{A}^T\mat{S} - \mat{S}\mat{B}\mat{R}^{-1}\mat{B}^T\mat{S} + \mat{Q} = 0 for which good numerical solvers exist to find the optimal cost-to-go matrix :math:`\mat{S}`. The optimal policy obtained is .. math:: \vect{u}(\vect{x}) = -\mat{R}^{-1}\mat{B}^{T}\mat{S}\vect{x} = -\mat{K}\vect{x}. In order to use an LQR controller for stabilizing the double pendulum on the top, the dynamics have to be linearised around the top position :math:`\vect{x}^{d} = [\pi, 0, 0, 0]` and :math:`\vect{u}^{d} = [0, 0]`: .. math:: \mat A = \left. \frac{\partial \vect{f}(\vect{x}, \vect{u})}{\partial \vect{x}}\right|_{\vect{x}=\vect{x}^{d}, \vect{u}=\vect{u}^{d}}, \mat B = \left. \frac{\partial \vect{f}(\vect{x}, \vect{u})}{\partial \vect{u}}\right|_{\vect{x}=\vect{x}^{d}, \vect{u}=\vect{u}^{d}} and the state and actuation have to be expressed in relative coordinates :math:`\tilde{\vect{x}} = \vect{x} - \vect{x}^{d}`, :math:`\tilde{\vect{u}} = \vect{u} - \vect{u}^{d}`. Region of Attraction (RoA) -------------------------- For dynamical systems, the Region of Attraction (RoA) :math:`\mathcal{B}` around a fixed point :math:`\vect{x}^{\star}` describes the set of initial states for which :math:`\vect{x} \rightarrow \vect{x}^{\star}` as :math:`t\rightarrow \infty`. Direct computation of this set is often not possible. However, it can be estimated by considering the sublevel set of a Lyapunov function :math:`V(\vect{x})` [1]. When using LQR to stabilize the system around :math:`\vect{x}^{\star}`, the cost-to-go can serve as a quadratic Lyapunov function [2]. In this case, the estimated RoA can be written as: .. math:: \mathcal{B}_{\text{est}} = \left \{ \vect{x} \vert \vect{x}^{T} \mat{S} \vect{x} \leq \rho \right \} Where :math:`\rho` is a scalar that can be estimated using either probabilistic [3] or optimization based methods [4]. For further reading we refer to these lecture notes [2]. References ---------- - [1] H. K. Khalil, Nonlinear Systems, 3rd ed. Upper Saddle River, N.J: Prentice Hall, 2002 - [2] R. Tedrake, Underactuated Robotics, 2022. (Online) url: ``__ - [3] E. Najafi, R. Babuška, and G. A. D. Lopes, “A fast sampling method for estimating the domain of attraction,” Nonlinear Dynamics, vol. 86, no. 2, pp. 823–834, Oct. 2016. url: ``__ - [4] P. Parrilo, “Structured semidefinite programs and semialgebraic ge- ometry methods in robustness and optimization,” Ph.D. dissertation, California Institute of Technology, Pasadena, California, 2000. url: ``__