In recent years, a wide number of theoretical papers have focused on reinforcement learning approaches to the linear quadratic regulator (LQR) problem. However, nearly all of these papers assume that an initial stabilizing controller is given. This paper gives a model-free, off-policy reinforcement learning algorithm for computing a stabilizing controller for deterministic LQR problems with unknown dynamics and cost matrices. When the system is stabilizable, a controller which is guaranteed to stabilize the system is computed after finitely many steps. Furthermore, the solution converges to the optimal LQR gain.
|Original language||English (US)|
|Title of host publication||2020 59th IEEE Conference on Decision and Control, CDC 2020|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Number of pages||6|
|State||Published - Dec 14 2020|
|Event||59th IEEE Conference on Decision and Control, CDC 2020 - Virtual, Jeju Island, Korea, Republic of|
Duration: Dec 14 2020 → Dec 18 2020
|Name||Proceedings of the IEEE Conference on Decision and Control|
|Conference||59th IEEE Conference on Decision and Control, CDC 2020|
|Country||Korea, Republic of|
|City||Virtual, Jeju Island|
|Period||12/14/20 → 12/18/20|
Bibliographical noteFunding Information:
This work was supported in part by NSF CMMI-1727096 The author is with the department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN 55455, USA
© 2020 IEEE.