Natural gradients for state and output feedback control

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Policy gradient methods for approximate optimal control and reinforcement learning fix parameterized form of the controller and then perform gradient descent on the cost-to-go function. In reinforcement learning for stochastic state-feedback problems, it has been shown that the natural gradient of the cost-to-go function can be approximated via samples of the state and step-cost, using no information about the plant model. There, the natural gradient is the gradient with respect to the Riemannian metric defined by the Fisher information matrix of the controller parameters. We give a general method for approximating the natural gradient for nonlinear output-feedback stochastic control problems with dynamic controllers. For linear systems, we give explicit formulas to compute the natural gradient when plant matrices are known, in both state and output feedback cases.

Original languageEnglish (US)
Title of host publication2016 IEEE 55th Conference on Decision and Control, CDC 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1984-1989
Number of pages6
ISBN (Electronic)9781509018376
DOIs
StatePublished - Dec 27 2016
Event55th IEEE Conference on Decision and Control, CDC 2016 - Las Vegas, United States
Duration: Dec 12 2016Dec 14 2016

Publication series

Name2016 IEEE 55th Conference on Decision and Control, CDC 2016

Other

Other55th IEEE Conference on Decision and Control, CDC 2016
Country/TerritoryUnited States
CityLas Vegas
Period12/12/1612/14/16

Fingerprint

Dive into the research topics of 'Natural gradients for state and output feedback control'. Together they form a unique fingerprint.

Cite this