Natural gradients for state and output feedback control

Andrew Lamperski

doi:10.1109/CDC.2016.7798555

Natural gradients for state and output feedback control

Andrew Lamperski

Electrical and Computer Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Policy gradient methods for approximate optimal control and reinforcement learning fix parameterized form of the controller and then perform gradient descent on the cost-to-go function. In reinforcement learning for stochastic state-feedback problems, it has been shown that the natural gradient of the cost-to-go function can be approximated via samples of the state and step-cost, using no information about the plant model. There, the natural gradient is the gradient with respect to the Riemannian metric defined by the Fisher information matrix of the controller parameters. We give a general method for approximating the natural gradient for nonlinear output-feedback stochastic control problems with dynamic controllers. For linear systems, we give explicit formulas to compute the natural gradient when plant matrices are known, in both state and output feedback cases.

Original language	English (US)
Title of host publication	2016 IEEE 55th Conference on Decision and Control, CDC 2016
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	1984-1989
Number of pages	6
ISBN (Electronic)	9781509018376
DOIs	https://doi.org/10.1109/CDC.2016.7798555
State	Published - Dec 27 2016
Event	55th IEEE Conference on Decision and Control, CDC 2016 - Las Vegas, United States Duration: Dec 12 2016 → Dec 14 2016

Publication series

Name	2016 IEEE 55th Conference on Decision and Control, CDC 2016

Other

Other	55th IEEE Conference on Decision and Control, CDC 2016
Country/Territory	United States
City	Las Vegas
Period	12/12/16 → 12/14/16

Access

10.1109/CDC.2016.7798555

OpenUrl availability

Full text

Cite this

Natural gradients for state and output feedback control. / Lamperski, Andrew.
2016 IEEE 55th Conference on Decision and Control, CDC 2016. Institute of Electrical and Electronics Engineers Inc., 2016. p. 1984-1989 7798555 (2016 IEEE 55th Conference on Decision and Control, CDC 2016).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Lamperski, A 2016, Natural gradients for state and output feedback control. in 2016 IEEE 55th Conference on Decision and Control, CDC 2016., 7798555, 2016 IEEE 55th Conference on Decision and Control, CDC 2016, Institute of Electrical and Electronics Engineers Inc., pp. 1984-1989, 55th IEEE Conference on Decision and Control, CDC 2016, Las Vegas, United States, 12/12/16. https://doi.org/10.1109/CDC.2016.7798555

@inproceedings{1358cc3618d4481ba164fc683c338278,

title = "Natural gradients for state and output feedback control",

abstract = "Policy gradient methods for approximate optimal control and reinforcement learning fix parameterized form of the controller and then perform gradient descent on the cost-to-go function. In reinforcement learning for stochastic state-feedback problems, it has been shown that the natural gradient of the cost-to-go function can be approximated via samples of the state and step-cost, using no information about the plant model. There, the natural gradient is the gradient with respect to the Riemannian metric defined by the Fisher information matrix of the controller parameters. We give a general method for approximating the natural gradient for nonlinear output-feedback stochastic control problems with dynamic controllers. For linear systems, we give explicit formulas to compute the natural gradient when plant matrices are known, in both state and output feedback cases.",

author = "Andrew Lamperski",

year = "2016",

month = dec,

day = "27",

doi = "10.1109/CDC.2016.7798555",

language = "English (US)",

series = "2016 IEEE 55th Conference on Decision and Control, CDC 2016",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "1984--1989",

booktitle = "2016 IEEE 55th Conference on Decision and Control, CDC 2016",

note = "55th IEEE Conference on Decision and Control, CDC 2016 ; Conference date: 12-12-2016 Through 14-12-2016",

}

TY - GEN

T1 - Natural gradients for state and output feedback control

AU - Lamperski, Andrew

PY - 2016/12/27

Y1 - 2016/12/27

N2 - Policy gradient methods for approximate optimal control and reinforcement learning fix parameterized form of the controller and then perform gradient descent on the cost-to-go function. In reinforcement learning for stochastic state-feedback problems, it has been shown that the natural gradient of the cost-to-go function can be approximated via samples of the state and step-cost, using no information about the plant model. There, the natural gradient is the gradient with respect to the Riemannian metric defined by the Fisher information matrix of the controller parameters. We give a general method for approximating the natural gradient for nonlinear output-feedback stochastic control problems with dynamic controllers. For linear systems, we give explicit formulas to compute the natural gradient when plant matrices are known, in both state and output feedback cases.

AB - Policy gradient methods for approximate optimal control and reinforcement learning fix parameterized form of the controller and then perform gradient descent on the cost-to-go function. In reinforcement learning for stochastic state-feedback problems, it has been shown that the natural gradient of the cost-to-go function can be approximated via samples of the state and step-cost, using no information about the plant model. There, the natural gradient is the gradient with respect to the Riemannian metric defined by the Fisher information matrix of the controller parameters. We give a general method for approximating the natural gradient for nonlinear output-feedback stochastic control problems with dynamic controllers. For linear systems, we give explicit formulas to compute the natural gradient when plant matrices are known, in both state and output feedback cases.

UR - http://www.scopus.com/inward/record.url?scp=85010815568&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85010815568&partnerID=8YFLogxK

U2 - 10.1109/CDC.2016.7798555

DO - 10.1109/CDC.2016.7798555

M3 - Conference contribution

AN - SCOPUS:85010815568

T3 - 2016 IEEE 55th Conference on Decision and Control, CDC 2016

SP - 1984

EP - 1989

BT - 2016 IEEE 55th Conference on Decision and Control, CDC 2016

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 55th IEEE Conference on Decision and Control, CDC 2016

Y2 - 12 December 2016 through 14 December 2016

ER -

Natural gradients for state and output feedback control

Abstract

Publication series

Other

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this