Hippocampal replay contributes to within session learning in a temporal difference reinforcement learning model

Adam Johnson; A. David Redish

doi:10.1016/j.neunet.2005.08.009

Hippocampal replay contributes to within session learning in a temporal difference reinforcement learning model

Adam Johnson, A. David Redish

Neuroscience

Research output: Contribution to journal › Article › peer-review

64 Scopus citations

Abstract

Temporal difference reinforcement learning (TDRL) algorithms, hypothesized to partially explain basal ganglia functionality, learn more slowly than real animals. Modified TDRL algorithms (e.g. the Dyna-Q family) learn faster than standard TDRL by practicing experienced sequences offline. We suggest that the replay phenomenon, in which ensembles of hippocampal neurons replay previously experienced firing sequences during subsequent rest and sleep, may provide practice sequences to improve the speed of TDRL learning, even within a single session. We test the plausibility of this hypothesis in a computational model of a multiple-T choice-task. Rats show two learning rates on this task: a fast decrease in errors and a slow development of a stereotyped path. Adding developing replay to the model accelerates learning the correct path, but slows down the stereotyping of that path. These models provide testable predictions relating the effects of hippocampal inactivation as well as hippocampal replay on this task.

Original language	English (US)
Pages (from-to)	1163-1171
Number of pages	9
Journal	Neural Networks
Volume	18
Issue number	9
DOIs	https://doi.org/10.1016/j.neunet.2005.08.009
State	Published - Nov 2005

Bibliographical note

Funding Information:
We thank Jadin Jackson, Zeb Kurth-Nelson, Beth Masimore, Neil Schmitzer-Torbert, and Giuseppe Cortese for helpful discussions and comments on the manuscript. This work was supported by NIH (MH68029) and by fellowships from 3M and from the Center for Cognitive Sciences (grant number T32HD007151).

Access

10.1016/j.neunet.2005.08.009

OpenUrl availability

Full text

Cite this

@article{f04b2dc4f0d04d108fad1d824ff782b0,

title = "Hippocampal replay contributes to within session learning in a temporal difference reinforcement learning model",

abstract = "Temporal difference reinforcement learning (TDRL) algorithms, hypothesized to partially explain basal ganglia functionality, learn more slowly than real animals. Modified TDRL algorithms (e.g. the Dyna-Q family) learn faster than standard TDRL by practicing experienced sequences offline. We suggest that the replay phenomenon, in which ensembles of hippocampal neurons replay previously experienced firing sequences during subsequent rest and sleep, may provide practice sequences to improve the speed of TDRL learning, even within a single session. We test the plausibility of this hypothesis in a computational model of a multiple-T choice-task. Rats show two learning rates on this task: a fast decrease in errors and a slow development of a stereotyped path. Adding developing replay to the model accelerates learning the correct path, but slows down the stereotyping of that path. These models provide testable predictions relating the effects of hippocampal inactivation as well as hippocampal replay on this task.",

author = "Adam Johnson and Redish, {A. David}",

note = "Funding Information: We thank Jadin Jackson, Zeb Kurth-Nelson, Beth Masimore, Neil Schmitzer-Torbert, and Giuseppe Cortese for helpful discussions and comments on the manuscript. This work was supported by NIH (MH68029) and by fellowships from 3M and from the Center for Cognitive Sciences (grant number T32HD007151). ",

year = "2005",

month = nov,

doi = "10.1016/j.neunet.2005.08.009",

language = "English (US)",

volume = "18",

pages = "1163--1171",

journal = "Neural Networks",

issn = "0893-6080",

publisher = "Elsevier Limited",

number = "9",

}

TY - JOUR

T1 - Hippocampal replay contributes to within session learning in a temporal difference reinforcement learning model

AU - Johnson, Adam

AU - Redish, A. David

N1 - Funding Information: We thank Jadin Jackson, Zeb Kurth-Nelson, Beth Masimore, Neil Schmitzer-Torbert, and Giuseppe Cortese for helpful discussions and comments on the manuscript. This work was supported by NIH (MH68029) and by fellowships from 3M and from the Center for Cognitive Sciences (grant number T32HD007151).

PY - 2005/11

Y1 - 2005/11

N2 - Temporal difference reinforcement learning (TDRL) algorithms, hypothesized to partially explain basal ganglia functionality, learn more slowly than real animals. Modified TDRL algorithms (e.g. the Dyna-Q family) learn faster than standard TDRL by practicing experienced sequences offline. We suggest that the replay phenomenon, in which ensembles of hippocampal neurons replay previously experienced firing sequences during subsequent rest and sleep, may provide practice sequences to improve the speed of TDRL learning, even within a single session. We test the plausibility of this hypothesis in a computational model of a multiple-T choice-task. Rats show two learning rates on this task: a fast decrease in errors and a slow development of a stereotyped path. Adding developing replay to the model accelerates learning the correct path, but slows down the stereotyping of that path. These models provide testable predictions relating the effects of hippocampal inactivation as well as hippocampal replay on this task.

AB - Temporal difference reinforcement learning (TDRL) algorithms, hypothesized to partially explain basal ganglia functionality, learn more slowly than real animals. Modified TDRL algorithms (e.g. the Dyna-Q family) learn faster than standard TDRL by practicing experienced sequences offline. We suggest that the replay phenomenon, in which ensembles of hippocampal neurons replay previously experienced firing sequences during subsequent rest and sleep, may provide practice sequences to improve the speed of TDRL learning, even within a single session. We test the plausibility of this hypothesis in a computational model of a multiple-T choice-task. Rats show two learning rates on this task: a fast decrease in errors and a slow development of a stereotyped path. Adding developing replay to the model accelerates learning the correct path, but slows down the stereotyping of that path. These models provide testable predictions relating the effects of hippocampal inactivation as well as hippocampal replay on this task.

UR - http://www.scopus.com/inward/record.url?scp=27844567151&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=27844567151&partnerID=8YFLogxK

U2 - 10.1016/j.neunet.2005.08.009

DO - 10.1016/j.neunet.2005.08.009

M3 - Article

C2 - 16198539

AN - SCOPUS:27844567151

SN - 0893-6080

VL - 18

SP - 1163

EP - 1171

JO - Neural Networks

JF - Neural Networks

IS - 9

ER -

Hippocampal replay contributes to within session learning in a temporal difference reinforcement learning model

Abstract

Bibliographical note

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this