Reconciling Reinforcement Learning Models With Behavioral Extinction and Renewal: Implications for Addiction, Relapse, and Problem Gambling

David Redish; Steve Jensen; Adam Johnson; Zeb Kurth-Nelson

doi:10.1037/0033-295X.114.3.784

Reconciling Reinforcement Learning Models With Behavioral Extinction and Renewal: Implications for Addiction, Relapse, and Problem Gambling

David Redish, Steve Jensen, Adam Johnson, Zeb Kurth-Nelson

Computer Science and Engineering

Research output: Contribution to journal › Article › peer-review

242 Scopus citations

Abstract

Because learned associations are quickly renewed following extinction, the extinction process must include processes other than unlearning. However, reinforcement learning models, such as the temporal difference reinforcement learning (TDRL) model, treat extinction as an unlearning of associated value and are thus unable to capture renewal. TDRL models are based on the hypothesis that dopamine carries a reward prediction error signal; these models predict reward by driving that reward error to zero. The authors construct a TDRL model that can accommodate extinction and renewal through two simple processes: (a) a TDRL process that learns the value of situation-action pairs and (b) a situation recognition process that categorizes the observed cues into situations. This model has implications for dysfunctional states, including relapse after addiction and problem gambling.

Original language	English (US)
Pages (from-to)	784-805
Number of pages	22
Journal	Psychological Review
Volume	114
Issue number	3
DOIs	https://doi.org/10.1037/0033-295X.114.3.784
State	Published - Jul 2007

Keywords

dopamine
problem gambling
reinstantiation
temporal difference reinforcement learning (TDRL)

Access

10.1037/0033-295X.114.3.784

OpenUrl availability

Full text

Cite this

@article{8c4bfd1941c04d97a9ef0634079483ac,

title = "Reconciling Reinforcement Learning Models With Behavioral Extinction and Renewal: Implications for Addiction, Relapse, and Problem Gambling",

abstract = "Because learned associations are quickly renewed following extinction, the extinction process must include processes other than unlearning. However, reinforcement learning models, such as the temporal difference reinforcement learning (TDRL) model, treat extinction as an unlearning of associated value and are thus unable to capture renewal. TDRL models are based on the hypothesis that dopamine carries a reward prediction error signal; these models predict reward by driving that reward error to zero. The authors construct a TDRL model that can accommodate extinction and renewal through two simple processes: (a) a TDRL process that learns the value of situation-action pairs and (b) a situation recognition process that categorizes the observed cues into situations. This model has implications for dysfunctional states, including relapse after addiction and problem gambling.",

keywords = "dopamine, problem gambling, reinstantiation, temporal difference reinforcement learning (TDRL)",

author = "David Redish and Steve Jensen and Adam Johnson and Zeb Kurth-Nelson",

year = "2007",

month = jul,

doi = "10.1037/0033-295X.114.3.784",

language = "English (US)",

volume = "114",

pages = "784--805",

journal = "Psychological Review",

issn = "0033-295X",

publisher = "American Psychological Association",

number = "3",

}

TY - JOUR

T1 - Reconciling Reinforcement Learning Models With Behavioral Extinction and Renewal

T2 - Implications for Addiction, Relapse, and Problem Gambling

AU - Redish, David

AU - Jensen, Steve

AU - Johnson, Adam

AU - Kurth-Nelson, Zeb

PY - 2007/7

Y1 - 2007/7

N2 - Because learned associations are quickly renewed following extinction, the extinction process must include processes other than unlearning. However, reinforcement learning models, such as the temporal difference reinforcement learning (TDRL) model, treat extinction as an unlearning of associated value and are thus unable to capture renewal. TDRL models are based on the hypothesis that dopamine carries a reward prediction error signal; these models predict reward by driving that reward error to zero. The authors construct a TDRL model that can accommodate extinction and renewal through two simple processes: (a) a TDRL process that learns the value of situation-action pairs and (b) a situation recognition process that categorizes the observed cues into situations. This model has implications for dysfunctional states, including relapse after addiction and problem gambling.

AB - Because learned associations are quickly renewed following extinction, the extinction process must include processes other than unlearning. However, reinforcement learning models, such as the temporal difference reinforcement learning (TDRL) model, treat extinction as an unlearning of associated value and are thus unable to capture renewal. TDRL models are based on the hypothesis that dopamine carries a reward prediction error signal; these models predict reward by driving that reward error to zero. The authors construct a TDRL model that can accommodate extinction and renewal through two simple processes: (a) a TDRL process that learns the value of situation-action pairs and (b) a situation recognition process that categorizes the observed cues into situations. This model has implications for dysfunctional states, including relapse after addiction and problem gambling.

KW - dopamine

KW - problem gambling

KW - reinstantiation

KW - temporal difference reinforcement learning (TDRL)

UR - http://www.scopus.com/inward/record.url?scp=34548837994&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34548837994&partnerID=8YFLogxK

U2 - 10.1037/0033-295X.114.3.784

DO - 10.1037/0033-295X.114.3.784

M3 - Article

C2 - 17638506

AN - SCOPUS:34548837994

SN - 0033-295X

VL - 114

SP - 784

EP - 805

JO - Psychological Review

JF - Psychological Review

IS - 3

ER -

Reconciling Reinforcement Learning Models With Behavioral Extinction and Renewal: Implications for Addiction, Relapse, and Problem Gambling

Abstract

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this