Structure learning in human sequential decision-making

Daniel E. Acuña; Paul Schrater

doi:10.1371/journal.pcbi.1001003

Structure learning in human sequential decision-making

Daniel E. Acuña, Paul Schrater

Psychology (Twin Cities)

Research output: Contribution to journal › Article › peer-review

37 Scopus citations

Abstract

Studies of sequential decision-making in humans frequently find suboptimal performance relative to an ideal actor that has perfect knowledge of the model of how rewards and events are generated in the environment. Rather than being suboptimal, we argue that the learning problem humans face is more complex, in that it also involves learning the structure of reward generation in the environment. We formulate the problem of structure learning in sequential decision tasks using Bayesian reinforcement learning, and show that learning the generative model for rewards qualitatively changes the behavior of an optimal learning agent. To test whether people exhibit structure learning, we performed experiments involving a mixture of one-armed and two-armed bandit reward models, where structure learning produces many of the qualitative behaviors deemed suboptimal in previous studies. Our results demonstrate humans can perform structure learning in a near-optimal manner.

Original language	English (US)
Article number	e1001003
Journal	PLoS computational biology
Volume	6
Issue number	12
DOIs	https://doi.org/10.1371/journal.pcbi.1001003
State	Published - Dec 2010

Access

10.1371/journal.pcbi.1001003

OpenUrl availability

Full text

Cite this

@article{7f2762f18e504b9ca66fc31dbb848001,

title = "Structure learning in human sequential decision-making",

abstract = "Studies of sequential decision-making in humans frequently find suboptimal performance relative to an ideal actor that has perfect knowledge of the model of how rewards and events are generated in the environment. Rather than being suboptimal, we argue that the learning problem humans face is more complex, in that it also involves learning the structure of reward generation in the environment. We formulate the problem of structure learning in sequential decision tasks using Bayesian reinforcement learning, and show that learning the generative model for rewards qualitatively changes the behavior of an optimal learning agent. To test whether people exhibit structure learning, we performed experiments involving a mixture of one-armed and two-armed bandit reward models, where structure learning produces many of the qualitative behaviors deemed suboptimal in previous studies. Our results demonstrate humans can perform structure learning in a near-optimal manner.",

author = "Acu{\~n}a, {Daniel E.} and Paul Schrater",

year = "2010",

month = dec,

doi = "10.1371/journal.pcbi.1001003",

language = "English (US)",

volume = "6",

journal = "PLoS computational biology",

issn = "1553-734X",

publisher = "Public Library of Science",

number = "12",

}

TY - JOUR

T1 - Structure learning in human sequential decision-making

AU - Acuña, Daniel E.

AU - Schrater, Paul

PY - 2010/12

Y1 - 2010/12

N2 - Studies of sequential decision-making in humans frequently find suboptimal performance relative to an ideal actor that has perfect knowledge of the model of how rewards and events are generated in the environment. Rather than being suboptimal, we argue that the learning problem humans face is more complex, in that it also involves learning the structure of reward generation in the environment. We formulate the problem of structure learning in sequential decision tasks using Bayesian reinforcement learning, and show that learning the generative model for rewards qualitatively changes the behavior of an optimal learning agent. To test whether people exhibit structure learning, we performed experiments involving a mixture of one-armed and two-armed bandit reward models, where structure learning produces many of the qualitative behaviors deemed suboptimal in previous studies. Our results demonstrate humans can perform structure learning in a near-optimal manner.

AB - Studies of sequential decision-making in humans frequently find suboptimal performance relative to an ideal actor that has perfect knowledge of the model of how rewards and events are generated in the environment. Rather than being suboptimal, we argue that the learning problem humans face is more complex, in that it also involves learning the structure of reward generation in the environment. We formulate the problem of structure learning in sequential decision tasks using Bayesian reinforcement learning, and show that learning the generative model for rewards qualitatively changes the behavior of an optimal learning agent. To test whether people exhibit structure learning, we performed experiments involving a mixture of one-armed and two-armed bandit reward models, where structure learning produces many of the qualitative behaviors deemed suboptimal in previous studies. Our results demonstrate humans can perform structure learning in a near-optimal manner.

UR - http://www.scopus.com/inward/record.url?scp=78651226963&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78651226963&partnerID=8YFLogxK

U2 - 10.1371/journal.pcbi.1001003

DO - 10.1371/journal.pcbi.1001003

M3 - Article

C2 - 21151963

AN - SCOPUS:78651226963

SN - 1553-734X

VL - 6

JO - PLoS computational biology

JF - PLoS computational biology

IS - 12

M1 - e1001003

ER -

Structure learning in human sequential decision-making

Abstract

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this