In dynamic environments, adaptive behavior requires striking a balance between harvesting currently available rewards (exploitation) and gathering information about alternative options (exploration) [1-4]. Such strategic decisions should incorporate not only recent reward history, but also opportunity costs and environmental statistics. Previous neuroimaging [5-8] and neurophysiological [9-13] studies have implicated orbitofrontal cortex, anterior cingulate cortex, and ventral striatum in distinguishing between bouts of exploration and exploitation. Nonetheless, the neuronal mechanisms that underlie strategy selection remain poorly understood. We hypothesized that posterior cingulate cortex (CGp), an area linking reward processing, attention , memory [15, 16], and motor control systems , mediates the integration of variables such as reward , uncertainty , and target location  that underlie this dynamic balance. Here we show that CGp neurons distinguish between exploratory and exploitative decisions made by monkeys in a dynamic foraging task. Moreover, firing rates of these neurons predict in graded fashion the strategy most likely to be selected on upcoming trials. This encoding is distinct from switching between targets and is independent of the absolute magnitudes of rewards. These observations implicate CGp in the integration of individual outcomes across decision making and the modification of strategy in dynamic environments.
Bibliographical noteFunding Information:
This work was supported by National Institute on Drug Abuse postdoctoral fellowship 023338-01 (B.Y.H.), National Institutes of Health grant R01EY013496 (M.L.P.), and the Duke Institute for Brain Studies (M.L.P.). We thank K. Watson for assistance in training the animals and A. Long for comments on the manuscript.