Temporal difference models and reward-related learning in the human brain

John P O'Doherty; Peter Dayan; Karl Friston; Hugo Critchley; Raymond J Dolan

doi:10.1016/s0896-6273(03)00169-7

Temporal difference models and reward-related learning in the human brain

Neuron. 2003 Apr 24;38(2):329-37. doi: 10.1016/s0896-6273(03)00169-7.

Authors

John P O'Doherty¹, Peter Dayan, Karl Friston, Hugo Critchley, Raymond J Dolan

Affiliation

¹ Wellcome Department of Imaging Neuroscience, Institute of Neurology, University College London, WC1N 3BG, London, United Kingdom. j.odoherty@fil.ion.ucl.ac.uk

PMID: 12718865
DOI: 10.1016/s0896-6273(03)00169-7

Abstract

Temporal difference learning has been proposed as a model for Pavlovian conditioning, in which an animal learns to predict delivery of reward following presentation of a conditioned stimulus (CS). A key component of this model is a prediction error signal, which, before learning, responds at the time of presentation of reward but, after learning, shifts its response to the time of onset of the CS. In order to test for regions manifesting this signal profile, subjects were scanned using event-related fMRI while undergoing appetitive conditioning with a pleasant taste reward. Regression analyses revealed that responses in ventral striatum and orbitofrontal cortex were significantly correlated with this error signal, suggesting that, during appetitive conditioning, computations described by temporal difference learning are expressed in the human brain.

Publication types

Clinical Trial
Research Support, Non-U.S. Gov't

MeSH terms

Adolescent
Adult
Brain / anatomy & histology
Brain / physiology*
Brain Mapping
Conditioning, Classical / physiology
Corpus Striatum / anatomy & histology
Corpus Striatum / physiology
Female
Frontal Lobe / anatomy & histology
Frontal Lobe / physiology
Humans
Learning / physiology*
Magnetic Resonance Imaging
Male
Reference Values
Reflex, Pupillary / physiology
Reward*
Taste / physiology
Time Perception / physiology*