 |
Previous Article | Next Article 
Journal of Neuroscience, Vol 16, 1936-1947, Copyright © 1996 by Society for Neuroscience
A framework for mesencephalic dopamine systems based on predictive Hebbian learning
PR Montague, P Dayan and TJ Sejnowski
Division of Neuroscience, Baylor College of Medicine, Houston, Texas 77030, USA.
We develop a theoretical framework that shows how mesencephalic dopamine
systems could distribute to their targets a signal that represents
information about future expectations. In particular, we show how activity
in the cerebral cortex can make predictions about future receipt of reward
and how fluctuations in the activity levels of neurons in diffuse dopamine
systems above and below baseline levels would represent errors in these
predictions that are delivered to cortical and subcortical targets. We
present a model for how such errors could be constructed in a real brain
that is consistent with physiological results for a subset of dopaminergic
neurons located in the ventral tegmental area and surrounding dopaminergic
neurons. The theory also makes testable predictions about human choice
behavior on a simple decision-making task. Furthermore, we show that,
through a simple influence on synaptic plasticity, fluctuations in dopamine
release can act to change the predictions in an appropriate manner.
This article has been cited by other articles:

|
 |

|
 |
 
W.-X. Pan, R. Schmidt, J. R. Wickens, and B. I. Hyland
Tripartite Mechanism of Extinction Suggested by Dopamine Neuron Activity and Temporal Difference Model
J. Neurosci.,
September 24, 2008;
28(39):
9619 - 9631.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. Kumar, G. Waiter, T. Ahearn, M. Milders, I. Reid, and J. D. Steele
Abnormal temporal difference reward-learning signals in major depression
Brain,
August 1, 2008;
131(8):
2084 - 2093.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Kobayashi and W. Schultz
Influence of Reward Delays on Responses of Dopamine Neurons
J. Neurosci.,
July 30, 2008;
28(31):
7837 - 7846.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Glascher, A. N. Hampton, and J. P. O'Doherty
Determining a Role for Ventromedial Prefrontal Cortex in Encoding Action-Based Value Signals During Reward-Related Decision Making
Cereb Cortex,
June 11, 2008;
(2008)
bhn098v1.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. X. COHEN
Neurocomputational mechanisms of reinforcement-guided learning in humans: A review
Cogn Affect Behav Neurosci,
June 1, 2008;
8(2):
113 - 125.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Nakamura, M. Matsumoto, and O. Hikosaka
Reward-Dependent Modulation of Neuronal Activity in the Primate Dorsal Raphe Nucleus
J. Neurosci.,
May 14, 2008;
28(20):
5331 - 5343.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Preuschoff, S. R. Quartz, and P. Bossaerts
Human Insula Activation Reflects Risk Prediction Errors As Well As Risk
J. Neurosci.,
March 12, 2008;
28(11):
2745 - 2752.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
V. Pawlak and J. N. D. Kerr
Dopamine Receptor Activation Is Required for Corticostriatal Spike-Timing-Dependent Plasticity
J. Neurosci.,
March 5, 2008;
28(10):
2435 - 2446.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. D'Ardenne, S. M. McClure, L. E. Nystrom, and J. D. Cohen
BOLD Responses Reflecting Dopaminergic Signals in the Human Ventral Tegmental Area
Science,
February 29, 2008;
319(5867):
1264 - 1267.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. B. Kim, N. Huh, H. Lee, E. H. Baeg, D. Lee, and M. W. Jung
Encoding of Action History in the Rat Ventral Striatum
J Neurophysiol,
December 1, 2007;
98(6):
3548 - 3556.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Schonberg, N. D. Daw, D. Joel, and J. P. O'Doherty
Reinforcement Learning Signals in the Human Striatum Distinguish Learners from Nonlearners during Reward-Based Decision Making
J. Neurosci.,
November 21, 2007;
27(47):
12860 - 12867.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. Niv and M. Rivlin-Etzion
Parkinson's Disease: Fighting the Will?
J. Neurosci.,
October 31, 2007;
27(44):
11777 - 11779.
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. M. Izhikevich
Solving the Distal Reward Problem through Linkage of STDP and Dopamine Signaling
Cereb Cortex,
October 1, 2007;
17(10):
2443 - 2452.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
W. H. Alexander
Shifting Attention Using a Temporal Difference Prediction Error and High-Dimensional Input
Adaptive Behavior,
June 1, 2007;
15(2):
121 - 133.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Lohrenz, K. McCabe, C. F. Camerer, and P. R. Montague
Neural signature of fictive learning signals in a sequential investment task
PNAS,
May 29, 2007;
104(22):
9493 - 9498.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B. Seymour, N. Daw, P. Dayan, T. Singer, and R. Dolan
Differential Encoding of Losses and Gains in the Human Striatum
J. Neurosci.,
May 2, 2007;
27(18):
4826 - 4831.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. X Cohen
Individual differences and the neural representations of reward expectation and reward prediction error
Soc Cogn Affect Neurosci,
March 1, 2007;
2(1):
20 - 30.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. Bogacz and K. Gurney
The Basal Ganglia and cortex implement optimal decision making between alternative actions.
Neural Comput.,
February 1, 2007;
19(2):
442 - 477.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. Sakai and T. Fukai
The Actor-Critic Learning Is Behind the Matching Law: Matching Versus Optimal Behaviors
Neural Comput.,
January 1, 2007;
20(1):
227 - 251.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. M. O'Carroll, S. J. Martin, J. Sandin, B. Frenguelli, and R. G.M. Morris
Dopaminergic modulation of the persistence of one-trial hippocampus-dependent memory
Learn. Mem.,
November 1, 2006;
13(6):
760 - 769.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. Loewenstein and H. S. Seung
Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity
PNAS,
October 10, 2006;
103(41):
15224 - 15229.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
N. D. Daw, A. C. Courville, and D. S. Touretzky
Representation and Timing in Theories of the Dopamine System
Neural Comput.,
July 1, 2006;
18(7):
1637 - 1677.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
O. Hikosaka, K. Nakamura, and H. Nakahara
Basal Ganglia Orient Eyes to Reward
J Neurophysiol,
February 1, 2006;
95(2):
567 - 584.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Haruno and M. Kawato
Different Neural Correlates of Reward Expectation and Reward Expectation Error in the Putamen and Caudate Nucleus During Stimulus-Action-Reward Association Learning
J Neurophysiol,
February 1, 2006;
95(2):
948 - 959.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B. S. Gutkin, S. Dehaene, and J.-P. Changeux
A neurocomputational hypothesis for nicotine addiction
PNAS,
January 24, 2006;
103(4):
1106 - 1111.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. A. Koene and M. E. Hasselmo
An Integrate-and-fire Model of Prefrontal Cortex Neuronal Activity during Performance of Goal-directed Decision Making
Cereb Cortex,
December 1, 2005;
15(12):
1964 - 1981.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. C. Kreitzer and R. C. Malenka
Dopamine Modulation of State-Dependent Endocannabinoid Release and Long-Term Depression in the Striatum
J. Neurosci.,
November 9, 2005;
25(45):
10537 - 10545.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Watanabe and O. Hikosaka
Immediate Changes in Anticipatory Activity of Caudate Neurons Associated With Reversal of Position-Reward Contingency
J Neurophysiol,
September 1, 2005;
94(3):
1879 - 1887.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Giustizieri, G. Bernardi, N. B. Mercuri, and N. Berretta
Distinct Mechanisms of Presynaptic Inhibition at GABAergic Synapses of the Rat Substantia Nigra Pars Compacta
J Neurophysiol,
September 1, 2005;
94(3):
1992 - 2003.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. E. Hyman
Addiction: A Disease of Learning and Memory
Am J Psychiatry,
August 1, 2005;
162(8):
1414 - 1422.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. E. Hasselmo
A Model of Prefrontal Cortical Mechanisms for Goal-directed Behavior
J. Cogn. Neurosci.,
July 1, 2005;
17(7):
1115 - 1129.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
W.-X. Pan, R. Schmidt, J. R. Wickens, and B. I. Hyland
Dopamine Cells Respond to Predicted Events during Classical Conditioning: Evidence for Eligibility Traces in the Reward-Learning Network
J. Neurosci.,
June 29, 2005;
25(26):
6235 - 6242.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Khamassi, L. Lacheze, B. Girard, A. Berthoz, and A. Guillot
Actor-Critic Models of Reinforcement Learning in the Basal Ganglia: From Natural to Artificial Rats
Adaptive Behavior,
June 1, 2005;
13(2):
131 - 148.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
N. P. Rougier, D. C. Noelle, T. S. Braver, J. D. Cohen, and R. C. O'Reilly
Prefrontal cortex and flexible cognitive control: Rules without symbols
PNAS,
May 17, 2005;
102(20):
7338 - 7343.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Riba, A. Rodriguez-Fornells, A. Morte, T. F. Munte, and M. J. Barbanoj
Noradrenergic Stimulation Enhances Human Action Monitoring
J. Neurosci.,
April 27, 2005;
25(17):
4370 - 4374.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. Dommett, V. Coizet, C. D. Blaha, J. Martindale, V. Lefebvre, N. Walton, J. E. W. Mayhew, P. G. Overton, and P. Redgrave
How Visual Stimuli Activate Dopaminergic Neurons at Short Latency
Science,
March 4, 2005;
307(5714):
1476 - 1479.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. L. Krichmar, D. A. Nitz, J. A. Gally, and G. M. Edelman
Characterizing functional hippocampal pathways in a brain-based device as it solves a spatial memory task
PNAS,
February 8, 2005;
102(6):
2111 - 2116.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
F. Worgotter and B. Porr
Temporal Sequence Learning, Prediction, and Control: A Review of Different Models and Their Relation to Biological Mechanisms
Neural Comput.,
February 1, 2005;
17(2):
245 - 319.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. J. Smith, S. Becker, and S. Kapur
A Computational Model of the Functional Role of the Ventral-Striatal D2 Receptor in the Expression of Previously Acquired Behaviors
Neural Comput.,
February 1, 2005;
17(2):
361 - 395.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. C. O'Reilly and M. J. Frank
Making Working Memory Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia
Neural Comput.,
February 1, 2005;
18(2):
283 - 328.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. D. Redish
Addiction as a Computational Process Gone Awry
Science,
December 10, 2004;
306(5703):
1944 - 1947.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. K. Seth, J. L. McKinstry, G. M. Edelman, and J. L. Krichmar
Visual Binding Through Reentrant Connectivity and Dynamic Synchronization in a Brain-based Device
Cereb Cortex,
November 1, 2004;
14(11):
1185 - 1199.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
F. G. Ashby and B. J. Spiering
The Neurobiology of Category Learning
Behav Cogn Neurosci Rev,
June 1, 2004;
3(2):
101 - 113.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
S. M. McClure, M. K. York, and P. R. Montague
The Neural Substrates of Reward Processing in Humans: The Modern Role of fMRI
Neuroscientist,
June 1, 2004;
10(3):
260 - 268.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
N. Schmitzer-Torbert and A. D. Redish
Neuronal Activity in the Rodent Dorsal Striatum in Sequential Navigation: Separation of Spatial and Reward Responses on the Multiple T Task
J Neurophysiol,
May 1, 2004;
91(5):
2259 - 2272.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Kitano and T. Fukai
Temporal Characteristics of the Predictive Synchronous Firing Modeled by Spike-Timing-Dependent Plasticity
Learn. Mem.,
May 1, 2004;
11(3):
267 - 276.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. O'Doherty, P. Dayan, J. Schultz, R. Deichmann, K. Friston, and R. J. Dolan
Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning
Science,
April 16, 2004;
304(5669):
452 - 454.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
I. Szita and A. Lorincz
Kalman Filter Control Embedded into the Reinforcement Learning Framework
Neural Comput.,
March 1, 2004;
16(3):
491 - 499.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Haruno, T. Kuroda, K. Doya, K. Toyama, M. Kimura, K. Samejima, H. Imamizu, and M. Kawato
A Neural Correlate of Reward-Based Behavioral Learning in Caudate Nucleus: A Functional Magnetic Resonance Imaging Study of a Stochastic Decision Task
J. Neurosci.,
February 18, 2004;
24(7):
1660 - 1665.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. R. Montague, S. M. McClure, P. R. Baldwin, P. E. M. Phillips, E. A. Budygin, G. D. Stuber, M. R. Kilpatrick, and R. M. Wightman
Dynamic Gain Control of Dopamine Delivery in Freely Moving Animals
J. Neurosci.,
February 18, 2004;
24(7):
1754 - 1759.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. Kawagoe, Y. Takikawa, and O. Hikosaka
Reward-Predicting Activity of Dopamine and Caudate Neurons--A Possible Mechanism of Motivational Control of Saccadic Eye Movement
J Neurophysiol,
February 1, 2004;
91(2):
1013 - 1024.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Satoh, S. Nakai, T. Sato, and M. Kimura
Correlated Coding of Motivation and Outcome of Decision by Dopamine Neurons
J. Neurosci.,
October 29, 2003;
23(30):
9913 - 9923.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. J. Gruber, S. A. Solla, D. J. Surmeier, and J. C. Houk
Modulation of Striatal Single Units by Expected Reward: A Spiny Neuron Model Displaying Dopamine-Induced Bistability
J Neurophysiol,
August 1, 2003;
90(2):
1095 - 1114.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. R. Dinse, P. Ragert, B. Pleger, P. Schwenkreis, and M. Tegenthoff
Pharmacological Modulation of Perceptual Learning and Associated Cortical Reorganization
Science,
July 4, 2003;
301(5629):
91 - 94.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. Durstewitz
Self-Organizing Neural Integrator Predicts Interval Times through Climbing Activity
J. Neurosci.,
June 15, 2003;
23(12):
5342 - 5353.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. D. Fiorillo, P. N. Tobler, and W. Schultz
Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons
Science,
March 21, 2003;
299(5614):
1898 - 1902.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
N. D. Daw and D. S. Touretzky
Long-Term Reward Prediction in TD Models of the Dopamine System
Neural Comput.,
November 1, 2002;
14(11):
2567 - 2583.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J.-C. Dreher, E. Guigon, and Y. Burnod
A Model of Prefrontal Cortex Dopaminergic Modulation during the Delayed Alternation Task
J. Cogn. Neurosci.,
August 1, 2002;
14(6):
853 - 865.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. Nakahara, S.-i. Amari, and O. Hikosaka
Self-Organization in the Basal Ganglia with Modulation of Reinforcement Signals
Neural Comput.,
April 1, 2002;
14(4):
819 - 844.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Fenu, V. Bassareo, and G. Di Chiara
A Role for Dopamine D1 Receptors of the Nucleus Accumbens Shell in Conditioned Taste Aversion Learning
J. Neurosci.,
September 1, 2001;
21(17):
6897 - 6904.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Ravel, P. Sardo, E. Legallet, and P. Apicella
Reward Unpredictability inside and outside of a Task Context as a Determinant of the Responses of Tonically Active Neurons in the Monkey Striatum
J. Neurosci.,
August 1, 2001;
21(15):
5730 - 5739.
| |