Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops

Abstract

Evaluation of both immediate and future outcomes of one's actions is a critical requirement for intelligent behavior. Using functional magnetic resonance imaging (fMRI), we investigated brain mechanisms for reward prediction at different time scales in a Markov decision task. When human subjects learned actions on the basis of immediate rewards, significant activity was seen in the lateral orbitofrontal cortex and the striatum. When subjects learned to act in order to obtain large future rewards while incurring small immediate losses, the dorsolateral prefrontal cortex, inferior parietal cortex, dorsal raphe nucleus and cerebellum were also activated. Computational model–based regression analysis using the predicted future rewards and prediction errors estimated from subjects' performance data revealed graded maps of time scale within the insula and the striatum: ventroanterior regions were involved in predicting immediate rewards and dorsoposterior regions were involved in predicting future rewards. These results suggest differential involvement of the cortico-basal ganglia loops in reward prediction at different time scales.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Experimental design.
Figure 2: Task schedule and behavioral results.
Figure 3: Brain areas activated in the SHORT versus NO contrast (P < 0.001, uncorrected; extent threshold of four voxels).
Figure 4: Brain areas activated in the LONG versus SHORT contrast (P < 0.0001, uncorrected; extent threshold of four voxels for illustration purposes).
Figure 5: Comparison of brain areas activated in the SHORT versus NO contrast (red) and the LONG versus SHORT contrast (blue).
Figure 6: Voxels with a significant correlation (height threshold P < 0.001, uncorrected; extent threshold of four voxels) with reward prediction V(t) and prediction error δ(t) are shown in different colors for different settings of the discount factor γ.

Similar content being viewed by others

References

  1. Bechara, A., Damasio, H. & Damasio, A.R. Emotion, decision making and the orbitofrontal cortex. Cereb. Cortex 10, 295–307 (2000).

    Article  CAS  PubMed  Google Scholar 

  2. Mobini, S. et al. Effects of lesions of the orbitofrontal cortex on sensitivity to delayed and probabilistic reinforcement. Psychopharmacology (Berl.) 160, 290–298 (2002).

    Article  CAS  Google Scholar 

  3. Cardinal, R.N., Pennicott, D.R., Sugathapala, C.L., Robbins, T.W. & Everitt, B.J. Impulsive choice induced in rats by lesions of the nucleus accumbens core. Science 292, 2499–2501 (2001).

    Article  CAS  PubMed  Google Scholar 

  4. Rogers, R.D. et al. Dissociable deficits in the decision-making cognition of chronic amphetamine abusers, opiate abusers, patients with focal damage to prefrontal cortex, and tryptophan-depleted normal volunteers: evidence for monoaminergic mechanisms. Neuropsychopharmacology 20, 322–339 (1999).

    Article  CAS  PubMed  Google Scholar 

  5. Evenden, J.L. & Ryan, C.N. The pharmacology of impulsive behaviour in rats: the effects of drugs on response choice with varying delays of reinforcement. Psychopharmacology (Berl.) 128, 161–170 (1996).

    Article  CAS  Google Scholar 

  6. Mobini, S., Chiang, T.J., Ho, M.Y., Bradshaw, C.M. & Szabadi, E. Effects of central 5-hydroxytryptamine depletion on sensitivity to delayed and probabilistic reinforcement. Psychopharmacology (Berl.) 152, 390–397 (2000).

    Article  CAS  Google Scholar 

  7. Doya, K. Metalearning and neuromodulation. Neural Net. 15, 495–506 (2002).

    Article  Google Scholar 

  8. Berns, G.S., McClure, S.M., Pagnoni, G. & Montague, P.R. Predictability modulates human brain response to reward. J. Neurosci. 21, 2793–2798 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Breiter, H.C., Aharon, I., Kahneman, D., Dale, A. & Shizgal, P. Functional imaging of neural responses to expectancy and experience of monetary gains and losses. Neuron 30, 619–639 (2001).

    Article  CAS  PubMed  Google Scholar 

  10. O'Doherty, J.P., Deichmann, R., Critchley, H.D. & Dolan, R.J. Neural responses during anticipation of a primary taste reward. Neuron 33, 815–826 (2002).

    Article  CAS  PubMed  Google Scholar 

  11. O'Doherty, J.P., Dayan, P., Friston, K., Critchley, H. & Dolan, R.J. Temporal difference models and reward-related learning in the human brain. Neuron 38, 329–337 (2003).

    Article  CAS  PubMed  Google Scholar 

  12. Sutton, R.S. & Barto, A.G. Reinforcement Learning (MIT Press, Cambridge, Massachusetts, 1998).

    Google Scholar 

  13. Houk, J.C., Adams, J.L. & Barto, A.G. in Models of Information Processing in the Basal Ganglia (eds. Houk, J.C., Davis, J.L. & Beiser, D.G.) 249–270 (MIT Press, Cambridge, Massachusetts, 1995).

    Google Scholar 

  14. Schultz, W., Dayan, P. & Montague, P.R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).

    Article  CAS  PubMed  Google Scholar 

  15. Doya, K. Complementary roles of basal ganglia and cerebellum in learning and motor control. Curr. Opin. Neurobiol. 10, 732–739 (2000).

    Article  CAS  PubMed  Google Scholar 

  16. McClure, S.M., Berns, G.S. & Montague, P.R. Temporal prediction errors in a passive learning task activate human striatum. Neuron 38, 339–346 (2003).

    Article  CAS  PubMed  Google Scholar 

  17. Mesulam, M.M. & Mufson, E.J. Insula of the old world monkey. III: Efferent cortical output and comments on function. J. Comp. Neurol. 212, 38–52 (1982).

    Article  CAS  PubMed  Google Scholar 

  18. Cavada, C., Company, T., Tejedor, J., Cruz-Rizzolo, R.J. & Reinoso-Suarez, F. The anatomical connections of the macaque monkey orbitofrontal cortex. Cereb. Cortex 10, 220–242 (2000).

    Article  CAS  PubMed  Google Scholar 

  19. Chikama, M., McFarland, N.R., Amaral, D.G. & Haber, S.N. Insular cortical projections to functional regions of the striatum correlate with cortical cytoarchitectonic organization in the primate. J. Neurosci. 17, 9686–9705 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Balleine, B.W. & Dickinson, A. The effect of lesions of the insular cortex on instrumental conditioning: evidence for a role in incentive memory. J. Neurosci. 20, 8954–8964 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Knutson, B., Fong, G.W., Bennett, S.M., Adams, C.M. & Hommer, D. A region of mesial prefrontal cortex tracks monetarily rewarding outcomes: characterization with rapid event-related fMRI. Neuroimage 18, 263–272 (2003).

    Article  PubMed  Google Scholar 

  22. Ullsperger, M. & von Cramon, D.Y. Error monitoring using external feedback: specific roles of the habenular complex, the reward system, and the cingulate motor area revealed by functional magnetic resonance imaging. J. Neurosci. 23, 4308–4314 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. O'Doherty, J., Critchley, H., Deichmann, R. & Dolan, R.J. Dissociating valence of outcome from behavioral control in human orbital and ventral prefrontal cortices. J. Neurosci. 23, 7931–7939 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Koepp, M.J. et al. Evidence for striatal dopamine release during a video game. Nature 393, 266–268 (1998).

    Article  CAS  PubMed  Google Scholar 

  25. Elliott, R., Friston, K.J. & Dolan, R.J. Dissociable neural responses in human reward systems. J. Neurosci. 20, 6159–6165 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Knutson, B., Adams, C.M., Fong, G.W. & Hommer, D. Anticipation of increasing monetary reward selectively recruits nucleus accumbens. J. Neurosci. 21, RC159 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Pagnoni, G., Zink, C.F., Montague, P.R. & Berns, G.S. Activity in human ventral striatum locked to errors of reward prediction. Nat. Neurosci. 5, 97–98 (2002).

    Article  CAS  PubMed  Google Scholar 

  28. Elliott, R., Newman, J.L., Longe, O.A. & Deakin, J.F. Differential response patterns in the striatum and orbitofrontal cortex to financial reward in humans: a parametric functional magnetic resonance imaging study. J. Neurosci. 23, 303–307 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Haruno, M. et al. A neural correlate of reward-based behavioral learning in caudate nucleus: a functional magnetic resonance imaging study of a stochastic decision task. J. Neurosci. 24, 1660–1665 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Reynolds, J.N. & Wickens, J.R. Dopamine-dependent plasticity of corticostriatal synapses. Neural Net. 15, 507–521 (2002).

    Article  Google Scholar 

  31. Tremblay, L. & Schultz, W. Reward-related neuronal activity during go-nogo task performance in primate orbitofrontal cortex. J. Neurophysiol. 83, 1864–1876 (2000).

    Article  CAS  PubMed  Google Scholar 

  32. Critchley, H.D., Mathias, C.J. & Dolan, R.J. Neural activity in the human brain relating to uncertainty and arousal during anticipation. Neuron 29, 537–545 (2001).

    Article  CAS  PubMed  Google Scholar 

  33. Rogers, R.D. et al. Choosing between small, likely rewards and large, unlikely rewards activates inferior and orbital prefrontal cortex. J. Neurosci. 19, 9029–9038 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Rolls, E.T. The orbitofrontal cortex and reward. Cereb. Cortex 10, 284–294 (2000).

    Article  CAS  PubMed  Google Scholar 

  35. Hanakawa, T. et al. The role of rostral Brodmann area 6 in mental-operation tasks: an integrative neuroimaging approach. Cereb. Cortex 12, 1157–1170 (2002).

    Article  PubMed  Google Scholar 

  36. Owen, A.M., Doyon, J., Petrides, M. & Evans, A.C. Planning and spatial working memory: a positron emission tomography study in humans. Eur. J. Neurosci. 8, 353–364 (1996).

    Article  CAS  PubMed  Google Scholar 

  37. Baker, S.C. et al. Neural systems engaged by planning: a PET study of the Tower of London task. Neuropsychologia 34, 515–526 (1996).

    Article  CAS  PubMed  Google Scholar 

  38. Middleton, F.A. & Strick, P.L. Basal ganglia and cerebellar loops: motor and cognitive circuits. Brain Res. Brain Res. Rev. 31, 236–250 (2000).

    Article  CAS  PubMed  Google Scholar 

  39. Haber, S.N., Kunishio, K., Mizobuchi, M. & Lynd-Balta, E. The orbital and medial prefrontal circuit through the primate basal ganglia. J. Neurosci. 15, 4851–4867 (1995).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Eagle, D.M., Humby, T., Dunnett, S.B. & Robbins, T.W. Effects of regional striatal lesions on motor, motivational, and executive aspects of progressive-ratio performance in rats. Behav. Neurosci. 113, 718–731 (1999).

    Article  CAS  PubMed  Google Scholar 

  41. Pears, A., Parkinson, J.A., Hopewell, L., Everitt, B.J. & Roberts, A.C. Lesions of the orbitofrontal but not medial prefrontal cortex disrupt conditioned reinforcement in primates. J. Neurosci. 23, 11189–11201 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Hikosaka, O. et al. Parallel neural networks for learning sequential procedures. Trends Neurosci. 22, 464–471 (1999).

    Article  CAS  PubMed  Google Scholar 

  43. Mijnster, M.J. et al. Regional and cellular distribution of serotonin 5-hydroxytryptamine2a receptor mRNA in the nucleus accumbens, olfactory tubercle, and caudate putamen of the rat. J. Comp. Neurol. 389, 1–11 (1997).

    Article  CAS  PubMed  Google Scholar 

  44. Compan, V., Segu, L., Buhot, M.C. & Daszuta, A. Selective increases in serotonin 5-HT1B/1D and 5-HT2A/2C binding sites in adult rat basal ganglia following lesions of serotonergic neurons. Brain Res. 793, 103–111 (1998).

    Article  CAS  PubMed  Google Scholar 

  45. Celada, P., Puig, M.V., Casanovas, J.M., Guillazo, G. & Artigas, F. Control of dorsal raphe serotonergic neurons by the medial prefrontal cortex: involvement of serotonin-1A, GABA(A), and glutamate receptors. J. Neurosci. 21, 9917–9929 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Martin-Ruiz, R. et al. Control of serotonergic function in medial prefrontal cortex by serotonin-2A receptors through a glutamate-dependent mechanism. J. Neurosci. 21, 9856–9866 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Hikosaka, K. & Watanabe, M. Delay activity of orbital and lateral prefrontal neurons of the monkey varying with different rewards. Cereb. Cortex 10, 263–271 (2000).

    Article  CAS  PubMed  Google Scholar 

  48. Shidara, M. & Richmond, B.J. Anterior cingulate: single neuronal signals related to degree of reward expectancy. Science 296, 1709–1711 (2002).

    Article  PubMed  Google Scholar 

  49. Matsumoto, K., Suzuki, W. & Tanaka, K. Neuronal correlates of goal-based motor selection in the prefrontal cortex. Science 301, 229–232 (2003).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank K. Samejima, N. Schweighofer, M. Haruno, H. Imamizu, S. Higuchi, T. Yoshioka, T. Chaminade and M. Kawato for helpful discussions and technical advice. This research was funded by 'Creating the Brain,' Core Research for Evolutional Science and Technology (CREST), Japan Science and Technology Agency.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kenji Doya.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Figure 1

An example of the time series of the explanatory variables for one subject. (PDF 34 kb)

Supplementary Figure 2

A schematic diagram of the brain areas involved in reward prediction at different time scales. The dotted lines indicate a cortico-cortico connection, and the green arrows indicate the serotonergic pathways from the dorsal raphe. The 'limbic loop' (including lateral OFC and ventral striatum) is involved in short-term reward prediction. The 'cognitive and motor loops' (including DLPFC, PMd and dorsal striatum) are involved in long-term reward prediction. Ventroanterior-to-dorsoposterior topographical projections from the insula to the striatum are involved in short-to-long-term reward prediction (rainbow-colored arrow). The mPFC and dorsal raphe, which are reciprocally connected, may regulate these loops by cortico-cortical and cortico-striatal projections from mPFC and serotonergic projections from dorsal raphe. SNr, substantia nigra pars reticulate. (PDF 415 kb)

Supplementary Table 1

Areas significantly activated in the block-design analysis. (PDF 11 kb)

Supplementary Table 2

Areas with significant correlation with reward prediction V(t) estimated with different discount factors γ. (PDF 15 kb)

Supplementary Table 3

Voxels with significant correlation with reward prediction error δ(t) estimated with different discount factors γ. (PDF 15 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tanaka, S., Doya, K., Okada, G. et al. Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nat Neurosci 7, 887–893 (2004). https://doi.org/10.1038/nn1279

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nn1279

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing