The Journal of Neuroscience, August 5, 2009, 29(31):9861-9874; doi:10.1523/JNEUROSCI.6157-08.2009
Previous Article | Next Article 
Behavioral/Systems/Cognitive
Validation of Decision-Making Models and Analysis of Decision Variables in the Rat Basal Ganglia
Makoto Ito1 and
Kenji Doya1,2
1Neural Computation Unit, Okinawa Institute of Science and Technology, Okinawa 904-2234, Japan, and 2Computational Neuroscience Laboratories, Advanced Telecommunications Research Institute International, Kyoto 619-0288, Japan
Correspondence should be addressed to either Makoto Ito or Kenji Doya, Neural Computation Unit, Okinawa Institute of Science and Technology Promotion Corporation, Initial Research Project, 12-22 Suzaki, Uruma Okinawa 904-2234, Japan. Email: ito{at}oist.jp or Email: doya{at}oist.jp
Reinforcement learning theory plays a key role in understanding the behavioral and neural mechanisms of choice behavior in animals and humans. Especially, intermediate variables of learning models estimated from behavioral data, such as the expectation of reward for each candidate choice (action value), have been used in searches for the neural correlates of computational elements in learning and decision making. The aims of the present study are as follows: (1) to test which computational model best captures the choice learning process in animals and (2) to elucidate how action values are represented in different parts of the corticobasal ganglia circuit. We compared different behavioral learning algorithms to predict the choice sequences generated by rats during a free-choice task and analyzed associated neural activity in the nucleus accumbens (NAc) and ventral pallidum (VP). The major findings of this study were as follows: (1) modified versions of an action–value learning model captured a variety of choice strategies of rats, including win-stay–lose-switch and persevering behavior, and predicted rats' choice sequences better than the best multistep Markov model; and (2) information about action values and future actions was coded in both the NAc and VP, but was less dominant than information about trial types, selected actions, and reward outcome. The results of our model-based analysis suggest that the primary role of the NAc and VP is to monitor information important for updating choice behaviors. Information represented in the NAc and VP might contribute to a choice mechanism that is situated elsewhere.
Received Dec. 25, 2008;
revised May 13, 2009;
accepted June 15, 2009.
Correspondence should be addressed to either Makoto Ito or Kenji Doya, Neural Computation Unit, Okinawa Institute of Science and Technology Promotion Corporation, Initial Research Project, 12-22 Suzaki, Uruma Okinawa 904-2234, Japan. Email: ito{at}oist.jp or Email: doya{at}oist.jp
This article has been cited by other articles:

|
 |

|
 |
 
M. R. Roesch, T. Singh, P. L. Brown, S. E. Mullins, and G. Schoenbaum
Ventral Striatal Neurons Encode the Value of the Chosen Action in Rats Deciding between Differently Delayed or Sized Rewards
J. Neurosci.,
October 21, 2009;
29(42):
13365 - 13376.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. M. A. Pennartz, J. D. Berke, A. M. Graybiel, R. Ito, C. S. Lansink, M. van der Meer, A. D. Redish, K. S. Smith, and P. Voorn
Corticostriatal Interactions during Learning, Memory Processing, and Decision Making
J. Neurosci.,
October 14, 2009;
29(41):
12831 - 12838.
[Abstract]
[Full Text]
[PDF]
|
 |
|