Expected reward impacts behavior and neuronal activity in brain areas involved in sensorimotor processes. However, where and how reward signals affect sensorimotor signals is unclear. Here, we show evidence that reward-dependent modulation of behavior depends on normal dopamine transmission in the striatum. Monkeys performed a visually guided saccade task in which expected reward gain was different depending on the position of the target. Saccadic reaction times were reliably shorter on large-reward trials than on small-reward trials. When position–reward contingency was switched, the reaction time difference changed rapidly. Injecting dopamine D1 antagonist into the caudate significantly attenuated the reward-dependent saccadic reaction time changes. Conversely, injecting D2 antagonist into the same region enhanced the reward-dependent changes. These results suggest that reward-dependent changes in saccadic eye movements depend partly on dopaminergic modulation of neuronal activity in the caudate nucleus.
We thank Drs. Matt Roesch, Masaki Isoda, Long Ding, Hiroyuki Nakahara, Robert H. Wurtz, and Barry J. Richmond for helpful comments. We also appreciate GC America, Inc. for providing us with dental acrylic, which was indispensable for our surgery.
Primate (Inoue et al., 1985; Watanabe, 1996; Leon and Shadlen, 1999; Platt and Glimcher, 1999; Amador et al., 2000; Liu et al., 2000; Stuphorn et al., 2000; Glimcher, 2001; Kobayashi et al., 2002; Shidara and Richmond, 2002; Schultz et al., 2003; Roesch and Olson, 2004; Sugrue et al., 2004) and human (for review, see McClure et al., 2004) studies have shown that neuronal activity in many brain areas is modulated by expected reward. These changes impact motor behavior, as measured by changes in choice or movement speed. However, it is unclear which neural pathways are crucially involved in incorporating reward information with sensorimotor or cognitive signals.
The dopamine system has been implicated in reward-related functions. Dopamine neurons respond to the delivery of unexpected reward and to sensory cues that predict reward (Ljungberg et al., 1991; Schultz, 1998; Satoh et al., 2003; Kawagoe et al., 2004). Reward-seeking behaviors and reward-related incentive learning can be impaired by dopamine antagonists (Beninger and Miller, 1998; Wise, 2004). At the cellular level, dopamine exerts powerful modulatory influences on intrinsic membrane properties and synaptic efficacy of the striatal projection neurons via D1- and D2-like receptor activation (Calabresi et al., 1987; Nicola et al., 2000; Reynolds and Wickens, 2002; Mahon et al., 2004).
It has been suggested that reward-dependent modulation of neuronal activity might occur in the striatum, an input channel of the basal ganglia, and that dopamine may be responsible for this process. Anatomically, the striatum receives convergent inputs from many cortical areas that carry sensorimotor/cognitive signals (Selemon and Goldman-Rakic, 1985; Parthasarathy et al., 1992) as well as strong dopaminergic projection from the substantia nigra pars compacta (Lynd-Balta and Haber, 1994). The striatum is involved in making associations between particular stimuli and behavioral responses, the process of which initially depends on reward signals but eventually may become more habitual (Packard and Knowlton, 2002; White and McDonald, 2002; Gerdeman et al., 2003; Everitt and Robbins, 2005). However, it is still unclear how dopamine alters neuronal activity in the striatum, and in turn, how these changes impact behavior. Here, we address this issue by using the primate saccade system, taking advantage of its measurable motor output and its relatively well understood neuronal circuitry (Hikosaka et al., 2000).
When different reward sizes are associated with particular locations of saccade targets, visual, saccadic, and cognitive signals carried by single neurons in the oculomotor region in the caudate nucleus are strongly modulated by expected reward (Kawagoe et al., 1998; Lauwereyns et al., 2002a,b; Watanabe et al., 2003). The neuronal activity in the caudate nucleus impacts behavior via pathways through the substantia nigra pars reticulata (Sato and Hikosaka, 2002) and the superior colliculus (Ikeda and Hikosaka, 2003) and is strongly correlated with changes in the saccadic reaction time (Takikawa et al., 2002b; Itoh et al., 2003).
To test the hypothesis that the reward-dependent modulation of saccades is dependent on dopamine-induced changes in caudate neuronal activity, we blocked dopamine receptor by injecting dopamine D1 or D2 antagonist locally into the oculomotor regions in the caudate. We found that local blockade of dopamine receptors in the caudate significantly changed the reward-dependent modulation of reaction times.
Materials and Methods
We used four hemispheres of two adult female rhesus monkeys (Macaca mulatta; laboratory designations, S and L). Both animals had been implanted with scleral search coils for measuring eye position and a post for holding the head. The recording chambers were placed over the fronto-parietal cortices. All procedures were approved by the Institute Animal Care and Use Committee and complied with Public Health Service Policy on the humane care and use of laboratory animals.
All aspects of the behavioral experiment, including presentation of stimuli, monitoring of eye movements, monitoring of neuronal activity, and delivery of reward, were under the control of a QNX-based real-time experimentation data acquisition system (REX). Eye position was monitored by means of a scleral search coil system with 1 ms resolution. Stimuli generated by an active matrix liquid crystal display projector (PJ550; ViewSonic, Walnut, CA) were rear-projected on a frontoparallel screen 25 cm from the monkey’s eyes. Drops of water or juice were delivered as reward through a spigot under control of a solenoid valve after successful completion of each trial.
The main task was a visually guided saccade task with position-biased rewards. Essential procedure of the task is summarized in Figure 1, A and B. Each trial began with presentation of a central fixation point (0.5°). After the animal maintained fixation on the spot for 1200 ms, the fixation point was turned off, and a target, a white small dot (1.2°), appeared at either right or left, 20° from the fixation point, to which the animal was required to make a saccade. There was no time gap between turning off of the fixation point and target onset, except for five experiments (four midcaudate and one postcaudate injections of D1 antagonist) in which there was a 100 ms gap. The effects were similar except that reaction times were shorter in the gap-present condition than in the gap-absent condition (supplemental Tables 1-1, 1-2, available at www.jneurosci.org as supplemental material). After 100 ms of fixation at the target, a liquid reward was given. The intertrial interval was 3 s. The sequence of the position of the target was pseudorandom subject to the constraint that within every four trials that contained two trials for each of the two locations, the order was determined randomly. In one block of experiment consisting of 20–28 trials (10–14 trials for each direction), reward was always larger (0.4 ml) for one target direction of the target and smaller (0.05 or 0 ml) for the other target direction (Fig. 1A,B, left-large and right-small). Even for small-reward trials, the animal had to make a correct saccade; if the saccade was incorrect, the same trial was repeated until the saccade was made correctly. In the second block, the direction of reward bias was reversed (Fig. 1A,B, right-large and left-small). For completion of one “session,” these two kinds of blocks with opposite reward schedules were alternated two or three times so that a total of four to six blocks were performed (Fig. 1B). Other than given reward, there was no external cue as to which direction was more rewarded or when the block was changed. To examine whether saccadic eye movement changed after drug injection regardless of the reward condition, we used the same visually guided saccade task, but with equal reward for both targets. The amount of reward was typically the same as that in the large-reward trials in the biased reward task. One session of the equal-reward saccade task consisted of 20 trials (10 trials for each direction).
After implanting the recording chamber, we obtained magnetic resonance images of the brain. This allowed us to determine the position of the electrode or the injection cannula relative to the recording chamber. To determine injection sites, we first performed single-unit recording using the saccade task with position-biased rewards (Fig. 1A) in a wide area in the caudate to determine the region related to the task. The anteroposterior extent of the recording sites was from 8 mm anterior to 8 mm posterior to the anterior commissure, which corresponded to anterior 10–26 mm (monkey S) and 7–23 mm (monkey L) in Horsley-Clarke coordinates. Task-related activity in the caudate included visual, presaccadic, and pretarget anticipatory activity (Hikosaka et al., 2000). Neurons were determined task-related if activity of any of the following task periods were significantly modulated (Kruskal–Wallis test, p < 0.01): early, mid, and late fixation period (0–400, 400–800, 800–1200 ms after fixation onset), posttarget period (100 ms after target onset), presaccadic period (100 ms before saccade onset), postsaccadic period (200 ms after saccade onset), and reward period (300 ms after reward onset).
The positions of recordings and injections were determined using a grid system (a plastic cylinder that has a two-dimensional array of holes at every 1 mm and is fitted to the recording chamber). First, a stainless-steel guide tube [outer diameter (o.d.), 0.6 mm; inner diameter (i.d.), 0.35 mm] was inserted through a grid hole and, after penetrating the dura, it was lowered until its tip reached 2–3 mm above the upper edge of the caudate. As an injection tube, we used a stainless-steel tube (o.d., 0.2 mm; i.d., 0.1 mm), which was connected to the polyethylene tubing (o.d., 3 mm), which was in turn connected to the Hamilton syringe. The drug solution was pressure-injected, 0.2 μl every 30 s for 10 times, for a total of 2 μl. We used ((R)-(+)-7-chloro-8-hydroxy-3-methyl-1-phenyl-2,3,4,5-tetrahydro-1H-3-benzazepine) (SCH23390; 8 μg/μl) as a dopamine D1 antagonist and Eticlopride hydrochloride (6 μg/μl) as a dopamine D2 antagonist (Sigma, St. Louis, MO), both directly dissolved by saline. These doses were chosen based on the previous reports (Watanabe and Kimura, 1998; Bari and Pierce, 2005). We also injected saline as a separate experiment to ensure the effect was not caused by any mechanical effect by liquid injection (data are in supplemental material, available at www.jneurosci.org).
After inserting an injection tube, the animal performed one session of the biased-reward saccade task (at least four blocks or 80–112 trials) and one session of the equal-reward saccade task (at least 20 trials), and the data were used as preinjection control. Soon after the injection was completed (within 5 min), the animal started performing the same biased and equal-reward saccade tasks. One session for each task took 10–15 min to complete. After injection, these two kinds of saccade tasks were repeated for 45–60 min until the effect was no longer observed. We used the data that were obtained within 30 min after injection as the postinjection data.
We analyzed saccade parameters (reaction time, peak velocity, and amplitude) in correct trials. The reaction time was measured from the target onset to the saccade onset. Because there was no external cue that indicates switch between blocks, the very first trial of the block was likely to carry the behavioral context of the previous block. On the other hand, after extensive training, the monkey might predict the switch of the block around the end of the block. Therefore, we excluded the first and last trials in each block for statistical comparison between conditions. Incorrect trials were classified into no fixation (eye position was not on the fixation point for 1500 ms after its onset), fixation break (eye position was not maintained within 2° around the fixation point until target onset), wrong saccade (saccade did not reach the point within 5° around the target), and slow response (saccade did not start within 500 ms after target onset), and their frequencies were computed for each session.
Statistical analyses of saccade parameters in correct trials were done at two levels: for within a session (see Fig. 2A,B) and for the population of sessions (see Figs. 2C,D, 3). For within-session analyses, we first grouped saccades into four kinds: two directions (contralateral and ipsilateral to the injection side) and two reward conditions (large and small rewards). For each group of saccades, we compared the reaction times in the preinjection session and those in the postinjection sessions using Mann–Whitney U test. To quantify the separation of the reaction times in large- and small-reward trials for each direction of saccades for each session, we calculated the reaction time bias as follows: the mean reaction time in small-reward trials − the mean reaction time in large-reward trials. To quantify the variability of reaction time, we calculated coefficient of variation (SD/mean) of reaction times for each condition for each session. For population analyses, we compared each of the above variables between the preinjection and postinjection sessions using Wilcoxon signed rank test, separately for the drugs (D1 and D2 antagonists) and saccade directions (contralateral and ipsilateral to the injection site).
In addition to the population analyses for the entire set of experiments for each drug, we analyzed the data obtained from anterior and posterior injection sites separately, because the effects of both drugs were different between the anterior and posterior regions (see Figs. 2C, 5C). To determine the boundary between these regions, we divided the injection sites into the anterior and posterior parts while moving the dividing line in 1 mm steps from the anterior end until a significant difference was found in the reaction time biases obtained from the anterior and posterior parts (p < 0.05, Mann–Whitney U test). We then termed the anterior and posterior parts the midcaudate and the postcaudate, respectively. The anterior part was termed the midcaudate because it was still a middle part of the caudate nucleus anatomically. Additional analyses were performed separately for each region to better characterize the effects (see Figs. 2D, 3, 5D, 6; for midcaudate, see supplemental Figs. 3, 4).
To study the time course of the effect within a session by drug injections, we first plotted the normalized reaction time against the trial number (see Fig. 4A). Normalization of the reaction times was necessary because both monkeys had idiosyncratically longer reaction times for one direction of saccades, and overall reaction times were different between monkeys. To normalize reaction times for each experiment, we first pooled reaction times for all saccades in both preinjection and postinjection sessions for each direction of saccades and obtained the mean and SD. We then computed z-value for a given trial as follows: (given reaction time − mean reaction time)/SD. The data were then averaged across experiments. As shown in the Figures 4A and 7A, the value at each bin indicates the grand average of the normalized reaction times in the nth trial before and after the transitions. To quantify how quickly saccade reaction times adapted to the change in the reward condition, we first computed the reaction time bias separately for the early period (the first 10 trials except for the first trial) and the late period (the last 10 trials except for the last trial) and made a pair-wise comparison between the values before and after drug injections (see Fig. 4B). To further examine in which of the large or small-reward condition the adaptation process was affected by the drug injection, we computed the changes in reaction times between the early and late periods in each of the large and small-reward conditions and made a pair-wise comparison between the values before and after drug injections (see Fig. 4C).
Before injection of dopamine antagonists, monkeys’ saccadic reaction times changed reliably and flexibly depending on the position-reward mapping, as reported previously (Lauwereyns et al., 2002b). In a typical example shown in Figure 2A, the reaction times were consistently shorter in the large-reward trials (median, 104.5 ms; mean, 106.4 ms; SE, 8.5 ms) than in small-reward trials (median, 172 ms; mean, 170.8 ms; SE, 16.0 ms; p < 0.0001, Mann–Whitney U test). When the direction of the reward bias was switched (i.e., block was changed), the reaction times changed within two or three trials. [Because there was no explicit cue to indicate when the reward schedule (i.e., block) would be changed, there was no significant change in reaction time on the very first trial of each block. The seemingly abrupt changes in reaction time across blocks, which are observed in Figures 2, A and B, and 5, A and B, actually occurred on the second or later trials in a block after some trials that induced saccades in the other direction (data not shown) (Watanabe and Hikosaka, 2005).] After the change, the reaction times remained stable during the block. As a measure of the ability of the animal to adjust the reaction times based on expected reward, we computed the reaction time bias, the mean reaction time in small-reward trials minus the mean reaction time in large-reward trials. In the example shown in Figure 2A, the reaction time bias was 64 ms.
To test the effect of local blockade of dopamine receptors in the caudate, we injected dopamine D1 antagonist (SCH23390) (n = 21), dopamine D2 antagonist (Eticlopride) (n = 23), or saline (n = 13) unilaterally. Before the drug injection experiments, we performed single-unit recording using the saccade task with position-biased rewards (Fig. 1A) in a wide area in the caudate. As reported previously, many caudate neurons were related to the saccade task (see Materials and Methods), and their activities were strongly modulated by the reward condition (i.e., large or small reward) (Kawagoe et al., 1998; Watanabe et al., 2003) or by the rewarded position (Lauwereyns et al., 2002b; Takikawa et al., 2002a). Because task-related neurons were found mainly in the caudate posterior to the anterior commissure, in agreement with a previous report (Hikosaka et al., 1989), we performed injections mainly in the part of caudate posterior to the anterior commissure.
We found that the adaptive modulation of saccades by reward was affected differently depending on the kinds of dopamine antagonists. In the following text, we will concentrate on the changes in the saccadic reaction time because it showed consistent and significant changes across experiments (see supplemental Tables 1–6, available at www.jneurosci.org as supplemental material, for other saccadic parameters). Incorrect trials were rare (typically <5%), and the frequency of any type of errors (no fixation, fixation break, wrong saccade, and slow response) showed no significant changes by any of the drug injections. These incorrect trials were thus excluded from further analyses.
Effects of dopamine D1 antagonist injection
The reward effect on the reaction time that had been seen during preinjection control (Fig. 2A) became smaller after injection of the D1 antagonist SCH23390 (Fig. 2B). In this example, the D1 antagonist was injected in the right caudate, 2 mm posterior to the anterior commissure (Fig. 1C, arrow). The saccade performance was recorded starting ∼3 min after the injection was completed. The reaction times of leftward saccades (contralateral to the injection site) were no longer reliably modulated by the reward condition (median, 102.0 ms; mean, 111.0 ms; SE, 32.0 ms in the large-reward trials and median, 125.0 ms; mean, 133.1 ms; SE, 43.1 ms in small-reward trials; p = 0.06, Mann–Whitney U test). The reaction time bias (22 ms) was smaller than the value in the preinjection control (Fig. 2A).
We made 21 (17 in monkey S, four in monkey L) injections of the D1 antagonist and obtained similar results. In Figure 2C are shown, for individual experiments, the reaction time biases before (abscissa) and after (ordinate) the drug injections. Similar to the example shown in Figure 2A, the reaction time bias decreased after the injections in many experiments (data points below the equity line), but a statistical analysis did not indicate a significant change (Wilcoxon signed rank test, p = 0.08). We found, however, that the drug effect was stronger in more anterior injections. Correlation between the distance from the anterior commissure and the degree of attenuation in the reaction time bias was significant (Spearman rank correlation, p = 0.004; ρ = 0.65; n = 21). To determine the regional boundary, we divided the injection sites into the anterior and posterior parts while moving the dividing line from the anterior end. A significant difference in the reaction time bias between anterior and posterior injections was found between 4 and 5 mm from the anterior commissure (p < 0.03, Mann–Whitney U test; see Materials and Methods). We therefore designated the area 1–4 mm from the anterior commissure as midcaudate and the area 5–6 mm from the anterior commissure as postcaudate. The reaction time bias became significantly smaller after D1 antagonist injections in the midcaudate (total, 12 injections; nine in monkey S; three in monkey L; p = 0.003) (Fig. 2C, filled symbols, D) but not in the postcaudate (total, nine injections; seven in monkey S; two in monkey L; p = 0.77) (Fig. 2C, open symbols). In the following, we focus on the effects of D1 antagonist in the midcaudate.
The attenuation of the reaction time bias was mainly caused by increases in reaction time on large-reward trials (Fig. 3A,B). In Figure 3A, we compare the distributions of reaction times on all trials in the gap-absent condition (no time gap between the fixation offset and target onset) before (top) and after (bottom) D1 antagonist injections in the midcaudate. The reaction times became significantly longer on large-reward trials (p = 0.0001) (Fig. 3A, left) but showed no significant change on small-reward trials (p = 0.06) (Fig. 3A, right). In the gap-present condition, the reaction times increased on large-reward trials (p = 0.0001; n = 66 for preinjection control; n = 61 for injections) and decreased on small-reward trials (p = 0.0005 for small-reward trials; n = 64 for preinjection control; n = 71 for injections). Figure 3B shows the results of the analysis of single experiments (i.e., comparison between preinjection and postinjection sessions). The reaction times became significantly longer in 6 of 12 experiments on the large-reward trials (Fig. 3B, left, filled circles); in contrast, the reaction times became shorter in only one experiment under small-reward conditions (Fig. 3B, right, filled circles). A pair-wise population comparison for individual experiments indicates that the mean reaction times on large-reward trials were significantly prolonged (p = 0.004, pair-wise comparison for individual experiments), whereas the mean reaction times on small-reward trials did not change significantly (p = 0.93); similar results were obtained for the median reaction times (p = 0.016 for large reward; p = 0.13 for small-reward trials) (supplementary Fig. 7A, available at www.jneurosci.org as supplemental material). We also found that the variability of reaction times increased, as shown by an increase in coefficient of variation (Fig. 3C). Similar effects were observed for saccades ipsilateral to the injection site (supplementary Fig. 1, available at www.jneurosci.org as supplemental material).
Injection of D1 antagonist into the midcaudate also affected the time course of the reward effect on the reaction time within a block. Figure 4A shows the changes in the normalized reaction time (see Materials and Methods) averaged across all sessions, aligned at the times when the reward condition was switched from the large to small reward and from the small to large reward. During the preinjection control, the reaction time changed quickly in both types of switch (as exemplified in Fig. 2A). This pattern changed after the D1 antagonist injection; the reaction time appeared to change more slowly after the reward condition was switched. To quantify this effect, we divided each block of trials into the early period (the first 10 trials, except for the first trial) and the late period (the last 10 trials, except for the last trial) and examined, for each period, the changes in the reaction time bias by the D1 antagonist. As shown in Figure 4B, these values were attenuated in the early period, rather than in the late period. In other words, although the monkey was still able to differentiate between the large and small-reward conditions, the adaptation to the new reward condition was slowed by the D1 antagonist. To examine whether the adaptation was slowed in the large or small-reward condition, or both, we analyzed the changes in the reaction times from the early period to the late period in each of the large and small-reward conditions, as shown in Figure 4C. The changes became significantly larger after D1 antagonist injections in the small-reward condition (small/early to small/late) (p = 0.03) but not in the large-reward condition (large/early to large/late) (p = 0.75). The results indicate that the adaptation was slowed under the small-reward condition.
Effects of dopamine D2 antagonist injection
Injection of a D2 antagonist, Eticlopride, also affected saccades, but in a different manner. Figure 5, A and B, exemplifies the changes in the reaction time of rightward saccades before (Fig. 5A) and after (Fig. 5B) the injection of the D2 antagonist in the left midcaudate, 4 mm posterior to the anterior commissure (Fig. 1C). The reaction time bias increased from 90 to 127 ms.
We made 23 (18 in monkey S, five in monkey L) injections of the D2 antagonist, and the reaction time biases before and after the injections are shown for individual experiments (Fig. 5C). Similar to the example shown in Figure 5A, the reaction time bias increased after the injections in many experiments (data points above the equity line), but a statistical analysis did not indicate a significant change (Wilcoxon signed rank test, p = 0.06). Unlike D1 antagonist injections, the relationship between the distance from the anterior commissure and the change in the reaction time bias was not monotonic (Spearman rank correlation, p = 0.1; ρ = 0.34; n = 23). Nonetheless, similar to the D1 antagonist injections, we found a regional difference between anterior and posterior injections at the level between 4 and 5 mm from the anterior commissure (p < 0.05, Mann–Whitney U test). The reaction time bias became significantly larger after D2 antagonist injections in the midcaudate (total, 13 injections; 10 in monkey S; three in monkey L; p = 0.006) (Fig. 5C, filled symbols, D) but not in the postcaudate (total, 10 injections; eight in monkey S; two in monkey L; p = 0.88) (Fig. 5C, open symbols). In the following, we focus on the effects of D2 antagonist in the midcaudate.
The enhancement of the reaction time bias was mainly caused by increases in reaction times on small-reward trials (Fig. 6A,B). As shown in Figure 6A, the reaction times became significantly longer on small-reward trials (p = 0.03) (Fig. 6A, right) but showed no significant change on large-reward trials (p = 0.73) (Fig. 6A, left). The results based on individual experiments (Fig. 6B) indicate that the reaction times became significantly longer in 5 of 13 experiments on the small-reward trials (Fig. 6B, right, filled circles); in contrast, the reaction time became longer in only one experiment on the large-reward trials (Fig. 6B, left, filled circles). A pair-wise population comparison for individual experiments indicates that the mean reaction times on small-reward trials were significantly prolonged (p = 0.007), whereas the mean reaction times on large-reward trials did not change significantly (p = 0.24); similar results were obtained for the median reaction times (supplementary Fig. 7B, available at www.jneurosci.org as supplemental material). The variability of reaction times was not significantly changed (Fig. 6C). Unlike the D1 antagonist, the significant effects of the D2 antagonist were observed for contralateral saccades but not for ipsilateral saccades (supplementary Fig. 2, available at www.jneurosci.org as supplemental material).
The D2 antagonist changed the time course of the reward effect on the reaction times differently from theD1 antagonist. As shown in Figure 7A, after the switches of the reward condition, the reaction time changed as quickly as the preinjection control. The prolongation of reaction times on small-reward trials became clear in the late block period. As a result, the increase in the reaction time bias was statistically significant in the late period but not in the early period (Fig. 7B). Unlike D1 antagonist, D2 antagonist injections did not affect the speed of adaptation of the reaction times significantly, as shown in Figure 7C (p = 0.15, small/early to small/late; p = 0.70, large/early to large/late).
In sum, the D2 antagonist caused gradual prolongation of reaction times of contralateral saccades on small-reward trials, which led to an enhancement of the reaction time bias.
Effect of dopamine D1 and D2 antagonist into posterior caudate
We found the effects of both the D1 and D2 antagonists were weaker when they were injected in the posterior part of the caudate, which was >5 mm posterior to the anterior commissure (Fig. 1C). We made nine injections of the D1 antagonist in the posterior caudate (seven in monkey S, two in monkey L). As a population, there were no significant changes in the reaction time bias (supplementary Fig. 3A,D, available at www.jneurosci.org as supplemental material). The only significant effects were an increase in reaction times on large-reward trials (supplementary Fig. 3B, available at www.jneurosci.org as supplemental material) and an increase in the variability of reaction times on small-reward trials (supplementary Fig. 3C, available at www.jneurosci.org as supplemental material), both for contralateral saccades.
We also made 10 injections of the D2 antagonist in the postcaudate (eight in monkey S, two in monkey L). The reaction time bias showed no change (supplementary Fig. 4A,D, available at www.jneurosci.org as supplemental material). The significant increases in reaction times on large-reward trials for both contralateral and ipsilateral saccades (supplementary Fig. 4B,E, available at www.jneurosci.org as supplemental material) had no major impact on the reaction time bias.
Control experiments: equal-reward task and saline injections
Previous studies have shown that caudate activity is more prevalent in the biased-reward task than in an equal-reward task during which an equal amount of reward is given for both directions (Takikawa et al., 2002a). This suggests that the effect of dopamine blockade on behavior should be dependent on reward context. To determine whether the effects observed above are specific to the biased-reward task, we examined behavior during performance of the equal-reward saccade task.
Significant changes in the reaction times were observed only after D1 antagonist injections in the midcaudate, for contralateral saccades (mean, 206.4 to 216.4 ms; p = 0.02). Changes in ipsilateral saccades did not reach the significant level (240.4 to 250.7 ms; p = 0.13). The effects of D2 antagonist injections in the midcaudate showed no significant change (223.5 to 229.1 ms, p = 0.09 for contralateral saccades; 204.8 to 197.7 ms, p = 0.05 for ipsilateral saccades). Injections of either drug to the postcaudate showed no significant changes. Increases in coefficient of variation for both direction of saccades were observed only for D2 antagonist injections in the midcaudate (contralateral, p = 0.04; ipsilateral, p = 0.02).
To exclude the possible volume effects by drug injections, we also performed saline injections as separate experiments (n = 9 for midcaudate; n = 4 for postcaudate) (supplementary Figs. 5 and 6, available at www.jneurosci.org as supplemental material). There were no consistent changes.
Our findings that local blockade of dopamine receptors in the caudate induced changes in reward-dependent reaction time provide two important conclusions. First, the caudate is a source of reward-dependent modulation of a particular motor behavior, namely saccadic eye movement. Second, dopamine D1 and D2 receptors are involved in the reward-dependent modulation of saccades, but in different manners.
The effects of dopamine antagonist injections on the performance of the biased-reward saccade task, summarized in Figure 8, cannot be explained simply by a general change in motor preparation, arousal, or motivational level. First, the frequency of fixation break, a measure of general motivation (Roesch and Olson, 2004), was not changed significantly. Second, more importantly, the effects were task dependent. In the D1 antagonist injections, the prolongation of reaction times on large-reward trials in the biased-reward task is consistent with the effect observed in the equal-reward task, but the reaction times on small-reward trials were not increased. Although the effect was bilateral in the biased-reward task, the effect on the equal-reward task was significant only for contralateral saccades. Increases in reaction time variability was evident in the biased-reward task, but not in the equal-reward task. D2 antagonist injections did not cause significant prolongation of the reaction times in the equal-reward task, but they did so for contralateral saccades on the small-reward trials in the biased-reward task. These task-dependent effects of dopamine antagonists are consistent with the task-dependent activity of caudate neurons that is sensitive to the difference in expected reward value between targets (Takikawa et al., 2002a; Cromwell et al., 2005).
Dopamine D1-mediated effects on reward-modulation of saccades
The effects of D1 antagonist injection in the biased-reward task were characterized by attenuation of the reward-dependent reaction time bias, loss of the stability of reaction times, and slower adaptation to the reversal of position–reward contingency. The attenuation of the reaction time bias was caused by the prolongation of reaction times on large-reward trials and unchanged reaction times on small-reward trials. At the neural network level, such reward–condition-specific effects of D1 antagonist may be explained by the following observations. First, in the biased-reward saccade task, dopamine neurons fire phasically when the saccade target indicates a large reward (Kawagoe et al., 2004), which should cause a phasic increase in the dopamine level in the caudate (Cragg et al., 2002). Second, in the striatum, D1 receptors are preferentially expressed by neurons that belong to the direct pathway (i.e., projecting to the substantia nigra directly) (Gerfen et al., 1990; Surmeier et al., 1996). Third, in the anesthetized animals, dopamine increases the excitability of caudate neurons, and this effect was reduced by D1 antagonist (Gonon, 1997; West and Grace, 2002). These observations suggest that D1 antagonist injection into the caudate would attenuate the responses of nigra-projecting caudate neurons to a large-reward indicating target, which leads to a weaker disinhibition of neurons in the superior colliculus (Hikosaka et al., 2000) and consequently the prolongation of saccade reaction times on large-reward trials.
The above interpretation assumes that dopamine neurons act quickly on caudate neurons to change saccade reaction time on a single trial. However, this may not be realistic, considering the metabotropic nature of dopamine actions. An alternative mechanism may be dopamine-dependent plasticity in cortico-striatal synapses (Calabresi et al., 1996; Reynolds and Wickens, 2002; Lovinger et al., 2003; Mahon et al., 2004). A conjunction of presynaptic activity in cortico-striatal inputs and postsynaptic activity in caudate neurons leads to long-term potentiation only if a large phasic increase in D1 receptor activation occurs simultaneously (Reynolds and Wickens, 2002). In our paradigm, if a particular target is repeatedly associated with a large reward, which would cause dopamine neuron activation (Kawagoe et al., 2004), the cortico-striatal synapses carrying the target signal should undergo long-term potentiation, and therefore caudate neurons respond to the target progressively more strongly, leading to shorter saccade reaction times. D1 antagonist should suppress such changes, as we observed as the longer saccade reaction times on large-reward trials. The result is also consistent with previous studies showing that D1 receptor blockade or knock-out disrupts the acquisition of the conditioned approach paradigms (Eyny and Horvitz, 2003; Tran et al., 2005).
We also found that D1 blockade impeded the adaptation of saccade reaction times to the reversal of position–reward contingency. This can be interpreted as slower changes in the efficacy of the cortico-striatal synapses. This phenomenon may be related to the deficits in switching or set-shifting reported in parkinsonian patients (Hayes et al., 1998; Gauntlett-Gilbert et al., 1999; Cools et al., 2001) and in animals with striatal dopamine depletions (Oades, 1985; van den Bos and Cools, 2003; Goto and Grace, 2005).
Another interesting effect of the D1 blockade was the increase in the variability of reaction times. The effect is consistent with the finding that shows manipulation of the dopamine system alters the variability and sequential pattern of generic behavior in rodents (Paulus et al., 1993).
Dopamine D2-mediated effects on reward-modulation of saccades
The effects of the D2 blockade were different from the effects of the D1 blockade. The reaction time bias was even enhanced as a result of the prolongation of reaction times on small-reward trials. D2 receptors are preferentially expressed by caudate neurons that belong to the indirect pathway (Gerfen et al., 1990; Surmeier et al., 1996), the action of which would lead to an enhanced inhibition of the superior colliculus (Hikosaka et al., 2000). Because the D2-mediated effect on caudate neurons is inhibitory (West and Grace, 2002), dopamine would exert facilitatory effects on superior colliculus neurons via the indirect pathway, similar to its effect on the direct pathway. The apparent discrepancy may be resolved by a common effect of D1 and D2 antagonists: prolongation of saccade reaction times. The prolongation of saccade reaction times occurred on large-reward trials for D1 blockade and on small-reward trials for D2 blockade. It is possible that the D2-mediated inhibitory effect on caudate neurons is necessary to keep minimum facilitatory effects on the superior colliculus for saccades to be generated even on small-reward trials. D2-receptor blockade would reduce the facilitatory effects on saccades on small-reward trials when the dopamine level decreases, leading to prolonged reaction times on small-reward trials.
Note that the schemes described above may be oversimplified, given the recent evidence indicating colocalization of D1 and D2 receptors in single striatal projection neurons (Surmeier et al., 1996; Aizman et al., 2000). It is also possible that D2 antagonist increased activation of the D1-mediated effect through blockade of D2-mediated autoreceptor inhibition of dopamine release (Carter and Muller, 1991). Additional studies should address the coordinated D1 and D2 functions in the control of voluntary behavior.
The laterality of the D1- and D2-mediated effects is difficult to interpret: D1 blockade effects were bilateral, whereas D2 blockade effects were contralateral. Contralateral effects are easy to understand because most of the connections from the caudate to the superior colliculus through the substantia nigra pars reticulata are ipsilateral (Tulloch et al., 1978) and the superior colliculus controls saccades to the contralateral hemifield (Robinson, 1972). A known exception is a crossed connection from the substantia nigra pars reticulata to the superior colliculus (Beckstead et al., 1981; Jiang et al., 2003). However, there is no evidence, to our knowledge, that the crossed connection preferentially carries the information of the D1 receptor (i.e., direct) pathway. On the input side, there is a hint for the difference in laterality: caudate neurons in the direct and indirect pathways tend to receive bilateral and ipsilateral cortical projection, respectively (Lei et al., 2004). It is possible that the outputs of these caudate neurons are organized in the same manner.
Role of dopamine in reward-related learning and motor control
Behavioral-pharmacological studies have shown an essential role of dopamine in reward- or drug-seeking behavior and related neuronal activity, providing detailed mechanisms dependent on brain location and receptor type (Ikemoto and Panksepp, 1999; van den Bos and Cools, 2003; Wise, 2004; Yun et al., 2004; Goto and Grace, 2005; Tran et al., 2005). Contrasting effects of D1- and D2-mediated effects on reward-related learning and learned behaviors have also been reported (Beninger and Miller, 1998; Eyny and Horvitz, 2003). A majority of these studies have been done either with systemic injection of drugs, or in relation to the function of the ventral striatum. Recently, however, the dorsal striatum has become another focus of research on reward- or drug-seeking behavior, which is also heavily dependent on dopamine. Consistent with our findings, it is now considered that dopaminergic input to dorsal striatum provides a reinforcing signal that effectively stamps in stimulus–response associations (Packard and Knowlton, 2002; Wise, 2004). Dopamine level in the dorsal striatum markedly increased in response to a drug-associated cue (Ito et al., 2002) and a dopamine antagonist attenuates such cue controlled reward or drug seeking behavior (Vanderschuren et al., 2005).
Despite the rich literature partially listed above, the precise circuits and functions affected by dopamine receptor activation that alter the behavioral outcome remain unclear. Many studies have indicated that patients of Parkinson’s disease and other dopamine deficiencies show impairments in saccade initiation (for review, see Hikosaka et al., 2000). That dopamine release in the caudate is essential for the control of saccades has been shown by unilateral infusion of MPTP (1-methyl-4-phenyl-1,2,3,6-tetra-hydropyridine) in the monkey caudate (Kato et al., 1995; Kori et al., 1995; Miyashita et al., 1995).
These experiments, however, did not indicate how dopamine is used to control saccadic eye movements. The results presented in this article provide evidence that dopaminergic modulation of caudate neuronal activity contributes, at least partly, to changes in saccadic eye movement by expected reward.
This work was supported by the intramural research program of the National Eye Institute.
- Correspondence should be addressed to Kae Nakamura, Laboratory of Sensorimotor Research, National Eye Institute, National Institutes of Health, Building 49, Room 2A50, 49 Convent Drive, Bethesda, MD 20892-4435. Email: