The midbrain dopamine system has been ascribed roles in reward expectancy, error detection, prediction, and memory. However, these theories typically do not differentiate between dopamine response and action in different forebrain terminal fields. We measured dopamine release in the prefrontal cortex (PFC), nucleus accumbens (NAc), and dorsal striatum (DS) of rats exposed to the same maze apparatus under three behavioral conditions: a set-shift task in which reward depended on discrimination learning and extradimensional set-shifting, a yoked condition in which reward was intermittent and not under the control of the subject, and a “reward-retrieval” variant in which reward was certain on every trial. We found dissociable patterns of dopamine release associated with learning, uncertainty, and reward. Dopamine increased in all three regions when reward was contingent on rule learning and shifting or was uncertain. These increases were sustained after behavior. There was a significant correlation between the magnitude of increase in PFC dopamine and the rapidity with which rats shifted between discrimination rules. In the yoke condition, in which the receipt of reward was always uncertain, the opposite relationship between dopamine levels and likelihood of reward was observed. Predictable, noncontingent reward was associated with increased dopamine levels in the NAc and DS. In contrast, PFC dopamine did not increase significantly above baseline levels. Thus, the dopaminergic projections to the PFC and nucleus accumbens were selectively, yet differentially, activated in situations of uncertainty and cognitive demand, whereas the dopaminergic projection to the DS responded independently of task differences in learning and reward.
Current influential theories regarding the behavioral function(s) of dopamine posit that dopamine signals universally encode the value and probability of reward (Montague and Berns, 2002; Schultz, 2002; Tobler et al., 2005). Other roles ascribed dopamine include the support or modulation of reward salience (Berridge and Robinson, 1998), goal-directed activity (Salamone and Correa, 2002), working memory (Williams and Goldman-Rakic, 1995), and memory consolidation (Wise, 2004). Further complicating our understanding of the behavioral roles of dopamine is the likelihood that it modulates different aspects of goal-directed behavior in different forebrain terminal fields (Salamone and Correa, 2002; Seamans and Yang, 2004; Wise, 2004). Thus, although the modulatory actions of dopamine within the mammalian forebrain are of undisputed importance in generating and maintaining effective behaviors, its precise role(s) remains obscure.
To better understand region-specific relationships between dopamine levels and cognition, we exposed rats to one of three different reward conditions in the same maze apparatus while simultaneously measuring dopamine levels in three forebrain regions. The set-shift condition required cognitive set-shifting (Stefani et al., 2003; Stefani and Moghaddam, 2005a,b), wherein reward depended on the acquisition of a discrimination rule, followed by an extradimensional shift from use of this rule to another. The yoke condition rewarded rats according to a schedule determined by the performance of set-shift rats. The third condition, reward retrieval, was cognitively less demanding, requiring only that a rat go to the end of either available arm each trial to receive reward. Thus, the three conditions differed in the predictability of reward and whether such predictability was under the control of the subject.
Dialysis probes were implanted in the medial prefrontal cortex (mPFC), dorsal striatum (DS), and nucleus accumbens (NAc). These regions have been implicated in aspects of learning and motivated behavior and receive midbrain dopaminergic projections (Lindvall and Bjorklund, 1978; Beckstead et al., 1979; Domesick, 1988). The rat mPFC is required for set-shifting behavior (Ragozzino et al., 1999; Birrell and Brown, 2000; Stefani and Moghaddam, 2005b). The NAc is involved in appetitively motivated instrumental learning and responding (Mogenson et al., 1980; Smith-Roe and Kelley, 2000). Dorsal striatal function is associated with stimulus–response learning, response selection, and aspects of set-shifting ability (Brown and Robbins, 1989; Aosaki et al., 1995; Packard and Knowlton, 2002; Ragozzino et al., 2002).
We found dissociable patterns of dopamine release associated with discrimination learning, uncertainty, and reward. When reward availability was contingent on rule learning and shifting or was uncertain, dopamine increased in all regions. When rule learning was possible, the increase in PFC dopamine correlated with the rapidity of rule acquisition and shifting. Uncertain and uncontrollable reward was associated with the inverse relationship between dopamine level and likelihood of reward. Predictable, noncontingent reward was associated with increased dopamine levels in the nucleus accumbens and dorsal striatum only. Thus, dopaminergic projections to the PFC and NAc were differentially activated in situations of uncertainty and cognitive demand, whereas the dopaminergic projection to the dorsal striatum responded independent of task-dependent learning.
Materials and Methods
Male Sprague Dawley rats (280–310 g on arrival; Harlan, Frederick, MD) were housed two or three to a cage and maintained on a 12 h light/dark cycle (lights on at 7:00 A.M.). The rats had ad libitum access to food for 2 weeks after arrival, after which they were placed on a restricted diet of 15 g of rat chow per day per rat. The rats had ad libitum access to water for the duration of the experiment. Animal care and experimental procedures were in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals.
The plus maze was constructed of Plexiglas (0.63 cm thick) and consisted of a square central platform, 14 cm on a side to which were joined four arms. The arms joined the central platform in such a way that no space existed between adjacent arms. The arms were 14.0 cm wide, 40.6 cm long, and 20.3 cm high. A food well 1.9 cm in diameter and 0.63 cm deep was located ∼2.5 cm from the distal end of each arm. The food well was sufficiently deep to hide the food pellet from the view of the rat from the arm entrance. Maze arms varied along two stimulus dimensions, brightness and texture. Arms were painted to be either dark (gray; catalog #335A-4; Valspar, Minneapolis, MN) or light (off-white; catalog #47583; Valspar) and were textured smooth (paint alone) or rough (paint mixed with sand). The center square and door of the maze were painted with gray primer and were distinct in contrast from the maze arms. The maze was placed on a locking rotary platform of our design that permitted the maze to be rotated between trials. The holding cage used for intertrial intervals (ITIs) was constructed of gray-painted Plexiglas and measured 35.6 × 35.6 × 35.6 cm.
Adjacent to the maze was a table on which were placed the holding cage, an infusion pump (Harvard Apparatus, South Natick, MA) and a laboratory stand with an adjustable height crossbar to which was attached a two-channel liquid swivel (model 357/D/QE; Instech, Plymouth Meeting, PA).
One week after arrival, a handling and habituation period ∼2 weeks in duration was begun (a week comprised 5 d). During the first week, each rat was handled by the experimenter for 3 min/d. Food reward pellets (Dustless Precision Pellets, purified formula, 45 mg; Bio-Serv, Frenchtown, NJ) were given to the rats in their home cages each day after handling to familiarize them with the taste and odor. Food restriction was begun on the second day of handling.
After the handling phase, the rats began a period of maze habituation lasting ∼1 week. During the first phase of maze habituation, all four arms of the maze were open for exploration. On the first day of habituation, cage mates were placed in the maze together and allowed to roam freely for 5 min. Four food reward pellets were placed in each food well to condition the rats to receiving the reward pellets in the maze. On the second and third days of habituation, the rats were placed in the maze individually, and the maze was baited with a single reward pellet per arm. Rats were allowed to explore until all of the food was eaten or for a maximum of 5 min. Immediately after the habituation period each day, the rats were placed in the holding cage for 1 min before being returned to their home cages. Rats that did not quickly consume the food pellets were habituated for additional sessions.
The rats next received 1–3 d of habituation to the maze in its test, “T” configuration (Stefani and Moghaddam, 2005b). For each trial, the arm directly across from the start arm was blocked with a removable opaque Plexiglas door. The rat was placed at the end of the start arm (the stem of the T) and allowed to explore one of the two “choice” arms and to consume any food reward there. Each rat received eight trials per day, two starts from each arm. The order of start arms was pseudorandom, and a food reward was not always present at the end of a choice arm. Between trials, the rats were held in the holding cage. The ITI was ∼15 s.
Through the above process, the rats were habituated to the maze, to receiving a food reward at the end of a choice arm, and to being repeatedly transferred between the holding cage and the maze. The rats were pseudorandomly rewarded during the habituation period in an attempt to forestall the development of associations between the presence of reward and arm stimulus parameters. The handling and habituation process was identical for rats in all three behavior groups described below.
Microdialysis probes were implanted the day after the last session of blocked arm habituation. This was approximately day 12 of the handling and habituation process.
Rats were anesthetized with halothane and placed in a stereotaxic apparatus using blunt ear bars. A thermostatically controlled electric heating pad was used to maintain the body temperature at 37°C. A 10 mm incision was made in the skin over the skull, and the wound margin was infiltrated with lidocaine solution (2%). Holes were drilled for two skull screws and two concentric microdialysis probes. For each rat, microdialysis probes were implanted into two of the three target brain regions, the mPFC, the NAc, and the DS. Stereotaxic coordinates for the target regions were as follows (in mm): mPFC, +3.2 anterior to bregma, 0.7 lateral to midline, 4.5 ventral from bregma; NAc, +1.5 anterior, 1.0 lateral, 8.4 ventral; and DS, +1.2 anterior, 1.8 lateral, 6.0 ventral. Coordinates were derived from the atlas of Paxinos and Watson (1986). The probes were constructed using Hospal AN69 polyacrylonitrile dialysis tubing (Renal Care, Lakewood, CO). Probes had an outer diameter of 330 μm and an exposed tip of 3.0 mm for the mPFC and 2.0 mm for the NAc and DS, respectively. Dialysis probes were affixed to the skull using jeweler’s screws and dental acrylic (Plastics One, Roanoke, VA).
Immediately after surgery, the dialysis probes were connected to a Harvard Apparatus (Holliston, MA) syringe pump via fused-silica tubing (Polymicro, Phoenix, AZ) attached to a liquid swivel/balance arm assembly (Instech). A thin wire leader affixed at one end to a bracket on the liquid swivel and at the other end to an eye screw embedded in the head cap served to eliminate tension on the dialysis lines themselves. Rats were then placed in a clear polycarbonate “recovery” cage (44 × 22 × 42 cm) containing fresh bedding. This cage was placed in the testing room, which was on a 12 h light/dark cycle identical to that of the animal colony room. During the postsurgical recovery period, dialysis probes were perfused at a rate of 1.0 μl/min with a Ringer’s solution containing the following (in mm):145 NaCl, 2.7 KCl, 1.0 MgCl2, and 1.2 CaCl2. The rats were allowed to recover for 24 h before behavior testing. During the recovery period, the rats were fed their daily ration of rodent chow, moistened with tap water.
Set-shift testing consisted of two sessions, or sets, separated by a 2 h interset interval. For set 1, the rats were trained to a criterion performance level on either a brightness (dark vs light maze arms) or texture (rough vs smooth arms) discrimination task. For set 2, the rats were trained on the alternative discrimination strategy for 80 trials, regardless of performance level. Rats were assigned to treatment and training groups pseudorandomly.
The maze was in a T configuration such that, from a start arm, a rat had a choice of making a 90° turn into either a dark or light arm, and a smooth or rough arm. Rats were trained to a criterion performance level of eight consecutive entries into a rewarded arm, at which point training was stopped. Rats were trained for a maximum of 120 trials. Rats that did not reach criterion performance were excluded from the experiment. Within each sequential block of eight trials, there were two starts from each of the four arms. The maze was rotated 90° with respect to the testing room every trial to discourage the use of extramaze cues. The ITI was ∼15 s. Between trials, the rats were placed in the holding cage, which contained fresh bedding material.
As during set 1, from each of the four start arms, a rat had a choice of making a 90° turn into an arm that was either rough or smooth, and light or dark. The rats received 80 trials, regardless of performance level during those trials. The number of trials required to reach the criterion of eight consecutive correct arm entries used for set 1 was recorded, as was the time required to complete all 80 trials. The sequence of arm starts was identical to that used for set 1.
A separate group of rats was tested under conditions in which the receipt of reward was yoked to the performance of rats from the set-shift task group described above. Yoking was accomplished through the use of historical data from rats that had completed the set-shift task. First, a set-shift group rat was randomly selected for yoking. A data sheet was composed for the yoked rat for sets 1 and 2 that indicated whether the corresponding set-shift rat had obtained a reward on a given trial. The yoked rat was then run in the maze for both sets, using the same sequence of arm starts, and for the same number of trials as had been the set-shift rat. When a trial was scheduled to be unrewarded, there was no reward present in the maze. Conversely, when a trial was to be rewarded, both choice arms were baited to ensure that the rat received a reward on that trial. Forced alternation was never used at any point.
Reward was available according to the reward schedule of the set-shift rat, slightly modified. Because the start arms of the maze are distinct from one another and could possibly be used to establish a predictive relationship, the receipt of reward was randomized within each eight block trial, while preserving the probability of reward on any given trial and the percentage of rewarded trials per block. Thus, during this task, rats navigated the maze and consumed food rewards in the absence of an explicit, experimenter-determined learning requirement. Furthermore, the receipt of reward was not a predictable outcome of the choices made, or strategies adopted, by a yoke group rat.
Rats in this group were run in the maze for a statistically identical number of trials on set 1 and for an identical number of trials on set 2, as their set-shifting counterparts. However, during the dialysis experiment, these rats were not trained to discriminate maze arms to receive food reward, nor were they trained to shift between discrimination rules. Instead, they were rewarded for all arm entries. Like rats in the yoke group, during this task, rats navigated the maze and consumed food rewards in the absence of an explicit, experimenter-determined learning requirement. However, under the reward-retrieval conditions, there was no uncertainty as to the receipt of reward after an arm choice; the probability of reward equaled 1.0 on every trial. The sequence of start arms was identical to that for the set-shift and yoke group rats.
Dopamine microdialysis procedure
On the day of experiment, rats were transferred from their recovery cages to the holding box ∼3 h before the collection of baseline samples (4 h before training on set 1). At this point, the flow rate was increased to 2 μl/min. Dialysate samples were collected every 20 min for the duration of each experiment. The samples were injected immediately onto an HPLC system with electrochemical detection for analysis of dopamine. The HPLC systems used a narrow-bore column [2.1 mm inner diameter; 3 μm C-18 particles (Thermo Electron Corporation, Bellefonte, PA) and LC-4C potentiostat (Bioanalytical Systems, West Lafayette, IN)]. The Eapp was +0.55 V versus Ag/AgCl reference electrode. The mobile phase for the dopamine system consisted of 0.1 m NaH2PO4, 640 mg/L octylsulfonic acid, 7% (v/v) acetonitrile, 0.25% EDTA, and 350 μl/L triethylamine, pH 5.0. The threshold for detection of dopamine was 5 fmol. Sample collection and behavior sets were coordinated such that the starts of sets 1 and 2 precisely coincided with the onset of collection of samples 4 and 12, respectively.
After completion of the experiment, rats were then anesthetized with chloral hydrate and perfused with 0.9% saline, followed by 10% Formalin. Their brains were removed and stored in a 10% Formalin solution. Serial sections (200 μm) through the regions of interest were mounted on glass slides and stained with cresyl violet. Stained sections were evaluated for accuracy of probe placement. Animals with probe placements outside of target regions were excluded from subsequent analysis. Figure 1 illustrates the location of acceptable, functional probe placements.
Between-group comparisons of behavior measures were made by one-way ANOVA and Tukey’s adjusted post hoc testing, with the exception of the “trials to criterion” measure for set 2. In the latter case, because of the imposed ceiling of 80 trials on set 2, the between-group comparison was made using the Kruskal–Wallis test for nonparametric data, with Mann–Whitney post hoc testing. To assess performance across trial blocks for set 2, percentage correct scores were calculated for 10 consecutive blocks of eight trials each. Performance across trial blocks was analyzed by comparing percentage correct scores using two-way, repeated-measures ANOVAs, with probe placement as the between-subject dimension and trial block as the within-subject dimension.
Perseverative responding on set 2 was evaluated by comparing the percentage correct scores from each of two start arm classes within each block of eight trials. For a given rat, the perseveration arms (PAs) were the start arms from which, on set 2, responding according to the correct stimulus–reward contingency from set 1 produced an incorrect, nonrewarded response. The reinforcement arms (RAs) were the start arms from which, on set 2, responding according to the correct stimulus–reward contingency from set 1 produced a correct, rewarded response (Stefani et al., 2003; Stefani and Moghaddam, 2005b). As with the overall performance across trial blocks, PA and RA performance was analyzed by comparing percentage correct scores using two-way, repeated-measures ANOVAs, with start arm class as the between-subject dimension and trial block as the within-subject dimension. All post hoc tests were adjusted according to the Tukey’s method.
Microdialysis values are expressed as percentage ± SEM of baseline. Baseline was defined as the average of the three dialysis samples immediately before the start of behavior testing (set 1). Dopamine levels were analyzed by two-way ANOVA, with behavior group (set-shift, yoke, or reward-retrieval) as the between-subjects factor and sample as the within-subjects measure. Between-group post hoc tests were made by one-way ANOVA with Tukey’s adjusted post hoc testing. Within-group post hoc tests were made using paired-sample t tests.
Correlations between dopamine output and behavioral measures were calculated using the percentage change in dopamine levels from baseline for the first 20 min of each test phase (samples 4 and 12 for sets 1 and 2, respectively) because all rats ran for the majority of the first 20 min during set 1 and for at least 20 min during set 2. Moreover, the first 20 min of each set represented the greatest difference between behavior groups in probability of reward receipt and cognitive load. Lines were fit for each dataset on the basis of an initial examination of the scatter plots, followed by use of the SigmaPlot regression tool (SigmaPlot version 8.0; SPSS; Chicago, IL) to determine the best conservative fit when nonlinearity was evident. The α level for all statistical comparisons was ≤0.05.
Thirty-eight rats were included in data analysis. One rat had a single probe, and the other 37 had two probes. Of the latter, 19 rats had two working dialysis probes; the balance had one functional probe. The distribution of dual-probe placements was as follows: PFC plus NAc, n = 11; PFC plus DS, n = 12; NAc plus DS, n = 14. Figure 1 shows the location of dialysis probes. There were no significant differences for any behavior performance measure when probe placement was used as the independent variable.
The implantation of dialysis cannulas 24 h before behavior testing did not adversely affect maze performance. Behavior measures did not differ between the rats included in this study and those of surgically naive rats or rats implanted with microinjection guide cannulas 1 week before testing [for comparison, see supplemental Fig. S1 (available at www.jneurosci.org as supplemental material) and Stefani et al., 2003].
Behavior task performance
Rats trained on the set-shift task (n = 14) required 51 ± 6 trials to reach the performance criterion of eight consecutive entries into the rewarded arm for set 1 and 56 ± 4 trials to reach criterion during set 2. The mean times required to reach criterion on set 1 and to complete the 80 trials of set 2, respectively, were 27 ± 3 and 39 ± 1 min. The mean time per trial values were 0.55 ± 0.05 and 0.49 ± 0.02 min for sets 1 and 2, respectively. Between sets 1 and 2, there were no significant differences in either the trials to criterion (t(13) = 0.90; p = 0.38) or the time per trial (t(13) = 1.6; p = 0.14). The reward density, as measured by the number of rewards received per trial, was for set 1, 0.65 ± 0.02 and was, for set 2, 0.73 ± 0.02. This difference was significant (t(13) = 2.9; p = 0.01) and is accounted for by the fact that, unlike during set 1, during set 2, rats were trained beyond reaching criterion, after which point they received a reward on nearly every trial.
Overall performance on set 2 began near chance level (50% correct) and improved significantly across trial blocks to nearly perfect discrimination 94.6 ± 2.9% correct (Fig. 2) (F(9,117) = 27.0; p < 0.001). Analysis by start arm class shows that rats began set 2 performing significantly worse from PA starts than RA starts (26.0 ± 4.6 and 79.2 ± 4.2% correct, respectively), indicating that they retained the cognitive set associated with the set 1 discrimination rule (Fig. 2, trial block 1). Analysis of set 2 performance from each of the two start arm classes showed a main effect of start arm class (F(1,26) = 57.0; p < 0.001) and trial block (F(9,234) = 23.0; p < 0.001) and an interaction between start arm class and trial block performance (F(9,234) = 8.3; p < 0.001). Post hoc analyses showed that rats made significantly more PA than RA errors at the outset of set 2. There were no significant differences in performance level at trial block 10.
Rats in the yoke task group (n = 10) required 48 ± 5 trials during set 1 to reach the point at which they had received eight consecutive food rewards and 60 ± 5 trials during set 2 to reach the point of continuous reward. These points correspond to the attainment of criterion performance by the subset of set-shift group rats on whose performance the yoke group reward schedule was based. The mean times required to reach criterion on set 1 and to complete the 80 trials of set 2, respectively, were 26 ± 3 and 39 ± 0.4 min. The mean time per trial values were 0.53 ± 0.01 and 0.49 ± 0.00 min for sets 1 and 2, respectively. The difference in trials to criterion between sets 1 and 2 was not significant (t(9) = 1.8; p = 0.11). Rats were significantly faster on a per trial basis during set 2 (t(9) = 3.4; p = 0.01) than on set 1 by an average of 4 s per trial. The reward density, as measured by the number of rewards received per trial was, for set 1, 0.69 ± 0.02 and was, for set 2, 0.71 ± 0.04. This difference was not significant (t(9) = 0.60; p = 0.56).
Rats (n = 14) were trained on the reward-retrieval condition for 48 ± 0.7 trials on set 1 and 80 trials for set 2. The times required to complete sets 1 and 2, respectively, were 25 ± 1 and 38 ± 0.6 min. The mean time per trial measures for sets 1 and 2 were, respectively, 0.52 ± 0.03 and 0.47 ± 0.01. The time per trial measures from sets 1 and 2 did not differ significantly. The reward density was 1.0, exactly, for both sets 1 and 2.
There were no significant differences between the three behavior groups in the number of trials run on sets 1 or 2 or the time per trial measures for each set (respective values of F(2,35) < 1.0; p values >0.05). There were main effects for the absolute number of rewards received during sets 1 and 2 and the reward density. These effects were accounted for by the designed differences between the reward-retrieval and both the set-shift and yoke groups. There were no significant differences in these measures between the set-shift and yoke groups. Furthermore, there were no significant differences in any behavioral measure between the yoke group and the subset of the set-shift group used for yoking (all values of F(1,15) < 1.0; p values >0.5).
Baseline dopamine levels
There were no significant between-task differences in absolute baseline extracellular dopamine levels within the three brain regions assayed. Respective baseline values were as follows (mean ± SEM of three pre-behavior baseline samples): mPFC, 0.33 ± 0.01 fmol/μl of sample for the set-shift trained rats, 0.17 ± 0.05 for the yoked rats, and 0.33 ± 0.01 for the reward-retrieval group (F(2,15) 1.2; p = 0.32); NAc, 0.82 ± 0.12 fmol for the set-shift trained rats, 0.49 ± 0.10 for the yoked rats, and 0.82 ± 0.10 for the reward-retrieval controls (F(2,16) = 3.4; p = 0.06); DS, 2.1 ± 0.64 fmol for the set-shift trained rats, 1.1 ± 0.26 for the yoked rats, and 1.1 ± 0.17 for the reward-retrieval controls (F(2,17) = 2.4; p = 0.13).
Inspection of the dialysis data suggested that dopamine levels in the yoke groups did not return to pre-behavior baseline levels after set 1 but rather stabilized at a higher level. To assess this, we compared the pre-set 1 baseline value with a new, pre-set 2 baseline value calculated as the average of samples 9–11, using paired t tests. The second baseline was significantly higher than the first in the NAc and DS of the yoke group (t(5) = 4.1 and t(6) = 3.2 for NAc and DS yoke groups, respectively; p values <0.05). There were no significant between-baseline differences within the set-shift or reward-retrieval behavior groups, nor were there significant between-behavior group differences in the second baseline in the three regions.
Medial prefrontal cortex
Maze exposure was associated with significant changes in mPFC dopamine levels in all three behavior groups (Fig. 3A). Comparison of the behavior groups found significant between-group (F(2,15) = 4.1; p = 0.04) and within-group (F(16,240) = 11.1; p < 0.001) differences. There was also a significant interaction effect between behavior group and dopamine sample (F(32,240) = 2.2; p < 0.001). Post hoc repeated-measures ANOVAs for each behavioral condition found significant effects of sample on dopamine release (set-shift, F(16,112) = 5.9; yoke, F(16,64) = 10.0; reward-retrieval, F(16,64) = 3.8; p values <0.001). In set-shift trained rats, dopamine levels were increased significantly above baseline during both behavioral sets (Fig. 3A, samples 4, 12, 13). Dopamine levels remained significantly elevated after testing (Fig. 3A, samples 5, 6, 14, 15). This sustained posttraining increase was especially persistent after set 2. Yoke group rats also had significantly increased dopamine levels associated with the two behavior sets (samples 4, 12, 13) and the samples immediately after each set (samples 5, 6, 14). Between sets, dopamine levels returned to baseline levels in both the set-shift and yoke groups. In the reward-retrieval rats, there was a slight but nonsignificant increase above baseline during set 1 and no change from baseline during performance of set 2. Rather, there was a trend for dopamine levels to remain at or below pre-maze baseline levels. Comparisons of the magnitude of increases in dopamine levels between sets 1 and 2 found a significant difference for the set-shift group (sample 4 vs sample 12, paired sample t(7) = 3.1; p = 0.02) and near-significant differences for the yoke (t(4) = 2.7; p = 0.06) and reward-retrieval (t(4) = 2.3; p = 0.08) groups.
There were significant between-group differences in dopamine levels during set 1, during which dopamine levels in set-shift and yoke rats exceeded those in the reward-retrieval rats (p values <0.05). The same pattern was obtained during set 2, although the main effects were not significant (p values = 0.08 and 0.07 for samples 12 and 13, respectively. The observed interaction effect is accounted for by the differential patterns of behavior-associated dopamine release, in which dopamine levels did not significantly increase when reward was certain and did not require explicit rule learning but did increase when reward was uncertain, whether reward maximization was possible through acquisition and shifting between two discrimination rules or not.
There was a striking dissociation in the relationship between the behavior-associated increases in mPFC dopamine and the trials to criterion performance between rats performing the set-shift task and those in the yoke group. In set-shift rats, there was a strong significant negative correlation between the percentage increase in dopamine above baseline and the trials to criterion measure for set 2 (Fig. 3B) (r = −0.85; p = 0.01). The correlation between dopamine and the acquisition rate for set 1 was also negative (r = −0.70) and neared statistical significance (p = 0.06). The correlations for both behavior sets were best fit by nonlinear, hyperbolic functions. In contrast, in the yoke groups, correlations between dopamine levels and the number of trials to criterion were positive and moderate in magnitude, although nonsignificant, for both set 1 (r = 0.64; p = 0.25) and set 2 (r = 0.55; p = 0.34). As for the set-shift rats, the correlation was best described by a hyperbolic function, although the difference between hyperbolic and linear fits was small.
Maze exposure was associated with significant changes in dopamine levels in all three behavior groups (Fig. 4A). There was no main effect of experimental group (F(2,14) = 3.1; p = 0.08), but there was a significant within-group effect (F(16,224) = 23.1; p < 0.001) and a significant interaction effect (F(32,224) = 3.2; p < 0.001). Post hoc one-way repeated-measures ANOVAs for each behavioral condition found significant effects of sample on dopamine release (set-shift, F(16,64) = 4.0; yoke, F(16,64) = 22.4; reward-retrieval, F(16,96) = 11.0; p values ≤0.001). In set-shift trained rats, dopamine levels were increased significantly above baseline only during set 2 (Fig. 4A, samples 12, 13) (t(5) > 2.8; p values <0.05). There was a nonsignificant behavior-associated increase during set 1, driven by a large increase in dopamine output in one rat that had an unusually high time per trial measure for set 1. Dopamine levels quickly returned to baseline after each behavior set. In contrast, dopamine levels in the yoke and reward-retrieval rats significantly increased during both sets (Fig. 4A, samples 4, 12, 13) and remained significantly increased after set 1 (samples 5–7) in both groups and after set 2 as well in the yoke group (samples 14–17). There were no statistically significant differences in dopamine levels between sets 1 and 2 within behavior groups.
Between-group differences in dopamine levels were significant during set 1 (sample 4, main effect, F(2,16) = 7.5; p < 0.001); post hoc testing showed dopamine levels significantly higher in the reward-retrieval group compared with the set-shift group. After set 1, there were significant differences during samples 5–7 (F values >3.9). During samples 5 and 6, dopamine levels for both reward-retrieval and yoke groups significantly exceeded those of the set-shift group; during sample 7, the significant difference was between the set-shift and yoke groups. There were no significant between-group differences associated with set 2. The interaction effect is explained by the lack of significant increase in the set-shift rats during set 1.
Correlations between accumbal dopamine output and the trials to criterion measure for set-shift group rats were large but nonsignificant for both set 1 (r = 0.65; p = 0.44) and set 2 (r = 0.82; p = 0.19). These same relationships for the yoke group were strong for set 1 (r = 0.81; p = 0.20) and weak for set 2 (r = 0.15; p = 0.97). In neither case were the correlations for the yoke group significant. Data for both the set-shift and yoke groups were best fit by nonlinear, hyperbolic functions.
Maze exposure was associated with significant changes in dopamine levels in all three behavior groups (Fig. 5A). There was a significant main effect of treatment (F(2,16) = 4.6; p = 0.03), a significant within-group effect (F(16,256) = 11.0; p < 001), and a significant interaction effect (F(32,256) = 3.4; p < 0.001). Post hoc one-way repeated-measures ANOVAs found significant effects of sample on dopamine release for each behavioral condition (set-shift, F(16,80) = 3.92; yoke, F(16,80) = 9.0; reward-retrieval, F(16,96) = 4.8; p values <0.001). Dopamine levels were significantly elevated above baseline during set 1 in the yoke and reward-retrieval groups (Fig. 5A, sample 4) and during set 2 only in the yoke group.
Comparisons of the magnitude of increases in dopamine levels between sets 1 and 2 found a significant difference for the yoke group (sample 4 vs sample 12, paired sample, t(6) = 3.0; p = 0.02). There were no statistically significant differences in dopamine levels between sets 1 and 2 for the set-shift and reward-retrieval groups.
There were no significant between-group differences in dopamine levels during set 1. During and after set 2 (Fig. 5A, samples 12, 14–20), dopamine levels were significantly higher in the yoke group compared with both of the other two behavior groups (main effect, F values >3.5; p values <0.05).
There were low, nonsignificant negative correlations between dopamine output and trials to criterion for each behavior set for both set-shift (set 1, r = −0.25, p = 0.64; set 2, r = −0.21, p = 0.68) and yoke groups (set 1, r = −0.26, p = 0.57; set 2, r = −0.02, p = 0.97). Data for both the set-shift and yoke groups were best fit by linear functions.
By simultaneously measuring extracellular dopamine and behavior, we identified dissociable patterns of dopamine release in three forebrain regions, the mPFC, NAc and DS, related to task-dependent cognitive requirements and the probability and controllability of reward. Within the mPFC, both the set-shift and yoke conditions were associated with significant, sustained increases in dopamine levels of similar magnitude. However, the relationship between dopamine levels and the likelihood of reward differed markedly depending on the controllability of reward maximization. Within the NAc, behavior-associated increases were observed for all groups but were more pronounced in the yoke and reward-retrieval groups, particularly during the first behavior period. In the DS, dopamine levels increased modestly, but significantly, during both sets in all three groups.
The most salient findings were the pronounced dissociations in behavior-associated increases in mPFC dopamine release among the three behavior conditions. Uncertainty of reward, whether controllable, as for the set-shift rats, or uncontrollable, as for the yoke rats, was associated with large increases in dopamine levels during both sets. Dopamine levels remained elevated after set 2 until the termination of the experiment, 1 h after the end of behavior. Reward retrieval was associated with a slight elevation of dopamine levels during set 1 and no elevation above baseline level during set 2. Rather, there was a tendency for dopamine levels to be below baseline.
Although superficially similar, the nature of the increases in mPFC dopamine in the set-shift and yoke groups differed markedly depending on whether reward maximization was within the subject’s control. Set-shift rats, by acquiring the respective discrimination rules, could, through their own cognitive efforts, eliminate action-outcome uncertainty and maximize reward. For these animals, there was strong, significant correspondence between the increase in dopamine within the mPFC and the rapidity with which a set-shift rat shifted from the set 1 rule to the set 2 rule. Larger increases in dopamine were associated with faster shifting to the second discrimination rule. This relationship was also observed during set 1, although to a lesser, nonsignificant degree. Set 2 is arguably the more cognitively demanding behavior phase, with maximal performance requiring not only the acquisition of a new discrimination rule but the inhibition of responding according to the set 1 discrimination rule. In contrast, the opposite relationship was obtained for the yoke rats, which performed under conditions of continuing uncertainty, even in the face of increasing probability of reward. Rapid “acquisition” of each set (and thus a high probability of overall reward) was associated with lower dopamine levels. Yoke rats for which reward receipt remained intermittent throughout set 2 had the highest dopamine levels. This may reflect continuing efforts by these rats to discern some pattern between arm choice and reward, that is, to learn a rule that would maximize reward.
The relationship between dopamine levels and set-shifting ability was nonlinear. Such nonlinearity has been reported for the subcortical dopaminergic system and is hypothesized to be attributable to shift from tonic to burst firing modes (Gonon, 1988; Grace, 1991; Chergui et al., 1994; Goto and Grace, 2005). The greater activation of the meso-PFC dopaminergic projection observed during set-shifting may result from a shift in the firing pattern of dopaminergic neurons to a bursting state, with the result of greater dopamine release in mPFC terminal fields and greater cognitive ability. Interestingly, under our task conditions, such a mechanism would have to be selective for meso-PFC dopaminergic neurons, because projections to other terminals regions did not display the same pattern of activation under identical behavioral conditions. Because there were identical stimuli and requirements for locomotion and the act of reward consumption within the maze for all three tasks, the mPFC dopaminergic projection does not appear to be activated by maze navigation or the retrieval of reward; rather, as indicated by the significant correlation between the rapidity of rule acquisition and the increase in mPFC dopamine release, it selectively responds to more cognitive task aspects.
The association between rat mPFC dopamine neurotransmission and cognition has been reported by others using a variety of tasks (Phillips et al., 2004; Tunbridge et al., 2004; Rossetti and Carboni, 2005). For example, Phillips et al. (2004) reported a significant negative correlation between dopamine levels and working memory errors. Paradoxically, although the increase in mPFC dopamine levels was similar in set-shift rats during sets 1 and 2, our previous work (Stefani and Moghaddam, 2005b) and that of others using dopamine antagonists or lesions (Ragozzino et al., 1999; Birrell and Brown, 2000; Ragozzino, 2002; Floresco et al., 2006b) suggest that neither dopamine receptors in particular nor the mPFC in general are required for the acquisition of the set 1 rule. Albeit superficially nonparsimonious, it is possible that the mPFC, although not necessary for acquisition of the first discrimination by rule-naive rats, influences subsequent efforts to shift rules or guide behavior through its own function or the modulation of other brain areas (Floresco et al., 2006a).
Accumbal dopamine levels were also responsive to behavioral contingencies. Set-shift task performance was associated with small dopamine increases above baseline during set 1, followed by large increases during set 2. The association between NAc dopamine levels and the rate of rule shifting, although not statistically significant, was much more robust during set 2 of the set-shift task than during set 1. Together, these observations suggest a more prominent role for accumbal dopamine in shifting as opposed to simple rule acquisition. Such a role for the NAc core during the rule shift has been reported recently (Floresco et al., 2006a). The yoke and reward-retrieval conditions showed large increases during both behavior phases. Accumbal dopamine release has been proposed to be generally required for associative learning and/or motivated or effortful behavior (Robbins et al., 1990; Garris et al., 1999; Smith-Roe and Kelley, 2000; Salamone and Correa, 2002; Goto and Grace, 2005). It is unlikely that motivational factors per se account for different dopamine profiles between the three behavior groups because there were no significant between-group differences in the time required to complete a trial, and all rats consumed all available food pellets. Dopamine neurons increase their firing rates in response to larger than expected rewards (Tobler et al., 2005). Although this might explain the large increase seen in the reward-retrieval group during set 1, it does not fully account for the increase observed in the yoke group or the similar patterns of release during set 2.
Dopamine levels in the dorsal striatum were more related to motoric aspects of behavior than to cognitive- or reward-related aspects. By design, all three tasks had approximately equivalent motor requirements and exposure to maze stimuli, and all rats were handled identically during testing. There was a significant effect of maze exposure on dopamine levels but no significant differences between behavior groups during set 1 or within groups between sets. The higher dopamine levels in the yoke group during set 2 might be explained by higher “resting” levels between the two sets. Yoke group rats were observed to be much more active in the holding chamber between trials and between sets than rats in the set-shift and reward-retrieval groups. The DS is generally hypothesized to be involved in stimulus–response learning and response selection (Brown and Robbins, 1989; Aosaki et al., 1995; Packard and McGaugh, 1996). Rats in the yoke and reward-retrieval groups often rapidly adopted response strategies during set 1 (data not shown), which may have accounted for the marginally higher levels of dopamine during that set.
Electrophysiological studies recording behavior-associated activity of dopamine cell bodies consistently report short latency, phasic firing of dopaminergic neurons in response to reward receipt, and stimuli predictive of reward (Schultz, 2002; Dommett et al., 2005; Tobler et al., 2005). In contrast, the present findings and other studies measuring dopamine levels in the terminal fields of the dopaminergic projection report elevations in extracellular dopamine levels persisting above resting levels well after the end of stimulation (Phillips et al., 2004; Rossetti and Carboni, 2005). This suggests that brief, phasic responses of dopaminergic neurons to a stimulus can influence dopamine-dependent neuromodulation in terminal regions well beyond presence of the stimulus. This mechanism may be critical for working memory, planning, memory consolidation, and other functions requiring the organization of behavior in the absence of recently presented stimuli. This hypothesis is, in part, supported by observations that posttraining administration of dopamine receptor agonists enhance learning (Packard and White, 1991; White et al., 1993) in a task- and region-selective manner.
In conclusion, by varying the reward contingencies associated with arm choices, while maintaining a relatively constant requirement for locomotion and reward consumption, we find that the patterns of behaviorally activated dopamine release in corticostriatal regions are not monolithic and differ depending on the nature of the task and elements within a task. Additional studies will determine whether this regulation occurs at the point of origin of the dopaminergic projections in the midbrain or in the terminal fields.
This work was supported by National Institute of Mental Health Grants R21-MH65026 and R01-MH48404. We thank Kelli Jones and Alicia DeFrancesco for technical assistance.
- Correspondence should be addressed to Dr. Mark R. Stefani, Psychology Department, Middlebury College, McCardell Bicentennial Hall 276, Middlebury, VT 05753.