Abstract
Studies of visual discrimination reversal learning have revealed striking neurochemical dissociations at the level of the orbitofrontal cortex (OFC) with serotoninergic, but not dopaminergic, integrity being important for successful reversal learning. These findings have considerable implications for disorders such as obsessive compulsive disorder and schizophrenia, in which reversal learning is impaired, and which are primarily treated with drugs targeting the dopaminergic and serotoninergic systems. Dysfunction in such disorders however, is not limited to the OFC and extends subcortically to other structures implicated in reversal learning, such as the medial caudate nucleus. Therefore, because the roles of the serotonin and dopamine within the caudate nucleus are poorly understood, this study compared the effects of selective serotoninergic or selective dopaminergic depletions of the marmoset medial caudate nucleus on serial discrimination reversal learning.
All monkeys were able to learn novel stimulus–reward associations but, unlike control monkeys and monkeys with selective serotoninergic medial caudate depletions, dopamine-depleted monkeys were markedly impaired in their ability to reverse this association. This impairment was not perseverative in nature. These findings are the opposite of those seen in the OFC and provide evidence for a neurochemical double dissociation between the OFC and medial caudate in the regulation of reversal learning. Although the specific contributions of these monoamines within the OFC–striatal circuit remain to be elucidated, these findings have profound implications for the development of drugs designed to remediate some of the cognitive processes underlying impaired reversal learning.
Introduction
Visual discrimination reversal learning is commonly used to investigate the neurobiological mechanisms underlying the ability of animals to adapt their behavior to changes in the motivational significance of environmental stimuli. Studies using this paradigm have focused on the frontostriatal loops and the relative contributions of different nodes of these circuits to adaptive behavior. Damage to both the orbitofrontal cortex (OFC) and the medial caudate nucleus (the region that receives OFC input) (Haber et al., 1995; Roberts et al., 2007; Schilman et al., 2008) impairs reversal learning performance in primates and rodents (Divac et al., 1967; Butter, 1969; McEnaney and Butter, 1969; Jones and Mishkin, 1972; Dias et al., 1996; McAlonan and Brown, 2003; Hornak et al., 2004; Clarke et al., 2008).
During reversal learning, subjects associate cues with the presence or absence of reward or punishment, adapting their behavior when those contingencies change. Computational models postulate that reward and error prediction signals are generated from the mismatch between observed and expected outcomes, and used to match behavior to the current contingency (Sutton and Barto, 1998; Schultz and Dickinson, 2000; Daw et al., 2002; Frank and Claus, 2006). Serotoninergic dorsal raphe neurons may encode the expectation and delivery of reward (Nakamura et al., 2008; Bromberg-Martin et al., 2010), whereas error signals are generated by phasic mesolimbic dopaminergic neuron firing (Montague et al., 1996; Schultz et al., 1997, 1998). Both the dorsal raphe serotonin and mesocortical/mesolimbic dopamine neurons project to the OFC and striatum (van der Kooy and Hattori, 1980; Oades and Halliday, 1987; Corvaja et al., 1993), suggesting that their integrity may be important for reversal learning; however, their specific contributions remain uncertain.
Genetic and systemic manipulations targeting both serotonin and dopamine modulate reversal learning (Mehta et al., 2001; Kruzich and Grandy, 2004; Izquierdo et al., 2006, 2007; Lee et al., 2007; Lapiz-Bluhm et al., 2009; Brigman et al., 2010), but the localization of these effects are unknown. Within the OFC of marmoset monkeys, anatomically and neurochemically selective depletions of serotonin, but not dopamine, severely impair reversal learning performance (Clarke et al., 2007). In contrast, some studies suggest a critical role for striatal dopamine in reversal learning. For example, methylphenidate-induced dopamine release in the human medial striatum correlates with reversal learning performance (Clatworthy et al., 2009), and those humans carrying a polymorphism that reduces striatal dopamine D2 receptor expression show reduced ventral striatal activation during probabilistic reversals (Jocham et al., 2009). In addition, dopamine depletion of the rat dorsomedial striatum impaired reversal learning in one study (O'Neill and Brown, 2007), although selective catecholaminergic lesions of the marmoset caudate nucleus were without effect (Collins et al., 2000; Crofts et al., 2001). However, whether serotonin at the level of the striatum contributes to reversal learning is unknown.
We therefore sought to compare directly the roles of striatal serotonin and dopamine in serial visual discrimination reversal learning by investigating the behavioral effects of selective dopaminergic and serotonergic depletions of the marmoset medial caudate nucleus.
Materials and Methods
Subjects and housing.
Ten common marmosets (Callithrix jacchus; five females, five males), bred on site at the University of Cambridge Marmoset Breeding Colony, were housed in pairs. All monkeys were fed 20 g of MP.E1 primate diet (Special Diet Services/SDS) and two pieces of carrot 5 d a week after the daily behavioral testing session, with simultaneous access to water for 2 h. On weekends, their diet was supplemented with fruit, rusk, malt loaf, eggs, treats, and marmoset jelly (SDS) and they had ad libitum access to water. Their cages contained a variety of environmental enrichment aids that were regularly varied and all procedures were performed in accordance with the UK Animals (Scientific Procedures) Act 1986.
Apparatus.
Behavioral testing took place within a sound-attenuated box in a dark room. As described previously (Clarke et al., 2008), the animal sat in a clear, plastic transport box, one side of which was removed to reveal a color computer monitor (Samsung). The marmoset reached through an array of vertical metal bars to touch stimuli presented on the monitor and these responses were detected by an array of infrared beams (Interact 415; Intasolve) attached to the screen. A reward of cooled banana milkshake (Nestlé) was delivered to a centrally placed spout for 5 s. Presentation of reward was signaled by a 4 kHz tone played through loudspeakers located to the left and right of the monitor and was dependent upon the marmoset licking the spout to trigger a peristaltic pump that delivered the milkshake. The test chamber was lit with a 3 W bulb. The stimuli presented on the monitor were abstract, multicolored visual patterns (32 mm wide × 50 mm high), which were displayed to the left and right of the central spout. The stimuli were presented using the Whisker control system (Cardinal and Aitken, 2001) running MonkeyCantab (designed by Roberts and Robbins, version 3.6) (Cardinal, 2007), which also controlled the apparatus and recorded responding.
Behavioral training and testing.
All monkeys were trained initially to enter a clear plastic transport box for marshmallow reward and familiarized with the testing apparatus. Monkeys then received the following sequence of training: familiarization of a milkshake reward, learning a tone-reward contingency, and responding on the touchscreen until they were reliably and accurately making 30 responses or more to a square stimulus presented to the left and right of the licker in a 20 min period (for full experimental details, see Roberts et al., 1988). After behavioral training, the marmosets proceeded onto serial visual discrimination reversal learning.
As described previously (Clarke et al., 2004, 2008), serial reversal learning consisted of two-choice discriminations between abstract, colored patterns (Fig. 1). For all discriminations, two stimuli were presented to the left and right of the center of the screen. A response to the correct stimulus resulted in the incorrect stimulus disappearing from the screen, and the correct stimulus remaining present for the duration of a 5 s tone that signaled the availability of 5 s of reinforcement. Failure to collect the reward was scored as a missed reinforcement. Following a response to the incorrect stimulus, both stimuli disappeared from the screen and a 5 s timeout period ensued during which the house light was extinguished. The intertrial interval was 3 s and, within a session, the stimuli were presented equally to the left and right sides of the screen. Each monkey was presented with 30 trials per day, 5 d a week, and progressed to the next discrimination (described in detail below) after attaining a criterion of 90% correct in the immediately preceding session. If a monkey showed a significant side bias (10 consecutive responses to one side), a rolling correction procedure was implemented whereby the correct stimulus was presented on the nonpreferred side until the monkey had made a total of three correct responses.
A schematic diagram illustrating the sequence of visual discriminations presented during the study and their occurrence relative to surgery. For each discrimination, correct and incorrect exemplars are indicated by the + and −, respectively. Actual stimuli were multicolored.
All animals received acquisition of a novel discrimination (D1), then acquisition of a second novel discrimination (D2). After attainment of criterion on D2, animals were separated into groups. They then received infusions of a dopaminergic (n = 4) or serotoninergic (n = 3) neurotoxin into the medial head of the caudate nucleus or a sham-operated control procedure (n = 3). After 2 weeks' recovery, they received retention of D2. Next, they received acquisition of a third novel discrimination (D3). From this stage onwards, the stimulus contingencies were counterbalanced to prevent differences in performance being an artifact of any innate biases in stimulus preference. Finally, the animals received a series of seven discrimination reversals, whereupon on each reversal, the previously correct stimulus became incorrect and the previously incorrect stimulus became correct (reversals 1–7).
Behavioral measures.
The main measure of the monkeys' performance on the visual discriminations was the total number of errors made before achieving the criterion of ≥90% correct in one session (excluding the criterion session) on each discrimination. Additional measures recorded for each trial were the latency to respond to the stimuli presented on the monitor (response latency), the latency to collect the reward from the spout (lick latency), and the left/right location of the response. In addition, the type of errors that were made during the reversal were classified as perseverative (where responding to the previously correct stimulus was significantly above chance), chance, or learning (where responding to the newly correct stimulus was at or above chance). Signal detection theory (Macmillan and Creelman, 1991) was used to establish subjects' ability to discriminate correct from incorrect stimuli independently of any side bias that might have been present. The discrimination measure d′ and the bias measure c were calculated and the normal cumulative distribution function (CDF) was compared with the criterion values of a two-tailed Z test (each tail, p = 0.05) to determine the classification of each 15 trial half-session (perseveration, chance, or learning). Sessions where CDF(d′) < 0.05 were classified as perseverative; sessions where CDF(d′) > 0.95 were classified as learning, and sessions where 0.05 ≤ CDF(d′) ≤ 0.95 were classified as chance. Sessions where CDF(c) < 0.025 or CDF(c) > 0.975 were considered biased, but were not excluded as d′ was still a valid measure of discrimination (Clarke et al., 2004). Days on which subjects attained criterion were excluded, as were the errors from half-sessions in which the monkey was on the same correction procedure trial for the entire block of trials and d′ could not be calculated. However, the numbers of errors that were excluded across the seven reversals was equivalent for all three groups (square root transformed data, group, F(2,7) = 2.013, p = 0.204; reversal × group, F(12,42) = 1.578, p = 0.136).
Statistics.
The behavioral results were subjected to ANOVA using SPSS v16 (SPSS). ANOVA models are in the form A3 × (C3 × S), where A is a between-subject factor with three levels (lesion group), C is a within-subjects factor of error type with three levels (perseveration/chance/learning), and S represents subjects (Keppel, 1991). Where raw data did not display homogeneity of variance, it was transformed appropriately (Howell, 1997). A Huynh–Feldt correction was used to adjust the degrees of freedom if sphericity could not be assumed and post hoc comparisons were made using Fisher's protected least significant differences test (LSD; performing three uncorrected pairwise tests following a significant one-way ANOVA with three groups), the most powerful test in this context (Howell, 1997). Behavioral data were analyzed in the following blocks: (1) presurgical discriminations (D1 and D2), (2) postsurgical discriminations (D2 retained and D3 acquired), and (3) the serial reversals (reversals 1–7). For the lesion data, depletion levels were analyzed using two-tailed one sample t tests (comparing the percentage depletion to zero/no depletion).
Surgical procedure.
Subjects were premedicated with ketamine hydrochloride (0.05 ml of a 100 mg/ml solution, i.m.; Pharmacia and Upjohn), given a 24-h prophylactic analgesic (Rimadyl, 0.03 ml of 50 mg/ml carprofen, s.c.; Pfizer), and then intubated and maintained on isoflurane gas anesthetic (flow rate: 2.5% isoflurane in 0.2 l/min O2; Novartis), before being placed in a stereotaxic frame especially modified for the marmoset (David Kopf). Anesthesia was closely monitored clinically and by pulse oximetry.
Anatomically defined lesions of the caudate nucleus were targeted toward the medial head of the caudate nucleus [(the area that preferentially receives input from the orbitofrontal cortex in the marmoset (Roberts et al., 2007)] and achieved using stereotaxic injections of dopaminergic or serotoninergic neurotoxins (Sigma) at carefully defined coordinates (Table 1). These coordinates were individually adjusted where necessary in situ to take into account the individual differences in brain size, as described previously (Dias et al., 1996). All injections were made in one stage of surgery using a pulled glass cannula attached to a 2 μl Hamilton syringe (Hamilton) at the rate of 0.04 μl/20 s. Sham surgery (n = 3) was identical except for the omission of any toxin from the infusion.
Lesion parameters including the stereotaxic coordinates of each injection (based on the interaural plane), the injection volume, and toxin details
Lesions of the serotoninergic innervation of the medial head of the caudate nucleus were made using 5,7-dihydroxytryptamine (5,7-DHT; 4 μg/μl; Sigma) in saline/0.1% l-ascorbic acid. To protect the noradrenaline (NA) and dopamine (DA) innervation, the NA uptake blocker nisoxetine (50 mm; Sigma) and the DA uptake blocker GBR-12909 (2.0 mm; Sigma), respectively, were administered concomitantly in the infusate. Lesions of the dopaminergic innervation of the medial caudate were made using 6-hydroxydopamine (6-OHDA; 6 μg/μl; Sigma) in saline/0.1% l-ascorbic acid. To protect the serotoninergic innervation of the medial caudate from the 6-OHDA, the selective serotonin reuptake inhibitor citalopram (5 mg/kg s.c.; Lundbeck) was administered concomitantly in the infusate. Pilot lesion data (data not shown) suggested that NA protection was not necessary, partly because of the very sparse noradrenergic innervation of the caudate (Arakawa et al., 2008). Postoperatively, all monkeys received the analgesic Metacam (meloxicam, 0.1 ml of a 1.5 mg/ml oral suspension; Boehringer Ingelheim), and complete recovery was assured, before being returned to their home cage for 10 d of weekend diet and water ad libitum before returning to experimental testing.
Postmortem neurochemical assessment.
The specificity and extent of the selective serotonin (5-HT) and DA depletions of the caudate nucleus were assessed by postmortem tissue analysis of monoamine levels in cortical and subcortical regions 218–335 d after administration of the neurotoxin, as described previously (Clarke et al., 2004). As we have previously shown that 6-OHDA lesions of the striatum show considerable recovery over time, three additional animals received unilateral DA lesions using the same surgical procedures described earlier, and were assessed for postmortem cortical and subcortical monoamine levels at 10, 95, and 141 d postoperatively. Tissue samples were homogenized in 200 μl of 0.2 m perchloric acid for 1.5 min and centrifuged at 6000 rpm for 20 min at 4°C. The supernatant (75 μl) was subsequently analyzed using reversed phase high-performance liquid chromatography (HPLC) and electrochemical detection, as described previously (Clarke et al., 2005).
Results
Neurochemical analysis of postmortem tissue from monkeys with DA or 5-HT medial striatal depletions
Striatal DA depletion
Our previous work has shown that 6-OHDA-induced dopaminergic depletions in the striatum do show recovery across time. Therefore, for long-term studies such as this, it was important to obtain the time course of DA depletion, including the time point that corresponds to when the major behavioral deficits were observed. Consequently, the dopamine depletion resulting from injections of 6-OHDA into the medial head of the caudate nucleus of marmosets was assessed at 10, 95, and 141 d postsurgery. This revealed substantial, selective, dopaminergic depletions of 79% (10 d), 98.94% (95 d), and 44.54% (141 d) in the medial head of the caudate. These findings confirm that the levels of DA were indeed starting to recover at 141 d, but, more importantly, clearly indicate that the period when the main behavioral observations of this study were made (i.e., reversals 1–3) coincides with very high, sustained levels of medial caudate dopamine depletion (Fig. 2). As predicted by the partial recovery of the dopamine depletion seen at 141 d, injections of 6-OHDA into the medial head of the caudate nucleus did not result in a significant reduction of medial caudate dopamine when measured an average of 271 d after surgery (14.67 ± 11.34%) (Table 2). Despite this, significant dopaminergic decreases were seen in the anterior cingulate (44.87 ± 12.5%; t(3) = 3.578, p = 0.037), midcingulate (35.86 ± 7.29; t(2) = 4.921, p = 0.039), and anterior parietal cortices (41.63 ± 0.51%; t(2) = 81.409, p < 0.001) at this time point. However, this is probably due to regional variation in recovery from the effects of the neurotoxin and is considered further in the Discussion below.
Postmortem depletions of DA in the medial head of the caudate as a function of time since surgery in DA-depleted monkeys. The gray region indicates the time period in which reversals (rev) 1, 2, and 3 were completed by the dopamine-depleted monkeys. The horizontal lines represent the maximum duration of each reversal (extending from the earliest starting point to the latest endpoint and thus reflecting the quickest and slowest learning monkeys, respectively), and the vertical marks on the lines represent the mean time points for the beginning and end of the reversal. Thus, the periods where the maximal behavioral impairment is seen correspond to high levels of medial caudate dopamine depletion. a, b, and c, Pilot lesions 10, 95, and 141 d postsurgery, respectively; d, termination of current behavioral study. Inset; Dopamine depletions in all striatal regions for monkeys a–c.
Mean percentage depletions of 5-HT, DA, and NA (± SEM) in the striatum and anterior cortices of marmosets with 5,7-DHT or 6-OHDA infusions into the caudate nucleus
No significant depletions were observed in serotonin, indicating that citalopram was successful at protecting 5-HT.
Striatal 5-HT depletions
Infusions of 5,7-DHT into the marmoset medial caudate nucleus resulted in a significant decrease in the levels of 5-HT in both the medial head (66.68 ± 12.1%; t(2) = 5.529, p = 0.031) and body of the caudate nucleus (34.98 ± 3.43%; t(2) = 10.185, p = 0.010) an average of 252 d after surgery (Table 2). The DA and NA reuptake blockers GBR12909 and nisoxetine successfully prevented any alterations in DA and NA within these regions. No other regions showed alterations in 5-HT levels apart from the medial cingulate (58.07 ± 11.7%; t(2) = 0.121, p = 0.037). Whereas the adjacent lateral head of the caudate nucleus showed no depletions of 5-HT, there was a significant increase in DA levels (41.6 ± 6.95%; t(2) = 5.985, p = 0.027) and a significant decrease in NA (60.36 ± 11.9%; t(2) = 5.071, p = 0.037).
Behavioral results: effects of striatal DA and 5-HT depletion on serial reversal learning
Preoperative discrimination behavior
Preoperatively, the three groups of monkeys did not differ in their ability to learn two novel visual discriminations (D1 and D2; group and group × discrimination, F values <1) (Table 3).
Prereversal discrimination performance
Postoperative discrimination behavior
Postoperatively, there were no significant differences in the ability of the three groups of monkeys to remember a previously learnt discrimination, or to learn a third novel discrimination (D2 retention and D3 acquisition; group and group × discrimination, F values <1). However, there was a main effect of discrimination (F(1,7) = 20.379, p = 0.003) representing the ease with which all monkeys performed D2 retention compared with D3 acquisition.
Serial reversals
It can be seen in Figure 3A that DA medial caudate-depleted monkeys made many more errors across the series of seven reversals than both 5-HT medial caudate-depleted monkeys and control monkeys. Repeated-measures ANOVA on the overall (total) errors to criterion across reversals (1–7) revealed significant main effects of group (F(2,7) = 7.005, p = 0.021) and reversal (F(3.8,26.7) = 7.091, ε̃ = 0.634, p = 0.001) but no reversal × group interaction (F(7.6,26.7) = 1.736, ε̃ = 0.634, p = 0.139). Post hoc analysis using Fishers LSD test showed that control monkeys did not differ from 5-HT-depleted monkeys (p = 0.893), whereas dopamine-depleted monkeys showed significantly worse performance than both controls (p = 0.014) and 5-HT-depleted monkeys (p = 0.018) (Fig. 3B). Analysis of reversal 1 independently revealed a main effect of group (F(2,9) = 4.974, p = 0.045) that was due to monkeys with DA medial caudate depletions making significantly more errors than control monkeys (p = 0.017) but not 5-HT depleted monkeys (p = 0.118), the latter not differing from controls (p = 0.252).
Serial discrimination reversal performance. A, Total errors to criterion for each reversal (rev). Repeated-measures ANOVA revealed a main effect of group (p = 0.025) attributable to an increased number of errors in the dopamine-depleted group. B, Total errors, collapsed across reversals. *p < 0.05. C, Mean perseveration, chance, and learning errors, showing that the increase in errors shown by the dopamine-depleted monkeys was not due to a specific error type (error type × group interaction, F < 1).
As described previously, such a gross analysis can be insensitive to changes within reversals, and can fail to detect more subtle changes, such as the perseverative responding seen after excitotoxic lesions of the medial caudate/nucleus accumbens in marmosets (Clarke et al., 2008). We therefore used signal detection theory to classify errors as either perseverative, random chance, or learning (see Materials and Methods, above), and performed additional analyses to investigate whether the different lesions preferentially caused distinct error types.
Despite their overall reversal impairment, monkeys with dopamine depletions in the medial caudate did not show a preponderance of any particular error type. Their impairment was due to an overall increase in all three error types (Fig. 3C). Repeated-measures ANOVA revealed main effects of reversal (F(6,42) = 6.945, p < 0.001), error type (F(2,14) = 12.604, p = 0.001), and group (F(2,7) = 6.638, p = 0.024) but no reversal × group (F(12,42) = 1.780, p = 0.084) or error type × group (F < 1) interactions. Post hoc analysis of this group effect confirms the findings of the total error analysis (controls vs 5-HT lesions, p = 0.899; controls vs dopamine lesions, p = 0.016; 5-HT lesions vs dopamine lesions, p = 0.020). Independent analysis of reversal 1 revealed no group interaction with error type (F < 1).
Latencies
At no point during the serial reversal paradigm were there any group differences in the latencies to make correct or incorrect responses (correct/incorrect × group, reversal × group, correct/incorrect × reversal × group; all Fs < 1). Although a trend toward a main effect of group was seen (F(2,7) = 3.768, p = 0.077), post hoc LSD analysis revealed that this was due to a significant difference between the 5-HT and DA-depleted animals (p = 0.031), but not to any significant differences between depleted animals and controls (5-HT vs controls, p = 0.105; DA vs controls, p = 0.502). Moreover, this effect is due to just one 5-HT-depleted animal who uniformly responded very slowly, and, given that the group only contained three animals, had a disproportionate effect on the group mean.
Discussion
This study provides insights into the neurochemical modulation of circuits subserving the behavioral flexibility inherent in discrimination reversal tasks. Selective dopamine depletion from the medial head of the marmoset caudate nucleus resulted in significantly more errors while performing a series of reversals. In contrast, selective serotoninergic depletion from this region had no effect on performance.
This neurochemical dissociation between the roles of serotonin and dopamine within the medial head of the caudate nucleus is the opposite to that seen in the OFC. The marmoset OFC has extensive connections with this striatal region (Roberts et al., 2007) and, like the caudate nucleus, contributes to successful discrimination reversal performance. However, within the OFC, selective dopaminergic depletions had no effect on reversal learning, whereas selective serotonergic depletions caused marked impairment (Clarke et al., 2004, 2007).
The current finding of a profound nonperseverative reversal learning deficit after caudate dopamine depletion is consistent with previous findings in rodents and humans that implicate striatal dopamine in reversal learning (Lee et al., 2007; O'Neill and Brown, 2007; Dodds et al., 2008; Clatworthy et al., 2009). In contrast to the current findings, previous marmoset striatal dopaminergic depletions had no effect on individual reversals embedded in an attentional set-shifting task (Collins et al., 2000). However, although nonsignificant when compared against controls and subjects with prefrontal dopaminergic lesions (Crofts et al., 2001), such subjects nevertheless displayed the most errors when reversing a simple discrimination (square-root transformed data ± SEM: controls, 11.96 ± 1.58; caudate DA depletions, 17.54 ± 2.3; prefrontal DA depletions, 14.3 ± 1.47).
Intriguingly, the impairments seen after dopamine striatal depletion in marmosets (current study) and rats (O'Neill and Brown, 2007) were not due to perseveration. This contrasts strongly with the perseveration seen after OFC 5-HT manipulations (Clarke et al., 2004, 2005, 2007), excitotoxic striatal lesions (Clarke et al., 2008), and systemic administration of amphetamine (Ridley et al., 1981a,b) in marmosets performing reversal learning. We speculate that striatal dopaminergic inactivation (pharmacologically or via selective dopaminergic lesions) causes nonperseverative impairments, whereas excessive dopaminergic activation may lead to perseveration (but see Ersche et al., 2008). Indeed, activation of the caudate nucleus, and specifically caudate dopamine D2/3 receptors, is associated with perseverative responding after contingency change (Clatworthy et al., 2009; Jocham et al., 2009; K. Ersche, J. Roiser, S. Abbott, K. Craig, U. Muller, J. Suckling, C. Ooi, S. Shabbir, L. Clark, B. Sahakian, N. Fineberg, E. Merlo-Pich, T. Robbins, and E. Bullmore, unpublished data), and variations in baseline striatal dopamine synthesis capacity can modulate the outcome of dopaminergic manipulations on probabilistic reversal learning. Thus, Cools et al. (2009) have shown that the D2 agonist bromocriptine has beneficial effects in subjects with low compared with high baseline striatal dopamine synthesis, raising the possibility that activation of striatal D2 receptors may have ameliorated the current reversal deficits. However, blockade, deletion, and activation of D2-like receptors have all been shown to impair reversal learning (Ridley et al., 1981a; Smith et al., 1999; Mehta et al., 2001; Izquierdo et al., 2006; Lee et al., 2007; Cools et al., 2009), and complex interactions between the heterogeneous striatal distribution of dopamine receptors, the tonic and phasic release of dopamine, baseline dopamine levels at the striatal synapse, and serotonergic modulation of dopamine release via 5-HT-1c, -2a/c, -3, and -4 receptor subtypes, may all determine the contribution of striatal dopamine to behavioral flexibility (Porras et al., 2002; Alex et al., 2005; Goto and Grace, 2007; Goto et al., 2007; Lee et al., 2007; Cools et al., 2009; Navailles and De Deurwaerdère, 2010).
There is also extensive evidence linking serotonin to behavioral flexibility processes. A variety of chemical agents that reduce neural 5-HT, including 5,7-DHT (Clarke et al., 2004), parachloroamphetamine (Masaki et al., 2006), and parachlorophenylalanine (Lapiz-Bluhm et al., 2009; but see Brigman et al., 2010) disrupt reversal learning, as does chronic cold stress (Lapiz-Bluhm et al., 2009) and subchronic PCP (Abdul-Monim et al., 2003). Furthermore, the reversal learning impairments induced by chronic cold stress and subchronic PCP can be ameliorated by systemic drugs that increase serotonergic function (Abdul-Monim et al., 2003; McLean et al., 2009; Danet et al., 2010). In addition, pharmacological or genetic inactivation of the 5-HT transporter, as well as polymorphisms in the promoter (5-HTTLPR) and the 3′ untranslated regions, all modulate reversal learning (Izquierdo et al., 2007; Vallender et al., 2009; Brigman et al., 2010). The particular importance of the OFC in 5-HT's modulatory effects on reversal learning is illustrated by the perseverative reversal impairments that follow local OFC 5-HT depletion in marmosets (Clarke et al., 2004, 2007), the correlations between 5-HT levels in the OFC and reversal performance (Masaki et al., 2006), and the rise of extracellular 5-HT during reversal performance in rats (Lapiz-Bluhm et al., 2009). Neuroimaging studies of acutely tryptophan-depleted humans during reversal learning also implicate regions of the medial and orbitofrontal prefrontal cortex (PFC) (Cools et al., 2005, 2008; Evers et al., 2005, 2010; van der Veen et al., 2007). Together with the present findings, these results suggest that the ventromedial PFC, but not the caudate nucleus, is implicated in the serotonergic modulation of reversal learning.
The role of frontostriatal 5-HT is clinically relevant. Functional imaging studies typically show increased OFC metabolism in obsessive-compulsive disorder (OCD) patients compared with healthy controls, which normalizes after successful selective serotonergic reuptake inhibitor (SSRI) treatment (Saxena et al., 1998; Brody et al., 1999; Saxena and Rauch, 2000). In addition, patients and their first-degree relatives both show reversal learning-related OFC hypofunction, suggesting that OFC activity may represent an endophenotype for individuals at increased genetic risk of OCD (Chamberlain et al., 2008). Despite this, there is limited evidence for reversal learning deficits per se in OCD patients (Remijnse et al., 2006, 2009), although evidence that the OFC and 5-HT are implicated in some forms of compulsive behavior can be found in the signal attenuation paradigm, a proposed model of OCD (Joel and Avisar, 2001; Joel et al., 2005; Flaisher-Grinberg et al., 2008). In this model, compulsive responding induced by excitotoxic OFC lesions is accompanied by a decreased density of striatal 5-HT and the presynaptic striatal 5-HT transporter (Joel et al., 2005; Schilman et al., 2010). Altered striatal, specifically caudate, activity is well documented in OCD (Saxena et al., 1998, 1999; Remijnse et al., 2006) and it has been speculated that decreased striatal 5-HT may mediate the increased compulsive-like behavior seen after OFC lesions (Schilman et al., 2010), perhaps due to striatal serotonergic [or dopaminergic (Joel and Doljansky, 2003; Denys et al., 2004)] receptor upregulation, e.g., 5-HT2a (Adams et al., 2005). The present findings that caudate 5-HT depletion had no effect on reversal learning do not support this hypothesis. However, it remains possible that striatal 5-HT assumes a greater role when the OFC is compromised, a premise supported by evidence that striatal SSRI infusion abolishes the increased compulsivity seen in OFC-lesioned rats, but has no effect in controls (Joel et al., 2005).
Although we conclude that dopamine rather than serotonin is important for mediating reversal learning in the caudate nucleus, an important caveat in the present study is the depletion observed in the cingulate cortex and the anterior parietal cortex in monkeys receiving intracaudate 6-OHDA. However, evidence for a role of the anterior cingulate cortex in reversal learning is inconsistent (Bussey et al., 1997; Meunier et al., 1997; Schweimer and Hauber, 2005; Ragozzino and Rozman, 2007), and evidence concerning the parietal cortex is scant (Fox et al., 2003; Chamberlain et al., 2008). It should also be noted that the unilaterally lesioned monkeys used for the timeline analysis (Fig. 2) showed nucleus accumbens dopamine depletions ranging from 25 to 60%. However, there is little evidence for nucleus accumbens involvement in serial visual reversal learning (Clarke et al., 2008), and we think it unlikely that this contributed to the current behavioral deficits. Nevertheless, the impact of selective dopaminergic depletions of these areas on serial reversal learning performance needs to be evaluated before the possibility of their involvement is excluded.
To conclude, these data suggest that reversal learning, at the level of the caudate nucleus and OFC, is differentially regulated by dopamine and serotonin, respectively. These data are not easily explained by existing reinforcement models of dopamine and 5-HT (Boureau and Dayan, 2011; Cools et al., 2011), which do not account for the actions of these neuromodulators at different levels of the neural hierarchy. However, we and others have shown the differential contribution of dopamine to cognition within the PFC and caudate nucleus (Collins et al., 2000; Crofts et al., 2001; Cools, 2008; Dodds et al., 2008), which is consistent with proposals that prefrontal dopamine stabilizes goal-relevant representations (Cohen and Servan-Schreiber, 1993; Durstewitz et al., 2000; Robbins and Roberts, 2007), whereas caudate dopamine promotes cognitive switching (for review, see van Schouwenburg et al., 2010). We would argue that simple discrimination reversal learning only taxes the latter. In contrast, although we have shown that orbitofrontal 5-HT contributes to attentional saliency (Walker et al., 2008), its role at the striatal level is poorly understood, and its overall contribution to the OFC–striatal circuit requires further investigation.
Footnotes
This work was supported by a Wellcome Trust programme grant (to T.W.R. and A.C.R.) and conducted within the University of Cambridge Behavioural and Clinical Neuroscience Institute, supported by a joint award from the Medical Research Council and the Wellcome Trust. H.F.C. is supported by a Network Grant from the J. McDonnell Foundation and a Junior Research Fellowship from Newnham College, Cambridge. Funding to pay the open access publication charges for this article was provided by the Wellcome Trust. We thank Jing Xia for the HPLC analysis.
- Correspondence should be addressed to Hannah Clarke, Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, United Kingdom. hfc23{at}cam.ac.uk