The ability to switch responding between two visual stimuli based on their changing relationship with reward is dependent on the orbitofrontal cortex (OFC). OFC lesions in humans, monkeys, and rats disrupt performance on a common test of this ability, the visual serial discrimination reversal task. This finding is of particular significance to our understanding of psychiatric disorders such as obsessive–compulsive disorder (OCD) and schizophrenia, in which behavioral inflexibility is a prominent symptom. Although OFC dysfunction can occur in these disorders, there is considerable evidence for more widespread dysfunction within frontostriatal and frontoamygdalar circuitry. Because the contribution of these subcortical structures to behavioral flexibility is poorly understood, the present study compared the effects of excitotoxic lesions of the medial striatum (MS), amygdala, and OFC in the marmoset monkey on performance of the serial reversal task.
All monkeys were able to learn a novel stimulus–reward association but, compared with both control and amygdala-lesioned monkeys, those with MS or OFC lesions showed a perseverative impairment in their ability to reverse this association. However, whereas both MS and OFC groups showed insensitivity to negative feedback, only OFC-lesioned monkeys showed insensitivity to positive feedback. These findings suggest that, for different reasons, both the MS and OFC support behavioral flexibility after changes in reward contingencies, and are consistent with the hypothesis that striatal and OFC dysfunction can contribute to pathological perseveration.
Cognitive flexibility allows animals, including humans, to adapt rapidly to environmental change, and includes mechanisms that detect changes in the relationship between environmental stimuli (or our actions) and rewarding/punishing outcomes, and modify behavior accordingly. Deficits in these mechanisms are a prominent symptom of numerous psychiatric disorders including obsessive–compulsive disorder (OCD) and schizophrenia, disorders associated with dysfunction in frontostriatal and frontoamygdalar circuitry. One test commonly used to measure cognitive flexibility is the serial visual discrimination reversal task, in which subjects are required to switch their responding repeatedly between one of two visual stimuli based on the stimulus's changing relationship with reward. Neuropsychological studies in humans, monkeys, and rodents have demonstrated that successful reversal learning requires the integrity of the orbitofrontal cortex (OFC) (Butter, 1969; Dias et al., 1996; McAlonan and Brown, 2003; Hornak et al., 2004) but the contribution of associated structures, including the striatum and amygdala, is less well understood.
An early study of electrolytic lesions of the ventrolateral head of the caudate nucleus in macaques implicated this region in object reversal learning (Divac et al., 1967); although the use of electrolytic lesions raised the possibility that damage to fibers of passage, rather than the caudate itself, contributed to the deficit. Humans also show increases in regional cerebral blood flow in the left ventral caudate during visual reversal (Rogers et al., 2000). However, recent animal studies, using cell-body-specific excitotoxins have produced variable results depending on the precise lesion location, e.g., nucleus accumbens (Schoenbaum and Setlow, 2003) versus ventral or medial striatum (Ferry et al., 2000), and the nature of the task, e.g., discrimination go/no-go (Ferry et al., 2000) or rule switching (Reading et al., 1991; Block et al., 2007). The contribution of the striatum to visual reversal learning has not been tested using fiber-sparing lesions. Consequently, an investigation into the role of the primate striatum in visual discrimination reversal is timely, particularly one that directly compares the performance of striatal-lesioned monkeys with that of OFC-lesioned monkeys.
There are also inconsistent reports of the amygdala's role in discrimination reversal (Schoenbaum et al., 2003; Izquierdo and Murray, 2007), which may in part be attributable to differences in the task procedures adopted, leading to differences in the types of association used by the animal to guide responding. Therefore, for comparison purposes, the effects of excitotoxic lesions of the medial striatum and amygdala were compared with those of the OFC on serial visual discrimination reversal performance in a New World primate, the common marmoset. To investigate the possibility that any deficits seen in reversal learning were attributable to a failure to extinguish responding to the previously rewarded stimulus, performance was also assessed on a related test of instrumental extinction.
Materials and Methods
Subjects and housing
Thirteen common marmosets (Callithrix jacchus; siz females, seven males) bred on site at the University of Cambridge Marmoset Breeding Colony were housed in pairs. All monkeys were fed 20 g of MP.E1 primate diet (Special Diet Services/SDS) and two pieces of carrot 5 d per week after the daily behavioral testing session, with simultaneous access to water for 2 h. At weekends, their diet was supplemented with fruit, rusk, malt loaf, eggs, treats, and marmoset jelly (SDS), and they had ad libitum access to water. Their cages contained a variety of environmental enrichment aids that were regularly varied, and all procedures were performed in accordance with the UK Animals (Scientific Procedures) Act 1986.
Behavioral testing took place within a sound-attenuated box in a dark room. The animal sat in a clear, plastic transport box, one side of which was removed to reveal a color computer monitor (Samsung). The marmoset reached through an array of vertical metal bars to touch stimuli presented on the monitor, and these responses were detected by an array of infrared beams (Intasolve, Interact 415) attached to the screen. A reward of cooled banana milkshake (Nestlé) was delivered to a centrally placed spout for 5 s. Presentation of reward was signaled by a 2 kHz tone played through loudspeakers located behind the monitor and was dependent on the marmoset licking the spout to trigger a peristaltic pump that delivered the milkshake. The test chamber was lit with a 3 W bulb. The stimuli presented on the monitor were abstract, multicolored visual patterns (32 mm wide × 50 mm high; 12 cm apart from the center of the stimuli) that were displayed to the left and right of the central spout. The stimuli were presented using the Whisker control system (Cardinal and Aitken, 2001) running MonkeyCantab [designed by Roberts and Robbins; version 3.6 (Cardinal, 2007)], which also controlled the apparatus and recorded responding.
Behavioral training and testing
All monkeys were trained initially to enter a clear plastic transport box for marshmallow reward and familiarized with the testing apparatus. Monkeys then received the following sequence of training: familiarization of a milkshake reward, learning a tone–reward contingency, and responding on the touchscreen until they were reliably and accurately making 30 responses or more to a square stimulus presented to the left and right of the licker in 20 min. [For full experimental details, see Roberts et al. (1988).] After behavioral training, the marmosets proceeded onto the experimental paradigms.
Serial reversal learning.
As described previously (Clarke et al., 2004), this consisted of two-choice discriminations composed of abstract, colored patterns (see Fig. 1). For all discriminations, a pair of stimuli were presented to the left and right of the center of the screen. A response to the correct stimulus resulted in the incorrect stimulus disappearing from the screen, and the correct stimulus remaining present for the duration of a 5 s tone that signaled the availability of 5 s of reinforcement. Failure to collect the reward was scored as a missed reinforcement. After a response to the incorrect stimulus both stimuli disappeared from the screen, and a 5 s timeout period ensued during which the houselight was extinguished. The intertrial interval was 3 s and within a session the stimuli were presented equally to the left and right sides of the screen. Each monkey was presented with 30 trials per day, 5 d per week and progressed to the next discrimination (described in detail below) after attaining a criterion of 90% correct in the immediately preceding session. If a monkey showed a significant side bias (10 consecutive responses to one side), a rolling correction procedure was implemented whereby the correct stimulus was presented on the nonpreferred side until the monkey had made a total of three correct responses.
All animals received the following series of discriminations: (1) acquisition of a novel discrimination (D1); and (2) acquisition of a second novel discrimination (D2).
After attainment of criterion on D2, animals were separated into groups. They then underwent an amygdala lesion (n = 3), a medial striatal lesion (n = 3), an OFC lesion (n = 3) or a sham operation (n = 4). After 2 weeks' recovery, they received the following series of discriminations: (1) Retention of D2. (2) Acquisition of a third novel discrimination (D3). From this stage on, the stimulus contingencies were counterbalanced to prevent differences in performance being an artifact of any innate biases in stimulus preference. (3) A series of four discrimination reversals, whereby after each reversal, the previously correct stimulus became incorrect and the previously incorrect stimulus became correct (reversals 1–4).
After completion of the reversal paradigm, all monkeys progressed onto an extinction paradigm. This consisted of the following stages (see Fig. 1B): (1) acquisition of a novel, single stimulus–reward contingency; and (2) extinction of this stimulus–reward contingency.
During acquisition, the task parameters were identical to those of the discriminations in the reversal paradigm except that on any one trial, a single stimulus was presented pseudorandomly to the left or right of the center of the screen. A response to this stimulus was rewarded. Once monkeys had shown accurate, fast performance of this task for 2 consecutive days, extinction commenced the next day. From then on, responses to the stimulus led to disappearance of the stimulus and no reward. Each session of extinction terminated when the monkey had made no response for 5 min. Extinction of responding was considered to have occurred after 2 consecutive days of ≤10 responses.
The main measure of the monkeys' performance on the visual discriminations was the total number of errors (including any errors made on correction trials as these did not differ between groups; F > 1, NS) made before achieving the criterion of ≥90% correct in one session (excluding the criterion session) on each discrimination. Additional measures recorded for each trial were the latency to respond to the stimuli presented on the monitor (response latency), the latency to collect the reward from the spout (lick latency) and the left/right location of the response. In addition, signal detection theory (see Macmillan and Creelman, 1991) was used to establish subjects' ability to discriminate correct from incorrect stimuli independently of any side bias that might have been present. The discrimination measure d′ and the bias measure c were calculated and the normal cumulative distribution function (CDF) compared with the criterion values of a two-tailed Z test (each tail p = 0.05) to determine the classification of each 15 trial half-session as perseveration, chance or learning (including correction trials). Half-sessions in which CDF(d′) < 0.05 were classified as perseverative; sessions in which CDF(d′) > 0.95 were classified as learning, and sessions in which 0.05 ≤ CDF(d′) ≤ 0.95 were classified as chance (Clarke et al., 2004). Errors during perseverative half-sessions were considered perseverative errors, and so on. Days on which subjects attained the criterion were excluded.
The behavioral results were subjected to ANOVA using SPSS version 12.0.1. ANOVA models are in the form A4 × (C3 × S), where A is a between-subject factor with four levels (lesion group) and C is a within-subjects factor of error type with three levels (perseveration/chance/learning); S represents subjects (Keppel, 1991). Where raw data did not display heterogeneity of variance, it was transformed appropriately (see Howell, 1997). A Huynh–Feldt correction was used to adjust the degrees of freedom if sphericity could not be assumed and post hoc comparisons were made using simple main effects and homogenous subset classification using the REGWQ homogenous subset test as recommended by Howell (1997) and Cardinal and Aitken (2006). This test takes multiple groups and splits them to create homogenous subsets that are significantly different from each other at the α = 0.05 level, controlling for the familywise error rate. Thus, if subset A is different from subset B, then any group in subset A is significantly different from any group in subset B. Based on these results, a detailed exploration of pairwise differences was made using the Ŝidák correction for multiple comparisons (Ŝidák, 1967).
Subjects were premedicated with ketamine hydrochloride (Pharmacia and Upjohn, 0.05 ml of a 100 mg/ml solution, i.m.), anesthetized with Saffan (alphaxalone 0.9% w/v, alphadolone acetate 0.3% w/v; Schering Plough; 0.4 ml, i.m.) and given a 24 h prophylactic analgesic (Rimadyl; 0.03 ml of 50 mg/ml carprofen, s.c.; Pfizer), before being placed in a stereotaxic frame especially modified for the marmoset (David Kopf). Anesthesia was closely monitored clinically and by pulse oximetry, and maintained with additional doses of Saffan when necessary.
Anatomically defined lesions were achieved using stereotaxic injections of quinolinic acid (Sigma) in 0.01 m phosphate buffer at carefully defined coordinates (see Table 1), which were individually adjusted where necessary in situ to take into account individual differences in brain size as described previously (Dias et al., 1996). All injections were made in one stage of surgery using a 28-gauge cannula attached to a 2 μl Hamilton syringe at the rate of 0.04 μl/20 s. Sham surgery (n = 4) was identical to the relevant excitotoxic lesion except for the omission of the toxin from the infusion.
Postoperatively, all monkeys received the analgesic Metacam (meloxicam, 0.1 ml of a 1.5 mg/ml oral suspension; Boehringer Ingelheim), and complete recovery was assured, before being returned to their home cage for 10 d of “weekend diet” and water ad libitum before returning to experimental testing.
Postmortem lesion assessment
All monkeys were humanely killed with Euthatal (1 ml of a 200 mg/ml solution, pentobarbital sodium; Merial Animal Health; i.p.) before being perfused transcardially with 500 ml of 0.1 m PBS, followed by 500 ml of 4% paraformaldehyde fixative over ∼10 min. The entire brain was then removed and placed in further paraformaldehyde overnight before being transferred to a 30% sucrose solution for at least 48 h. For verification of lesions, coronal sections (60 μm) of the brain were cut using a freezing microtome and cell bodies stained using Cresyl Fast Violet. The sections were viewed under a Leitz DMRD microscope and lesioned areas were defined by the presence of major neuronal loss, often with marked gliosis. For each animal, areas with cell loss were schematized onto drawings of standard marmoset brain coronal sections, and composite diagrams were then made to illustrate the extent of overlap between lesions.
The schematic representations of the extent of the lesions seen in all monkeys in the OFC, medial striatum (MS), and amygdala groups are shown in Figures 2, 3, 4, and 5. These figures illustrate those regions of the brain that were consistently lesioned in three, two or one of the marmosets within each lesion group. In all cases, the intention was to create discrete lesions of the target structures that did not incur damage to either fibers of passage or extra-target tissue.
The intended medial striatal lesion included all regions of the anteromedial head of the caudate nucleus and nucleus accumbens that we have previously shown to receive inputs from the OFC (Roberts et al., 2007). The resulting lesion encompassed the medial head of the caudate nucleus in all animals, and only extended into the lateral head in one animal. The posterior extent of the lesions was at the level of the anterior amygdala. The lesion did not extend into the body of the caudate. The nucleus accumbens was damaged in its entirety (including both shell and core regions) in two animals but damage in the third animal was restricted to the anterior sector. The putamen was largely spared in all monkeys. In one monkey there was additional damage to the subgenual cortex extending into the anterior septum, as well as slight damage to the dorsal anterior cingulate, the latter probably caused by the excitotoxin spreading up the cannula tract.
OFC (Fig. 4)
The intended orbitofrontal lesion included the agranular and dysgranular regions lying on the orbitofrontal surface, anterior to the genu of the corpus callosum, while sparing the highly granular cortex in the lateral convexity and the frontal pole. The resulting lesion extended from the posterior edge of the frontal pole to the genu of the corpus callosum. The lesion included the majority of the dysgranular and agranular regions but spared the granular anterior frontal pole and the lateral convexity. There was variable cell loss to the ventromedial convexity on the left which was more pronounced anteriorly, and one monkey incurred some unilateral cell damage to the anterior dorsal granular PFC (OFC, AP 17.0), probably caused by the excitotoxin spreading up the cannula tract.
Amygdala (Fig. 5)
The lesion encompassed the lateral (L), accessory (AA), and central (C) nuclei, as well as the more lateral basal nucleus components (B, Bmg) throughout the anterior–posterior extent of the amygdala in at least two of the three monkeys. The more medial basal components (Bpc, Bi), the medial (M) and cortical nuclei (Cr), and the medial portions of the accessory basal nucleus (AB) were lesioned in one monkey or were spared along the medial edge. There was no extra-amygdala damage except for slight damage to the underlying entorhinal cortex in one monkey.
Serial reversal learning
Preoperative discrimination learning
Preoperatively, the four groups of monkeys did not differ in their ability to learn two novel visual discriminations (D1 and D2; group and group × discrimination, F values < 1) (Table 2).
Postoperatively, there was no significant difference in the ability of all four groups to remember a previously learned discrimination or to learn a third novel discrimination (group and group × discrimination, F values < 1) (Table 2).
Repeated-measures analysis of the square-root transformed total errors to criterion revealed no significant main effects of group (F(3,9) = 1.921, p = 0.197) or reversal × group interaction (F(9,27) = 1.598, p = 0.166) (Fig. 6A). However, such a gross analysis across multiple reversals is not sensitive enough to detect the perseverative behavior that has previously been shown to be so characteristic of OFC-lesioned marmosets (for example Clarke et al., 2004; Man et al., 2008) and omission of a predictive factor such as error type (perseverative, chance, and learning errors) violates the assumptions of ANOVA (Howell, 1997; Cardinal and Aitken, 2006). Therefore our primary mode of analysis was to investigate how the different error types were affected by the lesions, rather than the total (combined) number of errors.
Both MS- and OFC-lesioned monkeys made many more perseverative errors across the series of four reversals than either the amygdala-lesioned monkeys or controls (Fig. 6B). Although both amygdala-lesioned monkeys and controls showed a steady improvement in perseverative performance across the four reversals, the OFC-lesioned monkeys and, in particular, the MS-lesioned monkeys did not. In contrast, there were no significant differences between the groups in errors during the chance or learning phases (Fig. 6C).
Repeated-measures ANOVA of the square-root transformed error types across group and reversal revealed significant main effects of error type (F(1.9,17.4) = 40.943, ε̄= 0.967, p < 0.001) and of reversal (F(3,27) = 14.206, p < 0.001) and a trend toward an effect of group (F(3,9) = 3.00, p = 0.088). There was also an error type × group interaction (F(5.804,17.412) = 6.219, ε̄ = 0.967, p = 0.001), but no reversal × group interaction (F > 1, NS), error type × reversal interaction (F(6,18) = 1.749, NS), or three-way interaction (F < 1, NS). Simple main effects of group for each error type, collapsed across all four reversals, revealed a significant difference between groups at the perseverative stage (F(3,12) = 15.656, p < 0.001), but not at the chance or learning stages (chance, F < 1; learning, F > 1, NS). Post hoc analysis of the perseverative responding using the REGWQ homogenous subset test (described in the Materials and Methods) revealed that both MS- and OFC-lesioned monkeys were classified as one subset, and made significantly more perseverative errors than the amygdala-lesioned monkeys and controls, which were classified as another subset. By definition, groups within a particular subset do not differ from each other. Detailed exploration of this pattern using Ŝidák-corrected multiple comparisons confirmed that both MS- and OFC-lesioned monkeys made more perseverative errors than amygdala-lesioned monkeys and controls (MS lesions vs controls, p = 0.004; MS lesions vs amygdala lesions, p = 0.034; OFC lesions vs controls, p = 0.002; OFC lesions vs amygdala lesions, p = 0.014), and that there was no difference in perseveration scores between amygdala-lesioned monkeys and controls (p = 0.814) or MS- and OFC-lesioned monkeys (p = 0.993).
At no point during the serial reversal paradigm were there any differences between groups in either the latency to respond to a stimulus or the latency to take the reward.
Sensitivity to rewarding and nonrewarding feedback
The finding that the impairment in reversal learning in both the MS- and OFC-lesioned monkeys was perseverative in nature highlights the comparability of their behavioral impairments. However, functional imaging studies in humans and electrophysiological studies in monkeys do highlight differences in the activity of these two regions during task performance (Pasupathy and Miller, 2005; Seger and Cincotta, 2006). Additional analyses were therefore undertaken to determine whether any differences in the impaired reversal performance of OFC- and MS-lesioned monkeys could be revealed. Initially, the responsivity of all four groups of monkeys to positive and negative feedback was investigated on all trials across all reversals. Probabilities of shifting responding to the other stimulus on trial X were calculated according to whether the response on trial X − 1 was rewarded (P[shift|win]) or not rewarded (P[shift|loss], where P[shift|win] + P[stay|win] = 1 and P[shift|loss] + P[stay|loss] = 1). In addition, a direct comparison between MS- and OFC-lesioned monkeys was of particular interest because these two groups showed a perseverative impairment of comparable magnitude.
Figure 7 displays the probability of shifting after an error or a correct response in OFC-lesioned, MS-lesioned, amygdala-lesioned, and control monkeys. MS- and OFC-lesioned monkeys were significantly less likely to shift after an error compared with the other two groups, consistent with their marked perseverative responding to the previously rewarded, but now incorrect, stimulus (Fig. 7A). In contrast, only OFC-lesioned monkeys showed a significantly increased likelihood of shifting after a correct response compared with the other three groups (Fig. 7B).
Repeated-measures ANOVA of the arcsine-transformed probability data from all four groups of monkeys revealed a significant main effect of group (F(3,9) = 4.127, p = 0.043), and also significant reversal × group (F(9,27) = 3.306, p = 0.008), feedback × group (F(3,9) = 25.755, p < 0.001), and reversal × feedback interactions (F(3,9) = 3.607, p = 0.026). Analysis of the simple main effects of the feedback × group interaction, collapsed across the reversals, showed group differences in the probabilities of shifting after both nonrewarded (error) and rewarded (correct) trials (shift given no reward, F(3,12) = 10.589, p = 0.003; shift given correct response, F(3,12) = 5.109, p = 0.025).
Subsequent post hoc analysis of the “shift given no reward” data were performed using the REGWQ homogeneous subset test. This classified MS- and OFC-lesioned monkeys in a subset together (i.e., not significantly different from each other), with these groups significantly different from the controls and amygdala-lesioned monkeys (which formed a second subset). Detailed exploration of the pairwise differences revealed that the MS- and OFC-lesioned monkeys were significantly less likely to shift after a nonrewarded trial than controls (control vs MS, p = 0.007; control vs OFC, p = 0.010), and did not differ from each other (p > 0.999). Amygdala-lesioned monkeys did not differ from controls (p = 0.8), and showed a nonsignificant trend toward a difference from both MS-lesioned monkeys (p = 0.065) and OFC-lesioned monkeys (p = 0.097; all p values Ŝidák-corrected for multiple comparisons). Similar analysis of the “shift given reward” using the REGWQ showed that the OFC-lesioned monkeys were significantly more likely to shift after a reward than a subset containing all other groups. Detailed exploration of the pairwise differences confirmed the lack of significant differences between MS-lesioned, amygdala-lesioned, and control monkeys (smallest p = 0.925), whereas OFC-lesioned monkeys were significantly more likely to shift after a reward than amygdala-lesioned monkeys (p = 0.033), with a trend toward a similar difference from both controls (p = 0.102) and MS-lesioned monkeys (p = 0.097; all p values Ŝidák-corrected for multiple comparisons).
The difference between OFC- and MS-lesioned monkeys in terms of their shifting probability after a correct response, was an issue of key a priori interest. Although the homogeneous subset test provided evidence in support of such a distinction, direct evidence for the specific contrast was not found in a detailed post hoc exploration of this pattern, possibly because of the lower power after correction for multiple comparisons. To confirm the key conclusion that OFC-lesioned monkeys are more likely to shift after a win than MS-lesioned monkeys, therefore, an ANOVA was conducted contrasting only the MS and OFC lesioned groups. It revealed a significant main effect of feedback (correct response or error; F(1,4) = 134.537, p < 0.001) and significant feedback × group (F(1,4) = 12.295, p = 0.025) and reversal × group (F(3,12) = 8.666, p = 0.002) interactions. Simple main effects of group for each feedback type, collapsed across all four reversals, revealed a significant difference between the probability of the groups shifting after a correct response (F(1,5) = 9.160, p = 0.039), with the OFC-lesioned monkeys being more likely to shift than the MS-lesioned monkeys. As before, there was no difference in the probabilities of OFC- and MS-lesioned monkeys shifting after an error (F(1,5) = 0.216, p = 0.666).
There was no significant difference in the ability of the four groups to acquire the new one-stimulus task (F(3,12) = 0.450, p = 0.723).
As can be seen in Figure 8, the total number of responses made before responding had extinguished did not differ between the controls and MS- or OFC-lesioned monkeys. In contrast, the amygdala-lesioned group made far fewer responses overall. However, ANOVA on square-root transformed data from all four groups revealed no significant differences in the number of responses made (group: F(3,12) = 1.144, p = 0.383). Because the large variation in extinction performance displayed by individuals in the MS and OFC groups (see symbols in Fig. 8A) may have caused a loss of power in the ANOVA, an additional analysis was performed comparing the control and amygdala groups alone; these latter two groups showing similar levels of variation. As expected, this one-way ANOVA indicated that amygdala-lesioned monkeys were significantly faster to extinguish than controls (F(1,5) = 15.123, p = 0.012).
To determine whether the length of time taken to extinguish responding was correlated with the degree of perseveration seen in the reversal paradigm (given that both indices are measures of inflexibility within their specific paradigms), a correlational analysis was performed. However, analysis revealed no significant “between” (r = −0.02, p = 0.948) or “within” group correlations (p ≥ 0.189), suggesting that for these two paradigms at least, perseveration and resistance to extinction are governed by independent processes (see Fig. 8B).
At no point during the one-stimulus extinction test were there any differences between groups in the latency to respond to the stimulus.
This study provides new insight into the neural circuitry underlying visual discrimination reversal learning in primates. First, selective excitotoxic damage to the MS induced inflexible, perseverative responding in reversal learning, similar to that seen after excitotoxic damage to the OFC. Second, OFC- and MS-lesioned monkeys differed in their sensitivity to positive, but not negative, outcomes. Whereas both OFC- and MS-lesioned monkeys were equally insensitive to negative outcomes (consistent with their perseveration), only OFC-lesioned monkeys were significantly less sensitive to rewarding outcomes. Third, the inflexibility induced by MS and OFC lesions did not extend to a one-stimulus extinction test, suggesting that at least partially dissociable processes support reversal learning and response extinction. Finally, amygdala lesions markedly facilitated extinction, despite failing to impair reversal learning, consistent with the findings of Izquierdo and Murray (2005).
The striatum, amygdala, and behavioral switching
Neurophysiological (Block et al., 2007), functional neuroimaging (Rogers et al., 2000; Hampton and O'Doherty, 2007), and computational modeling (Frank and Claus, 2006) studies have implicated the striatum in discrimination reversal learning. However, behavioral evidence for striatal involvement in such learning is limited. Moreover, care must be taken when comparing striatal studies in rats and monkeys given the observed differences in their overall OFC projection patterns (Schilman et al., 2008). There is little evidence for a selective role of the nucleus accumbens in reversal learning in either rats or monkeys (Annett et al., 1989; Stern and Passingham, 1995; Schoenbaum and Setlow, 2003), although a lesion of the nucleus accumbens that extended into the neighboring pallidum selectively disrupted reversal of an odor discrimination in rats (Ferry et al., 2000). Although this evidence suggests accumbens lesions do not always disrupt reversal learning, such lesions do impair another form of behavioral switching, strategy set-shifting (Reading et al., 1991; Block et al., 2007). Conversely, the dorsomedial striatum has been implicated in both place and egocentric response reversal learning (Ragozzino et al., 2002; Ragozzino, 2007), although the deficits were attributable to a failure to engage or maintain the new response rather than failure to inhibit the previously rewarded response (perseveration). It should be noted that there is a marked medial–lateral topographical gradient of projections from the OFC into the rat striatum (Schilman et al., 2008), with dorsomedial striatum receiving from more medial OFC regions. However, those OFC lesion studies in rats that have reported perseverative deficits in reversal learning have included the entire OFC (medial, ventral, and lateral regions) (Chudasama and Robbins, 2003) or primarily lateral regions (Schoenbaum et al., 2003). Thus, it remains unclear whether the dorsomedial striatum receives information from that region of OFC whose loss causes perseverative responding on reversal learning.
In the present study, a marmoset striatal lesion encompassing both the medial caudate nucleus and the nucleus accumbens [the striatal territory to which the OFC projects in marmosets (Roberts et al., 2007)] selectively impaired the reversal, but not the retention or acquisition, of a visual pattern discrimination. The deficit was perseverative, and similar to that seen after lesions of the marmoset OFC. It cannot be ascertained from this study whether the deficit was caused by medial caudate or nucleus accumbens damage, although it should be noted that the marmoset with only a partial nucleus accumbens lesion still made >40% more perseverative errors than the worst control. However, given the lack of effect of nucleus accumbens lesions on odor and object reversal learning (described earlier), the present results are most likely because of medial caudate damage. This would also be consistent with the enhanced blood flow seen in this region after reversal of a visual pattern discrimination in humans (Rogers et al., 2000).
The finding that both OFC- and MS-lesioned monkeys were impaired on reversal learning is consistent with the hypothesis that these regions form a functional PFC–striatal circuit that is critical for processing reward and punishment and guiding decision making (a hypothesis that could be tested by disconnecting the OFC from the MS during reversal learning). Both OFC and ventral striatum activations are consistently demonstrated in imaging studies of reversal learning (Cools et al., 2002; Kringelbach and Rolls, 2003; Hampton and O'Doherty, 2007), and reduced OFC and ventral striatum activity is associated with poor reversal performance in OCD patients (Remijnse et al., 2006). Computational models of frontostriatal dynamics postulate that the striatum acts as an integrative gate to the flow of environmental information from other regions of the brain to the frontal lobes (Houk and Wise, 1995; Frank et al., 2001, 2004). This premise is supported by data indicating that striatal (particularly caudate) activity precedes frontal activity during rule learning (Delgado et al., 2005; Pasupathy and Miller, 2005; Seger and Cincotta, 2006). Lesions of either the OFC or MS may cause reversal learning deficits, but we suggest that they do so for reasons that differ according to these structures' distinct roles in the information-processing hierarchy. Although both OFC- and MS-lesioned groups displayed perseverative responding, consistent with their insensitivity to negative outcomes, only the OFC-lesioned monkeys were more likely to shift after positive outcomes. Our finding that MS-lesioned monkeys are insensitive to negative outcomes, i.e., reward loss, is consistent with a role for this region in reward prediction error signaling (Pagnoni et al., 2002; Seymour et al., 2007). In contrast, the finding that OFC-lesioned monkeys were less sensitive to all feedback is consistent with data showing that the OFC represents positive and negative outcome expectancies (Schoenbaum et al., 1998; O'Doherty et al., 2001).
The failure of marmoset amygdala lesions to disrupt reversal learning is consistent with a recent study in which amygdala-lesioned rhesus monkeys were unimpaired in an object reversal task presented in the Wisconsin General Testing Apparatus (WGTA) (Izquierdo and Murray, 2007). In the WGTA rewards were visible, allowing visual (stimulus)–visual (sight of food) associations to guide responding, whereas in our study they were not. Our findings therefore extend the types of reversal learning that survive amygdala lesions to tasks that cannot depend on visual–visual associations. However, Stalnaker et al. (2007) showed that simultaneous lesions to the rat amygdala and OFC abolished the reversal learning impairments normally seen after OFC lesions alone. The present results clearly identify the MS as a likely candidate for reversal learning in the absence of OFC or amygdala.
Extinction versus discrimination reversal
Despite their unaltered performance in reversal learning, amygdala-lesioned marmosets showed a marked facilitation in extinction of responding to a single visual stimulus, similar to that seen after amygdala lesions in rhesus monkeys (Izquierdo and Murray, 2005). This facilitation is probably attributable to the insensitivity of amygdala-lesioned animals to the conditioned reinforcing properties of a reward-associated visual stimulus, properties that prolong responding in controls (Cador et al., 1989). However, increased sensitivity to nonreward has also been proposed as an alternative explanation of amygdala lesions facilitating extinction (see Izquierdo and Murray, 2005). Where some studies have shown prolonged responding in extinction (e.g., Burns et al., 1999), very different procedures to the current study were used, making comparison difficult.
In contrast to amygdala-lesioned marmosets, the extinction performance of MS- and OFC-lesioned marmosets remained indistinguishable from controls despite their severely impaired reversal performance. The intact extinction performance of OFC-lesioned marmosets deserves special mention because impaired extinction has been reported to occur after OFC lesions in rhesus monkeys (Butter, 1969; Izquierdo and Murray, 2005). Differences in procedures, including (1) self-determined versus time-dependent trial length, (2) self-determined versus prescribed trial numbers, and (3) past training history of the animals, could all affect the psychological mechanisms underlying extinction performance and hence any involvement of the OFC. In addition, lesion type (excitotoxic versus ablation) may also affect the results.
Differences in psychological mechanisms may also account for the failure to find any correlation between the extent of perseverative responding during reversal and overall responding in extinction (Fig. 8B). Indeed, these findings eliminate the possibility that the perseverative responding seen in reversal learning after marmoset OFC and MS lesions is attributable to a deficit in response extinction. Instead, they highlight the distinct psychological and neural mechanisms that underlie behavioral switching and behavioral extinction. It is possible that the simple procedural difference in reversal learning and extinction (two stimuli vs one stimulus) alters the type of associations that may govern learning (Roberts and Parkinson, 2006). This may be critical for the involvement of the OFC, which is specifically implicated in stimulus–reward but not action–reward associative learning (Ostlund and Balleine, 2007). However, recent findings from our laboratory (Walker et al., 2008) suggest that even when extinction of a two-stimulus visual discrimination task is used, neurochemical PFC lesions that cause perseveration on a reversal task do not prolong responding in extinction.
In summary, the present experiments clearly demonstrate that MS lesions cause a perseverative impairment in reversal learning similar to that seen after OFC lesions, as predicted by the frontostriatal loop hypothesis of Alexander et al. (1990). However, the differing feedback insensitivities apparent after each lesion provide preliminary evidence for regional specialization of outcome-related information processing. Such outcome insensitivities may contribute to abnormal striatal prediction errors that are suggested to aid the development of erroneous associations in schizophrenia and other psychoses, a process which could contribute to the cognitive inflexibility apparent in such disorders (Corlett et al., 2007). However, whereas the success of anti-dopaminergic drugs in the treatment of schizophrenia together with electrophysiological data and related models (Schultz, 2002; Frank and Claus, 2006) would implicate striatal dopamine in such processes, previous work from our lab has suggested that serotonin, not dopamine, is more important within the OFC (Clarke et al., 2007). Furthermore, global but transient 5-HT depletion in humans modulates prediction errors associated with punishment (Cools et al., 2008). Thus, an understanding of the frontostriatal circuitry mediating flexible behavior will not be complete without identifying the roles played by dopamine and serotonin.
This work was supported by Wellcome Trust program Grant 076274/z/04/z (T.W.R., B. J. Everitt, A.C.R., and B. J. Sahakian) and conducted within the University of Cambridge Behavioural and Clinical Neuroscience Institute, supported by a joint award from the Medical Research Council and the Wellcome Trust. H.F.C. is supported by the Newton Trust, Cambridge, a Network Grant from the J. McDonnell Foundation, and a Junior Research Fellowship from Newnham College, Cambridge. Funding to pay the open access publication charges for this article was provided by the Wellcome Trust. We thank Rudolf Cardinal for computer programming, Rudolf Cardinal and Mike Aitken for statistical advice, Mercedes Arroyo for histology, and John Bashford for photography.
- Correspondence should be addressed to Hannah F. Clarke, Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, UK.
This article is freely available online through the J Neurosci Open Choice option.