Event-related functional magnetic resonance imaging was used to measure blood oxygenation level-dependent responses in 13 young healthy human volunteers during performance of a probabilistic reversal-learning task. The task allowed the separate investigation of the relearning of stimulus–reward associations and the reception of negative feedback. Significant signal change in the right ventrolateral prefrontal cortex was demonstrated on trials when subjects stopped responding to the previously relevant stimulus and shifted responding to the newly relevant stimulus. Significant signal change in the region of the ventral striatum was also observed on such reversal errors, from a region of interest analysis. The ventrolateral prefrontal cortex and ventral striatum were not significantly activated by the other, preceding reversal errors, or when subjects received negative feedback for correct responses. Moreover, the response on the final reversal error, before shifting, was not modulated by the number of preceding reversal errors, indicating that error-related activity does not simply accumulate in this network. The signal change in this ventral frontostriatal circuit is therefore associated with reversal learning and is uncontaminated by negative feedback. Overall, these data concur with findings in rodents and nonhuman primates of reversal-learning deficits after damage to ventral frontostriatal circuitry, and also support recent clinical findings using this task.
Reversal learning involves the adaptation of behavior according to changes in stimulus–reward contingencies, a capacity relevant to socio-emotional behavior (Rolls, 1999). It is exemplified by visual discrimination tasks where subjects must learn to respond according to the opposite, previously irrelevant, stimulus–reward pairing. Reversal learning is disrupted after lesions of the ventral prefrontal cortex (PFC) and ventral striatum (VS) in nonhuman primates (Divac et al., 1967; Iversen and Mishkin, 1970; Dias et al., 1996). However, evidence of the same system being involved in reversal performance in humans is limited to two studies in patients with nonselective ventral PFC damage (Rolls et al., 1994; Rahman et al., 1999). In the current study, event-related functional magnetic resonance imaging (fMRI) was used, enabling the identification of neural mechanisms associated with reversal learning in the intact human brain, with the further aim of dissecting the components and temporal dynamics of this process.
Previous neuroimaging studies have associated the ventral PFC and VS with a variety of functions related indirectly to reversal learning, including unconditioned (Zald and Pardo, 1997) and conditioned (Delgado et al., 2000; Knutson et al., 2001; O'Doherty et al., 2001) reward processing and low-level inhibitory control (Garavan et al., 1999;Konishi et al., 1999). A recent positron emission tomography investigation using a blocked design failed to identify blood-flow changes in the PFC during reversal learning, although changes were observed in the ventral caudate nucleus (Rogers et al., 2000). However, in blocked designs a signal may be attenuated through the averaging of activity over an extended period. Methodological developments in fMRI have enabled the identification of neural responses to single events, and event-related fMRI is perfectly suited to investigating reversal learning, where critical errors occur against a background of correct responses.
The aim of the current study was to explore the involvement of ventral frontostriatal regions in reversal learning. The event-related approach offers the additional advantage of being able to examine, for the first time, the temporal dynamics of neural activity during the reversal phase in humans. “Final” reversal errors followed directly by shifting were modeled separately from other preceding reversal errors, not directly leading to changes of behavior (Fig.1). The use of a difficult, probabilistic task, where negative feedback was given to correct responses on a minority of trials, encouraged perseverative behavior after contingency reversals. Separate investigation of probabilistic errors, final reversal errors, and preceding reversal errors enabled independent assessment of the neural correlates of reversal learning and negative feedback. We predicted specific signal changes in a ventral frontostriatal network during final reversal errors (directly leading to changes in behavior) but not during probabilistic errors or preceding reversal errors.
An important rationale for the current study was based on recent findings that performance on a probabilistic reversal-learning task was impaired by dopaminergic medication in patients with mild Parkinson's disease (Swainson et al., 2000; Cools et al., 2001). This detrimental effect of dopamine was hypothesized to be a result of “overdosing” a ventral frontostriatal circuit (Swainson et al., 2000; Cools et al., 2001), given neuroanatomical evidence for a relative preservation of the VS in the early stages of the disease (Kish et al., 1988). Confirmation of ventral frontostriatal involvement in reversal learning would considerably strengthen this “overdose” hypothesis.
MATERIALS AND METHODS
Subjects. Fourteen right-handed, young, healthy volunteers participated in this study. One subject was unable to perform satisfactorily on the task and was therefore excluded from the analysis. All remaining 13 subjects (5 males, 8 females; mean age, 25.9; SD, 3.82; range, 22–37) gave informed consent, which was approved by the local Research Ethics Committee.
Experimental design. Each subject was scanned performing the behavioral task in three successive 9 min sessions. Before entering the scanner, subjects performed a 30 trial training session. This was a simple probabilistic discrimination task (i.e., without reversal stages) designed to introduce the subject to the concept of a probabilistic error without the need to reverse responding. On each go, the same two patterns were presented. One of the patterns was correct and the other pattern was wrong, and subjects had to choose the correct pattern on each go. During the task, the rule changed intermittently so that the other pattern was usually correct. Subjects were instructed to only start choosing the other pattern when they were sure that the rule had changed.
The task was programmed in Microsoft (Seattle, WA) Visual Basic 6.0 and stimuli were presented on a computer display projected onto a mirror in the MRI scanner. Different stimuli were used in each of the three task blocks (and training stage), and the order of presentation of the blocks was counter-balanced across subjects. Each block consisted of 10 discrimination stages, and therefore, 9 reversal stages. Reversal of the stimulus–reward contingency occurred after between 10 and 15 correct responses (including probabilistic errors). The number of probabilistic errors between each reversal varied from 0 to 4. To prevent subjects from adopting a strategy such as always reversing after two consecutive errors, probabilistic negative feedback was given on two consecutive trials once during each task block. Each block lasted ∼8.5 min depending on level of performance. The two stimuli in each block were abstract colored patterns presented simultaneously in the left and right visual fields (location randomized) (Fig. 1). Responses were made using the left or right button on a button box positioned on the stomach of the subject. On each individual trial, the stimuli were presented for 2000 msec within which the response had to be made (or else a “too late” message was presented). Feedback, consisting of a green smiley face for correct responses or a red sad face for incorrect responses, was presented immediately after the response (Fig. 1). The feedback faces were presented centrally, between the 2 stimuli, for 500 msec during which the stimuli also remained on the screen. After feedback, the stimuli were removed and the face was replaced by a fixation cross for a variable interval so that the overall interstimulus interval was 3253 msec, enabling precise desynchronization from the repetition time (TR) (of 3000 msec) and sufficient sampling across the hemodynamic response function.
Imaging acquisition. Imaging data were collected using a Bruker Medspec scanner (S300; Bruker, Ettlingen, Germany) operating at 3 tesla. A total of 180 T2*-weighted echo-planar images (EPIs), depicting blood oxygenation level-dependent contrast, were acquired in each session (TR, 3 sec; echo time, 27 msec). A total of 21 slices (each of 4 mm thickness; interslice gap, 5 mm; matrix size, 64 × 64; bandwidth, 100 kHz; axial oblique acquisition orientation) per image were acquired. The first seven EPIs in each session were discarded to avoid T1 equilibrium effects. We were unable to collect data from the orbitofrontal and ventromedial PFC because of susceptibility artifacts in nasal sinuses leading to signal dropout.
Imaging analysis. Data analysis was performed using SPM 99 (Statistical Parametric Mapping; Wellcome Department of Cognitive Neurology, London, UK). Preprocessing procedures included slice acquisition time correction, reorientation, within-subject realignment, geometric undistortion using fieldmaps (Cusack et al., 2001), spatial normalization using EPI masking (to exclude areas susceptible to signal dropout from nonlinear warping) to the standard Montreal Neurological Institute EPI template, and spatial smoothing using a Gaussian kernel (8 mm full-width at half-maximum). Time series were high-pass filtered.
A canonical hemodynamic response function was used as a covariate in a general linear model and a parameter estimate was generated for each voxel for each event type. The parameter estimate, derived from the mean least squares fit of the model to the data, reflects the strength of covariance between the data and the canonical response function for a given condition. Individuals' contrast images, derived from pair-wise contrasts between parameter estimates for different events, were taken to a second-level group analysis in which tvalues were calculated for each voxel treating intersubject variability as a random effect. The t values were transformed to unit normal Z distribution to create a statistical parametric map for each of the planned contrasts (described below).
The hemodynamic response function was modeled to the onset of the responses, which co-occurred with the presentation of the feedback. The following events were modeled (Fig. 1): (1) correct responses, co-occurring with positive feedback, as a baseline; (2) probabilistic errors, on which negative feedback was given to correct responses (trials on which subjects reversed after a probabilistic error were not included in the model); (3) final reversal errors, resulting in the subject shifting their responding; and (4) the other preceding reversal errors, following a contingency reversal but preceding the final reversal errors. The final reversal errors (co-occurring with the last negative feedback) were chosen as critical events of interest (i.e., reflecting reversal learning) because activation of a reversal network was assumed to follow this last negative feedback. Error trials that could not be classified as probabilistic errors or reversal errors (so-called “spontaneous” errors) were not included in the model.
The following contrasts were assessed: (1) final reversal errors minus correct responses, (2) other preceding reversal errors minus correct responses, (3) probabilistic errors minus correct responses, (4) final reversal errors minus other preceding reversal errors, and (5) final reversal errors minus probabilistic errors. In addition, we assessed whether the final reversal errors were parametrically modulated by the number of reversal errors directly preceding them. We predicted significant signal change in the ventral frontostriatal brain regions in contrasts 1, 4, and 5 but not in contrasts 2 or 3. All contrasts were initially thresholded at p < 0.05, corrected for multiple comparisons. Strong predictions about the involvement of the VS in reversal learning (see the introductory remarks) justified application of small volume corrections using a sphere around the VS. This a priori defined region of interest (ROI) was a sphere centered onx, y, z = +/−10, 8, −4 with a radius of 8 mm (i.e., the smoothing kernel). These coordinates represent the center of the nucleus accumbens as defined using the Talairach atlas, and are very close to ventral striatal foci specified in previous fMRI studies (Delgado et al., 2000; Breiter et al., 2001; Knutson et al., 2001). These coordinates were unaltered by conversion from Talairach to Montreal Neurological Institute (MNI) space using an algorithm by M. Brett (Medical Research Council Cognition and Brain Sciences Unit, Cambridge, UK) (available atwww.mrc-cbu.cam.ac.uk/imaging).
Finally, ROI analyses were performed on the less powerful contrasts 4 and 5, which did not include baseline correct responses but were predicted to generate ventrolateral PFC signal change. A region in the ventrolateral PFC was defined on the basis of previous functional neuroimaging studies. Specifically, the ventral PFC has been activated repeatedly in studies of working memory (Owen, 1997), and on the basis of these studies coordinates have been reported that define the approximate functional extent of this neuroanatomical region (stereotaxic coordinates x = +/−26 tox = +/−50, y = +16 toy = +24, and z = −9 toz = 8) (Owen et al., 1999). The only fMRI study specifically looking at reversal learning did not scan below the anterior commissure–posterior commissure (AC–PC) axis (Nagahama et al., 2001). The above-described statistical model was then reapplied to the average signal within the ROI, using the MarsBar tool (M. Brett, personal communication; seewww.mrc-cbu.cam.ac.uk/imaging/marsbar.html).
All 13 subjects included in the analysis performed well on the task. On average, subjects made 2.6 (SD, 0.51) perseverative reversal errors after reversal of stimulus–reward contingencies. Over the task as a whole, subjects made on average (SD) 320.7 (4.4) correct responses, 48.4 (2.7) probabilistic errors, 43.4 (13.9) preceding reversal errors, and 27 (0) final reversal errors.
Significant effects observed in whole-brain analyses are displayed in Table 1.
Comparison of the final reversal errors with the baseline correct responses (contrast 1; see Materials and Methods) revealed significant signal change in the right ventrolateral PFC (Table 1 and Fig.2). The effect in the left ventrolateral PFC was present but did not reach significance (coordinatesx, y, z = −32, 24, −4;T = 6.8). Other significant effects in this contrast were observed in medial frontal cortex (Brodmann area 8) and right parietal cortex (Table 1). Small volume corrections, restricting the search volume to a sphere around the ventral striatum (see Materials and Methods) revealed significant signal changes in that region bilaterally (coordinates x, y, z= −10, 2, −2; T = 6.3; p < 0.003; and coordinates x, y, z = 14, 2, −6; T = 4.3; p = 0.03).
An ROI analysis, restricting the search volume to an independently defined area in the ventrolateral PFC (see Materials and Methods), revealed that signal change in the right ROI during the final reversal error was also significantly greater than that observed during the preceding reversal errors (contrast 4; T = 2.79;p = 0.016) and during the probabilistic errors (contrast 5; T = 2.84; p = 0.014).
Significant effects in the ventrolateral PFC and VS were absent when the other preceding reversal errors were contrasted with the baseline correct responses (contrast 2) and when the probabilistic errors were contrasted with the baseline correct responses (contrast 3). Moreover, there was no significant parametric effect at the final reversal errors as a function of the number of preceding reversal errors, even when the search volume was restricted to the ventrolateral PFC using an ROI analysis.
The present results demonstrate recruitment of a ventral frontostriatal system in a task of probabilistic reversal learning. Detailed analyses showed that this significant signal change, observed in the right ventrolateral PFC and in the region of the VS, occurred specifically during the final reversal error, at which point subjects stopped responding to a previously relevant pattern and reversed responding to a newly relevant pattern.
These data are consistent with our predictions and concur with primate and rodent lesion studies showing that damage to the ventral PFC (Iversen and Mishkin, 1970; Jones and Mishkin, 1972; Dias et al., 1996) and VS (Divac et al., 1967; Taghzouti et al., 1985; Annett et al., 1989; Stern and Passingham, 1995) disrupts reversal learning. For example, Stern and Passingham (1995) have shown that lesions of the nucleus accumbens in monkeys lead to deficits on tasks of spatial (but not object or motor) reversal learning, while leaving acquisition performance intact. In rats, dopamine (6-OHDA) and ibotenic acid lesions of the nucleus accumbens have led to both acquisition and reversal-learning impairments in a spatial T-maze and a Morris water maze (Taghzouti et al., 1985; Annett et al., 1989), suggesting a role for the nucleus accumbens in the relearning of new location–reward associations, rather than in stopping old responses (Annett et al., 1989). The current study indicates that the VS, at least in humans, is also implicated in the reversal or the relearning of object–reward associations. In contrast, human brain-imaging studies have emphasized a role for the right lateral ventral PFC in behavioral inhibition (or stopping) using, for example, go–nogo tasks (Garavan et al., 1999;Konishi et al., 1999). It is possible that the observed signal change in the ventrolateral PFC, also lateralized to the right hemisphere, reflects behavioral inhibition, whereas the signal change in the VS reflects the learning of new associations. However, our study was not designed to functionally dissociate learning from stopping; this could be addressed in future event-related brain-imaging studies.
The use of an event-related technique enabled the separate investigation of distinct error trial types that loaded differentially on reversal shifting and simple negative feedback. The signal change in the ventrolateral PFC and VS was significantly greater at the final reversal error compared with baseline correct responses. A region of interest analysis revealed that signal change in the right ventrolateral PFC was also significantly greater during the final reversal error than during all other error trial types. Moreover, the absence of a parametric effect suggests that the effects at the final reversal error were not modulated by the number of preceding reversal errors. This indicates that the focus was not the result of gradual accumulation of activity caused by the preceding errors, although given power considerations such conclusions cannot be considered to be definitive. Finally, these areas were not significantly activated during the preceding reversal errors or probabilistic errors when compared with baseline correct responses. Although it is not possible conceptually to doubly dissociate reversal learning from negative feedback, these results suggest that the effect during the final reversal error in the ventrolateral PFC is primarily attributable to reversal learning and cannot be explained by an effect of negative feedback.
These findings are broadly consistent with those observed in a recent event-related fMRI study of high-level attentional set shifting (Monchi et al., 2001). Monchi et al. (2001) demonstrated signal change in an area in the ventrolateral PFC, at coordinates very similar to the current focus, in response to negative feedback, signaling a shift of set. However, in that study it was not possible to isolate the shifting component from the negative-feedback component. Thus, the current study extends their results by showing that the effect in the ventrolateral PFC is uncontaminated by negative feedback. The present data also indicate that shifting of lower-level stimulus–reward associations, as opposed to shifting of a higher-level attentional set, is sufficient to activate the ventrolateral PFC.
Two additional neuroimaging studies have used reversal-learning tasks with event-related fMRI. Nagahama et al. (2001) did not scan brain areas below the horizontal plane through the AC–PC axis (Talairach coordinate z = 0). This precluded conclusions about the role of the VS and ventral PFC in reversal learning. Instead, they emphasized a role for a (more dorsal) posteroventral PFC area in shifting, which was not replicated in the current study. However, the focus in this posteroventral area was later clarified by Monchi et al. (2001) to be nonspecifically activated [i.e., to respond during both negative feedback (triggering set shifting) and positive feedback (triggering set maintenance)]. The relevant contrasts (subtracting positive-feedback trials from negative-feedback trials) in the current study were not designed to address this question of nonspecific feedback. A second event-related fMRI study, using a probabilistic reversal-learning paradigm to assess orbitofrontal neural responses to reward and punishment (O'Doherty et al., 2001), revealed signal change in the right ventrolateral PFC during reception of negative feedback. This signal change was interpreted to reflect punishment. However, it was not possible in that study to exclude the contribution of reversal learning. In fact, our results show that this ventrolateral PFC effect more likely reflects reversal learning. In addition, O'Doherty et al. (2001) observed signal change in the medial orbitofrontal cortex that correlated with the magnitude of reward, and observed signal change in the lateral orbitofrontal cortex that correlated with the magnitude of punishment. Because of susceptibility artifacts, we were unable to image these latter brain regions and are therefore unable to draw conclusions on orbitofrontal effects during trials on which negative feedback was received without consequences for behavior (as on probabilistic errors and preceding reversal errors).
The orbitofrontal cortex is connected to the nucleus accumbens of the ventral striatum in a segregated frontostriatal “loop” (Alexander et al., 1986; Groenewegen et al., 1997). Functional evidence implicates the orbitofrontal cortex, in interaction with the amygdala, in unconditioned (and conditioned) reward processing (Delgado et al., 2000; Breiter et al., 2001; Knutson et al., 2001; O'Doherty et al., 2001). In contrast, the ventrolateral PFC is connected to the ventral putamen, a structure implicated in motor function. The integrative role of the ventral striatum has been suggested to be the funnelling of motivational information from the “limbic” system to the motor system (Mogenson, 1987), thereby mediating the effects of stimulus–reward mechanisms on goal-directed behavior (Robbins et al., 1989; Schultz et al., 1992). Thus, whereas the orbitofrontal cortex may be important for the “low-level” representation of reward or punishment values (O'Doherty et al., 2001), the more lateral PFC may play a role in the adaptation of behavior in response to changes in such reward or punishment values. Our finding that the ventrolateral PFC is critically involved in reversal learning, uncontaminated by the reception of negative feedback per se, is consistent with this proposed hierarchy within corticostriatal systems.
Finally, our results provide a clear interpretation of recent data demonstrating that administration of dopaminergic medication to patients with mild Parkinson's disease has a detrimental effect on probabilistic reversal learning (Swainson et al., 2000; Cools et al., 2001). Recent studies have shown that, in early Parkinson's disease, dopamine depletion is restricted to the putamen and the dorsal caudate nucleus, only later progressing to more ventral parts of the striatum and the mesocorticolimbic system (Kish et al., 1988; Agid et al., 1993). It was hypothesized that administration ofl-3,4-dihydroxyphenylalanine (l-DOPA) doses, necessary to remediate the dopamine depletion in the dorsal striatum and its connections to the dorsolateral PFC, may detrimentally overdose relatively intact brain regions, such as the VS and its connections to the ventral PFC (Gotham et al., 1988; Cools et al., 2001). The current data, showing involvement of the ventral PFC and VS in probabilistic reversal learning, considerably strengthen the possibility that dopaminergic agents can indeed overdose a relatively intact ventral frontostriatal system. This work highlights the potential of combining pharmacological and functional imaging approaches to understand the underlying neural substrates of cognitive functions in both basic and clinical settings.
This work was supported by a Wellcome Trust Programme grant (T.W.R.) and completed within a Medical Research Council Company-operative Group in Brain, Behavior, and Neuropsychiatry. R.C. holds the C. D. Marsden Parkinson's Disease Society Studentship. We thank Paul Fletcher and Matthew Brett for helpful discussion and Victoria Liversidge, Ruth Bisbrown-Chippendale, and Tim Donovan from the Wolfson Brain Imaging Centre (Cambridge, UK) for radiographic assistance with this study.
Correspondence should be addressed to Trevor W. Robbins, Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB,.