Fear learning is a rapid and persistent process that promotes defense against threats and reduces the need to relearn about danger. However, it is also important to flexibly readjust fear behavior when circumstances change. Indeed, a failure to adjust to changing conditions may contribute to anxiety disorders. A central, yet neglected aspect of fear modulation is the ability to flexibly shift fear responses from one stimulus to another if a once-threatening stimulus becomes safe or a once-safe stimulus becomes threatening. In these situations, the inhibition of fear and the development of fear reactions co-occur but are directed at different targets, requiring accurate responding under continuous stress. To date, research on fear modulation has focused mainly on the shift from fear to safety by using paradigms such as extinction, resulting in a reduction of fear. The aim of the present study was to track the dynamic shifts from fear to safety and from safety to fear when these transitions occur simultaneously. We used functional neuroimaging in conjunction with a fear-conditioning reversal paradigm. Our results reveal a unique dissociation within the ventromedial prefrontal cortex between a safe stimulus that previously predicted danger and a “naive” safe stimulus. We show that amygdala and striatal responses tracked the fear-predictive stimuli, flexibly flipping their responses from one predictive stimulus to another. Moreover, prediction errors associated with reversal learning correlated with striatal activation. These results elucidate how fear is readjusted to appropriately track environmental changes, and the brain mechanisms underlying the flexible control of fear.
Fear learning is typically rapid and resistant to modification (LeDoux, 2000). This tendency to persist prevents the need for relearning about danger and can be adaptive in promoting escape and avoidance in the face of threats. However, the ability to flexibly readjust behavior is also advantageous, particularly in an ever-changing environment. This ability may be impaired in anxiety disorders, and patients with such disorders often show fear responses that are inappropriate for current circumstances (Orr et al., 2000; Peri et al., 2000; Shalev et al., 2000; Rauch et al., 2006).
A leading model for studying fear and anxiety in the brain is Pavlovian fear conditioning, a behavioral procedure in which an emotionally neutral conditioned stimulus (CS), such as a tone, is paired with an aversive unconditioned stimulus (US), such as electric shock. Studies over the past several decades have revealed much about the cellular and molecular mechanisms involved in the acquisition and storage of information about fear conditioning (Fendt and Fanselow, 1999; Davis, 2000; LeDoux, 2000; Phelps and LeDoux, 2005). As a result of this work, the mechanisms of fear extinction, whereby fear responses are weakened by presentation of the CS without the US, have also begun to be understood (Paré et al., 2004; Myers and Davis, 2007; Sotres-Bayon et al., 2007; Quirk and Mueller, 2008). However, elucidating how fear responses evolve and weaken through learning provides only partial understanding of how fear is modulated in the brain. To understand emotional control, it is crucial to clarify how fear responses are flexibly maneuvered and readjusted.
One way to study flexibility in fear is through reversal of aversive reinforcement contingencies in a fear conditioning paradigm. In this case, after acquisition of fear to one CS, the fear response is not eliminated as with extinction, but rather is switched to another CS. This is a unique situation in which two processes, the development of a fear reaction and its inhibition, occur in parallel, targeting different stimuli. Fear reversal, therefore, represents a more sophisticated and perhaps more demanding case of fear modulation.
The aim of the present study was to perform a fine-grain analysis of the gradual change in physiological and neural responses to cues that alternate in predicting danger. Specifically, using whole brain functional magnetic resonance imaging (fMRI), we sought to identify the neural mechanisms that underlie the inhibitory control of the fear response while fear is still present but is directed elsewhere. Our second aim was to identify the neural mechanisms tracking the predictive values of the stimuli as they are reversed from fear-inducing to safety-inducing and vice versa. To this end, we also examined the encoding of prediction errors related to such reversals by using a prediction error response pattern generated by the temporal difference reinforcement-learning algorithm (Sutton and Barto, 1990) as a regressor for brain activation.
The experimental procedure (see Fig. 1) consisted of an acquisition stage followed immediately by an unsignaled transition to a reversal stage. During acquisition, subjects were presented with two visual stimuli (faces). One stimulus coterminated with an aversive outcome (US) on one-third of the trials (CS+, face A). The other stimulus was never paired with the US (CS−, face B). The reversal stage was similar to acquisition except that the reinforcement contingency was reversed so that the previously nonreinforced stimulus now sometimes coterminated with the US (new CS+, face B), and the previously reinforced stimulus was now unpaired with the US (new CS−, face A).
Materials and Methods
Twenty-two healthy right-handed volunteers were recruited for the fMRI reversal task. One subject had excessive head motions during the fMRI scan and was therefore excluded from further analysis. Four subjects had nonmeasurable levels of skin conductance (nonresponders), which did not allow an assessment of fear conditioning. We therefore did not analyze their fMRI data, and they were excluded from the experiment. Thus, the final sample included 17 healthy right-handed volunteers (9 males) between 18 and 31 years of age. The experiment was approved by the University Committee on Activities Involving Human Subjects. All subjects gave informed consent and were paid for their participation.
Conditioning paradigm and physiological assessment.
A fear discrimination and reversal paradigm was used, with delay conditioning and partial reinforcement (Fig. 1). We used partial reinforcement to make learning nontrivial and to slow acquisition and reversal. This allowed us to examine early and late phases in each stage and the gradual development of fear learning and its reversal (for a comparison between a full and partial reinforcement, see Dunsmoor et al., 2007). Subjects were told they would see visual images on a computer screen while receiving shocks. The level of the shocks was set before the experiment, and therefore subjects could experience it beforehand. The instructions were to pay attention to the computer screen and try to figure out the relationship between the stimuli and the shocks. No mention was made of two stages or of a reversal of contingencies.
The CSs were two mildly angry male faces from the Ekman series (Ekman and Friesen, 1976). These stimuli were chosen because they were successful in producing conditioning and amygdala activation in previous studies (Morris et al., 1998; Critchley et al., 2002; Kalisch et al., 2006). Regardless of any a priori emotional saliency of these stimuli, the use of a discrimination procedure allowed us to detect differences in the learned predictive properties of these stimuli. The US was a mild electric shock to the wrist (200 ms duration, 50 pulses/s). The CSs were presented for 4 s, with a 12 s intertrial interval (ITI) in which a fixation point was presented (Fig. 1A).
In the acquisition phase, one face (face A) was paired with the US on one-third of the trials (CS+), and the other (face B) was never paired with the US (CS−). In the reversal stage, these contingencies were reversed such that face B was now paired with the US on approximately one-third of the trials (new CS+) and face A was not paired with the US (new CS−). The order of the different trial types was pseudorandomized (no consecutive reinforced trials and no more than two consecutive trials of each kind), and the designation of faces into CS+ and CS− was counterbalanced across subjects. During acquisition, there were 12 presentations of each of the CSs, intermixed with an additional 6 presentations of the CS+ that coterminated with the US. Reversal immediately followed acquisition, and the transition between the stages was unsignaled. This stage consisted of 16 presentations of each of the CSs, intermixed with 7 additional presentations of the CS+ that coterminated with the US. We considered the first trial in which the previous CS− coterminated with the US as the beginning of the reversal stage (Fig. 1B).
Mild shocks were delivered through a stimulating bar electrode attached with a Velcro strap to the subject's right wrist. A Grass Medical Instruments stimulator charged by a stabilized current was used, with cable leads that were magnetically shielded and grounded through an RF filter. The subjects were asked to set the level of the shock themselves using a work-up procedure before scanning. In this procedure, a subject was first given a very mild shock (10 V, 200 ms, 50 pulses/s), which was gradually increased to a level the subject indicated as “uncomfortable, but not painful” (with a maximum level of 60 V). Skin conductance was assessed with shielded Ag-AgCl electrodes, filled with standard NaCl electrolyte gel, and attached to the middle phalanges of the second and third fingers of the left hand. The electrode cables were grounded through an RF filter panel. The skin conductance signal was amplified and recorded with a BIOPAC Systems skin conductance module connected to a Macintosh computer (Apple Computers). Data were continuously recorded at a rate of 200 samples per second. An off-line analysis of the analog skin conductance waveforms was conducted with AcqKnowledge software (BIOPAC Systems).
The level of skin conductance response was assessed for each trial as the peak-to-peak amplitude difference in skin conductance of the largest deflection (in microsiemens) in the 0.5–4.5 s latency window after stimulus onset. The minimal response criterion was 0.02 μS. Responses below this criterion were encoded as zero. The raw skin conductance scores were square root transformed to normalize the distributions, and scaled according to each subject's mean square-root-transformed US response.
Neuroimaging acquisition and analysis.
A 3T Siemens Allegra head-only scanner and Siemens standard head coil (Siemens) were used for data acquisition. Anatomical images were acquired using a T1-weighted protocol (256 × 256 matrix, 176 1-mm sagittal slices). Functional images were acquired using a single-shot gradient echo EPI sequence (TR = 2000 ms, TE = 25 ms, FOV = 192 cm, flip angle = 75°, bandwidth = 4340 Hz/px, echo spacing = 0.29 ms). Thirty-nine contiguous oblique-axial slices (3 × 3 × 3 mm voxels) parallel to the AC-PC line were obtained. Analysis of the imaging data were conducted using BrainVoyager QX software package (Brain Innovation). Functional imaging data preprocessing included motion correction, slice scan time correction (using sync interpolation), spatial smoothing using a three-dimensional Gaussian filter (4 mm FWHM), and voxelwise linear detrending and high-pass filtering of frequencies above three cycles per time course. One subject with motion >2 mm was not included in the analysis.
A random-effects general linear model analysis was conducted on the fMRI signal during the reversal task with separate predictors for each trial type (face A, face B) at each of four phases: early and late acquisition and early and late reversal (see below). We used separate predictors for trials terminating with a shock. This resulted in 10 box-car predictors corresponding to the length of each trial (4 s), which were convolved with a standard canonical hemodynamic response function. Structural and functional data of each participant were transformed to standard Talairach stereotaxic space (Talairach and Tournoux, 1988). For each region of interest (ROI), we compared the differential mean blood-oxygenation level-dependent (BOLD) responses to the predictive versus nonpredictive stimuli at each phase. These analyses were conducted on the mean percent BOLD signal change at the observed peak activation (4 ± 2 s after stimulus offset) compared with baseline (the mean BOLD response during the last 4 s of the ITI).
In a complementary analysis, a different general linear model design was used to investigate BOLD activation related to errors in fear predictions, in a whole-brain analysis. A temporal difference learning model was used to generate a fear prediction error regressor. For each trial we defined two time points (t), one at the time of the cue (CS+ or CS−) onset and another at the time of its offset. This resulted in four states st (two time points for two cues), each with corresponding predicative value V(st). At each time point, the prediction error (δt) was defined as the difference between two consecutive value predictions: δt = rt + V(st) − V(st − 1), where rt represents the outcome at every time point, i.e., shock delivery (rt = 1 for shock and rt = 0 for no shock). On the basis of this prediction error, the previous state value predictions were updated according to: V(st − 1) = V(st − 1) + ηδt, where η is the learning rate. The learning rate itself was decreased after every trial according to ηnew = αηold. The parameters of this temporal difference learning model were an initial value Vinit for the two CSs, a learning rate ηacq for the acquisition phase, a learning rate decay term α (which allowed learning to decrease over time), and a learning rate ηrev for the reversal phase (which allowed for the detection of change to again boost up the learning rate that has decayed). To fit these four parameters, we assumed that the skin conductance response at the time of the CS is linearly related to the prediction error at that time (i.e., that it is linearly related to the predictive value of the CS). We thus used linear regression to estimate the scaling of the prediction error for each subject (including in this regression terms for the baseline skin conductance response and a linear drift), and used the residual sum of squared errors from nonreinforced trials only (as in the reinforced trials the skin conductance response was overwhelmed by the response to the shock) as a measure of goodness of fit. Pooling data over subjects, we fit one set of parameters by minimizing the total sum of squared errors. These were: Vinit = 0.69, ηacq = 0.23, ηrev = 0.16 and α = 0.91. The final design matrix for this analysis included, in addition to the prediction error regressor, four additional regressors accounting for the occurrence of CS+ onsets, CS− onsets, trial terminations with US, and trial terminations with no US.
Physiological assessment of fear discrimination and reversal
The results of the skin conductance analysis are presented in Figure 2A. To assess expectations for the aversive outcome separated from unconditioned responses to the shocks themselves, we included only nonreinforced trials of CS+ in this analysis. To assess the development of learning over trials, we defined the first half of the acquisition trials as early acquisition and the second half as late acquisition (six nonreinforced trials each). Similarly, we defined the first half of the reversal trials as early reversal and the last half as late reversal (eight nonreinforced trials each).
As expected, there was a significantly greater skin conductance response to the CS+ compared with the CS− during both early and late acquisition (paired two-way t tests; t(16) = 3.99, p < 0.001; t(16) = 6.06, p < 0.0001, respectively). When reinforcement contingencies were initially reversed (early reversal), there was a nonsignificant (NS) difference in skin conductance responses to the two stimuli (t(16) = −0.85, NS). However, by late reversal, a significantly greater differential skin conductance response to the new CS+ versus the new CS− was observed (t(16) = −7.23, p < 0.0001). A three-way ANOVA with main factors of stimulus (face A, face B), stage (acquisition, reversal), and phase (early, late) revealed a significant stimulus × stage × phase interaction (F(1,9) = 14.53, p < 0.01). Bonferroni corrected post hoc t tests comparing the difference in skin conductance response between CS+ and CS− at each stage showed a significant difference in all stages (p < 0.001) except for early reversal. These results confirm that fear learning occurred (responses to face A were stronger than to face B during acquisition) and that it was successfully reversed (responses to face B were stronger than to face A during reversal).
Analysis of neuroimaging data
Reversal of fear
Our main objective was to examine neural responses during the reversal stage. Previous fear learning studies have shown that responses to the safe stimuli are stronger in the ventromedial prefrontal cortex (vmPFC) compared with the fear-predictive stimulus (Phelps at al., 2004; Kalisch et al., 2006; Milad et al., 2007). We expected the same pattern to emerge during reversal and therefore used a contrast of new CS− > new CS+ in late reversal. We examined regions on the statistical map showing a significant response (false discovery rate <0.05). Similarly to the physiological analyses, we included only nonreinforced trials of CS+ to assess expectations for the aversive outcome separated from the unconditioned responses. This contrast revealed robust activation in an extensive region of the vmPFC only (Fig. 2B).
To fully characterize the pattern of responding in the vmPFC and perform statistical comparisons on the BOLD signal to the different stimuli, we extracted the mean BOLD responses in all vmPFC voxels (2532 mm3) that emerged on the statistical map. Figure 2C presents the mean differential percent BOLD signal change in response to the CS+ versus CS− in the different phases. Separate examination of early and late phases within acquisition and reversal allowed us to detect gradual changes in BOLD responses. Interestingly, during acquisition, the nonpredictive cue (CS−, face B) elicited stronger responses in the vmPFC compared with the predictive cue (CS+, face A), in both early and late acquisition (t(16) = −2.63, p < 0.05; t(16) = −2.99, p < 0.01, respectively). When reinforcement contingencies were initially reversed (early reversal), there was no significant difference in responding to the two stimuli (t(16) = 0.57, NS). As expected, by late reversal, there was a significantly greater differential BOLD response (t(16) = 5.69, p < 0.001) to the new CS− (face A) versus the new CS+ (face B), which was the criterion for selecting this ROI. A three-way ANOVA with main factors of stimulus (CS+, CS−), stage (acquisition, reversal) and phase (early, late), revealed a significant stimulus × stage × phase interaction (p < 0.01). Bonferroni post hoc t tests comparing the difference in BOLD response between CS+ and CS− at each stage showed a significant difference in early (p < 0.05) and late acquisition (p < 0.01), and late reversal (p < 0.001).
Next, we sought to assess the differences in vmPFC responding during acquisition and reversal. To this aim, we used a conjunction analysis in which the resulting statistical activation map is conditioned on significant responding to two contrasts: CS− > CS+ in late acquisition and new CS− > new CS+ in late reversal. As expected, this analysis revealed activation only in the vmPFC. We extracted the mean BOLD response at peak activation (false discovery rate <0.05; x, y, z = 3, 32, −7) and compared the differential responding between the CS+ and CS− in acquisition to the differential responding between these stimuli in reversal (Fig. 3). This analysis revealed a significantly larger difference in reversal compared with acquisition (t(16) = 1.76, p < 0.05). In addition, responses to the new CS− in reversal were higher, compared not only with the new CS+ at this stage, but also with the old CS− in acquisition (t(16) = 1.97, p < 0.05). In contrast, responses to the old CS+ and the new CS+ were similarly decreased in the two stages (t(16) = 0.67, NS). In other words, the vmPFC dissociated the nonpredictive stimulus in reversal from the nonpredictive stimulus in acquisition, whereas the predictive stimuli in these stages were encoded in a similar manner. These results show that the selective activation of the vmPFC in reversal was driven by responses to the no longer predictive stimulus, that is, the face stimulus that switched from being threatening to being safe.
Finally, to examine overlap in the neural mechanisms of extinction and reversal, we performed an ROI analysis by applying a vmPFC ROI previously identified in an extinction data set (Phelps et al., 2004), and extracting the BOLD signal during the reversal task. Although these voxels were selected on the basis of a separate extinction data set, the pattern of BOLD response seen in these voxels (supplemental Fig. 1, available at www.jneurosci.org as supplemental material) is consistent with the results reported above (Fig. 3), with stronger responses to the new CS− compared with the old CS−.
Aversive value and prediction error
Our second objective was to explore brain regions tracking the predictive value of the stimuli throughout the task. To this end, we first used a contrast of CS+ > CS− in early acquisition to extract regions of interest, and examined their differential responding to the stimuli in subsequent stages. Again, we excluded CS+ trials coterminating with a US from this analysis. Regions on the statistical map showing a significant response (false discovery rate <0.05) and their differential BOLD response in each stage are summarized in Table 1.
Given the prominent role of the striatum and the amygdala in the processing of motivationally significant stimuli (Cardinal et al., 2002; Phelps and LeDoux, 2005; Balleine et al., 2007; Delgado, 2007), we focused on these regions in our subsequent analysis, although additional regions implicated in emotion and arousal might also collaborate (Table 1). Figure 4A presents the mean differential percent BOLD signal change to the CS+ versus CS− in the different phases. Striatal responses (left and right caudate; Fig. 4B) were stronger to the CS+ versus CS− in early acquisition (t(16) = 2.69, p < 0.01), which was the criterion for selecting this ROI. This difference was further seen in late acquisition (t(16) = 2.69, p < 0.01; t(16) = 5.50, p < 0.001; respectively), as well as to the new CS+ versus new CS− in late reversal (t(16) = −3.36, p < 0.01). A three-factor ANOVA (stimulus, phase, stage) revealed a significant stimulus × stage × phase interaction (F(1,9) = 10.35, p < 0.01). Bonferroni post hoc t tests comparing the difference in BOLD response between CS+ and CS− at each stage showed a significant difference in early (p < 0.05) and late acquisition (p < 0.001), and late reversal (p < 0.01).
To reveal amygdala BOLD responses (Fig. 4C), we used the contrast of CS+ > CS− in early acquisition with a slightly more liberal threshold (p < 0.005, uncorrected), consistent with previous fear conditioning studies (Büchel et al., 1998; LaBar et al., 1998). Figure 4A presents the mean differential percent BOLD signal change to the CS+ versus CS− in the different phases. In addition to the differential responding to the CS+ versus CS− in early acquisition, amygdala responses were reversed in late reversal, showing stronger responses to the new CS+ versus new CS− (t(16) = −1.85, p < 0.05). A three-factor ANOVA (stimulus, phase, stage) revealed a significant main effect of stimulus (F(1,9) = 5.06, p < 0.05) and a significant stimulus × stage interaction (F(1,9) = 8.04, p < 0.05).
Thus, both the striatum and the amygdala showed stronger responses to the CS+ versus CS− in acquisition and flipped those responses in reversal. These results suggest that both regions track the predictive aversive value of the stimuli throughout the task. Reinforcement learning theories suggest that learning occurs when outcomes deviate from our expectations. The value of predictive stimuli is continuously updated based on these prediction errors (Rescorla and Wagner, 1972). This was the basis for the temporal difference learning model (Sutton and Barto, 1990) that has been successful in accounting for electrophysiological and imaging data from Pavlovian and instrumental conditioning (McClure et al., 2003; Montague et al., 1996; O'Doherty et al., 2003b; Schultz et al., 1997). Accordingly, in a second analysis targeting regions that track predictive value, we examined BOLD activation related to the errors in fear predictions. For this we used the temporal difference learning model to generate a prediction error regressor. The statistical activation map corresponding to this regressor (false discovery rate <0.05), after accounting for all other events as effects of no interest, revealed the caudate (L: x, y, z = −7, 3, 9, 217 mm3; R: x, y, z = 9, 5, 8, 322 mm3), the dorsal anterior cingulate (x, y, z = −1, 1, 46, BA 32, 3277 mm3), the anterior insula (L: x, y, z = −34, 14, 9, 1801 mm3; R: x, y, z = 33, 20, 9, 569 mm3) and the thalamus (x, y, z = 12, 4, 9, 3874 mm3). Lowering the threshold (p < 0.005 uncorrected; minimal cluster size >100 mm3) did not reveal additional areas. These areas are similar to those that were found in the contrasts examining the differential aversive value of the CS+ and CS− above. However, whereas BOLD responses in both striatum and amygdala corresponded with aversive value in those contrasts (Fig. 4), temporal difference prediction errors were correlated only with striatal BOLD, in accord with previous studies (McClure et al., 2003; O'Doherty et al., 2003b, 2006; Knutson and Wimmer, 2007; Schöenberg et al., 2007; Hare et al., 2008). We note that with this type of model-based analysis, we cannot reliably distinguish between prediction error signals and predicted value signals. Indeed, at the time of the CS the prediction error signal and the predicted value signal are equal and the only difference between them is that the error signal is presumed to be punctate (phasic) whereas the value signal is more sustained for the whole duration of the CS. A recent study (Hare et al., 2008) did try to separate the value signal and the prediction error signal using fMRI, but this was done by using a special experimental design aimed directly at teasing these signals apart. As this is not possible in a standard conditioning design such as ours, here we performed the prediction error analysis in addition to the more conventional “model free” CS+ versus CS− analysis, mainly to verify consistency with previous reports.
Finally, we examined whether, similar to the vmPFC, the striatum and the amygdala dissociated a naive CS− from a CS− that carries conflicting information. We found no difference between these stimuli in the striatum (t(16) = −0.82, NS) or in the amygdala (t(16) = −0.70, NS). However, the amygdala, striatum and vmPFC ROIs were defined on the basis of different contrasts, which might bias a comparison between them. That is, the voxels in the vmPFC were defined as those showing stronger responses to the new CS− in late reversal, whereas the voxels in the amygdala and striatum were defined as those showing weak responses to the CS− in early acquisition. To compare the BOLD responses of these regions under the same conditions, we defined new ROIs in these areas on the basis of their responses to a subset of the trials (all reinforced trials > fixation, false discovery rate <0.05), and then extracted the percent BOLD signal change from each region and examined the differential responding to the nonreinforced trials. Specifically, we subtracted the BOLD response to the safe stimulus in acquisition (CS−) from the responses to the safe stimulus in reversal (new CS−). The differential scores for each region are presented in Figure 5. This analysis confirmed that there was differential responding in the vmPFC, but the striatum and the amygdala did not dissociate these stimuli. Thus, it appears that the selective responding to a stimulus that was once threatening but no longer predicts an aversive outcome is unique to the vmPFC.
The present study provides a detailed analysis of the core processes underlying the reversal of predictive fear and safety reactions. We focused on the gradual development of the reversal, with particular emphasis on safety stimuli (CS−). We found a unique dissociation between a safety stimulus previously predictive of danger and a “naive” safety stimulus, with the former more strongly engaging the vmPFC. The initial fear response and its transference to a new stimulus were mediated through a widespread network, including the amygdala, the striatum, and the vmPFC, that flexibly readjusted fear responses after reversal.
Fear reversal versus fear extinction
Reversal and extinction are two linked interference paradigms of Pavlovian learning. Interference with initial fear learning is introduced by the conflicting information given in the subsequent reversal or extinction phase. In fact, extinction is a component of reversal learning, such that responses to one stimulus are extinguished whereas another stimulus acquires the predictive value (Bouton, 1993; Brooks and Bouton, 1993). Reversal, therefore, is a more demanding process because the extinction association is acquired and retrieved while fear is still present but targeted elsewhere. As such, reversal learning might be based on different causal inference than extinction, and necessitates selective and accurate responding under stressful conditions. Understanding reversal learning is potentially relevant to the treatment of clinical fear disorders such as post-traumatic stress disorder (PTSD), because it may serve as a tool to study the inappropriate control of fear in anxiety disorders. The added value of this paradigm is in allowing an examination not only of how fear responses are diminished, but also of how they are appropriately maneuvered from one predictive stimulus to another without developing either a generalized fear response or perseveration of fear.
Although both reversal and extinction consist of a shift from fear to safety, only the reversal paradigm enables a direct comparison to the opposite shift, from safety to fear. This comparison is of interest because it allows examination of how specific fear responses are decreased while others are acquired, as opposed to an overall reduction in fear. Our results clearly show that under such conditions, neural responses to a safety stimulus learned in acquisition (CS−) are different from responses to a safety stimulus learned in reversal (new CS−). Importantly, such dissociation could not be revealed by extinction because the two stimuli would have been compared under different conditions of fear (present versus not present). We found that these stimuli are uniquely dissociated in the vmPFC, which showed stronger responses to a safety stimulus that previously predicted danger compared with the naive CS−.
Interestingly, vmPFC responses to the fear-predictive stimuli were similar in the two stages, and we could not differentiate a naive CS+ from a CS+ that carried conflicting information (was safe but now predictive of danger). In both cases, the vmPFC showed decreased responding compared with the nonpredictive stimuli. Such decreases to a CS+ versus CS− are typically seen during fear conditioning, and are followed by increased CS+ responses during extinction (Phelps at al., 2004; Kalisch et al., 2006; Milad et al., 2007). The present results provide evidence that these increases are selective to an extinguished CS+, rather than the result of a general reduction in fear arousal. This specificity is indicated by the fact that increased responses to the new CS− (which is equivalent to an extinguished CS+) were accompanied by decreased responses to the new CS+, mirroring acquisition of fear.
We propose two possible roles, which are not mutually exclusive, for the vmPFC in fear reversal. One role might be to provide a selective safety signal while fear responses are still being elicited. By inhibiting fear response to one stimulus, the vmPFC may facilitate the transference of this response to the currently predictive stimulus. In essence, the vmPFC is not generally signaling that it is “safe to let your guard down,” but rather is signaling which particular stimuli in the environment is safe to ignore. Impairments in such selective fear inhibition might lead to a generalized fear response on the one hand, or to preservative fear responses on the other hand (Morgan and LeDoux, 1993).
Another role might be to provide a reward signal associated with the omission of the aversive outcome to the new CS− in reversal. It could be argued that a naive CS− is encoded as irrelevant, thus not eliciting reward related activation, whereas the omission of an aversive US from the new CS− confers rewarding properties. Consistent with this idea, the vmPFC has been shown to increase activation in response to reward outcomes and reduce activation in reposes to punishment or reward omission (O'Doherty et al., 2001, 2003a; Gottfried et al., 2002; Hampton et al., 2007). An alternative possibility is that any safe stimulus, regardless of its past, might engage inhibitory mechanisms or even be considered rewarding after reversal has occurred. Examining the neural response to a second CS− that does not change roles during the experiment might be informative in this respect: according to this hypothesis, the vmPFC should become more active in response to this stimulus after reversal.
Aversive predictive value and prediction errors
Similar to the vmPFC, the amygdala and the striatum also discriminated the CS+ from CS− throughout the task, albeit in the opposite direction. During acquisition, these areas showed increased responses to the CS+ compared with the CS−. In reversal, these regions increased responding to the new CS+ and reduced their responding to the new CS−. Thus, a complete reversal of neural activation mirrored the reversal in skin conductance responses, our behavioral index of fear. Unlike the vmPFC, these regions did not dissociate a naive CS− from a CS− that carried conflicting information (Fig. 5).
Striatal activation was also correlated with prediction errors in the reversal task. There is accumulated evidence linking striatal BOLD responses with temporal difference prediction errors for rewards (McClure et al., 2003; O'Doherty et al., 2003b, 2006; Knutson and Wimmer, 2007; Schöenberg et al., 2007; Hare et al., 2008). The present finding adds to the growing body of evidence supporting the role of this structure in temporal difference prediction error for aversive outcomes as well (Ploghous et al., 2000; Seymour et al., 2004; Jensen et al., 2007; Menon et al., 2007). Although striatal activation has been observed in aversive learning paradigm in humans (LaBar et al., 1998; Ploghaus et al., 2000; Jensen et al., 2003, 2007; Phelps et al., 2004; Seymour et al., 2004; Menon et al., 2007), and animals (Horvitz 2000; Schoenbaum and Setlow, 2003; Pezze and Feldon, 2004), the role of this region in aversive learning is only beginning to be understood (McNally and Westbrook, 2006). The present study provides robust evidence for the role of the striatum in fear predictions and their associated errors, as well as in the flexible reversal of predictive fear learning.
In addition to the striatum, responses in other regions, including the dorsal anterior cingulate and anterior insula, also correlated with prediction errors. These findings are consistent with previous report using aversive learning (Seymour et al., 2004; Menon et al., 2007) and may point to interesting differences between aversive and appetitive prediction errors. However, amygdala BOLD responses were not significantly correlated with prediction errors in our task. Two recent studies found that the amygdala has a role in signaling appetitive (Seymour et al., 2005) and aversive (money loss) prediction errors (Yacubian et al., 2006). However, a recent study of electrophysiological responses in the primate amygdala could not disentangle prediction error–related signals from a number of other signals such as CS value, stimulus valence, and US-selective responses (Belova et al., 2007). Thus, the exact computation performed by amygdala neurons while learning about aversive consequences is currently unclear.
Nevertheless, the amygdala appears to have an important role in initial acquisition of fear, as seen by the more robust activation in early compared with late acquisition. In the later phase, the differential responding to the CS+ versus CS− was reduced. This finding is consistent with previous reports that CS+ evoked amygdala activation decreases over time (Quirk et al., 1997; Büchel et al., 1998; LaBar et al., 1998; Büchel and Dolan, 2000). It might also be related to the lack of correlation with the prediction error signal, because the temporal difference model predicts increased differentiation between the stimuli over time. Here we show that despite this decrease, the amygdala also flexibly readjusts its responding after reversal, allowing for the opposite differential responding to emerge.
Different types of reversal
Although very little is known about reversal of Pavlovian fear conditioning, the neural mechanisms underlying the reversal of instrumental responses driven by aversive outcomes have been more thoroughly investigated, implicating the lateral region of the ventral PFC (Cools et al., 2002; O'Doherty et al., 2003a; Schoenbaum and Setlow, 2003; Morris and Dolan, 2004; Rolls, 2004; Evers et al., 2005). Increased activation in this area has also been associated with punishment, reward omission, and response switch (Schoenbaum et al., 1998, 1999, 2000; O'Doherty et al., 2001, 2003a). It is possible that aversive instrumental and Pavlovian reversals might be dissociated in the lateral and medial regions of the ventral PFC, respectively. The former may mediate inhibition of instrumental responses, whereas the latter may mediate inhibition of physiological fear reactions. However, there are other fundamental differences between these studies. For example, here, the reversal was between aversive and neutral associations, whereas previous studies shifted between appetitive and aversive associations. Those studies also use serial reversals, which might engage higher order rule learning. Thus, additional studies are required to elucidate the differential contribution of these two regions to reversal learning.
In sum, the present study provides a first detailed analysis of the components of reversal learning in humans, with a particular focus on safety stimuli. We found evidence for the unique contribution of the vmPFC to inhibition of fear under adverse conditions, in which fear is not diminished but rather needs to be properly assigned and controlled. These finding are important for understating the neural dysfunctions leading to the inappropriate control of fear associated with anxiety disorders.
This work was supported by a Seaver Foundation grant to the Center for Brain Imaging (CBI), National Institutes of Health (NIH) Grants R01 K05 MH067048 and P50 MH58911 (J.E.L.), a James S. McDonnell Foundation grant and NIH Grant R21 MH072279 (E.A.P.), a Human Frontiers Science Program fellowship (Y.N.), and a Fulbright award (D.S.). We thank Mauricio Delgado for fruitful discussions and comments, and Keith Sanzenbach and the CBI at New York University for technical assistance. We also thank Kenji Doya and the members of Okinawa Computational Neuroscience Course 2005.
- Correspondence should be addressed to Dr. Elizabeth A. Phelps, Department of Psychology, 6 Washington Place, Room 863, New York, NY 10003.