Functional imaging studies of cued fear conditioning in humans have mostly confirmed findings in animals, but it is unclear whether the brain mechanisms that underlie contextual fear conditioning in animals are also preserved in humans. We investigated this issue using functional magnetic resonance imaging and virtual reality contexts. Subjects underwent differential context conditioning in which they were repeatedly exposed to two contexts (CXT+ and CXT−) in semirandom order, with contexts counterbalanced across participants. An unsignaled footshock was consistently paired with the CXT+, and no shock was ever delivered in the CXT−. Evidence for context conditioning was established using skin conductance and anxiety ratings. Consistent with animal models centrally implicating the hippocampus and amygdala in a network supporting context conditioning, CXT+ compared with CXT− significantly activated right anterior hippocampus and bilateral amygdala. In addition, context conditioning was associated with activation in posterior orbitofrontal cortex, medial dorsal thalamus, anterior insula, subgenual anterior cingulate, and parahippocampal, inferior frontal, and parietal cortices. Structural equation modeling was used to assess interactions among the core brain regions mediating context conditioning. The derived model indicated that medial amygdala was the source of key efferent and afferent connections including input from orbitofrontal cortex. These results provide evidence that similar brain mechanisms may underlie contextual fear conditioning across species.
When animals or humans experience repeated pairings of a neutral conditional stimulus (CS) (e.g., tone) and an unconditional stimulus (US) (e.g., footshock), they subsequently display fear responses to the CS and the context in which the US occurred (Kim and Fanselow, 1992; Phillips and Ledoux, 1992; Grillon and Davis, 1997). This latter form of learning is known as contextual fear conditioning, and it occurs whether the US is paired with a CS, unpaired with a CS in the context, or administered in the absence of any CS (Phillips and Ledoux, 1994). It is generally agreed that the hippocampus and amygdala are central to the acquisition and short-term expression of contextual conditioned fear responses in rodents (Rudy et al., 2004). It is unknown, however, whether similar brain structures support context conditioning in humans and other animals. The current study addresses this issue using functional magnetic resonance imaging (fMRI) and virtual spatial contexts.
Context conditioning is enhanced by manipulations that increase the temporal unpredictability of the US. Shocks that are presented alone (Fanselow, 1980) or unpaired with a CS (Calandreau et al., 2005, 2006; Grillon et al., 2006) produce greater context conditioning than shocks that are signaled through CS–US pairing. Moreover, animals and humans are more likely to avoid contexts in which shocks are unpredictable than predictable (Odling-Smee, 1975; Grillon et al., 2006). These findings indicate that unpredictable aversive events produce greater context conditioning than comparable predictable events. The use of predictable CSs may be a reason why previous studies in humans have not reported hippocampal involvement in context conditioning (but see LaBar and Phelps, 2005). Therefore, to maximize context conditioning and our ability to detect hippocampal neural activity, this study presents the US in the absence of any CS.
Previous neuroimaging studies of classical conditioning have focused on fear conditioning to discrete CSs (Buchel et al., 1998; LaBar et al., 1998) or extinction (Phelps et al., 2004; Kalisch et al., 2006; Milad et al., 2007a). Imaging studies have implicated the amygdala, ventromedial prefrontal cortex (Phelps et al., 2004), and hippocampus (Milad et al., 2007a) in extinction recall, providing evidence in line with animal research (Bouton et al., 2006) that this network contributes to contextual modulation of conditioned responses. However, previous imaging studies have not examined context conditioning per se. The standard view of context conditioning is that context information is encoded by the hippocampus and converges with information about the US in the amygdala complex (Maren, 2001). Neural plasticity within this complex supports the learning of a context–US association and allows fear to be triggered through various pathways (Davis, 1998). Beyond the hippocampus, anatomical studies in nonhuman primates implicate the orbitofrontal cortex as a potential source of context information to the amygdala (Stefanacci and Amaral, 2002). This work leads to the prediction that context conditioning in humans engages a network of regions encompassing the hippocampus, amygdala, and orbitofrontal cortex. The current study tests this hypothesis using standard analyses as well as structural equation modeling to model effective connectivity among brain regions.
Materials and Methods
Thirteen healthy volunteers (six females; mean age, 24.2 years; SD, 2.7 years) participated in the study and gave written informed consent approved by the National Institute of Mental Health Human Investigation Review Board. Inclusion criteria included the following: (1) no past or current psychiatric disorders as per Structured Clinical Interview for DSM-IV (First et al., 1995); (2) no medical condition that interfered with the objectives of the study as established by a physician (e.g., tachycardia); and (3) no use of illicit drugs or psychoactive medications as per urine screen. In addition to the 13 subjects included in the study, two subjects were excluded because equipment failure resulted in no psychophysiological recording, and one subject was excluded because of failure to show acquisition of the conditioned response as assessed by skin conductance and anxiety ratings. As a result of shock-related movement, eight additional subjects were excluded because of excessive head motion during scanning (i.e., >2 mm displacement in any direction).
The software application (VR Worlds; Psychology Software Tools) consisted of several complex virtual reality (VR) environments. Two of these environments, a house and an airport, served as the contexts in the present study (Fig. 1B). In addition, a static VR image of the outdoors (a sky and tree scene) was presented during each intertrial interval (ITI) to signal a rest period between context presentations. According to two-process models (Rudy et al., 2004), contextual conditioning occurs as a result of context encoding in the hippocampus and interactions with the amygdala, or via direct encoding of specific environmental features by the amygdala. Representations of the context are thought to be hippocampal-dependent, but specific feature representations are not. It is commonly held that context representations are formed during contextual conditioning when the hippocampus integrates disparate features of the environment into a new representation, a process thought not to occur in the amygdala (Moses and Ryan, 2006). Several steps were taken to maximize the probability that subjects would bind the independent features of the VR environments into hippocampal-dependent context representations. First, master recordings of the house and airport were created by having the same individual navigate each environment. Spatial navigation was performed from a first-person perspective, and conducted so as to avoid duplication of any path taken, ensuring that when subjects were shown any segment of the recordings, they would be exposed to a wide array of background features in each context. Studies in rodents suggest that such environmental complexity is important in inducing hippocampal-dependent processing of contextual stimuli (Moses et al., 2007). Studies in humans using virtual reality navigation confirm this suggestion (Maguire et al., 1998; Pine et al., 2002). Each master recording was then divided into distinct scenarios of each context. Each scenario lasted 28 s and was arranged into one of three runs as described below (see below, Contextual conditioning paradigm). During each scenario, subjects entered the house or the airport at one of several locations, and continuously toured the context until transitioning to rest. Because the scenarios were prerecorded, and passively viewed on a screen during scanning, subjects had no control of where they went in a context, or when a shock would occur. Because entry into each context occurred at several locations within the environment, it is unlikely that the onset of a context would be associated with perception of one or another specific contextual feature. Context onset occurred over the course of 1 s during which the context emerged from a black background that separated the rest period from the following context presentation. This gradual transition at context onset prevented unwanted orienting responses. Previous research in our laboratory, using this same methodology, has shown that VR contexts overshadow the surrounding experimental setting and are effective in studying contextual conditioning (Grillon et al., 2006; Alvarez et al., 2007).
Contextual conditioning paradigm
A differential conditioning paradigm was used (Fig. 1A). For 7 of the 13 subjects, the house and the airport were the CXT+ and the CXT−, respectively. Context assignment was reversed for the remaining subjects (n = 6) so as to counterbalance across subjects which environment was associated with a shock. An unsignaled shock was consistently paired with the CXT+ and no shock was delivered during the CXT−. On average, the shock was delivered 16.5 s after the onset of the CXT+ (range, 4–24 s). Because shocks were administered at unpredictable times during the CXT+, and were never repeatedly paired with specific landmarks within the CXT+, this paradigm was expected to produce minimal conditioning to specific background features and substantial conditioning to the CXT+. Each context was presented for 28 s with an 18 s ITI. During each of three runs, each context was presented four times for a total 12 CXT+ and 12 CXT−. The order of context presentation was semirandom with the limitation that no more than two CXT+ or two CXT− could be presented consecutively. The only instructions given to participants were that they would occasionally receive a shock while in a virtual indoor environment, and that they would never receive a shock while viewing the outdoor scene (i.e., during ITI), which they were to fixate until an indoor environment appeared. Participants were never told which of the two indoor environments would be associated with shock; this had to be learned through conditioning.
Throughout each run, left volar skin conductance was recorded on the sole of the left foot according to published recommendations (Prokasy and Ebel, 1967). Stimulation and recording were controlled by a commercial system (Contact Precision Instruments). Electric shocks up to 5 mA and 200 ms duration served as unconditional stimuli. The shocks were produced by a constant current stimulator and were administered on the right foot. Shock level was determined individually during a prescanning procedure involving the administration of one to three sample shocks. Subjects rated the shock level that was subsequently used in the study as moderately painful (mean, 3.38; SD, 0.23) based on a 1–5 scale (1, not at all; 3, moderately; 5, extremely). In addition, subjective anxiety ratings were obtained immediately after the last run. Subjects were asked to retrospectively rate their overall levels of anxiety in the house and airport on a 0–10 scale (0, not at all anxious; 5, moderately; 10, extremely).
Skin conductance responses were scored as the largest response initiated 1–5 s after context onset. The skin conductance response (SCR) was determined by subtracting the skin conductance level at the onset of the SCR from the peak skin conductance level. A log transformation (log[1 + SCR]) was performed to normalize the distribution, and the magnitudes were range corrected (SCR/SCRmax) for each subject to properly correct for interindividual variance. The skin conductance level (SCL) for the CXT+ was scored as the mean SCL during the 4 s window preceding the shock. Because 2 of the 12 shocks occurred within the first 8 s of the context, and SCL during the postshock period was not analyzed to avoid the confounding influence of the shock, SCL means were based on 10 context presentations. To avoid contamination from the SCR to context onset, all SCL measurement windows occurred 10 s postonset or later. Because no shock was ever delivered during the CXT−, the same SCL measurement windows that were used for the CXT+ were used for the CXT−. As with SCR, SCL scores were log-transformed and range-corrected to attain statistical normality and to reduce error variance (Lykken and Venables, 1971).
MRI data acquisition
A 3 tesla GE Signa MR scanner with an eight-channel receive-only brain array was used to acquire functional T2*-weighted echo-planar images (EPIs) with blood oxygen level-dependent (BOLD) contrast (repetition time, 2.0 s; echo time, 23 ms; flip angle, 90°; matrix, 128 × 128; field of view, 220 mm). Each volume consisted of 32 axial slices of 3.5 mm thickness and 1.7 × 1.7 mm2 in-plane resolution. This slice prescription achieved near whole-brain coverage for each subject, extending uniformly from the base of the temporal lobes to a point well superior to the cingulate sulcus. Because coverage did not encompass the uppermost portions of the cortex in all subjects, no inferences were made regarding activations in the most dorsal cortical regions at the top of the brain. However, in all subjects slice coverage included dorsal anterior cingulate and parietal regions, brain areas that have been previously implicated in cued fear conditioning (Milad et al., 2007b) and spatial and attentional processing (Colby and Goldberg, 1999), respectively. A total of 181 volumes were acquired during each run for a total of 543 volumes throughout the task. In addition, a whole-brain magnetization-prepared rapid-acquisition gradient echo (MPRAGE) anatomical scan (repetition time, 450 ms; echo time, minimum full; flip angle, 10°; matrix, 256 × 256; field of view, 240 mm; axial plane; slice thickness, 1 mm; 126 slices) was acquired for each subject for the purposes of aligning a reference EPI with each subject's anatomy. Foam pads were used to help prevent subject head movement during data acquisition. Presentation and timing of VR was achieved using the VR Worlds video editor (Psychology Software Tools) and PSYLAB (Contact Precision Instruments) on Dell laptops running Windows XP (Dell Computer).
Image preprocessing and realignment
Functional data were analyzed using the AFNI software package (Cox, 1996). The first six volumes were discarded to allow for T1 equilibrium effects. The remaining images were then corrected for slice timing offset and realigned to the EPI volume acquired most closely in time to the MPRAGE anatomical scan using a six-parameter rigid body transformation. Data were spatially smoothed using a 6 mm full width at half-maximum Gaussian kernel and then scaled by the mean of the time series at each voxel. The anatomical data were then transformed into Talairach–Tournoux standard space (Talairach and Tournoux, 1988) by manually aligning anterior commissure–posterior commissure and inferior–superior axes, and scaling to Talairach–Tournoux atlas brain size. Landmarks were placed on the high resolution MPRAGE anatomical dataset of each subject, and the same transformation was applied to each subject's EPI data after individual subject analysis. The quality of the alignment between the reference EPI and the anatomical dataset was verified for each subject. The data from the three runs were concatenated before individual subject analysis.
The a priori prediction was that after learning which context was associated with a shock, subjects would respond differentially to the shock context (CXT+) and no-shock context (CXT−). Therefore, planned comparisons were used to examine whether there was greater electrodermal activity to the CXT+ than to the CXT− during each of the three runs of the experiment. For each run (runs 1–3), pairwise comparisons between each stimulus type (CXT+, CXT−) were performed for both SCR and SCL data. To assess how SCRs and SCLs varied across time, stimulus type (CXT+, CXT−) by run (runs 1–3) repeated-measures ANOVAs were also performed. The sample size for SCR and SCL analyses was 12 because one subject was considered a nonresponder because of insufficient nonzero responses; however, the pattern of psychophysiological results did not change when the nonresponder was included. The subjective rating data were examined with a paired-samples t test. The value of α was set at 0.05 for all statistical tests, and all tests were two-tailed. The Greenhouse–Geisser correction was used for any statistical effect involving more than two levels.
Data were analyzed within the general linear model framework as implemented in AFNI. To give the shape of the hemodynamic response maximum flexibility, each stimulus type (CXT+ and CXT−) was modeled as the sum of piecewise linear B-spline basis functions, also called “tent” functions [Saad et al. (2006), their Fig. 1]. Six tent functions were used to cover each context from 4 to 24 s. To account for the delay in the hemodynamic response, the initial time point (0 s) was assumed to have a magnitude of zero. To minimize noise associated with context offset and the recovery of the BOLD response, two additional tent functions were used to span the 24–32 s postonset period. Thus, for each voxel the analysis resulted in eight amplitudes for each stimulus type. These amplitudes corresponded to eight time points (4, 8, 12, 16, 20, 24, 28, and 32 s) after context onset, and were each associated with a β coefficient and a t statistic. Because the time points at 28 and 32 s were treated as regressors of no interest, they are discussed no further. Additional regressors of no interest were used to model head movement, baseline drift, and the response to shocks. The latter was modeled with three basis functions: a SPM γ variate function, as well as derivatives for time and dispersion. The drifting effects (low frequency confounds) were modeled with a separate cubic polynomial for each run.
After individual analyses, a whole-brain group analysis was performed. First, the difference between subject-specific β coefficients for CXT+ and CXT− were entered into a regression model with no intercept, and an AR(1) model was adopted to account for the serial correlation of the within-subject residuals. The F statistic tested whether the amplitude difference between CXT+ and CXT− at any of the time points was significantly different from zero. The resulting group images were thresholded using a voxelwise threshold of p < 0.05 and a cluster probability of p < 0.05, corrected for gray matter multiple comparisons. The segmentation tool FAST in FSL was used to create the gray matter mask for correction. Multiple comparison correction was performed using AlphaSim, a program that estimates the probability of obtaining clusters of a particular size and generates a mapwise corrected p < 0.05 threshold. Second, to examine the direction of amplitude differences between CXT+ and CXT−, paired-sample t tests were performed for time points 4–24 s. The resulting group images were thresholded with a voxelwise threshold of p < 0.001 and gray matter corrected (p < 0.05). Conjunction analyses were performed between the overall F test and each of the individual t tests, ensuring that all activations in the final group images were significant and corrected for both tests. All activations were thresholded as stated above unless noted otherwise. We adopted this two-stage procedure rather than averaging or summing across the different β coefficients because averaging and summing cannot detect hemodynamic response shape differences, and are vulnerable to undershoots in the hemodynamic response (i.e., negative β values). The Duvernoy (1999) atlas was used to verify anatomical localization of all activations, and labels for Brodmann's areas in orbital and medial prefrontal cortex were drawn from Ongur et al. (2003).
To examine a priori hypotheses that the hippocampus and amygdala are critical to contextual conditioning, exploratory region-of-interest (ROI) analyses were performed (Poldrack, 2007). The purpose of an exploratory ROI analysis is to illustrate more clearly the pattern of signal in a whole-brain voxelwise analysis. To depict the pattern of neural activity for CXT+ and CXT− in hippocampus and amygdala specifically, atlas-based ROIs of bilateral hippocampus and amygdala were extracted from the Talairach Daemon database as implemented in AFNI (Lancaster et al., 2000). For each ROI, a conjunction analysis was performed between the anatomical ROI mask, the overall F test, and the t test for time point 4 s. This specific t test was used in conjunction analyses because it was the only time point in the whole-brain analysis to show a significant difference between contexts in the hippocampus and amygdala; thus, this ensured that clusters that survived the conjunction analysis contained only voxels that were activated. The F test and t test in the ROI conjunction analyses were thresholded with a relatively lenient voxelwise threshold of p < 0.05, but were corrected for gray matter multiple comparisons at a cluster probability of p < 0.05. To depict the time course of activation in each ROI, subject-specific β coefficients for CXT+ and CXT− were extracted from the peaks of activation clusters in each ROI, and plotted across time.
fMRI path analysis.
Structural equation modeling, as implemented by the AFNI program 1dSEM, was used to examine the interactions among brain regions in a small network that was found to support context conditioning in the current study. Based on previous work on fMRI path analysis (Bullmore et al., 2000; Stein et al., 2007), 1dSEM takes interregional covariances or correlations as input and estimates the connection strengths among the regions in the network. Although 1dSEM can be used to confirm a specific network model, this study used 1dSEM in its hierarchical model-search mode called “tree growth.” Tree growth uses an automated elaborative path analysis procedure similar to the approach that has been used recently to explore effective amygdala connectivity during face processing (Stein et al., 2007). Tree growth searches for an optimal model of the correlations within a defined network. This occurs by growing a model for one additional coefficient from the previous model for n − 1 coefficients. In other words, an extra path grows as a new “branch” on the previous model (or “tree”), and each new branch represents the best fit among all possible paths, until the best model that fits the data are found.
Based on previous studies of context conditioning (Phillips and Ledoux, 1992; Bucci et al., 2000; Bannerman et al., 2003) and hippocampal and amygdala connectivity in animals (Van Hoesen et al., 1993; Stefanacci and Amaral, 2002), six brain regions were selected as part of a potential network underlying context conditioning in humans: parahippocampal cortex (PHC), anterior hippocampus (HIP), amygdala (AMG), orbitofrontal cortex (OFC), subgenual anterior cingulate (SAC), and anterior insula (INS). Although a complete network would include all brain regions that have been implicated in context conditioning, region selection was kept parsimonious to minimize model complexity. Moreover, such a parsimonious approach allowed maximal power when testing specific hypotheses focused on delineating the manner in which context information may influence the amygdala, specifically. On the basis of neuroanatomical findings, Stefanacci and Amaral (2002) have recently proposed that the amygdala is well positioned to evaluate context-dependent threats as a result of context information it may receive from orbitofrontal cortex and medial temporal regions such as the hippocampus and parahippocampal cortex. Thus, the path analysis modeled functional connections among these regions specifically. Input from anterior insula to the amygdala may also provide a means for cortical input to influence visceral and autonomic responses to threats. We sought to test these specific hypotheses using regions that were associated with context conditioning in the whole-brain and ROI analyses of the present study. Using the coordinates for the peak and center-of-mass of each activation, 5 mm spheres were placed at the chosen coordinates of the selected regions (Table 1).
Time series extraction and model specification.
To obtain time series that reflected contextual conditioning for each subject, the average time series for all voxels, in each of the 5 mm spheres, was extracted from a functional dataset that represented subject-specific contextual conditioning (i.e., greater neural activity for CXT+ than CXT−). To prevent the model search procedure from deriving spurious connections that have no anatomical basis, model search was constrained by rejecting paths with weak or no known evidence of direct anatomical connection (labeled 0 in Table 2). Evidence of anatomical connection was based on converging results from rat, nonhuman primate, and human studies. Representative references for the anatomical connectivity of each path are shown in Table 3. Because known anatomical connectivity is insufficient for identifying a model that is functionally specific to contextual conditioning, the paths labeled 2 in Table 2 were deemed searchable during model search. Thus, the automated model search procedure was allowed to iteratively add paths from among the paths labeled 2 until an optimal model was derived. Additional details regarding the use of 1dSEM can be found at http://afni.nimh.nih.gov/sscc/gangc/PathAna.html.
The SCR and the SCL results are shown in Figure 2. Greater electrodermal activity for CXT+ than CXT− indicates context conditioning. Planned comparisons showed that subjects conditioned successfully after only one run, or four trials of each context. SCRs were larger to the CXT+ than to the CXT− during run 2 (t(11) = 4.2; p < 0.005) and run 3 (t(11) = 3.2; p < 0.01), but not during run 1 (p > 0.8). Likewise, SCL was larger for CXT+ than CXT− in run 2 (t(11) = 2.8; p < 0.05) and run 3 (t(11) = 2.1; p = 0.055), but not in run 1 (p > 0.4). Large differential SCRs occurred during runs 2 and 3, but because of habituation, SCRs to each context diminished in magnitude over time as supported by main effects of stimulus type (F(1,11) = 10.4; p < 0.01) and run (F(1,11) = 8.5; p < 0.005), and a stimulus type by run interaction (F(1,11) = 4.7; p < 0.05). As expected on the basis of previous studies involving threat (Bohlin, 1976), SCL in CXT+ and CXT− increased over time, reflecting increased tonic arousal, and SCL habituation was reduced as suggested by a weak quadratic trend of run (F(1,11) = 4.1; p = 0.068). Overall, contextual differences in SCL were small as reflected by the nonsignificant trend for a stimulus type main effect (F(1,11) = 3.1; p = 0.10). Subjective reports of anxiety were obtained retrospectively after all imaging runs were completed. Consistent with the above psychophysiological results, anxiety ratings showed that subjects found the CXT+ significantly more anxiogenic than the CXT− [mean (SEM), 7.3 (0.5) vs 2.3 (0.5), respectively; t(11) = 8.8, p < 0.0001]. Together, these results clearly demonstrate that contextual conditioning developed over time.
To identify the brain regions that were associated with contextual fear conditioning, CXT− was subtracted from CXT+. This contrast revealed brain regions that showed greater brain activity in the shock context than in the no-shock context as a function of time after context onset (Table 4, Fig. 3). Early in the context, conditioning was associated with activation in left posterior orbitofrontal, left medial dorsal thalamus, left anterior insula, left subgenual anterior cingulate, and bilateral parahippocampal cortex. Activation in prefrontal regions and the thalamus attenuated over time, but brain activity in anterior insula remained sustained in the CXT+ relative to the CXT−. Comparisons of brain activity across contexts in parahippocampal cortices were not statistically significant after correction. However, when correction was performed after using a more lenient voxelwise threshold (p < 0.005), a large cluster (volume, 520 μl) in right parahippocampal cortex showed significant contextual conditioning. Although no significant context differences were found at 16 and 24 s, conditioning during the latter one-half of the context was primarily associated with brain activity in bilateral inferior frontal and parietal cortex.
To assess the role of the hippocampus and amygdala, specifically, in context conditioning, exploratory ROI analyses were performed in both hemispheres. The results are shown in Figure 4. Two clusters within the left amygdala, one more lateral (peak at −29–3-18) and the other more medial (peak at −20–5-11), showed greater activation to the CXT+ than the CXT− early in the context followed by rapid attenuation across time. Although not shown in Figure 4, a similar pattern was found in a small cluster (volume, 81 μl) in the right amygdala (peak at 24–5-11). No significant difference between contexts was found in left hippocampus, but right anterior hippocampus (peak at 24–15-15) showed a context difference and a time course of activation that was similar to the amygdala. Consistent with preclinical data, the results of the ROI and whole-brain analyses implicate the hippocampus and amygdala as central components of a network underlying context conditioning.
To examine the interactions among key regions involved in context conditioning, automated path analysis was performed in model search mode. The result of the search procedure was an optimal model of effective connectivity within the defined network (Fig. 5). The optimal model included a total of 19 paths, and the model χ2 statistic [χ2(2) = 1.70; p = 0.43] indicated good statistical fit. Consistent with this result, the population-based index RMSEA (root mean square error of approximation) = 0, the incremental fit index CFI (comparative fit index) = 1, and the parsimonious fit index = 0.97, all indicate that the model fits the observed data well.
The path coefficients in the model, which indicate a directional influence from one brain region to another during the context (4–24 s), suggest that the amygdala was a key source of both afferent and efferent connections during context conditioning. Reciprocal connections were found between the amygdala and all other regions in the model except anterior insula. The path to subgenual anterior cingulate was particularly strong. In turn, the subgenual region was reciprocally connected with both prefrontal and medial temporal regions. The path from parahippocampal cortex to subgenual anterior cingulate was especially strong. Overall, the results of the path analysis show that context conditioning involves interaction between the hippocampus and amygdala as well as interactions with key cortical regions.
This study investigated contextual fear conditioning in humans using virtual spatial contexts. Similar to research on cued conditioning, the results were mostly consistent with preclinical findings in rodents. In support of animal research implicating the hippocampus and amygdala in context conditioning, neural activity was greater for CXT+ than CXT− in right anterior hippocampus and bilateral amygdala. Context conditioning also involved medial dorsal thalamus, anterior insula, as well as orbitofrontal, subgenual anterior cingulate, parahippocampal, inferior frontal, and inferior parietal cortices, all regions previously shown to have either direct connection to the amygdala or robust connections with amygdaloid afferent or efferent regions. Rodent studies have shown that lesions in each of these brain structures, except for parietal cortex (Keene and Bucci, 2008), result in impaired acquisition and expression of contextual fear (Morgan and LeDoux, 1999; Bucci et al., 2000; Li et al., 2004; Resstel et al., 2006). Thus, these results implicate a network of brain regions that are known to contribute to context conditioning. Some researchers have suggested that the amygdala may receive context information concerning threats from both orbitofrontal and medial temporal cortices (Stefanacci and Amaral, 2002). Consistent with this view, fMRI path analysis revealed modulation of left medial amygdala activity by left posterior orbitofrontal cortex, right anterior hippocampus, and right parahippocampal cortex.
Although the current findings are consistent with rodent-based research, methodological differences between studies in rodents and humans limit the extent to which cross-species comparisons can be made. Unlike lesion or inactivation studies in rodents, brain activations cannot demonstrate that a specific brain region is sufficient or necessary for a particular cognitive function (Poldrack, 2000), nor does the spatial resolution of fMRI match the precision of these techniques. Yet by determining that a brain region exhibits differential neural activity associated with a cognitive or emotional process (e.g., a contextual fear response), neuroimaging provides an important tool for relating findings across species. The topography of the two nonoverlapping activations found in lateral and medial left amygdala during contextual conditioning may be relevant to recent findings in rodent studies. This work shows that two regions of the amygdala contribute to context conditioning when no discrete CS is paired with a US: (1) the lateral nucleus and (2) the basal (or basolateral) nucleus (Calandreau et al., 2005).
Calandreau et al. (2005) found that inactivation of the basal amygdala selectively interferes with contextual conditioning regardless of whether a CS is paired or unpaired with a US, but not with cue conditioning; inactivation of the lateral nucleus, in contrast, impairs both cued conditioning and unpaired contextual conditioning (CS–US unpaired), a procedure that is similar to the unsignaled paradigm used in the present study. Based on the Duvernoy (1999) atlas, anatomical identification of the peak group-activated voxel in left lateral and medial amygdala (Fig. 4) would suggest that these clusters of activation correspond well to the lateral and basal nuclei, respectively. Nevertheless, the spatial resolution of fMRI precludes localization of lateral and basal amygdala nuclei, and any correspondence with the lateral and medial amygdala activations in the present study must remain speculative, particularly when considering that spatial smoothing was performed with a Gaussian kernel of 6 mm.
Despite the current inability of fMRI to localize lateral and basal amygdala nuclei, the distinct patterns of connectivity between these nuclei and the hippocampus and cortex provide a framework for understanding the functional significance of the amygdala in context conditioning, and a basis for predicting involvement of nonmedial temporal lobe regions in context conditioning. For example, tracing studies in the macaque monkey indicate that connections from orbitofrontal and subgenual anterior cingulate cortex project primarily to the basal but not the lateral nucleus (Stefanacci and Amaral, 2002). Consistent with this pattern, the results of the path analysis suggested that, as brain activity in orbitofrontal cortex increased, neural activity in the medial amygdala would be expected to increase, possibly as a result of receiving context information about potential threat such as an unpredictable shock. As a central component of the so-called orbital network (Kondo et al., 2005), orbitofrontal cortex receives abundant sensory input including visual information from association cortex, and is anatomically well positioned to relay information about the environment to the amygdala (Hoistad and Barbas, 2008).
Anatomical studies in rodents have also demonstrated that the basal and basomedial nuclei of the amygdala receive substantial input from areas within ventral hippocampus (CA1 and subiculum) via various tracts including the angular bundle (Canteras and Swanson, 1992). Because damage to ventral hippocampus impairs contextual conditioning (Maren and Fanselow, 1995), it has been assumed that the ventral hippocampus–amygdala pathway is involved in context conditioning (LeDoux, 2000). The present finding that context conditioning in humans was accompanied by activation in anterior hippocampus, the presumed homolog of the rodent ventral hippocampus, provides additional support for this view. In addition, the path analysis indicates that, as brain activity in anterior hippocampus increases, neural activity in medial amygdala would be expected to decrease. As Stein et al. (2007) point out, however, positive and negative coefficients in fMRI path analysis should be interpreted cautiously and cannot simply be equated with excitatory and inhibitory input because it is currently not possible to separate out the contribution of excitation and inhibition in the generation of the BOLD response (Logothetis and Wandell, 2004). It is likely nonetheless that hippocampal modulation of the amygdala in the current study reflects the influence of hippocampal cholinergic neurotransmission. Calandreau et al. (2006) have recently shown that levels of hippocampal acetylcholine release are greater in unpaired conditioning paradigms in which the context best predicts an aversive event, relative to paired paradigms in which a discrete cue is a better predictor. Moreover, the level of acetylcholine released regulates distinct patterns of molecular activation found within lateral and basal nuclei of the amygdala, and dictates whether subjects select a discrete cue or the context as most predictive of an aversive event. Because the current study used no discrete CS, we propose that the hippocampus–amygdala interaction found in the path analysis reflects the process of selecting the context as a predictor of shock.
The coefficient for the path from subgenual anterior cingulate to amygdala was negative, suggesting that, as activity in the subgenual region increases, neural activity in medial amygdala decreases. Neuroimaging studies have repeatedly reported inverse correlations between medial prefrontal regions (e.g., subgenual anterior cingulate) and the amygdala (Milad et al., 2006), suggesting a potential mechanism by which prefrontal cortex may inhibit the amygdala. However, given the paucity of data on context conditioning in humans, the precise role played by subgenual anterior cingulate cortex during contextual conditioning remains unclear. Emerging evidence suggests that subregions of anterior cingulate cortex may be differentially sensitive to discrete CS and contextual threat stimuli. Consistent with this idea, a recent study found that dorsal anterior cingulate activation was positively correlated with differential SCRs to CSs but not to the context (Milad et al., 2007b). This CS-specific result suggests that the role of dorsal anterior cingulate cortex in modulating fear expression may be limited to discrete threat cues. In contrast, the results of the present study indicate that contextual fear expression may be modulated by subgenual anterior cingulate. Recent studies in rodents involving contextual conditioning, as opposed to cued conditioning, are consistent with the notion of functional segregation in anterior cingulate cortex concerning conditioned fear expression (Resstel et al., 2006).
One surprising result in the path analysis was the failure to find evidence of interaction between anterior insula and the amygdala, two regions with extensive interconnections (Mesulam and Mufson, 1982). Beyond the relatively limited statistical power in studies with small samples such as ours, another potential explanation might involve the fact that the left medial amygdala ROI used in the path analysis primarily encompassed the basal and basomedial nuclei of the amygdala, and not the central nucleus of the amygdala, which is the primary target of anterior insular input (Mesulam and Mufson, 1982). Considering the sustained activation found in the anterior insula during context conditioning and the strong anatomical connections between this region and the central nucleus of the amygdala, future research on the expression of conditioned contextual fear should target interaction between these regions. However, differentiating neural activity in central amygdala from other amygdala subregions may require high-resolution fMRI and nonconventional analytic methods such as emerging multivariate analysis techniques (Kriegeskorte and Bandettini, 2007).
In summary, the present results provide clear evidence of contextual fear conditioning with the SCR, SCL, and anxiety ratings, and show that contextual fear conditioning in humans elicits neural activity in the hippocampus, amygdala, and several other brain regions, subcortical and cortical, including orbitofrontal cortex. These findings, and the results of the path analysis that indicate substantial interaction among brain regions, support animal models suggesting that both the hippocampus and amygdala are important for contextual fear learning, and that orbitofrontal cortex may provide the amygdala with important context information regarding potential threat. The results of this study demonstrate that similar mechanisms may support contextual conditioning across species.
This work was supported by the Intramural Research Program of the National Institute of Mental Health.
- Correspondence should be addressed to Dr. Ruben P. Alvarez, Mood and Anxiety Disorders Program, National Institute of Mental Health, 15K North Drive, MSC 2670, Bethesda, MD 20892.