The neural circuitry that increases attention to goal-relevant stimuli when we are in danger of becoming distracted is a matter of active debate. To address several long-standing controversies, we asked participants to identify a letter presented either visually or auditorily while we varied the amount of cross-modal distraction from an irrelevant letter in the opposite modality. Functional magnetic resonance imaging revealed three novel results. First, activity in sensory cortices that processed the relevant letter increased as the irrelevant letter became more distracting, consistent with a selective increase of attention to the relevant letter. In line with this view, an across-subjects correlation indicated that the larger the increase of activity in sensory cortices that processed the relevant letter, the less behavioral interference there was from the irrelevant letter. Second, regions of the dorsolateral prefrontal cortex (DLPFC) involved in orienting attention to the relevant letter also participated in increasing attention to the relevant letter when conflicting stimuli were present. Third, we observed a novel pattern of regional specialization within the cognitive division of the anterior cingulate cortex (ACC) for focusing attention on the relevant letter (dorsal ACC) versus detecting conflict from the irrelevant letter (rostral ACC). These findings indicate novel roles for sensory cortices, the DLPFC, and the ACC in increasing attention to goal-relevant stimulus representations when distracting stimuli conflict with behavioral objectives. Furthermore, they potentially resolve a long-standing controversy regarding the key contribution of the ACC to cognitive control.
- sensory cortex
- dorsolateral prefrontal cortex
- anterior cingulate cortex
The ability to minimize distraction is crucial for enabling goal-directed behavior. Current models posit that dorsolateral pre-frontal cortex (DLPFC) activity represents and maintains task-relevant information in working memory (e.g., rules, goals, instructions), which aids selection by biasing processing in other pathways to favor goal-relevant stimuli, associations, and responses (MacDonald et al., 2000; Kerns et al., 2004a). When distracting stimuli activate conflicting representations, the anterior cingulate cortex (ACC) signals the DLPFC to further increase the attentional bias toward task-relevant processing.
It is not known, however, whether current models generalize to situations in which distracting stimuli (e.g., loud songs on a radio) come from a different sensory modality than goal-relevant stimuli (e.g., words on a page). Multisensory interactions lead to heightened neuronal responses (Stein, 1998), perceptual illusions (Soto-Franco et al., 2002), and increased levels of distraction (Soto-Franco et al., 2004). Thus, it is important to determine how the brain minimizes cross-modal distraction from stimuli in a task-irrelevant sensory modality.
First, we investigated whether cross-modal distraction leads to increased activity in sensory cortices that process goal-relevant stimuli. Such a result would suggest that distraction is minimized by increasing attention to goal-relevant sensory representations, because orienting attention to a particular stimulus feature increases activity within cortical areas that process that feature (Shulman et al., 1999; Hopfinger et al., 2000; Giesbrecht et al., 2003; Woldorff et al., 2004). Because stimuli presented in different sensory modalities are processed in distinct brain regions (Bushara et al., 2003), the present study provided a particularly clear test of this hypothesis.
Second, we investigated whether regions of the DLPFC involved in focusing attention on goal-relevant stimuli help to minimize cross-modal distraction by further increasing attention to relevant stimuli (Banich et al., 2000b). Consistent with this view, both focusing attention (MacDonald et al., 2000) and distraction from irrelevant stimuli (Banich et al., 2000a) activate the DLPFC, but these activations have never been measured in the same study. Therefore, previous studies may have identified distinct regions of the DLPFC that participate in focusing attention and detecting conflict, respectively. We therefore determined whether the exact regions of the DLPFC that participate in focusing attention on goal-relevant items are reactivated during distraction from irrelevant stimuli.
Third, we investigated a long-standing controversy about whether the ACC focuses attention on goal-relevant stimuli (Posner and DiGirolamo, 1998; Dreher and Berman, 2002) or detects irrelevant stimuli that conflict with task goals (MacDonald et al., 2000; Kerns et al., 2004b). Most researchers have marshaled data to support either one view or the other, but some findings suggest that each of these functions is performed by a different subregion within the cognitive division of the ACC. Specifically, dorsal subregions may focus attention on goal-relevant stimuli (Dreher and Berman, 2002; Luks et al., 2002; Woldorff et al., 2004), whereas rostral subregions may detect conflicting items (Colcombe et al., 2004). Given previous reports of functional heterogeneity in the cingulate cortex (Banich et al., 2000a; Bush et al., 2000; Stephan et al., 2003), regional specialization in the ACC for focusing attention versus detecting conflict would provide a plausible resolution to this long-standing controversy.
Materials and Methods
Nineteen healthy participants (10 males and 9 females; age range, 19-36 years) took part in the study. All had normal or corrected vision and had no history of serious neurological trauma or disorders. All except one were right handed. Participants gave informed consent before the experiment in accordance with the local human subjects committee. Before the magnetic resonance (MR) session, each participant practiced one or two blocks of the experimental task. Participants were paid $20 per hour for their participation, which lasted ∼2 hr.
A personal computer was used to present stimuli and to record the participants' responses. Visual stimuli were projected onto a screen at the back of the bore of the magnet that participants viewed through a mirror. Auditory stimuli were delivered binaurally through MR-compatible headphones. Headphone volume was adjusted for each participant so that auditory stimuli could be heard clearly over the background MR scanner noise. Responses were made using the index and middle fingers of the right hand and recorded with an MR-compatible response box.
Structural images for each participant were collected using a T1-weighted spin echo sequence on a 1.5-T GE whole-body scanner [repetition time (TR), 500 msec; echo time (TE), 14 msec; flip angle, 90°; 17 contiguous 7-mm-thick slices; in-plane resolution, 0.94 × 0.94 mm]. The blood oxygenation level-dependent (BOLD) signal was measured with a spiral imaging sequence (TR, 1.25 sec; TE, 40 msec; flip angle, 90°; 17 contiguous 7-mm-thick slices; in-plane resolution, 3.75 × 3.75 mm) during the subsequent collection of functional images. Each participant completed eight runs of the experimental task. During each run, 313 brain volumes were collected. The first six functional images of each run contained no trials and were discarded.
In each 2.5 sec trial, a cue instructed participants to attend to and identify either the visual letter (X, 1.66 × 1.81°; or O, 1.75 × 1.84°) or the auditory letter (X or O) of an upcoming multisensory stimulus (e.g., a visual X at fixation presented simultaneously with an auditory O) or to wait until the next trial because no target would be presented (Fig. 1). Cue stimuli appeared equally often at fixation in the visual modality (“Look,” 4.59 × 1.16°; “Hear,” 4.53 × 1.13°; or “Wait,” 4.06 × 1.13°; visual-cued trials) (Fig. 1, left) or binaurally in the auditory modality (auditory-cued trials) (Fig. 1, right). Participants were instructed to press one button if the cued target letter was an X and a different button if it was an O, as quickly as possible without making mistakes, using the index and middle fingers of their right hand. In cue-plus-target trials, a multisensory target-distracter pair followed the cue in which the distracter was equally likely to be mapped to the same response as the target (i.e., congruent target-distracter pairs) (Fig. 1, bottom left) or to a competing response (i.e., incongruent target-distracter pairs) (Fig. 1, bottom right), in which case the distracter conflicted with task goals. In cue-only trials, a target-distracter pair did not follow the attention-directing cue. Cue stimuli and multisensory target-distracter pairs were each presented for 350 msec, separated by a 1250 msec stimulus onset asynchrony.
In all trials, the fixation dot (0.22 × 0.22°) changed color from white to red 1.25 sec after cue onset (coincident with target presentation in cue-plus-target trials and to signal no target would occur in cue-only trials). Participants were instructed to cease attending when the fixation dot turned red if a target failed to appear (cue-only trials). This procedure equated the duration of cue-triggered, attention-focusing processes in cue-plus-target and cue-only trials, such that a contrast between these trial types (which shared the same cue stimulus) would accurately reveal brain activity specific to incongruent and congruent target-distracter pairs (Corbetta et al., 2000). Activity for incongruent and congruent target-distracter pairs was then directly compared with reveal conflict-related activity (Weissman et al., 2002).
Each run included 14 event-related trial types presented in a first-order counterbalanced sequence in which each trial type was preceded equally often by every trial type in the design. In eight cue-plus-target trial types, a Look or a Hear cue was followed by a multisensory target-distracter pair, for which participants needed to discriminate the target letter in the cued sensory modality. In two of these eight trial types, a Look cue was presented in the visual modality and followed equally often by either a congruent or an incongruent target-distracter pair. In two other trial types, a Look cue was presented in the auditory modality and followed equally often by either a congruent or an incongruent target-distracter pair. Analogously, in four other trial types, a Hear cue was presented equally often in the visual or in the auditory modality and followed equally often by a congruent or an incongruent target-distracter pair.
To distinguish between cue- and target-related BOLD responses, we also included six cue-only trial types. In two types of Look cue-only trials, either a visual or an auditory Look cue was not followed by a multisensory target-distracter pair. Similarly, in two types of Hear cue-only trials, either a visual or an auditory Hear cue was not followed by a multisensory target-distracter pair. Finally, we included two additional Wait cue-only trial types. In these trial types, either a visual or an auditory cue instructed participants to wait until the next trial because a target-distracter pair would not be presented in the current trial. Because they were not central to the present hypotheses, Wait cue-only trials will not be discussed further. To optimize regression estimates of the BOLD responses produced by each of our 14 trial types, the intertrial interval (ITI) was varied between zero and five TRs using a nearly exponential distribution that favored short ITIs (Ollinger et al., 2001a,b).
SPM'99 (Friston et al., 1995) was used to correct functional images for head motion, normalize functional images to standard space, and spatially smooth the functional data with a three-dimensional Gaussian filter (full-width at half-maximum, 8 mm). Data from three participants were discarded because they moved their heads excessively while in the scanner, leaving nine males and seven females (15 right handed and 1 left handed; age range, 19-26 years). For these remaining 16 participants, the time series for each functional run was analyzed using a version of the general linear model that estimates each time point of the BOLD response to a particular trial type separately without making an assumption about the shape of the response. This method has been used in several previous studies and was validated in greater detail previously (Shulman et al., 1999; Ollinger et al., 2001a,b). For each of our 14 trial types, we modeled 12 TRs (16 sec) of the BOLD response, which meant including 14 × 12 = 168 regressors in the design matrix. Within the design matrix, we also included six motion regressors (i.e., SPM'99 motion estimates) and two separate regressors for the linear trend and y-intercept terms. Parameter estimates from each run were converted to units of percentage of change from baseline (i.e., the y-intercept term for that run) and then averaged across runs for each participant separately. Conversion from Montreal Neurological Institute to Talaraich (Talaraich and Tournoux, 1988) coordinates was implemented with two nonlinear transformations (http://www.mrc-cbu.cam.ac.uk/Imaging/Common/mnispace.shtml).
Region of interest analyses
Frontal regions. We functionally defined regions of interest (ROIs) in the midline frontal cortex by performing a random-effects one-way repeated-measures ANOVA on the average BOLD response to all multisensory target-distracter pairs, separately at each voxel. The resulting F-map was height thresholded at a value (F(11,165) = 3.6; p < 0.0002) that allowed us to identify relatively large numbers of activated voxels in the cognitive division of the ACC, including voxels in rostral, ventral, and caudal subdivisions and in the pre-supplementary motor area (SMA), SMA, and cingulate motor area (CMA). To create these ROIs, we used the Talaraich boundaries for these areas provided by previous studies (Picard and Strick, 1996; Petit et al., 1998). Given numerous previous findings of conflict-related activity extending across both caudal ACC and pre-SMA (Banich et al., 2000a; Luks et al., 2002; Kerns et al., 2004b), we also created an ROI for the dorsal ACC that included all voxels activated in the caudal ACC and pre-SMA.
Within each of these ROIs, we averaged the time courses for all types of cue-only trials instructing participants to orient attention to the auditory modality, all types of cue-only trials instructing participants to orient attention to the visual modality, all types of incongruent target-distracter pairs, and all types of congruent target-distracter pairs. Then, within each ROI, we determined whether peak activity (between 2.5 and 6.25 sec poststimulus onset, depending on the specific shape of the BOLD response) was greater for (1) the average response to all cue-only trials instructing participants to orient attention to the auditory versus the visual modality and (2) the average response to all types of incongruent versus congruent target-distracter pairs. We used random-effects, one-tailed t tests to investigate these directional hypotheses. The results of these ROI contrasts were unbiased because they were orthogonal to the contrast used to create the ROI. Therefore, p values <0.05 were considered to be significant. Confirmatory ROI analyses of peak activity in bilateral regions of the DLPFC identified using voxel-wise analyses (see Results) were conducted using the same procedures described above.
Sensory regions. Functionally defined ROIs were also created in bilateral regions of the occipital cortex using the same map of target-related activity described above. However, because we were less concerned with defining precise subregions and more concerned with maximizing statistical power, each occipital ROI consisted of a 27-voxel cube centered on a local maximum in the F-map. To limit the number of occipital regions considered, in each hemisphere we selected only the two ROIs centered on the highest local maxima for additional analyses of conflict-related activity. As discussed previously, time courses for individual trial types were averaged across voxels within each ROI. We then determined which, if any, of these four ROIs exhibited a significant interaction between target modality (auditory, visual) and distracter type (congruent, incongruent) using only those target-distracter pairs in which the cue and target occurred in the same sensory modality (for reasons discussed in Results). Because four occipital ROIs were analyzed, only p values <0.0125 were accepted as indicating a significant interaction. In the one occipital ROI exhibiting a significant interaction, we determined the simple effect of conflict on occipital activity separately for visual versus auditory targets. Analogous ROI analyses of simple effects were also performed in each of the two auditory regions that were activated in our voxel-wise analyses of conflict, which are described in Results.
Mean reaction times and mean percentage error rates were analyzed in separate random-effects, repeated-measures ANOVA with target modality (auditory, visual) and distracter type (congruent, incongruent) as within-participants factors. Participants were significantly slower (F(1,15) = 29.746; p < 0.001) to identify auditory versus visual targets (758 vs 708 msec). Furthermore, error rates were significantly higher (F(1,15) = 5.431; p < 0.035) when participants identified auditory versus visual targets (4.7 vs 2.92%). Also, as expected (MacLeod, 1991), participants were significantly slower (F(1,15) = 73.439; p < 0.001) to identify targets accompanied by incongruent versus congruent distracters (781 vs 684 msec). Similarly, error rates were significantly higher (F(1,15) = 39.887; p < 0.001) for targets accompanied by incongruent versus congruent distracters.
Given previous evidence indicating that cue stimuli “pull” attention to the sensory modality in which they are presented (Spence et al., 2001; Turrato et al., 2002), we conducted planned comparisons to test whether mean response times to a target stimulus were faster when it was preceded by a cue presented in the same versus a different sensory modality. Response times to visual targets were indeed significantly faster when they were preceded by a cue presented in the visual (688 msec) versus the auditory (727 msec) modality (t(15) = 3.81; p < 0.001; one-tailed). Analogously, response times to auditory targets were faster when preceded by a cue presented in the auditory (748 msec) versus the visual (767 msec) modality, but this effect did not achieve conventional levels of significance (t(15) = 1.44; p > 0.09; one-tailed). No significant effects were found in analogous planned comparisons of mean error rates (p > 0.09 in each case).
The overall model we wanted to test was as follows. First, attention is directed by a cue to either the visual or the auditory modality. If the cue directs participants to attend to the visual modality, there should be a relatively small increase of activity in brain areas involved in biasing attention (i.e., DLPFC and dorsal ACC) because attention is already focused at fixation because of the requirement to fixate (Hamed et al., 2002). But, if the cue directs participants to attend to the auditory modality, there should be a larger response in brain areas that direct processing resources toward task-relevant stimuli because attention must be shifted to the auditory modality. When the target and distracter letters appear, they are at least partially identified. If the letters are incongruent (vs congruent), then the rostral ACC signals the DLPFC and dorsal ACC to increase processing resources toward task-relevant representations. Greater DLPFC and dorsal ACC activity in incongruent (vs congruent) trials therefore reflects a second deployment of processing resources toward task-relevant information to minimize distraction from an irrelevant, conflicting distracter. This second deployment of attention results in a relatively late, attentional enhancement of activity in target-specific sensory cortices, which, via additional interactions with frontal regions, allows task-relevant sensory representations to play an especially important role in guiding the selection of an appropriate response under conditions of distraction (West and Alain, 1999).
We predicted that conflict would lead to selective increases of neural activity in sensory cortices that processed the target stimulus. Specifically, we hypothesized that there would be a selective increase of brain activity in (1) the visual cortex when a visual target letter was accompanied by an incongruent versus a congruent auditory distracter and (2) the auditory cortex when an auditory target was accompanied by an incongruent versus a congruent visual distracter. In conducting our analyses of conflict-related activity in sensory cortices (but not in the DLPFC and ACC), we limited ourselves to those trials in which the cue and target stimuli were presented in the same sensory modality. We did so because both previous and present findings indicate that cue stimuli can reflexively pull attention toward the sensory modality in which they are presented (Spence et al., 2001; Turrato et al., 2002). Such effects might make it relatively difficult to increase attention toward a target stimulus presented in a different modality and therefore relatively hard to observe an increase in target-specific sensory activity.
To evaluate our hypotheses, we localized sensory regions in which there was a significant interaction between target modality (auditory, visual) and distracter type (congruent, incongruent) using peak activity as the dependent measure in a voxel-wise analysis (t > 3.71; p < 0.001; one-tailed; four contiguous voxels). As hypothesized, this analysis revealed significant interactions in the auditory cortex (Fig. 2, Table 1). ROI analyses of simple effects in these regions of the middle temporal gyrus [Brodmann area (BA) 21] revealed that peak activity for an auditory target paired with an incongruent visual distracter was significantly greater (p < 0.05) than peak activity for an auditory target paired with a congruent visual distracter. In contrast, a visual target accompanied by an incongruent auditory distracter did not produce significantly greater peak activity than a visual target accompanied by a congruent auditory distracter.
These effects support our hypothesis. One deviation from this pattern, however, was that auditory targets did not produce significantly greater peak activity than visual targets in the left auditory cortex (t(15) < 1) (Fig. 2). The lack of a significant overall attention effect was unexpected and could be interpreted as evidence against our claim that conflict-related increases of auditory cortical activity reflect greater attention to auditory targets. We therefore determined whether attention effects were present during cue processing, in the form of a “baseline shift” (Kastner et al., 1999; Hopfinger et al., 2000; Woldorff et al., 2004), when participants oriented attention to the auditory versus the visual sensory modality. In line with such an attentional effect, an ROI analysis in the left auditory cortex indicated significantly greater peak activity for auditory-modality cues instructing participants to attend to the auditory versus the visual modality (t(15) = 1.84; p < 0.05). This finding supports our view that the conflict-related enhancements of target-specific activity we observed in the auditory cortex did indeed reflect increased attention to the auditory target.
The voxel-wise analysis did not reveal any significant interactions between target modality (auditory, visual) and distracter type (congruent, incongruent) in occipital visual cortices. The lack of an interaction at the voxel level may stem from our having instructed participants to maintain fixation throughout the experiment, resulting in relatively large amounts of attention to the visual modality in all conditions and therefore a reduced ability to detect subtle differences between the different conditions comprising the expected interaction. Consistent with this possibility, more statistically powerful ROI analyses of peak activity (based on functionally defined ROIs) revealed a significant interaction between target modality and distracter type in a region of the right visual cortex (t(15) = 2.57; p < 0.011; one-tailed). As predicted, in this region of the right middle occipital gyrus (BA 18), the pattern of simple effects observed in auditory ROIs was reversed (Fig. 2, Table 1). In particular, ROI analyses revealed that when a visual target was discriminated, peak activity was significantly greater (p = 0.05) when the accompanying auditory distracter was incongruent versus congruent. However, an auditory target accompanied by an incongruent visual distracter did not produce significantly greater peak activity than an auditory target accompanied by a congruent visual distracter.
To gain insight into the functional significance of these selective increases of target-specific sensory activity, we determined whether and how these increases might be related to behavior. Across participants, greater conflict-related functional MR imaging (fMRI) activity in the right middle occipital gyrus during the identification of visual targets predicted reduced behavioral interference from auditory distracters (r = -0.49; p < 0.05). In other words, the larger the increase in brain activity an individual displayed in the right middle occipital gyrus (BA 18) while trying to identify visual targets accompanied by incongruent versus congruent auditory distracters, the less that individual exhibited an increase in reaction time for visual targets accompanied by incongruent versus congruent auditory distracters (Fig. 3). Analogous correlations for auditory targets in the left and right auditory ROIs considered previously failed to achieve significance (p > 0.80 in both cases). However, the significant correlation in the visual cortex suggests that the functional role of increasing target-specific sensory activity during conflict is to minimize distraction from irrelevant stimuli.
Finally, we note that the selective effects of attention we observed in sensory cortices were unlikely attributable to generalized effects of task difficulty. As we reported previously, mean response times and mean error rates were significantly higher for incongruent than for congruent trials. However, greater activity for incongruent versus congruent trials occurred only in the visual cortex when a visual target was discriminated and only in the auditory cortex when an auditory target was discriminated. We therefore conclude that the effects in sensory cortices above reflect selective enhancements of attention to a target stimulus as a means for minimizing distraction, rather than a generalized effect of task difficulty in which neural activity simply increases as a task becomes more difficult.
We predicted that regions of the DLPFC involved in focusing attention on upcoming, goal-relevant stimuli during cue processing would help to increase attention to those stimuli during target processing when an incongruent distracter conflicted with task goals. To test this prediction, we performed a conjunction analysis to determine whether the same exact regions of the DLPFC exhibited both differential cue activity and conflict-related activity. This analysis was based on the results of two random-effects, repeated-measures ANOVAs, each of which computed the interaction between two trial types across 12 time points of the hemodynamic response. One ANOVA identified voxels in which the time courses differed for cue-only trials instructing participants to orient attention to the auditory versus the visual modality. A second ANOVA identified voxels in which the time courses differed for incongruent versus congruent target-distracter pairs. Each of these orthogonal statistical maps was thresholded at F(11,165) = 2.0 (p < 0.0313) to limit the false-positive rate of the conjunction analysis to 0.001.
In line with our hypothesis, the conjunction analysis isolated extensive regions of the left DLPFC (BA 9) and right DLPFC (BA 46) (Table 2, Fig. 4). Confirmatory ROI analyses (Table 2) verified that each of these regions of DLPFC exhibited significantly greater (p < 0.05) peak activity for (1) cue-only trials instructing participants to attend to the auditory versus the visual modality and (2) incongruent versus congruent target-distracter pairs. We note that the overall greater response to targets than to cues in the DLPFC likely stems from the fact that although both cues and targets require attention at multiple stages of processing, only targets require attention at response stages, because no responses are made to the cue stimuli.
As discussed previously, the behavioral data indicated a cross-modal exogenous cueing effect for visual targets but not for auditory targets. Specifically, response times to visual targets were significantly longer when the preceding cue stimulus appeared in the auditory versus the visual modality. This result suggests that demands on DLPFC processes that bias attention to the visual modality might have been greater after a cue presented in the auditory versus the visual modality because, after an auditory-modality cue, it was necessary to overcome an exogenous pull of attention. In line with this prediction, we observed significantly greater left DLPFC activity (3.75-8.75 sec after stimulus onset) when participants were instructed to attend to the visual modality by a cue presented in the auditory versus the visual modality (t(15) = 2.56; p = 0.01; one-tailed). Also mirroring the behavioral data, this cross-modal cueing effect did not achieve significance for visual-modality versus auditory-modality cues instructing participants to attend to the auditory modality (p > 0.49). No cross-modal exogenous cueing effects achieved significance in the right DLPFC (p > 0.11 in both cases). Nonetheless, the results from the left DLPFC provide additional support for our view that the DLPFC is involved in biasing attention toward task-relevant stimuli.
It is important to rule out the possibility that the differential cue activity and conflict-related activity above occurred simply because neural activity increases as task difficulty becomes greater. To do so, we performed additional ROI analyses to determine whether auditory targets produced greater peak activity than visual targets, because behavioral measures indicated that auditory targets were always more difficult to discriminate. No such effect occurred in the left DLPFC (t(15) = -1.31; p > 0.104; one-tailed), and the opposite effect (i.e., greater peak activity for visual vs auditory targets) occurred in the right DLPFC (t(15) = 2.26; p < 0.039; two-tailed). Thus, there is no evidence to support the view that the differential cue activity and conflict-related activity we observed in the DLPFC occurred because neural activity simply became greater as overall task difficulty increased.
Within the cognitive division of the ACC, we predicted a role for dorsal regions in focusing attention on goal-relevant stimuli and a role for rostral regions in detecting conflict from irrelevant stimuli. In line with a role for dorsal regions in focusing attention on task-relevant stimuli, ROI analyses revealed that peak activity was significantly greater (p < 0.05) for cue-only trials instructing participants to orient attention to the auditory versus the visual modality in the dorsal ACC (BA 32/6) (Fig. 5; see Table 2 for all relevant statistics). We observed an analogous effect in ventral regions within the cognitive division of the ACC (BA 24). The difference in peak activity within the dorsal ACC was also significant (p < 0.05) for both of the subregions that comprised the dorsal ACC (i.e., caudal ACC and pre-SMA) when these regions were analyzed separately. Furthermore, as predicted for regions involved in focusing attention, incongruent target-distracter pairs produced significantly greater activity than congruent target-distracter pairs in these same exact regions of the dorsal and ventral ACC (Table 2, Fig. 5). Finally, our data also supported a specific role for the rostral ACC (BA 32) in detecting conflict from irrelevant stimuli. Here, we observed significant conflict-related activity (p < 0.05) in the absence of differential cue activity (Table 2, Fig. 5).
It is important to verify that the distinct patterns of activation that we have observed in different ACC regions were not driven by the fact that these regions were composed of varying numbers of voxels. We therefore performed the ROI analysis in midline frontal regions a second time using only 20 peak voxels from each ROI. These peak voxels were those with the 20 largest F values in the repeated-measures ANOVA that was initially used to create the ROI (i.e., the ANOVA on the average target response vs baseline). Critically, the findings from this second ROI analysis perfectly replicated the main findings from the original analysis; namely, we observed the same patterns of effects in the dorsal ACC, ventral ACC, and rostral ACC. The only minor exception was that significantly greater peak activity for incongruent versus congruent target-distracter pairs was now also observed in the CMA (t(1,15) = 1.81; p < 0.05). Therefore, the distinct activation patterns that we observed in the dorsal ACC, ventral ACC, and rostral ACC cannot be explained by the fact that these ROIs were composed of different numbers of voxels in the original analysis.
Finally, as for the DLPFC discussed previously, we tested whether the effects we observed in midline frontal regions might have resulted from overall increases in task difficulty. Of importance, none of our midline frontal ROIs exhibited significantly greater peak activity for auditory versus visual targets (p > 0.05 in every case; one-tailed), although it was always more difficult to discriminate auditory versus visual targets. Thus, there is no evidence that the activations we observed in midline frontal cortices can be accounted for by a model in which neural activity simply becomes greater with increasing task difficulty.
The goal of the present study was to investigate the neural circuitry that minimizes cross-modal distraction. Several novel results inform current neurological theories of attention, including those pertaining to clinical populations with schizophrenia (Carter et al., 1997) and attention-deficit and hyperactivity disorder (Bush et al., 1999).
The neural correlates of minimizing distraction in sensory cortices
First, our findings indicate that minimizing distraction from irrelevant stimuli involves selectively increasing activity in target-specific sensory cortices. Conflict between a visual target and an auditory distracter led to increased activity in the visual cortex without modulating activity in the auditory cortex. However, conflict between an auditory target and a visual distracter led to increased activity in the auditory cortex without modulating activity in the visual cortex. This double dissociation concurs with findings that orienting attention to a particular stimulus feature selectively increases activity within sensory regions that process that feature (Kastner et al., 1999; Shulman et al., 1999; Hopfinger et al., 2000; Giesbrecht et al., 2003; Woldorff et al., 2004). Our findings extend these results, however, by demonstrating that target-specific increases in sensory activity also play an important role in minimizing distraction from irrelevant stimuli. Specifically, the larger the increase in visual cortical activity an individual displayed while identifying a visual letter under conditions of distraction, the less that individual exhibited behavioral interference from auditory distracters. This result indicates that amplifying the sensory representation of a goal-relevant stimulus helps to minimize distraction.
One might wonder how the modulations of sensory cortical activity function to minimize distraction in our task. These modulations likely occurred after conflicting stimuli were detected by the ACC. Thus, by the time conflict was detected, it may already have produced some adverse effects on behavior. Moreover, because stimulus presentation lasted only 350 msec, target presentation may have ended before conflict-related activity occurred in target-specific sensory areas.
Current models posit that when distracting stimuli conflict with task goals, the ACC signals DLPFC to further bias activity in other pathways to favor the processing of task-relevant stimuli, associations, and responses (Cohen et al., 1990; Botvinick et al., 2001; Kerns et al., 2004a). In line with this view, conflict-related increases of activity in target-specific perceptual cortices occur relatively late in processing, after conflict is detected and after stimulus presentation ends (West and Alain, 1999). Such increases likely reflect interactions between DLPFC and sensory cortices that maintain and/or reactivate task-relevant sensory representations after stimulus presentation ends, thereby allowing these representations to guide the selection of task-appropriate responses under conditions of distraction (Kerns et al., 2004a; West and Alain, 1999). Although conflicting stimuli may inevitably cause some behavioral interference when perceptual load is low (Lavie, 1995), the final magnitude of interference depends on the efficacy of DLPFC and ACC cognitive control processes. Supporting this view, participants who experience greater conflict-related activity in the dorsal ACC exhibit reduced behavioral interference effects (Weissman et al., 2004). Thus, late modulations of target-specific sensory activity are likely just one aspect of a larger circuit that minimizes distraction after conflict is detected.
The selective, conflict-related effects that we observed in sensory cortices also speak to the issue of modality-specific attentional resources. Increasing attention to goal-relevant stimuli usually reduces brain activity associated with irrelevant stimuli from the same sensory modality. For example, increasing attention to a word at fixation reduces the response of area MT to irrelevant motion in the peripheral visual field (Rees et al., 1997). These results suggest that within-modality attentional resources are limited, such that allocating greater resources to goal-relevant stimuli necessarily reduces those that are left over to process irrelevant stimuli (Lavie, 1995). In the present study, however, increasing attention to goal-relevant stimuli did not significantly modulate the amount of activity observed in sensory cortices that processed the distracter. However, our findings concur with recent data indicating that increasing attention to stimuli in one sensory modality does not reduce the processing of irrelevant stimuli from a different modality, consistent with evidence that each sensory modality has access to its own independent pool of attentional resources (Rees et al., 2001). Future studies will be necessary to determine the generality of this cross-modal effect.
The role of DLPFC is minimizing distraction
Our second novel finding is that conflict-related activity in the DLPFC reflects processes that increase attention toward goal-relevant stimuli (Banich et al., 2000b). Specifically, we observed both differential cue activity and conflict-related activity in identical regions of the DLPFC. Although previous researchers have reported both differential cue activity (MacDonald et al., 2000) and conflict-related activity (Banich et al., 2000a) in the DLPFC, these effects were never measured in the same study, raising the possibility that they occurred in different regions and thus reflected distinct cognitive control processes. Because we localized attention-related and conflict-related activity to the same regions of the DLPFC, our results suggest that at least some of the regions of the DLPFC involved in biasing attention toward goal-relevant stimuli during cue processing help to increase that bias during target processing when irrelevant stimuli conflict with behavioral goals (Banich et al., 2000b; Botvinick et al., 2001). This interpretation fits with the effects we observed in sensory cortices and with models in which the DLPFC biases activity in sensory cortices to favor the sensory representations of goal-relevant stimuli (Desimone, 1998; Kastner et al., 1999; Hopfinger et al., 2000). It also supports the view that the DLPFC helps to keep task goals active in working memory (Banich et al., 2000b; MacDonald et al., 2000; Miller, 2000; Botvinick et al., 2001), which leads to increased attention to goal-relevant stimuli when irrelevant stimuli conflict with behavioral objectives (Kerns et al., 2004a).
Regional specialization in the ACC
The third novel contribution of our findings is that they potentially resolve a long-standing controversy concerning whether the ACC focuses attention on task-relevant stimuli (Posner and DiGirolamo, 1998; Dreher and Berman, 2002) or detects conflict from distracting events (MacDonald et al., 2000; Botvinick et al., 2001) by indicating that each of these functions is implemented in a different region within the cognitive division of the ACC. In dorsal and ventral regions, we observed both differential cue activity and conflict-related activity. Cues instructing participants to attend to the auditory versus the visual modality imposed similar demands on basic sensory, semantic, and motor preparation processes. Therefore, the differential cue activity we observed likely indexed differential demands on processes that focus attention on goal-relevant stimuli (MacDonald et al., 2000). Importantly, this differential cue activity could not have reflected processes that detect conflicting irrelevant stimuli, which were absent during cue processing. Thus, the most parsimonious explanation for our findings in the dorsal and ventral ACC is that these regions play a role in focusing attention on goal-relevant stimuli during cue processing and, if necessary, during target processing to minimize distraction from conflicting, irrelevant stimuli (Posner and DiGirolamo, 1998).
Our findings in the dorsal ACC contrast with data from a visual-modality fMRI study of the Stroop interference effect (MacDonald et al., 2000), which indicated a role for the dorsal ACC in detecting conflict rather than in focusing attention. The critical difference between the two studies was likely the duration of the cue-target interval. Specifically, our cue-target interval (1.25 sec) was much shorter than that used in the previous study (12.5 sec). Multiple findings indicate that using a long cue-target interval encourages participants to wait before orienting their attention (Nobre, 2001; Ghose and Maunsell, 2002). The relatively long interval used in the Stroop fMRI study therefore probably diminished the likelihood that participants oriented their attention at the time of cue presentation and, in turn, the probability of observing activity in the dorsal ACC that was related to focusing attention. Thus, the present findings probably paint a more accurate picture of the role of the dorsal ACC in focusing attention than do findings from either the previous Stroop fMRI study above or studies of distraction in which cue-related activity was not measured (Botvinick et al., 1999; Carter et al., 2000; Kerns et al., 2004b). Additional studies will be necessary to determine whether the dorsal ACC participates in increasing attention to sensory and/or motor representations of goal-relevant stimuli, because its anatomical connections with superior parietal and frontal regions would allow it to influence attention to either or both of these types of representations (Devinsky et al., 1995; Bush et al., 2000).
Contrasting with our findings in dorsal regions, in rostral regions of the cognitive division of the ACC, we observed conflict-related activity in the absence of differential cue activity, supporting a specific role for these regions in detecting conflict from irrelevant stimuli. This result suggests that rostral regions signal the presence of conflicting stimuli to the DLPFC and dorsal ACC regions, which in turn minimize distraction by increasing attention to goal-relevant stimuli. Our data thus suggest a resolution to the long-standing debate over whether the ACC focuses attention or detects conflict by demonstrating that both of these processes occur in distinct ACC regions.
The present study has revealed novel roles for sensory cortices, the DLPFC, and the ACC in minimizing distraction. Future studies investigating how cognitive control mechanisms operate on multisensory stimuli may continue to inform neurological models of attention while providing a more ecologically valid framework for understanding how attention operates in multisensory environments.
This work was supported by a postdoctoral National Research Service Award (1 F32 NS41867-01) to D.H.W., by a Dean's Summer Research Fellowship to L.M.W. from the Duke University Undergraduate Research Support Office, and by grants from the National Institutes of Health (MH60415 and P01 NS41328, Project 2) to M.G.W. We thank Rebecca Compton, Scott Huettel, and two anonymous reviewers for many helpful comments and suggestions. We also thank Chris Karp and Naomi Wiesenthal for their careful proofreading of this manuscript.
Correspondence should be addressed to Dr. D. H. Weissman, Center for Cognitive Neuroscience, Box 90999, Duke University, Durham, NC 27708. E-mail:.
Copyright © 2004 Society for Neuroscience 0270-6474/04/2410941-09$15.00/0