Attention can be voluntarily directed to a location or automatically summoned to a location by a salient stimulus. We compared the effects of voluntary and stimulus-driven shifts of spatial attention on the blood oxygenation level-dependent signal in humans, using a method that separated preparatory activity related to the initial shift of attention from the subsequent activity caused by target presentation. Voluntary shifts produced greater preparatory activity than stimulus-driven shifts in the frontal eye field (FEF) and intraparietal sulcus, core regions of the dorsal frontoparietal attention network, demonstrating their special role in the voluntary control of attention. Stimulus-driven attentional shifts to salient color singletons recruited occipitotemporal regions, sensitive to color information and part of the dorsal network, including the FEF, suggesting a partly overlapping circuit for endogenous and exogenous orienting.
The right temporoparietal junction (TPJ), a core region of the ventral frontoparietal attention network, was strongly modulated by stimulus-driven attentional shifts to behaviorally relevant stimuli, such as targets at unattended locations. However, the TPJ did not respond to salient, task-irrelevant color singletons, indicating that behavioral relevance is critical for TPJ modulation during stimulus-driven orienting. Finally, both ventral and dorsal regions were modulated during reorienting but significantly only by reorienting after voluntary shifts, suggesting the importance of a mismatch between expectation and sensory input.
Mechanisms of selective attention ensure that we perceive and respond to a limited set of the large number of objects present in the environment. Previous research has identified two major ways in which attention can be directed to an object. Attention can be voluntarily directed to an object based on our current behavioral goals [endogenous (endo) orienting] or can be reflexively drawn to an object because of its sensory salience [exogenous (exo) orienting] (Jonides, 1981; Yantis and Jonides, 1990).
Psychological studies of the effects of spatial cues on performance on target stimuli that subsequently appear at the cued or uncued locations suggest that separable processes are involved in endogenous and exogenous shifts of attention (Muller and Rabbitt, 1989; Nakayama and Mackeben, 1989). Centrally located cues only facilitate performance at the cued location when the cue provides information about the subsequent target location (endogenous orienting). However, peripheral cues facilitate performance even when the cues provide no information about target location (exogenous orienting). Moreover, central and peripheral cues have different time courses. Peripheral cues produce facilitation at the cued location at slightly shorter cue-target stimulus-onset asynchronies (SOAs) than central cues but show inhibitory effects (i.e., slower response times at the cued location) at longer SOAs, an effect known as inhibition of return (IOR) (Posner and Cohen, 1984; Posner et al., 1985; Abrams and Dobkin, 1994; Klein, 2000).
Previous neuroimaging studies have not found distinct neural mechanisms for endogenous and exogenous orienting (Kim et al., 1999; Rosen et al., 1999; Mayer et al., 2004; Peelen et al., 2004). However, these studies have not separated the activations produced by the cue from the activations produced by the target. Ideally, one would like to examine the orienting mechanisms engendered by the cue independently of how those orienting mechanisms interact with the subsequent target. Moreover, in previous cueing studies, the exogenous condition involved very different luminance transients than the endogenous condition, complicating comparisons of the two conditions (Kim et al., 1999; Lepsien and Pollmann, 2002; Mayer et al., 2004).
An event-related functional magnetic resonance imaging (fMRI) study that separated endogenous orienting from target detection (Corbetta et al., 2000) suggested that exogenous and endogenous mechanisms may activate different brain regions. When spatial attention was deployed through endogenous mechanisms, a bilateral dorsal frontoparietal network was activated. Conversely, during target detection, a right (R)-lateralized ventral frontoparietal network was strongly activated, particularly when a salient target occurred in an unexpected location and initiated reorienting in a stimulus-driven manner. These results suggested that the ventral frontoparietal network [specifically the temporoparietal junction (TPJ), including the supramarginal gyrus (SMG) and superior temporal gyrus (STG)] may play a special role in stimulus-driven shifts of attention (Corbetta and Shulman, 2002).
Stimulus-driven shifts of attention may occur to highly salient sensory stimuli that are irrelevant to the current task, e.g., a flash in the periphery of the visual field (exogenous orienting) (Posner and Cohen, 1984), but such shifts also occur to stimuli that share some property with the target for which subjects are looking (e.g., a red letter when searching for a red digit). This latter stimulus-driven effect, reflecting the behavioral relevance of the stimulus features rather than their sensory salience, is called “contingent” orienting (Folk et al., 1992, 2002). A recent report (Serences et al., 2005) indicates that the TPJ is activated by distracter stimuli that match the color of the target, supporting a role of this region in contingent orienting. However, it is unknown whether this region is also involved in orienting to salient but irrelevant stimuli that do not share target features.
Macaque area 7a, located on the gyral surface of the inferior parietal lobule, might be a homolog of the human SMG, one of the regions forming the TPJ. Recent neurophysiological evidence indicates that neurons in area 7a show enhanced spatially specific responses for an unattended but perceptually salient item in a display, e.g., a red object in a field of green objects (color singleton) (Constantinidis and Steinmetz, 2001). This result may implicate a role of the TPJ in exogenous orienting.
The main goal of this study was to determine whether there are distinct neural networks for endogenous and exogenous attentional orienting. We designed a paradigm that separated the neural signals associated with the cue from those associated with the target. In the exogenous cueing condition, color singletons were used to draw attention to a location, ensuring that the spatial distribution of luminance in the exogenous and endogenous cue conditions was comparable (Theeuwes and Burger, 1998).
Materials and Methods
Twenty (n = 20) right-handed, healthy adults ranging in age from 19 to 30 (n = 9 females) were scanned. Participants were screened in advance of their participation and were excluded from the study if they had pre-existing neurological or psychiatric conditions based on their responses on a brief questionnaire. Participants had normal or corrected-to-normal vision and were not taking any psychoactive medications. The participants were recruited from the Washington University community and provided informed consent according to the approved guidelines of the Institutional Review Board of the Washington University School of Medicine. Participants were paid for their participation in the study.
All participants completed a behavioral training session before the imaging session. Only participants who demonstrated the expected cueing effects (in both the endogenous and exogenous conditions) in the behavioral session were invited to participate in the imaging session. Of the 28 participants that completed the behavioral session, 23 (82%) demonstrated significant spatial cueing effects in both conditions. Three of these 23 participants did not return for their scheduled imaging session.
The behavioral tasks were presented using an Apple G4 Macintosh computer (Apple Computer, Cupertino, CA), programmed with PsyScope software (Cohen et al., 1993). Within the scanner, the stimulus display was projected by an Epson PowerLite 703c liquid crystal display projector (Epson America, Long Beach, CA) onto a Plexiglas screen positioned at the end of the bore. Participants viewed the display through a mirror attached to the head coil.
Method: behavioral session
The experimental design involved three conditions (Fig. 1a). In all conditions, the display was presented against a black background. A diamond that remained on the screen at all times surrounded a central fixation cross. Participants were instructed to maintain fixation on the central diamond throughout the block of trials. During the intertrial interval (ITI), the fixation cross and surrounding diamond were dimmed. During the presentation of a trial, the fixation cross and diamond were brightened at the beginning of the cue period and remained white throughout the trial. Trials began with the presentation of a cue display consisting of eight colored boxes, which remained on the screen for 100 ms. The eight colored boxes were equally spaced along a virtual square centered on the fixation diamond. The visual angle of the entire grid of boxes was 12.2° [i.e., 6.1° left (L) and right], and the visual angle of each box was 1.9°. The central fixation cross and diamond also subtended 1.9° of the visual angle. The luminance of the colored boxes was 2.9 cd/m2.
After one of two possible cue-target SOAs (150 and 700 ms) that were randomly selected from trial to trial, a randomly rotated target letter appeared on the screen for 150 ms. The target was either the letter T or the letter L and was presented in either the left or right visual field in a location that had previously contained a colored square during the cue period. The letter subtended 1.4° of the visual angle and was presented in one of four possible randomly selected orientations [i.e., vertical (veridical), vertically inverted, tilted 90° to the left, and tilted 90° to the right].
Participants indicated the identity of the target letter by pressing one of two adjacent keys on a keyboard with one of two fingers (i.e., index or middle finger). All participants responded with their right (dominant) hand. The keyboard was rotated so that the two response buttons were aligned along a vertical axis, perpendicular to the horizontal axis joining the left and right target letter positions in the visual display. The two keys used to indicate the identity of the letters were counterbalanced across subjects. Participants were told to make their response as quickly as possible without sacrificing accuracy. After the target presentation, an ITI of 1000 ms occurred, during which only the dimmed central fixation cross and diamond were present on the screen. Test blocks also included a portion of cue-only trials (20%). Cue-only trials were identical to the cue plus target trials described previously, except that the trial ended after the presentation of the cue. In these types of trials, after the cue display was presented and a randomly selected SOA of either 150 or 700 ms had elapsed, the fixation cross and diamond were dimmed to gray, instead of a target letter appearing, indicating to the subject that the trial was over.
The three cueing conditions were distinguished by the colors in the display and the informativeness of the display.
Exogenous orienting task. In the exogenous orienting cue condition, seven of the eight boxes were the same color, and one was a unique color (i.e., a color singleton). The colors that defined the singleton and nonsingleton boxes varied from trial to trial. Four colors (red, green, blue, and purple) were used to define the singleton and nonsingleton boxes. All possible color combinations produced 12 possible exogenous displays for each position of the singleton, and these were randomly presented from trial to trial. The singleton always occurred in either the left or right box that was aligned with the fixation box, but the left-right location of the subsequent target was completely random with respect to the left-right location of the singleton.
Neutral (nonspatial) task. In the neutral condition, there were always two boxes of each color (i.e., blue, purple, green, and red) present in the display. The position of the different colored boxes was pseudorandomized with two constraints. First, two boxes of the same color could not occur directly adjacent to one another. Second, two boxes of the same color could not occur at the corners of the grid in either vertical or horizontal directions. These constraints were included to prevent the formation of perceptual groups that might induce a spatial bias. These two constraints, along with the four possible colors, resulted in 48 possible color combinations, which were randomly selected on each trial.
Endogenous orienting task. The combinations of colored boxes in the endogenous cue display were identical to those in the neutral cueing condition. However, in the endogenous condition, only one-half of the fixation diamond was brightened (rather than both sides, as in the exogenous and neutral conditions) to form an arrow cue that pointed to either the left or right side of the cue display. After the presentation of an arrow cue, the target was likely to appear in the direction of the arrow on 75% of the trials (valid trials), whereas it could appear at the opposite location on 25% of the trials (invalid trial). Participants were informed of these probabilities and encouraged to covertly attend to the location cued by the arrow. The direction of the arrow cue was randomly changed over trials.
Exogenous and neutral cueing trials were randomly intermixed in the same blocks of trials. Endogenous cueing trials were presented in separate blocks. There were three blocks of 80 trials (240 trials total) in the combined exogenous/neutral cueing conditions and two blocks of 80 trials (160 trials total) in the endogenous cueing condition.
Method: fMRI session
The procedure was the same as that in the behavioral session, with the following exceptions. First, the cue-target SOA in the imaging session was fixed and increased to one MR frame (2.16 s). Second, the ITI was randomly jittered between trials and lasted for one, two, or three MR frames (2160, 4320, or 6480 ms, respectively). Third, all display dimensions were reduced proportionately, based on a 7.6° visual angle for the cue display grid.
Behavioral responses were recorded by a fiber-optic key-press response made with either the left or right index finger. All subjects pressed the top button with the left index finger and the bottom button with the right index finger. The assignment of keys to target letters was counter-balanced across subjects. Participants were instructed to maintain fixation on the central fixation cross throughout a blood oxygenation level-dependent (BOLD) run. Eye movements were monitored and recorded during the imaging session with a specially modified MR-compatible Applied Science Laboratories (Bedford, MA) model 504 eye-tracking system with a resolution of 0.5° of the visual angle.
Participants completed a total of 540 trials in the imaging session, with 180 trials in each of the three cueing conditions. Each participant completed nine exogenous/neutral BOLD scans, in which each scan consisted of 40 trials (20 exogenous and 20 neutral), and six endogenous BOLD scans, in which each scan consisted of 30 trials. For all three cueing conditions, 20% of the trials were cue-only, and 80% were cue plus target.
fMRI scan acquisition and data analysis. Imaging was performed on a 1.5 tesla Siemens (Munich, Germany) Vision system. An asymmetric spin-echo echoplanar imaging sequence was used to measure BOLD contrast [repetition time (TR), 2.16 s; echo time (TE), 37 ms; flip angle, 90°]. Each scan consisted of either 122 or 165 MR frames (endogenous and exogenous/neutral runs, respectively), during which 16 contiguous 8 mm axial slices were acquired (3.75 × 3.75 mm in-plane resolution). Anatomical images were acquired using a sagittal magnetization-prepared rapid-acquisition gradient echo (MPRAGE) sequence (TR, 97 ms; TE, 4 ms; flip angle, 12°; inversion time, 300 ms). Functional data were realigned within and across runs to correct for head movement and coregistered with the anatomical data. Whole-brain normalization was applied to equate signal intensity across subjects. The data from each individual were warped to a standard stereotactic atlas space by aligning each participant's MPRAGE to an atlas target brain (Talairach and Tournoux, 1988).
For each individual, the time courses of the hemodynamic (BOLD) responses in the different conditions were analyzed at the “voxel” level using a linear regression model that yielded separate time courses (eight time points in each time course) for the cue phase (e.g., the cue period after a left endogenous cue) and target phase (e.g., the target period after a left endogenous cue) of a condition without assuming a hemodynamic response.
The regression model was coded as follows. Each time a cue for a particular condition occurred, a set of eight δ functions, one function for each time point or TR, was coded into the design matrix, starting at the TR for cue onset. Similarly, each time a target for a condition occurred, another set of eight δ functions was encoded into the model, starting at the TR for the target onset. On cue-only trials, only the cue set of eight δ functions was coded into the design matrix. On cue plus target trials, both the cue set and the target set of δ functions were coded into the design matrix, with the first target time point regressor starting one TR after the first cue time point regressor (i.e., the duration of the cue period). The use of cue-only trials decorrelated the cue and target onsets, whereas the use of random ITI durations decorrelated the cue or trial onset. We presented a quantitative validation showing that this model correctly estimates the cue and target time courses (Ollinger et al., 2001) and used this model in a number of previous publications (Shulman et al., 1999; Corbetta et al., 2000). The model also included terms on each scan for an intercept, linear trend, and temporal high-pass filter.
Statistical analyses of the time courses involved ANOVAs in which experimental factors were crossed with the factor MR frame (with time points 1-8 as the levels on this time factor). Separate ANOVAs were conducted on the time courses for the cue and target periods. During the cue period, the experimental factors included cue type (exogenous, endogenous, and neutral) and cue direction (left or right, for exogenous and endogenous conditions only). During the target period, experimental factors included cue type (exogenous, endogenous, and neutral), target direction (left or right) and validity (valid or invalid, for exogenous and endogenous conditions only). All analyses treated subjects as a random effect. Statistical images were corrected for multiple comparisons over the entire brain (p < 0.05), using a magnitude threshold derived from Monte Carlo simulations that takes into account the number of contiguous activated voxels (Forman et al., 1995). Coordinates of each cluster of activation were identified by an automated algorithm that searched for local maxima and minima (Mintun et al., 1989).
We also constructed several regions of interest (ROIs) to test our a priori hypotheses concerning the activations of dorsal and ventral frontoparietal areas during endogenous and exogenous orienting. One set of ROIs was constructed using group-averaged data from a previously published experiment (Corbetta et al., 2000), ensuring that the ROIs were not biased by the noise in the current data set. In this experiment, subjects were shown a foveal arrow cue indicating that a target would appear 3.3° to the left or right of fixation. After a 2.36 s cue period, the target was presented, and subjects responded with a button press. The spatial cue correctly indicated the target location on most trials (valid cue), but incorrectly indicated the location on the other trials (invalid cue). Because dorsal frontoparietal areas involved in endogenous orienting should show preparatory activity after a central cue to shift attention, dorsal ROIs were identified from the voxels in this previous experiment that were significantly activated during the cue period. Six dorsal ROIs (three in the left hemisphere and three in the right hemisphere) were defined: two in the anterior (ant) intraparietal sulcus (IPS) (left, x, y, z = 31, 54, 48; right, 28, -49, 44), two more posteriorly in the IPS (left, -23, -66, 51; right, 24, -63, 49), and two in the frontal eye field (FEF) (left, -23, -11, 49; right, 28, -12, 49). Because ventral frontoparietal areas involved in stimulus-driven reorienting should show greater activity for invalid targets, which initiate reorienting, than that for valid targets, ventral ROIs were defined based on the voxels in the previous experiment that were significantly more activated during the target period by the detection of invalidly cued targets than by that of validly cued targets. Three ventral ROIs (all in the right hemisphere) were defined: R SMG (52, -49, 29), R STG (57, -48, 11) and R inferior frontal gyrus (IFG) (36, 43, -3). We also constructed a set of dorsal and ventral ROIs from the current experiment. Dorsal ROIs were defined from the voxels showing an effect of cue type by time during the cue period (see Results). Two ventral ROIs, one in the SMG and one in the IFG, were defined from voxels that showed larger responses to invalid targets than to valid targets during the target period of the endogenous condition.
The statistical significance of the activations in these ROIs was evaluated in regional ANOVAs, in which the values of all voxels within the ROI were summed before the computation of the ANOVA (i.e., the region was treated as a single voxel). Regional ANOVAs were conducted using the same factors as those discussed above for the voxel-level analyses.
Eye-movement data acquisition and analysis. Usable eye-movement data were obtained from 16 of the 20 subjects scanned in this study. Of the four subjects for whom eye-movement data were unusable, the eye-movement monitor was not used for two subjects because of technical difficulties, one subject's data were very noisy because of excessive blinking, and one subject's data were very noisy because of the subject's contact lenses.
The eye-movement data from these 16 subjects were analyzed for broken fixations during both the cue and target periods.
Behavioral training session
The results in this section reflect all 28 participants that completed the behavioral session. Data from the endogenous cueing condition for one participant were lost as a result of technical difficulties. Therefore, that participant was excluded from any repeated-measures analyses involving that condition.
Participants performed the task with a high degree of accuracy (mean accuracy rates: endogenous trials, 96.5%; exogenous trials, 97.1%; neutral trials, 97.6%). There were no significant differences in accuracy between the three conditions (F(2,52) = 1.31) and no main effects or interactions involving SOA or target validity.
In the endogenous cueing condition, reaction time (RT) was significantly longer on invalid than valid trials (F(1,27) = 75.5; p < 0.0001), indicating that subjects used the spatial cue. The interaction of validity and SOA was also significant (F(1,27) = 25.4; p < 0.0001), indicating that the cue was used more effectively at the long SOA than at the short SOA. The RT difference between valid and invalid targets at the short SOA was only 13 ms (549 ms, valid; 562 ms, invalid), whereas the RT difference at the long SOA was 43 ms (531 ms, valid; 574 ms, invalid). These results are consistent with previous studies of endogenous cueing, which indicate that participants receive a greater benefit from the cue when they have sufficient time to process it (Muller and Rabbitt, 1989).
The RT data from the exogenous cueing condition also indicated an effect of the cue on performance, with slower responses to invalid than valid targets (F(1,27) = 11.0; p < 0.005), although the cue in this condition was completely noninformative. Although the validity effect appeared to be larger at the short SOA (22 ms invalid-valid difference) than at the long SOA (9 ms invalid-valid difference), the interaction between SOA and validity was not statistically significant (F(1,27) = 2.33; p = 0.14). IOR was not observed.
A joint analysis of the RT data from the exogenous and neutral conditions indicated that RTs to invalid trials were slower than RTs to both valid and neutral trials (F(1,27) = 11.2; p = 0.002; 557 ms, valid; 559 ms, neutral; 572 ms, invalid).
Participants performed the task accurately, with no significant differences in accuracy among the three cueing conditions (F(2,38) = 1.01; endogenous trials, 97.19%; exogenous trials, 97.05%; neutral trials, 97.64%). There were also no significant effects of validity on accuracy (F(1,19) = 0.14) and no interaction of cue type and validity (F(1,19) = 1.96).
An ANOVA on the RT data indicated that participants used the cue in the endogenous condition, with a highly significant main effect of validity (558 ms, valid; 598 ms, invalid; F(1,19) = 34.4; p < 0.0001). Therefore, participants were able to maintain attention to the cued location for the duration of the SOA (2.16 s).
An ANOVA on the RT data in the exogenous condition also revealed a significant main effect of validity (F(1,19) = 4.37; p = 0.02), reflecting slightly slower responses after invalid exogenous cues (569 ms, invalid; 560 ms, valid). Neutral RTs (563 ms) were between those for valid and invalid trials, indicating that the exogenous and neutral conditions were well matched.
Analyses of the eye-movement data from 16 participants revealed a very low occurrence of broken fixations. Of a total of 540 cue periods, the average percentage of broken fixations across all of the participants was 0.9%, with a range from 0 to 2.7%. Similarly, the average percentage of broken fixations after the presentation of all 432 targets across all participants was 1.8%, with a range from 0.23 to 3.7%. Therefore, activations observed during the imaging study were not attributable to eye movements made by participants. Trials with eye movements were included in the analyses of the fMRI data. A previous study done in our laboratory in which activation data were analyzed and compared when broken fixations were included or removed revealed no significant changes in patterns of activation (Astafiev et al., 2003).
Neuroimaging results: cue period
Dorsal frontoparietal network
Cue-related activity indexes brain activity related to the deployment of attention under endogenous versus exogenous conditions. A basic question is whether endogenous and exogenous cues have different effects on dorsal and ventral frontoparietal regions. Figure 2 shows a statistical map of regions in which the time course of the BOLD response during the cue period was significantly different for endogenous, neutral, and exogenous cues, as measured by the significant interaction of cue type (endogenous, exogenous, and neutral) and time (time points 1-8) in a voxelwise ANOVA (Table 1, left, coordinates and z-scores).
Activations were observed bilaterally in both the IPS and FEF components of the dorsal network. IPS activations were observed in anterior and posterior segments, with the latter extending into the superior parietal lobule, as well as the ventral extension of the IPS into the occipital lobe. The precuneus was also activated. The mean time courses in the IPS and FEF, shown in Figure 2, indicated that, in both regions, activations were largest for the endogenous condition, with no difference between the exogenous and neutral conditions. The only exception was in the left FEF, in which the response to exogenous cues appeared greater than that for neutral cues.
A voxelwise comparison of the endogenous and exogenous conditions, with cue type (endo and exo) and time as factors, was conducted to determine regions that were differentially activated by endogenous and exogenous orienting. Significant activations were observed in most of the same dorsal frontoparietal regions identified in the above analysis, with the exception of the right FEF (Table 1, right, coordinates and z-scores). A voxelwise comparison of the exogenous and neutral cue conditions, with cue type (exo and neutral) and time as factors, was also conducted to index stimulus-driven orienting to the color singleton. No activations passed the multiple-comparison corrected threshold, but activity very near the threshold was observed in the L FEF (-26, -13, 49; z = 3.2) and R FEF (35, -13, 40; z = 3.05).
The results of these voxel-level analyses were supported by regional analyses of the six dorsal frontoparietal ROIs defined independently from our previous study of endogenous orienting (see Materials and Methods). Significant interactions of cue type (exo, endo, and neutral) by time were observed in the ant IPS (L ant IPS, F(14,266) = 2.81, p < 0.01; R ant IPS, F(14,266) = 4.01, p < 0.001) and FEF (L FEF, F(14,266) = 3.06, p < 0.01; R FEF, F(14,266) = 2.35, p < 0.05) but not in the posterior IPS. Subanalyses indicated that both the ant IPS and FEF also showed significantly more activity for endogenous cues than either exogenous or neutral cues (data not shown). However, a comparison of exogenous and neutral cues only yielded a significant activation in the L FEF (F(7,133) = 2.38; p < 0.05).
These results indicate that the dorsal frontoparietal network was predominantly recruited by endogenous cues, consistent with the proposed role of this network in voluntary orienting (Corbetta and Shulman, 2002). A subset of these regions, particularly the left FEF cortex, also showed an enhancement of the BOLD response for cue displays containing color singletons compared with neutral cue displays. Because sensory stimulation was primarily matched for exogenous and neutral cues (if anything, the neutral display contained more color contrast), the dorsal modulations may have represented signals that coded the location or saliency of the color singleton. In addition, the dorsal modulations could have reflected the recruitment of preparatory processes automatically engendered by the presentation of the color singleton, because the behavioral data from the behavioral and fMRI sessions indicated that exogenous cues maintained attention at the cued location at long SOAs, albeit weakly.
Ventral frontoparietal network
We next consider the responses during exogenous and endogenous cueing in the putative stimulus-driven ventral network, consisting of regions in the TPJ and ventral frontal cortex (VFC). These regions were not observed in the above voxel-level analyses of differences between the three cueing conditions and were analyzed using independently defined ROIs (see Materials and Methods). ROIs for the ventral network were determined from both a previous experiment and the current experiment, using the same criteria of enhanced responses to invalid targets relative to valid targets. As shown below, the results from both sets of ROIs were quite similar.
We first show the analyses from the ventral ROIs defined from the target period of the present experiment (Table 2, left, coordinates and z-scores for all regions showing larger activations for invalid than valid targets in the endogenous condition), because these ROIs provide the best estimates of the location of the ventral network in this experiment. The time course of the BOLD signal in the R SMG and R IFG during the cue period is shown in Figure 2. A regional ANOVA, with cue type (exo, endo, and neutral) and time as factors found a significant main effect of time for both the SMG (F(7,133) = 8.27; p < 0.0001) and IFG (F(7,133) = 2.79; p < 0.01), indicating that both regions were activated. In the SMG, the interaction of cue type (exo, endo, and neutral) by time was significant (F(14,266) = 2.73; p < 0.001). Pairwise comparisons indicated a significant difference between the endogenous and exogenous conditions (F(7,133) = 2.07; p = 0.05) and the endogenous and neutral conditions (F(7,133) = 3.90; p < 0.001), with no difference between the exogenous and neutral conditions. The time course shown in Figure 2 indicates that the BOLD response was actually greater for endogenous than exogenous or neutral cues, the reverse of the initial hypothesis. In the IFG, the interaction of cue type (exo, endo, and neutral) by time was only marginal (F(14,266) = 1.62; p < 0.1), whereas pairwise comparisons indicated only a marginal difference between the endogenous and exogenous conditions (F(7,133) = 1.88; p < 0.1), with no significant differences between the other condition pairs.
The time courses of the SMG and IFG responses suggest that the responses in the endogenous condition did not reflect the same spatial orienting mechanisms observed in IPS and FEF. The increase in the SMG-IFG response above baseline was delayed by almost 2 s relative to the IPS-FEF response. Analyses showed that the BOLD signal in the R SMG and R IFG ROIs of Figure 2 showed no significant variation during the endogenous condition over time points 1-3 (SMG, F(2,38) = 1.29; IFG, F(2,38) = 0.13). We constructed ROIs for the dorsal network from the voxels showing an interaction of cue type (endogenous, exogenous, and neutral) by time. In the endogenous condition, both IPS (F(2,38) = 10.8; p < 0.0005) and FEF (F(2,38) = 15.7; p < 0.0001) showed a significant effect of time, indicating that the BOLD signal varied in magnitude over time points 1-3. We then tested for an interaction between time and region over time points 1-3, where the region factor consisted of two levels, a right hemisphere dorsal region and a right hemisphere ventral region. The time by region interaction was significant for IPS versus SMG (F(2,38) = 4.30; p = 0.026), IPS versus IFG (F(2,38) = 6.04; p = 0.005), FEF versus SMG (F(2,38) = 5.11; p = 0.011), and FEF versus IFG (F(2,38) = 8.14; p = 0.001). These interactions demonstrate that, over time points 1-3, each dorsal region showed a different time course than each ventral region.
Because the dorsal regions were defined from the cue period data, they might be biased toward showing more effects over time points 1-3 than the ventral regions. Therefore, we confirmed this ventral-dorsal difference at time points 1-3 by conducting analyses on the ROIs defined from our previous experiment (Corbetta et al., 2000). This allowed a direct comparison of comparably defined dorsal and ventral ROIs that were independent of the present data set. Highly significant changes in the BOLD signal over time points 1-3 were observed during the endogenous condition in both the left and right ant IPS and FEF (p < 0.0001 in all four cases), but the changes in the three ventral ROIs (R SMG, R STG, and R IFG) were not significant. Direct comparisons of the dorsal regions and R SMG, the ventral region most similar to that identified in the present experiment, were also conducted. Over time points 1-3, the time by region interaction was significant for the R ant IPS versus the R SMG (F(2,38) = 5.77; p < 0.01) and the R FEF versus the SMG (F(2,38) = 5.32; p < 0.01) but not for the R ant IPS versus the R FEF. These results demonstrate that, over time points 1-3, both the R ant IPS and the R FEF showed a different time course than the R SMG but did not differ from each other.
Therefore, regional analyses using ROIs defined from either the present data or an independent data set indicated that dorsal FEF-IPS regions involved in endogenous orienting showed a very different time course after endogenous cues than ventral TPJ-VFC regions. In previous work (Shulman et al., 2002), we observed a late response in the TPJ and frontal cortex on cue-only trials, in which the trial aborts before a target is presented, which is related to the detection of stimuli for terminating a sustained, endogenous state of preparation. The current SMG-IFG responses, which are similar to the delayed responses in the previous work, may have been related to the termination of the preparatory state on a cue trial (indicated by the dimming of the fixation point at the end of the 2.16 s cue phase), rather than the establishment and maintenance of that state, as in the IPS and FEF.
In conclusion, ventral regions in the SMG and IFG were not differentially activated by exogenous cues, whereas the activations related to endogenous cues probably reflected different processes than those engaged by dorsal regions.
Extrastriate visual cortex
In contrast to other experiments in which different cue stimuli directed attention under endogenous versus exogenous conditions (e.g., foveal arrows for endogenous vs peripheral transients for exogenous), the present study primarily equated sensory factors across cue conditions. This allowed us to separate the effects of exogenous versus endogenous orienting from the sensory response to the cue.
The neuroimaging data indicated that sensory factors were well matched, because most regions in the visual cortex were not differentially active across cue conditions. However, there were two separate functional clusters in the extrastriate visual cortex in which the temporal profile and magnitude of the BOLD signal differentiated among cue conditions [a third, more complex pattern was observed in the cuneus (data not shown)].
In the lateral occipital cortex, regions active during the presentation of foveal arrow cues (middle temporal and lateral occipital complexes) (Corbetta et al., 2000, 2002) showed a significant interaction of cue type (endo, exo, and neutral) by time. The time course shown in Figure 2 indicates that, although the region was activated by all three cue displays, the activation was significantly stronger for endogenous compared with exogenous or neutral cues (Table 1, left, Fig. 3, top). These responses may reflect processes related to the encoding and interpretation of the cue stimulus (Corbetta et al., 2000).
A second pattern occurred in the ventral occipitotemporal cortex, just anterior to area V8 (Van Essen, 2002), a region involved in color processing (Hadjikhani et al., 1998), and near/at other functional regions also specialized in color processing (Beauchamp et al., 1999). A two-way ANOVA with cue type (exo and endo) and time yielded an interaction in two regions (Table 1, right): one in the right anterior fusiform (46, -43, -19) and the other in the left inferior temporal gyrus (ITG) (-62, -63, -11). The time course (Fig. 3, bottom) for the fusiform focus shows that, at early time points, exogenous cues evoked a response, whereas endogenous and neutral cues elicited either no response or a negative response. This pattern is strongly in contrast to the first pattern, in which the three types of cues produced a positive BOLD response that was largest in the endogenous condition (Fig. 3, top). The time course also indicates that the significant interaction in the fusiform may have partly reflected the relative increase in the response to endogenous cues at later time points, well after the presentation of the color singleton.
Finally, a regional ANOVA on the significant voxels from the ITG and fusiform regions indicated no differences in the response to singletons in the left and right visual fields (R fusiform, F(7,133) = 1.16; left ITG, F(7,133) = 0.30). This is not surprising, because neurons in macaque inferotemporal cortex, the putative monkey homolog of fusiform cortex, have large receptive fields that occupy most of the visual field (Desimone et al., 1984).
One hypothesis for this second pattern is that the differential activation of putative color-processing regions reflected the coding of the location or saliency of the color singleton. This signal could “mark” the stimulus for other processes that either facilitate or inhibit processing at that location.
Neuroimaging results: target period
Brain activity for targets presented at unattended (invalidly cued) versus attended (validly cued) locations indexes processes related to spatial reorienting to behaviorally relevant targets. Although in the endogenous condition, reorienting is triggered in conjunction with a breach of expectation for the location of the target, in the exogenous condition, attentional reorienting occurs without a change in expectation because the color singleton is not predictive of the target location. Additionally, in the exogenous condition, target activity may be modulated by the IOR given the long SOAs between cue and target presentation in the fMRI experiment (∼2 s).
Previous studies (Arrington et al., 2000; Corbetta et al., 2000; Macaluso et al., 2002) have shown that a set of regions in the ventral and dorsal frontoparietal cortex, strongly lateralized to the right hemisphere, responds more strongly when targets appear at unattended (invalidly cued) than attended (validly cued) locations after an endogenous shift of spatial attention. We proposed that these regions form a neural circuit to reorient attention, with the ventral network “circuit breaking” ongoing task processes in the dorsal frontoparietal network and redirecting them toward novel relevant sensory events (Corbetta and Shulman, 2002).
We were primarily interested in whether enhanced responses to invalid targets depended on whether attention was initially oriented endogenously or exogenously, because this would indicate whether the engagement of the circuit breaker during stimulus-driven reorienting depended on a breach of expectation. We conducted a voxelwise ANOVA with cue type (exo and endo), validity (valid and invalid), and time (time points 1-8) as factors. A significant interaction of cue type by validity by time, indicating that differential responses to valid and invalid targets depended on the type of cueing, was discovered in several regions (Fig. 4), including the bilateral SMG, FEF, anterior insula, and supplementary motor area (Table 2, right, coordinates and z-scores). Several of these regions had also been identified in the voxelwise analysis of the endogenous condition (Table 2, left, Fig. 4, black outlines). Inspection of the time courses (Fig. 4) indicated that, in the endogenous condition, there was a stronger response to invalid targets than valid targets, as expected. Conversely, in the exogenous condition, responses appeared slightly stronger for valid than invalid targets, an effect that was consistent across the regions. However, a separate voxelwise analysis of the validity by time interaction in the exogenous condition yielded no significant effects.
We also tested for enhanced responses of ventral and dorsal regions to invalid targets after exogenous and endogenous cues using the ROIs defined from our previous experiment (Corbetta et al., 2000). None of the four IPS ROIs showed enhanced responses, either when exogenous and endogenous conditions were tested within a combined analysis or when they were tested separately. However, both the left and right FEF showed a significant cue type by validity by time interaction (F(7,133) = 3.21; p < 0.005) in the combined analysis (L FEF, F(7,133) = 4.36, p < 0.0005; R FEF, F(7,133) = 2.93, p < 0.01), a significant validity by time interaction in a separate test of the endogenous condition (L FEF, F(7,133) = 4.23, p < 0.0005; R FEF, F(7,133) = 2.68, p < 0.05), and no interaction in a separate test of the exogenous condition. Of the ventral ROIs, the R SMG showed a significant cue type by validity by time interaction (F(7,133) = 3.21; p < 0.005) in the combined analysis, a significant validity by time interaction (F(7,133) = 3.32; p < 0.005) in a separate test of the endogenous condition, and no interaction in a separate test of the exogenous condition.
These results on ventral and dorsal network ROIs therefore mirror the voxel-level analyses. Dorsal regions in the FEF that were activated by an endogenous cue to shift attention also showed enhanced activity when attention was subsequently reoriented to an invalid target. Ventral regions in the SMG showed enhanced responses to invalid targets after an endogenous cue. However, none of these regions showed validity effects after exogenous cues. These results suggest that a breach of expectation may be necessary for engaging ventral regions during stimulus-driven reorienting. Alternately, the absence of activations related to reorienting in the exogenous condition may have reflected the small behavioral validity effect observed in the exogenous condition at the long SOA used in the fMRI session.
The goal of the present study was to test the hypothesis that dorsal frontoparietal regions, including the IPS and FEF, and ventral frontoparietal regions, including the TPJ and IFG, play different roles in endogenous and stimulus-driven shifts of attention. Stimulus-driven shifts of attention were studied under cueing conditions in which there were no top-down or task-relevance signals modulating the spatial effect of the cue (exogenous orienting).
Dorsal frontoparietal regions in the control of endogenous orienting
The present results support the hypothesis that dorsal and ventral frontoparietal regions play distinct roles during endogenous orienting. Endogenous cues produced significant activations in bilateral areas of the putative human FEF (at the junction of the superior frontal sulcus and the precentral sulcus) and in the bilateral IPS in response to the presentation of endogenous cues. Moreover, these dorsal activations were significantly larger in magnitude than when the cue was neutral or exogenous.
Endogenous cues also produced slightly larger activations than exogenous or neutral cues in ventral frontoparietal regions. However, the latency of onset was delayed, indicating that these regions played a different functional role than dorsal regions. Previous work has identified delayed responses in ventral frontoparietal regions when a sustained preparatory state was unexpectedly terminated (Shulman et al., 2002). Therefore, one possibility is that, although dorsal regions were involved in the establishment and maintenance of a preparatory state, ventral regions signaled a change in behavioral state when the dimming of the fixation point indicated the end of the cue period.
Ventral frontoparietal regions in the control of exogenous orienting
The results provided no support for the hypothesis that ventral frontoparietal regions, including the TPJ and IFG, play a special role in directing exogenous shifts of attention. Exogenous cues did not activate these regions more than neutral cues. These results converge with other results, suggesting that the ventral network plays a role in stimulus-driven orienting, but only to behaviorally relevant stimuli. In the psychological literature, stimulus-driven orienting that is contingent on the behavioral relevance of the features of the stimulus is known as contingent orienting (Folk et al., 1992, 2002). In the study by Corbetta et al. (2000), as well as in the present study, the invalid target initiating a shift of attention to the uncued location after an endogenous cue was a behaviorally relevant stimulus that required a response rather than a behaviorally irrelevant stimulus such as the color singleton in the exogenous condition. Similarly, studies of oddball detection often report that low-frequency targets, which require a behavioral response, produce greater TPJ activity than low-frequency novel stimuli, which do not require a response and are behaviorally irrelevant (McCarthy et al., 1997; Linden et al., 1999; Downar et al., 2000; Kiehl et al., 2001).
These results imply that top-down signals modulate the TPJ so that it is only activated by task-relevant stimuli. Shulman et al. (2003) have shown recently that the TPJ is deactivated by distracter displays during rapid serial visual presentation search, in which subjects search a rapid successive series of displays for a target object. They proposed that, in the absence of a task set, such as during passive viewing or during the intertrial interval of an active task, the TPJ can be activated by a wide range of salient stimuli that might be behaviorally relevant and can attract attention. However, in the presence of a task set, the input to the TPJ is filtered so that it is only activated by the small set of stimuli that are consistent with the current task set. In the current experiment, the color singleton was completely task irrelevant and therefore did not activate the TPJ. If the target task had involved a singleton color judgment, however, rather than a form judgment, the cue display, which would now be contingent, might well have activated the TPJ. Accordingly, Serences et al. (2005) report that, when subjects had to detect a target of a particular color embedded within a rapid stream of foveal stimuli, the TPJ was modulated by an irrelevant peripheral distracter if it matched the target color.
Exogenous orienting to color singletons
This study suggests that orienting to color singletons involves signals both in feature-specific extrastriate visual areas and in the dorsal frontoparietal network, especially the FEF. Regions in the anterior fusiform and inferior temporal gyrus may mark the location of the color singleton for attentional processes in the dorsal network. The initial response in these regions was stronger for exogenous than endogenous cues, in strong contrast to other occipital regions, although the interpretation of the later rebound of the response in the endogenous condition was unclear. The location of these regions appears to overlap with a color-processing system in the ventral occipitotemporal cortex that includes areas V4 and V8 as well as more anterior occipitotemporal areas (Hadjikhani et al., 1998; Beauchamp et al., 1999; Bartels and Zeki, 2000).
One caution is that, although the response in occipitotemporal regions to exogenous cues was significantly greater than to endogenous cues, the difference in response between exogenous cues and neutral cues was not significant. Although the sensory stimulation in the endogenous, exogenous, and neutral conditions was approximately equated, the spatial distribution of attention in the three conditions was different. In the endogenous condition, attention was initially directed to the center of the display to interpret the arrow cue, whereas in the exogenous and neutral conditions, attention was more diffusely distributed. A wider distribution of attention might have enhanced activations from the brief presentation (100 ms) of the peripheral colored boxes. This hypothesis predicts similar time courses for the exogenous and neutral conditions.
Color singletons also produced greater activity than neutral cues in the left FEF, which is part of the dorsal frontoparietal network and strongly modulated by endogenous cues. Therefore, exogenous orienting may rely in part on the same dorsal attention system that is involved in endogenous orienting. This conclusion is consistent with previous studies (Kim et al., 1999; Lepsien and Pollmann, 2002; Mayer et al., 2004) but is more strongly supported because this is the first study to compare exogenous and endogenous cues while controlling for sensory activity and separating the responses to the cue and subsequent target.
The hypothesis that exogenous orienting involves signals in feature maps that mark a location of interest and signals in the frontoparietal cortex that direct attention to that location is consistent with theories of guided search (Wolfe et al., 1989; Wolfe, 1994). Feature maps provide bottom-up signals that mark objects of interest in the visual field, whereas the priority of a target for an attentional shift is coded in a “saliency map” that combines bottom-up and top-down (expectation, behavioral relevance, and working memory) signals. The dorsal frontoparietal system, which is sensitive to both exogenous and endogenous signals, is an ideal candidate to code saliency. Saliency-related signals have been reported in macaque dorsal areas such as the lateral intraparietal (Gottlieb et al., 1998) and the FEF (Thompson et al., 2001), thought to be homologs of the human IPS and FEF. In FEF, especially, single-unit studies have shown neural activity during visual search tasks involving color singletons that is consistent with the selection of target information (Thompson et al., 1997).
Another related interpretation for the selective FEF recruitment during exogenous cueing is that it participates in the formation of the IOR. IOR is an inhibitory mechanism that prevents the reexploration of a previously visually cued location. A recent transcranial magnetic stimulation (TMS) study showed that stimulation over the human FEF, but not IPS, disrupts the formation of IOR in the ipsilateral visual field (Ro et al., 2003).
Ventral and dorsal frontoparietal regions in stimulus-driven reorienting
Finally, we replicated the modulation of dorsal and ventral frontoparietal attention networks identified in previous work (Corbetta et al., 2000, 2002) (Fig. 4) when we compared BOLD responses for detecting targets infrequently presented at unattended locations with those for targets frequently presented at attended locations (invalid more than valid). This modulation, we argued, underlies processes related to stimulus-driven reorienting (Corbetta and Shulman, 2002). Here, we show that this modulation is much stronger when attention is initially directed endogenously rather than exogenously.
How do we account for this difference, and does this difference inform us about underlying processes? Reorienting to stimuli involves multiple processes. Some processes are related to the localization of the new target and the reprogramming of a novel stimulus-response association. According to a framework proposed by Posner et al. (1985), the detection of a stimulus away from the locus of attention initially involves the detection of the novel event, followed by a “disengagement” of attention from the current location, and, finally, a movement of attention to the new location. Regardless of whether reorienting is accomplished through these exact or similar processes, it must clearly involve spatial processes that should be active regardless of whether attention was initially deployed endogenously or exogenously.
One interpretation for the stronger BOLD modulation during endogenous stimulus-driven reorienting is that it just reflects a longer time-on-task of spatial processes involved in reorienting. In fact, the strength of the validity effect (i.e., the RT difference for responding to invalid vs valid targets), an index of the time necessary to reorient from the currently attended location to the new location, was larger for the endogenous than exogenous cueing condition (40 vs 8 ms). This explanation is somewhat weakened by two observations. First, the RT to invalid targets, those requiring a reorienting of attention, was similar for the endogenous and exogenous tasks. Much of the difference in the validity effect between conditions was accounted for by effects on valid trials at long SOAs (2 s in the fMRI experiment), namely, a shortening of RTs in the endogenous task and the relative lengthening of RTs in the exogenous task (Fig. 1). Therefore, the duty cycle of a process specifically engaged by invalid targets would have been similar, on average, in the two tasks. Second, there was a small, but not significant, yet consistent (Fig. 4) BOLD increase for valid over invalid targets in the exogenous condition, although RTs for invalid trials were slightly slower than for valid trials. In other words, although slower responses to invalidly cued targets in the endogenous condition correlated with higher BOLD signals, slower responses to invalidly cued targets in the exogenous condition actually correlated with lower BOLD signals. Hence, performance differences do not seem to account well for the observed BOLD modulation.
A second interpretation that we prefer is that the interaction of target validity by cue type underlies specific processes that are unique to the endogenous condition. We proposed previously that one function of the ventral network was to act as a circuit breaker, reorienting the dorsal system, specialized for shifting spatial attention and eye movements, whenever a salient or behaviorally relevant sensory event was detected. These findings suggest that activity in the stimulus-driven reorienting network may be a function of a mismatch between expectation and sensory input triggered whenever a behaviorally relevant yet unattended target is presented. A mismatch signal may reflect an adjustment of the expected value of the stimulus (i.e., the amount of reward associated with a particular sensory-motor decision), which undoubtedly varies from trial to trial when different spatial locations are cued, even if rewards are not explicitly manipulated. Neural correlates of expected values have been identified recently in the monkey parietal cortex (Platt and Glimcher, 1999; Sugrue et al., 2004). Another possibility is that the BOLD modulation for endogenously cued invalid targets reflects an error of prediction in sensory stimulation or reward, which is known to drive dopaminergic and neuroadrenergic neurons in the brainstem that project to frontal and temporoparietal areas similar to those active here (Schultz and Dickinson, 2000; Tremblay and Schultz, 2000).
Can we assign different roles to different nodes of the reorienting network? Although evidence from this experiment is ambiguous, previous work indicates that the dorsal IPS-FEF system may be a good candidate for mediating spatial reorienting processes. These regions are active during the allocation of attention after endogenous or exogenous cues (Kastner and Ungerleider, 2000; Corbetta and Shulman, 2002) and contain mechanisms for oculomotor planning/execution (Corbetta et al., 1998; Connolly et al., 2002; Astafiev et al., 2003) and visuotopic maps of contralateral space (Sereno et al., 2001; Jack et al., 2004), mechanisms well suited for reprogramming a sensory-motor link between a new target location and an effector. This reprogramming may be necessary at invalidly cued locations after an endogenous cue, because a link has been established with the cued location and at cued locations at long SOAs after an exogenous cue, because IOR may have prevented or weakened the formation of a link to the exogenously cued location. This idea is consistent with previously noted TMS work in the FEF (Ro et al., 2003). Conversely, regions in the TPJ and VFC contain multimodal sensory responses that are spatially nonselective (Corbetta et al., 2000; Downar et al., 2000; Macaluso et al., 2002) and are not modulated by response factors (Braver et al., 2001; Astafiev et al., 2004). Rather, they show strong sensitivity to the detection of infrequent events, even when they do not require a shift of spatial attention (McCarthy et al., 1997; Linden et al., 1999; Kiehl et al., 2001). Therefore, these higher-order, more cognitive areas may be preferentially involved in adjusting internal expectancies to incoming sensory inputs.
In summary, this study confirms the importance of a dorsal frontoparietal system in voluntary orienting. Exogenous orienting to color singletons appears to involve a partly overlapping circuit comprising regions in the extrastriate visual cortex that may mark a location, and dorsal frontoparietal regions, such as the FEF, that direct attention. Exogenous orienting, however, does not recruit the TPJ portion of the ventral system, indicating that the TPJ is only involved in stimulus-driven shifts if the stimuli share features that are behaviorally relevant (contingent orienting).
Finally, reorienting to unattended targets after an endogenous cue drives both dorsal and ventral frontoparietal attention networks. Although areas in the dorsal network such as the FEF may be preferentially involved in reprogramming a stimulus-response link to the novel target, more ventral areas such as the TPJ may detect a mismatch between expectancy and sensory input.
This work was supported by National Institute of Mental Health Grant MH71920-06 and by the J. S. McDonnell Foundation.
Correspondence should be addressed to Maurizio Corbetta, Department of Neurology and Radiology, Washington University School of Medicine, East Building, Room 2127A, 4525 Scott Avenue, St. Louis, MO 63110. E-mail:.
Copyright © 2005 Society for Neuroscience 0270-6474/05/254593-12$15.00/0