Abstract
Visual attention selects task-relevant information from scenes to help achieve behavioral goals. Attention can be deployed within multiple domains to select specific spatial locations, features, or objects. Recent evidence has shown that voluntary shifts of attention in multiple domains are consistently associated with transient increases in cortical activity in medial superior parietal lobule, suggesting that this may be the source of a domain-independent control signal that initiates the reconfiguration of attention. To investigate this hypothesis, we used fMRI to measure changes in cortical activation while human subjects shifted attention between spatial locations or between colors at a location. Univariate multiple regression analysis revealed a common, domain-independent transient signal [in posterior parietal cortex (PPC) and prefrontal cortex] time-locked to shifts of attention in both domains. However, multivariate pattern classification conducted on the cortical surface revealed that the spatiotemporal pattern of activity within PPC differed reliably for spatial and feature-based attention shifts. These results suggest that the posterior parietal cortex is a common hub for the control of attention shifts but contains subpopulations of neurons with domain-specific tuning for cognitive control.
Introduction
Natural scenes contain a multitude of objects competing for perceptual representation in cortex. Cortical neurons tuned for visual attributes (location, color, object category, etc.) often contain multiple objects within their receptive fields. The competition for representation among items is resolved by selecting objects that are behaviorally relevant and filtering out those that are irrelevant through deliberate acts of selective attention (Desimone and Duncan, 1995).
A network of prefrontal and parietal cortical regions is thought to be the source of control signals that resolve the visual competition through voluntary deployments of attention (Kastner and Ungerleider, 2000; Corbetta and Shulman, 2002). Sustained intervals of focused spatial attention evoke sustained cortical activity in topographically organized regions of the intraparietal sulcus (IPS) (Sereno et al., 2001; Bisley and Goldberg, 2003; Silver et al., 2005; Serences and Yantis, 2007; Saygin and Sereno, 2008) and prefrontal cortex (PFC) (Sereno et al., 2001; Serences and Yantis, 2007; Saygin and Sereno, 2008). These regions are thought to comprise attentional priority maps that specify the highest priority location(s) in the scene from moment to moment.
Shifts of attention evoke transient activation in medial superior parietal lobule (mSPL) that is time-locked to the initiation of shifts between spatial locations (Vandenberghe et al., 2001; Yantis et al., 2002; Kelley et al., 2008; Shulman et al., 2009), spatially superimposed objects (Serences et al., 2004), sensory modalities (Shomstein and Yantis, 2004), and visual features (Liu et al., 2003). The consistency of this pattern suggests that mSPL may be the source of a domain-independent control signal that initiates the reconfiguration of attention.
Other studies, however, have suggested that different domains of attention may rely on domain-specific cortical sources (Shulman et al., 2002; Giesbrecht et al., 2003; Slagter et al., 2007). Rushworth et al. (2001) reported that distinct subregions within posterior parietal cortex (PPC) were associated with shifts of visual attention versus shifts of response preparation. Based on a meta-analysis, Wager et al. (2004) concluded that, although certain gross regions of the brain are consistently associated with attention shifts (e.g., PFC, PPC), there remain subregions of cortex specific to shifting within each possible domain.
Each of the cited studies that provide evidence for a domain-independent attention shift signal was conducted with different tasks and different groups of subjects, which limits the generality of any conclusion concerning the degree to which acts of attentional control are mediated by common mechanisms. To determine whether shifts of visual attention in different perceptual domains are associated with a domain-independent reconfiguration signal, we devised a single task that permitted us to directly compare cortical activation [using blood oxygenation level-dependent (BOLD) fMRI] associated with attention shifts in two different domains: spatial location and color. We used conventional univariate statistical procedures to determine whether attention shifts in these two domains evoked activity in the same cortical regions. Furthermore, we used multivariate pattern classification (MVPC) (Norman et al., 2006; Esterman et al., 2009) within these activated regions to determine whether distinct spatiotemporal activation patterns were associated with acts of attentional control in the two domains.
Materials and Methods
Subjects.
Eight neurologically healthy adult volunteers (19–45 years of age; mean age, 26; five females) with normal or corrected-to-normal visual acuity were recruited from the Johns Hopkins University community to participate in the experiment. Each participant provided written informed consent approved by the Johns Hopkins Medicine Institutional Review Board and passed an MRI safety medical screening approved by the Kennedy Krieger Institute, F.M. Kirby Research Center for Functional Brain Imaging (where MRI scanning was performed). Subjects were trained on the task in a separate session before scanning and performed the task during two fMRI sessions of 2 h each.
Stimuli.
Stimuli were back-projected (Epson PowerLite 7600p, Epson America; with custom zoom lens, Navitar) onto a screen (Da-plex substrate with Video Vision optical coating; Da-Lite) mounted to the top of the magnet bore behind the subject's head. All visual stimuli described below were generated and displayed via MATLAB scripts (MathWorks) created with Psychophysics Toolbox extensions (Brainard, 1997; Pelli, 1997). Subjects viewed the screen (via a mirror mounted to the head coil) at an optical distance of 68 cm. The subjects responded via a pair of custom-built, MR-compatible, fiber-optic push button response boxes (MRA).
The visual display in the experimental task consisted of two circular apertures (2.6° in diameter) containing coherently moving dot fields centered 4.5° to the left or right of a white central fixation disk (0.36° diameter) superimposed on a black background (Fig. 1). Individual dots subtended 0.09° of visual angle. In each of the two apertures, half of the dots were rendered in red and the other half in green (30 dots/color), forming two sets of dot fields in each aperture. At any moment throughout the experiment, one set of dots in each aperture was moving at a rate of 3 deg/s and the other set of dots was stationary. Dots moved in one of eight directions (up, down, left, right, upper-left, upper-right, lower-left, and lower-right) at 100% coherence and each dot had a limited lifetime of random duration (mean, ∼1200 ms) to discourage subjects from attending to a single dot (or small set of dots) throughout the experiment. The moving dots changed direction (or stopped moving) exactly once per second and all changes to stimulus motion occurred simultaneously in both the left and right apertures. When the moving dot field in one of the apertures stopped moving, the previously stationary dots of the other color would begin to move.
Procedure.
Subjects were instructed to begin each run of the experiment by attending to the green dot field on the left side of the display. Subjects monitored the attended dot field for four specified directions of motion that provided cues to either switch or maintain attention. Upward motion indicated shift location (but hold color) and downward motion indicated shift color (but hold location). These instructions were reversed for half of the subjects. Leftward and rightward motion always indicated that subjects should hold attention on the current dot field (same color and location). For example, if the currently attended green dots on the left side of the display began to drift downward, subjects were to shift attention to the red dots on the left. However, if the attended green dots began to drift upward, subjects were to shift attention to the green dots on the opposite (right) side of the display. Subjects were instructed to press two buttons (one in each hand) each time one of the four cues appeared. The four oblique directions of motion served as “filler” motion between cues and could be ignored. In rare instances, subjects would fail to detect the presence of a cue. In this case, subjects were instructed to simply maintain attention on the current location/color dot field until a new cue appeared. The task was designed such that cues cycled back to the same dot field at intervals no longer than 30 s.
Cues occurred every 3, 6, or 9 s, jittered with respect to the MRI acquisition time (2000 ms) to allow for better detection of event-related impulse response functions (Birn et al., 2002). Cues appeared only in the currently attended dot field; all other motion was in the oblique directions. This encouraged subjects to adopt a highly concentrated focus of spatial and feature-based attention in which the three unattended dot fields were ignored. Subjects completed approximately four practice runs (8 min each) outside of the scanner before fMRI data collection. In the scanner, each run began with a fixation-only display (triggered by the onset of the first image volume acquisition). The dot fields appeared and began moving 8 s later (triggered by the onset of the fifth image volume acquisition). The four cue directions were apportioned approximately equally throughout each run (one-half of all trials for hold, one-quarter for shift color, and one-quarter for shift location). Each subject performed 10 runs of the experiment.
MRI data acquisition.
Imaging data were acquired with a 3T Philips Gyroscan MRI scanner (Philips Medical Imaging Systems) using an eight-channel transmit/receive sensitivity-encoding (SENSE) head coil (Invivo) at the F.M. Kirby Research Center for Functional Brain Imaging in the Kennedy Krieger Institute, Baltimore, Maryland. Each subject was scanned in two 2 h sessions on different days. Whole-brain anatomical image volumes were acquired using a 1 mm isotropic T1-weighted magnetization-prepared rapid gradient echo pulse sequence [repetition time (TR) = 8.23 ms; echo time (TE) = 3.9 ms; flip angle = 8°; acquisition matrix = 256 × 256; 200 slices; SENSE factor = 2]. Whole-brain functional image volumes (3 mm isotropic resolution) were acquired using a T2*-weighted echoplanar imaging (EPI) pulse sequence (TR = 2000 ms; TE = 30 ms; flip angle = 70°; acquisition matrix = 64 × 64; 38 transverse slices; SENSE factor = 2). Main experimental runs were acquired with 244 volumes and lasted 8.13 min each. The initial four volumes (8 s) were discarded from each EPI run to allow magnetization to reach steady-state. One-half of the EPI data were acquired during each of the two scanning sessions. The anatomical data were collected in the first scanning session and an additional fast (higher SENSE factor) T1 volume was acquired on the second day of scanning for intersession alignment. During all scans, each subject wore foam earplugs to reduce effects of scanner noise and used a custom dental bite-bar (Stevenson Industries) that was attached to the head coil to reduce head motion.
fMRI data analysis: general linear model analysis.
All fMRI data analyses were conducted with the AFNI/SUMA software package (Cox, 1996; Saad et al., 2004), except where noted, along with custom MATLAB scripts. Before any statistical testing, all EPI runs from individual subjects were slice-time corrected, realigned to the average EPI image volume from the run acquired nearest in time to the anatomical dataset, and converted to percentage signal change values normalized to the mean of each run. Individual anatomical data from the two scanning sessions were then coregistered with each session's functional volumes, aligned with each other, and, finally, transformed into Talairach coordinates (Talairach and Tournoux, 1988). This alignment and transformation was then applied to all functional volumes across both sessions.
The EPI volumes from each run were then concatenated in time and mapped to a cortical surface representation using a method similar to that outlined in Jo et al. (2007). Specifically, EPI time series data were mapped from voxels to nodes on the smoothed gray/white matter boundary surface of the Talairached N27 anatomical dataset obtained from the Montreal Neurological Institute. For each node on the gray/white boundary surface, a related node was identified on the pial (gray matter/CSF boundary) surface. For each node pair, a connecting line segment between the nodes was computed. The voxels through which this line segment passed were used to compute an average time series that was then assigned to the given node on the gray/white boundary surface. Once completed for each node on the two surfaces (both left and right hemispheres), the data were then spatially smoothed with a heat kernel (Chung et al., 2005), as implemented in AFNI program SurfSmooth (Saad, 2005b). Smoothing was performed iteratively such that the final full-width at half-maximum of the smoothed data was 4 mm. All further data analyses were performed entirely on these surface-mapped data.
A canonical hemodynamic response function (Cohen, 1997) was convolved with a model of the timing of the stimulation epochs during the experiment, yielding six regressors corresponding to the three types of cues (shift location, shift color, hold) for the left and right apertures. Additional regressors accounting for variance due to linear, quadratic, and cubic drifting as well as constant offset (separately for each run) were included in the general linear model analysis (GLM) as nuisance regressors. Together, these were submitted to a multiple regression, GLM (Friston et al., 1995). The following contrasts between the resulting beta-weights yielded statistical maps that are described in Results, below: shift left to right > hold left; shift color on the left side > hold left; shift right to left > hold right; shift color on the right side > hold right. The intersection of nodes reaching threshold (after individual subject multiple-comparisons correction, as detailed below) for these four contrasts formed a conjunction map that identified regions activated by both kinds of shift cues. Nodes were classified into three categories: (1) those that were significantly active for both types of spatial shifts (colored red), (2) those that were significantly active for both types of color shifts (colored green), and (3) those that were significantly active for at least one color shift and at least one spatial shift (colored yellow). In a second GLM, we modeled blocks of time when subjects were attending to the right aperture and left aperture separately. Using the same method described above, functional maps yielding regions responsive to the contralateral maintenance of attention were generated by contrasting attend-right and attend-left regressors. Finally, in a third GLM, we modeled blocks of time when subjects were attending to green dot fields and red dots fields separately. Again, using the same method, functional maps were generated by contrasting these two regressors. In each case, we also performed a mixed effects group analysis across all subjects by submitting individual subject beta-weight estimates for each condition (fixed effects) to an ANOVA and generating contrasts across the individuals (random effect).
All statistical maps were corrected for multiple comparisons on the surface by using a novel permutation method we developed to arrive at appropriate cluster correction values in the application of Gausian random field theory (Nichols and Hayasaka, 2003). Although more complex and computationally demanding than standard methods, this procedure was necessary because no tools currently exist for conducting Monte Carlo simulations on surfaces. For eight (of 16) hemispheres (four left and four right), 500 alternative versions of the two color shift regressors were generated by randomly shuffling the labels that marked the onset of the events. Each shuffled pair was then submitted to a GLM (identical to the one described above, save these two regressors) and a contrast between the beta-weights associated with the two shuffled regressors was generated. For each of these 500 contrasts, we measured the surface area of the largest cluster obtained (rounded to the nearest integer value) at several nominal p values. The 500 cluster size values were then rank-ordered from 0 to the largest area measured to construct a table of alpha probabilities from which a cluster correction value could be obtained for a desired corrected p value. These cluster thresholds were then averaged across the eight hemispheres to provide a set of cluster correction values for the group maps. Clustering was then performed for each functional map on the surface via AFNI program SurfClust (Saad, 2005a).
Each of the functional maps described above served as the basis for a set of regions of interest (ROIs) from which time courses were extracted and displayed. For each statistical map, all contiguous values within a given brain region that were above threshold were used to examine event-related averages. Select time courses are presented as illustrations of the underlying neural changes corresponding to the regions surviving threshold in the various statistical maps. Because these time series data were normalized in preprocessing (see above), the event-related time courses are in units of percentage signal change.
fMRI data analysis: multivariate pattern classification.
To examine the patterns of activity within certain ROIs that resulted from the GLM analyses described above, we conducted a multivariate pattern classification analysis on individual subject surfaces (exactly analogous to multivoxel pattern classification, but performed on data mapped to surface nodes) (Norman et al., 2006; Esterman et al., 2009). For each ROI to be tested, we selected features (surface nodes to be included in the classification) by using a “leave-one-run-out” (LORO) approach to maximize the use of our data while ensuring that the measures were derived independently. Specifically, for each subject we ran 10 GLM analyses with the six regressors of interest (as described above). For each GLM, one of the 10 runs was left out of the analysis such that, in each case, functional maps of the conjunction described above were generated from nine runs of data. Each functional map was then used to create a set of frontoparietal ROIs that defined the features (nodes) from which time courses were extracted. For each of the 20 maps (10 per hemisphere), time courses were extracted from all 10 runs. However, data from the nine runs used to generate each map were labeled as training data whereas data from the run left out were sequestered as testing data. This ensured that the ROI selection was independent of the test data. Following this feature selection process, we extracted 15 time points (from −2 to +12 s, at a resolution of 1 s) around the time that each cue to shift attention was presented. This procedure was conducted for both hemispheres for each of our eight subjects. When completed, we had 10 sets (one for each LORO GLM) of nine training runs and an associated testing run for each hemisphere per subject. The training data were submitted to a support vector machine (SVM) learning algorithm (Vapnik, 1995) that was implemented in MATLAB (LIBSVM) (Chang and Lin, 2001). It is important to note that we preserved the temporal information from each trial. The SVM was trained on the matrix of nodes, time point-by-time point; thus, our testing results provide information about how effective the classification of shift type (color or location) was over time.
We used a linear SVM and performed a grid search to find the optimal values of cost (C). The results generalized to a wide range of C values. The classifier evaluated the data for each hemisphere a total of 10 times, once for each set of data independently generated by the LORO GLM procedure described above. In each SVM run, the classifier was trained on the nine training runs and tested on the single run-left-out for that GLM. In each instance, once the classifier training was completed, the SVM was tested on our testing data and we recorded the classifier's accuracy in determining whether each time point was from a shift color trial or a shift location trial. Classifier accuracy across all 10 testing runs was pooled to create a classification rate across the entire experiment while maintaining independence between training and test data. We also ran a permutation test in which the testing labels were shuffled 1000 times to determine the 95% confidence interval around classifier chance performance. This entire process was repeated for each region (in each of 16 hemispheres) that was significantly active (after cluster correction) in our conjunction analysis of shift color and shift location cues (greater than hold cues) on at least seven of the 10 LORO GLMs per hemisphere. Finally, the event-related classification time courses were averaged within lobe (frontal or parietal) and hemisphere and t tests were used to assess significance of our group classification results for each time point. Because two of our subjects (four hemispheres) did not yield at least one frontal and one parietal ROI in the LORO analysis, these two individuals were excluded from the group data (leaving 12 hemispheres tested).
To our knowledge, this is the first report of event-related multivariate pattern classification conducted entirely on the surface. We believe this approach provides marked advantages over classification in the volume with respect to spatial smoothing (superior preservation of signal-to-noise ratio and localization of activation) and statistical power (Jo et al., 2007; Anticevic et al., 2008). It has become commonplace in fMRI data analysis to apply a spatial Gaussian smoothing kernel to acquired data to increase the inherently low signal-to-noise ratio of BOLD fMRI data, increase activation detectability via the matched filter theorem, and satisfy the requirement of Gaussian random field theory (for multiple-comparisons corrections) that smoothness of the data are large compared with the sampling resolution. However, when performed in a standard three-dimensional (3D) volume, spatial smoothing can cause nearby (in Euclidean distance) voxels to contribute to voxels that are actually quite far (in Geodesic distance); this comes about because of the highly folded structure of cortex.
This issue can be ameliorated if smoothing is applied to a surface representation of cortex to which the volume data are mapped. Mapping data from volume to surface (as implemented in this study) is a form of smoothing, though if done carefully, one is simply smoothing in the direction normal to the surface (not across gyri/sulci). Furthermore, once data are mapped to the surface and analyzed to produce areas of activation, defining ROIs is much simpler and more properly restricted to gray matter compared with defining 3D ROIs in the volume. Finally, the requirements of spatial smoothing noted above may seem to be at odds with the approach taken in this study; that is, the use of multivariate pattern classification to exploit underlying biases in the neuronal populations underlying specific data samples. However, it has recently been demonstrated (Op de Beeck, 2010) and further noted (Kamitani and Sawahata, 2010; Kriegeskorte et al., 2010) that spatial smoothing does not necessarily degrade the information leveraged during multivariate pattern classification procedures.
We performed two additional analyses to address two methodological issues. First, the surface-based SVM method we used could yield a different outcome than conventional volume-based SVM because the mapping of volume data to surface nodes might have artificially increased statistical power for univariate analyses by pooling information across neighboring voxels while not necessarily preserving the spatial patterns for pattern classification. To quantify the difference between pattern classification of volume-based versus surface-based data, we performed an SVM analysis in the volume for the surface ROIs discussed above. We first projected the independently defined (via LORO procedure) surface-based ROIs into the volume using the reverse of the mapping procedure explained above, and then performed the SVM on the volume data. This provides a direct comparison between ROIs on the surface and in the volume. The results are shown in supplemental Figure S1 (available at www.jneurosci.org as supplemental material), created to parallel Figure 5. Comparing these figures shows that the volume-based pattern classification performed similarly to the surface-based classification. Specifically, the pattern shows group mean classification above chance in parietal ROIs between 6–10 s and 7–12 s postcue for left and right, respectively. The frontal ROIs exhibited chance performance throughout the time course. As expected, the quantitative values have changed between the surface-based and volume-based analysis, however the qualitative results are the same. These results provide evidence that the surface-based classification performance was not artificially inflated by the mapping method we used and provide additional confidence that the activation pattern differences detected by SVM on the surface reflect a true underlying multivariate signal present in the data.
Second, it is possible that mean differences between conditions within an ROI (that may not be differentiable via GLM contrast) might provide a basis for above-chance classification accuracy. We therefore performed a mean-centering procedure, modeled after the work of Kamitani and Tong (2005) and Esterman et al. (2009), on the same set of volume-based ROIs used in generating supplemental Figure S1 (available at www.jneurosci.org as supplemental material). For each of these ROIs, we performed the SVM classification after subtracting condition-mean activity within the ROI from each feature (voxel) for that condition. The results for each individual subject and the group mean are shown in supplemental Figure S2 (available at www.jneurosci.org as supplemental material) and provide an important check of our data and analyses: comparing the classification rate time courses in supplemental Figure S2 (classification rate after mean-centering) with the plots in supplemental Figure S1 (raw classification rate) yields only minor quantitative differences in classification in each of the four sets of averaged ROIs. Specifically, as in supplemental Figure S1, classification was reliably above chance in the parietal ROIs during a similar (or slightly expanded, for left parietal) range of time points to that of supplemental Figure S1 when data were not mean-centered. Furthermore, both left and right frontal ROIs yielded chance classification performance throughout the time course. These findings suggest that mean differences were not responsible for classification results and provide additional evidence that specifically tuned subpopulations of neurons may be differentially active between the two attention shifts.
Because group data can sometimes conceal irregular patterns in individual subjects, in supplemental Figure S3 (available at www.jneurosci.org as supplemental material), we plot comparisons both with and without mean-centering in two ROIs [frontal eye fields (FEF) and mSPL] from each of the two hemispheres of one representative subject using ROIs in both surface nodes and volumetric voxels. These detailed results show a remarkable consistency before and after mean-centering, again suggesting (as in supplemental Fig. S2, available at www.jneurosci.org as supplemental material) that mean differences within each ROI for each attention shift condition played little role in driving classification performance in either frontal ROIs (where classification was near chance) or parietal ROIs (where classification was consistently above chance several seconds postshift cue). In addition, comparing across coordinate systems (volume- vs surface-based ROIs) also yielded a high degree of similarity in classification rate time course, further validating our approach to pattern classification on the surface.
To directly test whether mean differences within each ROI provided any information to the SVM about attention shift condition, we also performed a direct SVM classification on the mean time course per condition in each surface-based ROI. As in previous analyses, we used the LORO procedure to sequester training trials from test trials. For each subject, the mean of all voxels in each ROI used in the surface-based analyses discussed above was calculated separately for shift location and shift color trials and submitted to the linear classifier. The results for each individual subject and the group mean are shown in supplemental Figure S4 (available at www.jneurosci.org as supplemental material). The classification rate did not exceed chance at any time from just before to long after cue onset. This indicates that the mean time course in each ROI provided no information to the classifier about the type of attention shift performed. Note that the classification accuracy is near chance (50%) at all time points for the group mean; however, individual subject classification is centered at values slightly away from theoretical chance. This is due to the slight imbalance between trial types for individuals leading to chance performance that is not exactly 50%. Nevertheless, each individual has near flat classification performance in each ROI throughout all time points. Independent, one-sample t tests for the group confirmed that no time points in any of the ROIs were significantly different from chance performance (all t values ≪ 1). Together, the results of these control analyses (mean-centering and ROI mean classification) show that MVPC does not rely on differences in mean activity across conditions for successful classification.
Finally, we also performed a full-brain searchlight pattern classification analysis (Kriegeskorte et al., 2007) in the Talairach-transformed volume for each of our subjects included in previous SVM analyses. We used a cubic searchlight of 27 voxels centered on each voxel in the brain, in succession. To simplify the computation, we performed the classification on a single value: the mean of 6, 7, and 8 s after each attention shift cue. As in previous analyses throughout the paper, the classifier was asked to discriminate between shift color and shift location trials, using the LORO procedure to sequester training versus test trials. This yielded a value for classification accuracy at each voxel in each subject of our group. We then compiled the group data by performing an independent, one-sample t test versus chance performance, which yielded a group functional map of significant classifier performance (corrected for multiple comparisons using Gaussian random field theory cluster correction to yield a corrected p < 0.05). Representative slices from this map are shown in supplemental Figure S5 (available at www.jneurosci.org as supplemental material), and supplemental Table S1 (available at www.jneurosci.org as supplemental material) lists the locations of voxel clusters in which the classifier performed significantly above chance. We note the similarity between this result and the result of our full-brain univariate conjunction analysis. Specifically, the only regions in the searchlight map that do not appear in our group univariate map are right insula and right extrastriate cortex; both of which appear in a subset of our individual subject conjunction maps. With respect to our primary hypothesis, the results of our searchlight analysis complement our finding that portions of posterior parietal cortex (particularly mSPL) are domain-specific in nature. In addition, the left FEF, left middle frontal gyrus (MFG)/inferior frontal gyrus (IFG), and bilateral supplementary eye fields (SEF) also showed smaller clusters of voxels with significant above-chance classification accuracy. This suggests that some portions of the frontal ROIs we submitted to ROI-based MVPC may contain reliable spatiotemporal information about the domain of attention shift performed.
Eye position monitoring.
Subjects were instructed to attend covertly to the left and right dot apertures while continuously maintaining fixation at the central fixation cross. To ensure continuous fixation, we monitored the right eye position of a subset (n = 3) of the participants while they performed the main experimental task in the scanner using a custom-made MR-compatible video camera system (MRA). Camera output was recorded by a computer in the scanner control room using ViewPoint eye-tracking software (Arrington Research) and analyzed later with custom MATLAB scripts. Our analysis showed that subjects held fixation throughout the experiment on ∼95% of cue trials (defined by <1° of horizontal eye movement in the 500 ms following each cue). These data, together with the highly consistent contralateral attentional modulations observed in extrastriate cortex, show that subjects successfully used covert attention to complete this task, as instructed.
Results
Behavioral results
Subjects were asked to press buttons held in both hands each time they saw one of the four cues (shifts and holds); speed was not stressed. In the imaging analyses, we excluded data from trials in which cues were not detected. Subjects' behavioral data are shown in Table 1; mean (±SEM) overall detection accuracy was 0.84 ± 0.06. There was no significant difference in detection accuracy for shift cues versus hold cues (paired t(7) = 0.78; p = 0.46). A two-way repeated-measures ANOVA was conducted with cue location (left and right) and shift domain (color and location) included as within-subject factors. This yielded no significant main effect of either cue location (F(1,7) = 2.14; p = 0.19) or domain (F(1,7) = 0.62; p = 0.46). The interaction was also not significant (F(1,7) = 0.07; p = 0.80).
Effects of sustained attention
The expected contralateral visual field effects of sustained spatial attention are shown in Figure 2 for our group of eight subjects. This map is the result of a contrast between epochs of time during which the subject was attending to the left aperture versus the right aperture. As detailed in Table 2, cortical activity was greater contralateral to the attended visual field in the occipital and posterior temporal lobes, including extrastriate visual cortex and the ventral visual processing stream. This pattern was evident in all subjects. The event-related BOLD time courses in Figure 2, B–E, demonstrate the effects of attention in extrastriate cortex. These regions were selected on the criteria that attended left periods evoked greater activity than attended right, and vice versa; therefore, the time courses for hold-left and hold-right must necessarily be different (tonically low and high levels of activity for ipsilateral and contralateral cues, respectively, persist throughout the entire period shown, beginning before cue onset and continuing through the cued period and beyond). This reflects the sustained state of spatial attention to a single location before, during, and after the hold cues. Furthermore, signals evoked by shift location cues exhibited the previously documented crossover pattern in which extrastriate cortical activity is modulated via shifts to and from the preferred visual hemifield (Yantis et al., 2002; Kelley et al., 2008). Shift-to-contralateral cues in Figure 2, B and D, evoked signals that began at the low level of the hold-ipsilateral activity and rose to the high level of the hold-contralateral activity. In contrast, shift-to-ipsilateral signals began at the high level of the hold-contralateral activity and fell to the low hold-ipsilateral level.
Unlike the effects of attending to spatial locations, we are aware of no reports that attention to one of two spatially superimposed colors modulates cortical activity measured with fMRI. Accordingly, contrasts between epochs of attention to red and attention to green yielded no statistically significant activations. Shift-color cues evoked a small, transient increase in extrastriate cortical activity relative to the more sustained activity evoked by hold cues when attention was directed to either the contralateral or ipsilateral visual fields (Fig. 2C,E), possibly due to arousal from target detection.
A subset of attentional control regions (listed in Table 2) in the parietal lobe also exhibited sustained increases in activation during epochs of spatial attention maintenance. Figure 2A shows significant sustained activity in the superior and inferior parietal lobules (SPL/IPL). Individual subjects also exhibited activity in precentral gyrus (probably the FEF) (Paus, 1996) and paracentral gyrus on the medial wall (probably the SEF) (Grosbras et al., 1999). As in the occipitotemporal regions, these loci were active during epochs of sustained attention to the contralateral visual hemifield (data not shown).
Attention-shift related activity
To test for a domain-independent control network, we generated a statistical parametric conjunction map of location shift and color shift activity. As outlined in the methods section, the following four contrasts were examined: shift left-to-right > hold left, shift right-to-left > hold right, shift color on left > hold left, and shift color on right > hold right. Figure 3 shows the resulting group functional map. Nodes that were significantly active for both types of shift-location cues versus hold cues are colored red and nodes that were significantly active for both shift-color cues versus hold cues are colored green. The yellow nodes represent the conjunction, the voxels significantly activated by shift-color and shift-location cues versus hold cues. This common shift activation (Table 3) was present in bilateral medial SPL/precuneus, left ventral PPC, left middle/inferior frontal gyri, and portions of bilateral medial/superior frontal gyri.
To better illustrate the domain-independent role of these regions, we generated group-averaged event-related time courses from ROIs defined by the LORO statistical maps produced as part of the multivariate pattern classification analysis described below. Figure 4A shows example ROIs in left and right medial SPL/precuneus, left and right medial/superior frontal gyri, and left middle/inferior frontal gyri. These ROIs delineated the general regions over which time courses were averaged in Figure 4, B–F (the specific nodes for each ROI varied slightly for each LORO GLM). In each plot, the response to hold cues remained low throughout the trial, however, the response to both the shift-color and shift-location cues were elevated and reached nearly identical levels of activation. Because these time courses were extracted from an analysis (our LORO procedure) that was independent of the GLM used to generate the group functional maps in Figure 3, the time courses need not exhibit such a pattern.
These results, which corroborate previous findings, suggest that mSPL, FEF/SEF, and MFG/IFG comprise a network of regions that exhibit transient increases in activity that is time-locked to shifts of attention between locations and between colors. One interpretation of this outcome is that these regions perform a truly domain-independent function, perhaps providing a generic go-signal to initiate a shift of attention that is specified by domain-specific cortical activity elsewhere in the brain.
An alternative possibility is that there exists distinct but spatially intermixed populations of neurons that are recruited selectively for shifts of attention between locations and colors. A conventional univariate GLM analysis (as outlined above) cannot discriminate between these possibilities. We therefore used a multivariate approach to further investigate this possibility.
Multivariate pattern classification
We trained a linear SVM, using a subset of our data (see Materials and Methods for details), to classify patterns of activity associated with shifts of attention between locations and colors in frontoparietal cortical ROIs. We then tested this classifier on a limited portion of independent data (using a leave-one-run-out cross-validation approach) to determine whether classification was reliable. We executed the training and testing on each individual time point during each trial (yielding an event-related classification rate) so that we could observe the temporal profile of the classifier performance.
Figure 5 shows the event-related classification time courses. Panels show the mean classification performance at each time point relative to the onset of a switch cue (thick black line) and the individual subject classification time courses (thin colored lines) for frontal and parietal ROIs. Classifier performance did not exceed chance for the frontal ROIs but successfully classified the test data significantly better than chance in the parietal regions (indicated by the gray shaded portion of the plots; p < 0.01). This pattern was consistent across subjects, as can be seen in the individual classification time courses [this was also true for a parallel analysis performed in the volume and was partially confirmed in a whole-brain searchlight MVPC analysis (supplemental Fig. S1 and S5, available at www.jneurosci.org as supplemental material)]. Classification accuracy in the parietal ROIs was significantly greater than chance at time points 4 through 8 s (left parietal) and 3 through 8 s (right parietal) after shift cue onset. Notably, the classifier did not perform above chance in the parietal ROIs early or late in the time course; the peak classifier performance occured at approximately the same time as the peak event-related BOLD activity evoked by shift cues (Fig. 4, cf. event-related average plots). However, the shape of the event-related classification rate is markedly different from that of the event-related averages; whereas the event-related averages are fairly symmetrical, the event-related classification rates have a quick rise-time and gradually fall back to chance performance by ∼9 s postcue. We also note that the profile of event-related classification time courses, as performed in the volume (supplemental Fig. S1, available at www.jneurosci.org as supplemental material), is markedly different from those resulting from surface-based classification. Specifically, the volume-based classification rates are less transient than those of the surface-based classification (compare Fig. 4 and supplemental Fig. S1, available at www.jneurosci.org as supplemental material).
Discussion
We used fMRI to measure the time course of cortical activity following shifts of visual attention within two domains: location and color. Both the sensory stimulus and the motor response demands were constant throughout the experiment. This allowed for a direct comparison of shift-related cortical activity in the two domains. Visual sensory regions in extrastriate cortex reflected attentional modulations consistent with sustained deployments of spatial attention. For spatial attention shifts, the observed crossover pattern of neural activity replicates previous reports (Motter, 1994; Yantis et al., 2002; Kelley et al., 2008). For attentional shifts between colors, we observed low-amplitude, transient modulations of contralateral extrastriate cortex (relative to hold cue-evoked activity) similar to those seen in frontoparietal attentional control regions (compare Fig. 2C,E with Fig. 4B–F). This echoes a finding reported by Serences et al. (2004) in which a similar modulation of extrastriate cortex followed shifts of object-based attention. Those authors suggested that task-specific segmentation processes (in their case, segmenting two objects; in our case, segmenting two colors) may briefly accompany the attention shift signal and cause heightened activity in sensory areas. We postulate that this modulation is not the same reflection of top-down control seen in frontoparietal regions, but instead the result of low-level perceptual processing.
As noted in the Introduction, several previous studies have reported that a region in the mSPL exhibits transient activation following a cue to shift attention. Here, we examined the neural correlates of attentional shifts between values within two different perceptual domains in the same experiment to test the hypothesis that dorsal regions of PPC and prefrontal cortex produce a domain-independent signal that serves to reconfigure the current state of attentional selection. A conjunction analysis of spatial and color shift activity yielded significant activation of bilateral mSPL as well as bilateral FEF/SEF and left MFG. In each of these regions, mean event-related BOLD time courses for both color and spatial shifts of attention exhibited a transient increase in activity with a peak at ∼4–5 s after cue onset. Importantly, the activity for both types of shifts was nearly identical and did not vary for the two values tested in each domain.
A group analysis of sustained spatial attention revealed a region of activation from ventrotemporal through extrastriate visual cortex along with an area of PPC during periods in which subjects maintained attention to the contralateral hemifield. These parietal regions of activation partially overlapped with the domain-independent shift-related activity shown in the conjunction analysis. However, the significant sustained attention activity spread more laterally into the IPL, whereas common shift-related activity was localized more medially in precuneus/SPL. This result is consistent with the recent finding of differential contributions of SPL and IPS following changes of attentional priority (Molenberghs et al., 2007). Both that study and the current one report evidence of mSPL involvement in shifts of spatial attention while IPS/IPL subserves endogenous maintenance of attention. However, Molenberghs et al. (2007) argue that SPL is more closely involved in spatial, not feature-based, shifts of attention. They suggest that previous studies reporting nonspatial attention shift signatures in SPL (Liu et al., 2003; Serences et al., 2004; Shomstein and Yantis, 2006) used tasks that may have allowed subjects to use a spatial strategy. In the current study, however, such a spatial strategy (e.g., tracking an individual dot as it moved, and shifting spatial attention from the currently attended dot to a dot in the to-be-attended color) would give rise to transient signals throughout the entire experiment (not solely at the moment of a color shift). Because individual dot lifetime was limited, such a strategy would require frequent shifts from a dot in the attended color to another dot in the attended color when the currently tracked dot disappeared. This would have evoked multiple instances of spatial attention shifts (and corresponding transient control signals) during sustained periods of attention. Our data do not support this possibility. Thus, our results suggest that mSPL plays a role in both spatial and nonspatial shifts of attention.
Multivariate pattern classification analysis of our data revealed that in parietal ROIs, which exhibited nearly identical mean BOLD time courses for color and location shifts, domain-specific information is expressed in the spatiotemporal patterns of nodewise activation within mSPL. Furthermore, mean-centering procedures had little effect on classification accuracy and mean time course classification yielded chance classifier performance (supplemental Figs., available at www.jneurosci.org as supplemental material), lending further support to separability of function at the sub-ROI level. Our findings suggest that PPC contains distinct subpopulations of neurons distributed throughout this region, each with domain-specific tuning functions for cognitive control. This finding also provides evidence against the hypothesis suggested by Molenberghs et al. (2007) that shifts of attention between features are instances of spatial attention shifts.
Recent evidence for interleaved subpopulations of neurons tuned for different domains of control was recently reported by Schenkluhn and colleagues (2008). They applied transcranial magnetic stimulation (TMS) to PPC while subjects performed a visual search task. In this paradigm, subjects were cued to either the color or location of a target item in a subsequent visual search array. They applied TMS after the cue and observed impairment in both color and spatial trials when TMS was applied to anterior IPS (stimulation of a more lateral site in PPC disrupted only spatial cue trials). The disruption of preparatory shifts of attention to a specific color and a specific location is consistent with our finding that spatially distributed and intermixed populations of neurons exhibit differential tuning to these two domains of attentional control.
Recent studies have implicated the PPC in not only shifts of attention within perceptual domains, but also nonperceptual shifting of categorization rules held in memory (Chiu and Yantis, 2009). Multivoxel pattern classification revealed distinct spatiotemporal patterns of activity within PPC for spatial shifts of attention versus rule shifts (Esterman et al., 2009). Esterman and colleagues (2009) suggested that the patterns of activity in mSPL associated with spatial attention shifts and rule switches could be accounted for by postulating dimensions of perceptual versus nonperceptual or spatial versus nonspatial acts of cognitive control. The current study demonstrates that spatial shifts of attention elicit distinct patterns of activity from (nonspatial) color shifts, both shifts of perceptual selection. Additional studies that explore two different nonspatial domains of cognitive control will further clarify the organizational structure of mSPL.
The current study did not examine shifts between distinct perceptual domains. Shifts of spatial attention were always between dot fields of the same color, and color shifts were always between dot fields at the same spatial location. Two previous studies, however, have shown that cross-domain shifts of attention between color and motion (Liu et al., 2003) and between vision and audition (Shomstein and Yantis, 2004) also evoke transient increases in mSPL. Based on the current findings, we would predict that pattern classification of those data would reveal distinct spatiotemporal patterns of activity for shifts in one direction versus the other.
The notion that there exists a generalized frontoparietal attention control network that works together with specialized subregions of this network to execute a shift of attention has been raised previously (Shulman et al., 2002; Slagter et al., 2007). The present data support both this global hypothesis as well as specifications that propose separate roles for the frontal and parietal portions of this network. Shulman and colleagues (2002), for example, suggest that preparatory signals for specifying task-relevant information emanate from frontal portions of this network; the signals are then propagated to posterior parietal attention regions that reflect both preparatory and task-execution processes. They observed that preparatory signals (frontal and parietal) were domain-independent whereas task execution signals (parietal only) contained both domain-independent and task-specific components. The current data are in accord with these findings in that pattern classification was successful in distinguishing between the two shift conditions in PPC but not in the frontal ROIs, suggesting domain-generality in the frontal regions and domain-specificity in parietal regions.
Furthermore, Slagter et al. (2007) suggest that subregions of this frontoparietal network are specialized for either content information (e.g., a specific location in space) or type of process (e.g., a shift of attention). The present data echo this idea, suggesting that domain-independent frontal regions reflect process type whereas parietal regions may reflect content information. However, we include the caveat that portions of several frontal regions we submitted to ROI-based MVPC may contain reliable spatiotemporal information about the domain of attention shift performed, as evidenced by our searchlight pattern classification results (see Material and Methods for discussion and supplemental figures, available at www.jneurosci.org as supplemental material).
The present results contribute to a growing body of evidence that medial SPL serves as a cortical hub for the initiation of attention and task shifts in multiple domains. Distinct spatiotemporal patterns of neural activity are associated with acts of cognitive control in different domains. These patterns of activity may participate in the specification of the new state of attention and/or the disengagement from the current state of attention—or both. A remaining challenge is to discover the mechanism by which these attentional specifications target domain-specific activity in sensory cortex.
Footnotes
-
This work was supported by National Institutes of Health Grants R01-DA13165 (to S.Y.) and F31-NS055664 (to A.G.). We thank T. Brawner and K. Kahl for help with scanning and J. Gillen for technical assistance. We also thank Yuefeng Han for help with the searchlight analysis.
- Correspondence should be addressed to Adam S. Greenberg, Department of Psychology, Carnegie Mellon Universtiy, Pittsburgh, PA 15213. agreenb{at}cmu.edu