Abstract
Visual search is aided by previous knowledge regarding distinguishing features and probable locations of a sought-after target. However, how the human brain represents and integrates concurrent feature-based and spatial expectancies to guide visual search is currently not well understood. Specifically, it is not clear whether spatial and feature-based search information is initially represented in anatomically segregated regions, nor at which level of processing expectancies regarding target features and locations may be integrated. To address these questions, we independently and parametrically varied the degree of spatial and feature-based (color) cue information concerning the identity of an upcoming visual search target while recording blood oxygenation level-dependent (BOLD) responses in human subjects. Search performance improved with the amount of spatial and feature-based cue information, and cue-related BOLD responses showed that, during preparation for visual search, spatial and feature cue information were represented additively in shared frontal, parietal, and cingulate regions. These data show that representations of spatial and feature-based search information are integrated in source regions of top-down biasing and oculomotor planning before search onset. The purpose of this anticipatory integration could lie with the generation of a “top-down salience map,” a search template of primed target locations and features. Our results show that this role may be served by the intraparietal sulcus, which additively integrated a spatially specific activation gain in relation to spatial cue information with a spatially global activation gain in relation to feature cue information.
- visual search
- attention
- spatial attention
- feature-based attention
- top-down salience map
- oculomotor planning
Introduction
Visual search is aided by previous knowledge regarding probable features (Egeth et al., 1984; Kaptein et al., 1995) and locations (Posner et al., 1980; Yantis and Jonides, 1990) of a sought-after target. Spatial information can guide eye movements and engage top-down spatial attention mechanisms that enhance responses in visual neurons whose receptive fields overlap with the expected target location (Moran and Desimone, 1985; Treue and Maunsell, 1996; Luck et al., 1997; Tootell et al., 1998; Brefczynski and DeYoe, 1999; Bichot et al., 2005). Feature-based information can be harnessed to engage top-down feature-based attention mechanisms, facilitating responses in neurons preferentially coding for the targeted feature across the visual field (Treue and Martinez-Trujillo, 1999; McAdams and Maunsell, 2000; Saenz et al., 2002; Bichot et al., 2005; Liu et al., 2007; Serences and Boynton, 2007), and this “highlighting” of potentially relevant locations can in turn guide eye movements and the focus of spatial attention (Wolfe, 1994; Cave, 1999; Rao et al., 2002). Importantly, in real-life visual search, we often have available both feature-based and spatial foreknowledge, and these sources of information are presumably used in concert to optimize the allocation of attention and eye movements during the search process (Kingstone, 1992; Turano et al., 2003). However, the way in which the human brain represents concurrent spatial and feature-based expectancies is currently not well understood.
Combined influences of spatial and feature-based expectancies have been reported in visual target regions of top-down biasing, namely, in stimulus-evoked responses of extrastriate neurons in the monkey (Treue and Martinez-Trujillo, 1999; McAdams and Maunsell, 2000). However, it is not clear whether visual cortex represents the initial site of integration for spatial and feature-based expectancies, combining inputs from anatomically segregated source regions of spatial and feature-based top-down biasing or, alternatively, whether an integrated representation of relevant features and locations already exists in source regions of top-down biasing and oculomotor planning, such as frontal and parietal cortices. An extensive literature has delineated a fronto-cingulate-parietal network involved in spatial orienting of attention and eye movements (Mesulam, 1999; Corbetta and Shulman, 2002), but studies that sought to directly compare neural signatures of spatial versus feature-based attention have found little evidence for a dissociable “feature attention network” (Wojciulik and Kanwisher, 1999; Vandenberghe et al., 2001; Giesbrecht et al., 2003). However, an important shortcoming of these studies is that they did not manipulate spatial and feature information simultaneously and independently of each other, such that the interactive effects of these factors could not be assessed, which precludes the unambiguous identification of shared or distinct neural representations (Sternberg, 2001).
Here, we assessed how the human brain represents and integrates concurrent spatial and feature-based expectancies, by acquiring behavioral and functional magnetic resonance imaging (fMRI) data during a cued visual search task that independently and parametrically varied the degree of spatial and feature-based cue information regarding the likely identity of an upcoming search target.
Materials and Methods
Subjects.
Fourteen healthy, right-handed volunteers (10 females; mean age, 27.1 years; range, 22–35 years) gave written informed consent to participate in this study, in accordance with institutional guidelines of Northwestern University. All participants had normal or corrected-to-normal vision and were screened by self-report to exclude any subjects reporting previous or current neurological or psychiatric conditions, and current psychotropic medication use. Subjects were paid $60 for participating in two 1 h fMRI sessions.
Experimental protocol.
To characterize how the brain integrates a priori information regarding probable features and locations of a visual search target, we acquired fMRI data in human subjects performing a cued visual search task that independently varied the degree of spatial and feature-based information available about the likely identity of an upcoming search target (Fig. 1) (cf. Kingstone, 1992). Each trial consisted of a cue period and a visual search (target) period. Search arrays consisted of a small central fixation circle (subtending 0.5° visual angle) and four peripherally placed diamond shapes (each subtending 1.5° visual angle) of blue [red–green–blue (RGB), 0, 0, 255; hue, 160; saturation, 240; luminance, 120] and red color (RGB, 255, 0, 0; hue, 0; saturation, 240; luminance, 120). Two diamonds were placed to the left (one blue and one red diamond) and two to the right (one blue and one red diamond) of fixation, at 7.5° eccentricity, on a uniform gray background (RGB, 127, 127, 127; hue, 160; saturation, 0; luminance, 120) (Fig. 1A). Thus, on any given trial, the identity of each diamond could be uniquely specified by a combination of spatial (left vs right) and feature-based (blue vs red) information. On each trial, one of the four diamond stimuli (the “target”) had a small fragment of either its top or bottom corner removed (Fig. 1A), and it was the subjects' task to locate this diamond and indicate whether the corner was missing at the top or bottom of the diamond, via a button push with either the right index or middle finger. The color of each diamond, the identity of the target diamond, and the location of the missing fragment on the target diamond all varied randomly from trial to trial.
The search for the target diamond was systematically biased via a preceding cue stimulus (subtending 1.5° visual angle) that consisted of a central square and two flanking triangles (Fig. 1A,B). Cues contained two sources of potential information, feature-based information, in the form of a letter in the central square (“B” for blue, “R” for red), and spatial information, which was conveyed by “filling in” one of the flanking triangles. We varied independently and parametrically the level of spatial and feature cue information regarding the likely identity of the subsequent target diamond, from 50% (no information) to 70–90% accurate target location and/or color prediction, resulting in a 3 × 3 factorial design (Fig. 1B), crossing spatial cue information (50, 70, and 90%) with feature cue information (50, 70, and 90%). For example, the cue could inform the subject that the target diamond has a 70% probability of appearing in the left hemifield and a 90% probability of being red (Fig. 1A). This design allowed us to assess main and interaction effects of previous spatial and feature-based information on visual search performance and blood oxygenation level-dependent (BOLD) responses. Equiprobable target location (the 50% cue) was conveyed by leaving the triangles blank (white), 70% probability of the target being presented to a given side of the cue stimulus was indicated by filling in the triangle pointing to that side with gray color, and 90% probability was associated with black fill-in color. The same rules applied to the feature cue information, in which a blank square indicated equal probability for the target being red or blue, the 70% probability condition was associated with the cue letter being displayed in gray, and the 90% probability condition was associated with black letters (Fig. 1B). Note that cue information aided the search for the target diamond but was not predictive of the correct response associated with the target and thus did not prime the correct response. Furthermore, targets required a “top” or “bottom” response, orthogonal to the left/right dimension that was cued by the spatial cue information.
To dissociate BOLD responses associated with cue processing from those related to subsequent target processing, the duration of the cue period was jittered, varying from 4 to 8 s in 1 s steps, along a pseudoexponential distribution (Ollinger et al., 2001; Wager and Nichols, 2003), in which 50% of cue periods lasted 4 s, 25% lasted 5 s, 13% lasted 6 s, 6% lasted 7 s, and 6% lasted 8 s, and cue and target periods were modeled with separate regressors in the fMRI analysis (see below). The target period lasted until the subject responded. To dissociate cue processing from target processing on the previous trial, a jittered intertrial interval was used, varying from 4 to 8 s with an identical distribution to the cue intervals, during which a fixation cross (subtending 1° visual angle) was displayed.
To closely simulate the use of spatial and feature information in real-life visual search scenarios, subjects were allowed to move their eyes during the target period of each trial but not during the cue period (which we verified by tracking eye movements; see below, Eye-movement data). Thus, cue-related BOLD responses in this task were not contaminated by actual eye movements but represent an amalgam of the representation of spatial and feature information and the translation of that information into top-down attentional biasing and oculomotor planning processes. The task programming, stimulus delivery, and recording of behavioral responses were performed with Presentation software (Neurobehavioral Systems). Stimuli were displayed via a back-projection screen placed at the head of the scanner bore, which was viewed by the subjects via a mirror attached to the head coil. Manual responses were recorded via MRI-compatible button pads. Subjects performed eight scan runs of 36 trials each, broken up into two separate scanning sessions.
Behavioral data analysis.
In each subject, response time (RT) data from correct trials were trimmed by removing cases that lay >2 SDs away from the subject's grand mean. Then, mean RT was calculated for each cue condition, separately for validly cued and invalidly cued trials. These individual subject mean values were then entered into group analyses, which consisted of repeated-measures ANOVAs and paired-sample t tests, described in Results. Accuracy data did not represent a measure of interest but were analyzed to verify that effects in the RT data were not related to speed–accuracy tradeoffs.
Eye-movement data.
We recorded eye movements during scanning to ensure that the BOLD responses during cue processing were not confounded by eye motion. Eye movements were gauged with a long-range optics Applied Science Laboratory (ASL) eye-tracking system, which monitored and recorded the relative positions of the pupil and corneal reflection, in reference to the visual stimulus display. We obtained valid eye-movement records in 8 of 14 subjects. In the remaining subjects, eye movements were monitored on-line by the experimenter, and none of our subjects reported any difficulty in maintaining fixated on the cue stimulus. Eye-movement data were analyzed using ASL Eyenal software. For each cue period in each subject, we calculated the position and duration of fixations (defined as gaze maintained within 1° visual angle for >100 ms) but discarded eye-blink artifacts and time periods during which no corneal reflection was reliably recorded. This pruning process resulted in a sample of 966 artifact-free cue periods for which eye gaze was analyzed. To verify that subjects maintained their gaze on the cue stimulus (subtending 1.5° visual angle), we created an area of interest (AOI) of 3° diameter, centered on the cue, and calculated the percentage of time during which fixations were registered within the AOI. Subjects were found to have reliably kept their gaze on the cue stimulus: for an average ± SD of 98.02 ± 4.97% of the time, fixations lay within the AOI, and success at fixating did not vary with cue type (range, 97.2–98.7%).
Image acquisition.
Images were recorded with a Siemens Trio 3 tesla scanner, using a 12-channel birdcage head coil. Functional images were acquired parallel to the anterior commissure–posterior commissure line with a T2*-weighted echo planar imaging sequence of 38 contiguous axial slices [repetition time (TR), 2000 ms; echo time (TE), 20 ms; flip angle, 80°; field of view (FOV), 220 × 220 mm; array size, 64 × 64] of 3.0 mm thickness and 3.4 × 3.4 mm in-plane resolution. Structural images were acquired with a T1-weighted magnetization-prepared rapid-acquisition gradient echo sequence (TR, 19 ms; TE, 5 ms; flip angle, 20°; FOV, 220 × 220 mm), recording 124 slices at a slice thickness of 1.5 mm and in-plane resolution of 0.86 × 0.86 mm.
Image analysis.
All preprocessing and statistical analyses were performed using SPM5 (http://www.fil.ion.ucl.ac.uk/spm/software/spm5/). For each subject, functional data were slice-time corrected and spatially aligned to the first volume of the first run. Each subject's structural scan was coregistered to a mean image of their realigned functional scans and then used to calculate transformation parameters for normalizing the functional images to the Montreal Neurological Institute (MNI) template brain. The normalized functional images (resampled at 3 mm3) were spatially smoothed with a Gaussian kernel of full-width half-maximum of 9 mm3. The first five volumes of each run were discarded before building and estimating the statistical models of the task. A 256 s temporal high-pass filter was applied to remove low-frequency artifacts. Temporal autocorrelation in the time series data were estimated using restricted maximum likelihood estimates of variance components using a first-order autoregressive model, and the resulting nonsphericity was used to form maximum likelihood estimates of the activations.
In the initial set of analyses, the task model entailed one regressor that coded for onsets of cue periods, one regressor that coded for onsets of target periods, and a nuisance regressor that coded for error trials. Importantly, the cue period regressor was complemented by two orthogonal linear parametric modulator regressors, one each to code for the levels of spatial and feature information contained by each cue (50% = 0, 70% = 1, and 90% = 2), respectively. Activity associated with each of these modulators was first assessed at the single-subject level and then entered into second-level random-effects analyses, identifying regions whose activity was significantly modulated by increasing spatial cue information or by increasing feature cue information and regions where either of these effects was present to a stronger degree than the other (i.e., an interaction effect). Statistical significance of activations for these analyses was assessed at a whole-brain corrected false discovery rate (FDR) (Genovese et al., 2002) of p < 0.05, and a minimum cluster extent of 20 voxels. Next, a conjunction analysis (Nichols et al., 2005) was performed to reveal areas that were subject to both spatial and feature cue modulation. The regions thus identified served as regions of interest (ROIs) for additional analysis. To estimate these regions' responses to each of the nine cue conditions and depending on which side of the visual field was cued by the spatial cue (left vs right), new nonparametric task models were constructed, in which each possible cue condition was modeled by a separate regressor (target periods and error trials were also modeled separately, as before). In each subject, mean estimates of the activity associated with each cue type (beta parameters) were extracted from each of the ROIs (using Marsbar, http://marsbar.sourceforge.net/) and entered into group analyses described in Results. For display purposes, we also generated graphs of the time courses of percentage change in BOLD signal after cue onset (see Figs. 2, 3). Data for these graphs were generated by modeling cue-related BOLD data with a finite impulse response function (using Marsbar, http://marsbar.sourceforge.net/) that does not assume any particular shape of the hemodynamic response. In each subject, the mean percentage signal change in the BOLD signal was estimated for each of eight scan acquisitions after each cue (0–16 s after cue), and time course were subsequently averaged across subjects and cue types.
Results
Behavioral data
The analyses of behavioral performance data acquired during the fMRI scan focused on mean RT of correct trials only. We first assessed whether the probabilistic manipulation of cue information was successful in inducing graded and comparable search benefits for spatial and feature-based foreknowledge. If the manipulation were successful, RT should systematically vary as a function of cue information and cue validity (Posner et al., 1980; Yantis and Jonides, 1990; Hahn et al., 2006), in which more informative cues would be associated with faster RT on validly cued trials (cue benefits) but with slower RT on invalidly cued trials (cue costs). As shown in Figure 1C, exactly this pattern of results was observed. We formally tested these effects in a cue information (70 vs 90%) × cue validity (valid vs invalid) × cue dimension (spatial vs feature) repeated-measures ANOVA. As expected, the effects of cue information and cue validity interacted (F(1,13) = 15.2, p < 0.005), because 90% informative cues were associated with faster RT than 70% informative cues in the valid cue condition (t(13) = 2.35, p < 0.05) but with slower RT in the invalid cue condition (t(13) = 4.0, p < 0.005) (Fig. 1C). There were no effects involving cue dimension, indicating that spatial and feature-based attention had comparable effects on search performance. We further corroborated these results in a traditional “cue validity effect” analysis (Fig. 1D): valid trial RT for 90 and 70% informative cues was subtracted from invalid trial RT for 90 and 70% informative cues, respectively, and the resulting cue validity effect scores were submitted to a cue information (70 vs 90%) × cue dimension (spatial vs feature) ANOVA. We observed a main effect of cue information (F(1,13) = 15.2, p < 0.005), because the cue validity effect was larger for the 90% than the 70% informative condition, with no difference between spatial and feature-based cues and no interaction between these factors (Fig. 1D). These results establish that our task successfully induced comparable search benefits of spatial and feature-based information that reliably varied with level of cue information.
Relationship between spatial and feature-based cuing
Having established the principle validity of the experimental manipulation, we focused on the analysis of validly cued trials only to determine how concurrent spatial and feature-based foreknowledge affect visual search performance. To this end, RT data were analyzed in a spatial cue information (50 vs 70 vs 90%) × feature cue information (50 vs 70 vs 90%) ANOVA. As depicted in Figure 1E (for descriptive statistics, see Table 1), RT decreased with increasing cue information, reflected in main effects of spatial (F(2,26) = 32.3, p < 0.001; linear trend, F(1,13) = 39.7, p < 0.001) and feature cue information (F(2,26) = 19.4, p < 0.001; linear trend, F(1,13) = 20.4, p < 0.001). These linear effects of cue information were accompanied by smaller quadratic trends (spatial cue, F(1,13) = 13.1, p < 0.005; feature cue, F(1,13) = 15.4, p < 0.005), because RT benefits derived from additional cue information tended to “level off” in the transition from 70 to 90% predictive information compared with the transition from 50% (no information) to 70% predictive cue information (Fig. 1E).
The analyses also revealed an interaction effect between spatial and feature-based cue information (F(4,52) = 4.5, p < 0.005), because the effect of spatial cue information was greatest when the feature cue was uninformative (50% condition), and, vice versa, the effect of feature cue information was greatest when the spatial cue was uninformative. This interaction appeared to be driven by the very slow RT in the “neutral” cue condition (50% spatial and 50% feature cue), in which neither spatial nor feature-based attention could be engaged a priori, rather than by a progressive decline in the effects of one factor with increasing level of the other factor (Fig. 1E). In line with this observation, additional spatial cue information benefited RT when the feature cue was 70% informative (F(2,26) = 19.4, p < 0.001; linear trend, F(1,13) = 30.6, p < 0.001) and even when the feature cue was 90% informative (F(2,26) = 19.5, p < 0.001; linear trend, F(1,13) = 49.5, p < 0.001), with no difference in strength between these effects (F(2,26) = 1.5, p > 0.1) (Fig. 1E). Likewise, additional feature cue information facilitated RT when the spatial cue was 70% informative (F(2,26) = 3.9, p < 0.05; linear trend, F(1,13) = 5.0, p < 0.05) and even when the spatial cue was 90% informative (F(2,26) = 9.4, p < 0.001; linear trend, F(1,13) = 13.0, p < 0.005), with no difference in strength between these effects (F(2,26) = 1.9, p > 0.1) (Fig. 1E). These results suggest that spatial and feature-based foreknowledge produced primarily independent, quasi-additive beneficial effects on visual search performance.
Accuracy on this task was consistently very high (mean ± SD, 98.0 ± 2.8%) and did not represent a primary measure of interest. However, to ascertain that the cueing effects observed in the RT data did not derive from speed–accuracy tradeoffs, accuracy scores were analyzed in the same way as the RT data (i.e., in a 3 × 3 ANOVA). No differences in accuracy between the cueing conditions were detected (for descriptive statistics, see Table 1).
fMRI data
We set out to identify brain regions whose activity displayed main effects of spatial cue information, main effects of feature cue information, as well as potential interaction effects and shared effects of spatial and feature cue information. To ensure that these effects were not confounded with bottom-up stimulus factors, target detection, response selection, or other processes involved in the actual visual search, all fMRI analyses reported below refer to the preparatory cue interval only, that is, they reflect BOLD responses in anticipation of the target stimulus array (Kastner et al., 1999; Corbetta et al., 2000; Hopfinger et al., 2000). Furthermore, spatial and feature cue information were modeled as orthogonal, parametric modulators of the effects of cue processing per se, thus also controlling for perceptual and other generic effects of cue encoding in the activations reported below. To ensure that these data were not contaminated by oculomotor activity, we recorded eye movements in the scanner and confirmed that subjects maintained fixated on the cue during the cue period (see above, Materials and Methods, Eye-movement data). Note that subjects were free to saccade during the search period itself however, such that the cue-related BOLD responses analyzed here represent preparation for a “real” (overt) visual search, comprising the encoding and representation of spatial and feature cue information and the translation of this information into top-down attentional biasing and oculomotor planning processes.
Shared neural substrates of spatial and feature cue information
Random-effects analyses of cue-related BOLD responses revealed that increasing spatial cue information was associated with enhanced bilateral activation in the intraparietal sulcus (IPS), dorsolateral frontal cortex, the presupplementary motor area/anterior cingulate cortex (preSMA/ACC), lateral prefrontal cortex, inferior frontal cortex/anterior insula (IFC/AI), and lateral occipital regions (Fig. 2A, Table 2). The dorsolateral frontal activation focus was centered on the junction of the superior frontal and precentral sulci, which corresponds to the locus of the human frontal eye field (FEF), as defined in functional studies (Lobel et al., 2001), and we will from here on refer to this activation cluster as the FEF, for the sake of brevity. No brain regions displayed decreases in activity with increasing spatial cue information. These activations closely resemble those found in previous studies of spatial orienting (Gitelman et al., 1999; Mesulam, 1999; Corbetta and Shulman, 2002). Interestingly, it was found that increasing feature cue information, although entirely orthogonal to the spatial cueing factor, was accompanied exclusively by increments in activity in a subset of these regions, namely, bilateral foci in the IPS, FEF, preSMA/ACC, and IFC/AI (Fig. 2B, Table 2). No brain regions displayed decreases in activity with increasing feature cue information. These data suggest a close overlap between regions that represent spatial and feature-based search information. In line with this observation, a conjunction analysis (Nichols et al., 2005) revealed that the bilateral IPS, FEF, and IFC/AI, as well as a unilateral focus in the left preSMA/ACC, all displayed significant increments in activation with both increasing spatial and feature cue information (Fig. 2C). The temporal profiles of these cue-related effects are shown in Figure 2D–G, which displays time courses of percentage change in the BOLD signal for each of the shared regions (averaged across bilateral activation foci for the IPS, FEF, and IFC/AI) as a function of the main effects of spatial cue information (averaged across levels of feature cue information) and feature cue information (averaged across levels of spatial cue information).
In additional support of the impression that spatial and feature-based cue information are represented in shared brain regions, when assessing interaction effects between spatial and feature-based cue information, no evidence for dissociable regions supporting spatial and feature-based foreknowledge was found. In other words, no brain region displayed significantly stronger modulation by spatial than by feature cue information or vice versa. Nevertheless, we entertained the possible existence of regions specific to spatial or feature cue processing further, by conducting exploratory interaction effects analyses at lower statistical thresholds. Only at a very lenient threshold setting of uncorrected p < 0.005 were such activations observed. At this level, small clusters of voxels in bilateral dorsal anterior cingulate cortex [MNI coordinates (x, y, z): −9, 18, 39; 15, 24, 36] and the cuneus/precuneus (x, y, z: −12, −81, 42; 18, −84, 39), as well as in left anterior middle frontal gyrus (x, y, z: −33, 42, 30), were more activated in relation to spatial cue information than feature cue information. Conversely, small clusters of voxels in the bilateral posterior cingulate cortex (x, y, z: 12, −51, 24; −9, −48, 27), as well as in the left posterior middle frontal gyrus (x, y, z: −39, 12, 48) and the left inferior parietal lobule (IPL) (x, y, z: −42, −66, 39), were relatively more activated with increasing feature cue information than spatial cue information. In summary, these data do not present reliable evidence for segregated representations of spatial and feature-based information but rather suggest that spatial and feature-based expectancies are represented within overlapping frontal, parietal, and cingulate brain regions.
To further probe the degree to which representations of spatial and feature-based cue information are sustained by shared brain regions, we assessed the effects of feature cue information in the region that was most responsive to spatial expectancies (the peak active region for the main effect of spatial cue information) and the effects of spatial cue information in the area that was most responsive to feature-based expectancies (the peak active region for the main effect of feature cue information). As listed in Table 2, the peak effect of spatial cue information was observed in the right precuneus (Prec), whereas the peak effect of feature cue information was observed in the left IPL. We extracted mean activation estimates (beta parameters) and time courses of percentage BOLD signal change for the different levels of spatial and feature cueing from 10 mm spheres centered on each of these ROIs. As is shown in Figure 3, both ROIs displayed significant increases in activation with both types of cue information (Prec: spatial cue, F(2,26) = 8.9, p = 0.001; feature cue, F(2,26) = 5.1, p < 0.05; IPL: spatial cue, F(2,26) = 8.3, p < 0.005; feature cue, F(2,26) = 6.0, p < 0.01). Thus, even in an independently defined region of peak activation for spatial cue information, significant effects of feature cue information were observed and vice versa, further bolstering the conclusion that representations of spatial and feature-based a priori search information have shared neural substrates.
Finally, it should be noted that, at a whole-brain corrected threshold, we did not observe cueing effects in ventral visual regions, such as the fusiform gyrus, which is involved in color processing (McKeefry and Zeki, 1997; Hadjikhani et al., 1998), and in which possible “baseline shifts” of neural activity that have been reported previously in target regions of top-down biasing could have been expected (Chawla et al., 1999; Kastner et al., 1999). These null findings are not entirely surprising, however, because it has been shown that effects of top-down biasing in the absence of bottom-up stimulation generally tend to be less pronounced in the visual target regions of the biasing signals than in the more anterior source regions of these signals (Kastner et al., 1999). At a more lenient threshold of uncorrected p < 0.005, effects of spatial cue information were evident in bilateral fusiform gyri (x, y, z: −48, −42, −18; 42, −66, −21), and effects of feature-based cue information were observed more medially, in bilateral lingual gyri (x, y, z: −9, −72, 24; 6, −78, −6). However, no claims can be made regarding the specificity of these effects, because no spatial × feature cue interaction effects were evident in any of these regions. Furthermore, even at this lenient threshold setting, we did not detect any visual areas that displayed shared effects of spatial and feature biasing. Thus, although common effects of spatial and feature-based cue information in anticipation of a visual search were observed in source regions of top-down attentional biasing and oculomotor planning, that is, in frontal, cingulate, and parietal regions, no evidence for anticipatory effects of this nature was obtained in visual target regions of top-down biasing.
Additive integration of spatial and feature-based search information
To characterize more closely the way in which concurrent spatial and feature cue information is represented in the frontal, parietal, and cingulate regions identified by the conjunction analysis (Fig. 2C), we extracted mean activation estimates (beta parameters) for each cue condition from each of these ROIs and then analyzed these estimates in spatial cue information (50 vs 70 vs 90%) × feature cue information (50 vs 70 vs 90%) ANOVAs, akin to the analysis of the behavioral data. It was found that cue information-dependent BOLD responses in the IPS, FEF, IFC/AI, and preSMA/ACC bore a close, inverse resemblance to the behavioral effects of cue information (Fig. 4A–D, activation estimates are averaged over the two hemispheres for the IPS, FEF and IFC/AI). The IPS (Fig. 4A) displayed linear effects of spatial (F(2,26) = 47.2, p < 0.001) and feature-based (F(2,26) = 16.9, p < 0.001) cue information, as activity increased in concert with increasing cue information. There was no interaction between spatial and feature-based cue information (F(4,52) = 0.9, p > 0.1), indicating that their effects were independent. The same pattern of results was observed for activity in the FEF (Fig. 4B) (spatial linear effect: F(2,26) = 40.5, p < 0.001; feature linear effect: F(2,26) = 5.1, p < 0.05; interaction effect: F(4,52) = 1.0, p > 0.1), in the IFC/AI (Fig. 4C) (spatial linear effect: F(2,26) = 12.2, p < 0.005; feature linear effect: F(2,26) = 4.5, p = 0.054; interaction effect: F(4,52) = 1.1, p > 0.1), and in the preSMA/ACC (Fig. 4D) (spatial linear effect: F(2,26) = 9.2, p = 0.1; feature linear effect: F(2,26) = 4.3, p = 0.058; interaction effect: F(4,52) = 0.9, p > 0.1). It seems noteworthy that the mean beta parameters in Figure 4B–D suggest that the effects of feature cue information on activation in the FEF, IFC/AI, and preSMA/ACC became evident only in the presence of an informative spatial cue (the 70 and 90% spatial cue conditions). However, as reported above, no significant feature × spatial cue information interaction effects were observed in these regions. Thus, resembling the behavioral visual search data, BOLD responses in the IPS, FEF, IFC/AI, and preSMA/ACC were found to display additive effects of spatial and feature-based expectancies, indicating that these sites harbor integrated but independent representations of spatial and feature-based search information.
Topography of spatial and feature-based representations
As touched on in Introduction, a notable distinction between the effects of top-down spatial versus feature-based attention is that spatial attention biases visual processing in a spatially selective manner, enhancing responses in visual neurons whose receptive fields overlap with the attended location (Moran and Desimone, 1985; Treue and Maunsell, 1996; Luck et al., 1997; Tootell et al., 1998; Brefczynski and DeYoe, 1999; Bichot et al., 2005), whereas feature-based attention enhances responses in neurons that are selective for the attended feature in a spatially global manner, across the visual field (Treue and Martinez-Trujillo, 1999; McAdams and Maunsell, 2000; Saenz et al., 2002; Bichot et al., 2005; Liu et al., 2007; Serences and Boynton, 2007). Accordingly, Treue and Martinez-Trujillo (1999) have found that the additive influences of these two top-down mechanisms on responses in extrastriate visual neurons resulted in approximately twice the response gain in neurons that were both selective for the attended feature and whose receptive fields overlapped with the spatial locus of attention compared with neurons that were either selective for the attended feature but did not represent the attended location or neurons that represented the attended location but were not selective for the attended feature. We were interested in exploring whether the representations of spatial and feature-based expectancies in frontal and parietal cortices would display a similar topography of effects, namely, an additive combination of a spatially lateralized effect of spatial cue information and a spatially global effect of feature cue information.
To explore the topography of spatial and feature cue information representations in our ROIs, we analyzed cue-related activity depending on the relationship of the locus of the activation (left vs right cerebral hemisphere) to the locus of spatial attention (left vs right visual hemifield), as determined by spatial cue information. Because this type of analysis cannot be adequately conducted in a unilateral activation focus, the preSMA/ACC ROI was omitted from this treatment. For the bilateral ROIs, activation estimates were averaged across foci that lay ipsilateral to the spatially cued hemifield (“spatially unattended”) and those that lay contralateral to the spatially cued hemifield (“spatially attended”), collapsed across the 70 and 90% informative spatial cue conditions. We then entered these data into spatial cue direction (spatially unattended vs attended) × feature cue (50 vs 70 vs 90% informative) ANOVAs. These ANOVAs were aimed at testing two effects of interest. First, the hypothesized lateralized representation of spatial cue information was tested by assessing the main effect of spatial cue direction (spatially attended > unattended). Second, the hypothesized global effect of feature-based cue information was tested by assessing whether there was a spatial cue direction × feature cue interaction effect; specifically, if feature cue information were represented globally, the degree of increase in activity with feature cueing should not vary depending on whether this activity stems from the hemisphere ipsilateral or contralateral to the spatially cued hemifield.
In the IPS (Fig. 5A), we observed a main effect of spatial cue direction (F(1,13) = 13.4, p < 0.005), because activity in the hemisphere contralateral to the spatially cued hemifield (spatially attended) was greater than that in the ipsilateral hemisphere (spatially unattended), reflecting a lateralized effect of spatial cue information. The effect of feature cue information (F(2,26) = 13.0, p < 0.001) did not interact with spatial cue direction (F(2,26) = 0.1, p > 0.1), indicating that feature cueing had a global effect on IPS activity. In the FEF (Fig. 5B), the effect of spatial cue direction was nonsignificant (F(1,13) = 0.4, p > 0.1), suggesting a nonlateralized representation of spatial cue information. Moreover, the effect of feature cue information (F(2,26) = 3.9, p < 0.05) did not interact with the spatial cue factor (F(2,26) = 1.2, p > 0.1), indicating that feature cueing also had a global effect on FEF activity. In the IFC/AI (Fig. 5C), the effect of spatial cue direction was also nonsignificant (F(1,13) = 0.1, p > 0.1), suggesting a global effect of spatial cueing. In addition, the effect of feature cueing (F(2,26) = 2.7, p = 0.08) did not interact with spatial cue direction (F(2,26) = 0.1, p > 0.1), indicating that feature cueing also had a global effect on IFC/AI activity. Finally, it may be instructive to compare the degree of spatial lateralization (or lack thereof) observed in these ROIs with the level of lateralization found in extrastriate visual cortex. We therefore conducted the equivalent analyses in the lateral occipital cortex (Occ) regions that we had found to be responsive to spatial cueing in the initial whole-brain analysis (Fig. 2A, Table 2). As would be expected, spatial cue-related activity in lateral occipital cortex (Fig. 5D) was strongly lateralized (F(1,13) = 30.2, p < 0.001). The feature cue × spatial cue direction interaction effect was not significant (F(2,26) = 1.4, p > 0.1).
In summary, of the regions that show concurrent effects of spatial and feature-based cueing, activity in the IPS displayed a spatially lateralized effect of spatial cueing in combination with a spatially global effect of feature cueing, akin to the pattern of effects reported in stimulus-evoked responses in extrastriate visual neurons (Treue and Martinez-Trujillo, 1999), whereas the FEF and IFC/AI displayed spatially nonspecific effects for either type of cue information. Given that a preponderance of neurons in the monkey (Bruce and Goldberg, 1985; Bruce et al., 1985) and human (Kastner et al., 2007) FEF have been shown to represent locations in the contralateral visual field, it appears surprising that we did not observe lateralized responses in this region. One possible reason for this null effect is that the search stimuli in our experiment were not placed sufficiently peripheral for lateralized expectancy effects to emerge in the FEF BOLD responses. In accordance with this speculation, a previous study that found lateralized effects of spatial cueing in the FEF used more peripherally placed stimuli (Serences and Yantis, 2007). Thus, it is possible that a qualitatively identical pattern to the one we observed in the IPS could be observed in the FEF, given a more peripheral placement of the search stimuli. A noteworthy aspect of the IPS activation pattern in the current study is that it discounts a possible alternative interpretation of the fMRI results, namely, that the observed BOLD responses may not be reflective of specific representations of spatial and feature cue information but are rather related to more generic processes that may accompany an increase in cue information, such as global increases in arousal or vigilance. This account would not predict any lateralization of BOLD responses. By the same token, however, one ultimately cannot rule out the possibility that global activation increases, such as those observed for feature cue information and for spatial cue information in the FEF and IFC/AI, may in part be related to some generic processes accompanying increased levels of cue information.
Discussion
We investigated how the human brain represents and integrates concurrent spatial and feature-based cue information regarding the likely identity of a visual search target. By independently varying the degree of spatial and feature cue information and analyzing cue-related BOLD responses, we obtained important novel findings. First, our data suggest that spatial and feature-based search information is represented in shared brain regions rather than in specialized, anatomically segregated areas, but that their representations are nevertheless independent of each other. Second, location and feature-based information about search targets is combined in anticipation of a visual search in frontal, cingulate, and parietal source regions of top-down attentional biasing and oculomotor processes. Finally, cue-related activity in the IPS displayed a qualitatively similar pattern to that observed previously in visual target regions of attentional biasing (Treue and Martinez-Trujillo, 1999), namely, an additive combination of spatially specific (lateralized) effects of spatial cue information and spatially global effects of feature-based cue information.
Previous studies have provided evidence for additive influences of spatial and feature-based information on human performance (Kingstone, 1992), as well as in single-neuron responses in monkey visual cortex (Treue and Martinez-Trujillo, 1999; McAdams and Maunsell, 2000). It is tempting to assume that these effects originate with two anatomically distinct, specialized spatial attention and feature-based attention systems, particularly because their seemingly additive neural and behavioral effects provide evidence for their independence. However, our data argue against this scenario, because we did not detect reliable evidence for the existence of brain regions that show activity specific to either spatial or feature cue information. In support of this argument, previous fMRI studies that compared neural signatures of endogenous spatial and feature cueing have reported primarily overlapping activity in frontoparietal brain regions and have found little evidence for a dissociable “feature attention network” (Wojciulik and Kanwisher, 1999; Vandenberghe et al., 2001; Giesbrecht et al., 2003). The current results add more weight to these previous observations because they were obtained in a paradigm that manipulated spatial and feature cue information simultaneously and independently of each other, permitting a direct test of shared and dissociable neural substrates by means of analyzing interaction effects (Sternberg, 2001).
Surprisingly, not only did we find that spatial and feature-based foreknowledge share the same neurocircuitry but also that they nevertheless appear to give rise to independent effects in these regions, because no interaction effects between spatial and feature-based representations were observed. If spatial and feature-based information were indeed additively represented in the same neurons (cf. Treue and Martinez-Trujillo, 1999), our data suggest that these neurons have a sufficiently large dynamic range to avoid subadditive “ceiling effects” when combining spatial and feature-based responses. An alternative explanation for the counterintuitive finding of shared sources but independent effects is that there exist spatially interspersed populations of neurons that preferentially represent either spatial or feature-based expectations but whose responses are spatially summated in the BOLD signal detected with fMRI. These alternative hypotheses could be examined via single-neuron recordings in an experimental protocol similar to the current one.
The current results give the impression that subjects allocated their attention in a graded manner, corresponding to the parametric gradation of cue information (Figs. 1A, 4A–D), but it is important to note that, in principle, attention may instead have been allocated in an “all-or-none” manner, with a higher proportion of trials of full attentional commitment occurring in conditions of higher cue validity (Jonides, 1980). However, previous work has shown that the assumption of a graded distribution of attention can account better for (spatial) cue validity effects than an all-or-none allocation (Johnson and Yantis, 1995). How could a graded distribution of attentional resources be instantiated in the current experiment? A graded allocation of spatial attention could be mediated by a combined adjustment in the locus and size of a single “zoom lens” of spatial attention (Eriksen and Yeh, 1985) in anticipation of the search array. For example, subjects may use a small attentional (high resolution) focus that is centered directly on the cued peripheral location in response to a 90% informative cue but a wider (low resolution) focus with a less peripherally located center in response to a 70% informative cue, so as to include the uncued side of the search array in the fringe of the attentional “beam.” At the neurophysiological level, these adjustments would correspond to varying sizes of target regions (and correspondingly, varying intensities) of top-down attentional modulation in extrastriate visual cortex (Muller et al., 2003b). Alternatively, a view that allows for multiple simultaneous spotlights of spatial attention (Shaw and Shaw, 1977; Awh and Pashler, 2000; Muller et al., 2003a; McMains and Somers, 2004), paired with the assumption of a judicious (cue information-driven) distribution of limited attentional resources between attended loci, would also offer a parsimonious account for a graded allocation of spatial attention. Graded allocation of feature-based attention, conversely, could be based on a global sensitization of neurons that are responsive to the target feature, weighted by the expected behavioral relevance of this feature (Treue and Martinez-Trujillo, 1999). Importantly, graded neuronal responses that scale with probabilistic information are certainly feasible at the neurophysiological level. For instance, single neurons in monkey parietal cortex have been shown to display finely graded responses that scale with probabilistic cue information concerning varying levels of reward magnitude [and thus, presumably, attention (Maunsell, 2004)] associated with a given stimulus feature or location (Yang and Shadlen, 2007).
The current data show that spatial and feature-based information about a search target are integrated within frontal, cingulate, and parietal regions in anticipation of a visual search. In the IPS, this integration took the form of an additive combination of spatially specific effects of spatial cue information and spatially global effects of feature cue information. We here speculate that the purpose of this anticipatory integration could lie with the generation of a top-down salience map in the IPS, a search template that primes the processing of targeted locations and features in line with endogenous information (cf. the proposal of a “task relevance map” by Navalpakkam and Itti, 2005). Specifically, spatial expectancies could prime (enhance the responsiveness of) IPS neurons whose receptive fields overlap with the anticipated target location (e.g., the left side of the search array), whereas feature-based expectancies could prime IPS neurons that are responsive to the target feature (e.g., red). These biasing processes would result in an additive priming effect in neurons that are responsive to the target feature and also have receptive fields overlapping the attended target location (cf. Treue and Martinez-Trujillo, 1999). During presentation of the search array, locations that are represented by these neurons (e.g., locations on the left that contain the color red) would enjoy a competitive processing advantage over other locations, thus conferring a high “salience” unto these locations and attracting eye movements during the search process (Itti and Koch, 2001).
The IPS satisfies two crucial premises of this speculative account: first, neurons in this region are known to display topographic receptive field organization (Blatt et al., 1990; Sereno et al., 2001; Silver et al., 2005; Swisher et al., 2007), and, second, despite being considered part of the dorsal visual stream, posterior parietal neurons have also been shown to represent feature-based stimulus information (Sereno and Maunsell, 1998; Toth and Assad, 2002; Todd and Marois, 2004; Xu, 2007; Konen and Kastner, 2008). For instance, in human fMRI studies, it has been shown that the IPS supports visual short-term memory representations of shape and color features, independently of the need to maintain location information (Todd and Marois, 2004; Xu, 2007). Similarly, electrophysiological studies in the monkey have demonstrated that parietal neurons display shape and color selectivity (Sereno and Maunsell, 1998; Toth and Assad, 2002). Of particular relevance to the current experiment, Toth and Assad (2002) showed that IPS neurons display selectivity to colors when color information is of relevance for guiding subsequent saccades. In summary, these data document that it is neurophysiologically feasible for the IPS to harbor a de facto top-down search template that entails a combination of spatial and feature-based information about a search target. Finally, the proposed role for the IPS fits well with a long line of research indicating that the posterior parietal cortex integrates multimodal information to construct representations of the locations of motivationally relevant stimuli in extrapersonal space (Mountcastle et al., 1975; Andersen, 1997; Colby and Goldberg, 1999).
In conclusion, we showed that knowledge concerning probable locations and features of a search target produces primarily independent search benefits and that additive representations of spatial and feature-based information are generated in shared regions of the frontal, cingulate, and parietal cortices. We speculate that a priori spatial and feature-based information may be used to generate a parietal top-down salience map, in which additive biasing effects of spatial and feature-based expectancies in neurons that are selective for the target feature and whose receptive fields overlap with the expected target location confer a competitive processing advantage unto locations that are both spatially attended and contain the sought-after target feature.
Footnotes
-
We thank Christopher Summerfield, Wen Li, Kia Nobre, and Darren Gitelman for insightful comments on this work.
- Correspondence should be addressed to Tobias Egner, Cognitive Neurology and Alzheimer's Disease Center, Feinberg School of Medicine, Northwestern University, 320 East Superior, Searle 11-569, Chicago, IL 60611. t-egner{at}northwestern.edu