Abstract
What is the neural locus of visual attention? Here we show that the locus is not fixed but instead changes rapidly to match the spatial scale of task-relevant information in the current scene. To accomplish this, we obtained electrical, magnetic, and hemodynamic measures of attention from human subjects while they detected large-scale or small-scale targets within multiscale stimulus patterns. Subjects did not know the scale of the target before stimulus onset, and yet the neural locus of attention-related activity between 250 and 300 ms varied according to the scale of the target. Specifically, maximal attention-related activity spread from a high-level, relatively anterior visual area (the lateral occipital complex) for large-scale targets to include a lower-level, more posterior area (visual area V4) for small-scale targets. This rapid change indicates that the neural locus of attention in visual cortex is not static but is instead determined rapidly and dynamically by means of an interaction between top-down task information and local information about the current visual input.
Introduction
The perceptual problem addressed by this study is illustrated in Figure 1, a and b. A natural scene is shown from two viewing distances, and the superimposed circles represent the receptive fields of two neurons, one in a relatively anterior, high-level processing area in which receptive fields are large (e.g., temporal cortex area TE), and the other in a more posterior, intermediate processing area in which receptive fields are smaller (e.g., visual area V4). When an observer attempts to perceive the details of the individual flowers from the closer viewpoint (see Fig. 1a), the large receptive field (blue) receives competing information from multiple flowers. In contrast, the small receptive field (yellow) contains only one flower and therefore does not suffer from competition. However, the number of objects within a given receptive field (and hence the degree of competition) varies considerably across scenes and viewing distances (Kastner et al., 2001). When the scene is viewed from the more distant viewpoint (see Fig. 1b), multiple flowers are present in both the smaller and larger receptive fields.
The presence of multiple competing objects within a receptive field may lead to perceptual interference (Treisman, 1996; Luck et al., 1997a), and mechanisms of attention are therefore used to bias the competition in favor of task-relevant information (Desimone and Duncan, 1995), with the strongest attention effects occurring when multiple stimuli compete for access to a given receptive field (Moran and Desimone, 1985; Treue and Maunsell, 1996; Luck et al., 1997b; Reynolds et al., 1999). Consequently, the areas of visual cortex in which attention operates will depend on the receptive field sizes in each area and the scale of the information being attended. Moreover, because the scale of attended information may vary rapidly and unpredictably across scenes and viewing distances, the cortical areas in which attention operates will need to be adjusted rapidly in accordance with moment-by-moment changes in the visual input. We therefore hypothesized that the neural locus of attentional selection changes rapidly to match the current scale of the visual input. Specifically, attention effects will almost always be observed in anterior areas of visual cortex with very large receptive fields, and these effects will extend more and more posteriorly as the visual input and/or the observer’s goals require a finer scale of attentional selection. Moreover, these differences in the neural distribution of attention should occur rapidly in response to changes in the scale of task-relevant information in the visual input.
To test this hypothesis, we used stimuli that contained task-relevant information simultaneously at two different spatial scales (see Fig. 1c–e). Targets could occur unpredictably at either the small scale or the large scale, which parallels the unpredictability of the scale of task-relevant information in the natural environment. We recorded event-related potentials (ERPs) and event-related magnetic fields (ERMFs) to determine whether the cortical distribution of attention-related brain activity would extend more posteriorly for small-scale targets than for large-scale targets. We also conducted a control ERMF experiment in which the scale of the target was predictable to determine whether foreknowledge of target scale makes it possible to use different neural systems for detecting the target. The distribution of attention-related activity was finally verified with event-related functional magnetic resonance imaging (fMRI), which provides a higher degree of spatial certainty and spatial resolution at the cost of reduced temporal resolution.
Materials and Methods
Subjects.
Ten subjects (mean age of 29.5 years; six females) participated in the main ERP/ERMF experiment, and seven subjects (mean age of 20.6 years; four females) participated in the control ERMF experiment. Five subjects (mean age of 29.0 years; one female) took part in the fMRI experiment. All subjects provided written consent, had normal color vision and normal or corrected-to-normal visual acuity, and were paid for participation.
ERP/ERMF experimental design (main experiment).
Stimuli were back-projected onto a gray screen (∼8.0 cd/m2), viewed at a distance of 120 cm. Subjects were instructed to fixate a continuously visible central point, and fixation was monitored via electrooculography (described in detail later). Stimulus arrays consisted of four hierarchical dot patterns (Fig. 1c), with one in each quadrant. Each pattern consisted of 16 dots arranged in four sets of four dots. A small-scale target was created by displacing one dot upward or downward by 0.7° (Fig. 1d), and a large-scale target was created by displacing a set of four dots upward or downward by 2.0° (Fig. 1e). The dots were blue in two of the quadrants, red in one quadrant, and green in one quadrant.
The assignment of colors to quadrants varied unpredictably from trial to trial, with the constraint that red and green were always in opposite hemifields. At the beginning of each trial block, the observers were told to attend to red or to green for that block. They were further instructed to press one of two buttons (index and middle finger of the right hand) for each stimulus array, indicating whether the attended-color target consisted of an upward or downward displacement (regardless of target scale). A target was always present in both the red and green quadrants, but only the attended-color target was task relevant. The attended color was alternated from block to block so that the same physical stimulus could be used to direct attention to the left or right hemifield.
It should be noted that the presence of a displaced dot or set of dots in these highly regular displays was quite salient at both the large and small scales, and we assumed that the displacements would be detected very rapidly. However, discriminating whether this displacement was upward or downward seemed likely to require focused attention.
Small- and large-scale targets were randomly intermixed within each of the six to eight trial blocks. Each stimulus was presented for 700 ms, followed by an interstimulus interval of 1700–1900 ms (rectangular distribution). Each block consisted of 250 trials, subdivided by short pauses every 2 min. A total of 375–500 trials was collected from each observer for each cell of the design in a single session. Trials with incorrect or missing responses were excluded from all analyses (for a summary of the behavioral data, see supplemental material, available at www.jneurosci.org).
ERMF experimental design (control experiment).
The stimuli and task were identical to the those in the main ERP/ERMF experiment, except that scale of the attended-color target was kept constant within a given trial block and subjects were informed of the target scale at the beginning of each block. Thus, in contrast to the main experiment, subjects could stay in a fixed preparatory state for one scale without the need for scale adjustments from trial to trial. This modification addresses the impact of performance strategies that subjects may have adopted in the main experiment to cope with the unpredictable scale. For example, subjects may have tried to identify a large-scale displacement first and may have switched to search for a small-scale displacement only after the large-scale search failed. Such a strategy would be highly unlikely in the follow-up experiment. Hence, the presence of similar effects of scale selection in the main and the control experiment would rule out strategy-based accounts of the observed distribution of brain activity.
Subjects performed 12 trial blocks (six large-scale and six small-scale; three attend-red and three attend-green at each scale), with each block containing 85 trials. In one experimental session, a total of 508 trials was collected from each observer for each cell of the design. Finally, stimulus patterns were presented only in the left and right lower quadrants.
EEG/MEG recordings.
The magnetoencephalogram (MEG) was recorded using a whole-head magnetometer (BTI Magnes 2500; Biomagnetic Technologies, San Diego, CA) with 148 SQUID sensors. In the main experiment, the electroencephalogram (EEG) was simultaneously recorded from 32 scalp electrodes (with reference to the right mastoid) using a Synamps amplifier (NeuroScan, El Paso, TX). Electrode locations were chosen according to the standard electrode montage of the American Electroencephalographic Society (1994): Fpz, Fz, Cz, Pz, Oz, Iz, Fp1, Fp2, F7, F8, F3, F4, FC1, FC2, T7, T8, C3, C4, CP1, CP2, P7, P8, P3, P4, PO3, PO4, PO7, PO8, IN3, IN4. For the control experiment, only MEG data were obtained. The MEG and EEG signals were filtered from direct current to 50 Hz and digitized with a sampling rate of 254 Hz. Artifact rejection was performed off-line by removing epochs with peak-to-peak amplitudes exceeding a threshold of 3.0 × 10−12 T for the MEG or 100 μV for the EEG. Eye movements were monitored by recording the horizontal and vertical electrooculogram (HEOG and VEOG) using bipolar electrode placements at the outer canthi of both eyes (HEOG), as well as above and below the left eye (VEOG). In addition, central fixation was continuously monitored using a zoom-lens infrared camera mounted inside the MEG boot. Trials with eye movements exceeding 100 μV in the EOG recordings were excluded from all analyses. A detailed description of fixation performance is found in the supplemental data (available at www.jneurosci.org as supplemental material).
To determine the attention-related brain activity in the EEG/MEG data, we focused on the N2pc component, a well established correlate of the focusing of visuospatial attention (Luck and Hillyard, 1994; Luck et al., 1997a; Luck and Ford, 1998; Woodman and Luck, 1999, 2003). The N2pc is generated primarily in inferior occipitotemporal cortex (Hopf et al., 2000, 2002) and appears to reflect a filtering operation that serves to attenuate input from distractor items. In particular, the N2pc appears to represent the same attentional modulation of visual processing that has been observed in single-unit recordings from macaque extrastriate cortex (Chelazzi et al., 1993, 1998, 2001; Luck et al., 1997a).
To derive the N2pc, separate averages for both MEG and EEG were computed for targets occurring in the left and the right visual fields (LVF and RVF), and the data were collapsed over the two target colors (red and green). The N2pc was then isolated by subtracting the right visual field target waveforms from the left visual field target waveforms. These difference waves eliminated activity attributable to purely sensory responses, because all arrays contained a red item in one visual field and a green item in the other field, with red being attended in some trial blocks and green being attended in others. Furthermore, any higher-level cognitive activity that was equal for ipsilateral and contralateral targets was also eliminated in these difference waves, leaving only lateralized cognitive responses (primarily the N2pc wave). The N2pc component was quantified as the difference in voltage or magnetic field strength between contralateral and ipsilateral targets from 250–300 ms, relative to a 100 ms prestimulus baseline. It should be noted that subtracting the response to right visual field targets from the response to left visual field targets causes the N2pc effect to appear as relative negativity over the right hemisphere and as relative positivity over the left hemisphere in the ERP response. Correspondingly, in the ERMF response, the N2pc appears as an influx–efflux difference field over the left posterior cortex and as an efflux–influx difference field over the right posterior cortex, with the efflux field components of both sides conflowing (for a detailed discussion, see Hopf et al., 2000). To compensate for the opposite polarity of the N2pc difference in the left and right hemispheres, left hemisphere measures were switched in polarity before submitting them to within-subjects ANOVAs.
ERMF/ERP source localization.
ERMF and ERP data from the main experiment were combined to estimate the neural sources of the N2pc effect. For the control experiment, the source analysis was based on ERMF data only. For source localization, source density estimates (SDEs) were computed using a minimum norm least-squares algorithm (Fuchs et al., 1999) as implemented in the multimodal neuroimaging software Curry 4.0 (Neurosoft, El Paso, TX). SDEs underlying the N2pc were computed from ERMF/ERP difference waves (left visual field targets minus right visual field targets), which subtracts away all ERP/ERMF activity except the lateralized activity related to processing the target (Hopf et al., 2000). Importantly, although the N2pc difference waves (ERP and ERMF) display opposite polarity for left and right visual field targets (see above), the relative polarity does not influence the distribution of the minimum norm-based source density estimates. That is, identical current source distributions would be obtained from either direction of subtraction. The SDE measures were obtained from each hemisphere separately and then collapsed across hemispheres for additional statistical analysis. To keep the results from ERMF/ERP source analysis comparable with the data obtained in the control and fMRI experiments, SDEs were estimated for lower visual field targets only.
To allow optimal spatial precision, the SDEs were estimated using realistic models of the volume conductor (CSF space) and the source compartment (gray matter), which were obtained from three-dimensional (3D) segmentations of individual high-resolution MR data from each observer [T1-weighted three-dimensional spoiled gradient echo sequence; 256 × 256 matrix; field of view, 25 × 25 cm; 124 slices; slice thickness, 1.5 mm; echo time (TE), 6 ms; repetition time (TR), 20 ms; flip angle, 30°]. The algorithms provided in the Curry package were used for 3D segmentation (Fuchs et al., 1998). Grand average activity across observers was estimated using a cortical surface obtained from a segmentation of an average reference brain [Montreal Neurological Institute (MNI) brain, average of 152 T1-weighted stereotaxic volumes; see http://www.bic.mni.mcgill.ca/cgi/icbm_view/].
SDE quantification and statistical validation.
To perform statistical analyses on the SDEs, the magnitudes of the SDEs for large- and small-scale targets were determined from 164 adjacent non-overlapping cortex patches (∼1 cm radius) that covered the whole cortical surface of each observer. These patches were determined on the 3D surface by placing a 2 × 2 cm surface grid on the cortical surface reconstruction with one sphere (2 cm diameter) centered on each grid point. The part of cortical surface cut by the sphere was taken to define the patch. The SDE in one patch represented the source density average across all dipoles contained in the patch. Because we had no a priori means of selecting regions of interest, we found the patch with the maximum small-scale SDE and the patch with the maximum large-scale SDE for each observer and used this patch as a region of interest for additional analyses. We then measured the magnitude of the SDE in these two patches in each observer. Note that, because the patches were identified separately for each observer, this approach takes individual differences into account.
The locations of the patches were used to demonstrate that the location of the maximal SDE for small-scale targets was different from the location of the maximal SDE for large-scale targets. Specifically, the MNI (x, y, z) coordinates of the small- and large-scale maxima were computed for each subject, and the Euclidean distance between them in the axial x–y plane was measured and compared against zero with a t test.
The magnitude in each patch was measured for both small- and large-scale targets to demonstrate that the distribution of source density differed for small- and large-scale targets. For statistical validation, the SDEs for both small-scale and large-scale targets were measured from the patches corresponding to the small- and large-scale maxima in each hemisphere of each observer. That is, the response to small-scale targets was measured at both the small-scale maximum and the large-scale maximum, as was the response to large-scale targets. We then computed the difference between the SDE estimates for small- and large-scale targets at the two locations (after collapsing measures from both hemispheres). Because the two patches were selected by virtue of their large responses to either small- or large-scale targets, the finding of a significantly larger response for small-scale targets at the small-scale maximum or for large-scale targets at the large-scale maximum would not alone be meaningful. However, our goal was to show that a significant difference between small- and large-scale targets was present only at the small-scale maximum and that this difference was larger at the small-scale maximum than at the large-scale maximum. This pattern of results could not be an artifactual consequence of our procedure for identifying these two regions of cortex.
fMRI experimental design.
The fMRI experiment was identical to the main ERP/ERMF experiment except for two modifications. First, the timing of the trial sequence (and consequently the number of trials per condition) was changed to optimize the analysis of the event-related blood oxygenation level-dependent (BOLD) data (see below). Second, to maximize the number of trials per condition, the red and green dot patterns always occurred in lower visual field (left/right position was still randomized), with the blue dot patterns constantly appearing in the two upper quadrants.
fMRI data acquisition.
Images were acquired on a neuro-optimized GE Medical Systems (Milwaukee, WI) Signa LX 1.5 T system using a 5-inch surface ring coil beneath the subject’s occipital pole. Functional images extended anteriorly from the occipital pole approximately orthogonal to the calcarine fissure (23 contiguous coronal slices) and included posterior parts of temporal and parietal lobe (echo-planar imaging sequence; slice thickness, 3 mm; in-plane resolution, 2.8 × 2.8 mm; TR, 2000 ms; TE, 40 ms; flip angle, 60°). A total of 309 images were acquired during each trial block. Observers received six trial blocks, with each block containing different randomizations of 40 trials per experimental condition, yielding a total of 240 trials per condition. The attended color alternated from block to block. The stimulus onset asynchrony in each trial block was pseudorandomized in 1/2/3-folds of the TR (2 s), with the requirement that the variance of BOLD estimates for each of the four experimental conditions (LVF/RVF × large/small scale) is minimized for each individual run.
fMRI data processing and statistical analysis.
Preprocessing of the images was performed in SPM99 (Wellcome Department of Cognitive Neurology, London, UK). First, functional volumes were phase shifted in time with reference to the first slice to minimize purely acquisition-dependent signal variations across slices. Second, head-movement artifacts were corrected on the basis of an affine rigid body transformation with reference to the first image of the first run. Third, volumes were spatially smoothed with a Gaussian kernel of 6 mm (full-width half-maximum). For statistical analysis, the functional data were high- and low-pass filtered in the temporal domain (high pass, 98 s; low pass, 4 s) and rescaled to the global mean. Statistical analysis was performed separately for each observer using a modeled hemodynamic response function for each experimental condition (Friston et al., 1998).
Significant differences in hemodynamic responses were assessed using the general linear model approach as implemented in SPM99 (Frackowiak et al., 2004). The BOLD response reflecting the N2pc effect was derived by computing mirror-image SPM(t)-contrasts (LVF targets > RVF targets for the right hemisphere and RVF targets > LVF targets for the left hemisphere). Only signal enhancements were considered. Please note, however, that any signal decrease in the left hemisphere of the LVF targets > RVF targets contrast will appear as signal enhancement in the RVF targets > LVF targets contrast in this hemisphere, and vice versa. Thus, using mirror-image contrasts provides a complete description of the data. Furthermore, it closely parallels the ERP/ERMF analysis using LVF–RVF difference waves. For individual quantification, the areas of significant activation for large- and small-scale targets in each observer and hemisphere were used as regions of interest for determining the magnitude of the BOLD response, reflecting the N2pc effect. To summarize activations across observers, a random-effects model was computed using the “summary statistics approach” (Frackowiak et al., 2004), in which first-level t-contrasts for large- and small-scale trials were subjected to a second-level one-sample t test. Given the small number of subjects, this random-effects approach provides a conservative estimate of statistical significance, but it also provides greater generalizability than a fixed-effects approach. The level of significance reported in the SPM(t) maps is always corrected for multiple comparisons.
Retinotopic maps.
Retinotopic maps were acquired from two of the five subjects in the fMRI experiment using a protocol similar to that reported by Wade et al. (2002), with the modification that the angle of the rotating wedge was 45°. Functional data [SPM(t) maps] were then coregistered with high-resolution anatomical scans (T1-weighted three-dimensional spoiled gradient echo sequence; 256 × 256 matrix; field of view, 25 × 25 cm; 124 slices; slice thickness, 1.5 mm; TE, 6 ms; TR, 20 ms; flip angle, 30°) and flattened using VISTA software (available at http://white.stanford.edu/software/) (Teo et al., 1997; Wandell et al., 2000). The lateral occipital complex (LOC) was functionally localized based on voxels showing significantly larger BOLD responses for gray-scale objects versus scrambled objects (t test with p < 0.001). Stimulation and scanning were analogous to the protocol reported by Kourtzi and Kanwisher (2000).
Results
Main ERMF/ERP results
Figure 2 shows grand average ERMF waveforms from a relatively posterior sensor and a relatively anterior sensor over left visual cortex in response to small- and large-scale targets. Targets elicited a clear N2pc component, which is visible as an amplitude difference starting at ∼200 ms after stimulus. The N2pc component is lateralized with respect to the location of the target, and the differential response to contralateral versus ipsilateral targets can be used to isolate the N2pc from other components, which are bilaterally distributed for stimuli such as these. This contralateral-versus-ipsilateral difference was clearly present for both small- and large-scale targets at the more anterior sensor site, but it was present primarily for the small-scale targets at the more posterior sensor site (arrow). For statistical validation, three-way repeated-measures ANOVAs with factors of N2pc (left vs right visual field targets), sensor hemisphere (left vs right hemisphere), and target scale (large vs small scale) were computed based on mean ERMF amplitude measurements between 250 and 300 ms. Separate ANOVAs were performed for anterior and posterior sensor sites, which revealed significant main effects for N2pc (anterior, p < 0.005; posterior, p < 0.05) as well significant N2pc × sensor hemisphere interactions (anterior, p < 0.05; posterior, p < 0.005), reflecting the fact that the N2pc effect was slightly larger over the left than the right hemisphere. Furthermore, a significant N2pc × target scale interaction was observed at posterior sensor sites (p < 0.05) but not at anterior sites. Finally, there was no significant interaction between target scale and sensor hemisphere, indicating that, although the N2pc was slightly larger in the left hemisphere, the effect of target scale on the N2pc component was not significantly lateralized. These effects are broadly consistent with the hypothesis that both small- and large-scale targets are processed in relatively anterior areas of visual cortex, whereas only small-scale targets are processed in relatively posterior areas.
The cortical distribution of the N2pc was assessed by estimating the underlying current source density at each point along the cortical surface in the time range of the maximum effect (250–300 ms). Figure 3a–d shows the current source density distributions for a representative observer and for the average of all 10 observers, along with the location of the maximum source density estimate for each observer. From these data, it is clear that the maximum activity was located at relatively and lateral portions of the ventral visual pathway for large-scale targets [MNI coordinates (55, −61, −9)] and at relatively posterior and medial portions of this pathway for small-scale targets [MNI coordinates (25, −92, −9)]. The mean distance between maxima in the MNI x–y plane (3.4 cm) was significantly greater than zero (t(9) = 3.83; p < 0.001).
Although the location of maximal attention-related activity clearly differed for small- versus large-scale targets, small-scale targets elicited substantial attention-related activity over a broad region of ventral occipitotemporal cortex, including the location of the maximal large-scale effect. In contrast, the attention-related activity elicited by large-scale targets was confined to anterior regions of visual cortex, with very little activity at more posterior locations. To confirm this pattern statistically, the mean magnitude of the source density estimates was obtained for both large- and small-scale targets at the location of each observer’s large- and small-scale maximum (Fig. 3d) (see Materials and Methods). We then computed the difference in activity between small- and large-scale targets at each of these two locations. One-sample t tests showed that the difference between small- and large-scale targets was significantly greater than zero at the small-scale maximum (p < 0.05) but not at the large-scale maximum (p > 0.25). In addition, a paired-samples t test comparing the difference scores at the two locations showed that the difference between small- and large-scale targets was greater at the small-scale location than at the large-scale location (p < 0.05). This pattern accords with the prediction that small-scale targets will lead to competition-induced attentional focusing in both lower-level areas with small receptive fields and higher-level areas with large receptive fields, whereas large-scale targets will lead to competition and attentional focusing primarily in higher-level areas. Moreover, given that the scale of the target varied randomly from trial to trial, these results demonstrate that the locus of attention within visual cortex can change rapidly in response to momentary stimulus conditions.
Control experiment results
The control experiment addresses the possibility that, because large-scale targets were slightly easier to discriminate, subjects may have attempted to detect the large-scale target before attempting to detect the small-scale target on each trial. This could have led to activity in a relatively anterior region of visual cortex for both small- and large-scale targets while subjects attempted to detect the large-scale target, followed by activity in a relatively posterior region when the large-scale target was not detected. Unfortunately, analyzing the data from the main experiment with a sufficiently fine time grain to assess this pattern would lead to an unacceptably poor signal-to-noise ratio. Consequently, we conducted a control experiment in which target scale remained constant throughout a trial block, making it possible for subjects to focus in advance on the spatial scale of both small- and large-scale targets and eliminating the usefulness of checking all stimuli first for large-scale targets.
If the pattern observed in the main experiment were caused by this strategy, then this pattern should not be observed in the control experiment. However, the same pattern of results was observed again in the control experiment. Figure 4a shows the current source density distribution (right hemisphere) for the average of all seven observers, along with a bar graph illustrating the average magnitude of the source activity for both small- and large-scale targets at each observer’s small- and large-scale maximum (Fig. 4b). Again, the maximum activity was located at relatively anterior and lateral portions of the ventral visual pathway for large-scale targets and at relatively posterior and medial portions of this pathway for small-scale targets. Importantly, as in the main experiment, small-scale trials produced substantial attention-related activity over a region of the ventral occipitotemporal cortex that coincided with the location of the maximal large-scale effect. Source activity for large-scale trials, conversely, was confined to anterolateral regions of visual cortex, with substantially smaller activity at more posterior locations.
To confirm these observations statistically, one-sample t tests of the difference between small- and large-scale targets were computed that revealed no significant difference (p > 0.15) at the large-scale maximum but a significant difference at the small-scale maximum (p < 0.005). In addition, analogous to the analysis of the main experiment, a paired-samples t test comparing the difference scores at the two locations showed that the difference between small- and large-scale targets was greater at the small-scale location than at the large-scale location (p < 0.05).
fMRI results
To provide converging evidence about the neural origins of these effects, we repeated the main experiment as an event-related fMRI study (n = 5). The BOLD response corresponding to the N2pc effect is shown in Figure 5. Significantly, active voxels were observed in the occipitotemporal region (parietal activations were also observed but did not show systematic topographic variations across target types). Just as was observed in the ERP/ERMF experiment, small-scale targets produced significant activations in both a relatively anterolateral region and a relatively posteromedial region of the ventral pathway, whereas large-scale targets produced significant activations primarily in the relatively anterolateral region.
The fMRI data were analyzed in the same manner as the ERP/ERMF data. First, we found that the mean distance in the x–y plane between the activation maxima for small- and large-scale targets (1.9 cm) was significantly different from zero (t(4) = 23.5; p < 0.001). Second, an analysis of the magnitude of the BOLD response (shown in Fig. 2f) revealed that small-scale targets elicited a substantial BOLD response in both the anterolateral and posteromedial regions, whereas large-scale targets primarily produced a response in the relatively anterolateral region. This pattern was supported statistically by one-sample t tests, which showed that the difference in signal strength between small- and large-scale targets was significantly greater than zero at the small-scale maximum (p < 0.05) but not at the large-scale maximum (p > 0.40). In addition, a paired-samples t test comparing the difference scores at the two locations showed that the difference was greater at the small-scale location than at the large-scale location (p < 0.005). This pattern matches the results from the ERP/ERMF experiment extremely well (compare bar graphs in Figs. 3d, 5).
To determine the functionally defined brain areas in which these effects occurred, retinotopic mapping was performed in two subjects (for details, see Materials and Methods). Figure 6 shows the results on a flat-map representation, in which the retinotopically defined area V4 is highlighted in yellow and the area defined as LOC (see Materials and Methods) is highlighted in green. The area encircled on the flat maps corresponds with the areas on the folded cortex (left side). In both subjects, significant BOLD changes were elicited by small-scale targets (blue outlines) in the region identified as area V4 and in the region identified as the LOC, which is probably homologous with macaque temporal–occipital area TEO (Tootell et al., 2003). Large-scale targets also produced substantial BOLD effects in the LOC but produced little or no significant activity in V4. Figure 7 illustrates the localization of these effects on the folded cortex of both subjects shown in Figure 6. Shown are BOLD changes for large-scale (red) and small-scale (blue) targets together with the localizers of areas V4 (yellow) and the LOC (green) using the same cutoff threshold as in Figure 6. Again, and consistent with the source density distributions of the main ERP/ERMF experiment (Fig. 3b), BOLD maxima for both large-scale and small-scale targets are visible in posterior parts of the LOC, which appears more lateral and anterior to area V4. In contrast, only small-scale targets produce a substantial activation in V4, whereas large-scale targets show little or no effect.
Discussion
The combined pattern of ERP/ERMF and fMRI data indicates that attention operates at different levels of the visual hierarchy depending on the scale of the attended object. The maximum effect occurs in the LOC for large-scale targets and in the vicinity of area V4 for small-scale targets, which matches well with typical receptive field sizes in these areas. That is, a typical V4 receptive field would contain only one complete set of the small squares (Desimone and Schein, 1987), whereas a typical TE receptive field would contain an entire quadrant or more (Gross and Mishkin, 1977). Previous single-unit recording studies with monkeys have found that attention operates most strongly when multiple objects are present within the receptive field of the neuron (Moran and Desimone, 1985; Treue and Maunsell, 1996; Luck et al., 1997b; Reynolds et al., 1999), and closer inter-object spacings have presumably been necessary to observe these effects in more posterior areas of the ventral pathway attributable to the smaller receptive field sizes in these areas. However, the single-unit studies did not systematically explore the relationship between stimulus scale and the neural locus of attention. Recent fMRI data demonstrate that BOLD suppression attributable to stimulus competition can occur with wider stimulus spacings at later stages of the ventral stream, apparently scaling with receptive field size in humans (Kastner et al., 2001). More specifically, this study demonstrated that TEO shows competitive BOLD suppression at larger inter-item spacings than area V4, suggesting larger receptive field sizes in the former area. Furthermore, area TEO has been proposed recently to correspond with parts of the human LOC (Tootell et al., 2003), which would be consistent with the present pattern of large-scale attention-related activity involving parts of the LOC and small-scale attention-related activity involving area V4 as well as the LOC.
In accord with the results of single-unit recordings in monkeys (Moran and Desimone, 1985; Treue and Maunsell, 1996; Luck et al., 1997b; Reynolds et al., 1999), fMRI data in humans indicate that attention operates by counteracting suppressive stimulus interactions (Kastner et al., 1998, 1999). Thus, changing the spatial scale of competition should change the stages of the visual hierarchy in which attention operates. However, this prediction that has not been rigorously tested by previous research. The present study is the first to directly demonstrate that the neural locus of attention does indeed vary with the scale of the attended information, even when the overall stimulus arrays remain constant.
Even more importantly, the present results demonstrate that the neural locus of attention is adjusted rapidly in response to moment-by-moment changes in the scale of the attended information. Although the observers did not know whether a small-scale or large-scale target would appear within a given array, the neural locus of attention was adjusted to reflect the spatial scale of the target within 250–300 ms of array onset.
One might suppose that the large-scale targets were easier to detect than the small-scale targets, leading observers to search at the small scale only if no target was detected at the large scale. This possibility was addressed in a control ERMF experiment in which the scale of the target item was consistent across trials. The cortical activity pattern reflecting scale selection, however, was not different from that of the main experiment. In particular, whether predictable or not, both small- and large-scale targets elicited substantial attention-related activity in a relatively anterior and lateral area of visual cortex (presumably the LOC), whereas only small-scale targets elicited activity in a more posterior and medial area (presumably V4). This pattern is clearly not strategic.
It should also be noted that target detection was only slightly and nonsignificantly faster for the large-scale targets than for the small-scale targets, making it implausible that subjects first searched at the large scale and then searched at the small scale. Thus, the observers presumably searched at both scales in parallel, using preattentive information about irregularities in the otherwise regular dot patterns to detect the location and scale of the target, followed by the allocation of attention in a scale-dependent manner to discriminate the direction of the dot displacement in the target.
It should be noted that several distinct attentional mechanisms were presumably involved in this task, and our study isolated only a subset of these mechanisms (i.e., those that were lateralized with respect to the target location and produced a measurable ERP/ERMF response). In particular, there may be general hemispheric differences in the allocation of attention to small- versus large-scale information (Ivry and Robertson, 1998). A typical observation of previous work on the processing of global and local aspects of hierarchical stimuli (Fink et al., 1996; Heinze et al., 1998; Ivry and Robertson, 1998; Mevorach et al., 2005; Weissman and Woldorff, 2005) is that global versus local processing causes a hemispheric lateralization of the visual response, although the direction of lateralization has not always been observed to be consistent (Weissman et al., 2002). Such lateralization was not identified in our analyses. Thus, although the present data reveal an important attentional mechanism, they do not provide an exhaustive account of attentional processing for small- and large-scale targets.
Finally, it is necessary to consider whether the slight differences between the stimuli containing small- and large-scale targets may have caused the observed differential pattern of activity. That is, the large- and small-scale targets were slightly different physical stimuli (Fig. 1, compare d, e), and the large-scale targets caused a change in the overall configuration of the display. However, the small physical differences between large- and small-scale targets could not have produced the observed results. First, changes in overall configuration should be identifiable primarily in the LOC, in which the receptive fields are large enough to encode a large configuration. However, we observed the same pattern of activity in LOC for both large- and small-scale targets. Second, subjects were instructed to attend to the red items in some trial blocks and to the green items in other trial blocks, and the same physical stimulus was therefore used when subjects attended to the LVF and RVF. All of our analyses examined differences between attending to the LVF and RVF, thus subtracting out any effects attributable to stimulus properties per se. Thus, although the small- and large-scale targets might have elicited slightly different responses, the design of the experiment ruled out any contributions of these differences to our main results.
The main finding of the present study, rapid changes in the neural distribution of attentional modulation, has important implications for the neural circuitry underlying visual attention. Specifically, this finding suggests that attention operates in a self-organized manner, modulating neural responsiveness rapidly depending on local and momentary conditions of competition within a given cortical area. This further suggests that the detailed operation of attention at each moment depends on local computations, with attentional control areas in parietal and prefrontal cortex providing general bias signals (Corbetta, 1998; Miller and Cohen, 2001) rather than providing moment-by-moment control of the implementation of selective processing. That is, the prefrontal cortex provides general signals about the nature of the discrimination to be performed (Logan and Gordon, 2001), and local circuitry within each area of visual cortex determines whether the current stimuli require the spatial focusing of attention within that area.
Footnotes
-
This work was supported by a grant from the Deutsche Forschungsgemeinschaft (H.-J.H., principal investigator) and a grant from the National Institute of Mental Health (S.J.L., principal investigator).
- Correspondence should be addressed to Dr. Jens-Max Hopf, Department of Neurology II, Otto-von-Guericke University of Magdeburg, Leipziger Strasse 44, 39120 Magdeburg, Germany. Email: jens-max.hopf{at}medizin.uni-magdeburg.de