Abstract
Selective attention may be flexibly directed toward particular locations in the visual field (spatial attention) or to entire object configurations (object-based attention). A key question is whether spatial attention plays a direct role in the selection of objects, perhaps by spreading its facilitatory influence throughout the boundaries of an object. We studied the relationship between spatial and object-based attention in a design in which subjects attended to brief offsets of one corner of a real or illusory square form. Object-selective attention was indexed by differences in event-related potential (ERP) amplitudes and blood oxygen level-dependent (BOLD) activations to unattended corner offsets in conditions in which the objects were intact versus fragmented or absent. This design ensured that object-based attention effects were not an artifact of attention being guided by simple directional cues such as parallel lines, which may have occurred in previous studies. Both space-based and object-based attention were associated with enhanced negative ERPs (N1 component at 140–180 ms) that were colocalized with BOLD activations in lateral occipital cortex (LOC). These results provide physiological evidence that directing spatial attention to one part of an object (whether real or illusory) facilitates the processing of the entire object at the level of the LOC and thus contributes directly to object-based selective attention.
Introduction
When attention is directed to a specific location in the visual field, the detection and discrimination of stimuli at that location becomes faster and more accurate (Wright, 1998). This space-based stimulus selection has been likened to a moveable “spotlight” that can facilitate processing of stimuli at relevant locations. There is also abundant evidence that attention can select objects in their entirety for preferential processing (Scholl, 2001). Critical evidence for object-based attention comes from studies based on a design by Egly et al. (1994) in which two adjacent rectangles are presented and attention is cued to one end of one of the rectangles. The general finding is that stimuli presented at uncued locations within the cued rectangle are selected more rapidly than are physically equivalent stimuli presented within the uncued rectangle (Marino and Scholl, 2005).
Several investigators have proposed that space-based attention plays an important role in object-selective attention (Vecera and Farah, 1994; Weber et al., 1997; Davis et al., 2000). According to this view, spatial attention directed to one part of an object spreads throughout its boundaries and strengthens the sensory representation of the entire object. In support of this “object-guided spatial-selection mechanism,” physiological studies using variants of the dual-rectangle paradigm of Egly et al. (1994) have shown that neural activity patterns associated with spatial and object-based attention have much in common, as evidenced by functional magnetic resonance imaging (fMRI) (Müller and Kleinschmidt, 2003) and event-related potential (ERP) recordings (He et al., 2004; Martinez et al., 2006, 2007).
Although the paradigm of Egly et al. (1994) has been used in many studies of object-based attention, there is reason to believe that the “same object advantage” observed in these studies may not, in fact, reflect selection of the rectangular object itself. Instead, Avrahami (1999) suggested that attention may be guided from cued locations along the direction of the lengthwise parallel lines of the rectangle rather than by its integral object properties. In support of this proposal, a cueing advantage was found for distant stimuli situated within arrays of evenly spaced parallel lines that did not constitute enclosed objects (Avrahami, 1999; Marino and Scholl, 2005). These results call into question whether the paradigm of Egly et al. (1994) provides a good metric of object-selective attention.
In previous studies (Martinez et al., 2006, 2007), we used fMRI and ERP recordings in an Egly-style paradigm to show that space-based and object-based attention produced a similar pattern of enhanced neural activity in the lateral occipital cortex (LOC) at 150–180 ms after stimulus onset. The present study aimed to determine whether this neural activity pattern in LOC, a region previously implicated in object-recognition processes (Grill-Spector et al., 2001), truly reflects object-selective attention. Accordingly, the present stimulus display consisted of symmetrical objects (real and illusory squares) so that object-based attention could not be based on simple stimulus features such as parallel lines that are an integral part of the Egly-type display. The critical index of object-based attention here was an increased ERP amplitude to unattended regions of intact objects.
Materials and Methods
Experiment 1
Subjects
Fourteen healthy volunteers (mean age, 21 years; nine females) participated in the ERP portion of the study. An additional seven subjects took part in the fMRI experiment. All subjects were right handed, had normal or corrected-to-normal vision, and gave informed consent to participate in the experiment.
Stimuli and task
Stimuli were either a single white square (6.5 × 6.5° visual angle) on a gray background [intact object (IO) condition] or four separated rectangular white shapes [fragmented object (FO) condition] that together comprised the same area as the single square (Fig. 1). All stimuli were white on a gray background and contained a small fixation cross in the center that was present at all times. In both the IO and FO configurations, task stimuli consisted of brief (100 ms) offsets of the corners in either the upper left (UL), upper right (UR), lower left (LL), or lower right (LR) quadrants. Corner offsets left either a concave edge (standards, p = 0.8) or a convex edge (targets, p = 0.2) and occurred in random order, one at a time, in the different quadrants at intervals of 400–600 ms. Stimuli were delivered in 20 s blocks with either the IO or FO configuration. In both cases, subjects were instructed to maintain fixation on the central cross while directing attention to the corner (quadrant) indicated by a pair of arrows presented just above or below fixation. Subjects were instructed to respond as quickly and accurately as possible to target stimuli appearing in the attended quadrant and to ignore all stimuli at the other quadrants. All subjects responded with their right hand. A single run consisted of five attend-left blocks and five attend-right blocks. During half of the runs, the cues alternated between the UL and UR quadrants and in the remaining half alternated between the LL and LR quadrants. Each subject took part in a total of 20 runs resulting in ∼500 stimuli per condition, and the order of the runs (attend upper field or lower field and IO or FO) was counterbalanced across subjects.
Electrophysiological recordings
Subjects sat in a dimly lit recording chamber and viewed the stimuli on a video monitor at a distance of 100 cm. Recordings were acquired from 62 scalp electrodes using a modified 10–20 system montage (Di Russo et al., 2003) referenced to the right mastoid. Eye movements and blinks were monitored via recordings of the horizontal (right vs left outer canthi) and vertical (below the eye vs right mastoid) electrooculogram (EOG). The EEG was digitized at 250 Hz with an amplifier bandpass of 0.1–80 Hz. Computerized artifact rejection was performed before signal averaging to discard epochs in which deviations in eye position and/or blinks (defined as any peak-to-peak amplitude change of ≥60 μV in the EOG channels) or amplifier blocking (defined as a flat voltage line for ≥40 ms) occurred. Additionally, epochs that were preceded by a target stimulus within 1000 ms or followed by a target within 200 ms were eliminated to avoid contamination by ERP-related target detection and motor response. Approximately 10% of all trials were rejected based on one or more of the preceding criteria.
Time-locked ERPs to the standard corner-offset stimuli were averaged separately according to quadrant of presentation (UL, UR, LL, and LR), whether the stimuli were attended or unattended, and whether the stimulus configuration was IO or FO. For the data analysis, ERPs were re-referenced algebraically to the average of the left and right mastoids. ERPs to target stimuli were not analyzed.
To assess effects of spatial attention, ERPs to attended-location stimuli were averaged across both IO and FO configurations (for which the attended ERPs did not differ; see Results) and were compared with the ERPs elicited by the same stimuli when unattended. However, to avoid confounding the spatial attention effect with object attention effects, only unattended stimuli in the FO configuration were included in this comparison. For example, the spatial attention effect for an UR stimulus was obtained by subtracting the average of the ERPs elicited by an UR corner stimulus when attention was directed toward the UL, LL, and LR quadrants during the FO condition from the ERP elicited by the same (UR) stimulus when attention was focused on the UR quadrant averaged over both the IO and FO configurations.
In this study, we consider “object-based” attention effects to be indexed by contrasting ERP unattended stimuli in the IO versus FO configurations. Specifically, object-based attention effects were calculated by comparing ERP amplitudes for unattended corner offsets only, as a function of whether these formed part of the attended object (IO configuration) versus part of a different object (FO configuration). For example, ERPs elicited by an UR stimulus were averaged over the conditions of attending to the UL, LL, and LR quadrants when the configuration was an IO and were compared with the ERPs elicited by the same stimulus in the same three attention conditions, but in the FO configuration. Additionally, to determine whether these object-based attention effects were affected by the position of the unattended stimulus relative to the location of the attended stimulus, separate one-way ANOVAs were calculated for each relative position. These ANOVAs compared ERP amplitudes on IO versus FO conditions when attention was directed to the horizontal, vertical, or diagonal quadrant with respect to the eliciting unattended quadrant. For example, ERPs to unattended UR stimuli were compared (IO vs FO configurations) when attention was focused on the UL quadrant (horizontal), LR quadrant (vertical), or LL quadrant (diagonal).
In all cases, attention effects were quantified in terms of mean amplitudes within specified latency windows with respect to a 100 ms prestimulus baseline. For each quadrant, the mean amplitude of the P1 (88–120 ms) and N1 (140–180 ms) components were subjected to a repeated-measures ANOVA with factors of attention (attended vs unattended for spatial attention; IO vs FO for object-based attention) and hemisphere (ipsilateral vs contralateral to the eliciting stimulus). In all analyses, P1 and N1 were measured as mean amplitudes averaged over a cluster of the 10 posterior electrode sites in each hemisphere (O1/O2, PO3/PO4, PO7/PO8, P1/P2, P3/P4, P5/P6, P7/P8, CP1/CP2, CP3/CP4, CP5/CP6) where these components were largest. The specific time windows used for measuring each component were chosen because they encompassed the attention-related amplitude modulations that were stable in scalp topography within their respective time windows. Finally, the scalp distributions of the N1 amplitude modulations produced by spatial- versus object-selective attention were compared across the entire array of electrode sites after normalizing their amplitudes before ANOVA using the method of McCarthy and Wood (1985).
Source localization
To estimate the cortical generators of the spatial and object-based attention effects on N1 amplitude, two different types of source localization analyses were performed on the grand-averaged difference waves (attended minus unattended ERPs for spatial attention; IO minus FO for object-based attention) within the same intervals used for statistical testing. First, dipole modeling was performed using the Brain Electrical Source Analysis program (BESA; version 5). BESA iteratively adjusts the location and orientation of dipolar sources to minimize the residual variance between the calculated model and the observed ERP voltage topography (Scherg, 1990). The general approach was to fit symmetrical pairs of dipoles that were mirror constrained in location but not in orientation over restricted time intervals.
Second, current density distributions accounting for the N1 difference waves were estimated using a local autoregressive average (LAURA) algorithm (Grave de Peralta Menendez et al., 2001). LAURA uses a realistic head model with a solution space of 4024 nodes evenly distributed within the gray matter of the Montreal Neurological Institute (MNI) average template brain. It makes no a priori assumptions regarding the number of sources or their locations and can deal with multiple simultaneously active sources (Michel et al., 2001). LAURA analyses were implemented using the Cartool software (http://brainmapping.unige.ch/Cartool.php).
The current source distributions estimated by LAURA and the dipole locations calculated by BESA were transformed into the standardized coordinate system of Talairach and Tournoux (1988) and projected onto a structural brain image supplied by MRIcro (Rorden and Brett, 2000) using the Analysis of Functional NeuroImaging (AFNI) software package (Cox, 1996) and the MNI2TAL formula provided by http://www.mrc-cbu.cam.ac.uk/Imaging/Common/mnispace.shtml.
fMRI methods
Seven healthy volunteers (mean age, 26 years; four females) were paid for their participation in this study. The Institutional Review Board of the Nathan S. Kline Institute for Psychiatric Research approved all procedures. The stimuli and task were identical to that used in the ERP study with the addition of a passive stimulation condition using the same stimuli. This passive condition was used to define functional regions of interest (ROIs) associated with sensory processing of the corner-offset stimuli in each quadrant. Each participant took part in a total of eight scans, the first two of which were passive (no task); in the remaining six scans, subjects performed the same attention task as in the ERP experiment.
Image acquisition.
T2*-weighted echo-planar images (EPIs) (repetition time, 2 s; echo time, 38 ms; flip angle, 90°; voxel size, 3 mm3; matrix size, 64 × 64) were acquired on a 3 tesla SMIS MRI system equipped with a head volume coil. During each scan, 164 volumes were acquired on each of 20 contiguous slices in the coronal plane beginning at the occipital pole. The first four volumes were discarded before all analyses to allow for stabilization of the blood oxygen level-dependent (BOLD) signal. Visual stimulation was delivered through MR-compatible liquid crystal display goggles (Resonance Technology, Northridge, CA). For anatomical localization of functional data, high-resolution (1 mm3) images of the entire brain were acquired from each subject, using a standard MPRAGE (magnetization prepared rapid gradient echo) sequence.
Data processing.
All fMRI data analyses were conducted with the AFNI software package (Cox, 1996). Before all statistical tests, the EPIs from individual subjects were realigned to the first included volume (motion never exceeded 1.3 mm along any axis), linearly detrended, and slice time corrected. Functional images were then coregistered with each individual's high-resolution anatomical images and projected into Talairach coordinate space before being spatially smoothed with a Gaussian kernel of 6 mm full-width at half-maximum. The statistical significance levels and minimum cluster size of all activation maps in the study were calculated using a Monte Carlo simulation (AlphaSim, included in AFNI package).
Passive stimulation: defining ROI masks.
Corner-offset stimuli identical to those used in the ERP experiment were used for passive stimulation to define ROIs for the attention study. The stimuli belonged to either the IO or FO configuration and were delivered with the same timing parameters described above for the ERP experiment. Functional ROIs were defined for each quadrant by convolving a canonical hemodynamic response function with a model of the timing of the stimulation epochs and cross-correlating the resulting design matrix on a voxel-by-voxel basis with the data obtained from each passive scan. Percentage signal change maps corresponding to regions with enhanced contralateral activation during passive sensory stimulation at each quadrant (averaged over IO and FO blocks) were generated for each individual subject. Group ROI masks, used in all subsequent analyses, were generated by entering individual signal change maps into a groupwise t test and testing against the null hypothesis. Only voxels with t values >6.78 (p < 0.01) and belonging to clusters of five or more (>135 mm3) neighboring voxels survived the final threshold.
Attention scans.
BOLD responses elicited by attended and unattended stimuli were calculated using a cross-correlation method as described above. The group ROI masks were used to generate individual signal change maps for each quadrant as a function of whether it was attended or unattended and, if unattended, if it formed part of the IO or FO. Spatial attention effects in the group were calculated by comparing the percentage signal change maps for each quadrant when attended (averaged over IO and FO configurations) versus the average signal change when attention was focused on the remaining three quadrants in the FO condition. Object-based attention effects were calculated by comparing activation within each unattended ROI as a function of whether the attended offset stimulus belonged to the same versus a different object (IO vs FO conditions). In both cases, statistical significance of these effects was assessed by paired t tests, calculated separately for each quadrant. To identify common regions of activation between object-based and spatially directed attention, a conjunction map was constructed by taking the union of the statistical maps for object and spatial attention at each quadrant. Group data are reported only for voxels with t values >3.91 (p < 0.05) that belonged to clusters of eight or more neighboring voxels (>216 mm3).
Experiment 2
Subjects
Eleven new subjects (mean age, 21 years; five females) were recruited to participate in this study. Subjects gave informed consent before participating in a single ERP recording session. All subjects were right handed and had normal or corrected-to-normal vision.
Stimuli and task
In the “object-present” configuration, stimuli consisted of four Kanizsa inducers (circles with 90° sectors missing), each subtending 2.8° of visual angle (Fig. 2). The inducers were oriented and positioned such that they generated the perception of an illusory square (subtending 6.5° visual angle) centered about a continuously visible fixation point. In the “object-absent” configuration, another 90° sector was removed from each inducer, leaving a half-circle oriented either vertically or horizontally. With these stimuli, no illusory square was seen. All stimuli were white on a gray background.
In both configurations, the Kanizsa or half-circle figures were present at all times. The ERP-eliciting stimuli consisted of brief (100 ms) presentations of triangular wedges that filled in the missing sector of the Kanizsa inducers. These stimuli were presented one at a time (400–600 ms, stimulus onset asynchrony) in random order to the different visual quadrants. The inner edge of each wedge stimulus was either straight (standards, p = 0.8) or slightly rounded (targets, p = 0.2). A pair of arrows just above or below the fixation point cued the subject to sustain attention to one of the four quadrants during blocks lasting 20 s each. As in experiment 1, subjects were instructed to respond (with the right hand) as quickly and accurately as possible to attended targets only. Ten blocks, alternating between attend-UL and attend-UR conditions, were delivered during each of five runs. In a separate set of five runs, the cues alternated between the LL and LR quadrants, resulting in ∼500 stimuli delivered per subject per condition. The order of the runs (attend upper or lower field) was counterbalanced across the subjects. The subject's task was to press a button on detection of a target stimulus in the attended quadrant.
Electrophysiological recordings
The ERP recording and analysis procedures were identical to those described for experiment 1. The same criteria were also used to identify trials with excessive eye movements or amplifier blocking. Approximately 13% of all trials were rejected and excluded from the average based on these criteria. Time-locked ERPs were averaged separately in response to wedge-onset stimuli (standards only) in each visual quadrant according to whether they were attended or unattended and whether the configuration was object-present (illusory square) or object-absent. Spatial and object attention effects were quantified in terms of mean amplitude within specified latency windows with respect to a 100 ms prestimulus baseline. For each quadrant, the mean amplitude of the P1 (90–120 ms for all quadrants) and N1 (132–172 ms) components were entered into separate repeated-measures ANOVAs with factors of attention (attended vs unattended for spatial attention; object-present vs object-absent for object-based attention) and hemisphere (ipsilateral vs contralateral to the eliciting stimulus). In all analyses, P1 and N1 were measured as mean amplitudes averaged over the same cluster of 10 posterior electrode sites used in experiment 1.
Source localization
As in experiment 1, the scalp topographies of the group-averaged object and spatial attention difference waves for each quadrant were used to estimate the underlying brain sources of the attention-related N1 modulations using both BESA and LAURA. Experiment 2 did not include fMRI.
Results
Experiment 1
Behavioral results
On average, subjects correctly detected 92.1% of the targets in the IO configuration and 92.8% of the targets during the FO condition with mean reaction times (RTs) of 487 and 503 ms, respectively. Neither discrimination accuracy nor RTs differed significantly between targets in the UL, LL, UR, and LR quadrants (p > 0.35) nor between IO and FO configurations (p > 0.85).
ERP results
As in many previous studies (for review, see Hopfinger et al., 2004), the effects of spatial attention were evident as amplitude modulations of the early, sensory-evoked P1 (88–120 ms) and N1 (140–180 ms) components. In all quadrants, attended stimuli elicited significantly larger P1 and N1 voltage amplitudes than the same stimulus when unattended (Fig. 3, Table 1). These components were significantly enhanced over recording sites contralateral to the eliciting stimulus (Table 1, Hemisphere × attention). ERPs elicited by attended stimuli did not differ as a function of the stimulus configuration for any quadrant. This was true for both the P1 component (IO vs FO: UL, F(1,13) = 2.03, p < 0.19; UR, F(1,13) = 1.87, p < 0.20; LL, F(1,13) = 0.93, p < 0.45; LR, F(1,13) = 1.11, p < 0.28) and the N1 component (IO vs. FO: UL, F(1,13) = 2.17, p < 0.22; UR, F(1,13) = 2.90, p < 0.11; LL, F(1,13) = 1.07, p < 0.38; LR, F(1,13) = 1.01, p < 0.30). Accordingly, the attended ERP waveforms were collapsed over the two configurations in the analysis of spatial attention effects.
Object-based attention effects were evidenced by comparing N1 amplitudes elicited by unattended stimuli that formed part of the same object (IO condition) to the case when the stimulus belonged to a different object (FO condition) (Table 2). As in the case of spatial attention, these object-based N1 modulations were significant for each quadrant and were largest over the contralateral scalp. This finding of larger N1 amplitudes to unattended stimuli in the IO versus FO conditions but equivalent N1 amplitudes to attended stimuli in the two conditions was also reflected in a significant attention × configuration interaction in an overall ANOVA (F(1,13) = 8.62; p < 0.01). Unlike spatial attention, however, object-based selection did not significantly affect the amplitude of the P1 component as reflected in a nonsignificant attention × configuration interaction (F(1,13) = 2.88; p < 0.89) and in nonsignificant object effects for each quadrant tested separately (IO vs FO: UL, F(1,13) = 0.15, p < 0.71; UR, F(1,13) = 3.72, p < 0.08; LL, F(1,13) = 0.11, p < 0.75; LR, F(1,13) = 3.48, p < 0.09).
The distribution of attention within the attended object was tested by contrasting the amplitude of the object-based N1 modulations at contralateral electrode sites when the attended quadrant was situated horizontally, vertically, or diagonally with respect to the eliciting unattended quadrant. For all quadrants, these object-based attention effects (comparing IO and FO conditions) were not affected by the relative position of the attended and unattended quadrants (p > 0.11) (Table 3).
As shown in Figure 4, the scalp topographies of the N1 modulations produced by object-based and spatial attention were very similar, both having maximal amplitude over the posterior contralateral scalp. Although the degree of similarity varied somewhat according to the visual quadrant of the eliciting stimulus, a comparison of these N1 difference topographies using the method of McCarthy and Wood (1985) revealed no statistical difference between the object-based and spatial attention effects in any quadrant (p > 0.18, for all quadrants).
Source localization
Pairs of mirror-symmetrical dipoles were fit using BESA to the difference topographies of the spatial and object-based attention effects on N1 shown in Figure 4. In all quadrants, the spatial attention effect on N1 was well fit by a pair of ventrolateral dipoles in occipital cortex. The object-based N1 modulations were accounted for by similarly situated dipole pairs (see Table 4 for Talairach coordinates). Both sets of dipole models (spatial and object-based attention) accounted for >90% of the variance in scalp topography in each quadrant over the fitted time range (140–180 ms).
The linear distributed inverse solution (LAURA) was also used to estimate the neural sources underlying the spatial and object-based attention effects on N1. In all quadrants, a principal source associated with the N1 object attention effect was identified in the LOC of the hemisphere contralateral to the eliciting stimulus, overlapping with the dipolar sources modeled with BESA (Fig. 5).
fMRI results and coregistration with source analysis models
Spatial attention effects were calculated by comparing the BOLD signal elicited during blocks of attention to one visual quadrant to the signal elicited when the same quadrant was unattended. Attention-related enhancements of the BOLD signal were observed within several striate and extrastriate cortical regions including the inferior and middle occipital gyri, the fusiform gyrus, and portions of the inferior and superior parietal lobes. These activations, summarized in Table 5, were largest in the contralateral hemisphere
Analysis of object-based attention effects were calculated by comparing the activations elicited within ROIs by unattended stimuli in the IO versus FO conditions. In all quadrants, this comparison yielded significant BOLD enhancements within discrete extrastriate cortical areas that included the fusiform gyrus, middle occipital gyrus, and the inferior parietal lobes of the contralateral hemisphere. In general, these object-based attention effects were less extensive than those observed for spatial attention and varied somewhat depending on the quadrant of stimulation (Table 5).
A conjunction analysis between the spatial attention activation maps and those of object attention was conducted to identify common areas of activation. Although the overlapping regions revealed in this analysis varied slightly across the four quadrants, a prominent activation in the region of the contralateral middle occipital gyrus (Brodmann's area 19) was found for all visual quadrants during both attention conditions. The Talairach coordinates of this region are within ±6 mm of the range of coordinates reported for area LOC in several previous fMRI studies (Malach et al., 1995; Grill-Spector et al., 1998; Grill-Spector, 2003).
To compare the anatomical locations of these activations with the sources of the N1 attention effects, the LAURA models and BESA dipole coordinates were transformed into Talairach space and coregistered with the fMRI maps. Figure 5 shows the close correspondence between the cortical regions identified by BESA and LAURA as the sources of the spatial and object attention effects on the N1 and those identified by fMRI as having common activation during both attention conditions. These methods converge to indicate that activation within a region in the contralateral LOC (area LOC) is involved in mediating selection based on location as well as selection based on object integrity.
Experiment 2
Behavioral results
Target discrimination accuracy averaged 90.7% for the object-present condition and 91.9% for the object-absent configuration, a nonsignificant difference (p > 0.42). Mean RTs (502 and 517 ms for object-present and object-absent, respectively) did not differ significantly between stimulus configurations nor among the quadrants (p > 0.65).
ERP results
As in experiment 1, spatial attention effects were calculated by subtracting the ERPs elicited by attended wedge-onset stimuli (both configurations) from the ERPs elicited by the same unattended stimulus (object-absent configuration only). Attended ERPs (P1 and N1 components) were not affected by the stimulus configuration (object-present vs object-absent) for any quadrant (P1: UL, F(1,10) = 3.01, p < 0.09; UR, F(1,10) = 2.07, p < 0.18; LL, F(1,10) = 1.03, p < 0.91; LR, F(1,10) = 0.83, p < 1.01; N1: UL, F(1,10) = 2.75, p < 0.15; UR, F(1,10) = 1.96, p < 0.20; LL, F(1,10) = 1.22, p < 0.78; LR, F(1,10) = 2.00, p < 0.18). As in experiment 1, the effects of spatial attention were evident as amplitude enhancements of the P1 (90–120 ms) and N1 (132–172 ms) components that were largest over the contralateral scalp (Fig. 6, Table 6).
Object-based attention effects were assessed by comparing unattended ERPs in the object-present versus object-absent configurations, as in experiment 1. This comparison did not yield any significant modulation of the P1 component for any quadrant (UL, F(1,10) = 1.05, p < 0.63; UR, F(1,10) = 1.27, p < 0.48; LL, F(1,10) = 0.62, p < 0.51; LR, F(1,10) = 2.12, p < 0.12), but the N1 amplitudes were significantly larger in the illusory object-present configuration than in the object-absent configuration. As in experiment 1, these effects were reflected in a significant attention × configuration interaction for the N1 (F(1,10) = 7.29; p < 0.02) but not for the P1 (F(1,10) = 2.05; p < 0.31) component. As with spatial attention, the object-based attention on N1 effects were significantly larger over the contralateral scalp (Fig. 6, Table 7). No significant difference in the amplitude of the N1 modulations produced by object-based attention was obtained at any location as a function of whether attention was directed to the quadrant adjacent (horizontal or vertical) or diagonal with respect to the eliciting unattended stimulus (p > 0.23, for all quadrants) (Table 8).
In all quadrants, the scalp topographies of the object-based and spatial-attention N1 difference waves were very similar to one another, both having maximal amplitudes over the contralateral occipital scalp (Fig. 7). The procedure of McCarthy and Wood (1985) was used to statistically compare these topographical distributions, which did not differ (p > 0.17) for any of the quadrants.
Source localization
Inverse dipole modeling was first performed using the BESA algorithm. Separate dipole models were calculated for each of the four quadrants over time windows corresponding with the peak N1 component (see Materials and Methods). In all quadrants, a pair of symmetrically constrained dipoles in ventrolateral occipital cortex accounted for >90% of the voltage topography of both the spatial- and object-based N1 modulations (Table 9). Second, a linear distributed inverse solution approach was performed using LAURA (Grave de Peralta Menendez et al., 2001). LAURA revealed a prominent source in LOC for both the spatial and object-based attention N1 modulations in all quadrants. This source corresponded very closely to the dipole locations modeled with BESA and was largest in the contralateral hemisphere (Fig. 8).
Discussion
The data reported here provide physiological evidence from fMRI and ERP recordings that directing spatial attention to one part of an object facilitates the processing of the entire object. This was observed both for a real object having the shape of a uniform square (experiment 1) and for an illusory square defined by Kanizsa inducers at its corners (experiment 2). For both types of stimuli, object-selective attention was manifested in amplitude modulations of the N1 component (latency, 140–180 ms) of the visual ERP elicited by brief offsets of the corners of the squares. When attention was directed to offsets at one corner, the N1 amplitude elicited by offsets at the other, unattended corners were larger when the square was an intact, perceptual object than when it was fragmented (experiment 1) or made to disappear by modifying the Kanizsa inducers (experiment 2). This object-selective N1 enhancement had the same timing and source localization as the spatially selective increase in N1 amplitude elicited by stimuli at the attended corner. Both the object-based and space-based N1 amplitude increases were found to arise from a common cortical source in the LOC, a finding supported by converging evidence obtained from a parallel fMRI experiment while subjects engaged in the same task. These results point to an important role for spatial attention in strengthening the sensory representations of entire objects and thus contributing to object-based attention.
Previous physiological studies comparing the neural bases of spatial and object-selective attention by means of fMRI (Müller and Kleinschmidt, 2003) and ERP recordings (He et al., 2004; Martinez et al., 2006, 2007) used variations of the classic paradigm of Egly et al. (1994). In this design, two rectangles are presented, and attention is cued to one end of one of the rectangles. Spatial attention is evidenced by findings of faster reaction times to a subsequent target at the cued location than at any of the other locations in the display. Object-based attention is inferred by findings of faster RTs to a target at the uncued end of the cued rectangle than to an equidistant target belonging to the uncued rectangle. Studies by Avrahami (1999) and Marino and Scholl (2005), however, found that similar attention effects could be obtained when the short ends of the rectangles were removed, so that the display consisted of an array of parallel lines rather than full-fledged enclosed objects. It was proposed that the flow of attention was guided from the cued location by a directional line tracing operation along the axis of the parallel lines rather than by true object-selective attention. These findings call into question whether the “same object effects” observed in the aforementioned physiological studies using the Egly-type design were truly a consequence of object-based selection of the entire bounded form.
In the current study, we compared the neural bases of spatial and object-based attention using symmetrical (square) objects so that attention would not be guided from the attended location by simple directional cues such as parallel lines. Importantly, for both real and illusory object forms, we found that the enhancement of the N1 amplitude to unattended corner stimuli comparing object versus no-object conditions did not differ as a function of whether the unattended corner was situated horizontally, vertically, or diagonally with respect to the attended corner. This indicates that focusing attention on one part of the square resulted in selection of the entire square, even its most distant corner. Whether this object-based selection is achieved through a systematic tracing of the entire boundary or by a filling in of the area within the object's boundary is a matter for further investigation. In any case, the present results extend our previous findings (Martinez et al. 2006, 2007) by showing that a common pattern of ERP modulation (enhancement of the N1 component) and fMRI activation (in area LOC) is shared by spatial attention and by bona fide object-based attention.
The object-based modulations of the N1 component were virtually identical for the illusory square produced by Kanizsa inducers and for the real square produced by a luminance increment. A similar equivalence was observed previously for ERP effects in an Egly-style paradigm using illusory and real rectangles (Martinez et al., 2007). The present findings with the illusory square strengthen the case that the observed N1 modulations are associated with true object-based selection that occurs at a level of processing in which perceptual object forms are represented. Previous studies have found that the perception of both illusory and real object forms is associated with enlarged N1 components in the 140–200 ms range (Pegna et al., 2002; Proverbio and Zani, 2002), the neural generators of which have been localized to area LOC (Murray et al., 2002, 2004, 2006; Halgren et al., 2003). These findings are consistent with a broad range of evidence from fMRI (for review, see Grill-Spector, 2002; Malach et al., 2002) and ERP (Doniger et al., 2001; Sehatpour et al., 2006) studies, which have implicated the LOC area as being critically involved in the initial encoding and recognition of objects.
Although the present results and those of previous studies (Müller and Kleinschmidt, 2003; He et al., 2004; Martinez et al., 2006, 2007) have shown substantial overlap between the neural systems underlying spatial and object-based attention, these systems were by no means identical. In the first place, spatially mediated selection began earlier than the object-selective effects and were first evident as amplitude modulations of the P1 component (90 ms after stimulus onset), whereas unattended stimuli that formed part of the attended object only elicited amplitude enhancements of the subsequent N1 component. This finding follows from previous studies that have ascribed separate roles to the P1 and N1 components in spatial attention, with the P1 reflecting an early stage that suppresses unattended inputs and the N1 reflecting a subsequent stage at which relevant inputs are discriminated in detail (Luck et al., 1994). Second, the spread of spatial attention throughout the attended objects' boundaries did not occur in a uniform manner. Attended stimuli elicited significantly larger N1 amplitudes than did unattended stimuli, regardless of whether the unattended stimulus formed part of the same or a different object. This is consistent with numerous studies reporting a gradient of attention-related selectivity as a function of distance from the attended location (Müller and Kleinschmidt, 2003). An object-based attentional gradient, as measured by N1 amplitudes elicited by stimuli at unattended locations, was not obtained in either study reported here when adjacent versus diagonal quadrants were compared, possibly because the distances separating these quadrants were not large enough. Finally, although fMRI and source localization converged to indicate a source in cortical area LOC for both the spatial and object-guided N1 modulations, the fMRI activations during spatial attention were significantly larger and more widespread than those associated with object-based selection (see also Müller and Kleinschmidt, 2003; Martinez et al., 2006).
Two major hypotheses have been put forward to account for the role of spatial attention in the selection of objects, which may be characterized as “object-guided spatial attention” (Weber et al., 1997; Davis et al., 2000) and “priority setting” (Shomstein and Yantis, 2002), respectively. According to the priority setting view, object-specific benefits occur in trial-by-trial cueing tasks because attention is switched from the cued (attended) location to uncued locations within the same object with a higher priority than to uncued locations within a different object. Because the present task required focal attention to only one location, however, and no other locations were relevant, it seems unlikely that a prioritized switching of attention to the other locations would take place. Instead, the object-selective modulations of ERPs and BOLD signals that were observed here are highly compatible with an object-guided spatial attention mechanism in which the deployment of spatial attention to one part of an object produces a graded facilitation of the sensory processing of the entire object, whether its boundaries are defined by real or illusory contours. This sensory enhancement was manifested by an increased amplitude of the N1 component, which was localized to the LOC where object forms are initially encoded and represented. We hypothesize that the spotlight of spatial attention gates sensory input into area LOC via an earlier filtering mechanism indexed by the P1 component. At this earlier level, sensory input is facilitated solely according to its location, but in LOC, the facilitation is guided by object representations as well as by location. The facilitation of attended object representations in LOC may reinforce their perceptual integrity and underlie the performance benefits manifested in the same object advantage.
Footnotes
-
This work was supported by National Eye Institute Grant EY016984. The Cartool software was programmed by Denis Brunet (Functional Brain Mapping Laboratory, Geneva, Switzerland) and is supported by the Center for Biomedical Imaging of Geneva and Lausanne, Switzerland. We thank Matt Marlow for technical assistance.
- Correspondence should be addressed to Dr. Antigona Martinez, Department of Neurosciences–0608, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0608. antigona{at}ucsd.edu