Abstract
Traditional explanations of our limited attentional capacity focus on our ability to direct attention to multiple items. We ask whether this difficulty in simultaneously attending to multiple items stems from an inability to effectively represent multiple attended items. Although attending to one of a set of neighboring stimuli can isolate it from competitive interactions in visual cortex, no such isolation should occur if multiple competing items are attended. Indeed, we find that attention is ineffective at enhancing blood oxygen level-dependent signal in visual cortical area V4 when it is directed to three stimuli simultaneously, but only when those three stimuli compete in visual cortex. This suggests that competition may prevent attention from acting as effectively on representations of multiple items as it does on representations of a single item. In contrast to traditional explanations that posit limits in the sources of attentional control, we show that mechanisms at the sites of stimulus representation may also impose limits on our ability to attend to multiple items simultaneously.
Introduction
Attending to multiple items simultaneously is more difficult than attending to a single item (Ericksen and St. James, 1986; Alvarez and Franconeri, 2007). Traditional cognitive models of attentional limitations have ascribed these limits to inadequate attentional resolution (Intriligator and Cavanagh, 2001) or resources (Alvarez and Franconeri, 2007), putatively subserved by mechanisms in frontoparietal cortex (Intriligator and Cavanagh, 2001; Lavie and Robertson, 2001; Mitchell and Cusack, 2008; Xu and Chun, 2009). Many frontoparietal attentional mechanisms act by altering activity in visual cortex, however, raising the possibility that the functional architecture of the visual cortex may limit the effectiveness with which sources of attentional control can operate.
It is well established that local spatial interactions among cells in visual cortex can alter the fidelity of the neural representation of a stimulus in the presence of nearby items (Blakemore and Tobin, 1972; DeAngelis et al., 1992; Desimone and Duncan, 1995; Pelli, 2008). For example, two stimuli presented simultaneously within the receptive field (RF) of a cell evoke less activity than the summed activity of each stimulus presented alone (Chelazzi et al., 1998; Reynolds et al., 1999). Similarly, four neighboring stimuli evoke lower blood oxygen level-dependent (BOLD) responses in visual cortical area V4 when presented simultaneously than when presented sequentially (Kastner et al., 1998). The magnitude of this difference varies with the distance among the stimuli (Kastner et al., 2001) and scales with RF size across visual cortex (Kastner et al., 2001; Beck and Kastner, 2007), suggesting that inhibitory interactions among multiple stimuli are strongest when they are likely to fall within the same RFs. The result of these interstimulus interactions is that representations of stimuli presented simultaneously are weaker (i.e., signaled less clearly) than those of stimuli presented alone.
Attention can alter competitive interactions in visual cortex, resolving them in favor of the attended stimulus (Kastner et al., 1998; Reynolds et al., 1999). Specifically, attention is thought to “bias” these competitive interactions by increasing the inhibitory effects of the attended stimulus on unattended stimuli and decreasing the inhibitory effects of unattended stimuli on the attended stimulus. The end result is a cell that strongly represents the attended stimulus.
Now consider the case in which attention must be directed to multiple competing stimuli. We would predict that attention would enhance signal to all competing items. No single stimulus would receive a boost that would enable it to dominate the competitive process; instead, signals from cells whose RFs contained more than one attended item would continue to reflect the contribution of all of the simultaneously attended items. Because attention would be unable to reduce the inhibitory interactions among multiple attended items, the representations of attended items would be weaker than would be the case if a single item received attention. In other words, even if top-down attentional “resources” were adequately directed toward multiple items, competitive interactions in visual cortex would limit the effectiveness with which the attentional signals could enhance their representation. Here we ask whether limits in our ability to attend to multiple items are determined, at least in part, by the degree to which the multiple attended items competitively interact in visual cortex.
Materials and Methods
Subjects.
We tested 10 volunteers (eight males; ages 24–37 years) in experiment 1 and seven volunteers (five males; ages 26–34 years) in experiment 2, all with normal or corrected-to-normal visual acuity. Participants gave written informed consent to participate in this study, which was approved by the Institutional Review Board of the University of Illinois at Urbana-Champaign and were paid for their participation.
Stimuli.
In experiment 1, five Gabor patches (σ of Gaussian envelope, 0.52° of visual angle; each ∼1.5° × 1.5° in size) presented in white, blue, cyan, red, green, yellow, and purple served as visual stimuli. The Gabor patches consisted of 0.5–7.5 visible grating cycles with the following possible orientation and wavelength combinations: 0° (vertical) and 3.0°, 0° and 0.6°, 90° and 0.6°, 0° and 0.2°, and 90° and 0.2°. We centered each Gabor patch in one of five squares arranged in a grid (gray on a black background) that was present throughout the experiment in the upper right visual field (Fig. 1). Five 1.8°-sided squares comprised this grid. The two uppermost squares were centered 4.6° from the horizontal meridian and 0.99° and 2.8° from the vertical meridian, respectively. The two central-most squares were centered 2.8° from the horizontal meridian and 2.8° and 4.6° from the vertical meridian, respectively. The lowermost square was centered 0.99° above the horizontal meridian and 4.6° from the vertical meridian.
The five Gabor patches appeared either sequentially (noncompeting) or simultaneously (potentially competing). During sequential presentation, each of the five patches appeared in isolation for 250 ms. The three patches closest to fixation always appeared (in a random order) before the two patches farthest from fixation (the uppermost of these patches was always presented before the lowermost). When presented simultaneously, the five patches appeared together for 250 ms. Onset times for the simultaneous items jittered such that the average stimulus onset asynchrony (SOA) was 1.25 s (range, 750 ms to 1.75 s). In both conditions, each stimulus appeared for 250 ms, and each square in the grid was filled once and only once every 1.25 s on average. Total visual stimulation at each location was therefore equated in the two conditions.
In experiment 2, four shapes (hearts, squares, circles, or triangles) were crossed with four colors (red, blue, yellow, or green) and four textures (solid, horizontal stripes, vertical stripes, and diagonal stripes) to create 64 different stimuli. These were again centered within individual squares of a gray grid centered in the upper right visual field. The sides of these squares extended 2° of visual angle and were positioned such that one of the squares crossed the vertical meridian (centered 1° to the left of the vertical meridian and 4.89° above the horizontal meridian). Two squares were centered 1° to the right of the horizontal meridian and 5.89° and 3.89° above the vertical meridian, respectively. The lowermost right corner of this third square and the center of the display defined a line through which these three squares were reflected. This reflection produced three new squares that matched the eccentricity of the upper three squares (see Fig. 3). All six shape stimuli presented on each trial appeared simultaneously for 250 ms. We jittered their onsets between 1.25 and 1.75 s, to produce an average SOA of 1000 ms.
Stimulus presentation was controlled using Vision Egg stimulus presentation software (Straw, 2008) running under Windows XP Professional (Microsoft) on a Pentium 4 Dell personal computer. In experiment 1, stimuli were presented through goggles designed by Magnetic Resonance Technologies. In experiment 2, two participants viewed these stimuli through goggles designed by Magnetic Resonance Technologies. The remaining four participants viewed the stimuli on a back-projection system run through a Proxima C410 digit projector (InFocus).
Task and trial structure.
In experiment 1, participants searched the Gabor stimuli for one of two conjunctions of color orientation and spatial frequency combinations, randomly selected from the sets of colors and orientation–frequency combinations described above. The targets changed every four runs. They appeared at the beginning of each run, alternating at a rate of 1 Hz, to help participants remember them. Critically, we asked participants to search for the conjunctions in either one or three of the five grid squares. In the attend-three condition, they searched all of the three centermost grid squares. In the attend-one condition, they searched only one of the three centermost grid squares. The attend-one location was randomly selected for each participant and remained constant for each participant throughout the duration of the experiment. The attention condition (attend-one or attend-three) remained constant through each run and occurred in one of two four-run sequences (A1-A3-A3-A1 or A3-A1-A1-A3), randomly chosen every four runs. At the beginning of each run, we instructed participants to attend to either one or three locations. The alternating target samples appeared only in the attend-one location at the beginning of each attend-one run but appeared in all of the attend-three locations at the beginning of each attend-three run. As a final aid to our participants' memories, we placed a red digit (40 point font), present throughout the run, 3° to the left of fixation to indicate the attention condition: “1” during the attend-one condition and “3” during the attend-three condition.
For the purpose of experiment 1 data analysis, we considered each 1.25 s period in which all five grid squares were stimulated a “trial.” Each 163 s run consisted of four 20 s blocks (16 trials) of either sequential or simultaneous presentations interleaved with 14 s blank periods. In addition, each run began and ended with a 16.5 s blank period. The order of sequential (SEQ) and simultaneous (SIM) blocks (SEQ-SIM-SIM-SEQ or SIM-SEQ-SEQ-SIM) randomly varied on each run. Each participant completed 12 runs.
In experiment 2, participants searched one or three of these grid locations for one of two shape, color, and texture conjunction targets. The targets changed every four runs. Participants searched for the conjunction targets in either one or three grid squares, under conditions in which competition was either likely or unlikely to occur among stimuli. Competition among the stimuli was less likely when the three squares were distributed across the vertical meridian but likely when those three squares were reflected such that they all fell within the right visual field. In the attend-one conditions, participants searched either the leftmost square [in the left visual field (LVF)] or lowermost square [in the right visual field (RVF)] in the display; these squares were equidistant from fixation. In the attend-three conditions, participants searched one of these two squares as well as the two adjacent squares. The attend-one and attend-three conditions, attention within visual field (competition likely) and attention distributed across visual fields (competition unlikely), alternated across groups of four runs such that each run contained four blocks of a single combination. Attend-three runs and attend-one runs always alternated, and visual field conditions were always nested to ensure that each attentional condition followed the other an equal number of times. We cued participants as to the identity of the target items, the locations to be attended in each run, and the current attention condition in the same manner described for experiment 1.
For the purposes of experiment 2 data analysis, we considered each display of shapes to be a trial. In each 134 s run, the six shape stimuli appeared in four 16 s blocks (16 trials), interleaved with 14 s blank intervals. In addition, each run began and ended with a 16.5 s blank period. Each participant completed 12 runs.
To maintain alertness, participants in both experiments performed a rapid serial visual presentation (RSVP) task during the “blank” presentation periods. Specifically, they searched for an “a” in a 4 Hz stream of digits (1–9) and ASCII symbols (%, &, *, #) presented in the center of the screen, at fixation, in a white 30 point font. Although participants only performed the RSVP task when the Gabors were not on the screen, the RSVP stream was present throughout the run, with the exception of a 2 s period before and after each visual stimulation block used to cue participants to move their attention to or from the peripheral presentation grid. During the attend-letter condition, the red digit near fixation was “0”.
In both experiments, targets occurred in 20% of the trials. To hold both the number of targets presented and the probability of targets occurring in the attended locations equal across task conditions, we restricted targets to the respective attended locations during attend-one and attend-three task conditions. This arrangement, however, did not require participants to change search strategies for attend-one and attend-three trials. To encourage participants to restrict their attention to the attended location during attend-one trials, we forced the nontarget stimuli that occurred in each of attend-three locations to contain one of the four target-defining features on all trials in both the attend-one and attend-three conditions. The presence of these lures at nontarget locations both made it advantageous for participants to restrict their attention to the attend-one location during the attend-one condition and held the frequency of target defining features in potential target locations at 100% across both attentional conditions. We note that, in both experiments, the number of target items that occurred in each “attend” location were not equal across conditions, i.e., in attend-one conditions, a target appeared at the attend-one location on 20% of trials, whereas in attend-three conditions, a target appeared in that location on only 6.67% of trials. Because our analysis focuses on BOLD response in V4 regions to stimulation at the attend-one locations (see below), any generalized reduction we observe in V4 response during attend-three trials could be attributed to the lower incidence of targets in the attend-one location under attend-three conditions. This differential incidence of targets is constant across levels of potential competition in both experiments, however, and so cannot be the cause of any competition-specific reductions we observe in BOLD under attend-three conditions. We note also that any other influences of the targets on the BOLD activity of interest should have been minimized by the inclusion in our general linear model of a regressor to account for activity associated with target hits and false alarms. This step is further described below (see Data acquisition and analysis).
In all conditions, participants maintained fixation on the RSVP stream and indicated the presence of a task-relevant target by pressing a button with their right index finger. Responses were collected using USB optically isolated 10-button response boxes (Rowland Institute at Harvard, Cambridge, MA). Data from one participant in experiment 2 whose target false-alarm rate (7%) was >2 SDs above the mean false-alarm rate (2.5%) was rejected from additional analysis.
Eye-movement monitoring.
During functional magnetic resonance imaging (fMRI) scanning, eye movements from all participants in experiment 1 and two participants in experiment 2 were monitored using the View Point eye tracker built into our goggle system. We excluded data from one participant in experiment 1 because eye-tracking measures indicated that his eyes deviated from fixation on >40% of trials; this number of eye movements was >2 SDs above the mean across participants (15%). Statistical analysis indicated that eye movements occurred equally across all trial conditions for the remaining nine participants in experiment 1 and the two participants monitored during experiment 2.
Training.
Participants in both experiments also completed 12 (experiment 1) or eight (experiment 2) runs of training trials the day before undergoing fMRI scanning. We trained participants to maintain fixation on the RSVP stream throughout the blank and visual presentation blocks by monitoring their eye movements and providing feedback whenever their gaze deviated from fixation by >1° of visual angle for >150 ms. Feedback was a 512 Hz, 80 dB tone that persisted until participants returned their gaze to fixation. To help participants remain engaged during training, we increased the Gabor target rate to 40% of trials and decreased the duration of the blank periods to 5 s. Stimuli were presented using Vision Egg (Straw, 2008) software running under Windows 2000 (Microsoft) running on a Pentium 4 Dell personal computer. Eye movements were monitored using a head-mounted Eye-Link II tracker. In all other ways, we held training conditions identical to those described for fMRI testing.
Localizer stimulations.
We also acquired data that allowed us to localize the area of visual cortex sensitive to stimulation in the attend-one location(s) for both experiments. Before performing the experimental task, participants passively viewed Gabor stimuli presented in the attend-one location(s) for blocks of 13.75 s, preceded and followed by blank intervals of the same length. An RSVP stream (SOA, 250 ms) of digits, letters, and symbols appeared at fixation throughout the run. In experiment 1, participants 1–5 viewed five blocks of Gabor stimuli during a single echo planar image (EPI) run. All remaining participants in experiments 1 and 2 viewed 10 blocks of Gabor/shape stimuli during two EPI runs. The resulting regions of interest (ROIs) from this analysis were then aligned with the maps from each participant's retinotopy session to identify the V4 ROI (see retinotopic mapping and region of interest analysis procedures, available at www.jneurosci.org as supplemental material).
Data acquisition and analysis.
Imaging data were acquired in a 3 T head-only scanner (Allegra; Siemens) using a standard head coil. For experiment 1, participants 1–4, we collected EPIs from the entire brain using a gradient echo sequence [repetition time (TR), 3 s; echo time (TE), 25 ms; flip angle, 90°; field of view, 192 × 192 mm; voxel size, 3 × 3 × 3 mm; 1 mm gap] in 48 ascending coronal slices. We collected 65 repetitions for the localizer run and 55 repetitions for the 12 experimental runs. We discarded one run from participant three because of a technical error during scanning. To assist in registering images to anatomical space, we collected T2-weighted anatomical images (TR, 9100 ms; TE, 96 ms; flip angle, 150°; 128 × 128 matrix) in the same coronal planes used for EPI acquisition. For the six remaining participants in experiments 1 and all participants in experiment 2, we acquired higher-resolution EPIs (TR, 2 s; TE, 20 ms; flip angle, 90°; field of view, 160 × 160 mm; voxel size, 2.5 × 2.5 × 3; 0 mm gap) in 20 ascending coronal slices starting at the posterior pole. We collected 12 experimental runs of 83 repetitions and 180 repetitions of the localizer run for the remaining six participants in experiment 1, with the exception of participant 5, from whom we collected 90 repetitions of the localizer run. We collected 12 experiment runs of 75 repetitions and two localizer runs of 90 repetitions for all participants in experiment 2. To assist in registering EPIs to anatomical space, we collected T2-weighted anatomical images (TR, 9100 ms; TE, 96 ms; flip angle, 150°; 128 × 128 matrix) with 49 coronal slices aligned at the posterior pole to EPI slices from the same session. For all participants tested, we collected multiple high-resolution T1 anatomical images collecting during retinotopy scanning (for retinotopic mapping procedures, see supplemental data, available at www.jneurosci.org as supplemental material), to which we registered our EPIs.
We used tools from the FMRIB (Oxford University Centre for Functional MRI of the Brain) Software Library (FSL) to analyze our functional data. Data were motioned corrected using McFLIRT [FSL 3.3 (Jenkinson et al., 2002; Smith et al., 2004)]. We used FEAT (FMRI Expert Analysis Tool) version 5.2 [FSL 3.3 (Woolrich et al., 2001; Smith et al., 2004)] to submit functional data from individual runs to multiple regression analysis.
To analyze our data from experiment 1, we included two regressors of interest in the analysis of each run. Models for attend-one runs included square-wave functions matching the time course of attend-one sequential and attend-one simultaneous blocks. Models for attend-three runs included square-wave functions matching the time course of attend-three sequential and attend-three simultaneous blocks. These square-waves were convolved with a Gaussian model of the hemodynamic response function (HRF) (phase, 0; SD, 3 s; mean lag, 6 s). Because we were interested in comparing conditions in which participants attended all three locations versus a single location, we also included an event-related regressor of Gabor target hits and false alarms, convolved with a double-gamma model of the HRF (phase, 0 s) to model out those trials in which attention collapsed around a perceived or actual target. The resulting statistical maps were registered into the participant's individual anatomical space and into standard space using FLIRT (Jenkinson et al., 2002).
To analyze our data from experiment 2, we defined a single square-wave regressor of interest in the analysis of each run. These square waves were convolved with a Gaussian model of the HRF (phase, 0; SD, 3 s; mean lag, 6 s) to generate idealized response functions. This regressor corresponded to the periods in which participants attended prespecified locations. To eliminate any impact on our data of attention collapsing around a perceived or actual target, we included an event-related regressor of shape-target hits and false alarms. This regressor was convolved with a double-gamma model of the HRF (phase, 0 s). The resulting statistical maps were registered into the participant's individual anatomical space and into standard space using FLIRT (Jenkinson et al., 2002).
To extract the region of visual cortex most responsive to stimulation of the attend-one location(s) in each experiment, we fit a single regressor of interest, a square-wave function matching the onset and offset of Gabor patch stimulation, to the data collected in our localizer run. This square wave was convolved with a Gaussian model of the hemodynamic response (phase, 0; SD, 3 s; mean lag, 6 s) to generate the idealized response function. Six motion correction parameters (three translations and three rotations) estimated with McFLIRT were included as regressors of no interest. The resulting statistical maps were registered into the participant's individual anatomical space and into standard space using the FMRIB Image Registration Tool FLIRT (Jenkinson et al., 2002). We used Freesurfer (Dale et al., 1999; Fischl et al., 1999, 2001; Ségonne et al., 2004) to register each participant's FSL analysis results into their individual retinotopic space to identify V4 for each participant in each experiment (for more detail, see retinotopic mapping and region of interest analysis procedures, available at www.jneurosci.org as supplemental material).
To identify ROIs in posterior parietal cortex (PPC) for each subject, we subjected contrast parameter estimates (mapped to standard space during first-level analysis) for each experimentally relevant condition to a fixed-effects higher-level analysis using FLAME (FMRIB Local Analysis of Mixed Effects) (Beckman, 2003; Woolrich et al., 2004), part of FEAT version 5.2 [FSL 3.3 (Woolrich et al., 2001; Smith et al., 2004)]. Specifically, for each experiment, we were interested in voxels in the parietal cortex that were significantly activated by all four experimental conditions of the experiment (contrast, [1 1 1 1]). We identified the voxels with the peak contrast value in the right and left PPC and included in each ROI all contiguous voxels whose activation was >66% of the peak activation.
We used Featquery (Smith et al., 2004) to extract the parameter estimates in the V4 ROIs identified in the localizer scans and the PPC ROIs identified as active in all four conditions for each individual in each experimentally relevant condition from each run. For each participant, we computed the average parameter estimate for each experimentally relevant condition for each ROI in visual cortex. For V4 ROIs, Featquery applied the magnetization-prepared rapid-acquisition gradient echo (MPRAGE) to EPI transformation matrix calculated during image registration to determine which EPI voxels fall within the ROI (selected in MPRAGE space). For PPC ROIs, Featquery applied the standard space to EPI transformation matrix calculated during image registration to determine which EPI voxels fall within the ROI (selected in MPRAGE space). Because the edges of the differently resolved standard space, MPRAGE, and EPI voxels are not in direct correspondence, the number of EPI voxels sampled by Featquery will vary as a function of the size of the region activated by the ROI identification process, the degree of correspondence between EPI and MPRAGE or standard edges in that ROI in that participant, and the interpolation threshold set by the experimenter. For V4 ROI selection, we allowed Featquery interpolation thresholds to vary from 0.2 to 0.5, such that information from no more than 12 and no fewer than 5 voxels in functional space (3 × 3 × 3 mm) were included in the averaged parameter estimate from each run. The vast majority of ROIs produced an acceptable number of voxels using interpolation thresholds of 0.3 or 0.2. For PPC ROI selection, we allowed Featquery interpolation thresholds to vary from 0 to 0.1, such that information from no more than 40 and no fewer than 30 voxels in functional space (3 × 3 × 3 mm) were included in the averaged parameter estimate from each run.
Results
Experiment 1: sequential versus simultaneous displays
To test the hypothesis that competition for representation is one source of our limited attentional capacity, we asked whether attending to multiple items, as opposed to a single item, is less effective at enhancing BOLD signal in visual cortex when those items competitively interact than when they do not. For stimulus items to competitively interact in visual cortex, they must activate a common group of cells at a common time. In our first experiment, we varied the timing of stimulus displays to manipulate the likelihood that stimulus items would compete for representation at a common time. Participants viewed five colored Gabor patches presented sequentially or simultaneously in neighboring locations in the upper right quadrant of the visual field. During sequential presentation, each of the five items appeared in isolation. During simultaneous presentation, all five items appeared simultaneously. Integrated over time, physical stimulation at each location was identical in the two conditions. Only when the items were presented simultaneously, however, could they competitively interact in visual cortex (Kastner et al., 1998, 2001; Beck and Kastner, 2005, 2007) (Fig. 1A,B). Participants selectively attended to one or three of the five locations and detected the occurrence of either of two color/spatial frequency/orientation conjunction targets, which appeared on 20% of trials (Fig. 1C,D).
To compare activity for the same physical stimulus under different attentional conditions, we used data from a passive localizer task and a separate retinotopic mapping session to identify the V4 ROI that represented the attend-one location. We concentrated on V4 because previous studies using stimulus parameters similar to ours showed the largest competitive effects and the largest effects of attention in V4 (Kastner et al., 1998, 2001).
In keeping with our hypothesis that competitive interactions limit the ability of attention to enhance the representations of multiple attended items, we predicted that the costs of attending to three items compared with a single item would be greater when the items compete in visual cortex than when they did not. Analysis of the BOLD signal extracted from each participant's V4 ROI revealed a main effect of presentation method (F(1,8) = 11.7; p = 0.009); consistent with previous data (Kastner et al., 1998, 2001; Beck and Kastner, 2005, 2007), sequential presentations evoked significantly greater activity than simultaneous presentations (Fig. 2). Importantly, we found a significant interaction between presentation method (sequential simultaneous) and attention (attend one item or attend three items) (F(1,8) = 11.76; p = 0.009); activation was lower for attend-three than attend-one conditions under simultaneous, but not sequential, presentation conditions. In other words, attending to three locations resulted in a reduced signal in visual cortex relative to attending to a single location, but only under conditions in which the three items could compete for representation.
The behavioral data showed a similar pattern of results as the fMRI data; that is, the costs of attending to multiple stimuli were greater when the items could potentially compete than when they could not (F(1,8) = 17.85; p = 0.01). Specifically, there was a greater drop in target sensitivity (d′) for attend-three relative to attend-one when the stimuli were presented simultaneously (attend-three, 1.56; attend-one, 2.59) than when they were presented sequentially (attend-three, 1.63; attend-one, 2.14).
Does the number of attended items change the attentional demands of the sequential presentation conditions?
It is possible that the number of attended items did not significantly alter the task requirements under sequential presentation conditions. Specifically, participants may have simply shifted attention among all five stimulus locations with the onset of each stimulus, regardless of the number of locations they were instructed to monitor. We note that the d′ data do not support this conjecture, because participants performed significantly worse during attend-three than attend-one sequential conditions. We can investigate the sensitivity of posterior parietal cortex to the number of attended items (see Fig. 6A), however, to further confirm that participants were indeed treating attend-one and attend-three conditions differently under conditions of simultaneous presentation. Both left [F(1,8) = 20.06; p = 0.002 (see Fig. 6B)] and right [F(1,8) = 22.47; p = 0.001 (see Fig. 6C)] PPC were sensitive to the number of attended items across conditions; attend-three conditions produced more activity than did attend-one conditions across both presentation conditions. Indeed, in the left PPC, although sequential presentation produced more activation than did simultaneous presentation (F(1,8) = 10.45; p = 0.012), this factor did not interact with the number of attended items (F(1,8) = 0.032; p = 0.863). Planned comparisons indicated that attend-three conditions produced more activation than attend-one conditions under both sequential (t(8) = 3.04; p = 0.02) and simultaneous (t(8) = 5.1; p = 0.001). In the right PPC, although the presentation factor showed a trend toward interacting with the number of attended items (F(1,8) = 4.816; p = 0.06), planned comparisons again revealed that attend-three conditions produced more activation than attend-one conditions under both sequential (t(8) = 2.79; p = 0.02) and simultaneous (t(8) = 8.34; p = 0.002) presentations, indicating that the sensitivity of the PPCs to the number of attended items was present across presentation conditions. Together, the d′ data and the PPC data indicate that subjects followed the instructions to approach attend-three and attend-one sequential conditions differently.
Experiment 2: within-field versus across-field visual displays
In a second experiment, we again asked whether attention to multiple items was hindered by competitive interactions in visual cortex, but this time we used a spatial manipulation to vary the likelihood of competitive interactions. Specifically, we exploited the lateralized projection of left and right visual field information to the right and left occipital lobes to manipulate the potential of the stimuli to activate a common group of cells. Six colored and textured shapes appeared in neighboring locations in the upper visual field. Five of these items fell in the RVF, and one of these items fell in the LVF. Participants detected either of two color/shape/texture conjunctions, which occurred on 20% of trials. Again, participants simultaneously attended to one or three of the six stimuli. Critically, the three attended items could either be distributed across the visual fields or entirely contained within the RVF (Fig. 3). Because cells in early to intermediate levels of visual cortex represent only the contralateral visual field, only attended items that fall within the same visual field should be represented by a common cell population and thus interact competitively (Kastner et al., 2001; Beck and Kastner, 2005, 2007).
Again, we examined regions in visual cortex that corresponded to the attend-one locations. In this experiment, however, we were interested in both the left hemisphere (LH) ROI that corresponds to the attend-one stimulus presented in the same visual field as its closest neighbors and the right hemisphere (RH) ROI that corresponds to the attend-one item presented alone in the LVF (but one of the three neighboring stimuli that spanned the midline). We used a localizer task to identify right and left hemisphere V4 ROIs.
Effects of competing and noncompeting stimuli
As in our first experiment, we predicted that the cost associated with attending to three items compared with one item would be exacerbated by competition in visual cortex. Consistent with this prediction, when participants directed attention toward the potentially competing items in the RVF, BOLD signal in the corresponding LH ROI was lower under attend-three than attend one-conditions (p = 0.0003) (Fig. 4). We did not find a similar difference between attend-one and attend-three conditions in the RH ROI when attention (attend-one or attend-three) was directed toward the noncompeting stimulus isolated in the LVF (Fig. 4). In other words, once again, we saw a decrement in the signal evoked by a single item in the attend-three condition, only when the three items had the potential to compete with one another in visual cortex.
Behavioral sensitivity to targets (d′) showed an interaction similar to the results we observed in the fMRI data (F(1,5) = 7.6; p = 0.04). Our finding that attending to three items was always more difficult than attending to single items is consistent with other reported data (Ericksen and St. James, 1986). The cost associated with attending to three items (1.72) relative to one item (3.1), however, was greater when the targets appeared in the right visual field (in the presence of competing stimuli) than when the targets appeared in the left visual field (isolated from competing stimuli; attend-three, 2.64; attend-one, 3.08).
Does attention directed to multiple competing stimuli enhance extrastriate signal at all?
In the current experiment, we can investigate whether attention has any effect at all when directed to multiple competing items by comparing signals in an ROI when attention was directed toward rather than away from the items represented by it. We found a significant interaction between the number of attended items (one or three) and location of attention (either contained within the RVF or spanning the midline) in the signal measured from the LH ROI (F(1,5) = 20.372; p = 0.006) (Fig. 5A). As expected, attending to a single item in the RVF significantly increased activation in the LH ROI (t(5) = 2.82; p = 0.037) compared with when attention was directed away from the RVF and toward the item presented alone in the LVF. Interestingly, this attention-related enhancement was completely absent when attention was directed to multiple items; in fact, attending to three items in the RVF produced slightly less activation in the LH ROI than when attention was directed away from this location and toward the three items spanning the midline (t(5) = 2.06; p = 0.094). Similarly, attention indices, which quantify the difference between attending to the RVF and LVF/midline items, were significantly higher for the attend-one (0.16) than the attend-three (−0.08) condition in the LH ROI (t(5) = 5.9; p = 0.002). When the RVF stimulus of interest was in competition with its neighbors, then attention enhanced its signal only when it was the sole recipient of attention.
A different pattern emerged in the RH ROI, however. Because the item in the LVF was not in competition with its neighbors in the RVF, we expected that attention would enhance signal in the RH ROI, regardless of the number of attended items. Indeed, we found main effect of the location of attention (F(1,5) = 9.92; p = 0.025) but no main effect of nor interaction with the number of items attended (Fig. 5B). Attention indices did not differ between attend-one (0.29) and attend-three (0.34) conditions (t(5) = 7.3; p = 0.51). Critically, attention effectively enhanced BOLD signal to the LVF item regardless of whether or not that item was the sole recipient of attention. These data indicate that directing attention to multiple items can enhance BOLD signal as effectively as directing attention to a single item can but only when the attended items are less likely to compete for representation in visual cortex. When we consider the data from both ROIs, then we conclude that competition among attended items prevents attention from enhancing BOLD signal effectively to those items.
Can separate parietal resources account for the divided visual field effects?
The two hemispheres appear to have somewhat separable attentional resources that may, under certain specific conditions, functionally expand attentional capacity by processing information in parallel (Alvarez and Cavanagh, 2005; Delvenne, 2005; Scalf et al., 2007). This possibility could account for the insensitivity to the need to divide attention across the visual fields that we observed when the item was presented alone in the LVF. If only the target item isolated in the left visual field can access the attentional resources of the RH, then the access to attentional resources of that item should not vary with the number of attended items. The signal evoked by that item in right visual cortex, in turn, would not vary with attentional condition.
We examined our data for evidence that the within-field and across-field distribution of attention might be differentially activating left and right PPC (Fig. 6D), whose role in distributing attention across different items in visual space is well established. We replicated the finding that PPC in both the LH [F(1,5) = 14.36; p = 9.013 (Fig. 6E)] and RH [F(1,5) = 12.55; p = 0.017 (Fig. 6F)] is sensitive to the number of attended items; PPC in each hemisphere was more active when attention was directed to three items (LH, 176; RH, 161) than when attention was directed to one item (LH, 114; RH, 104). Neither the left PPC (F(1,5) = 0.49; p = 0.517) nor the right PPC (F(1,5) = 0.14; p = 0.727) was sensitive to the visual field status of the attended items, however, indicating that the left and right PPC were equally activated by conditions that required attention to the left and right visual fields. Furthermore, visual field status failed to interact with the number of attended items in either the left (F(1,5) = 0.01; p = 0.91) or the right PPC (F(1,5) = 0.14; p = 0.72). We found no evidence, then, that the attentional systems of the two hemispheres were functioning independently during performance of this difficult triple conjunction detection task. Although this may seem surprising given the recent excitement over the ability of the attentional systems of the hemispheres to operate in parallel (Alvarez and Cavanagh, 2005; Delvenne, 2005; Scalf et al., 2007), we note that conditions of high attentional demand usually prevent the hemispheres from acting in parallel (Banich, 1998; Mikels and Reuter-Lorenz, 2004; Scalf et al., 2007). Indeed, nearly two decades of data suggest that high attentional demands usually cause the connected hemispheres to increase their interactions and function as a coordinated unit that is responsive to information in both visual fields (Luck et al., 1989; Banich and Belger, 1990; Belger and Banich, 1992; Banich, 1998, Weissman and Banich, 2000; Mikels and Reuter-Lorenz, 2004; Scalf et al., 2007). Our three-way conjunction search task was highly attentionally demanding and thus likely to require this coordinated response from the parietal cortices.
Discussion
We report evidence that competitive interactions reduce the ability of attention to operate on multiple items. In two experiments, we found that the cost of attending to multiple items, relative to attending to a single item, was greater when those items interacted competitively in visual cortex. Furthermore, in our second experiment, we found that attention-related BOLD enhancement was only seen when the attended items were not in competition with one another; no attention-related enhancement was observed when an attended item could competitively interact with other attended items. Such an interaction cannot be explained by a model in which a limited resource is the sole determinant of our limited attentional capacity, which would predict that distributing attention among multiple items should still enhance extrastriate activity relative to conditions of inattention, although to a lesser extent than would directing attention to a single item. Indeed, McMains and Somers (2005) demonstrated that attention distributed to three items (whose spatial separation and hemifield placement made them unlikely to compete) continued to enhance extrastriate representations of these items relative to conditions of inattention, although this effect was weaker than conditions in which attention was directed to a single item.
A straightforward “limited resource” model, in which a fixed attentional resource is simply spread over more locations, also would not explain why the cost associated with attending to three items should be modulated by such stimulus factors as the timing and placement of the items. Instead, it would predict an inverse relationship between the number of attended items and attention-related enhancement of their extrastriate representation, regardless of their timing or placement in the visual field. Such results, in fact, have been observed when attention is divided among widely separated stimuli (McMains and Somers, 2005). In our paradigm, however, the negative effects on extrastriate signal of attending to multiple stimuli in close proximity to each other was apparent only when those stimuli were likely to compete for representation. Although other factors may also limit our ability to attend to multiple items, the interactions with stimulus factors in both experiments are consistent with our proposal that competitive interactions in visual cortex partially determine the influence attention exerts on stimulus representations.
Effort and resources versus competition among representations
Regions in PPC were sensitive to the number of items that required attention, regardless of the potential of those items to undergo competition for representation. Consistent with the findings of other authors (Mitchell and Cusack, 2008; Xu and Chun, 2009), we reported that increases in the number of attended items uniformly produced increases in the activation of right and left PPC, indicating that the attentional system recognized and responded to our experiment manipulations of attentional demand. Importantly, however, this increased effort by PPC regions was insufficient to bring either the extrastriate representation or the subjects' performance in the attend-three condition to the level of the attend-one condition. The fact that the PPC increases activity during the very same conditions in which visual cortex activity and subjects' performance is reduced suggests that at least part of the limitation in attending to multiple items lies outside of the PPC. The data presented here suggest that one such limiting factor may be competition in visual cortex.
Our findings also inform more cognitive explanations of our limited capacity to attend to multiple items, which have focused on our limited ability to select (Intriligator and Cavanagh, 2001; Alvarez and Franconeri, 2007) or individuate and identify (Xu and Chun, 2009) multiple objects. These models proceed from a “resource-limited” (or non-data limited) view of attentional capacity; that is, because we can select, individuate and identify any single member of a group of items, our failure to successfully perform these operations simultaneously on all members must derive from the limited resources we have to apply to them rather than on the limited information (data) we have about the items. Such a view, however, is grounded in an assumption that “data limitations” only occur before a “resource-limitation” stage. The data presented here, in contrast, suggest an alternative explanation; because top-down resources are thought to feedback on visual cortex, they themselves are also subject to the functional architecture that supports the so-called data representation.
Attention to multiple items and the biased competition theory of selective attention
To explain our data, we draw on two essential tenets of biased competition theory: that items interact in visual cortex and that biasing signals can affect those interactions. The ideas presented here, however, differ from the original presentation of the theory in which selection was only considered for a single item (Desimone and Duncan, 1995). We asked subjects to attend to three of five items. Such a request does not make sense in those formulations of biased competition theory in which the term “attention” refers to emergent property of the interaction between biasing processes and competitive interactions (Desimone, 1996; Duncan et al., 1997); only if one can successfully bias the competition in favor of a single item would those models conclude that the item was attended. This formulation, however, is an uncomfortable fit with both our phenomenology and the current data showing a direct relationship between the number of task-relevant items participants attempt to monitor and the amount of PPC activity. We and many other neuroscientists and psychologists, starting with William James (1950), would call this an attempt to monitor “attention.” Here, the term attention is not dependent on the success of the selection but on the effort applied. We could explain our data and theory without reference to the word attention: subjects may send biasing signals to multiple items represented in visual cortex, but if those items competitively interact with each other in visual cortex, then the bias will ultimately be less successful. Such a formulation, however, would disconnect our data and theory from the everyday phenomena that we are trying to explain; that is, why do we find it difficult to “attend” to multiple items.
If one accepts that directing top-down attention toward multiple stimuli falls within the scope of biased competition theory, then our data serve to highlight a different aspect of that theory than is usually discussed. Current descriptions of this model emphasize the effects of attentional biasing on the competitive process; our data, however, demonstrate that competition also affects the efficacy of the attentional biasing processes. According to the biased competition model, attentional biasing acts as the countermeasure to competitive interactions in visual cortex. If a bias is directed to a visual stimulus, it influences the competitive process in favor of the attended item such that signals from that item ultimately guide behavior. We propose that this casual relationship can also operate in the other direction, however; competition modulates the effectiveness of the attentional processes by reducing its ability to operate across multiple items.
Multiple causes of limited attentional capacity
Finally, we note that we are not suggesting that visual cortex is the only cause of our limited attentional capacity. Indeed a large literature confirms limits in the sources of attentional control. For example, individual differences in interference from task-irrelevant stimuli are inversely predicted by individual differences in right middle frontal gyrus recruitment (Mecklinger et al., 2003), participants' peak capacity to represent bound visual features plateaus with activation in posterior parietal cortex/superior occipital cortex (Todd and Marois, 2004), and response selection delays correspond to delays in activation in posterior lateral prefrontal cortex (Dux et al., 2006). We believe that limits on the processing of multiple items may exist at multiple neural levels. We point out, however, that the limits in visual cortex suggested by our data represent a more fundamental limit in attentional capacity because it is occurring at the level of the visual representation.
Indeed, our findings that local extrastriate mechanisms limit attentional capacity change the extent to which we might view attentional limitations as mutable by factors such as practice. Certainly, practice can reduce the degree to which task-irrelevant information interferes with task performance (Milham et al., 2003), increase the interval over which we can maintain representations of visual information (Olesen et al., 2004), and decrease the time within which two separate responses can be selected (Erickson et al., 2007). All of the neural changes concurrent with these improvements, however, occur in prefrontal and parietal sources of top-down attentional control. Our data, in contrast, suggest that even if top-down attentional guidance were infinite and unlimited, the functional architecture of visual cortex may still prevent attention from acting as effectively on multiple items as it does on a single item.
Footnotes
- Received August 26, 2009.
- Revision received September 29, 2009.
- Accepted October 16, 2009.
-
This work was funded by National Institute of Mental Health Grant R03 MH082012 (D.M.B.). We thank Caterina Gratton for assistance in collecting the data, Walter Boot, Eamon Caddigan, and Mathew Hall for assistance in eye tracking, as well as Bettina Frances and Neal Cohen for feedback on previous versions of this manuscript.
- Correspondence should be addressed to Paige Scalf, Beckman Institute, University of Illinois at Urbana-Champaign, 405 North Mathews, Urbana, IL 61801. pscalf{at}uiuc.edu
- Copyright © 2010 the authors 0270-6474/10/300161-09$15.00/0