Abstract
Over the last several decades, spatial attention has been shown to influence the activity of neurons in visual cortex in various ways. These conflicting observations have inspired competing models to account for the influence of attention on perception and behavior. Here, we used electroencephalography (EEG) to assess steady-state visual evoked potentials (SSVEP) in human subjects and showed that highly focused spatial attention primarily enhanced neural responses to high-contrast stimuli (response gain), whereas distributed attention primarily enhanced responses to medium-contrast stimuli (contrast gain). Together, these data suggest that different patterns of neural modulation do not reflect fundamentally different neural mechanisms, but instead reflect changes in the spatial extent of attention.
Introduction
Selective attention is the mechanism by which behaviorally relevant sensory inputs are preferentially processed at the expense of distracters. This selective information processing is thought to partially depend on changes in the gain of neurons within striate and extrastriate visual cortices. However, developing a parsimonious model to account for gain modulation has been challenging because different studies have found disparate attention effects on stimulus evoked neural responses (Reynolds et al., 2000; Martínez-Trujillo and Treue, 2002; Williford and Maunsell, 2006; Buracas and Boynton, 2007; Kim et al., 2007; Lee and Maunsell, 2009, 2010a,b).
In a canonical paradigm (Reynolds et al., 2000; Martínez-Trujillo and Treue, 2002), attention is covertly deployed to one of two stimuli, whereas stimulus contrast is systematically varied to generate contrast-response functions (CRFs) based on the activity level of visually responsive neurons. Whereas some studies report that attention primarily enhances already strong responses (response gain; see Fig. 1A; Di Russo et al., 2001; Kim et al., 2007; Lee and Maunsell, 2010a), others report that attention enhances only responses to midcontrast stimuli (contrast gain; see Fig. 1B; Reynolds et al., 2000; Martínez-Trujillo and Treue, 2002). Finally, other studies report patterns that resemble a combination of different gain modulations (Williford and Maunsell, 2006; Buracas and Boynton, 2007; Murray, 2008; Pestilli et al., 2011).
The normalization model of attention (NMA; Reynolds et al., 1999; Lee and Maunsell, 2009, 2010b; Reynolds and Heeger, 2009) suggests that these inconsistent modulatory patterns might arise via changes in the size of the stimulus and the attention field. The NMA is based on the premise that, in the absence of attention, two factors determine the firing rate of a visually responsive neuron. First, a facilitatory component (stimulus drive) is determined by the contrast of the stimulus placed in the receptive field (RF) of a neuron. Second, a suppressive drive is determined by the summed activity of other neighboring neurons, serving to normalize the overall spike rate of the cell in question via mutual inhibition (Heeger 1992). Attention modulates the pattern of neural activity by altering the balance between these facilitatory and suppressive drives (Reynolds and Heeger, 2009). Thus, a highly focused attention field leads to response gain (see Fig. 1A) because attentional gain is applied primarily to the stimulus drive. Conversely, a larger attention field leads to contrast gain because attention increases both the stimulus and the suppressive drives, and this normalizes responses at high contrasts (see Fig. 1B).
A previous psychophysical study has reported a pattern of behavioral performance that is consistent with the NMA (Herrmann et al., 2010). However, existing studies have not measured neural CRFs to determine whether manipulating the size of the attention field with respect to a constant stimulus size selectively alters the gain pattern of neural CRFs. Here, we evaluated this relationship in human subjects by measuring steady-state visual evoked potentials (SSVEPs) elicited by attended and ignored flickering visual stimuli.
Materials and Methods
Electroencephalography subjects.
We initially recruited eight neurologically healthy human subjects (21–27 years old, four females) with normal or corrected-to-normal vision. The recruitment of eight subjects is within the typical range for studies using similar multisession approaches (Di Russo et al., 2001; Morrone et al., 2002, 2004; Carrasco et al., 2004; Pestilli and Carrasco 2005; Ling and Carrasco 2006; Pestilli et al., 2007, 2009, 2011; Herrmann et al. 2010). All subjects signed an informed consent form approved by the Institutional Review Board at the University of California, San Diego (UCSD) and participated in the study for monetary compensation of $15 per hour. They were behaviorally trained for 1.5 h on the task 1 d before undergoing multiple electroencephalography (EEG) sessions (10–15 sessions, 3840–5760 trials in total). Data from one subject were discarded due to a failure to complete the experimental protocol (the subject withdrew after the first EEG session). Of the other seven subjects, four completed 10 sessions, one completed 11 sessions, one completed 13 sessions, and one completed 15 sessions of EEG recording.
EEG task design.
EEG data were recorded while subjects performed an attention task that required the covert allocation of either focused or distributed spatial attention (see Fig. 2A,B). We monitored SSVEPs elicited by flickering a series of small disks (1.22° radius) concurrently in the lower left and right visual field quadrants at a rate of either 21.25 Hz (25% on–off duty cycle) or 28.33 Hz (33.33% on–off duty cycle), respectively (and vice versa in half of the trials). By flickering the left and right stimulus arrays at different frequencies, we were able to isolate the neural response to stimuli presented in each location via a frequency-domain analysis of the stimulus-locked SSVEP response. These high flicker-frequencies were chosen based on previously established methods (Müller et al., 1998a; Breakspear et al., 2010; Bridwell and Srinivasan, 2012; Garcia et al., 2013; Itthipuripat et al., 2013) in an effort to restrict our measurements to entrained activity in visual cortex. The small disk appeared randomly within a circular area with a radius of 4.90° (marked by an imaginary black ring in Fig. 2A, centered 8.58° from a central fixation point, which was located 3.50° above the center of the display). The contrast of the flickering disks was systematically varied across trials over a range extending from 2.5 to 90.0% (Michelson contrast: 100 × (Is − Ib)/(Is + Ib), where Is is the luminance of the gray disk, and Ib is the luminance of the dark gray background fixed at 2.7 cd/m2).
We manipulated the spatial extent of attention by varying the potential location of an occasional target stimulus that was presented on 25% of the trials. The target was a circular oriented grating with the same mean contrast and the same size as the nontarget disk, and subjects pressed one of two buttons to indicate whether the orientation of the grating was 10° clockwise or 10° counterclockwise relative to vertical. In both the focused and the distributed conditions, nontarget disks were presented in a circle (radius, 4.90°; marked by the black ring in Fig. 2). Within this 4.90° radius circle, the location of each nontarget disk was randomly drawn from a nonuniform distribution, approximating a Gaussian, such that there was a higher probability of the disk appearing near the center of the stimulus window than near the edges. In the focused-attention condition (see Fig. 2A, left), targets always appeared within a small circle (marked by the blue dotted ring in the figure; radius, 2.04°). Thus, the region of space in which a target disk could appear (the attention field) was smaller than the size of the stimulus drive in the focused-attention condition. In the distributed-attention condition (see Fig. 2A, right), targets could be presented inside a larger circle (marked by the cyan dotted ring in the figure; radius, 6.54°). Thus, the size of the attention field (6.54°) was larger than the size of the stimulus drive (4.90°) in the distributed-attention condition. Critically, the spatial extent of the stimulus drive is fixed across the focused and distributed conditions. Thus we predicted more response gain in the focused-attention condition because the attention field is relatively small compared to the stimulus drive, and more contrast gain in the distributed-attention condition because the attention field is relatively large compared to the stimulus drive (Fig. 1A,B). Note also that these predictions hold as long as the relative size of the attention field is larger in the distributed-attention condition compared to the focused-attention condition (Fig. 1C–E for model simulations as described below, Additional model simulations).
A, The NMA predicts that focused attention leads to response gain, or an upward shift of the CRF. B, On the other hand, distributed attention should lead to contrast gain, or a leftward shift of the CRF. The spatial distribution of the stimulus response and the size of the attention field used here are based on the parameters used in the main EEG experiment. C, D, We also specified the stimulus size and the attention field size in relation to expected RF sizes across striate and extrastriate visual areas (V1, V2, and V4) and found that the NMA generates consistent results across regions. E, The NMA simulation (using the RF parameters from V4) in which the stimulus size is fixed at 4.9° but the size of the attention field varies from 2.0 to 6.5° consecutively in 0.5° steps. The model predicts that as the size of the attention field increases, response gain will decrease and contrast gain will increase in a continuous manner. See model parameters for all figures in Table 1 and simulation procedures in Material and Methods.
The schematic of the trial structure is shown in Figure 2B. Each trial started with a fixation point and a 1 s arrow cue instructing subjects to covertly attend to the left or to the right stimulus, followed by a 50 ms auditory cue (400 or 1500 Hz) instructing the subjects to implement either a focused- or distributed-attention strategy. Two seconds after the arrow cue onset, the small flickering nontarget disks appeared in the lower left and lower right quadrants for 7 s. A second 50 ms auditory cue was then presented after a pseudorandomly selected interval of 3–4 s from the stimulus onset and instructed subjects to either maintain or switch their attentional strategy. The intertrial interval (ITI) was 2 s. All trial types were pseudorandomized so that the occurrence of the target-present trials was unpredictable. To ensure that subjects were actively engaged in the task for the duration of each trial, the grating targets were briefly presented for 11.7 ms in the first half of the trial, in the second half of the trial, or in both intervals with equal probability. In addition, the time at which the target appeared was pseudorandomly selected from a uniform distribution to decrease the predictability of the target onset time. Target stimuli could appear only on the attended side (100% cue validity). This was done to ensure that subjects focused only on the attended location and did not divide their attention across hemifields.
EEG experimental procedures.
Stimuli were presented on a PC running Windows XP using MATLAB (MathWorks) and the Psychophysics Toolbox (version 3.0.8) (Brainard, 1997; Pelli, 1997). Precise stimulus onset times were recorded from the CRT monitor (HP P1203, 85 Hz refresh rate) via small fiber-optic cables attached to the screen. The gain and sensitivity of the fiber-optic system (custom built by the Electronic Development and Repair Facility, Department of Physics, University of California, San Diego; schematics available upon request) were adjusted to give accurate and fast sampling to record the precise timing of the flickering stimuli. Participants were seated 70 cm from the monitor in a dark room and used two buttons on a keyboard to make their responses.
Analysis of behavioral data.
Only correct button presses occurring between 200 and 1500 ms after the onset of a target were counted as correct. Repeated-measures ANOVAs were used for all behavioral analyses. Note that subjects made no correct responses on trials in which the stimulus contrast was set to 2.5%, as the stimuli were nearly invisible at this level. Thus, the analysis of response time data included only trials from the five higher contrast levels (i.e., from 5–90%). This is reflected in the smaller degrees of freedom [4, 24] in the ANOVA that assessed the effects of contrast on RT.
EEG data acquisition and analysis.
EEG data were recorded using a 128-channel Geodesic Sensor Net coupled with a NetAmps 300 amplifier (Electrical Geodesics), sampled at 1000 Hz and referenced to the central channel. Electrode impedances were kept below 50 kΩ, which is standard for this system. Blinks and eye movements were monitored by six built-in electrodes placed above, below, and beside the left and right eyes.
The EEG data were filtered by applying a high-pass Butterworth filter with 2 dB attenuation at 2 Hz and a band-stop Butterworth filter with 30 dB attenuation between 58 and 62 Hz. The EEG data were then segmented into epochs extending from 3000 ms before stimulus onset to 3000 ms after stimulus offset. Trials that exhibited prominent blink, electrooculogram, or electromyogram artifacts were discarded using threshold rejection (more than ±75 μV deviation from the mean) and visual inspection on trial-by-trial basis, which resulted in the removal of <20% of trials across subjects. Principal components were then computed and selected for removal based on visual analysis to attenuate any residual artifacts.
Fourier coefficients were calculated at frequencies of 21.25 and 28.33 Hz (the two stimulus frequencies) and 40 surrounding frequency bins separately for the first half (from 0–3 s after the flicker onset) and the second half of the trial (from 0–3 s after the second auditory cue), and for each spatial cue type (attend left or right), attention condition (focused or distributed), stimulus location (left or right), and stimulus contrast level (six levels). To avoid spectral leakage, we truncated the FFT calculation window to have a length equal to the number of integer stimulus cycles nearest to the 3 s interval. Specifically, for a stimulus flicker of 21.25 Hz, the 3 s interval was truncated to 2.9172 s, giving a central frequency of 21.2547 and a 0.3428 Hz spectral resolution. For the stimulus flicker at 28.33 Hz, the 3 s interval was truncated to 2.9647 s, giving a central frequency of 28.3305 Hz and a 0.3373 Hz spectral resolution. All figures and analyses are based only on the data from nontarget trials on which no false alarm was made. Analyzing only nontarget trials was critical because it prevented confounds related to target detection- and response-related activity, and it also ensured that all sensory aspects of the displays were identical across changes in the size of the attentional field. The signal-to-noise ratio (SNR) of the SSVEP response was calculated on a trial-by-trial basis by dividing the power at the frequency bin centered on the stimulus frequency by the mean power in the two frequency bins 0.69 Hz above and below the center frequency of 21.25 Hz (corresponding to two bins on either side of the center frequency) and 0.68 Hz above and below the center frequency of 28.33 Hz. This SNR metric has been used in previous SSVEP studies (Bridwell and Srinivasan, 2012; Kim and Verghese, 2012; Bridwell et al., 2013; Garcia et al., 2013), and we focused on analyzing the SNR rather than the raw power/amplitude of the SSVEP to ensure that the modulations of the SSVEP were not confounded by any changes in broadband power at β frequencies.
The five electrodes in each subject with the highest median SNR computed across all stimulus contrast levels and attentional conditions at each stimulus frequency and each stimulus location were then defined as electrodes of interest [EOIs; see Fig. 3A, bottom; see similar methods in the studies by Itthipuripat et al. (2013) and Müller et al. (2003)]. The EOI approach strengthens the power of our analysis due to a slight variation of SSVEP topography across subjects. Also, the selection of EOIs based on data collapsed across focused- and distributed-attention conditions and across all contrast levels avoids biasing our analysis to favor either one of the alternative patterns of gain (i.e., response or contrast gain). The median SNR in each set of EOIs was then computed for each stimulus contrast level to construct contrast response functions separately for attended and ignored stimuli in the focused and distributed conditions (see Fig. 4).
To quantitatively examine the pattern of gain (either response or contrast gain) separately for the focused and distributed conditions, the grand-averaged SSVEP contrast response functions obtained across all subjects were fit with a Naka–Rushton equation (Geisler and Albrecht, 1997):
where R is the magnitude of the SSVEP response as a function of contrast (c), a is a response amplitude parameter (multiplicative gain factor), C50 is the semisaturation constant (the contrast value at half the maximum response), n is an exponent that determines the slope of the contrast response function, and b is the baseline response level. In this analysis, b was defined as the lowest SNR of each data set, and n was fixed at 2 (see Herrmann et al., 2010; Carandini and Heeger, 2011). Then, a and C50 were estimated using MATLAB's fminsearch function. The fitting procedure was constrained under the assumption that the CRF reaches asymptote before or at the maximal contrast. This constraint was implemented to ensure that the amplitude parameter did not exceed the maximum SSVEP response and the semisaturation constant did not vary outside the range of contrast values that were used in the present experiment (1–90% contrast).
A bootstrapping procedure was then performed to assess significant differences between conditions and to establish 95% confidence intervals on the best fitting model parameters (see Fig. 4C,E,G). First, we resampled EEG trials with replacement for each individual subject. Next, we averaged the resampled trials across subjects to generate a CRF for each experimental condition, and then we fit the averaged CRFs to generate an estimate of a and C50. This resampling and fitting procedure was then repeated 10,000 times to create bootstrapped distributions from which confidence intervals associated with each parameter (a and C50) were computed. To evaluate the interaction between attention field size (focused/distributed) and the locus of attention (attended/ignored) on the model parameters, we compiled bootstrapped distributions of the differences between the estimated fit parameters in the focused-attention condition and the distributed-attention condition, i.e., (focused: attended − ignored) − (distributed: attended − ignored), and computed the percentage of values in the tail of this distribution that were greater or less than zero. Then, post hoc comparisons were performed to test for additional differences between pairs of conditions by evaluating bootstrapped distributions of differences and then computing the percentage of the values in the tails of these distributions that were greater or less than zero. We used two-tailed statistical tests to be conservative and because the NMA does not make specific predictions about the influence of attention field size on CRFs associated with ignored stimuli (Reynolds and Heeger, 2009). All p values associated with post hoc comparisons were Bonferroni corrected, resulting in a corrected threshold for eight comparisons of α < 0.0063 (two tailed).
fMRI subjects.
fMRI data were obtained from five neurologically healthy human subjects (21–31 years old, three females), three of whom participated in the main EEG experiments. All subjects signed an informed consent form approved by the Institutional Review Board at UCSD and participated in the study for monetary compensation of $20 per hour. Each subject participated in a 2 h scanning session to acquire fMRI for the experimental task, high-resolution anatomical images, and functional localizer scans. Retinotopic mapping for all subjects was carried out in a separate session using standard procedures (Engel et al. 1994; Sereno et al., 1995).
fMRI task design.
We conducted this control fMRI experiment to independently verify that the manipulation of attentional field size successfully modulated the spatial extent of activation in early visual areas. The protocol and stimulus parameters of the fMRI version of the experiment were similar to the main EEG experiment (see above, EEG task design), except that the auditory cue was replaced by a colored arrow cue (red or blue), there was only one attention strategy used on each trial, and the stimulus flicker frequency was either 20 or 30 Hz (which was limited by the 60 Hz refresh rate of the projector and varied only slightly from the 21.25 and 28.33 Hz flicker rates used in the EEG study). Stimuli were front-projected on a screen (90 cm width), located 380 cm from the subject's eyes. The stimulus contrast was fixed at 20.3%, the contrast level at which SSVEP responses reached about half of their maximal response. The gray nontarget stimulus disk (radius, 0.60°) was flickered in an area with a radius of 2.00°. The spatial extent of the target stimulus in the focused-attention condition and the distributed-attention conditions were 0.93° and 2.65° in radius, respectively. Note that the stimulus parameters used in the fMRI study are smaller than those in the EEG study, which was caused by a limited visible screen area due to the narrow scanner bore. However, the respective ratio of the stimulus parameters across conditions are closely matched to those used in the EEG study.
Each trial started with an arrow cue pointing to the left or the right side of the visual field (duration, 2 s), instructing subjects to covertly shift their attention to the left or the right (with equal probability). Subjects applied the focused- and distributed-attention strategies when the arrow cue was red and blue, respectively. The flickering nontarget disks then appeared in the left and right lower visual fields at 20 and 30 Hz, respectively (and vice versa on one-half of the trials) for 3 s, followed by a passive-fixation ITI of 3 s. Each run contained 12 focused-attention nontarget trials, 12 distributed-attention nontarget trials, 4 focused-attention target trials, 4 distributed-attention target trials, and 5 null trials (null trial duration, 8 s of passive fixation) in pseudorandomized order. Each subject completed 9–10 runs of the main fMRI experiment.
fMRI functional localizer task.
Subjects also performed two runs of a functional localizer task to identify voxels that were visually responsive to the portion of the visual field subtended by the maximum area over which a target stimulus could be presented during the distributed-attention condition in the fMRI main task (radius, 2.65°). Note that the size of the localizer stimulus is larger than the region over which the flickering nontarget disks were presented (i.e., the stimulus drive had a radius, 2.00°; see respective ratio in Fig. 2A). Subjects maintained fixation while covertly attending to a single flickering circular checkerboard with 100% contrast that was alternately presented in the left and right stimulus locations for 12 s/trial. Subjects responded with a button press when they perceived a brief and small contrast change in the checkerboard; contrast detection targets could appear between one and three times per 10 s trial.
fMRI retinotopic mapping procedure.
Striate and extrastriate visual areas (V1, V2v, V3v, V2d, V3d) were defined by standard retinotopic mapping procedures (using a rotating counterphase flickering checkerboard), and the data were projected onto a computationally inflated gray/white matter boundary surface reconstruction for visualization (Engel et al. 1994; Sereno et al., 1995).
fMRI data acquisition, preprocessing, and analysis.
All subjects were scanned on a 3T GE MR750 scanner at the Center for Functional magnetic Resonance Imaging at UCSD. Functional images were collected using a gradient EPI pulse sequence and a 32-channel head coil (Nova Medical), except one subject with whom an 8-channel head coil was used. Functional acquisition parameters were otherwise identical (19.2 × 19.2 cm FOV, 64 × 64 matrix size, 35 3-mm-thick slices with 0 mm gap, TR = 2000 ms, TE = 30 ms, 90° flip angle), yielding a voxel size of 3 × 3 × 3 mm. We acquired axial slices that covered the entire occipital cortex. In addition, we obtained a high-resolution anatomical scan (fast spoiled gradient-recalled-echo T1-weighted sequence, TR = 11 ms, TE = 3.3 ms, TI = 1100 ms, 172 slices, 18° flip angle, 1 mm3 resolution). EPI images were first unwarped using the FMRIB Software Library (Oxford, UK). Then, BrainVoyager 2.3 (Brain Innovations) was used to perform slice-time correction, 3D motion correction, temporal high-pass filtering (three cycles per run), and transformation into Talairach space.
In the main analysis, we first used a general linear model (GLM) to identify voxels that showed a significant response to contralateral versus ipsilateral epochs of visual stimulation during the independent functional localizer task [single-voxel false discovery rate (FDR)-corrected threshold, p < 0.05]; voxels showing a significant response in each area were then retained for further analysis in the main experimental task. Next, we ran a GLM with eight regressors (focused attention left, focused attention right, distributed attention left, distributed attention right, focused attention left target trial, focused attention right target trial, distributed attention left target trial, and distributed attention right target trial) on each retained voxel in each visual area. Note that in the main fMRI task, we only analyzed the nontarget trials to keep the stimulus drive fixed across the focused and distributed conditions and to prevent possible confounds from target-evoked sensory responses, decision processes, and/or motor-related processes. Each regressor was constructed by convolving a boxcar model of the stimulus sequence with the standard difference-of-two-gamma function hemodynamic response function model implemented in Brain Voyager. We then preformed t tests on the resulting beta weights to assess the proportion of voxels in each visual area that showed a significant response on trials where attention was directed to the contralateral visual field. A sign test was then performed to determine whether the number of visual areas in which more voxels were significantly active in the distributed condition (compared to the focused condition) was different from the number of areas expected by chance. Since these areas are retinotopically mapped, a higher proportion of significant voxels in one condition compared to another should translate into a larger spatial extent of activation, allowing us to infer changes in the relative size of the attention field. To ensure that the results were not biased by the exact choice of a single-voxel statistical threshold, we used three p values, p < 0.10, p < 0.05, and p < 0.01, all FDR corrected. We also repeated this analysis across a large range of statistical thresholds (a range of t values from −1 to 10; see Fig. 5B). At each point in these cumulative plots, we computed the percentage of voxels within each unilateral ROI (30 ROIs total) with t values exceeding each t threshold. At each t threshold, we then compared whether more ROIs had more voxels active during the distributed-attention condition compared to the focused-attention condition than would be expected by chance using a sign test, correcting for multiple comparisons using FDR (p < 0.05).
Finally, we examined the response in all voxels within each localizer-defined ROI to determine whether the least active voxels in the focused-attention condition became more active in the distributed-attention condition. We sorted voxels from each ROI into 20 evenly spaced bins based on the β coefficient corresponding to the focused-attention condition. Next, we computed the average response across all voxels within each of the 20 bins on both focused- and distributed-attention trials. We then evaluated, for each bin, whether average betas increased in more ROIs than would be expected by chance using a sign test (corrected for multiple comparisons at an FDR threshold of p < 0.05).
Additional model simulations.
Since EEG measurements are presumably influenced by distributed activity across visual areas with different RF sizes, we modified the NMA as written by Reynolds and Heeger (2009) to use the exact stimulus/task parameters used in the EEG study (for model parameters, see Table 1) and constrained the simulation of RF sizes in each modeled visual area based on estimates from monkey neurophysiology (Gattass et al., 1981, 1988; Freeman and Simoncelli, 2011). The stimulus is assumed to have a Gaussian shape. This is consistent with the fact that the potential location of the small nontarget disk is randomly drawn from a nonuniform distribution, with higher probability associated with the disk appearing near the center of the stimulus window than near the edges, similar to a Gaussian. The attention field, excitatory field, and suppressive field are also assumed to have Gaussian shape similar to the original NMA (Reynolds and Heeger, 2009). Given the eccentricity of our stimuli in the EEG experiment (8.58°), the RF sizes were fixed at V1 = 1.46°, V2 = 4.33°, and V4 = 6.86° (Gattass et al., 1981, 1988; Freeman and Simoncelli, 2011). In the RF-size-constrained model, the size of the excitatory field was set equal to the estimated RF size of neurons in each region. The bandwidth of the suppressive field is set to be two times larger than the stimulation field (Cavanaugh et al., 2002); thus the size of the suppressive field is linearly scaled with the RF size. We convolved the excitatory field with the stimulus (E) and applied the attentional field (A) via multiplication to estimate the stimulus drive as enhanced by attention (AE). Then, we convolved the suppressive field with AE (the stimulus drive that is already enhanced by attention) to generate the suppressive drive (S). Finally, we divided AE by S to generate a predicted population response. The model predicts similar patterns of gain modulation across all simulated visual areas (Fig. 1C). Specifically, in the focused-attention condition, there is clear response gain pattern (i.e., attention enhances already strong responses). These enhanced responses are normalized when the attention field becomes larger, resulting in a pattern resembling contrast gain in the distributed-attention condition. The observation of similar effects in all areas is particularly relevant as our scalp recorded SSVEP signals likely reflect the combined activity across several early visual areas in occipital cortex.
Model parameters used in Figure 1A–E
Although the SSVEP signals that we measure reflect oscillatory activity evoked by stimuli spanning the entire stimulus drive (4.9°), we also considered a model in which the stimulus drive consisted of a single presentation of a 1.2° nontarget disk (Fig. 1D). Under these conditions, the NMA still predicts the same general shift from relatively more response gain to relatively more contrast gain as the size of the attention field increases. Note that the simulation parameters in Figure 1D are the same as C, except that the stimulus is 1.2° in radius and has a square-wave shape (consistent with the shape of the small disk used in the EEG experiment). We also ran the stimulation (using the parameters from V4) in which the stimulus size is fixed at 4.9° but the size of the attention field varies from 2.0° to 6.5° consecutively in 0.5° steps (Fig. 1E). The model simulation predicts that as the size of the attention field increases, response gain will decrease and contrast gain will increase in a continuous manner. This highlights the importance of the relative size between the attention field and stimulus drive, and a shift from response to contrast gain should be observed as long as the size of the attention field increases and the stimulus drive remains constant.
Results
Behavioral results
Subjects' accuracy during the EEG experiment significantly improved (Fig. 2C, left; F(5,30) = 128.547, p < 0.001), and reaction times (RTs) on correct trials significantly decreased as a function of stimulus contrast (Fig. 2C, right; F(4,24) = 18.151, p < 0.001). Although accuracy was unaffected by changes in the spatial scope of attention, subjects responded more slowly in the distributed condition than in the focused condition (F(1,6) = 7.159 p = 0.037), consistent with the increased uncertainty of the target location. This selective influence of attention field size on RT, as opposed to accuracy, is consistent with our instructions that the subjects emphasize responding accurately.
A, The spatial attention task required either a focused- or distributed-attention strategy. In the focused-attention condition (left), the location of a small oriented-grating target was constrained to an area (a blue dotted ring, not shown in the actual display) that was smaller than the spatial extent of the nontarget disks (a black ring, not shown in the actual display). In the distributed-attention condition (right), the target could appear across a larger region of space (a cyan dotted ring, not shown in the actual display). B, Trial structure. Each trial began with a 2 s arrow cue that instructed subjects to attend to either the left or the right stimulus, followed by a 50 ms auditory cue, and the pitch (high/low) instructed subjects to either adopt a focused- or distributed-attention strategy. A second auditory cue was then presented after a pseudorandomly selected interval of 3–4 s, and instructed the subjects to either maintain or switch their attention strategy. The small flickering nontarget disks in the lower left and lower right quadrants were updated at 21.25 and 28.33 Hz, respectively (or vice versa on one-half of the trials). The stimulus contrast was set to 2.5, 5.3, 9.6, 20.6, 46.5, or 90.0% on each trial. The ITI was 2 s. No target was presented on 75% of the trials, and these nontarget trials formed the basis for all subsequent analyses of the EEG signal. On the remaining 25% of the trials, a single square-wave grating target stimulus could appear briefly for 11.7 ms (green cross, bottom) in the first half of the trial, in the second half of the trial, or in both. C, Behavioral accuracy (left) and reaction times for correct trials (right) as a function of stimulus contrast. Note that no correct responses were made when at the 2.5% contrast level, as the stimuli were nearly invisible. All error bars are ±1 SEM across subjects.
SSVEP results
SSVEPs are well suited to evaluate the impact of changes in the spatial scope of attention on neural CRFs because they reflect synchronized activity pooled across neurons in visual cortex (Regan 1989; Rager and Singer, 1998; Srinivasan et al., 1999), and models such as the NMA make predictions regarding changes in the gain pattern of neural activity at the population level under the assumption that individual neurons in the population share a similar dependence on stimulus contrast (Reynolds and Heeger, 2009). In addition, previous research has established that SSVEP signals are influenced by attention and thus provide a sensitive measure to study patterns of sensory gain modulation (Müller et al., 1998a,b, 2003; Kim et al., 2007). Figure 3A shows the scalp topography of the grand-averaged SSVEPs collapsed across the focused- and distributed-attention conditions and all stimulus contrast levels. We obtained reliable SSVEP responses for both 21.25 and 28.33 Hz stimuli that peaked over posterior–occipital regions contralateral to the stimulus. Note that the attention effects (third row) were slightly less lateralized than the responses in each condition considered in isolation; this occurred because responses to attended stimuli not only had a higher SNR, but were also slightly broader in their spatial distribution. Since the peak distribution of SSVEP responses for each stimulus frequency assignment varied slightly across individuals, we selected five focal EOIs for each frequency from each observer for further analysis (collapsing across contrast levels; Fig. 3A, bottom; see Materials and Methods).
A, Topographies of the grand-averaged SNR of the SSVEPs, collapsed across focused- and distributed-attention conditions, the first half and the second half of the trial, and all stimulus contrast levels. SSVEPs for attended (first row) and ignored (second row) stimuli peaked at posterior occipital sites contralateral to the stimulus that evoked the response. Attention boosted the SNR of the SSVEPs (third row). Five focal electrodes were selected separately for each frequency assignment and each stimulus location as EOIs. The fourth row shows the probability of each electrode being included as an EOI across all subjects. B, Topographical maps for each attention condition and each stimulus contrast level. The plot is collapsed across stimulus frequencies and stimulus locations. The left half and the right half of the head model correspond to electrodes ipsilateral and contralateral to the stimulus location, respectively. Consistent with the main CRF results (Fig. 4), the SNR of the SSVEP increases as a function of stimulus contrast. In the focused-attention condition, the response at the highest contrast level is most enhanced by attention. In contrast, responses to high-contrast stimuli are relatively attenuated in the distributed-attention condition, and response enhancement is most evident at midlevel contrasts instead. C, Posterior–occipital view of the grand-averaged SNR of the SSVEPs for each stimulus frequency assignment during the first and second halves of the trial. Data were collapsed across focused- and distributed-attention conditions and across all stimulus contrast levels. Blue and red lines show the response to attended and ignored stimuli, respectively. For each frequency assignment, there are clear SSVEPs that are sharply tuned to the flicker frequency of each stimulus and that peak in posterior–occipital electrodes contralateral to the stimulus location. Green boxes highlight sets of electrodes that responded robustly to stimuli in each location and stimulus frequency assignment.
The grand-averaged SSVEPs are plotted as a function of stimulus contrast and attention in Figure 4A: the focused-attention condition yielded a pattern that qualitatively resembles response gain (left), and the distributed-attention condition yielded a pattern that qualitatively resembles contrast gain (right). As an initial evaluation of attention-related differences between the CRFs shown in Figure 4A, we performed a three-way repeated-measures ANOVA. There was a significant main effect of the locus of attention (response to attended versus ignored stimulus, F(1,6) = 15.287, p = 0.008), a main effect of stimulus contrast (F(5,30) = 12.573, p < 0.001), and an interaction between the locus of attention and stimulus contrast (F(5,30) = 6.584, p < 0.001). In addition, there was a significant three-way interaction between the size of the attention field, the locus of attention, and stimulus contrast (F(5,30) = 8.972, p < 0.001), demonstrating that changes in the size of the attention field lead to different patterns of CRF modulation. This pattern can also be qualitatively seen in the scalp distribution of the SSVEP SNR (Fig. 3B). Importantly, this pattern of results was reproduced when data from the first half (Fig. 4D) and the second half (F) of each trial were analyzed separately. Specifically, there was a significant main effect of the locus of attention (first half, F(1,6) = 9.690, p = 0.021; second half, F(1,6) = 22.184, p = 0.003), a main effect of stimulus contrast (first half, F(5,30) = 10.345, p < 0.001; second half, F(5,30) = 13.087, p < 0.001), an interaction between the locus attention and stimulus contrast (first half, F(5,30) = 3.598, p = 0.011; second half, F(5,30) = 3.881, p = 0.008), and an interaction between the size of attention field, the locus of attention, and stimulus contrast (first half, F(5,30) = 4.213, p = 0.005; second half, F(5,30) = 3.869, p = 0.008). Note that consistent results were observed when raw SSVEP amplitudes were analyzed as opposed to SNR (data not shown).
A, CRFs based on the grand-averaged SNR (N = 7) associated with attended and ignored stimuli separately for the focused-attention (dark blue for the attended stimulus, red for the ignored stimulus) and distributed-attention conditions (cyan for the attended stimulus, orange for the ignored stimulus). The continuous curves represent fits based on a Naka–Rushton equation (see Materials and Methods). In the focused-attention condition, R2 was 0.958 and 0.989 for attended and ignored responses, respectively. In the distributed-attention condition, R2 was 0.835 and 0.987 for the attended and ignored responses, respectively. Focused attention led to a pattern of response gain (left), and distributed attention led to a pattern of contrast gain (right). B, The attentional modulation for focused and distributed conditions (the difference between the data shown in left and right panels in A). Asterisks indicate significant differences as assessed with by post hoc t test with Bonferroni correction for multiple comparisons (both t(6) values > 3.901; p values < 0.00833, two-tailed). C, Results from the bootstrapping analysis demonstrating changes in the response amplitude parameter, a (left), and the semisaturation contrast, C50 (right), as a function of the size of the attention field and the locus of attention. The interaction between the size of the attention field and the locus of attention are significant for both a and C50, with more pronounced response gain in the focused-attention condition and more pronounced contrast gain in the distributed-attention condition. D, F, Similar results were observed when data from the first half (D) and the second half (F) of each trial were analyzed separately. E, G, Pattern of results was also observed when data from the first half (E) and the second half (G) of each trial were analyzed separately. FA, Focused attended; FI, focused ignored; DA, distributed attended; DI, distributed ignored. For C, E, and G, asterisks indicate significant differences as assessed with post hoc comparisons (Bonferroni corrected for 8 comparisons, all p values < 0.0063, two-tailed). Hash marks indicate differences without Bonferroni correction (all p values < 0.05). Error bars in A, B, D, and F are ±1 SEM across subjects. All error bars in C, E, and G represent 95% confidence intervals of the bootstrap distributions.
Although the ANOVA presented above confirms that there is an effect of attention field size on the CRFs, it does not directly address the selectivity of changes in terms of response or contrast gain. Therefore, we next fit a Naka–Rushton equation (Eq. 1) to the CRFs associated with each condition to evaluate the influence of attention field size on response gain (as indexed by the response amplitude model parameter, a; Fig. 4C, left) and contrast gain (as indexed by the semisaturation constant model parameter, C50; Fig. 4C, right).
Using a bootstrapping procedure (see Materials and Methods), we found a significant two-way interaction between the size of the attention field and the locus of attention on the response amplitude parameter, a (p < 0.0001; Fig. 4C, left). This two-way interaction was primarily driven by higher response amplitude for attended compared to ignored stimuli in the focused-attention condition (p < 0.0001; this and all other p values associated with post hoc pairwise comparisons are Bonferroni corrected with α < 0.0063), but not in the distributed-attention condition [not significant (n.s.), p = 0.3986]. We also observed higher response amplitude for attended stimuli in the focused-attention condition compared to the distributed-attention condition (p = 0.0026; Fig. 4C, left, compare dark blue, cyan bars). Finally, there was no significant difference in the response amplitude for ignored stimuli across the focused- and distributed-attention conditions (n.s., p = 0.0448; Fig. 4C, left, compare red, orange bars).
As shown in the right panel in Figure 4C, right, we also found a significant two-way interaction between the size of the attention field and the locus of attention on the semisaturation contrast parameter C50 (p = 0.0021). The interaction was primarily driven by a significant decrease in C50 for attended compared to ignored stimuli in the distributed-attention condition (p = 0.0010) and no difference in C50 between attended and ignored CRFs in the focused-attention condition (n.s., p = 0.5302). No significant differences were observed for attended stimuli in the focused-attention condition compared to the distributed-attention or for ignored stimuli in the focused-attention condition compared to the distributed-attention (n.s., p = 0.0146 and p = 0.0642, respectively).
Overall, these results demonstrate that the CRFs associated with attended stimuli undergo response and contrast gain in the focused-attention and distributed-attention conditions, respectively. Importantly, a similar pattern of amplitude and semisaturation parameters were observed when data from the first half (Fig. 4E) and the second half (G) of each trial were analyzed separately. Specifically, the interaction between the size of the attention field and the locus of attention is significant for both a (first half, p = 0.0018; second half, p = 0.0225) and C50 (first half, p = 0.0147; second half, p = 0.032), with more pronounced response gain in the focused-attention condition and more pronounced contrast gain in the distributed-attention condition.
fMRI results
To independently determine whether the spatial scope of attention was larger in the distributed compared to the focused-attention condition, we performed a control study using fMRI to examine the extent of activation within retinotopically mapped regions of early visual cortex. We tested two complementary predictions. First, we predicted that more voxels overall would be significantly active in the distributed compared to the focused-attention condition. Given that these areas are retinotopically organized, a higher proportion of significantly active voxels should correspond to a larger area of the visual field. Second, we reasoned that voxels showing the least activation in the focused-attention condition should undergo a relatively large increase in activation in the distributed-attention condition (compared to voxels that already showed a high activation level in the focused-attention condition). Again, this prediction is based on the retinotopic organization of these areas: voxels with spatial receptive fields near the center of the stimulus should respond strongly in both the focused and the distributed conditions, whereas voxels with a spatial receptive field that is farther from the center of the stimulus should respond more in the distributed compared to the focused-attention condition. Note that the first prediction concerns only voxels that show a significant positive response (i.e., does the total spatial extent of voxels passing a fixed threshold change with attentional demands?). In contrast, the second prediction concerns systematic changes in the responses of all voxels within a visual area, regardless of whether their responses are significantly higher than baseline in any given condition.
With respect to the first prediction, we observed a higher proportion of significant voxels in the distributed-attention condition compared to the focused-attention condition across most localizer-defined regions of V1, V2, and V3 (at an individual-voxel FDR-corrected threshold of p < 0.05; Fig. 5A, middle; 23 of 30 areas had a higher proportion of active voxels in the distributed compared to the focused-attention condition, where left and right V1, V2, and V3 were considered separately for each subject; p < 0.005 by sign test). Similar results were also obtained at FDR-corrected individual-voxel thresholds of p = 0.10 (Fig. 5A, left) and p = 0.01 (Fig. 5A, right), indicating that the result does not just reflect a thresholding artifact (all p values < 0.01 by sign test; Fig. 5B, results showing the higher proportion of voxels in the distributed- compared to focused-attention conditions across a wider range of t thresholds). Similar results were also observed when only responses in area V1 were considered. Note that only about ∼40–50% of voxels were significant across the focused- and distributed-attention conditions. This is consistent with the fact that the contrast of the stimulus used in the main fMRI experiment (20.03%) was lower than the contrast of the localizer stimulus (100%), and the area of stimulation was also smaller than the size of the localizer (see Materials and Methods). Finally, note that for all analyses we first removed the mean of the BOLD response across the entire brain on a volume-by-volume basis, so the differences between the focused and distributed conditions in unlikely to be related to changes in global activation levels associated with general arousal. In addition, Figure 5C shows representative activation maps from two subjects who participated in both the EEG and fMRI experiments. In the distributed-attention condition, there is a visibly broader patch of activation on the cortical sheet in left/right V1, V2d, and V3d compared to the focused-attention condition (Figure 5C, orange–yellow color represents significance above baseline; p < 0.05, FDR corrected).
fMRI results independently address how the distributed-/focused-attention manipulation changed the spatial extent of activation in retinotopically organized regions of early visual cortex. A, At an individual voxel threshold of p = 0.05 (FDR corrected) for determining a significant response (middle), we observed a higher proportion of significant voxels in the distributed-attention condition compared to the focused-attention condition across most V1, V2, and V3 regions of interest (p < 0.005 by sign test). Similar results were also obtained at FDR-corrected individual voxel thresholds of p = 0.10 (left) and p = 0.01 (right), indicating that the result does not just reflect a thresholding artifact (all p values < 0.01, sign test). Given that these areas are retinotopically organized, a higher proportion of voxels should translate into a broader spatial representation across the cortical surface in the distributed-attention condition. B, The higher proportion of voxels in the distributed-attention condition compared to the focused-attention condition across V1, V2, and V3 was also observed across a wide range of t thresholds. C, Representative activation maps from two subjects. In the distributed-attention condition, there is a visibly broader patch of activation on the cortical sheet in left/right V1, V2d, and V3d compared to the focused-attention condition (orange–yellow color represents significant voxels above the baseline; p < 0.05, FDR corrected). These data illustrate the effects reported in A and demonstrate that the experimental manipulation led to a larger attention field in the distributed-attention compared to the focused-attention condition. Critically, since the stimulus drive in our experiment was always fixed and the size of the attention field is larger in the distributed condition compared to the focused condition, the NMA predicts higher response gain and lower contrast gain in the focused condition compared to the distributed condition (see Fig. 1A–E, model simulations). D, Voxels in left and right V1, V2, and V3 were sorted by the β values for the focused-attention condition (low to high) and evenly divided into 20 bins. The averaged β values for the focused- and distributed-attention conditions were compared across these 20 bins (left). Voxels with low activation in the focused-attention condition underwent a significant increase in activation in the distributed condition, but those with higher activation did not. The difference between the β values for distributed and focused conditions was significant (above zero) for bins with low β values, but not in bins with higher β values (bottom), suggesting that the spatial pattern of activity in the distributed condition is more diffuse compared to the focused condition. E, The sorted β results (same as D) in individual subjects. Red asterisks in B and D indicate significance, as determined by sign tests (p < 0.05, FDR corrected). Gray asterisks indicate uncorrected difference. Error bars in all figures are ±1 SEM across all unilateral regions of interest from each subject (left/right V1, V2, and V3; 30 total ROIs).
We next tested our second prediction that the least active voxels in the focused-attention condition should undergo the largest positive change in the distributed-attention condition. To assess this possibility, we examined the response in all voxels within each localizer-defined ROI to determine whether the least active voxels in the focused-attention condition became more active in the distributed-attention condition (as we would predict if the spatial scope of attention increased). We first sorted voxels in left and right V1, V2, and V3 from low to high based on β values (i.e., the GLM-estimated response magnitude) in the focused-attention condition. Because each participant had a different number of voxels in each ROI, we evenly sorted the data from each ROI into 20 bins. Then, we compared the averaged β values for voxels in each bin across the focused- and distributed- attention conditions (Fig. 5D). Across all subjects, voxels with relatively low activation (low β values) in the focused-attention condition showed significantly higher activation in the distributed-attention condition (Figs. 5D, far left bins marked by a red asterisks; p < 0.05, sign test across participants and visual areas, FDR-corrected). In contrast, voxels with higher activation in the focused-attention condition underwent a smaller change with attentional strategy (the far right bins; Fig. 5E, individual subject panels). Note that the modulation is primarily in voxels that have either negative or low positive β weights; however, the sign of the weight is only relevant with respect to the passive-fixation null trials (Stark and Squire, 2001), and so any increase (in this case, β values becoming less negative) is consistent with a more diffuse spatial response profile in the distributed-attention condition. In addition, note that the large response increases in voxels with a low β weight is not inconsistent with the data presented in Figure 5A, as the data in Figure 5A simply indicate that more total voxels fall above an FDR-corrected threshold in each condition (i.e., a small percentage of voxels shift from nonsignificant to significant when attention is diffuse). Thus, these two analyses provide consistent and complementary evidence that the spatial scope of attention changed as a function of our task instructions.
Discussion
Over the last several decades, spatial attention has been shown to modulate the contrast response functions of neurons in visual cortex in many different ways, with some studies suggesting response gain, others suggesting contrast gain, and still others suggesting a combination of both (McAdams and Maunsell, 1999; Reynolds et al., 1999, 2000; Di Russo et al., 2001; Martínez-Trujillo and Treue, 2002; Williford and Maunsell, 2006; Buracas and Boynton, 2007; Kim et al., 2007; Murray, 2008; Lee and Maunsell, 2009, 2010a,b; Carrasco 2011; Pestilli et al., 2011; Serences, 2011; Andersen et al., 2012; Fig. 1). In turn, the heterogeneous pattern of neural modulation is consistent with the wide variety of gain patterns inferred using psychophysical methods (Morrone et al., 2002, 2004; Carrasco et al., 2004; Ling and Carrasco 2006; Pestilli et al., 2007, 2009).
Previously, the NMA (Lee and Maunsell, 2009, 2010b; Reynolds and Heeger, 2009) offered a potential solution as to how these apparently inconsistent attentional gain patterns might arise via changes in the relative size of the stimulus and the attention field. The NMA generates the output of sensory neurons by multiplying the stimulus drive (E) from the classical excitatory RF with the attention field (A). This combined influence of the stimulus drive and attention (AE) is then divided by the suppressive drive (S), obtained from the convolution between AE and the nonclassical inhibitory receptive field. Given a fixed size of E and S, the size of the attention field (A) relative to the stimulus drive (E) will influence the pattern of gain of the modeled population responses. For example, an attention field smaller than the stimulus drive (focused attention) will lead to response gain because attentional gain enhances the entire stimulus drive, but only enhances the center of the suppressive field. In contrast, an attention field larger than the stimulus drive (distributed attention) will lead to contrast gain because attentional gain is applied equally to the stimulus drive and suppressive field. Consistent with this prediction, a post hoc survey of previous studies suggests that different patterns of gain modulation may be due to changes in the size of the attention field (Table 2). For example, Reynolds et al., 2000 used a large rectangular cue to direct animals' attention to a location that contained a smaller target stimulus. This display may have induced a relatively large attention field with respect to the stimulus, which is consistent with the observation of pure contrast gain. In contrast, other studies instructed animals to attend to a peripheral stimulus that was located on either the left or the right side of a video display (Williford and Maunsell, 2006; Lee and Maunsell, 2010a). The same location was attended for a long sequence of trials, and no physical cue or placeholder was used to mark the target position. In this case, the size of the attention field may have closely matched (or been smaller than) the size of the target, which is consistent with the observation of response gain in many cells.
A post hoc survey of studies examining the influence of attention on the gain pattern of neural CRFs
Here, we used EEG to monitor SSVEPs in human subjects performing an attentionally demanding task in which the spatial scope of attention was systematically manipulated (focused vs distributed attention) while keeping the spatial extent of visual stimulation constant across trials. We found that highly focused spatial attention primarily enhanced neural responses to high-contrast stimuli (response gain), whereas distributed attention primarily enhanced responses to medium-contrast stimuli (contrast gain). Together, these data suggest that different patterns of neural modulation do not reflect fundamentally different neural mechanisms, but instead reflect changes in the spatial scope of attention.
It is possible that, in the distributed-attention condition, subjects' attention might be captured by high-contrast ignored stimuli on the opposite side of the display, and the attention field might actually extend across the vertical meridian to the ignored stimulus. In turn, this spread of attention to high-contrast ignored stimuli might attenuate any amplitude modulation (i.e., the a parameter from the Naka–Rushton equation) in the distributed-attention condition. However, we view this scenario as unlikely for two reasons. First, the stimuli were presented in different hemifields, and the spatial separation was large even with respect to the size of the attention field in the distributed-attention condition. Second, this account predicts that behavioral response times should be inflated when the ignored stimulus was high contrast and thus more salient, but no such differential inflation of response times was observed (n.s., F(1,6) = 0.034, p = 0.86; Fig. 2C, right).
Interestingly, although the two types of CRF modulation that we report here (contrast gain and response gain) are generally consistent with previous observations from electrophysiological (Reynolds et al., 2000; Di Russo et al., 2001; Martínez-Trujillo and Treue, 2002; Kim et al., 2007; Lee and Maunsell, 2009, 2010a,b; Andersen et al., 2012) and psychophysical studies (Morrone et al., 2002, 2004; Carrasco et al., 2004; Ling and Carrasco 2006; Pestilli et al., 2007, 2009), several studies have now reported that the BOLD response measured using fMRI shows a purely additive shift such that the effects of attention on the BOLD response are constant across contrast levels (Buracas and Boynton, 2007; Murray, 2008; Pestilli et al., 2011; Table 2). Some authors have suggested that this additive effect may be due to the BOLD response pooling activity across neurons that show different attention effects, which would result in an aggregate response that more closely resembles an additive shift as opposed to either contrast or response gain (Williford and Maunsell, 2006; Pestilli et al., 2011). However, the SSVEP signal that we measured in the present study also pools signals across large neural populations, but we observed contrast and response gain effects that more closely resemble patterns typically observed in single neurons (Reynolds et al., 2000; Martínez-Trujillo and Treue, 2002; Lee and Maunsell, 2009, 2010a,b). Thus, the present data suggest that the SSVEP and fMRI measurements may tap into at least partially different signals associated with attentional modulation in visual cortex. One possibility is that BOLD and SSVEP responses are differentially influenced by factors other than stimulus-evoked neural activity. For example, BOLD signals are highly sensitive to both stimulus-evoked and non-stimulus-evoked activity (Kastner et al., 1999; McMains et al., 2007; Murray, 2008; Sirotin and Das, 2009). Thus, relatively high BOLD responses to the attended low-contrast (and also 0% contrast) stimuli in past fMRI studies (Buracas and Boynton, 2007; Murray, 2008; Pestilli et al., 2011) may be largely attributable to anticipatory activity that is not directly evoked by the stimulus. In contrast, SSVEP signals, by definition, reflect stimulus-driven neural responses that are selectively entrained at the stimulus frequency (Regan 1989; Srinivasan et al., 1999; Rager and Singer, 1998; Müller et al., 1998a,b, 2003; Kim et al., 2007). Future studies could use multimodal imaging techniques (e.g., combining scalp-EEG SSVEPs and fMRI) to further examine the relationship between results obtained using these two imaging modalities.
Previous single-unit recording studies measuring neural CRFs have reported both response and contrast gain, fueling a long-running debate about the mechanisms of selective attention (McAdams and Maunsell, 1999; Reynolds et al., 1999, 2000; Martínez-Trujillo and Treue, 2002; Williford and Maunsell, 2006; Lee and Maunsell, 2009, 2010a,b). Here, we show that these discrepant modulatory effects can each be observed, depending on changes in the spatial extent of attention. This interaction between the size of the attention field and the gain pattern of the SSVEP-derived neural CRFs is consistent with a prediction of the NMA (Lee and Maunsell, 2009, 2010b; Reynolds and Heeger, 2009): a small attention field will result in relatively more response gain, and a larger attention field will result in relatively more contrast gain, given a constant stimulus drive (Fig. 1). However, it is important to note that the NMA is intentionally agnostic about other aspects of attentional modulation, such as the relationship between the size of the attention field and responses evoked by ignored stimuli, which the model assumes to be very far away from the focus of attention. For example, we observed modest, albeit nonsignificant, modulations of CRFs associated with ignored stimuli (Fig. 4C), and a similar observation was also reported in a previous psychophysical study (Herrmann et al., 2010). Since the NMA does not explicitly specify the exact spatial function that governs the attention field (Reynolds and Heeger, 2009), future studies will be required to determine how changes in the size and/or shape of the attention field mediate interactions between attended and ignored stimuli across the extent of the visual scene. That said, the present demonstration that changes in the size of the attention field can lead to a shift from response to contrast gain supports a key prediction of the NMA, and the general theoretical framework can be expanded as justified by new data.
Footnotes
This work was supported by NIH Grant R01-MH092345 (J.T.S.) and a James S. McDonnell Foundation grant (J.T.S.). J.O.G. was supported in part by NIH Grant R01-MH068004 to Ramesh Srinivasan. We thank Kimberly Kaye and Edward F. Ester for help with data collection, Anna Byers and Mary E. Smith for assistance with retinotopic mapping procedures, and John Reynolds, Timothy Q. Gentner, Edward Awh, Jeremy Freeman, and Scott Freeman for useful discussions.
The authors declare no competing financial interests.
- Correspondence should be addressed to either Sirawaj Itthipuripat or John T. Serences, Neurosciences Graduate Program, Department of Psychology, University of California, San Diego, La Jolla, California 92093-0109. itthipuripat.sirawaj{at}gmail.com or jserences{at}ucsd.edu