Abstract
Previous electrophysiology data suggests that the modulation of neuronal firing by spatial attention depends on stimulus contrast, which has been described using either a multiplicative gain or a contrast-gain model. Here we measured the effect of spatial attention on contrast responses in humans using functional MRI. To our surprise, we found that the modulation of blood oxygenation level-dependent (BOLD) responses by spatial attention does not greatly depend on stimulus contrast in visual cortical areas tested [V1, V2, V3, and MT+ (middle temporal area)]. An additive model, rather than a multiplicative or contrast-gain model best describes the attentional modulations in V1. This inconsistency with previous single-unit electrophysiological data has implications for the population-based neuronal source of the BOLD signal.
Introduction
Recent neuroimaging studies show that visual spatial attention influences neuronal responses as early as the human primary visual cortex (Gandhi et al., 1999; Martinez et al., 1999; Somers et al., 1999) and even the lateral geniculate nucleus (O'Connor et al., 2002). The quantitative effect of spatial attention on neuronal responses has been studied electrophysiologically in a number of studies. One study of primary visual cortex (V1) and V4 of the macaque found that spatial attention acts on the orientation tuning functions of individual neurons by multiplying the firing rate by a constant factor (McAdams and Maunsell, 1999). This result is consistent with a multiplicative, or response gain model of attention. By measuring attentional effects for stimuli across a range of contrasts, studies in macaque V4 (Reynolds et al., 2000) and middle temporal area (MT) (Martinez-Trujillo and Treue, 2002) show that attention acts to enhance the effective contrast of the attended stimulus, a phenomenon known as contrast gain. Interestingly, a recent single-unit study of contrast responses in macaque area V4 found that attention modulation in many neurons was consistent with response gain, contrast gain, and “activity gain,” i.e., overall multiplicative scaling of neuronal activity including spontaneous activity (Williford and Maunsell, 2006). Moreover, Murray and He (2006) report that contrast invariance in object-selective human visual area LOC (lateral occipital complex) is achieved only for attended stimuli, whereas a contrast-dependent response is present when attention is withdrawn.
Functional magnetic resonance imaging (fMRI) blood oxygen level-dependent (BOLD) responses to stimuli of varying contrast [contrast response functions (CRFs)] measured in human visual cortex are closely predicted by CRFs averaged across a population of single neurons of macaque visual cortex (Heeger et al., 2000). Therefore, we might expect that attentional modulation of CRFs measured in human visual cortex using fMRI can also be predicted from macaque electrophysiology. Moreover, CRFs measured using visual evoked potentials (VEPs) in human visual cortex are modulated by attention multiplicatively (Di Russo et al., 2001). Thus, we might expect that attention has a multiplicative effect on the fMRI response.
Both multiplicative and contrast-gain models predict that effects of attention should be weakest for stimuli that produce smaller responses (such as at low contrasts). However, this prediction is not consistent with three fMRI studies that show substantial increases in the BOLD signal with spatial attention in the absence of a stimulus, which is effectively a stimulus of zero contrast (Kastner et al., 1999; Ress et al., 2000, Silver et al., 2006). If the effects of attention are already large at zero contrast, it seems unlikely that the effects of attention on the BOLD signal could be predicted by the contrast or multiplicative gain model.
The purpose of our study was to investigate this discrepancy quantitatively by measuring the effects of spatial attention on the BOLD signal in early retinotopically organized visual areas for a range of contrasts using both a speed and a contrast discrimination task.
Materials and Methods
Subjects.
Two male and two female subjects (all right-handed; mean age, 30) participated in the study. All subjects had normal visual acuity. All subjects indicated informed written consent in accordance with the Salk Institute Human Subjects Institutional Review Board.
Stimuli and psychophysical task.
Subjects performed the following psychophysical tasks both in the scanner and in the laboratory using nearly identical viewing conditions [for details, see Buracas et al. (2005)]. fMRI measurements were made while subjects performed either a speed or contrast discrimination task on a moving grating presented in the periphery. Computer-generated images were projected onto a semicircular back projection screen near the subject's chest (60 cm from the eyes, viewed through a mirror).
The stimulus for both the contrast and speed discrimination task was a moving sinusoidal grating with mean luminance matching the background (800 cd/m2). The 0.5 cycle/degree sinusoidal grating was windowed by a circular aperture of diameter 6° and centered 8° down from the horizontal meridian, and 8° laterally (left or right) from the vertical meridian. The entire projected background subtended a semicircular region ∼38° of visual angle in diameter, so the grating stimuli were separated from the border of the screen by at least 5° from the edges. The grating moved at a baseline speed of 10°/s in the direction of 45° toward the upper left when presented within the lower left visual quadrant and toward the upper right when presented in the lower right quadrant. Five baseline contrast levels were used (6.25, 12.5, 25, 50, and 75%).
For both discrimination tasks, we used a two-interval forced-choice paradigm. Each trial lasted 3000 ms and consisted of two 1000 ms stimulus presentation intervals separated by a 200 ms blank interval, a 300 ms response period, 300 ms feedback period, and 200 ms intertrial intervals. On every trial, both the contrast and speed were independently varied between the first and second stimulus interval by a small increment. The higher contrast and faster speed appeared randomly in either the first or second interval. For the speed discrimination task, subjects indicated which of the two presentation intervals contained the fastest-moving grating by pressing one of two buttons during the response interval that followed stimulus presentation. For the contrast discrimination task, subjects indicated which of the two intervals contained the stimulus with higher contrast. For the feedback interval, the outline of the fixation spot turned red for incorrect responses, green for correct responses, and yellow if no response was entered before the end of the response interval.
Before scanning, speed and contrast discrimination thresholds were measured in the laboratory for each subject for every condition using a standard one-up three-down double-interleaved staircase procedure (70 trials for each staircase run). Weibull functions were fit to the psychometric data using a maximum likelihood procedure to estimate the speed or contrast increment that would produce 80% correct performance. Data collected during the first two 1 h sessions (∼24 staircases) were excluded from analysis to minimize learning effects. Staircase runs were counterbalanced for baseline contrast level and stimulus presentation side (using an m-sequence). Threshold measurements (staircases) were repeated six times for each contrast level and discrimination task and were averaged across both the lower left and lower right visual quadrants.
Speed and contrast increments obtained in the laboratory were used for the tasks in the MRI scanner. These threshold increments resulted in constant task difficulty at ∼80% correct for both tasks across all stimulus contrasts. Speed and contrast discrimination thresholds were found not to depend on whether or not a stimulus was presented in the contralateral hemifield; thus, during scanning, the same speed and contrast increments were used for both the attend versus blank condition and the attended versus unattended condition. For constant performance, contrast increment thresholds increased with baseline contrast, whereas speed increment thresholds were roughly constant with baseline contrast. These psychophysical results have been published previously [Buracas et al. (2005), their Fig. 2].
fMRI experimental design.
Our goal was to measure the fMRI response to an attended and an unattended stimulus and compare these responses to a baseline with no stimulus while maintaining the level of vigilance or arousal in our subjects at a constant level. We accomplished this by carrying out two scanning conditions using a periodic blocked design. The first condition, which we call the attended versus unattended condition had stimuli presented on both sides of the visual field simultaneously while a cue at fixation directed the subject to perform the task on either the left or right stimulus. In this condition, speed or contrast increments occurred independently on both sides of the visual field, and subjects were instructed to indicate the higher contrast or faster speed only on the attended side. Each block consisted of eight trials (3 s per trial) with attention directed to each side, so that subjects alternated attention from left to right every 24 s. Each scan contained 11 blocks, lasting a total of 264 s.
In the second condition, called the attended versus blank condition, spatial attention and the stimulus alternated from left to right by presenting only a single stimulus in the attended visual field and cycling it between the left and right hemifield every block. The response to this condition in each hemisphere mimics an on/off blocked design while keeping the subjects' vigilance level constant throughout the scan. fMRI results from this attended versus blank condition were presented in a previous publication (Buracas et al., 2005). For both conditions, the order of the two tasks was counterbalanced across scanning sessions for each subject.
fMRI data acquisition.
Functional MRI data were acquired using Varian (Palo Alto, CA) Unity-Inova 3T scanner using a custom-made volume coil (diameter, 23 cm) and an echo-planar imaging sequence (125 kHz). During each scan, 132 temporal frames were acquired over 264 s (repetition time, 2 s; flip angle, 90°; 24 interleaved slices of 3 mm thickness and 3 × 3 mm resolution; field of view, 192 mm). fMRI data from the first block (24 s) were discarded to avoid the effects of magnetic saturation and visual adaptation.
Twelve scans were acquired from each subject during each scanning session: a retinotopic reference scan, an MT+ reference scan, and 10 scans consisting of 5 contrast levels × 2 discrimination tasks. The same task was performed for five consecutive scans with increasing stimulus contrast. The task order (i.e., speed discrimination vs contrast discrimination) was counterbalanced across scanning sessions and subjects. The scans were separated by 1.5–2 min resting intervals. The same task was performed continuously throughout each scan, alternating between the two visual hemifields.
Each scanning session ended with a T1-weighed structural scan (magnetization-prepared rapid-acquisition gradient echo, 1 × 1 × 3 mm resolution) used to align functional data across multiple scanning sessions to a subject's reference volume. A minimum of six sessions was performed for each subject, resulting in at least 72 scans per subject.
Occipital visual cortical areas V1, V2, V3, and MT+ were localized using standard retinotopic mapping and cortical-flattening techniques described previously (Engel et al., 1994; see also Gandhi et al., 1999). Areas V3A and V4V were not consistently localized for all subjects; therefore, results in these areas are not shown. Regions of interest (ROIs) within these predefined areas were selected by means of on/off blocked design localizer scans that were run at the beginning of each session [for details, see Buracas et al. (2005)].
fMRI data analysis.
The fMRI response amplitudes were estimated by fitting a sinusoid (5 cycles/scan, 48 s period) to the time series of voxel responses averaged across a given ROI and then compensating for the hemodynamic response latency (Boynton et al., 1996). Because stimuli were presented in the lower visual field, estimated amplitudes were measured in occipital visual areas V1, dorsal V2, dorsal V3, and MT+.
Results
Behavioral results during fMRI data acquisition
As expected, all subjects performed at ∼80% correct for each task, contrast, and condition. Mean performance across task, contrast, and condition for each of the four subjects was 77.5, 80.0, 83.1, and 83.2%. An ANOVA of behavioral performance, with subject as a random effects variable, showed no main effect of contrast (F(4,70) = 1.83; p = 0.1327), task (F(1,70) = 0.1637; p = 0.6870), or condition (F(1,70) = 2.8706; p = 0.0947). The independence of performance on condition indicates that performance did not depend on whether there was a distracting stimulus and therefore shows that the unattended stimulus was truly unattended.
No difference between speed and contrast discrimination tasks
An ANOVA, with subject as a random variable, found no main effect for the effect of task (speed vs contrast discrimination) on the fMRI response (F(1,304) = 2.15; p = 0.1435). Because fMRI responses did not depend on whether the subjects were performing a speed or a contrast discrimination task, we averaged data across the two tasks. This resulted in 64 measurements per contrast level for the attended vs unattended condition (4 subjects × 4 scans/contrast × 2 task conditions × 2 hemispheres) and 32 measurements per contrast level for the attended versus blank condition (4 subjects × 2 scans/contrast × 2 task conditions × 2 hemispheres).
We also found no significant interaction between task and ROI (F(3,304)= 0.1328; p = 0.9405). This is consistent with our previous result (Buracas et al., 2005) showing that, for example, area MT+ did not show a stronger response during the speed discrimination task than during the contrast discrimination task (but see Huk and Heeger, 2000).
fMRI contrast response
The filled symbols in the top row of Figure 1 show fMRI responses to the attended versus blank condition for each of the four ROIs, averaged across the four subjects and the two tasks. The error bars represent the SEs of the mean across subjects. Consistent with previous results (Boynton et al., 1996, 1999; Tootell et al., 1998) and also shown in Buracas et al. (2005), fMRI contrast response functions increase monotonically in area V1 but saturate at lower contrasts in higher visual areas. With the contrast values used for these experiments, fMRI contrast response functions do not exhibit the sigmoidal-like inflection found in single neurons. In area MT+, responses remain roughly constant along the measured range of contrasts, consistent with previous studies finding that responses within MT+ saturate at low contrasts (Tootell et al., 1995).
The solid lines through the filled symbols are the best fits of power function to the data, described by the following: where RA is the predicted fMRI response to the attended versus blank condition, C is stimulus contrast, b is a multiplicative constant, and p is an exponent. These parameters were chosen so that they minimize the sums of squared errors between the predicted curve and each individual subject's measurement.
Best fitting b parameters for the results averaged across subjects are as follows: V1 = 1.66, V2 = 1.90, V3 = 1.76, and MT+ = 0.42. Best fitting exponent values, p, for the results averaged across subjects are V1 = 0.25, V2 = 0.15, V3 = 0.11, and MT = −0.06. This list of decreasing exponent parameters across visual areas indicates that the responses in higher visual areas saturate at lower contrast levels than earlier visual areas.
Modeling the effects of spatial attention
The filled symbols in the bottom row of Figure 1 show fMRI responses to the attended versus unattended condition as a function of stimulus contrast for each ROI, averaged across the four subjects and two tasks. Error bars represent the SEs of the mean across subjects.
The additive model assumes that the effects of attention simply add a constant to the fMRI response, regardless of stimulus contrast. This means that the response to the attended versus unattended condition should be fit with a horizontal line. The solid lines through the data in the bottom row of Figure 1 are least-squares fit of a horizontal line through the fMRI responses to the attended versus unattended condition. This is the same as the mean value of the response across all subjects and contrasts. These means across ROIs are V1 = 0.22, V2 = 0.40, V3 = 0.46, and MT+ = 0.21% signal change.
The multiplicative model assumes that the effects of attention multiply fMRI responses by a constant factor. In our case, we assume that the response to the unattended condition is equal to the response to the attended condition multiplied by a constant, k, which is less than one. Assuming the contrast-response function is a power function (Eq. 1), the multiplicative model predicts that the response to the attended versus unattended condition should be the following: which is itself a power function.
Least-squares fit of the multiplicative model, assuming the parameter values of b and p from the fits to the attended versus blank condition, are shown as dashed lines in the bottom row of Figure 1. Best-fitting values of the parameter k are the following: V1 = 0.82, V2 = 0.64, V3 = 0.70 and MT+ = 0.53.
The contrast-gain model assumes that the effects of attention can be predicted by multiplying the contrast of a stimulus by a constant factor. Assuming the power function for the contrast response, multiplying the contrast by a constant factor, g, is equivalent to multiplying the contrast response function by a constant gp. Therefore, in this case the contrast-gain model and the multiplicative make equal predictions for the attended versus unattended condition.
Comparing the models
Visual inspection of the best-fitting additive model (solid lines) versus the multiplicative/contrast-gain model (dashed lines) in the bottom row of Figure 1 shows that the additive model clearly fits better in V1, but not necessarily in V2, V3, or MT+. To compare the fits of the models quantitatively, a Monte Carlo simulation was conducted by generating 1000 sample data sets by sampling with replacement from the original individual scan by scan measurements. Sums-of-squared values were compared for both models when fit to each sample data set. In V1, the additive model fit better than the multiplicative/contrast-gain model in 98% of the sample data sets. However, in areas V2 and V3, the additive model fit better only 71 and 70% of the sample data sets, respectively. In area MT+, the multiplicative/contrast-gain model fit better for 69% of the sample data sets.
We also considered a hybrid model in which both an additive and a multiplicative gain change was allowed. Adding multiplicative gain to our model reduced the overall fit error across ROIs by <0.3%, despite the extra parameter value, which is not a statistically significant improvement in the quality of the fit (F(1,36) = 2.21 for nested models; p > 0.05).
Discussion
Our results show that in V1, spatial attention has an additive effect across stimulus contrasts on the fMRI response in early retinotopically organized visual areas. Attentional effects in V2, V3, and MT+ showed a trend in favor of the additive model over the multiplicative/contrast-gain model, but the effect did not reach statistical significance. These effects are similar for each of the four individual subjects and are shown in supplemental Figures 1–4 (available at www.jneurosci.org as supplemental material).
Relationship between psychophysics and neuronal responses
Psychophysical experiments have shown results consistent with both the electrophysiological data and human brain imaging. For example, Carrasco et al. measured the apparent contrast of attended and unattended stimuli using an exogenous cue and a contrast-matching task (Carrasco et al., 2004). They found that the perceived contrast of the cued stimulus was greater than that of the uncued stimulus and that the shift in apparent contrast was consistent with the contrast-gain model, as predicted by electrophysiology of macaque extrastriate cortex (Reynolds et al., 2000; Martinez-Trujillo and Treue, 2002). A study of the duration of the motion aftereffect (MAE) as a function of stimulus contrast and attention (Rezec et al., 2004) found that MAE duration as a function of attention and contrast could be described by a contrast-gain model of attention. Moreover, we have recently reported (Buracas et al., 2005) that BOLD CRFs in early visual cortical areas (V1, V2, V3) can explain human contrast discrimination performance.
Possible substrates of the additive shift with attention
Electrophysiological experiments typically measure responses in cells that are strongly responsive to the stimuli that are being used. Similarly, psychophysical performance is thought to depend on the responses of select subsets of neurons suitable for the task in question. Instead, the BOLD response is thought to pool over the entire population of neurons, whether responsive or not. Our BOLD responses might therefore be dominated by changes in baseline firing rates, which have been found to increase with attention (Luck and Vogel, 1997). These small changes in baseline firing rates might appear negligible compared with the changes induced by attention on responsive neurons. However, the moving grating stimuli that were used in the present experiments have restricted spatiotemporal energy, thus it is possible that only a relatively small proportion of neurons were responsive to our stimuli. A small baseline shift in a large, unresponsive subpopulation of neurons might obscure a contrast-gain effect by attention in a much smaller number of responsive neurons.
In contrast, an additive effect with attention may be found within the responsive neurons as well. The recent report by Williford and Maunsell (2006) concludes that attentional modulation of neuronal responses in macaque area V4 are no more consistent with contrast-gain model as with response and activity gain models. This leaves open the possibility that an additive effect of attention can describe their population-averaged data [Williford and Maunsell (2006), their Fig. 6B,C]. An additive effect is especially notable for responses to stimuli of nonoptimal orientation.
Alternatively, it has been proposed that the BOLD response may be related more closely to the local field potentials and hence synaptic activity than to spiking neuronal activity (Logothetis et al., 2001). So perhaps attentional modulation of our fMRI response reflects a widespread subthreshold modulation of synaptic activity that is not detectable using extracellular recordings (Devor et al., 2003). Finally, it is possible that spatial attention itself may change the coupling between neuronal activity and regional hemodynamic factors, such as regional cerebral blood flow, oxygen metabolism, and blood volume. Indeed, there are neuromodulators involved in the control of attention that are also vasoactive agents (Bentley et al., 2004). Any such change in coupling would presumably be independent of the contrast of the attended stimulus.
Conclusions
The additive effect reported herein is not consistent with electrophysiological single-unit recordings, which generally show an increase in effective contrast or response gain with attention. However, one recent study of attentional modulations in area V4 (Williford and Maunsell, 2006) opens the possibility that a population-averaged response in higher cortical areas is consistent with an additive model at population level. Because fMRI signals reflect the entire population response, it may be dominated by an additive change in baseline activity and responses to nonoptimal stimulus. A second possibility is that the attention has an additive effect on the subthreshold synaptic activity that is thought to mediate the BOLD signal (Logothetis et al., 2001). Finally, it is possible that the modulation of fMRI signals by attention may reflect not only underlying neuronal activity, but also a direct modulation of vasculature by vasoactive agents. Additional investigation is required to distinguish between these three explanations.
Footnotes
-
This work was supported by National Institutes of Health Grant NEI 12925. We thank Ione Fine and Minna Ng for their helpful comments.
- Correspondence should be addressed to Giedrius T. Buracas, University of California, San Diego Center for Functional MRI, 9500 Gilman Drive, La Jolla, CA 92093.