Although many studies have demonstrated that neuronal responses are modulated by attention, the significance of this modulation for behavior is poorly understood. We recorded from neurons in the middle temporal (MT) and ventral intraparietal (VIP) areas in the visual cortex of two macaque monkeys while they performed a motion detection task under two conditions of spatial attention. The ability of the animals to detect the motion was reduced when they withdrew attention from the stimulus. Withdrawing attention also reduced neuronal responses to the motion in both the MT and VIP areas. To compare the neuronal and behavioral effects of attention, the amount of attentional modulation was expressed in units of stimulus strength. On average, attention modulated neuronal responses in MT less than needed to account for the attentional effect on behavior. The opposite was observed in VIP, where the average effect of attention on neuronal responses was greater than that needed to account for behavior. Similar results were obtained when the effects of attention on neuronal response and behavioral performance were compared using a parametric function of stimulus strength. Across neurons in both areas, attentional modulation of neuronal responses was more variable than, and uncorrelated with, attentional modulation of behavioral performance. These findings suggest that attention can alter the average relationship between neuronal activity in visual cortex and behavioral performance. Where this relationship is preserved may indicate which cortical regions are most closely associated with the behavior in a given task.
Directing attention to a specific region in space improves stimulus detection at that region relative to others (Eriksen and Hoffman, 1972; Posner, 1980; Downing, 1988). Spatial attention also affects the responses of neurons in visual cortex (Bushnell et al., 1981; Motter, 1993; Desimone and Duncan, 1995;Luck et al., 1997). How the behavioral and neuronal effects of attention are related is poorly understood and was the focus of our experiments.
One possibility is that the effect of attention on the responses of visual cortical neurons can fully account for its effect on behavioral performance. This hypothesis arises from several observations. First, the neuronal modulation that occurs when attention is directed to a stimulus (Spitzer et al., 1988) or when effort is increased (Spitzer and Richmond, 1991) is typically an enhancement, which is consistent with behavioral improvements. Second, a rough correspondence exists between behavioral performance and the ability of individual neurons to discriminate among or detect stimuli (Parker and Hawken, 1985; Barlow et al., 1987; Britten et al., 1992; Geisler and Albrecht, 1997; Prince et al., 2000). Because these studies are likely to have spanned a range of attentional states, it is possible that the relationship between neuronal activity and behavioral performance persists across different attentional conditions. Third, the correspondence between neuronal activity and behavioral performance persists during improvements in performance that occur with practice (Zohary et al., 1994).
If attention alters behavior without affecting the link between neuronal activity and behavioral performance, then attention must act in a manner similar to varying stimulus strength. That is, behavioral performance should follow changes in neuronal responses, whether those changes arise from stimulus differences or changes in behavioral state. This idea is supported by recent studies which show that attention alters neuronal responses in a multiplicative manner without changing stimulus selectivity (McAdams and Maunsell, 1999; Treue and Martinez Trujillo, 1999). Multiplicative scaling is similar to the way changes in stimulus strength affect neuronal responses (Tolhurst, 1973; Sclar and Freeman, 1982). Directing attention to a stimulus may therefore have the effect of multiplicatively enhancing specific representations in sensory cortex, thereby improving detection.
We wanted to know whether the attentional enhancement of neuronal activity and behavioral performance is equivalent to the enhancement expected from increasing stimulus strength. Were this the case, it would support the idea that the effect of attention on neuronal responses in sensory cortex accounts for the attentional modulation of behavioral performance. We designed an experiment in which we could simultaneously measure behavioral performance and responses of individual neurons while spatial attention and stimulus strength were varied. We used a motion detection task and recorded in the middle temporal (MT) and ventral intraparietal (VIP) areas, two regions of the visual cortex involved in motion processing. We found that the average effect of attention on neuronal responses in MT was usually less than needed to account for changes in behavioral performance. In contrast, the average effect of attention on the VIP area responses was much greater than that needed to account for changes in behavioral performance.
MATERIALS AND METHODS
Behavioral tasks. Data were collected from two male rhesus monkeys (Macaca mulatta) while they performed a spatially cued motion detection task (see Fig. 1A). Each monkey sat in a primate chair during training and recording sessions, which lasted 2–5 hr. While the animal pressed a lever and fixated on a central point, two patches of dynamic random dots were presented on opposite sides of the fixation point. Both patches started with no net motion (0% coherent), and the animal's task was to release the lever within 750 msec after either patch began moving in a coherent manner. Coherent motion started at a random time between 500 and 8000 msec. Having at least 500 msec of 0% random motion before the coherent motion occurred minimized any effect the static cue may have had on neuronal responses. The coherent motion onset was exponentially distributed with a mean of 1300 msec. However, for the first 21 MT cells recorded in monkey 1, the coherent motion times were uniformly distributed, which resulted in a slight increase (4%) in the number of correct responses for coherent motion occurring toward the end of trials.
The diameter and location of one patch of dots was adjusted to fill the receptive field (RF) of the neuron under study, and the coherent motion in either patch was in the preferred direction and speed of the neuron. The coherent motion was present until the monkey released the lever or the end of the 750 msec reaction time window was reached. The strength of the coherent motion signal varied randomly from trial to trial among preset values to produce a range of behavioral performances.
Spatial attention was controlled by presenting a cue of static dots in one patch at the beginning of each trial. This cue indicated which patch would contain the coherent motion signal. A key element of this task was that the cue was valid 80% of the time (valid trials). In 20% of the trials, the coherent motion signal occurred in the uncued patch (invalid trials). The idea was that the animal would devote most of its attention to the cued patch of dots, because this patch was most likely to contain the coherent motion signal. This paradigm of valid/invalid cueing has been used successfully to measure the behavioral effects of spatial attention in humans (Posner, 1980).
The monkey received a reward for releasing the lever between 200 and 750 msec after the start of the coherent motion signal (correct trial). Failure to release the lever or late releases was not rewarded (missed trial). Both correct and missed trials were scored as completed trials. Trials in which the monkey released the lever early either during the 0% coherent motion or <200 msec into the coherent motion (false alarm) or did not maintain fixation within 1° of the fixation point (fixation break) were not counted as complete or analyzed.
Experiments were run in a block mode in which the cue was presented at the same location for 15 completed trials (either correct or missed). Valid and invalid cues were balanced between the two patch locations. Thus, for each block the monkey had 12 valid trials in which the coherent motion occurred in the cued patch and 3 invalid trials in which the coherent motion occurred in the uncued patch. Trials in which the cue and the coherent motion were both in the RF of the neuron will be referred to as “attend in” because the monkey was directing its attention toward the RF. Trials in which the cue was outside the RF and the coherent motion occurred inside the RF will be referred to as “attend out” because the monkey's attention was directed away from the RF. Only trials in which the coherent motion occurred in the patch centered on the RF were used in this analysis.
Behavioral performance was measured as the proportion of correct trials. Four levels of motion coherence were usually measured, including three levels of motion coherence (low, medium, and high) and 0% (catch trials). The values of non-zero motion coherence were adjusted for each stimulus configuration to produce a range of behavioral performances for the animal. The average behavioral performance was 50, 92, and 99% correct for low, medium (validly cued), and high coherence trials. No reward was given during the 0% catch trials.
The monkeys were also trained to perform a standard memory delayed saccade task (White and Sparks, 1986). In this task, the monkey fixated on a central point while a peripheral target (0.25° diameter) appeared for 500 msec. To get a reward, the monkey had to remember the target location for 500–2500 msec and then, after the central fixation point was extinguished, saccade to within 2.5° of its location within 300 msec. Neuronal responses were analyzed only from correctly completed trials.
Random dot stimulus. The animals sat 62 cm from a computer monitor (±17 × ±13° of visual angle, 1600 × 1200 pixels, 75 Hz refresh). The stimuli consisted of two patches of white dots (each 0.25° diameter, 78 cd/m2) on a dark gray background (12 cd/m2) with a dot density of 2.1 dots/degree2. Each patch of dots was updated every other frame (37.5 Hz) using the following procedure. The dots were evenly divided into two groups. On each update, one group was replaced with new, randomly positioned dots, while dots in the other group were displaced by a fixed distance. The dots in this latter group determined the motion coherence. For 0% coherence, all of the dots in this group moved a fixed distance in a random direction. For coherent motion greater than zero, a proportion of the dots moved with a fixed distance in the same direction. This proportion determined the strength of the coherent motion. On the next update, the groups were switched. This arrangement ensured that all the dots had a lifetime of two updates (26.6 msec) before they were replaced and that there would be no changes in the apparent dot density associated with the onset of coherent motion. Because half of the dots are always randomly replotted regardless of the proportion of dots moving coherently, our motion had a maximum strength of 50% coherence. For example, at 25% coherent motion, half of the dots are randomly replotted, one-fourth are moving with the same fixed distance and direction, and one-fourth are moving with the same fixed distance in a random direction.
Neuronal recording and data collection. Using standard extracellular recording techniques (Gibson and Maunsell, 1997), we recorded from single neurons in MT and VIP areas in both animals. When a neuron was isolated, the receptive field was mapped using a manually controlled bar while the animal fixated on a central spot. The diameter of the receptive fields ranged from 3.9 to 10.7° (median 7.4°) for the MT area and 5 to 10.6° (median 8.2°) for the VIP area. Receptive field center eccentricities ranged from 3.9 to 11.1° (median 7.9°) for the MT area and 3.9 to 11.0° (median 8.1°) for the VIP area. The preferred speed was also judged using a bar moved by hand. The directional tuning of the neuron was determined using the motion detection task described above with 50% coherent motion presented in eight directions. For most cells, once the receptive field location, size, preferred direction, and speed were determined, the memory saccade task was run with the targets at the centers of where the random dot patches would be located. Five to 30 (median 12) correctly completed trials were collected for this task. The motion detection task was then run, and we recorded from the neuron as long as possible. The number of completed trials per coherence level for the motion detection task ranged from 15 to 175 (median 35). The monkey's performance varied with patch location, size, and motion speed, which were determined by the response properties of the neuron under study. Consequently, different neurons were tested with different coherence levels. The animal's eye position was measured every 5 msec using a scleral search coil (Robinson, 1963; Judge et al., 1980), and the occurrence of action potentials was recorded to the nearest millisecond.
Analysis. Standard statistical methods were used for most analyses. The exception was the analysis in Figure 5, in which a bootstrap procedure (Efron and Tibshirani, 1993) was used to determine whether the neuronal and behavioral effects of attention observed in a single neuron were significantly different. The bootstrap procedure has the advantage of requiring no assumptions regarding the distribution of the null hypothesis. For this analysis, trials from each neuron were randomly resampled with replacement to form new bootstrap samples. This was repeated to produce 1000 total bootstrap samples in which each bootstrap sample had the same number of trials as the original data set. For each bootstrap sample, sigmoidal and linear fits were performed as a function of motion strength on the behavioral and neuronal data, respectively. The difference between the neuronal and behavioral effects of attention was computed from these fits, forming a distribution of differences. If the 95% confidence interval of this distribution of attentional differences contained zero, then it was concluded that there was no difference between the neuronal and behavioral effects of attention (see Fig. 5, open symbols). Otherwise, it was concluded that attention had a significantly different effect on the behavior and neuronal response (p < 0.05) (see Fig. 5, filled symbols). For the marginal distributions in Figure 5, the effects of attention on the neuronal response and behavioral performance were assessed separately using the same bootstrap procedure. However, in this case, distributions were computed separately for neuronal and behavioral data and tested if the 95% confidence interval for the mean contained zero.
Sigmoidal curves were fit to behavioral performance using a nonlinear fitting function in MATLAB (The Mathworks). Because we fit four data points with sigmoids containing two free parameters, the fits where predictably very good. For all experiments, the minimum correlation coefficient between the measured behavior and fitted sigmoid was 0.97 (median 0.99). The use of a sigmoid to describe behavioral performance as a function of motion coherence was based on days when we measured the animal's behavioral performance with more than four levels of coherent motion (data not shown). Neuronal responses as a function motion coherence were described using a linear function. We chose this because it has been shown that MT responses increase linearly with motion coherence (Britten et al., 1993). Linear fits of neuronal responses were evaluated using standard regression analysis (F test for slope = 0; p < 0.05).
Histology. A vertical approach was used with recording chambers implanted dorsal to the superior temporal and intraparietal sulci. A histological reconstruction of recording sites was made only for monkey 1 (monkey 2 is to be used in further experiments). For monkey 1, electrolytic lesions (10 μA for 10 sec) were made at a few recording sites in the MT and VIP areas a few days before the end of recording. The extent of the MT area was mapped using myelin-stained sections (Van Essen et al., 1981). Of 56 neurons recorded in the superior temporal sulcus in monkey 1, 9 were not unequivocally within the MT area and were excluded from analysis. Sections of the intraparietal cortex revealed that we recorded from neurons located in the ventral portion of the lateral bank. Recordings from the lateral bank of the intraparietal sulcus were within 3 mm of the fundus, which has been identified as VIP (Colby et al., 1993). The Horsely-Clark coordinates of the MT area recordings ranged from 4 to 7 mm posterior and 15 to 18 mm lateral. The VIP coordinates ranged from 1 mm posterior to 2 mm anterior and from 10 to 14 mm lateral.
We recorded from 93 MT cells and 104 VIP cells in two monkeys performing a motion detection task. Of these, 11 MT and 15 VIP neurons were excluded from the analysis on the basis of their lack of responsiveness to the coherent motion (see below).
Directional selectivity in MT and VIP
The MT area projects to several parts of the parietal cortex including the VIP area (Maunsell and Van Essen, 1983), which is a later stage of processing in the parietal stream (Ungerleider and Mishkin, 1982). Both areas contain many neurons that are strongly directionally selective (Van Essen et al., 1981; Colby et al., 1993). The directional selectivity of the neurons that we recorded is shown in Figure 2, A and B. Data from the two monkeys were combined because directional selectivity was similar for both animals. The average directional tuning curve in the left panels was based on responses to 50% motion coherence presented using the motion detection task. Directional selectivity was measured only while the animals were attending to the stimulus in the RF of the neuron.
Individual tuning curves were rotated to bring the preferred direction of each neuron to the top, and responses to different directions were then averaged. The mean firing rates for the 0% and 50% coherent motion were computed using two 300 msec periods just before and after the onset of the coherent motion (Fig.1 B). We chose these intervals on the basis of the monkeys' reaction times, which were >300 msec 99% of the time. The spontaneous firing rate was computed using the 250 msec period just before the 0% coherent motion began.
MT neurons were more directionally selective than VIP neurons, and most were inhibited relative to the 0% coherent response by motion in the null direction. The right panels in Figure2 show the distributions of directional selectivity expressed as a directionality index (DI). This index is the normalized vector sum of the firing rates for different directions. DI was calculated by first normalizing the average firing rates for each motion direction by the sum of all the average firing rates, referred to as N d, where drepresents one of eight directions. If allN d are summed as vectors, with each vector pointing in the direction of motion, the result is a vector with a magnitude that is DI. For a cell that has no directional tuning, in which all directions of motion produced the same response, DI = 0. For a cell that only responded to motion in the preferred direction, DI = 1. The mean DI (Fig. 2, dashed lines) for MT neurons was slightly greater than that for VIP neurons, but both areas contained a large proportion of cells that could contribute to motion detection.
Attentional modulation in MT and VIP
Figure 3 shows the responses of a typical MT neuron recorded while the monkey performed the motion detection task. The proportion of trials in which the monkey released the lever in response to the coherent motion signal is shown in Figure3 A for the two attentional states. The filled points in Figure 3 A correspond to trials in which the monkey directed his attention to the patch of dots in the RF (Attend in). When the strength of the motion signal was strong (30%), the monkey correctly detected the coherent motion on almost every trial. As the strength of the coherent motion was reduced, the monkey's ability to detect the motion signal decreased. At 0% coherent motion there was no signal, and a behavioral performance greater than zero indicates a false alarm or guessing rate by the animal. For both animals the false alarm rate averaged <5%, indicating that they used a conservative criterion for detecting coherent motion.
To assess the behavioral effect of attention, we measured the behavioral performance on trials when the animal was attending to the patch of dots outside the RF (Fig. 3 A, Attend out). In these trials, the cue was presented outside the RF of the neuron, whereas the coherent motion occurred in the RF. Thus, these trials had invalid cues. Only the medium strength coherent motion was presented during invalidly cued trials (which for this cell was 22.5%). The effect of withdrawing attention on behavioral performance is shown in Figure 3 A (open square) by the poorer performance when the monkey directed its attention away from the patch of dots containing the coherent motion.
We did not sample invalidly cued trials at other motion coherence levels because of the limited time available for electrophysiological recordings of single neurons. Because the invalid cues occurred on only 20% of the trials, adding another invalid cueing level would double the number of trials required for a complete data set. We measured behavioral performance using invalid cues with several motion coherence levels on days when we did not record neuronal data. The resulting psychophysical curves for the detection of the motion when the animal was not attending was always consistent with a rightward shifted version of the performance curve for when the animal was attending (data not shown).
To compare the behavioral and neuronal effects of attention, we expressed both in units of stimulus strength (percentage motion coherence). A sigmoid was fit to the behavioral performance in Figure3 A (see Materials and Methods). Using the sigmoidal fit, the effect of withdrawing attention in units of motion coherence is shown by the dotted lines. For the invalid cueing condition at 22.5% motion coherence, withdrawing attention reduced the monkeys behavioral performance from 0.6 to 0.3. This was equivalent to the behavioral performance at the 13.5% motion coherence level in the attended condition. Thus the behavioral effect of withdrawing attention in units of motion coherence is 13.5–22.5% = −9% and represented a significant change in behavior (bootstrap; p = 0.0; see Materials and Methods).
The time course of the response of the MT neuron is shown in Figure3 B aligned to the onset of the coherent motion stimulus (vertical line). The response to the coherent motion was greater for stronger motion signals. The effect of attention on the neuronal response can be seen by comparing the histograms for the two attentional conditions at 22.5% coherent motion stimulus. The response of the neuron was reduced on trials when the monkey was attending to the patch of dots outside the RF (Attend out). We used the increment in driven rate (Fig. 1 B, c) to quantify the strength of the neuronal response. This was computed as the mean firing rate produced by the coherent motion signal minus the mean firing rate produced by the 0% coherent motion using the two 300 msec periods just before and after the onset of the coherent motion. The mean driven response above the 0% background for this MT neuron is plotted in Figure 3 C. The driven rates of firing were typically small, but this is not surprising because the motion signals were set to be close to detection threshold.
We described the response of each neuron as a function of motion coherence using a linear relationship (see Materials and Methods). Of 93 MT cells, 11 were not well described by a linear relationship and were excluded from further analysis. Using the linear fits, we computed the effect of withdrawing attention in units of motion coherence, which is shown by the dotted line in Figure 3 C. For this neuron, withdrawing attention was equivalent to changing motion coherence in the valid cueing condition by the amount of 12.5–2.5% = −10%. This change, however, was not significant (bootstrap;p = 0.09).
In this example, withdrawing attention reduced both behavior and neuronal activity by similar amounts. This was not true for all cells in either MT or VIP. A different effect of attention is illustrated for an example VIP neuron in Figure 4. The monkey's behavioral performance during the motion detection task is illustrated in Figure 4 A. In this case, the reduction in behavioral performance that occurred while the monkey directed attention away from the stimulus in the RF was equivalent to −9% in units of motion coherence (bootstrap; p = 0.0). The neuronal responses to the low, medium, and high coherent motion levels are shown in Figure 4, B and C. This cell, like many VIP neurons, was strongly suppressed while the monkey directed attention away from the stimulus in the RF. In Figure 4 C, the mean driven neuronal response above the 0% coherent background is shown. The responses of VIP neurons were also reasonably linear as a function of motion strength. Of the 104 VIP neurons, 15 were excluded from analysis because of poor linear fits as a function of coherent motion (see Materials and Methods). For this cell, withdrawing attention from the RF during the invalid cueing trials was equivalent to a −16.5% change in units of motion coherence (bootstrap;p = 0.0). Thus withdrawing attention had a larger impact on the neuronal response than on the behavioral performance.
Attention had different overall effects on the population of MT and VIP neurons. Figure 5, A andB, illustrates the behavioral versus neuronal effects of attention in MT expressed in units of motion coherence for each animal. Each symbol corresponds to a single neuron, and the filled symbols represent neurons for which the behavioral and neuronal effects of attention were statistically different (p < 0.05), as determined using a bootstrap procedure (see Materials and Methods). Withdrawing attention produced statistically different changes in the behavioral and neuronal activity in 38% of MT neurons (38% in monkey 1 and 37% in monkey 2). For both animals, the average attentional modulation of the neuronal response in MT neurons was smaller than the average attentional modulation of behavior (p < 0.001, monkey 1;p = 0.05, monkey 2; paired t test). Thegray stars in Figure 5 show the mean attentional modulations.
On average, VIP neurons had large attentional modulation. Figure 5,C and D, shows the behavioral and neuronal effects of attention in units of motion coherence for VIP. The effects of attention on neuronal responses and behavioral performance were statistically different in 55% of VIP neurons (52% in monkey 1 and 60% in monkey 2; p < 0.05; bootstrap). Unlike MT neurons, however, the modulation in VIP was usually greater than the modulation of the behavior. The average attentional modulation (gray stars) is significantly above the unity-slope lines for both monkeys (paired t test; p < 0.001). Thus, on average, neurons in the VIP area were more affected by attention than would be predicted on the basis of behavioral modulation. In MT, neurons with attentional modulation that was significantly different from the behavioral modulation (Fig. 5,filled symbols) were almost exclusively to the right of the unity-slope lines. In the VIP area, the opposite is true, with most significant points falling to the left of the unity-slope lines.
Figure 5 also includes the marginal distributions of the neuronal and behavioral effects of attention expressed in units of equivalent motion coherence. The hatched bars indicate statistically significant effects of attention (bootstrap; p < 0.05). The mean for every distribution (indicated by the gray stars) is significantly <0% (t test;p < 0.05), except for the neuronal distribution for monkey 1 (p = 0.09).
Figure 5 shows that the neuronal modulation by attention was highly variable within both MT and VIP neurons. The behavioral effects of attention, however, are much less variable, indicating that the monkeys may have been in two relatively constant attentional states depending on the location of the cue. Large variability of neuronal modulation is commonly observed in studies of attention. The reason for this variability is not known. Figure 5 suggests, however, that the variability in neuronal modulation is probably not caused by variability in the monkeys' attentional state. These plots also show that there is little correlation between the effect of attention on the behavior and neuronal responses across recording sessions (in Fig. 5, correlation coefficients are 0.19 and 0.04 for MT neurons and 0.04 and 0.16 for VIP neurons for monkeys 1 and 2, respectively).
The filled and open symbols of Figure 5 represent two groups of neurons, those that approximated the behavioral effects of attention and those that did not. We examined whether these two groups differed in other ways. We found no difference in the directional selectivity (DI) of the two groups for either MT (two-sample t test; p = 0.16) or VIP (p = 0.52) neurons. However, neurons that were modulated by attention differently than the behavioral modulation (filled symbols) had slightly greater average responses to the coherent motion compared with other neurons (open symbols), but this difference was not statistically significant. For MT neurons, the average driven rates to the high coherent motion were 16.0 and 20.2 spikes/sec for neurons with the same and different effects of attention with respect to the behavior (two-sample t test, p = 0.08). In VIP neurons, the average driven rates were 13.9 and 18.8 spikes/sec for the same and different effects of attention with respect to the behavior (two-sample t test; p = 0.07). The reason for this difference in average firing rate for both MT and VIP neurons is most likely caused by the increased noise at lower firing rates. With more variability at lower response rates, the less likely a significant difference between neuronal and behavioral effects of attention will be observed.
A limitation in the analysis of Figure 5 arises if attention affects the slope of either behavioral performance (Figs. 3 A,4 A) or neuronal response (Figs. 3 C,4 C). If this were the case, then the amount of attentional modulation computed would depend on the strength of motion coherence used for the unattended condition. Figure 5 also depends on accurate sigmoidal and linear fits to the behavioral and neuronal responses for each experiment. An alternative way of comparing the average effects of attention on behavior and neuronal activity that avoids these limitations is shown in Figure 6. These plots show the mean behavioral performance and mean neuronal response plotted as a parametric function of stimulus strength for MT and VIP neurons in both animals. For this analysis, performance and neuronal data were averaged across cells at each level of motion coherence (0%, low, medium, and high) while the monkeys were attending to the stimulus in the RF. A sigmoidal curve was assumed to describe how the average behavioral performance and neuronal response covaried as a function of stimulus strength and was fit to the data points. If attention exerted its influence in a manner that was similar to varying stimulus strength, then the data point corresponding to the unattended condition would fall on this line.
Figure 6 confirms the primary observations in Figure 5. For MT neurons in both monkeys, the unattended point lies to the right of the sigmoidal line, indicating that the amount of attentional modulation was less than expected on the basis of the modulation of behavioral performance (Fig. 6 A,B). The effect of attention was different, however, between the two monkeys. For monkey 2, the invalid point lies much closer to the sigmoidal line. Figure 6, C and D, summarizes the effects for VIP neurons. As with MT neurons, the unattended point for VIP neurons did not fall on the curve that describes the relationship between neuronal activity and behavioral performance in the attended condition. On average, withdrawing attention was not similar to reducing stimulus strength in VIP. Figure 6 highlights the main difference between these two brain regions. In VIP, the average attentional modulation of the neuronal signal was too large to account for the behavioral modulations, whereas in MT it was too small.
So far we have assumed that the driven rate in the neuronal response above the 0% coherent motion (Fig. 1 B, c) corresponds to the animals' detection of the motion stimulus. Although we believe that this is the most likely way the animals used the neuronal responses, it is possible that the absolute response (Fig.1 B, b) could have been used instead. We repeated the analysis for Figure 6 using absolute firing rates, and the results are shown in Figure 7. Use of absolute rates increased the attentional modulation of the neuronal response. This increase occurred because attention also affected the neuronal response to the 0% coherent motion. The results using absolute rate in Figure 7 are similar to using driven rate in Figure 6. The exception is for MT neurons in monkey 2 in which the unattended point now falls on the attended curve, suggesting that attention affected both absolute neuronal response and behavioral performance as a change in stimulus strength.
Figure 8 compares the effect of attention on the neuronal responses. We computed the distribution of attentional modulation using the index of (R in −R out)/(R in+ R out), whereR in is the response while the animals attended to the stimulus and R out is the response while the animals attended away from the stimulus. The corresponding ratio (R in/R out) is labeled on the top x-axis. The median attentional modulation corresponded to a 12 and 24% enhancement in MT neurons and a 195 and 389% enhancement in VIP neurons for monkeys 1 and 2, respectively. Thus monkey 2 had greater attentional modulation of neuronal responses in both areas compared with monkey 1.
In Figure 8 we calculated the amount of attentional modulation using the driven neuronal responses relative to the 0% coherent motion. For some VIP neurons, withdrawing attention produced responses that dropped below the 0% coherent motion firing rate. For these neurons, the response to the invalid cue was considered to be zero, producing an attentional index equal to 1. When absolute firing rates were used to compute the amount of attentional modulation (Fig.1 B, b), monkeys 1 and 2 had median attentional enhancements of 12 and 28% for MT and 68 and 94% for VIP. This places these two monkeys at the low and high end of what has been observed for MT spatially directed attentional modulation (Seidemann and Newsome, 1999; Treue and Maunsell, 1999). In these studies, the animals were trained to entirely ignore the unattended stimuli. In our task, however, the monkeys directed some attention to the uncued stimuli, as indicated by their occasional responses to the invalidly cued patch of dots.
It is possible that on unattended trials, the animals may have shifted their attention to the coherent motion immediately after the motion began. This would confound measurements of attentional modulation. To address this, we also calculated the attentional modulation for the absolute firing rate of the 300 msec of 0% coherent motion that preceded the onset of the coherent motion signal (Fig.1 B, a). In this case, the median attentional modulation in monkeys 1 and 2 was 10 and 24% for MT and 40 and 35% for VIP neurons.
Thus for VIP neurons, the attentional enhancement of neuronal responses went up appreciably after the coherent motion signal began. This is surprising because shifting attention toward the coherent motion in the uncued patch would reduce the attentional enhancement relative to that immediately before the motion began. To see how attentional modulation evolved during the trials, we plotted the average neuronal response for the medium coherent motion combining neurons from both monkeys. Figure9 A shows the mean firing rates for both attentional states where all trials aligned to the onset of the coherent motion. Only the first 500 msec of the response to the coherent motion is shown because as the animals responded, fewer trials contributed to the average. The ratio of the attentional modulation is shown in Figure 9 B. The effect of attention is relatively constant for MT neurons. For VIP neurons, however, attentional modulation increases when the coherent motion begins. The reason for this increase is unknown.
Even when the animals were not attending to the patch containing the coherent motion, they still were able to respond to the coherent motion signal on ∼50% of trials. One possibility is that the amount of attention directed at the stimulus may have been different between correct and missed trials. We examined this by plotting the average time course of the neuronal response in MT and VIP neurons (aligned to the onset of the coherent motion) for correct and missed trials separately (Fig. 10) (note expanded scales). During the 0% coherent motion, the neuronal responses corresponding to correct and missed trials in the unattended condition were almost identical (thin and thick dashed lines). After the coherent motion began, the unattended responses were greater for correct compared with missed trials in both MT and VIP neurons. For the attended condition, only the responses for correct trials are shown (solid lines) because there were not enough missed trials to estimate the average response. These results suggest that the animals maintained a relatively constant level of attention and effort before the coherent motion began. Once the coherent motion started, the animals may have quickly reoriented their attention during detection of the coherent motion in the unattended patch. If so, this reorientation may have been too weak or slow on missed trials to allow a correct behavioral response, producing the corresponding weaker neuronal response (thin vs thick dashed lines).
Memory delayed saccade activity in MT and VIP neurons
The lateral intraparietal (LIP) area has been implicated in coding intended eye movements [Andersen et al. (1997); but see Colby and Goldberg (1999)]. Although neither MT or VIP (Colby et al., 1993) neurons are thought to be involved in this computation, if such a signal were present in our motion detection task, it could appear as pronounced attentional modulation. Because the VIP area is immediately ventral to the LIP area in the intraparietal sulcus, we were also concerned that we might have inadvertently sampled neurons that encoded intended eye movements. Although the existence of robust directional selectivity (Fig. 2) argued that our recordings were from the VIP area, for most cells we included a memory delayed saccade task to measure the effects of intended saccades. This task was done for cells in both the VIP and MT areas.
In this task the monkey had to perform a saccade to a remembered target location. There were two possible target locations corresponding to the center of the two patches of dots. Figure11 shows the average response for all MT and VIP neurons tested using the memory saccade task. Responses were aligned to when the target was extinguished (at 500 msec). Although neurons in both areas gave a transient response to the appearance and disappearance of the target, there was no appreciable memory delay activity observed for either MT or VIP neurons. Only 5% of MT neurons and 19% of VIP neurons showed a significant difference in firing during the memory delayed period (two-sample t test;p < 0.05). Thus it is unlikely that either population of cells was coding for intended eye movements or that we were recording from the LIP area.
Our results suggest that attentional state can alter the relationship between neuronal response and behavioral performance. Although the psychophysical and neurophysiological effects of attention have been extensively studied individually, to our knowledge their interactions have not been directly examined previously. Understanding how attention alters the processing of visual information is important because attentional modulations have been seen in every visual cortical area examined, and in some cases they are a major component of the activity of a neuron.
We were particularly interested in understanding whether varying spatial attention was equivalent to varying stimulus strength for both the neuronal response and the behavioral performance. Withdrawing attention reduced behavioral performance in our motion detection task and also reduced neuronal responses in MT and VIP neurons. However, in MT neurons the average amount of attentional modulation of the neuronal signals was usually less than the average attentional modulation of behavioral performance. In contrast, the average attentional modulation of neuronal signals in the VIP area was greater than the average attentional modulation of behavioral performance.
Using a reaction time task allowed us to focus on the part of the neuronal response that most likely contributed to the behavioral performance. An assumption for most of our analysis was that the neuronal signal indicating coherent motion was the driven response above the 0% baseline during the first 300 msec of the coherent motion. One issue is whether a different measure of neuronal performance would yield qualitatively different results. Many studies that try to link neuronal response with behavioral performance use a receiver–operator characteristic (ROC) model of signal detection, which uses a statistical description of the neuronal response to produce a performance metric (Green and Swets, 1966). To determine whether such a description of the neuronal signal would affect our results, we constructed an ROC model of the neuronal responses using the two 300 msec periods before and after the start of the coherent motion as our noise and signal distributions. The result of the ROC model was qualitatively identical to the analysis reported here. Similarly, our results were not sensitive to the absolute firing rate during the coherent motion, although in this case the average MT modulation was now much closer to the behavioral modulation (Fig. 7). Other time windows (i.e., 200 and 400 msec) used to compute neuronal responses did not affect our results. Also, our results were not appreciably different when time windows were aligned to the monkey's response to account for different reaction times on individual trials. Thus, we believe that the relevant feature of the neuronal response that contributed to the behavioral performance was captured in our analysis.
The amount of extraretinal enhancement observed in MT neurons ranges from small (Ferrera et al., 1994; Seidemann and Newsome, 1999) to very large (Treue and Maunsell, 1999). This wide range of MT area attentional modulation has been hypothesized to be task dependent, although the results from our two monkeys suggest that the amount of modulation also depends on differences between animals. The differences in the amount of attentional modulation between our two animals (especially in MT) could be attributable to the fact that the animals used different strategies to allocate attention. However, the similar behavioral effects of attention suggest that this was not the case and emphasize the value of simultaneous neuronal and behavioral measurements of attention. That VIP neurons exhibited larger modulation than MT neurons in both animals suggests there is a significant difference between these areas. The large effect of attention in VIP neurons seen here is consistent with human imaging experiments that have demonstrated strong attentional modulation in regions of parietal cortex (Corbetta, 1998; Kastner and Ungerleider, 2000).
One limitation of our study comes from the constraint that we could only sample the unattended condition at a single stimulus strength. It is unknown, for example, whether the MT neuron in Figure 4 would also exhibit nearly the same attentional modulation as the behavior at several different levels of motion coherence. A single unattended data point fails to distinguish whether attention produces a shift or change in slope for neuronal and behavioral performance as a function of stimulus strength. However, if attention behaved in a manner equivalent to varying stimulus strength, then the stimulus level used for the invalid cueing trials would not affect the results. Because we found that this was not the case for the single invalid point used here, the addition of other invalid cueing points would not have changed our conclusions. Thus, even if attention was observed to be equivalent to varying stimulus strength at other motion coherences, it would not change the results reported for the motion coherences used here.
Task difficulty or effort has been shown to modulate neuronal response (Spitzer and Richmond, 1991) and affect attentional modulation (Boudreau and Maunsell, 2001). It is possible that task difficulty may have affected our results. It is unlikely, however, that the level of motion coherence chosen to examine the effects of attention had much influence on task difficulty. This is because the invalidly cued trials were only 20% of the total trials. Task difficulty in our experimental design would likely depend on the range of coherences used for the validly cued trials (80% of total trials). Because the animals never knew in advance the strength of the coherent motion on any given trial, they probably maintained a constant level of effort. A constant level of effort by the animals is supported by the observation that attentional modulation of the 0% coherent response was equivalent for correct and missed trials (Fig. 10). How varying task difficulty would affect our results, however, is unknown and remains to be tested in future experiments.
The relationship between behavioral and neuronal performance did not persist across changes in behavioral state for average responses in either MT or VIP neurons. One interpretation is that the correspondence between neuronal activity and behavioral performance, observed in other studies (Britten et al., 1992), exists only for conditions of high attention. However, we think it is more likely that a correspondence survives changes in attentional state, but only for those specific cortical regions with response properties best suited to task demands. This interpretation is based on the observation that the average neuronal enhancement by attention increases as a function of cortical hierarchy. Figure 12 shows this using data from several previous reports that measured spatial attentional modulation in more than one cortical area using identical behavioral conditions. The important observation from Figure 12 is that for each study that measured spatial attention in more than one area, the amount of modulation was greater in higher cortical areas.
Although the reason for greater attentional modulations in later stages is unknown, it has important implications for the relationship between neuronal activity and behavioral performance. If stimulus–response functions are similar for neurons in different cortical areas (e.g., as they are for MT and VIP neurons), then only certain levels of cortical processing will have a mean amount of modulation that is consistent with that needed to account for the attentional modulation of behavior. Although one might expect that this should occur at the latest stages of visual cortex, the current results suggest that this is not always the case. Neurons in the latest stages of cortex often have elaborate and specific response properties and may not be best suited for performance in tasks such as the motion detection used here. In our task, the average attentional modulation in MT and VIP neurons fell to either side of the average behavioral modulation. This suggests that an intervening area (perhaps the middle superior temporal area) would have exhibited the same amount of attentional modulation as seen in the behavioral response. This raises the intriguing possibility that the site where attentional modulation of neuronal and behavioral responses match could indicate which region of visual cortex is most directly involved in a given perceptual task.
If the behavioral effects of attention closely follow the modulation of neuronal activity in visual cortex, then the increased attentional modulation in later stages of the cortical hierarchy would have specific consequences. Later stages of visual cortex contain neurons that respond to increasingly complex stimulus attributes. The MT area, for example, is thought to represent basic features such as translation and depth (DeAngelis et al., 1998). In contrast, the VIP area contains neurons that respond to several types of visual and extraretinal signals, including tactile stimulation of the face, vestibular stimulation, optic flow, and targets moving in either retinocentric and head-centered coordinates (Schaafsma and Duysens, 1996; Colby and Goldberg, 1999). If the nature of the stimulus analysis required by a perceptual task determines the particular level of cortical representation used, then a simple perceptual task that depended primarily on early representations in visual cortex may demonstrate little behavioral effect of spatial attention, whereas more complex perceptual tasks may produce much larger behavioral effects of attention.
This work was supported by National Institutes of Health Grant R01 EY05911. J.H.R.M. is an Investigator with the Howard Hughes Medical Institute. We thank J. Assad, C. Boudreau, J. DiCarlo, G. Ghose, and T. Yang for helpful discussion on all aspects of this project. We also thank D. Murray and T. Williford for technical assistance.
Correspondence should be addressed to Erik P. Cook, Division of Neuroscience, S-603, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030. E-mail:.