Abstract
To guide behavior, perceptual and motor systems must estimate properties of the sensory environment from the responses of populations of cortical neurons. In the domain of visual motion, estimates of target speed are derived from the responses of motion-sensitive neurons in the middle temporal (MT) area of the extrastriate visual cortex and are used to drive smooth pursuit eye movements and perceptual judgments of speed. We have asked how these behavioral systems estimate target speed from the population response in area MT. We found that increasing the spatial frequency of a sine wave grating caused decreases in the target speed estimated by both pursuit and perception and commensurate changes in the identity of the active neurons in area MT. Decreasing the contrast of a sine wave grating caused decreases in the target speed estimated by both pursuit and perception, while altering only the response amplitude of MT neurons and not the identity of the active neurons. Applying a modified vector-averaging computation to the population response measured in area MT allowed us to predict the effects of both spatial frequency and contrast on speed estimation for both perception and pursuit. The modification biased the speed estimation toward low target speeds when responses across the population of neurons were small.
Introduction
Sensory information about a given stimulus is distributed among many neurons in the cerebral cortex. Yet, both perceptual and motor systems are able to extract information from the population codes to estimate the original sensory stimulus and form veridical judgments or program accurate actions. Visual motion has provided an excellent sensory modality for analysis of population coding and decoding, because so much is known about the population code for motion in the middle temporal (MT) area of the extrastriate visual cortex. Here, we have asked how target speed is reconstructed for a motor action and a perceptual judgment. As a motor behavior, we have used smooth pursuit eye movements. As a perceptual behavior, we have used judgments of the relative speed of two stimuli. Both rely on accurate estimates of target motion from the population response for visual motion in area MT (Newsome et al., 1985; Logothetis and Schall, 1989; Britten et al., 1992; Britten et al., 1996; Groh et al., 1997; Churchland and Lisberger, 2001).
Neurons in area MT are tuned for the direction and speed of a moving visual stimulus (Dubner and Zeki, 1971; Maunsell and Van Essen, 1983). They fire maximally for stimuli of the optimal direction and speed and less well, or not at all, for other stimuli. For any given moving stimulus, many MT neurons will be active, and the most active MT neurons will be those with preferred speed and direction matching the stimulus. Therefore, target speed could be reconstructed by any computation that reports the preferred speed of the most active neurons. Possible computations include “winner-take-all,” “vector-summation,” or “vector-averaging” (Robinson, 1972; Salinas and Abbott, 1994; Pouget et al., 1998; Groh, 2001). A recent theoretical paper has shown that a vector-averaging computation may account for a broad range of perceptual phenomena if the computation has a strong bias to estimate low speeds when responses across the active population are small or noisy (Weiss et al., 2002).
A more direct method to identify decoding computations is to search for computations that will convert population responses recorded from actual sensory neurons into behavioral responses measured for the same set of stimuli (Churchland and Lisberger, 2001). Here, we exploit our recent recordings of the responses of a large population of MT neurons to the motion of sine wave gratings of different contrast, spatial frequency, and temporal frequency (Priebe et al., 2003). When converted into population codes, our MT recordings make two predictions about how pursuit and perception should be affected if the decoding computation reports the preferred speeds of the most active neurons. Because changes in the spatial frequency of the stimulus altered the distribution of active neurons in area MT, variation of spatial frequency should affect the estimates of speed. Because reductions in the contrast of the stimulus reduce the response amplitude throughout the MT population without changing the distribution of the active neurons, stimulus contrast should not affect the estimated target speed. Our data confirmed the former, but not the latter prediction. Computational analysis resolved this discrepancy: target speed can be estimated by implementing a form of vector averaging with a bias toward low speeds when the responses of neurons in the active population are small or noisy.
Materials and Methods
Our study is based on three separate data sets collected on three different groups of subjects using the same set of stimuli. The stimuli consisted of moving sine wave gratings in which we varied the contrast, spatial frequency, and temporal frequency. The gratings were windowed in different ways for each set of experiments, in a manner chosen to be ideal for the behavior under examination. The first data set was obtained from monkey subjects in experiments on pursuit eye movements, to determine the effect of varying the parameters of the stimulus grating on the smooth eye acceleration in the first 100 msec of pursuit. The second data set was derived from human observers in a two-alternative forced choice design, to determine the perceived speed of different moving test gratings relative to a single, invariant standard grating. The third data set was taken from our previous publication of the responses of a large sample of MT neurons to moving sine wave gratings of different contrast, spatial frequency, and temporal frequency (Priebe et al., 2003). We do not report any new features of the MT recordings, but we transpose the data in a way that is suitable to document the population response in area MT.
Recordings of smooth pursuit eye movements. Smooth pursuit eye movements were recorded from three male rhesus monkeys (Macaca mulatta) that had been trained to pursue spot targets using techniques described previously (Lisberger and Westbrook, 1985). All methods for experiments on behaving monkeys followed a protocol that had been approved in advance by the Institutional Animal Care and Use Committee at the University of California at San Francisco (UCSF).
Briefly, eye movements were measured with the scleral search coil method (Judge et al., 1980), using eye coils that had been implanted with a sterile procedure while the animal was anesthetized with isofluorane. In a separate surgery, orthopedic stainless steel strips were secured to the skull with 8 mm screws and attached with dental acrylic to a cylindrical receptacle that could be used for head restraint. During experiments, the monkey's head was restrained by attaching a post to both the receptacle implanted on his skull and the ceiling of a specially designed primate chair in which he sat. Eye velocity was obtained by an analog circuit that differentiated the eye position outputs from the search coil electronics for frequencies up to 25 kHz and filtered higher frequencies with a roll-off of 20 dB/decade.
Experiments were run daily and lasted ∼2 hr. Targets were presented in individual trials that began with the appearance of a fixation spot. The monkey was required to fixate the target within 600 msec after it appeared and to maintain fixation within a 2° window around target position for an additional 200-800 msec. The fixation spot then was extinguished and replaced with a tracking target that could be either a spot or a Gabor function that consisted of a vertically oriented sine wave grating windowed by a circular Gaussian function. The Gaussian had an SD of 2.5°. The tracking target appeared 4° eccentric and immediately began to move toward the position of fixation (Rashbass, 1961). The duration of target motion varied from 500 to 1200 msec, depending on the speed of the target. Faster targets neared the edge of the monitor sooner and were extinguished earlier. Monkeys received a juice reward if they maintained eye position within a window around target position for the duration of the trial.
Each pursuit experiment consisted of multiple repetitions of a list of up to 36 types of trials, in which each trial type presented motion of a grating with a specific set of parameters. The trials were sequenced by shuffling the list and requiring the monkey to complete each trial successfully once. If the monkey failed a trial, it was placed at the end of the list and presented again after all the other trials had been completed. After all trials had been completed once, the list was shuffled and presented again.
Pursuit responses were analyzed by aligning eye velocity responses to multiple repetitions of the same stimulus on the onset of target motion and computing the average eye velocity as a function of time, in 1 msec bins. We then estimated the time of the initiation of pursuit from the averages and defined an analysis interval that started at the initiation of pursuit and had a duration equal to one “open-loop interval.” The duration of the open-loop interval was estimated as the latency of the eye velocity response to a change in target velocity during sustained pursuit of a spot target. For each stimulus target, we measured the change in average eye velocity during the analysis interval and divided by the duration of the open-loop interval to compute average eye acceleration. We chose the onset of the analysis interval by marking the onset of pursuit separately in the averages for each target form and motion, but we used the same duration analysis interval for all stimuli in each monkey. Thus, our data took account of variation in the latency of pursuit as a function of the parameters of target motion and of differences in the duration of the open-loop interval between monkeys. Trials with saccades during the analysis interval after pursuit initiation were excluded from all analyses.
Psychophysical judgments of stimulus speed. Six scientists from the Keck Center for Integrative Neuroscience at UCSF, one female and five males, were subjects in the experiments on human speed perception. Subject 6, an author, was not naive to the experiment but did express some skepticism about the project. Two other subjects (2 and 3) had previous experience as subjects in psychophysical studies but were naive to the study being performed. All subjects were healthy and had normal vision. The experiments generally lasted between 1 and 2 hr, and frequent breaks were taken during the duration of each experiment. Subjects gave informed consent at the beginning of the experiment, and experimental procedures were approved in advance by the Committee on Human Research at UCSF.
The experiment was conducted as a series of discrete trials, in which each trial presented a pair of gratings. Subjects were asked to fixate a spot at straight-ahead gaze throughout the trial. Fixation was not measured, but subjects reported that they had no difficulty maintaining fixation. After 800-1200 msec of fixation, two gratings appeared in separate windows: the windows were 5° squares and were centered 4° above and below the fixation point. The gratings moved within the windows for 1000 msec, after which all stimuli disappeared and subjects were given an additional 1500 msec to indicate which of the gratings was moving faster by pressing a button. If there was no response or both buttons were pressed, then the trial was discarded. The two gratings were always of a vertical orientation and moved to the right behind stationary windows.
One grating was always a standard that had a spatial frequency of 0.5 cycle/degree, a contrast of 32%, and a speed of 8°/sec. The spatial frequency of the other test grating was varied systematically among 0.25, 0.5, and 1 cycle/degree. Its speed was varied to be within a one- or two-octave bandwidth above or below the base speed of 8°/sec. The contrast of the test grating was the same as the standard, except in one block of trials for each subject, when the contrast of the test grating was lowered to 8% in 40% of the trials. The location of the test and standard gratings in windows above and below the fixation target was alternated randomly from trial to trial. The horizontal position and spatial phase of the two gratings was randomized in steps of 0.25° and one-quarter of a cycle, respectively, to prevent judgment of speed based on the phase of the sine wave gratings. We again used an interleaved design: repeatedly, the full list of grating pairs was shuffled, and then each trial was presented once. Subjects were given an initial 5 min training session to familiarize them with the sequence of stimuli, the appearance of the stimuli, and the buttons used to report whether the grating in the upper or lower window appeared to move faster. Subjects did not receive feedback about their speed discriminations.
After each experiment, we sorted the data according to the spatial frequency, contrast, and speed of the test grating and plotted the percentage of times the test grating was reported to be faster than each standard grating as a function of the actual speed of the test grating. Each psychometric function was then fitted with a standard sigmoid function: 1
where P is the probability of a “faster” response, a is a metric of the steepness of the curve, s is the speed of the test stimulus, and b is the speed of the test stimulus that would make the choice of the test and the standard grating equally likely. The parameter b was used to estimate the target speed that was perceived to be equivalent to the test speed of 8°/sec.
Recordings from area MT. Single-unit recordings were made in area MT of nine anesthetized, paralyzed macaque monkeys (Macaca fascicularis) using methods that were reported in a previous study (Priebe et al., 2003). All methods had received prior approval by, and followed the regulations of, the Institutional Animal Care and Use Committee at UCSF. Because the data have been published previously and are used simply for computational analysis here, we refer the reader to our prior publication (Priebe et al., 2003).
Presentation of visual stimuli. All visual stimuli were generated by a video frame buffer [VSG (visual stimulus generators) 2/3; Cambridge Research, Kent, UK] and were presented on Barco (Poperinge, Belgium) video displays that had non-interlaced refresh rates of 100 Hz. The display used in the pursuit experiments was a reverse projection video screen (Retrographics 808S). The reverse projection monitor was placed 135 cm from the monkey's eyes and subtended 128.3 cm horizontally and 94.8 vertically. The video board had a spatial resolution of 1024 × 768 pixels so that there were at least 20 pixels per visual degree. A conventional Barco video monitor (Reference Calibrator CCID 121) was used for the assessment of human perceptual judgments and for the recordings from area MT. The spatial resolution of the conventional monitor was also 1024 × 768 pixels, and the screen was subtended 33.6 cm horizontally and 25.2 cm vertically. For presentation of gratings for human perceptual judgments, the monitor was 50 cm from the subject so that there were at least 18 pixels per visual degree. In all experiments, the video monitor had a mean luminance of 68 candelas/m2.
Results
The effect of spatial frequency on smooth pursuit
Three monkeys pursued moving Gabor functions that consisted of vertical sine wave gratings windowed by a Gaussian function. We chose a Gabor function as a pursuit stimulus because it creates a discrete target that we could train monkeys to track, while allowing us to change the spatial frequency and speed of the underlying sine wave without modifying the spatial extent of the target. In contrast to later experiments on human perception, the Gaussian window moved across the screen rigidly with the underlying grating, creating a stimulus that physically moved rather than a sine wave grating that moved behind a stationary window. The configuration of the trials is diagrammed in Figure 1, A and B, and described in Materials and Methods.
All of the monkeys were able to pursue moving Gabor functions of a wide range of spatial frequencies, and the speed and spatial frequency of the Gabor affected the amplitude of the initial eye acceleration. For example, Figure 1, A and B, shows representative examples of the target and eye position traces for tracking targets with spatial frequencies of 0.25 and 1 cycle/degree and the same speed of 8°/sec. During sustained tracking, eye velocity reached and then fluctuated around target velocity in similar ways for both targets. During the initiation of pursuit, however, the rate of change of eye velocity was clearly higher when the spatial frequency was 0.25 cycle/degree compared with 1 cycle/degree. To quantify the effect of spatial frequency on the initiation of pursuit, we averaged the eye velocity across multiple repetitions of the same stimulus and measured the eye acceleration during the open-loop interval of pursuit (Fig. 1C, vertical dashed lines) (see Materials and Methods for definition). By restricting our attention to this early component of the response, we were able to probe the estimation of the pursuit system of target speed uncontaminated by effects of feedback, which arrives at the end of the open-loop interval, one visual latency after the onset of the pursuit response.
The strength of the initial pursuit response was positively related to target velocity for each spatial frequency but inversely related to spatial frequency for each target velocity. The bar graph in Figure 1D summarizes the initial eye accelerations for the experiment that produced the traces in Figure 1C. At each spatial frequency, the response to target motion at 16°/sec (gray bars) was greater than that for target motion at 8°/sec (black bars). For each of the two target velocities, the initial pursuit response declined as spatial frequency increased from 0.25 to 0.5 to 1 cycle/degree. Approximately the same initial pursuit responses were obtained for target motion at 8°/sec with a spatial frequency of 0.25 cycle/degree and for target motion of 16°/sec with a spatial frequency of 1 cycle/degree, indicating that the pursuit system estimated similar target speeds for these two stimuli.
The effect of spatial frequency was consistent across experimental day and monkey (Fig. 1E,F). To show all of the monkeys and days of experiments together, we have normalized the initial eye acceleration to that elicited by motion of the target with a spatial frequency of 0.5 cycle/degree at each speed. For each monkey and for target motion at 8 and 16°/sec, the graphs indicate that the magnitude of the initial pursuit response decreased as a function of spatial frequency: increasing spatial frequency by a factor of 4 is equivalent to reducing target speed by approximately a factor of 2.
The effect of stimulus contrast on smooth pursuit
Reduction of the contrast of the sine wave grating in the Gabor target caused decreases in the eye acceleration of pursuit but preserved the effect of spatial frequency on the initiation of pursuit. As illustrated in Figure 2A, the trajectory of eye velocity was lower when the contrast of the stimulus was low (gray traces) than when it was high (black traces), for stimuli of otherwise identical target speed (8°/sec) and spatial frequency. As shown in Figure 2, B and C, quantitative analysis revealed that the initial eye acceleration of pursuit was smaller for the low-contrast targets than for the high-contrast targets for each monkey.
To summarize the effects of contrast and spatial frequency, we computed the ratio of the initial eye acceleration for targets of 0.25 cycle/degree divided by that for targets of 1 cycle/degree, separately for each monkey, target speed, and contrast (Table 1). On average, the effect of spatial frequency on the initial eye acceleration was somewhat larger for the high-contrast than the low-contrast targets, averaging almost a twofold increase in pursuit response for a fourfold decrease in spatial frequency for the high-contrast targets versus a 1.5-fold increase in pursuit response for the low-contrast targets. This was true for both target speeds in monkey O but for only one target speed in the other two monkeys.
The effects of spatial frequency and contrast on perception
The perception of speed was altered by the spatial frequency and contrast of the test grating in qualitative agreement with the effects of the same parameters on the initiation of pursuit. Perception was tested in a two-alternative forced choice design that is diagrammed in Figure 3A and described in detail in Materials and Methods. Human subjects reported which of two gratings moved faster. One of the gratings (the “standard” grating) always had a spatial frequency of 0.5 cycle/degree, a contrast of 32%, and a speed of 8°/sec, whereas the other grating (the “test” grating) was composed of a range of spatial frequencies (0.25, 0.5, and 1 cycle/degree), contrasts (8 and 32%), and a wide range of speeds.
To analyze the effect of spatial frequency on perceived speed, data were grouped according to the spatial frequency and speed of the test stimulus. For each group, the percentage of times the test stimulus was reported to be faster than the standard was plotted as a function of the speed of the test stimulus, as illustrated in Figure 3B. Psychometric curves were fitted to the data separately for each spatial frequency. When the test stimulus had the same spatial frequency as the standard (Fig. 3B, open triangles), the psychometric function conformed with the physical speeds of the stimuli. The test was perceived as faster than the standard ∼50% of the time when the test speed was 8°/sec, as expected given that the test and standard were physically the same. As the speed of the test stimulus was increased or decreased, the percentage of reports that the test was faster increased or decreased, as expected given the physical speeds of the two stimuli.
Changing the spatial frequency of the test stimulus caused the psychometric functions for perceived speed to shift. When its spatial frequency was 1 cycle/degree (Fig. 3B, filled circles), the test stimulus of 8°/sec was seen as faster than the standard stimulus only 20% of the time in subject 6, although the two stimuli were physically the same speed. The psychometric function was shifted to the right, and the physical speed of the 1 cycle/degree stimulus had to exceed 9.2°/sec before it was seen as equal to that of the 0.5 cycle/degree stimulus moving at 8°/sec. When the spatial frequency was 0.25 cycle/degree (Fig. 3B, filled squares), the psychometric function shifted in the opposite direction. The test stimulus was seen as faster than the standard stimulus almost 85% of the time when the two speeds were physically identical, and the physical speed of the 0.25 cycle/degree stimulus had to be <6.8°/sec before it was seen as equal to that of the 0.5 cycle/degree stimulus moving at 8°/sec.
For each spatial frequency, we fitted data like those in Figure 3B with sigmoid functions (Eq. 1) and used the value of speed that gave a value of 50% on each fitted sigmoid to estimate the test grating speed that was perceived as equal to that of the 0.5 cycle/degree, 8°/sec standard. A graph of the speed that was judged equal to the standard speed as a function of spatial frequency (Fig. 3C) revealed that five of our six subjects showed the same effect. The physical speed deemed to be equal to the 8°/sec standard was a consistent function of spatial frequency: the perceived speed of the test stimulus decreased as its spatial frequency was increased.
Reduction of the contrast of the test grating caused decreases in the perceived speed of a stimulus. For the subject illustrated in Figure 4A, a low contrast test stimulus had to move faster than the 8°/sec, high-contrast standard stimulus to be judged of equal speed. For all three spatial frequencies, the psychometric functions for stimuli of 8% contrast (symbols and gray curves) plotted to the right of those for stimuli of 32% contrast (black curves). Two of the three graphs in Figure 4A, and 9 of the other 15 graphs (three spatial frequencies by five subjects), also showed a decrease in the slope of the psychometric functions for low-contrast stimuli, possibly indicating that reduced contrast caused a decrease in the precision of speed determination. We have not explored the latter effect further.
Figure 4B summarizes the effect of spatial frequency on the speed of the 8% contrast test stimulus that was judged equal to that of the 32% contrast, 8°/sec standard stimulus. For all six subjects, increases in the spatial frequency of the low-contrast stimulus increased the physical speed required for it to be seen as equal to the speed of the standard. Even subject 1, who did not show any effect of spatial frequency on the perception of speed for high-contrast stimuli, now displayed a consistent effect. Figure 4B also shows that the speed of the test stimulus judged equal to the 32% contrast standard was higher for low-contrast test stimuli (symbols connected by lines) than for the original 32% contrast stimuli (solid black line). Thus, decreasing contrast or increasing spatial frequency causes the speed of a grating to be judged as lower, by both pursuit and perception.
Our results are in general agreement with previous studies of the effect of both spatial frequency (Smith and Edgar, 1990) and contrast (Stone, 1992; Thompson et al., 1996; Thompson and Stone, 1997; Blakemore and Snowden, 1999; Brooks, 2001; Hurlimann et al., 2002) on the perceived speed of grating motion. However, our results were obtained with stimulus speeds of 8 and 16°/sec, faster than the speeds used in previous studies: it would have been plausible to assume that the neural mechanisms of speed perception could be quite different for these widely different ranges of speeds. Furthermore, it was essential that we use the same experimental setup with the same parameters of target motion for physiology, perception, and pursuit for our goal of analyzing the relationship between speed estimation for behavior and the population of responses recorded in area MT.
Effect of spatial frequency and target speed on responses of MT neurons
Our goal was to link (1) our prior findings of the responses of MT neurons to gratings of different spatial frequencies, speeds, and contrasts (Priebe et al., 2003), and (2) our measurements of the estimates of speed derived by motor and perceptual behaviors for the same stimuli. To move toward this goal, we next introduce the scheme used to understand the responses of MT neurons and briefly review the basic finding of our previous recordings.
When the stimulus is a sine wave grating, it is customary to represent it in a space that plots stimuli and responses in terms of their spatial frequency in cycles/degree (sf) and temporal frequency in cycles/second (tf) (Tolhurst and Movshon, 1975), where: 2
Figure 5A summarizes the responses of one of the MT neurons we published before in sf/tf space. The diameter of each filled circle indicates the magnitude of neural response to a moving grating that was composed of a given combination of spatial and temporal frequency. Because each response is plotted at a location defined by the logarithms of its spatial and temporal frequencies, gratings moving at the same speed fall along the same diagonal line. To see most clearly that the preferred speed depends on spatial frequency, we have replotted the data in Figure 5B so that there is a separate speed-tuning curve for each spatial frequency that evoked a clear response. Because the spatial frequency was varied from 0.25 to 4 cycles/degree, the preferred speed of the neuron changed from 32 to 2°/sec. When the same neuron was tested with moving dot textures, it had a preferred speed of 8°/sec. Across the sample of MT neurons, the dependence of preferred speed on spatial frequency ranged from strong, as in Figure 5, to none.
Reconstructing speed from the population response in area MT
Our recordings documented the responses of individual MT neurons to sine wave gratings of a wide range of spatial and temporal frequencies. But, motion-related behavior is driven by estimates of target speed derived from the population of MT responses, not from the responses of any individual neuron. Therefore, it is necessary to transform the responses of individual neurons into a population code that shows the responses of many neurons to each individual stimulus. To see how we have represented the population code in area MT for moving sine wave gratings, consider the graph in Figure 6A, which shows the population response for a grating of 32% contrast and 0.25 cycle/degree moving at 8°/sec. Here, each symbol shows the response of one neuron and is plotted at a location that represents the preferred temporal and spatial frequencies of the neuron. In this format, the stimulus is plotted at the site indicated by the large ×, at a spatial frequency of 0.25 cycle/degree and a temporal frequency of 4 cycles/sec. Scrutiny of the population responses for gratings of 32% contrast and 0.25 cycle/degree (Fig. 6A) or 1.0 cycle/degree (Fig. 6B) reveals that the population of active MT neurons shifts impressively as a function of spatial frequency. There is greater activity in the left side of the graph in Figure 6A and on the right side of the graph in Figure 6B. For these spatial frequencies, the most active neurons do not cluster around the location of the stimulus in spatial-temporal space, although they did when the spatial frequency was 0.5 cycle/degree (data not shown).
For each spatial frequency, we determined the center-of-mass of the population response for spatial and temporal frequency (SF′ and TF′, respectively) using vector-averaging computations: 3 4
where PSFi and PTFi represent the preference of each neuron for spatial and temporal frequency, and Ri is the response of neuron i. The center-of-mass is the result of computing the population vector average. In each graph, the center-of-mass resulting from the population response (+) fell very close to the stimulus in spatial-temporal space. However, there is a consistent shift in the location of the center-of-mass along the spatial frequency axis as a function of the spatial frequency of the stimulus. The left-right shift results from a change in which neurons are active according to the match between the spatial frequency of the stimulus and the preferred spatial frequency of the neurons. The absence of a similar up-down shift for changes in temporal frequency reflects the broad temporal frequency tuning found in MT neurons. Because of the relationship between speed and spatial frequency (Eq. 2), an increase in the spatial frequency of the center-of-mass of the population response, without a change in temporal frequency of the center-of-mass, yields a center-of-mass at a lower speed. Therefore, the center-of-mass of the population response (+) shifts in the same direction as the effects of spatial frequency on the speed reconstructed by pursuit and perception: increases in spatial frequency cause a decrease in the estimated target speed. The shifts in the location of the center-of-mass are small, but as we will see later, they are of approximately the same magnitude as those for the perception of speed.
For stimuli of 8% contrast (Fig. 6C,D), the population code shows the same effect of spatial frequency as it did for stimuli of 32% contrast, with the difference that most MT neurons showed smaller responses, indicated by smaller diameter symbols. As before, the active population of neurons moves from left to right as spatial frequency is increased, and the center-of-mass of the population shows the same relationship to spatial frequency as it did for high-contrast gratings. Comparison of the neural responses for high- and low-contrast gratings of the same spatial frequency does not reveal an effect of contrast on the center-of-mass of the population response.
The same population responses are replotted in the tilted insets of Figure 6, A and B, now in more conventional coordinates that show the average neural activation of neurons as a function of their preferred speed. The smooth curves in each graph were obtained by fitting the entire collection of points with a Gaussian function. Estimated target speed (S′) was computed from the population responses by using a vector-averaging computation to determine the center-of-mass of the population response: 5
where PSi and Ri are the preferred speed and responses of neuron i. We estimated PSi as the preferred temporal frequency divided by the preferred spatial frequency for the neuron, and we normalized Ri relative to the largest response of the neuron to any moving grating of 32% contrast.
As indicated by the arrows in the insets to Figure 6, changes in spatial frequency caused the same direction of shift in the estimates of speed extracted from the population of responses as in our behavioral observations: the higher spatial frequency (thin arrow) produced a lower estimate of speed. However, because stimuli of 8% contrast reduced the amplitude of the population response without changing which neurons responded, reconstructions of speed using Equation 5 did not depend on contrast. In terms of Equation 5, estimated speed is independent of contrast because the weighted sum of the responses of all the neurons, in the numerator, is normalized by the total response from the population, in the denominator. This prediction of Equation 5 disagrees with our observation that reduction of contrast caused lower estimates of target speed by both pursuit and perception.
To allow the reconstruction computation to account for our data showing that both spatial frequency and contrast affect the speed estimated by pursuit and perception, we elaborated slightly the vector-averaging computation used to estimate target speed (S′): 6
where ϵ is a constant term and g(ϵ) is a gain factor that was adjusted so that the reconstructed speed was always 8°/sec when the spatial frequency was 0.5 cycle/degree and contrast was 32%. As the amplitude of the response of the population of MT neurons decreases, ϵ becomes dominant and biases S′ to low values of speed; when the response of the population of MT neurons is large, ϵ has little effect on the estimate of speed because ∑iRi dominates the denominator. We express ϵ as a percentage of the maximal activation of the total sample population.
Because ϵ is a free parameter, we evaluated the effect of the value of ϵ on the estimates of target speed (Fig. 7A-D). When ϵ is zero, Equation 6 estimates similar values of speed for low- and high-contrast stimuli. As ϵ is increased, there are two effects on the estimates of speed. First, the speed estimated for low-contrast stimuli declines relative to that for high-contrast stimuli. Second, the effect of spatial frequency on reconstructed speed is gradually reduced, leading to curves with shallower negative slopes. These trends are summarized in Figure 7, E and F. In Figure 7E, the two curves summarize predictions for gratings of high contrast (bold curve) and low contrast (fine curve) and plot the ratio: estimated speed for spatial frequencies of 0.25 cycle/degree divided by that for 1 cycle/degree, as a function of ϵ. In Figure 7F, the three curves summarize predictions for targets of low (0.25 Hz; boldest curve), medium (0.5 cycle/degree; middle curve), and high (1 cycle/degree; finest curve) spatial frequency, and plot the ratio: estimated speed for high-contrast stimuli divided by that for low-contrast stimuli, as a function of ϵ.
Our data on the effects of spatial frequency and contrast on the perception of speed fit well on Figure 7, E and F. In each graph, we have summarized the results of the perceptual experiments by drawing bold arrows that start on the y-axis at values of the ratios used to summarize our results in Table 2. In those data, the ratio of speed perceived at low versus high spatial frequency was similar for the two contrasts and was ∼1.3 (Table 2, rightmost column). An arrow drawn from 1.33 on the y-axis of Figure 7E comes into the range between the two curves derived from Equation 6 when ϵ is 7%. In Table 2, the ratio of speed perceived at high versus low contrast averaged ∼1.2 for all three spatial frequencies (bottom row). Again, an arrow drawn from 1.2 on the y-axis of Figure 7F comes into the range among the three curves derived from Equation 6 when ϵ is ∼7%. Thus, our observations on the effects of spatial frequency and contrast on the perception of speed can be accounted for by the model implied by Equation 6, with ϵ is ∼7%.
In contrast, there is no single value of ϵ that would allow Equation 6 to predict our observations on pursuit eye movements. The effects of spatial frequency and contrast were both larger for pursuit than for perception. Indeed, the ratios of 1.75 for low versus high spatial frequency at a target speed of 8°/sec (Table 1, top two entries in right column) are not predicted by Figure 7E even when ϵ = 0%. The ratios of 1.6 for high versus low contrast (Table 1, row labeled “Ratio 32% divided by 8%” in the 8°/sec section) are predicted by Figure 7F only if ϵ is ∼25%. In Figure 7, E and F, the locations of the pursuit data are indicated by the symbols labeled with “P.”
There are two minor differences in the methods used to study pursuit and neural responses that might have, but did not, account for the discrepancy between the size of the effect of spatial frequency on the amplitude of smooth pursuit and that predicted from the population response. First, for our pursuit experiments, we analyzed only the response to the first 100 msec of stimulus motion, whereas we estimated the response of MT neurons from 1 sec of motion. However, using only the first 100 msec of MT neuron response did not change the speed estimated from our sample population of neurons for any of the three spatial frequencies (paired t test; p > 0.3). Second, the Gabor stimulus used in our pursuit experiments decomposes to a Gaussian of power in spatial frequency space, whereas the sine wave gratings used for our MT recordings were composed of a single spatial frequency. To test whether the spread of frequencies in the Gabor function had an impact on our results, we used a linear model to estimate the responses each neuron in our sample population would have given to the spatial and temporal energy in the Gabor functions (Priebe et al., 2003). The estimates of target speed obtained by applying Equation 6 to the corrected population responses were affected less by the spatial frequency within the Gabor functions. This correction did not change the predicted direction of the effect of spatial frequency on the initiation of pursuit. Instead, it increased the discrepancy between the size of the effect on pursuit and the prediction from our MT recordings.
Discussion
We have shown that both speed perception and smooth pursuit eye movements use estimates of target speed that decrease as a function of increases in spatial frequency or decreases in contrast. The same direction of effect was predicted if we applied the modified vector-averaging computation represented by Equation 6 to the population codes recorded in area MT for stimuli of the same range of spatial frequency and contrast. For perception, there was excellent quantitative agreement with the result of speeds estimated from the MT population responses. For pursuit, the effects of spatial frequency on initial eye acceleration were in the direction predicted by the population response but were larger than predicted by the vector-averaging computation.
Estimating target speed from the population response
To estimate speed from the population response in a way that agreed with our behavioral data, we had to deal with the fact that changes in spatial frequency and contrast of the stimulus affect the population response in area MT in completely different ways. Changes in spatial frequency altered which MT neurons were active in a manner that caused different groups of neurons, with different preferred spatial frequencies and speeds, to be the most active. Changes in contrast reduced the response amplitude of all neurons in the population but did not change the identity of active neurons. The computation used to estimate speed needed to have two different elements to take account of both of these features of the data.
The effect of changes in the spatial frequency of the stimulus could be accounted for by a standard vector-averaging computation that summed the activity of every MT neuron weighted by its preferred speed and divided by the un-weighted sum of the activity across the population. In the domain of speed, the vector-averaging computation finds the center-of-mass of a neural population response, which provides a good estimate of the preferred speed of the most active neurons if the population comprises neurons with Gaussian tuning curves and independent sources of noise (Georgopoulos et al., 1986; Seung and Sompolinsky, 1993; Salinas and Abbott, 1994; Lewis and Kristan, 1998; Deneve et al., 1999). To us, the most remarkable aspect of the relationship between the population response and the target speed estimated by perception is that a major change in which MT neurons are responding caused only a small change in perceived speed, and that this change in behavior is predicted quantitatively by the center-of-mass of the population response.
The effect of reducing stimulus contrast on pursuit and speed perception could not be accounted for by the vector-averaging computation alone. Because the identify of the active neurons in area MT is not affected by the contrast of the stimulus, the vector average is not affected either. To account for the effect of contrast on the behaviors, we added one parameter to the vector-averaging computation. The use of ϵ in the denominator of the reconstruction computation has relatively little impact on the estimate of speed when the population response is large but starts to push the estimated speed toward zero as the response amplitude declines. A similar class of solution was suggested by Weiss et al. (2002) to account for a number of perceptual illusions. In real life, it makes sense to have the estimate of speed of the nervous system tend toward zero when the population response is small or noisy. Otherwise, an organism might erroneously chase objects that are perceived to be moving when they are actually stationary.
The computation we have adopted to account for our data are analogous to that used by Churchland and Lisberger (2001) to account for a similar set of data obtained using apparent motion targets. When the spatial and temporal separation between target flashes is increased without changing the apparent speed of the target, the initial pursuit response (and the perception of speed) first increases and then decreases. Churchland and Lisberger (2001) used Equation 6 to estimate target speed from the population response in area MT for the same set of apparent motion stimuli, but with two twists. First, their computation was based on an opponent motion signal rather than the MT responses used here. The opponent motion signal was needed to reproduce their behavioral data because most MT neurons responded well to apparent (but not real) motion in the direction opposite the preferred direction of the neuron: the estimation worked only if they took this into account. For our stimuli, MT neurons are silent during motion in their nonpreferred direction (our unpublished observations), so a vector average based on an opponent motion signal would have worked equally well. Second, Churchland and Lisberger (2001) also included ϵ in the denominator of their reconstruction equation, but to achieve noise immunity in the speed estimation rather than to account for a fundamental feature of their data. A related model has been used to describe the effect of microstimulation in the superior colliculus on saccadic eye movements (Sparks et al., 1976; Stanford et al., 1996) and a neurally plausible way of implementing Equation 6 has been tested by (Groh, 2001).
Why is behavior sensitive to the spatial frequency of the moving stimulus?
Although we had observed that the speed tuning of MT neurons depended on spatial frequency (Priebe et al., 2003), there are two reasons why behavior might have achieved veridical estimates of speed. First, ∼25% of MT neurons show speed tuning that is independent of spatial frequency, and it was possible that the estimation of target speed for behaviors could have been based entirely on these neurons. Then, the initial pursuit response and the perception of speed would have been independent of spatial frequency or contrast. The fact that they are not, and that the illusions reported here are in the right direction to be attributed to the effect of spatial frequency on the population response, argues that the estimation of target speed is based on the responses of the full population of neurons instead of on the select few that are truly speed tuned. Second, veridical estimates of speed could have resulted from estimations of target speed based on measuring the rate of traversal of neural activation across the cortical topographic representation of visual space. Measurements of the rate of traversal would have been insensitive to the preferred speed of the most active neurons. However, at least for pursuit, estimation of target speed by traversal computations have been tested and excluded (Priebe et al., 2001).
Possible differences in estimation of speed for perception and pursuit?
Our data revealed qualitative agreement but quantitative differences in the effects of spatial frequency and contrast on the initiation of pursuit eye movements versus perceptual judgments of speed. One possible reason for the differences is the different species and states used in the three different sets of experiments. MT population responses were obtained in anesthetized, paralyzed fascicularis monkeys (Priebe et al., 2003), pursuit was recorded in awake rhesus monkeys, and perception was evaluated in humans. We cannot dismiss the possibility that the discrepancies in the data originate from the differences in the preparations. However, it would be easier to accept this possibility if the best agreement, between the estimates of target speed from the MT population response and the human perception of speed, did not arise from the most different preparations, anesthetized macaques and awake humans.
The largest discrepancy in the data lies in the effects of spatial frequency and contrast on the initiation of pursuit, which were considerably larger than predicted by the effects of the same stimulus manipulations on the population response in area MT. Several possible explanations exist. First, we used a space-limited stimulus consisting of a Gabor function for pursuit and much larger grating stimuli for recording MT responses and for testing speed perception. Second, estimations of target speed could be based on the entire population of MT neurons for perception, as we have assumed in our analysis, but on a subpopulation of MT neurons for pursuit, selecting neurons with the largest effect of spatial frequency on preferred speed. We cannot exclude either of these possibilities, although they seem unlikely to us.
Finally, target speed estimation may be genuinely different for pursuit and perception. Indeed, there are a number of precedents for assuming different processing streams for the perceptual and motor uses of sensory inputs (Goodale and Milner, 1992; Churchland et al., 2003). For example, pursuit eye velocity seems to result from the action of two separate systems, one of which provides visual-motor drive and one of which has a multiplicative effect that determines the gain of transmission of visual signals to pursuit (Schwartz and Lisberger, 1994; Tanaka and Lisberger, 2001). If the estimated speed were passed through both pathways, then multiplying the signals from the two pathways should cause the effect of spatial frequency on pursuit to be the square of the effect on perception. This might explain why taking the square root of the ratios that define the size of the effects in Table 1 brings the data for pursuit into good agreement with those for perception (Table 2). Clearly, additional experiments would be needed to test such a speculative interpretation.
Footnotes
This work was supported by the Howard Hughes Medical Institute and by National Institutes of Health Grants R01-EY03878 and T32-EY07120. We are grateful to Scott Ruffner for creating the target presentation software, to Karen MacLeod and Elizabeth Montgomery for assistance with animal preparation and maintenance, and to Mark Churchland and Carlos Cassanello for insightful discussions. We also thank Jessica Hanover for helpful discussions and comments.
Correspondence should be addressed to Dr. Nicholas Priebe, Department of Neurobiology and Physiology, Northwestern University, 2145 North Sheridan Drive, Evanston, IL 60208. E-mail: nico{at}northwestern.edu.
Copyright © 2004 Society for Neuroscience 0270-6474/04/241907-10$15.00/0