Abstract
The visual input for pursuit eye movements is represented in the cerebral cortex as the distributed activity of neurons that are tuned for both the direction and speed of target motion. To probe how the motor system uses this distributed code to compute a command for smooth eye movements, we have recorded the initiation of pursuit for 150 msec presentations of two spots moving at different speeds and/or in different directions. With equal probability, one of the two spots continued to move at the same speed and in the same direction and became the tracking target, whereas the other disappeared and served as a distractor. We measured eye acceleration in the interval from 110 to 206 msec after the onset of spot motion, within both the open-loop interval for pursuit and the interval during which eye motion was affected by the two spots. Our results demonstrate that weighted vector averaging is used to combine the responses to two moving spots. We found only a minute number of responses that were consistent with either vector summation or winner-take-all computations. In addition, our data show that it is difficult for the monkey to defeat vector averaging without extended training on the use of an explicit cue about which spot will become the target. We argue that our experiment reveals the computations done by the pursuit system in the absence of attentional bias and that vector averaging is normally used to read the distributed code of image motion when there is only one target.
- visual motion processing
- eye movements
- smooth pursuit
- sensorimotor transformation
- vector averaging
- winner-take-all
One fundamental principle of organization of the cerebral cortex is that sensory information is represented in distributed maps. In general, each neuron in the map is broadly tuned, so that any one stimulus causes responses in many neurons. How does the brain use such a distributed representation? Because the neural representations of its sensory and motor components are well known, visually guided smooth pursuit eye movements provide an excellent opportunity to analyze how a distributed sensory representation is converted into a command for voluntary movement (Lisberger et al., 1987).
In primates, pursuit is driven by motion of a small target with respect to the eye. The resulting visual input, called “image motion,” is represented as activity distributed across neurons in the middle temporal visual area (MT) (Maunsell and Van Essen, 1983a). Lisberger et al. (1995) have recorded the responses of MT cells during target speeds and accelerations like those experienced during pursuit, so that much is currently known about this distributed representation of image motion. MT receives its visual inputs mainly from the primary visual cortex (V1) (Maunsell and Van Essen, 1983b) and provides visual inputs for pursuit via projections to a series of higher cortical areas (Ungerleider and Desimone, 1986; Tusa and Ungerleider, 1988; Boussaoud et al., 1992b; Tian and Lynch, 1996) that include the medial superior temporal area (MST) in the parietal cortex (Dürsteler and Wurtz, 1988) and the frontal pursuit area (FPA) in the depths of the arcuate sulcus (Lynch, 1987; MacAvoy et al., 1991). MT, MST, and FPA all project to the pontine nuclei (Glickstein et al., 1980; Ungerleider et al., 1984; Leichnitz, 1989; Boussaoud et al., 1992a), which in turn project to the parts of the cerebellum that are involved in pursuit (Brodal, 1979, 1982; Gerrits and Voogd, 1989). The responses of cerebellar neurons during pursuit are known, and at least those recorded in the floccular complex, after minor filtering, provide adequate commands for smooth eye velocity (Shidara et al., 1993;Krauzlis and Lisberger, 1994). Thus, the transformations from visual input to area MT and from the cerebellum to eye movements are understood. The missing link is the sensorimotor transformation between area MT and the cerebellum.
Several computations have been considered as ways in which the brain might transform a distributed neural code like that found in MT into commands for movement. Each computation assumes that every unit in the distributed code is a “labeled” line, so that downstream areas know something about what information each neuron conveys when it is active. The computations include “winner-take-all,” in which the label on the neuron with the largest response determines the output of the map; “vector summation,” in which the activities of all active neurons are summed with weights that are determined by their individual labels; and “vector averaging,” in which the vector sum is normalized according to the number of neurons that are active. Recently, Groh et al. (1997) showed that vector averaging can account for the majority of the effects of microstimulation in area MT on the smooth and saccadic eye movements evoked by moving visual targets.
Our goal was to use natural visual stimuli to determine the computation used in the sensorimotor transformation that converts the distributed representation of image motion in MT into commands for smooth pursuit eye motion. Our strategy was to measure the pursuit evoked by the brief presentation of two targets moving in different directions and/or at different speeds. We found that the sensorimotor transformation for pursuit takes a vector average of the visual inputs, at least under conditions in which the monkey does not know which of the two spots will become the target.
MATERIALS AND METHODS
Experiments were run on three rhesus monkeys (Macaca mulatta) that had been overtrained on pursuit of single moving targets. Our basic experimental methods have been presented previously (e.g., Lisberger and Westbrook, 1985). Briefly, monkeys were trained to perform a visual-tracking task in exchange for liquid reinforcement. Eye movements were monitored with the scleral search coil method using eye coils that had been implanted with sterile procedure while the monkey was anesthetized with Isoflurane. During experiments, monkeys sat in a primate chair, and their heads were immobilized. Experiments were conducted daily and lasted 2–3 hr. Methods had been approved in advance by the Institutional Animal Care and Use Committee at the University of California, San Francisco.
Visual stimuli and presentation of targets. Stimuli were presented on a 12 inch diagonal oscilloscope (Hewlett Packard 1304A) that was driven by the digital-to-analog converters from a digital signal-processing board in a Pentium computer. This system allowed us to present multiple targets that were identical in shape and brightness. It provided spatial resolution of 32,000 × 32,000 pixels and a refresh rate of 4 msec. Each visual stimulus was a 0.4 or 1.2° square spot that consisted of 36 individual points plotted at spatial intervals of 80 or 240 pixels and temporal intervals of 2 μsec. The larger spot was used only in a few experiments (see those summarized in Figure 7). The screen was 40 cm from the monkey and subtended 32 × 26° of visual angle. The luminance of the spot was 3.5 cd/m2. The background of the screen was uniform and gray, and the room was dimly lit.
Spots were presented in individual trials that consisted of an initial period of 1220–1740 msec during which the animal was required to keep its eyes within 2° of a target at straight-ahead gaze, a 150 msec interval during which two spots moved in different directions and/or at different speeds, a 400–600 msec interval in which the monkey was required to track the continuation of one of the original two-spot motions, and a 600 msec interval in which the monkey was required to fixate the target at its final position. Each spot motion consisted of a step-ramp (Rashbass, 1961) in which the target appeared at an eccentric position and moved toward the position of fixation. In almost all of our experiments (see Fig. 7 for exceptions), either of the two spots could become the final tracking target with equal probabilities. Although the monkey did not receive any information about which spot would become the tracking target, we will use the term “target” to refer to the spot destined to become the tracking target and the term “distractor” to refer to the spot that would disappear after 150 msec of motion. During the 150 msec in which two spots were present, the fixation requirements were suspended. The monkey then was allowed 300 msec to bring its eye position within 3° of the target and was required to maintain tracking with that accuracy for the duration of the trial.
Each day, the experiment consisted of multiple repeats of a list of 16 trials (see Fig. 7), 64 trials (see Figs. 1, 2, 3, 4, 5, 6), or 256 trials (see Figs. 8, 9, 10). For each repeat, the trials were sequenced by shuffling the list and requiring the monkey to complete each trial successfully once. If the monkey failed at one of the trials, it was placed at the end of the list and presented again after the animal had completed the other trials in the list. For the experiments that included 256 trials, we obtained enough trials to provide good estimates of sample mean and variance by combining data collected on several consecutive days.
Data acquisition and analysis. Experiments were controlled and data were acquired by computer programs running on two computers. A UNIX workstation provided the graphical user interface for the design and control of the experiment. A Pentium computer controlled the experiment, acquired the data, and streamed it over the local area network for storage on the UNIX file system. We obtained eye velocity signals by analog differentiation of the eye position outputs from the search coil electronics (DC, 25 Hz; −20 dB/decade), and we sampled horizontal and vertical eye position and eye velocity at rates of 1000 samples per sec per channel. In each file, we also recorded a series of codes to indicate the exact spot motions we commanded, and we used these codes in the data analysis program to reconstruct horizontal and vertical target position and velocity.
Data were analyzed in two phases. In the first phase, we reviewed the horizontal and vertical eye position and velocity for each trial on a screen. We began by flagging trials for exclusion from subsequent analyses if the monkey made an early saccade (<220 msec after the onset of target motion) or clearly was not attempting to track smoothly. Approximately 2% of trials were excluded based on these criteria, leaving only trials in which the first saccade occurred after the interval we would use for measuring the responses. In the remaining 98% of trials, we replaced each saccadic deflection of eye velocity that occurred in the interval from 220 to 400 msec after the onset of target motion with a straight-line segment that connected the eye velocity at the start and end of the saccade. Because the first phase of analysis excluded trials with early saccades, the remaining saccades were edited only outside the times used for quantitative analysis of the data. Thus, replacing the saccades with straight-line segments did not alter our measurements of the initiation of pursuit and served only to allow clean averages of eye velocity for verifying that the monkey was responding to the tracking target after the distractor had been extinguished.
In the second phase of data analysis, we aligned the eye velocity responses to identical stimuli on the onset of stimulus motion and measured the eye acceleration in intervals from 110 to 158 msec and from 158 to 206 msec after the onset of stimulus motion. We selected these intervals because they primarily precede the moment of the first visual feedback about the initial eye movement of pursuit and thus represent the “open-loop” response of the pursuit system to the visual stimulus that was present before the onset of pursuit. For most of our analyses, we made averages of eye velocity as a function of time from 100 msec before to 400 msec after the onset of target motion, and we computed eye acceleration from the averages. For the analysis of the variability of the responses, however, we measured eye acceleration from individual trials. We began the analysis interval 110 msec after the onset of target motion to ensure that our measurements did not include the very earliest part of pursuit, which might have been affected by minor trial-to-trial variations in the latency of pursuit.
RESULTS
Design of the two-spot experiments
Figure 1, A andB, illustrates the stimuli for the two trials in which the initial spot motions (vectors labeled T andD) were rightward and downward. Each plot shows spot position in the visual field; the position of fixation was at the point where the axes cross, and the different arrowsindicate the different phases of spot motion. The short arrows indicate the first 150 msec of motion for the target (solid arrow labeled T) and the distractor (dashed arrow labeled D). Thelong solid arrow indicates the subsequent trajectory of the spot that the monkey was required to track. Each spot underwent step-ramp target motion (Rashbass, 1961): in Figure 1, the step was 3°, and the ramp took the spot toward the position of fixation at 20°/sec. In Figure 1A, rightward target motion continued for the duration of the trial, and the downward distractor was extinguished after 150 msec. In Figure1B, downward target motion continued for the duration of the trial, and the rightward distractor was extinguished after 150 msec of motion. Figure 1, C and D, shows individual trials of eye position and velocity for each of these two conditions, for the interval from 300 msec before to 450 msec after the onset of motion of the two spots. In both examples, the simultaneous downward and rightward spot motion evoked both downward and rightward smooth eye velocity. Saccades were withheld until after the downwardarrowheads, which show the time when the distractor (dashed position traces labeled D) was extinguished. The saccades then brought the eye (bold position traces labeled E) accurately onto the position of the target (solid position traces labeled T).
For every pair of two-spot motions, similar to the pair illustrated in Figure 1, each spot had a probability of 0.5 of becoming the final tracking target. This experimental design guarantees that two different intervals should be revealed by comparison of the eye velocities evoked by the same initial but different final target motions. There must be an early interval in which the eye velocity depends on the simultaneous motion of the two spots and a later interval in which the eye velocity is driven by the tracking target. These two intervals can be seen in Figure 2A, which superimposes the averages of eye velocity for four trials in which (1) a single target moved downward (light dashed traces labeled Dn), (2) a single target moved rightward (light solid traces labeledRt), (3) two spots moved downward and rightward but the downward moving spot became the tracking target (bold dashed traces labeled Rt&Dn), and (4) two spots moved downward and rightward, but the rightward moving spot became the tracking target (bold solid traceslabeled Rt&Dn).
Comparison of the two pairs of bold traces in Figure2A shows that the initial response to the motion of two spots did not depend on which spot would ultimately become the target. In Figure 2, these two traces separated ∼70 msec after the distractor was extinguished, or ∼220 msec after the onset of spot motion. For 196 such comparisons made on seven experiments in three monkeys (28 comparisons per experiment), we measured the time of divergence as the moment when the difference between the twotraces exceeded the sum of the SEMs. The time of divergence averaged 236 msec after the onset of spot motion (86 msec after the distractor disappeared) and was the same for measurements made from thetraces for horizontal and vertical eye velocity. Only 7% of the times of divergence were <206 msec after the onset of spot motion. This validates the use of an interval from 110 to 206 msec after the onset of spot motion to analyze the responses to the combined motion of two spots. Figure 2A also shows that the average responses to the motion of two spots were intermediate between the eye velocities induced by each spot individually.
Predictions of different rules for reading a distributed representation of image motion
Figure 2B illustrates how possible outcomes of our experiments would appear in averages of eye velocity. Thetraces labeled Rt and Dn are the same averages of eye velocity used in Figure 2A that show the average eye velocities evoked by motion of single spots to the right or down. Vector summation of the responses to the motion of the two spots (bold solid traces labeledSum) predicts that the horizontal component of the response to the simultaneous motion of the two spots should be nearly equal to the eye velocity evoked by rightward motion of a single spot; the vertical component should be nearly equal to the eye velocity evoked by downward motion of a single spot. In contrast, vector averaging (bold dashed traces labeled Avg) predicts that the horizontal component of the response to two spots should be intermediate between the horizontal eye velocities evoked by the rightward or downward motion of one spot; the vertical component should be intermediate between the vertical eye velocities evoked by the rightward or downward motion of one spot. Comparison ofA and B in Figure 2 shows that the actual responses to the motion of two spots (Fig. 2A,bold traces) conform more closely to the predictions of vector averaging than to those of vector summation.
The same example is plotted as vectors in Figure3A. Each arrowshows one possible response for a trial that consisted of the motion of two spots, one rightward and one downward. At one extreme, the pursuit response could reflect a winner-take-all computation with either rightward or downward spot motion winning (WTA rightor down). The resulting eye movement would then be identical to that produced by single targets moving rightward or downward, respectively. Indeed, it is plausible to think that the response might reflect a winner-take-all computation on individual trials with the winner varying from trial to trial. After introducing the basic experimental paradigm, we will evaluate this possibility from the data. At the other extreme, the pursuit response could reflect a vector-averaging (Fig. 3A, Average) or vector summation (Fig. 3A, Sum) computation. For a given pair of spot motions, these two computations predict eye acceleration in the same direction but with different magnitudes. When there are two spots, vector averaging predicts half as large an eye acceleration as does vector summation.
We now extend this vector representation to the full experiment that is diagrammed in Figure 3B. With the monkey fixating at straight-ahead gaze (+), two spots appeared and moved along two of the eight trajectories shown by the arrows. Thus, the experiment had an eight × eight design and consisted of 64 trials presented in random order. Each spot started 3° eccentric and moved at 20°/sec along an axis toward the position of fixation, providing step-ramp motion (Rashbass, 1961). When the two spots moved along the same trajectory, the result was a single target that was twice as bright as each spot individually. In separate experiments, we have verified that doubling the intensity of a bright target has no effect on the latency or the eye acceleration at the initiation of pursuit (S. G. Lisberger, unpublished observations). The eye acceleration for single spots moving in eight different directions is summarized in Figure 3D (filled triangles) by plotting average vertical eye acceleration on the y-axis and average horizontal eye acceleration on the x-axis. As we have shown previously (Lisberger and Pavelko, 1989), the direction of eye acceleration was nearly equal to the direction of target motion, and the magnitude of the responses did not depend strongly on the direction of target motion. For this plot, eye acceleration was measured in the interval from 158 to 206 msec after the onset of target motion, but we obtained similar results for the interval from 106 to 158 msec after the onset of target motion.
To analyze our results, we sorted the trials to consider together all cases in which one direction of target motion was presented separately or had been paired with the other seven directions of distractor motion. This divided our experiment into eight groups of eight trials, one group for each of the eight different directions of target motion. Figure 3D shows how one of these groups might appear on a plot of vertical versus horizontal eye acceleration for the eight trials in which target motion was to the right. It demonstrates that very different curves are predicted by vector averaging (solid curve labeled Average) versus vector summation (dashed curve labeled Sum) of the responses to the motion of single targets in each direction. The key difference is that vector summation predicts faster eye accelerations for the motion of two spots than for the motion of a single target.
Winner-take-all and equally weighted vector averaging represent two points along a continuum of possible outcomes represented in the computation: Equation 1where Ei,j represents the eye acceleration vector for the motion of two stimuli in directionsi and j, Ei represents the eye acceleration vector for the motion of one target in directioni, and wi represents a weighting with a value between 0 and 1 that defines the strength of target motion in direction i when competed with a distractor moving in any other direction (Ej). Ifwi has a value of 1.0, then this equation reduces to winner-take-all for direction i. Ifwi has a value of 0.5, then this equation describes equally weighted vector averaging. Figure 3E shows the predicted outcomes of the weighted vector-averaging computation when the target moves to the right and wi ranges from 0.1 to 0.9. Depending on the value ofwi, the computation can vary from equally weighted vector averaging (wi = 0.5) to pure winner-take-all for either the target (wi = 1) or the distractor (wi = 0). A small ellipse represents a large weight for the target, and a large ellipse represents heavy weighting of the distractors and a small weight for the target.
Weighted vector averaging for spots moving in different directions
The graphs in Figure 3, C and F, show the results of selected experiments in which target motion in one direction was paired with distractor motion in each of the other seven directions we used. Each set of connected points shows the average eye accelerations for a set of trials that had the same direction of target motion. As before, the graphs were created by plotting averages of horizontal and vertical eye acceleration on the x-axis andy-axis, respectively. In this graph and all subsequent analyses in this paper, we present measurements made in the interval from 158 to 206 msec after the onset of target motion. In each experiment, we obtained very similar results for eye acceleration in the interval from 110 to 158 msec after the onset of target motion.
In Figure 3C, the filled triangles plot the responses for trials in which the target moved to the right and the distractors moved in each of the other seven directions. The open triangles plot the responses for trials in which the target moved to the left and the distractors in each of the other seven directions. In this example, the direction and magnitude of eye acceleration were clearly affected by the direction of motion of the distractor. The weighting of the distractors was greater when the tracking target moved to the right than when it moved to the left. Thus, when the target and distractor moved to the right and the left, in exact opposition (circled points), the net eye acceleration was to the left. In Figure 3, C andF, each dashed curve shows the best fit of Equation 1 to the data, and the numbers in the relevant quadrants give the values of wi. Clearly, there was some diversity in the computations that combined the responses to two targets. Some cases gave nearly perfect vector averaging (e.g., Fig.3F), whereas others provided examples that were weighted toward winner-take-all for either the target (e.g., open triangles in Fig. 3C) or the distractor (e.g.,filled triangles in Fig. 3C).
Each of our experiments included all 64 possible combinations of the eight directions of stimulus motion we used, and each, thus, provided eight sets of connected points similar to those shown in the graphs of Figure 3, C and F. For each of these eight sets of points, we fitted Equation 1 to find the eight values ofwi that provided the best fit to the data from two-spot trials. The fitting procedure minimized the mean error for the seven trials that used two spots moving in different directions. The error for each point was computed as the square root of the sum of the squares of the errors in horizontal and vertical eye acceleration. The histogram in Figure 4 plots the distribution of 56 values of wi obtained from seven experiments on three monkeys. The data demonstrate that the pursuit system used computations that can deviate from equally weighted vector averaging but never all the way to winner-take-all for either the target or distractor. There were many cases of equally weighted vector averaging, with values of wi near 0.5, but there were also values of wi as low as 0.3 and as high as 0.7. In most cases, an individual experiment yielded evidence of equally weighted vector averaging for some directions with unequally weighted vector averaging leaning toward winner-take-all for the target or the distractors in other directions.
We also fitted the data with the model defined by: Equation 2where Ei,j represents the eye acceleration vector for the motion of two targets in directions i andj, Ei represents the eye acceleration vector for the motion of one target in direction i, andwi is a value between 0 and 1 that defines the weight of stimulus motion in direction i when competed with a second stimulus in any other direction. To guarantee a unique solution, we added the additional constraint that the Σwi = 4 (mean, 0.5). For the case of equally weighted vector averaging, all values of wishould be equal to 0.5. We used a gradient descent optimization algorithm to fit Equation 2 to the data and to obtain a single set of eight values of wi for the 56 combinations for each experiment of two spots moving in different directions. In every case, the fit obtained by Equation 2 yielded slightly lower values of error (average improvement, 11%; range, 5–23%) than did the fits to Equation 1. The functions relating wi to the direction of spot motion were similar, however.
It is not surprising that the fits to Equations 1 and 2 were similar because the models are very similar. However, the two models are not identical when the experiment includes more than two directions of spot motion. We regard Equation 1 as a descriptive model. It allowed us to derive a single number that reports where each combination of one direction of target motion and seven directions of distractor motion fell on the continuum from winner-take-all to equally weighted vector averaging. However, the descriptive model has the shortfall that a given direction of distractor motion can have different weights, depending on the direction of motion of the target in a given trial. In contrast, Equation 2 provides a mechanistic model in which each direction of spot motion has a single weight. Each weight indicates how strongly that direction of motion affected pursuit, without regard for the direction of motion of the other spot. The mechanistic model maps well onto a pursuit system in which each direction of spot motion has a unique weight, so that the response to a given pair of spot motions can be computed simply as the weighted average of the responses to the two spot motions separately.
Vector averaging occurred consistently in individual responses to the motion of two spots
The analysis in the preceding section shows that the average eye movement was consistent with the predictions of vector averaging. Because of the possibility that a winner-take-all computation to pursue the target or the distractor on alternate trials was used, we have also determined how pursuit responded to the motion of two spots on individual trials. In the analysis of individual trials, an alternating winner-take-all strategy would have produced a bimodal distribution with separate peaks near winner-take-all for the target and the distractor. Yet, the average of the alternating winner-take-all strategy could have yielded averaged results that fit the expectations of a vector-averaging computation (e.g., Figs. 3, 4).
We analyzed eye acceleration for each trial according to the scheme diagrammed in Figure 5B. For this contrived example, the target moved to the right and, when provided as a single target, evoked an average eye acceleration vector that was rightward with a small downward component (arrowlabeled T). The distractor moved upward and, when provided as a single target, evoked an average eye acceleration vector that was upward with a small leftward component (arrowlabeled D). The eye acceleration from one trial when the upward and rightward spots were presented at the same time is shown as the vector labeled R. For this combination of motion of two spots, the predictions of the different possible computations are T, winner-take-all for the target;D, winner-take-all for the distractor; VS, vector summation; and VA, vector averaging. To analyze each trial, we computed T′ and D′, which are the projections of R onto the axes defined by the vectors for the average responses to the target (T) and distractor (D) as single spots. This yielded weights for the target (wT) and distractor (wD) in each individual trial, whereT′ = wTT and D′= wDD. Among the possible outcomes of this analysis are winner-take-all for target,wT = 1 and wD = 0; winner-take-all for distractor, wT = 0 andwD = 1; vector summation,wT = wD = 1; and vector averaging, wT = wD = 0.5.
Figure 5, A, C, and D, plots the weight of the distractor versus the weight of the target for a subset of the data from one daily experiment on each of the three monkeys. In these graphs, each point shows data from an individual trial, thelarge filled circles show the predictions of winner-take-all for the target (T) and distractor (D) and of vector summation (VS), the two dashed lines cross at the prediction of equally weighted vector averaging, and the solid line from T to D shows the continuum of possible unequally weighted vector-averaging computations. We have made it possible to discriminate the individual points by plotting data from every nth trial, where n was selected so that we would have ∼300 points on each graph. We have excluded trials in which the target and distractor moved in opposite directions because the values of wT and wDare not unique under this condition. The data are clustered near the prediction of vector averaging with very few examples that would be consistent with vector summation or winner-take-all for either of the spots. In addition, there is no evidence of the bimodal distribution that would have emerged if the alternating winner-take-all strategy had been used.
We summarized the analysis of individual trials by analyzing the values of wT and wD along two dimensions. The first dimension asked whether the data were more consistent with an averaging or a summation computation by plotting distributions of the sum of wT andwD. Vector averaging predicts that this sum should equal 1, whereas vector summation predicts that the sum should equal 2. The seven histograms on the left of Figure6 show that the distributions ofwT + wD were consistent with vector averaging (Fig. 6A, arrowlabeled VA) for all seven experiments we ran. The distributions all peaked at values <1, and only a minute fraction of the trials yielded weights consistent with vector summation (arrow labeled VS; wT +wD = 2). In the seven histograms, fromtop to bottom, the mean values ofwT + wD were 0.82, 0.82, 0.82, 1.17, 0.91, 0.85, and 0.67. The second dimension asked whether the responses were more consistent with equal weighting of the target and distractor or with winner-take-all for one or the other by plotting distributions of the difference between wT andwD. Equal weighting predicts thatwT − wD should equal 0, whereas winner-take-all predicts that wT −wD should be either −1 or +1. The seven histograms on the right of Figure 6 show that the distributions of wT − wDare unimodal and centered near zero. Only very few trials showed values of the weights consistent with winner-take-all for either the target (Fig. 6B, arrow labeledT) or the distractor (arrow labeledD). We conclude that the pursuit system is performing vector averaging with approximately equal weightings of the target and distractor with two caveats. First, there is a considerable distribution in the relative weightings of the target and distractor. Second, the total weight of the target and distractor was usually <1, indicating that the responses to the motion of two spots were somewhat smaller than predicted even by vector averaging.
In the experiments reported here, we have examined only the special case in which two targets move in different directions toward the position of fixation. Previous experiments on pursuit have suggested that centripetal target motion may be privileged in the sense that it causes more vigorous initiation of pursuit than does target motion in other directions (Lisberger and Westbrook, 1985). However, other experiments to be reported elsewhere show that vector averaging is the computation used to guide presaccadic pursuit, even if one of the spots is not moving toward the position of fixation. In contrast, the pursuit system can come much closer to winner-take-all behavior in the immediate wake of a saccade to either the target or the distractor (Lisberger, unpublished observations).
Vector averaging was difficult to defeat
In the experiments described above, the target and distractor had identical appearances, and no previous information was given to allow the monkey to decide which spot would become the target. In a separate set of experiments, we relaxed both of these facets of the experimental design. The target was always a big spot (1.2°) that moved horizontally, and the distractor was always a smaller spot (0.4°) and could move in any of the eight directions used previously. Because this experiment had 16 different trials (2 × 8) rather than the 64 trials (8 × 8) used in preceding sections, the monkey repeated the sequence 100–150 times in a daily session and had ample opportunity to become familiar with the structure of the task.
Figure 7 shows that the additional information afforded by this design did not allow the monkeys to defeat vector averaging. The two graphs show data for two monkeys, and each set of eight connected points shows the responses when target motion to the right (filled triangles) or left (open triangles) was paired with distractor motion in each of the eight directions. For each direction of target motion, the average eye acceleration depended strongly on the direction of distractor motion. To evaluate these graphs, it is worthwhile to consider the direction and magnitude of the eye accelerations separately. Each of the connected sets of points in Figure 7 suggests that the monkey was able to use previous knowledge about the axis of target motion to acquire some control over the direction of eye acceleration. Thus, each set of points is elongated along the axis of target motion. On each monkey, we ran a separate experiment that occupied a complete day and analyzed target motion along each of four axes of target motion (horizontal, vertical, 45° oblique left, and 45° oblique right). In each experiment, we observed a small but incomplete elongation of the points along the axis of motion of the target. Unfortunately, it is not possible to compute the predictions of vector averaging for this experiment because there were no trials that presented single targets moving along axes other than the axis of target motion.
Analysis of the magnitude of eye acceleration in Figure 7 failed to provide any evidence that previous knowledge about the form of the target or the axis of motion allowed the monkey to overcome vector averaging. Thus, each point reveals eye acceleration with an amplitude that depends on the relative directions of target and distractor motion. In addition, the points for target and distractor motion in opposite directions (circled) plot close together, showing that the pursuit system was not able to distinguish the target from the distractor based on previous information about the size of the target.
Weighted vector averaging for stimuli moving at different speeds
We now describe the results of experiments in which spots moved in eight different directions at speeds of either 20°/sec or 5°/sec. As shown in Figure 8B, targets started 3 and 0.75° eccentric for motion at 20°/sec and 5°/sec, respectively, and moved toward the position of fixation. In these experiments, the target and distractor were again the same size, and the two spots were equally likely to become the target. Thus, the monkey could not know which spot would be the target until 150 msec after the onset of motion, when the distractor disappeared.
Figure 8, A, C, and D, illustrates the predictions of the vector-averaging and vector summation algorithms for conditions in which one spot moved at 5°/sec and one at 20°/sec. The connected triangles in Figure 8, A andC, plot the eye accelerations in the interval from 158 to 206 msec after the onset of motion for single targets that moved in eight different directions at speeds of 5°/sec (open triangles) or 20°/sec (filled triangles). Figure 8A also compares the predictions of vector summation and vector averaging when the tracking target moved to the right at 20°/sec and the distractor moved in eight different directions at 5°/sec. Vector summation (dashed curve) predicts that the responses to two spots should be centered on the response for a single target moving to the right at 20°/sec and that the connected points should form a curve with the same shape and size as that for the motion of a single target at 5°/sec. Vector averaging (continuous curve without points) predicts that the area inside the connected points should be smaller than that for the motion of a single target at 5°/sec and that the responses should be centered at half of the response amplitude for the single target moving to the right at 20°/sec. Figure8C shows a similar set of predictions when the tracking target moves to the right at 5°/sec and the distractor moves in one of eight directions at 20°/sec. Again, the predictions of the vector-averaging and vector summation hypotheses are very different.
To illustrate the possible outcomes predicted by weighted vector averaging, we used an elaborated version of Equation 1: Equation 3where Et,i;d,j represents the eye acceleration for the motion of a tracking target at speed tin direction i and a distractor at speed d in direction j, Et,i represents the eye acceleration for the motion of one target at speed t in direction i, and wt,i;d represents a weighting with a value between 0 and 1 that defines the strength of target motion in direction i at speed t when competed with a distractor moving in any other direction at speedd. If wt,i;d has a value of 0.0 or 1.0, then this equation reduces to winner-take-all for either the distractor or the target, respectively. Ifwt,i;d has a value of 0.5, then Equation 3reduces to equally weighted vector averaging. In Figure8D, we solved Equation 3 for rightward target motion at 20°/sec and distractor motion at 5°/sec, using the values ofEt,i obtained from single-target experiments. We then computed the predicted outcome when the value ofwt,i;d was 0 (winner-take-all for the distractor, open triangles), 0.25 (leftmost dashed curve), 0.5 (middle solid curve), 0.75 (rightmost dashed curve), and 1.0 (winner-take-all for the target,filled triangle).
Figure 9 illustrates two examples to show the results when the target moved to the right at 20°/sec and the distractor moved in one of eight directions at 5°/sec (Fig.9A) and the target moved to the right at 5°/sec and the distractor moved at 20°/sec in one of eight directions (Fig.9B). In both graphs, the eye accelerations in the interval from 158 to 206 msec after the onset of spot motion conformed more closely to predictions of pure vector averaging (solid curve without data points) than to predictions of vector summation (curve with long dashes). We next asked where the responses fell on the continuum from winner-take-all for the distractor to winner-take-all for the tracking target by fitting the data with Equation 3. We used a least squares procedure to fit 32 values ofwt,i;d to the 32 groups of eight points obtained from this experiment (one group for target motion at each of the two speeds in each of the eight directions and distractor motion at two speeds: 2 × 8 × 2 = 32). The values ofwt,i;d were 0.33 (Fig. 9A) and 0.57 (Fig. 9B), and the fits are shown as the curves with short dashes.
Figure 10 shows that the computation used to combine the motion of two spots moving at 5°/sec and 20°/sec corresponded to weighted vector averaging, just as it did for combining two spots moving in different directions at 20°/sec. In Figure 10, A and B, each value on thex-axis shows one of the four combinations of target and distractor speed. The eight points plotted at each of the four values on the x-axis show the wt,i;d for each of the eight directions of target motion. In monkey A (Fig.10A), the value of the weights generally was between 0.35 and 0.65, reflecting only minor deviations from equally weighted vector averaging. The largest variation occurred when both the target and the distractor speed were 5°/sec (t5/d5); the values of the weights ranged from 0.25 to 0.75. In monkey I (Fig.10B), the weights were grouped around 0.5 when the target and distractor moved at the same speed (t20/d20, t5/d5). However, the weights were clearly <0.5 when the target moved at 20°/sec and the distractor at 5°/sec (t20/d5) and clearly larger than 0.5 in the opposite situation when the target moved at 5°/sec and the distractor at 20°/sec (t5/d20). This combination indicates that stimulus motion at 5°/sec had a stronger effect on pursuit than did stimulus motion at 20°/sec, when the two speeds were competed against each other. This result is slightly paradoxical, because the motion of a single target at 20°/sec consistently evoked much larger eye accelerations than did the motion of a single target at 5°/sec (mean, 101.5 vs 38.3°/sec2). It is possible that spot motion at 5°/sec was weighted more heavily because that spot appeared closer to the position of fixation.
DISCUSSION
Our experiments reveal that vector averaging is used to combine the visual inputs that arise from the motion of two spots. Although the weights afforded the target and distractor were often unequal, analysis of the individual trials failed to reveal more than a few instances that were compatible with the alternate computations of winner-take-all or vector summation.
Why vector averaging?
Vector averaging provides an excellent way to read a distributed code of direction or speed. For example, Salinas and Abbott (1994)discuss a number of computations that are close to optimal for reading a distributed code, and most of the computations are specific implementations of the general computation of vector averaging. In the present experiments, our goal was to determine whether the pursuit system uses this nearly optimal approach to compute a motor command from the distributed representation of motion of a single target. Because it was not clear how to ask this question using only natural stimuli and single targets, we elected to use two targets moving across different parts of the visual field to probe the computation used to read the distributed code for a single target. Our approach depends on the assumption that the pursuit system uses the same computation to combine information from the two spatial locations we used as it does for a single location. We think this is a valid assumption partly for the practical reason that we used nearby locations, within the central 4° of the visual field, and partly because the pursuit system is attempting to match eye and target speed and therefore has no obvious reason to care about the exact spatial position of the targets. Thus, although our conclusions about how the pursuit system reads the distributed code of image motion are derived from the eye movements evoked by the simultaneous motion of two spots, we think these conclusions apply equally well to the determination of initial eye acceleration for a single spot.
There are now a number of examples in which vector averaging is or may be used to read the distributed representation of a movement command. In motor systems other than pursuit eye movements, simultaneous electrical stimulation of the frontal eye fields caused saccadic eye movements that could be described as the vector average of the saccades stimulated by each site separately (Robinson and Fuchs, 1969). Reversible lesions of the superior colliculus caused changes in the direction and amplitude of saccades that were consistent with the use of vector averaging and inconsistent with the use of vector summation to convert collicular activity into a command for saccadic eye movements (Lee et al., 1988). Recordings from the sensory and motor cortex have demonstrated distributed codes for the direction of arm movement that could be read by either vector averaging or vector summation (Georgopoulos et al., 1986; Kalaska, 1988).
In the pursuit system, microstimulation of visual area MT at the onset of motion of a visual target had effects on both pursuit eye movements and saccades that were most consistent with the use of vector averaging to convert the distributed representation of image motion in MT into commands for these movements (Groh et al., 1997). Although they used dynamic random dot patterns and humans rather than single spots and monkeys, Watamaniuk and Heinen (1994) showed that the initial smooth eye movements evoked by this stimulus reflect a vector combination of the motion of all the dots with precision equivalent to precision of perceptual decisions based on the same stimulus. By directly demonstrating the use of vector averaging to compute motor responses to natural stimuli, our data provide a critical link in the rapidly mounting evidence that vector averaging is a general computation used by the brain to read a distributed representation of either sensory input or motor commands.
Possible neural sites of vector averaging
We selected the initial positions of the targets in our experiments to ensure that we were investigating interactions that occurred downstream from the representation of visual motion in area MT. Although some of our pairs of spots almost certainly fell in the receptive fields of some individual cells, many of the pairs fell in opposite hemifields and would have activated cells in opposite cerebral hemispheres. Therefore, it seems unlikely that the computation revealed in our experiments results from the interactions between multiple targets that have been revealed in a number of previous recordings within MT. Thus, although the transparency effects of Qian and Andersen (1994), the pattern direction selective cells of Movshon et al. (1985), and the two-spot experiments of Recanzone and Wurtz (1994) are potentially interesting effects that could be used to implement some vector averaging in area MT, none of these are likely to be the substrate of the data reported here.
Instead, it seems likely that the neural representation of the vector-averaging computations revealed here will be found downstream in area MST, in the frontal pursuit area, in the dorsolateral pontine nucleus, or even in the cerebellum. From a computational standpoint, there are several physiological requirements for the anatomical structures that participate in vector averaging. There should be a distributed representation of the direction and speed of target motion at least in the inputs to the site of vector averaging, if not at the site itself. Receptive fields should be large and bilateral. One way to implement vector averaging rather than vector summation in the brain is to rely on a process called “response normalization” or “divisive gain control.” Thus, there should be a mechanism for implementing this function. Given what is known about the pursuit system, only area MT is excluded as a possible site for vector averaging, based on the size of its receptive fields and their restriction to the contralateral hemifield.
Role of vector averaging in the initiation of pursuit
Our experiments were designed to reveal the behavior of the pursuit system in the absence of an attentional bias. By providing two spots with identical appearance, depriving the animal of any previous information about which spot would be the target, providing a balanced distribution of the directions of target and distractor motion, and analyzing only presaccadic pursuit for spots moving toward the position of fixation, we have attempted to force the pursuit system to emit a response without intervention from expectations or attention. The difficulty of defeating vector averaging even when the tracking target is identified by its size and direction of motion suggests that vector averaging is the computation that the pursuit system does naturally as a first response, in the absence of compelling information to do otherwise. In a world with many moving objects, or even many stationary objects, however, vector averaging could be doomed to immobilize pursuit. A number of other mechanisms may be used in conjunction with pursuit to overcome these problems. For example, vector averaging may occur over a limited spatial extent, for a limited time after the onset of pursuit, or only for nonzero velocity vectors. Our experiments have not yet tested these possibilities explicitly.
Other data from our laboratory suggest that vector averaging is the earliest response the pursuit system can emit but that it can be supplanted by other, more selective mechanisms once adequate information is available. Recent data (Lisberger, unpublished observations) show that the period of vector averaging ends when the monkey makes a saccade to one of the two moving spots in a two-spot trial. Even in the first few tens of milliseconds after the saccade, the smooth eye movement is most consistent with the predictions of a winner-take-all computation based on the motion of the saccade target. Thus, target selection can bias the vector-averaging computation toward the signals that arise from the selected target. In addition, our earlier experiments on target selection (Ferrera and Lisberger, 1995,1997) provide an example of how previous information about the structure of the pursuit task can cause winner-take-all behavior for trials that present two moving spots, even in presaccadic pursuit. If a monkey is given a color cue to tell him which of two differently colored stimuli will be the tracking target, and if the animal knows that the direction of target motion will always be horizontal, then distractors that move in other directions do not have a consistent effect on the direction of eye acceleration. Because the behavioral conditions were so different, our new data do not contradict these earlier results or the conclusions that were based on them.
The use of attention to obtain winner-take-all behavior from the pursuit system is not without cost. When the distractor and target move in opposite or nearly opposite directions, the selection of the target causes an added latency of 30–50 msec in the initiation of pursuit. Thus, although vector averaging seems to be the first computation done at the initiation of pursuit, other computations can control the direction and/or speed of pursuit after enough time has elapsed so that a target can be selected. Under natural conditions, this organization would allow the pursuit system to react quickly on the basis of the information available at the initiation of pursuit and to make cognitive or attentional decisions later.
Footnotes
This research was supported by Grants EY03878 from the National Institutes of Health (S.G.L.) and by McDonnell-Pew Program in Cognitive Science Fellowship JSMF 92-38 (V.P.F.). We are grateful to Stefanie Tokiyama for assistance with data analysis and to Drs. Tony Movshon, Michael Stryker, Terry Sejnowski, and Jennifer Groh for comments on an earlier version of this manuscript. We also thank the members of the Lisberger laboratory for many helpful discussions and for comments on this paper.
Correspondence should be addressed to Dr. Stephen G. Lisberger, Department of Physiology, University of California, San Francisco, Box 0444, San Francisco, CA 94143.