Abstract
Decisions about the visual world can take time to form, especially when information is unreliable. We studied the neural correlate of gradual decision formation by recording activity from the lateral intraparietal cortex (area LIP) of rhesus monkeys during a combined motion-discrimination reaction-time task. Monkeys reported the direction of random-dot motion by making an eye movement to one of two peripheral choice targets, one of which was within the response field of the neuron. We varied the difficulty of the task and measured both the accuracy of direction discrimination and the time required to reach a decision. Both the accuracy and speed of decisions increased as a function of motion strength. During the period of decision formation, the epoch between onset of visual motion and the initiation of the eye movement response, LIP neurons underwent ramp-like changes in their discharge rate that predicted the monkey's decision. A steeper rise in spike rate was associated with stronger stimulus motion and shorter reaction times. The observations suggest that neurons in LIP integrate time-varying signals that originate in the extrastriate visual cortex, accumulating evidence for or against a specific behavioral response. A threshold level of LIP activity appears to mark the completion of the decision process and to govern the tradeoff between accuracy and speed of perception.
- lateral intraparietal area (LIP)
- decision
- reaction time
- random-dot motion
- electrophysiology
- vision
- psychophysics
A hallmark of higher brain function is the ability to form decisions from sensory data to guide appropriate behavioral responses. Decisions tend to be more accurate if subjects are given longer exposure to a stimulus (Green and Luce, 1973; Wickelgren, 1977; Gold and Shadlen, 2000; Mateeff et al., 2000), and given the freedom to respond when ready, subjects will improve accuracy by taking more time (Luce, 1986). The relationship between accuracy and processing time suggests that subjects accumulate information to improve performance. A long-term objective of cognitive neuroscience is to elucidate the neural mechanisms that underlie decision formation. To this end, we have looked for a neural correlate of decision making in a task that requires the accumulation of sensory information as a function of time.
A simple form of decision making occurs when an observer discriminates between two possible interpretations of a visual stimulus.Newsome and colleagues (1989) introduced a random-dot motion discrimination task suitable for the simultaneous study of threshold psychophysics and neurophysiology in monkey. A combination of recording, lesion, and stimulation studies has established that neurons in extrastriate cortex (areas MT and MST) carry critical sensory information about motion direction (Newsome and Pare, 1988; Salzman et al., 1990;1992; Britten et al., 1992; Celebrini and Newsome, 1994;1995). These signals inform a binary decision about direction, which determines the monkey's behavioral response, a saccadic eye movement.
Many neurons in the lateral intraparietal area (LIP) respond to visual stimuli that are the target of a planned saccadic eye movement (Gnadt and Andersen, 1988; Colby et al., 1996; Platt and Glimcher, 1997). When the direction of random-dot motion instructs the choice of a target for a saccade, LIP activity modulates in a way that predicts the monkey's eye movement response (Shadlen and Newsome, 1996,2001). The gradual evolution of activity during motion viewing and its dependence on the difficulty of the discrimination suggests that neurons in LIP might represent the accumulation of visual information about motion leading to the formation of the monkey's decision.
To characterize neural activity as decision related, the period of decision formation must be clearly defined. In previous experiments, monkeys were given a fixed amount of time to view the motion stimulus and decide its direction. The actual moment when the monkey committed to a behavioral response was not known, so one could not distinguish neural activity associated with decision formation from activity that represented the planned eye movement. We therefore trained monkeys to perform the same task, but to respond as soon as a decision was made. The combination of threshold psychophysics with a reaction time measurement allowed us to examine both the accuracy of the monkeys' perception and the time required to reach a decision. Near psychophysical threshold, we were able to record neural activity during an extended epoch of decision formation, which was determined on each experimental trial. We found that the activity of LIP neurons modulated as the monkey viewed the motion stimulus, predicting both the perceptual decision and the time required to reach it.
Parts of this work have been published previously in abstract form (Roitman and Shadlen, 1998).
MATERIALS AND METHODS
We recorded from 54 neurons in the LIP of two rhesus monkeys (both female, 4.5–5 kg) trained to perform a reaction-time direction-discrimination task. Each monkey was implanted with a head-holding device, a recording cylinder for transdural introduction of electrodes (Crist Instruments, Damascus, MD), and a scleral eye coil for monitoring eye position (Fuchs and Robinson, 1966;Judge et al., 1980). For each recording session, a plastic grid that held a sterile guide tube through which a tungsten microelectrode was passed was secured in the recording chamber. Signals from the electrode were amplified, filtered, and passed to a dual voltage-time window discriminator (Bak Electronics, Germantown, MD) to discriminate action potentials from a single neuron. A record of the time of each action potential, marked to the nearest millisecond, as well as events occurring in the trial were stored on a PC computer for off-line analysis (Hays et al., 1982). Horizontal and vertical eye positions were also measured (C-N-C Engineering) and stored to disk for analysis (250 Hz). All surgical and animal care methods conformed to the National Institutes of Health Guide for the Care and Use of Laboratory Animals and were approved by the University of Washington Animal Care Committee.
Neurons were selected according to anatomical and physiological criteria. Postoperative MRI was used to identify LIP and to direct the placement of recording electrodes within the recording chamber. The three coronal images in Figure 1 show the recording locations from one of our monkeys. The arrowheads border the sites where neurons were recorded. A minority of neurons (5–10%) studied in this monkey may have been located within the histological boundaries of the lateral occipitoparietal zone (in “A”), whereas most of the neurons were located in LIPv (“B” and “C”) (Lewis and Van Essen, 2000). Locations from the second monkey were nearly identical to the images in Figure 1, B andC.
We used a memory-guided saccade task to select LIP neurons that were active during saccade planning (Hikosaka and Wurtz, 1983; Gnadt and Andersen, 1988). The monkey fixated a central point while a target appeared briefly (200 msec) in the periphery. After a delay of 1–2 sec, the fixation point was extinguished, and the monkey made a saccade to the remembered location of the target. By varying the location of the target from trial to trial, we identified the location of the visual field that caused a sustained response in the LIP neuron during the delay period, termed the response field (RF). This location determined the position of one of the choice targets in the direction discrimination task (range, 4.5–15° eccentric).
Direction discrimination task. Monkeys performed a single-interval, two-alternative forced-choice direction-discrimination task (Fig. 2). They were tested in two conditions: reaction time (RT) and fixed duration (FD). The conditions shared the following features. A trial began when the monkey fixated a central fixation point. Two choice targets then appeared, with one located in the RF of the neuron under study (T1) and the other located in the opposite hemifield (T2). The random-dot motion stimulus appeared in a 5°-diameter aperture centered at the fixation point. The stimulus was presented on a computer monitor with a frame rate of 75 Hz. A set of dots was shown for one video frame and then replotted three video frames later. When replotted, a subset of dots was offset from their original location to create apparent motion while the remaining dots were relocated randomly. Therefore, the first pairing of coherent dots occurred after three video frames (40 msec), although random motion was present by the second video frame (13 msec). The net direction of motion was toward one or the other choice target. Both the direction and strength of motion (the percentage of coherently moving dots) were chosen randomly on each trial. The monkey's task was to decide the direction of motion and to indicate its decision with a saccadic eye movement to the appropriate choice target. The monkey received a liquid reward for correct choices and was rewarded on half the trials that used 0% coherent motion. Errors were followed by a short time out (extra 750 msec added to the intertrial interval).
In the RT task (see Fig. 2A), the choice targets were displayed for a variable interval before the onset of the motion stimulus. The duration of this prestimulus interval was randomly selected from an exponential distribution (mean = 700 msec). This randomization served to discourage anticipation of the onset of the motion stimulus. We found that randomization of the prestimulus interval in this fashion was essential for training on the RT task. When the onset of the stimulus was predictable, RT was faster than in these experiments and varied less across the range of coherences [see also Green et al. (1983)]. Once the motion stimulus began, the monkey was free to indicate its choice at any time. When the computer detected a break in fixation, the random dots were extinguished. If the monkey made a saccade to either choice target, the trial was scored as correct or incorrect. Breaks in fixation that were not associated with an immediate saccade to a choice target were rare and are not included in our data set.
In the FD task (Fig. 2B), the choice targets were displayed for 700 msec followed by the appearance of the random-dot motion stimulus. The monkey then viewed the motion stimulus for 1 sec. This was followed by a random delay of 500–1500 msec, followed by the disappearance of the fixation point, which cued the monkey to report its choice of direction. Until then, the monkey was required to maintain fixation within a window of ±0.5° (monkey N) or ±1.5° (monkey B).
The RT and FD tasks were conducted in alternating blocks consisting of 10–40 trials at each of the six levels of motion strength. The monkeys were informed of the task condition by the color of the fixation point, which was red for FD and blue for RT. Note that in the RT trials, the monkey could take as little or as much time as it needed to make its decision. However, the reward was withheld for a minimum of 800 msec (monkey B) or 1200 msec (monkey N) after onset of random-dot motion, no matter how quickly the monkey indicated its choice. This strategy created an incentive to respond within ∼1 sec of motion onset, but no incentive to go any faster. On each trial we obtained a measurement of the monkey's direction judgment and, on the RT version of the task, the amount of time taken to achieve it. We refer to the period from onset of random-dot motion to saccade initiation as the reaction time.
Analysis of behavioral data. For each experiment, the monkey's sensitivity to motion was estimated by plotting the probability (p) of a correct choice as a function of motion coherence (C). The accuracy data were fit by a cumulative Weibull function (Quick, 1974): Equation 1using a maximum likelihood fitting procedure. The discrimination threshold, α is the coherence level at which the monkey would make 82% correct choices. The second parameter, β, describes the slope of the psychometric function.
Analysis of neural data. All physiological data reported in this paper were acquired from trials in which the monkeys completed the direction-discrimination task by choosing one of the two choice targets. Spike times were recorded to 1 msec precision and aligned to events in the trial. In the RT task, there are two relevant time scales: one that relates the time of spikes to the onset of random-dot motion, the other to initiation of the saccadic eye movement response. Unless indicated otherwise, when reporting data aligned to the onset of motion, we exclude all spikes that occurred from 100 msec before the saccadic eye movement. Therefore, the averages shown on the left portion of Figures 7A, 11, and 12 exclude any perisaccadic enhancement that is commonly observed in LIP neurons (Barash et al., 1991a). Similarly, when analyzing data with respect to saccade initiation, we exclude all spikes occurring within 200 msec after onset of random-dot motion. This precludes any activity associated with stimulus onset from the averages. These manipulations are important when analyzing the RT data because events with fixed latency to stimulus onset occur with variable time with respect to the saccade, and vice versa.
We use various regression tests to evaluate the role of motion strength and other factors on the neural response. In general, these regression models provide a convenient test of the null hypothesis that the factor in question does not affect the measured response. They also offer a sensible way to combine data from several neurons: unless indicated otherwise, the regression models make use of an identifier variable, Iu (also known as a dummy variable), to indicate neuron identity. This maneuver adjusts for differences in overall response level between neurons and allows the measurement from each neuron to affect the regression fit with leverage commensurate to its reliability. Unless indicated otherwise, all fits were obtained using weighted least squares (Draper and Smith, 1966), which furnishes maximum likelihood estimates of the fitted coefficients and their confidence intervals under the assumption that variability (ɛ) is distributed as multivariate Normal. Here we list the specific regression models with minimal explanation so as to complement the development in Results.
Effect of motion strength, RT group, and task during specific epochs. To determine whether the neural response was affected by the strength of motion at a particular time (see Figs. 6, 7, and 9), we fit: Equation 2Awhere Yt is the spike rate in the epoch (as specified in the text), C is the coherence level from that trial (0–0.512), Iu is the dummy variable that identifies the neuron ( Iu = 1 when u = n and 0 otherwise), and ɛ is random error (see above). βi represents the fitted coefficients ( β1,u is a vector of fitted constants, one per neuron), estimated using weighted least squares. Responses for trials in which the monkeys selected T1 or T2 correctly were analyzed separately. In Equation 2A, the fitted value for β2 and its confidence interval (CI) furnishes an estimate of the change in response per 100% coherence. The null hypothesis is that motion strength does not affect the level of LIP activity (H0: β2 = 0), which we evaluate using an F test for nested models (Draper and Smith, 1966).
We used the same strategy to determine whether the response was affected by RT during an epoch before saccade initiation (see Fig. 8). Instead of motion strength, we tested the effect of RT on spike rate, measured at designated times: Equation 2Bwhere Tgroup is the median RT for the group of trials used to generate the response functions in Figure8A.
We obtained data from both the FD and RT versions of the task in a subset of neurons (see Fig. 10). For these neurons we were interested in whether the responses differed according to the behavioral paradigm. For specific epochs during the trials, we expanded Equation 2A to incorporate the effect of task type: Equation 2Cusing the same conventions as above. Itask equals 1 or 0 for RT and FD trials, respectively. Any difference in activity between the two paradigms is estimated by β3 and its confidence interval. The null hypothesis that the choice of paradigm does not influence the level of the neural response is tested by setting β3 = 0.
To examine whether the buildup and decline of activity during motion viewing differs during correct and error trials (see Fig. 11), we modified Equation 2C to compare correct and error trials: Equation 2Dwhere Icorrect equals 1 or 0 for correct and error trials, respectively. The analysis excludes all 0% coherent motion trials and is performed separately for T1 and T2 choices. The null hypothesis (H0: β3 = 0) is that the response is not affected by whether the target was chosen correctly (or equivalently, that the response is not affected by the direction of motion).
We were also interested in whether the response recorded in particular epochs was related to the amount of time it took the monkey to make its choice (see Figs. 12, 13). For each neuron, all correct trials for each coherence level in which the monkeys had chosen T1 were sorted into a “short” or “long” group on the basis of the RT on that trial. To test the effect of RT group, Equation 2A was expanded to: Equation 2Eusing the same conventions as above. In this equation, IRT equals 1 for trials in the short RT group and 0 otherwise. β3 estimates the difference in spike rate that can be attributed to RT group. The null hypothesis is that RT group does not affect the response (H0: β3 = 0). This analysis was also performed on data from experiments in which identical random-dot patterns were presented on half of the trials. Equation 2E was expanded to: Equation 2Fwhere Ip identifies each unique pattern of random dots and β0,p is a list of constants associated with each pattern. The same null hypothesis (H0: β3 = 0) allows us to examine the possibility that any difference in response associated with short versus long RT is explained by motion strength and differences in the particular sequence of random dots.
Time course of response. To examine the effect of motion strength on the time course of the neural response, we modeled spike rate as a linear function of time and examined the effect of motion strength on the slope of these lines. For these analyses, spike rate (Y) was measured in 20 msec bins, and time (T) denotes the center of the bin. The effect of random-dot coherence (C) on time course of the response can be estimated by fitting: Equation 3Awhere T × C is the term describing the interaction between time and motion strength. Responses for trials in which the monkeys selected T1 or T2 correctly were analyzed separately. The interaction term instantiates the possibility that the time-dependent change in spike rate is affected by motion strength. Therefore, the null hypothesis is that β4 = 0.
We also were interested in whether the neural response changed as a function of time early in the trials even when RT was long. Trials were selected according to choice (T1 or T2 correct) and RT (for example, 700–799 msec; see Fig. 8), permitting a reduction of Equation 3Ato: Equation 3Bβ2 is the average slope of the response during this interval. The null hypothesis is that there is no time-dependent change in response (H0: β2 = 0).
For correct T1 choices, we tested whether the time course of the neural response was related to the amount of time it takes the monkey to make its choice (see Figs. 12 and 13 and Eq. 2E). For each coherence level, the response was estimated for each RT group with a modification of Equation 3A: Equation 3Cusing the same conventions as above. The null hypothesis is that RT group does not affect the slope of the response (H0: β4 = 0). The slope of each response function was estimated separately for short- and long-RT groups (using Eq. 3B).
Saccade metrics. We were interested in whether the variability in the neural response during motion viewing could be accounted for by variability in the eye movements themselves. To test this, we first determined whether saccade metrics were affected by the strength of the motion stimulus. For each saccade, we measured its maximum velocity (Vmax), average velocity ( ), duration (Tdur), amplitude (A), reaction time (RT), and accuracy ( ACC, 1/distance of saccade endpoint from target, divided by target eccentricity). We tested each parameter to determine whether it varied with coherence for each direction of motion, using the model: Equation 4Awhere S is the value for Vmax, , Tdur, A, RT, or ACC. The null hypothesis is that motion strength does not affect the saccade parameter (H0: β2 = 0).
We then tested whether these saccade parameters could explain the apparent relationship between spike rate and motion stimulus strength. Equation 2A was expanded to incorporate the potential confounding variables: Equation 4Bwhere C is motion strength and Yt is the response on each trial during the 40 msec epoch ending at the median RT for 51.2% coherence trials. Only saccade metrics that varied as a function of coherence level are included. The null hypothesis is that motion strength does not affect the level of LIP activity once saccade metrics are known (H0: β6 = 0).
Response on single trials. We used a maximum likelihood procedure to estimate the rate of increase in the spike rate from single trials, using spikes from 200 msec after dots onset until 100 msec before saccade initiation. Only correct T1 choices were included in this analysis. The spike rate, λ(t), was approximated as a line: Equation 5We solved for values of λ0 and k that maximize the likelihood of obtaining the observed spike train assuming that spikes are emitted in accordance with a nonstationary Poisson point process parameterized by λ(t). The procedure furnishes for each trial an estimate of the change in spike rate per unit time ( k, which we term ramp-slope) along with its standard error. The latter is obtained by inverting the 2 × 2 Hessian matrix of second partial derivatives of the error function (minus log likelihood) with respect to λ0 and k.
To evaluate the relationship between the ramp-slope and RT, we used weighted multiple regression to factor out the influence of cell identity and coherence on ramp-slope: Equation 6where k is the ramp-slope for each trial (from Eq.5), C is motion coherence, and RT is the reaction time. As above, β1,u represents a list of constants, one per neuron. When analyzing data from just one neuron, this term is replaced by a single constant that represents the average ramp-slope independent of motion strength and RT. The change in ramp-slope that is related to RT is estimated by β3. The units are spikes per second squared per second (i.e., spikes per second cubed). The null hypothesis, that there is no relationship between the neural activity and RT, is tested by setting β3 = 0.
Because behavioral and physiological results were similar in the two monkeys (see Fig. 6, Table 1), all population analyses were performed on combined data from both monkeys.
RESULTS
We recorded from 54 LIP neurons in two rhesus monkeys while they performed a direction discrimination task (Fig.2). We will first show how the speed and accuracy of the monkeys' direction decisions depended on motion strength. Then we will examine the activity of LIP neurons during the period of decision formation. Finally, we will expose a weak relationship between RT and the activity of LIP neurons on single trials.
Speed and accuracy of direction judgments
Performance accuracy on the RT and FD tasks depended on the motion strength of the stimulus. Accuracy data from a single recording session are shown in Figure 3A. The monkey's performance varied from chance (50% correct, data not shown) to perfect discrimination as the motion strength increased from 0 to 51.2% coherence. The fitted psychometric function revealed a threshold (αRT) of 6.3% coherence motion and a slope (βRT) of 1.7 (see Materials and Methods, Eq. 1). This level of performance on the RT task was comparable with the performance on alternating blocks of trials in which the random dots were viewed for a full second (Fig.3A, dashed curve) ( αFD = 7.5% coherence; βFD = 1.5). The psychometric functions obtained from FD and RT trials in this experiment were not statistically different (p = 0.49; likelihood ratio test).
Overall, the accuracy of the direction discrimination was not compromised when tested in the RT condition (Table1). For 38 experiments in which we obtained data on both the FD and RT tasks, the ratio of thresholds (αRT/αFD) was <1 (geometric mean = 0.74, 95% CI: 0.63–0.87; p = 0.002, paired t test after log transform). Thus when given the freedom to respond when ready, the monkeys took the time needed to perform the task slightly better than the level that they achieved with a full second of stimulus viewing.
The amount of viewing time required to achieve such performance varied inversely as a function of motion strength. Figure 3B shows the mean RT for correct choices in the same experiment depicted in Figure 3A. RT varied from 350 ± 9 msec (mean ± SEM) for the strongest motion (51.2% coherence) to 876 ± 35 msec for the weakest motion strength. Summary statistics for all 54 experiments are provided in Table 2. RT tended to be slightly faster for T1 choices than T2 choices, presumably because of the extensive practice given the monkey during RF mapping. The distributions of RT associated with any one condition tend to exhibit positive skew. Various statistics have been advanced to describe the moments and shape of the RT distribution (Luce, 1986; Carpenter and Williams, 1995;Ratcliff and Rouder, 1998). For the present purposes, we ask the reader to consider the values in Table 2 as indicators of central tendencies. A full account of the distribution of RT will be reported separately (Ditterich et al, 2001).
The rapid reaction times accompanying strong motion indicate that the monkeys did not procrastinate in reporting their decisions once attained. This is remarkable considering that there was no incentive to respond any faster than 800 or 1200 msec (depending on monkey). In contrast, on the weaker motion strengths, the monkeys often took more than 1 sec to reach a decision (e.g., in 20% of the trials at 3.2% coherence) (Table 2). Presumably, the ability to view a difficult stimulus for a longer time accounts for the small difference in performance on the RT and FD tasks. Overall, the pattern of results indicates that the monkeys took about as long as required to achieve the level of performance to which they had grown accustomed on the 1 sec FD task. Although we did not attempt to manipulate the tradeoff between speed and accuracy in these experiments, one of the monkeys tended to respond with short latencies when first exposed to the RT task. We observed that this monkey's performance improved over weeks as it took more time to respond to weaker stimuli. This observation suggests that the additional viewing time on difficult trials was devoted to solving the task, consistent with previous studies (Gold and Shadlen, 2000). The important point is that the RT task allows us to deduce the window of time corresponding to decision formation on each trial, before the monkey is committed to a particular behavioral response.
Neural response during direction discrimination task
The direction of motion and the location of targets in the discrimination task were arranged so that the activity of the neuron under study would indicate the monkeys' decisions. One of the choice targets (T1) was in the response field of the neuron; the other choice target (T2) was placed in the opposite hemifield. The random-dot motion was in a 5° aperture centered at the fovea and was directed toward either T1 or T2. The monkey was trained to interpret such motion as an instruction to make an eye movement to the corresponding target. On the basis of our selection criterion (Materials and Methods), we expected neurons to respond more when the monkey planned an eye movement to the target in its RF (Shadlen and Newsome, 1996, 2001). The RT version of the task allowed us to examine the neural activity in the time frame of the monkey's decision formation.
Neural activity increased during the period of decision formation when the monkey's judgment resulted in an eye movement to the choice target in the RF (T1) but not before an eye movement to the other target (T2). Figure 4 illustrates the responses obtained in the RT condition of the same experiment as Figure 3. When motion was strong, the monkey made its decision rapidly, and the response modulation was apparent for only ∼150 msec before saccade initiation (Fig. 4, top, 51.2% coherence).
When the motion was weaker, the time required to reach a decision was prolonged. As shown in Figure 4 (bottom, 6.4%coherence), the response of the neuron began to increase several hundred milliseconds before the execution of the saccade to the target in the RF. The example exposes a dividend of the combined RT threshold-discrimination task. When the task is easy, it is difficult to interpret the neural data because the sequence of events from stimulus onset to saccade execution occurs within a short time frame. However, when the task is more difficult, the decision is formed over a prolonged period, which can be differentiated from the immediate preparation of an eye movement.
A similar pattern of activation was evident during blocks of trials in which the monkey viewed motion for a fixed duration. Figure5 illustrates the responses obtained in the FD condition of the same experiment as Figure 3. The activity increased during the period of motion viewing on trials in which the monkey judged the direction as toward the RF, and this change in activity persisted through the delay period until the go signal. When the monkey judged the direction of motion as away from the RF, the activity decreased and remained suppressed through the delay period. This pattern of persistent activity during the delay period helps to distinguish LIP neurons from visual sensory neurons like those in area MT (Seidemann et al., 1998). Our hypothesis is that the buildup and attenuation of activity during motion viewing represents formation of the monkey's decision about direction (Shadlen and Newsome, 2001). To test this, it is necessary to discern whether the activity modulates in the time period that the monkey uses to reach a decision. This is not possible using the FD version of the task because there is no way to tell when the monkey has reached its decision.
The chief advantage of the RT task is that we can examine the neural activity during the period of decision formation, before the monkey is committed to an eye movement response. In particular, we may ask whether the stimulus has an effect on the neural activity that cannot be accounted for by the initiation of the saccade. We examined the spike rate as a function of motion strength during a short epoch after the onset of random-dot motion (Fig. 6). We chose for this analysis a 100 msec epoch ending at the median RT for the strongest motion strength, thus permitting us to estimate spike rate from at least half of the trials at every coherence level. Figure6A shows the spike rates associated with T1 and T2 choices for the neuron depicted in Figure 4, plotted as a function of motion strength and fit by a line. For this neuron, the average response associated with T1 choices increased by 42.2 spikes per second per 100% coherence (CI: 14.6–69.9; p < 0.001) (Eq. 2A, H0: β2 = 0 ), indicating a profound effect of stimulus strength on the buildup of activity during motion viewing. When the monkey chose T2, there was a decrease of −16.5 spikes per second per 100% coherence (CI: −29.0 to −4.0; p < 0.001) (Eq. 2A, H0: β2 = 0 ).
We performed this analysis on each of the neurons in our sample. For each neuron, we used this regression procedure to estimate the change in spike rate per 100% coherence on trials that would ultimately culminate in T1 or T2 choices. These values are displayed for each monkey in the histograms in Figure 6B. For the population of neurons studied, the modulation of the response during motion viewing was related to motion strength. Arrowsindicate the average change in firing rate for each monkey. The dependence of the response on coherence during motion viewing was found across the population of neurons studied and was similar in the two monkeys.
The evolution of activity accompanying motion viewing furnishes insight into the neural basis of decision formation. Figure7A shows the averaged responses of 54 neurons recorded while monkeys performed the RT task. On the left half of the graph, the responses are aligned to the onset of random-dot motion. They show activity accompanying motion viewing and exclude any perisaccadic response. On the right half of the graph, the responses are aligned to the onset of the eye movement response. They show activity leading to the monkeys' behavioral response and exclude any response modulation that accompanies onset of random-dot motion. The next three paragraphs focus primarily on the left portion of the graph, during the period in which the monkey is viewing random-dot motion but has not committed to a choice.
Before the onset of motion, there was a modest level of activity attributable to the presence of one of the choice targets in the RF of the neuron. Approximately 90 msec after the onset of random-dot motion, there was a transient dip and recovery in the activity lasting ∼100 msec that did not depend on the strength of motion (indicated bycolor) or the monkey's choice (solid ordashed lines). The activity then increased or decreased in a manner that reflected the strength of motion and the monkey's ultimate choice. On trials that culminated in T1 choices (direction judged as toward the RF), the activity increased in a ramp-like fashion. The rate of growth in the response was largest for the strongest motion and smaller as the coherence decreased. The slopes of the functions ranged from 21.5 spikes per second squared to 88.8 spikes per second squared as motion strength was varied from 0 to 51.2% coherence (CI: 16.5–26.6 and 49.2–128.3, respectively; p < 0.0001) (Eq. 3A, H0: β4 = 0 ). The effect of motion strength on the rate of change was statistically significant for both monkeys (increase of 59.3 and 40.7 spikes per second squared from 0 to 51.2% coherence for monkeys N and B; CI: 51.3–67.3 and 31.5–50.0, both p < 0.0001) (Eq. 3A, H0: β4 = 0 ). A similar pattern was apparent in the declining responses that accompanied T2 choices. For T2 choices the responses were less ramp-like but tended to drift toward lower rates in a manner that also depended on the strength of motion. This inverse relationship between the rate of change in neural response and motion strength seen in Figure 7A was also reliable (p < 0.0001) (Eq. 3A, H0: β4 = 0 ).
Importantly, the ramp-like modulation in discharge accompanied motion viewing and was not an immediate antecedent to the saccadic eye movement. The response averages shown in the left half of Figure 7A are drawn to the median reaction time and do not include any activity in the 100 msec preceding saccade initiation. These curves therefore exclude any enhancement (or attenuation) that occurred just before the eye movement. They reveal clear differences in the neural processing that accompanied the formation of difficult versus easy decisions. It will prove useful for comparison to tabulate the mean response in the 40 msec epoch denoted by arrows aand b in Figure 7A. The mean responses (±SEM) for each coherence level for both directions are shown in Figure7B. These means, drawn from at least half of the trials at all motion strengths, demonstrate a systematic dependency on motion strength: an increase in activity of 38.4 spikes per second per 100% coherence when motion was toward T1 (a, CI: 23.9–52.8 spikes per second; p < 0.001) (Eq. 2A, H0: β2 = 0 ) and a decrease of −29.9 spikes per second per 100% coherence when motion was toward T2 (b, CI: −50.2 to −9.6 spikes per second; p < 0.01) (Eq. 2A, H0: β2 = 0 ).
Because many LIP neurons modulate their activity in relation to eye movements (Barash et al., 1991a; Colby and Goldberg, 1999), it is natural to ask whether the effect of motion strength on neural activity could be explained by differences in the monkeys' eye movement responses. Except for RT, however, the saccade parameters that we measured did not vary substantially as a function of motion strength. Saccadic duration and accuracy did not show any dependency on motion strength (p > 0.05) (Eq. 4A, H0: β2 = 0 ). The higher coherence stimuli were associated with saccades that were slightly slower and shorter, but these effects were very small (∼3%); they did not constitute violations of the main sequence (Fuchs, 1967). Of course, RT was 53.4% faster across the range of motion strengths (p < 0.0001) (Eq.4A, H0: β2 = 0 ). To test whether such variation could explain the effect of motion strength on LIP activity, we incorporated the saccade parameters into the regression analysis used to model the responses atarrows a and b in Figure 7 as a function of motion strength. We found that even when the saccade parameters were incorporated, the fit was improved significantly by including the coherence level of the trial (p < 0.001) (Eq.4B, H0: β6 = 0 ). We are therefore confident that LIP neurons reflect the strength and direction of the motion in a way that is not explained by differences in saccade metrics.
In fact, by the time the monkey was committed to a particular eye movement response, differences in neural activity attributable to motion strength were substantially attenuated or absent. This is shown by the solid curves on the right half of Figure7A, in which the activity was aligned to the monkeys' eye movement responses. On the trials in which the monkeys chose the T1 target (in the RF), the response achieved a common value of 68.6 ± 0.9 spikes per second in the epoch 30 to 70 msec before the saccades, indicated by arrow c. The mean responses for each coherence level during this epoch are shown in Figure 7C. In this interval (c), the neural response was not affected by motion strength (1.1 spikes per second per 100% coherence; CI: −3.2 to 5.4; p = 0.46) (Eq. 2A, H0: β2 = 0 ).
The responses preceding T2 choices (outside the RF) also exhibited a precipitous change before saccade initiation, but unlike the T1 choices, the responses remained associated with motion strength. For example, in the 40 msec epoch indicated by d in Figure 7,A and C, there was a decrease of −17.5 spikes per second per 100% coherence (CI: −36.7 to −1.6; p< 0.05) (Eq. 2A, H0: β2 = 0 ). The inverse relationship between spike rate and motion strength was evident throughout the time course. The decline in spike rate is consistent with mounting evidence against a T1 choice, but in contrast to the conditions favoring a T1 choice, there is no common spike rate value that precedes the onset of the saccade.
The gradual change in LIP activity leading up to the monkey's choice is better appreciated by examining trials sharing the same duration. In Figure 8A, we have grouped the trials by the monkey's RT, using T1 choices from all motion strengths. Each curve shows the average spike rate from sets of trials that end within 25 msec of each other, for example 400–425 msec, plotted as function of time from the saccade. Unlike Figure7A, the averages shown do not exclude any spikes between stimulus onset and saccade. Trials ending in short RT follow a steeper trajectory than those ending in long RT. For example, in the epoch from 200 msec after motion onset to 100 msec before saccade, the firing rate increased 64.1 spikes per second squared for the 500 msec RT group and 25.7 spikes per second squared for the 900 msec RT group (CI: 34.4–93.9 and 14.1–37.2; both p < 0.01) (Eq. 3A, H0: β4 = 0 ). At the time just before the saccade (same as c in Figure 7), there is no dependence of the response on RT group (change of 5.5 spikes per second across RT groups; CI: −26.7 to 37.8; p = 0.62) (Eq.2B, H0: β2 = 0 ). Just 40 msec earlier, the same epoch revealed a dependency on reaction time. As shown in Figure 8A, shorter RT was associated with lower spike rates, consistent with the steeper rise of response (38.8 spikes per second per second RT; CI: 2.1 to 75.0; p < 0.05). The stereotyped activity in the last 50 msec of the trial could mark a completion of decision formation and a commitment to an eye movement.
Importantly, grouping the data by RT allows us to appreciate a gradual change in spike rate for hundreds of milliseconds before the monkeys initiated their eye movement responses. Figure 8Bshows the responses in the epoch from 200 to 500 msec after motion onset (that is, subsequent to the transient dip) at the very beginning of the coherence-dependent portion of the response. The responses are shown for the three longest RT groups (700–799, 800–899, 900–999 msec). Points represent the mean spike rate from 40 msec epochs and were fit by lines using weighted regression. The slopes of these regressions and their confidence intervals are shown in Figure8C (p < 0.05 for all six fits) (Eq.3B, H0: β2 = 0 ). The analysis documents a time-dependent change in the spike rate accompanying motion viewing during an epoch that ends 200–500 msec before initiation of the eye movement response. This observation helps to exclude the possibility that the gradual change in activity seen in previous figures could have resulted from averaging across trials containing more abrupt changes in activity as the monkey chose the T1 or T2 choice target. Instead, we can be confident that the spike rate changes in a gradual fashion many hundreds of milliseconds before the monkey is committed to its choice.
The buildup and decline in spike rate also occurred in the FD version of the task (Fig. 9A), where a decision does not immediately lead to an eye movement. Early in the motion-viewing period (left panel), the response modulated in a ramp-like fashion. The mean responses at the time point marked by the arrow a (corresponding to arrow a in Fig.7A) are shown in Figure 9B (a). The response associated with T1 choices increased 25.8 spikes per second per 100% coherence (CI: 20.1 to 31.6 spikes per second;p < 0.0001) (Eq. 2A, H0: β2 = 0 ). Likewise, in the same period, the response level associated with T2 choices decreased 10.2 spikes per second per 100% coherence (b in Fig. 9A,B) (CI: −19.8 to −0.6; p < 0.05) (Eq. 2A, H0: β2 = 0 ). The level of response attained during the motion-viewing period reflected the direction and strength of the motion stimulus, consistent with previous reports (Shadlen and Newsome, 1996,2001; Horwitz and Newsome, 1999, 2001; Kim and Shadlen, 1999).
The effect of motion strength is apparent early in the delay period, but by the end of the delay, the activity reflected only the monkey's choice of saccade target. Responses preceding saccades to T1 bear striking similarity to the enhancement seen in the RT task. Like the RT data, there was a stereotyped increase in activity that did not vary systematically with motion strength in the epoch 30–70 msec before the saccade (c in Fig. 9A, C) (change of 1.9 spikes per second per 100% coherence; CI: −4.1 to 8.0;p = 0.37) (Eq. 2A, H0: β2 = 0 ). Unlike the RT data, neural activity was not affected by motion strength during the epoch 30–70 msec preceding the monkey's correct saccades to T2 (d in Fig.9A, C) (−9.4 spikes per second per 100% coherence; CI: −28.1 to 9.3; p = 0.18) (Eq. 2A,H0: β2 = 0). For both FD and RT tasks, the activity depends on motion strength while the monkey gathers evidence, and the activity becomes stereotyped once the monkey is committed to a particular eye movement response. In the RT task, this occurs ∼50 msec before saccade initiation. In the FD task, this occurs earlier and is presumably variable.
These qualitative similarities belie important quantitative differences in the responses on the FD and RT tasks. Figure10 compares the activity from 38 neurons from which we were able to obtain data using both FD and RT tasks. In the epoch before onset of random dot motion (Fig.10A), activity was on average 4.4 spikes per second higher in the RT task (CI: 3.2 to 5.5; p < 0.0001) (Eq. 2C, H0: β3 = 0 ). In Figure 10B, the average responses during motion viewing for T1 and T2 are shown for one coherence level (12.8%). Across all motion strengths, the responses were 13.8 spikes per second higher in the RT task for T1 choices (CI: 11.0 to 16.6;p < 0.0001) (Eq. 2C, H0: β3 = 0 ). For T2 choices, the activity was 6.9 spikes per second higher in the RT task (CI: 4.2 to 9.6;p < 0.0001), which is not substantially larger than the difference before fixation. In the epoch 30–70 msec preceding saccade initiation, responses were 9.6 spikes per second higher in the RT task when the monkey chose T1 (Fig. 10C) (CI: 6.8 to 12.4; p < 0.00001) (Eq. 2C, H0: β3 = 0 ). In contrast, for saccades to T2, the responses were not significantly different in the RT task, despite the higher firing rate during fixation (CI: −8.1 to 0.6; p= 0.12) (Eq. 2C, H0: β3 = 0 ). These observations suggest that both the offset and gain of the LIP response is greater in the RT version of the task.
The observations in Figures 7-9 are consistent with the idea that LIP reflects the mounting evidence for or against a T1 choice. The evidence is represented in the ramp-like activity during the period of decision-making—before the monkey is committed to a choice. This period ends ∼50 msec before the saccade in the RT task and some time during motion viewing (or just after) in the FD task. The pattern of activity in the RT task in particular suggests that the neurons are accumulating some quantity—determined by the strength and direction of motion—toward a threshold, at which point the monkey is committed to one action or another. Two additional observations support this hypothesis. The first derives from an analysis of the trials in which the monkeys chose the wrong direction of motion. The second involves a closer look at the relationship between the neural response and the monkeys' reaction times.
Errors
Near psychophysical threshold, monkeys often judged the direction of motion incorrectly. Figure 11compares the averaged responses on correct and error trials at two of the weaker motion strengths. Regardless of the direction of random-dot motion, LIP activity increased when the monkeys chose the target in the RF and decreased when the target outside of the RF was selected. However, the activity followed different trajectories on error trials. When monkeys selected T1, the activity that occurred during motion viewing increased less when the direction of motion was away from the RF (dashed gray curves). The response was 5.0 spikes per second higher on correct T1 choices than on errors during stimulus viewing (CI: 1.9 to 8.1; p < 0.05) (Eq. 2D, H0: β3 = 0; response compared in 40 msec epoch ending at median RT for 51.2% coherence trials). When monkeys chose T2, the spike rate was 5.1 spikes per second lower on correct choices than on errors (CI: −6.1 to −4.1; p< 0.01) (Eq. 2D, H0: β3 = 0 ).
The more gradual evolution of the response on error trials was accompanied by prolonged reaction times. Table 2 presents mean RT for each motion strength on error and correct choices. The best evidence for longer RT comes from the weaker motion strengths where errors are more common. This can also be appreciated from the length of the curves in Figure 11, which are drawn to the median RT for each condition. The association of longer RT with the muted ramps of activity seen on error trials lends additional support to the idea that the buildup of activity represents an accumulation of evidence toward a threshold. We will explore this point further in Discussion.
Relationship between neural response and RT
If LIP represents the accumulation of evidence favoring one or the other alternative in the discrimination task, then the amount of time required to reach a decision should be related to the rate of growth and decline in LIP activity. This interpretation would explain the tendency for stronger motion to produce faster RT. Alternatively, both the timing of the saccade and the rate at which the LIP response builds up could be related to motion strength but not to each other, raising the possibility that the relationship between RT and LIP activity is merely coincidental. For example, separate processes, related to coherence level, could operate to determine when to initiate the saccade and where to direct it. If LIP activity represents where, but not when, to move the eyes, then there should not be a relationship between LIP response and RT for trials of the same motion strength. We therefore performed two analyses to determine whether the rate of rise in neural response preceding correct T1 choices was related to the monkey's RT.
First, at each coherence level, we sorted trials from each neuron into short or long RT groups based on the median [as in Hanes and Schall (1996)]. Examples from the 6.4% coherent motion strength are shown in Figure 12. These response functions represent activity during motion viewing up to the median RT for the group, again excluding any spike activity within 100 msec of saccade initiation. To estimate the rate of change in response as a function of time, we fit lines to the average responses for the two RT groups (Eq. 3C). For the responses at 6.4% coherent motion, the neural activity grew at a rate of 53.4 spikes per second squared for trials in the short RT group and 33.9 spikes per second squared for trials in the long RT group (CI: 47.5–59.2 and 30.1–37.6, respectively). The rate of rise of the response for the two groups at each coherence level is shown in Figure13. The steeper rise in spike rate associated with shorter RT was statistically significant for all motion strengths except 25.6% (p < 0.05) (Eq. 3C, H0: β4 = 0 ). A comparison of the neural response in the same epoch indicated by arrow ain Figure 7 shows that, across motion strengths, the firing rate in LIP was 4.3 spikes per second higher for trials in the short group (CI: 3.1 to 5.5; p < 0.01) (Eq. 2E, H0: β3 = 0 ). In contrast, just before saccade initiation (same epoch as c in Fig. 7), there was no difference in firing rate between the two groups (difference = 0.5 spikes per second; CI: −1.0 to 2.0; p = 0.71) (Eq. 2E, H0: β3 = 0 ).
One possible concern with this analysis stems from the fact that random-dot stimuli of nominally identical motion strength are in fact different from trial to trial because the position and timing of random dots are determined by different sequences of random numbers. We therefore performed further experiments using identical patterns of random dots on half of the trials. For three neurons studied in this fashion, we observed a 10.6 spikes per second difference in spike rate favoring short over long RT trials (epoch a in Fig. 7; CI: 4.7–16.4 spikes per second; p < 0.001) (Eq. 2F, H0: β3 = 0 ). Removing the trial-to-trial variation in motion strength caused a small reduction of the trial-to-trial variability in RT (7% change in coefficient of variation, ς/μ), but the remaining variability in RT retained its correlation with the activity in LIP.
Up to now, we have considered only averages of responses and averages of RT. Yet, it is clear from Figures 4 and 12 (inset) that RT can be quite variable from trial to trial, even when the same stimulus is shown to the monkey. We wondered whether this variability might be reflected in the neural activity, which is also highly variable from trial to trial (Softky and Koch, 1993;Shadlen and Newsome, 1994, 1998). We therefore devised a method to estimate the rate of change in the spike rate on individual trials. The irregularity of spike trains in cortex generally precludes an estimate of instantaneous spike rate from single trials (spike rate is usually inferred from averages of many trials), but this impediment can be overcome with previous knowledge of the shape of the spike rate function.
On the basis of the preceding analyses (Fig. 8), we assumed that for T1 choices, the spike rate, λ(t), approximates a linear ramp during the interval from 200 msec after motion onset to 100 msec before saccade initiation. We used a maximum likelihood procedure to estimate the slopes of the ramps and associated standard error (see Materials and Methods, Eq. 5). An example of such a fit is shown in Figure14A for a single trial. The quality of the fit should not be viewed as evidence that the spike rate changes linearly as a function of time. In principle, various functions could provide a better depiction of the spike rate on any individual trial. However, the slope of the fitted ramp does estimate the overall magnitude of change in spike rate per unit of time.
These estimates of ramp-slope allowed us to detect a weak relationship between RT and LIP response on single trials. For each trial, we obtained a set of paired observations: RT and ramp- slope ( k ± SE). An example of the trials from one neuron at one coherence level (6.4%) are shown in Figure 14B.Each point represents the estimated ramp-slope (± SE) from a trial, plotted as a function of RT. For this example, there was a change in the slope of firing rate of −48.3 spikes per second squared per second of RT (CI: −75.3 to −21.3; p < 0.001) (Eq. 6, setting β2 = 0, H0: β3 = 0 ). For each neuron, we obtained a single estimate of the relationship between slope change and RT using all motion strengths by incorporating motion coherence in Equation 6(β2 ≠ 0). In effect, this is the average change in ramp-slope per second change in RT with respect to the mean RT for each coherence level. The values for each neuron are shown in Figure14C. Across trials from all neurons, the change of ramp slope per second of RT was −58.5 spikes per second cubed (CI: −79.9 to −37.1; p < 0.0001) (Eq. 6, H0: β3 = 0 ).
The inverse relationship between change in spike rate and RT supports the idea that LIP neurons accumulate signals from visual cortex to a threshold value, which marks completion of the decision process. Variability in the direction signals (Britten et al., 1992, 1993,1996; Shadlen and Newsome, 1998) and possibly in the accumulation process itself (Carpenter and Williams, 1995; Ratcliff and Rouder, 1998; Ditterich et al., 2001) leads to variability in the amount of time it takes to reach a decision.
DISCUSSION
Neurons in area LIP are known to respond to salient visual stimuli, especially when they are the target of a planned eye movement (Gnadt and Andersen, 1988; Colby et al., 1996). We exploited this property of LIP to study the neural basis of a perceptual decision. We targeted neurons in area LIP that discharge during a delay period after an instruction to make an eye movement into a particular region of the visual field (Barash et al., 1991a, 1991b;Mazzoni et al., 1996). Shadlen and Newsome (1996, 2001) showed that when the instruction for the eye movement is random-dot motion, the activity of these neurons predicts the monkey's decision about direction.
When the monkey is given a fixed duration to view the random-dot motion, activity in LIP modulates during motion viewing and persists through a memory period (Fig. 9) (Shadlen and Newsome, 1996, 2001). Because the time course and level of activity depend on the strength of random-dot motion, it is likely that LIP neurons not only represent the planned eye movement response but also the visual information on which the developing decision is based—in other words, a decision variable. This interpretation rests on an assumption that the modulation occurs during the epoch of decision formation, before the monkey is committed to a choice.
The incorporation of a reaction time measurement in the discrimination task further clarifies two aspects of the LIP response. First, it allows the monkey to demarcate the period of decision formation on each trial. We can now be certain that the modulation of LIP activity that accompanies motion viewing is not a consequence of the decision,fait accompli. The gradual increase or decrease in spike rate occurs before the monkey is committed to a particular action, while the monkey is evaluating the evidence but has not yet decided the direction of motion. Second, the completion of the decision process seems to be marked by a stereotyped level of excitation among neurons that signal the plan to make an eye movement response to a choice target in their RF. This provides novel insight into a step between decision formation and action that we term commitment.
Certain features of the activity in LIP differentiate it from sensory signals that represent visual motion and motor signals that command eye movements. The persistence of activity through the delay period of the FD version of the discrimination task occurs in the absence of visual stimulation and, in both RT and FD versions of the task, the response is ultimately dominated by the monkey's choice rather than the direction of random-dot motion [e.g., 0% coherent motion and error trials; see also Shadlen and Newsome (1996,2001)]. This is in stark contrast with direction-selective neurons in area MT, which do not respond during a delay period on this task (Seidemann et al., 1998) and the responses of which are determined largely by the direction and strength of motion (Britten et al., 1993,1996). The gradual buildup of activity in the RT paradigm allows us to differentiate the response from motor planning. As seen in Figure 7, the discharge is affected by the strength and direction of motion beginning ∼200 msec after onset of random-dot motion, and this influence persists until ∼50 msec before initiation of the eye movement response.
The pattern of activity observed in the RT experiments lends insight into the computations that underlie the formation of a perceptual decision. After onset of random-dot motion, there is a stereotyped change in activity that appears as a dip (and recovery), which is followed by a stimulus-dependent rise or fall in spike rate resembling a linear drift or ramp, on average. The dip seems to mark the beginning of this accumulation process. When the activity reaches a threshold value, the decision process is complete and the monkey initiates a saccade ∼50 msec later. The process suggests an accumulation of information toward a threshold, consistent with models of reaction time proposed by other investigators (Luce, 1986;Carpenter and Williams, 1995; Hanes and Schall, 1996; Ratcliff and Rouder, 1998; Reddi and Carpenter, 2000).
This idea lends itself to a physiological interpretation. Neurons in extrastriate visual cortex (areas MT and MST) furnish the “information” about the direction of motion that is critical to performance on this task (Salzman et al., 1990;Britten et al., 1992, 1996; Celebrini and Newsome, 1995). During motion viewing, they emit a constant level of spike discharge (after an initial transient) at a rate that depends on the strength and direction of motion (Britten et al., 1993). The ramp-like change in spike rate in LIP could reflect the integral of a difference in direction signals representing motion toward T1 and T2 (Mazurek et al., 2000). From moment to moment, this difference in activity between neurons selective for the two directions of motion provides evidence for (or against) a T1 choice.
Gold and Shadlen (2001) showed that the difference in signals from opposing motion sensors is proportional to the log of the likelihood ratio favoring one direction over the alternative. Therefore, a useful decision variable is the accumulated difference between opposing motion sensors. On average, the difference will favor the true direction of motion (for nonzero motion coherence), but on any one trial, variability in the neural responses can lead to an accumulation of evidence favoring the wrong direction. For example, error trials resulting in a rightward choice would be based on a net accrual of motion information favoring rightward, despite the fact that motion was actually leftward. The evidence leading to these errors appears to be weaker, on average, than the evidence for a rightward choice based on actual rightward motion (Kim and Shadlen, 1999, their Fig. 8). In short, the discharge on any one trial resembles a random walk to a threshold representing the level of evidence required to reach a decision, at which point the decision process terminates (Luce, 1986; Ratcliff and Rouder, 1998). The stereotyped activity that occurs 50 msec before saccade initiation (Fig. 7) could represent such a threshold level: the amount of evidence required for commitment to one of the alternatives.
The relationship between RT and the rate of accumulation to threshold suggests that LIP is read out downstream to determine the moment of commitment to a particular eye-movement response. We found a weak association between RT and single-unit activity on a trial-by-trial basis, suggesting that the signal that is compared to the threshold is likely to be represented by an ensemble of neurons in LIP. The relationship between single neuron activity in LIP and RT might therefore be likened to a similar weak relationship between the activity of single neurons in MT and the monkey's choice, so-called choice probability (Celebrini and Newsome, 1994;Britten et al., 1996; Shadlen et al., 1996; Thiele and Hoffmann, 1996; Croner and Albright, 1999).
On average, the level of activity preceding eye movements to the RF of the neuron (T1 choices) ceases to depend on motion strength ∼50 msec before initiation of the saccade, suggesting that a threshold for commitment was set at an ensemble average of ∼68 spikes per second. The variability inherent in this estimate is consistent with the observation that neurons in other brain structures exhibit a more consistent relationship with saccade initiation (Hanes and Schall, 1996). In light of decision-related activity in these and other brain structures (Salinas and Romo, 1998;Horwitz and Newsome, 1999; Kim and Shadlen, 1999), it is likely that the transition from accumulation of evidence to commitment involves interactions between several brain areas. We are unable to discern whether the apparent threshold is computed de novo in LIP or is imposed from other structures. Nonetheless, it is rewarding that such a cognitive step should possess a neural correlate. It has been suggested that when shorter RT is promoted by changing the urgency of a behavioral response, this threshold value is lowered (Reddi and Carpenter, 2000). This hypothesis can now be tested at the physiological level.
The main alternative to the accumulation-to-threshold model is that a coherence-dependent change in activity in LIP during decision formation is only coincidentally related to RT because the monkey tends to respond earlier when motion is stronger. According to this idea, the stereotyped activity preceding the eye movement would be explained by a burst preceding a saccade, regardless of antecedent. This alternative explanation would predict no relationship between RT and the ramp of activity before this burst, but we find precisely this relationship between ramp slope and RT even excluding all spike discharge in the last 100 msec before saccade initiation (Fig. 8). Thus the rate of rise in the evidence favoring a T1 choice appears to predict whether the monkey will complete the decision process sooner or later.
The inference of an accumulation of sensory “evidence” toward a threshold would suggest that LIP neurons compute time integrals on appropriate sensory inputs. This is an appealing idea because it could explain many features of the LIP response. For example, it would provide a qualitative explanation for the persistence of activity in the delay period seen in the fixed duration version of the task (Shadlen and Newsome, 2001). In general, it would cast the “memory” response in simpler memory-guided eye movements (Funahashi et al., 1991; Bracewell et al., 1996) as the integral of a discrete impulse (e.g., a transient representation of a suprathreshold target). According to this idea, the early dip and recovery in activity after onset of motion could represent a reset of a neural integrator (Seung et al., 2000). Although such interpretation is only speculative, our results indicate that temporal integration of sensory evidence underlies both the speed and accuracy of a perceptual decision and that this integral is represented by the spike discharge of neurons in LIP. It is likely that similar “accumulations” will be evident in other brain structures that participate in eye movement planning, allocation of attention, and working memory (Kim and Shadlen, 1999;Horwitz and Newsome, 2001). It remains to be determined how this computation is achieved and whether it applies more generally to other tasks and to other types of decisions.
Footnotes
This research was supported by Grants EY07031, EY11378, and RR00166 and the McKnight Foundation. M.N.S. is an Investigator of the Howard Hughes Medical Institute. We thank Melissa Mihali for expert technical assistance. We are also grateful to John Palmer, Josh Gold, Jochen Ditterich, and Mark Mazurek for helpful suggestions on this manuscript.
Correspondence should be addressed to Dr. Michael N. Shadlen, Department of Physiology, University of Washington Medical School, Box 357290, Seattle, WA 98195-7290. E-mail:shadlen{at}u.washington.edu.