Abstract
Recent work suggests that the middle temporal (MT) area contributes to depth perception in addition to its well established roles in motion perception. To determine whether single MT neurons carry disparity signals with sufficient fidelity to account for depth perception, we have compared neuronal and psychophysical sensitivity to disparity while monkeys discriminated between two coarse disparities (near vs far) in the presence of noise. The strength of the visual stimulus was titrated around psychophysical threshold by varying the percentage of binocularly correlated dots in a random dot stereogram. We find that the average MT neuron has sensitivity equal to that of the monkey, as was reported previously for direction discrimination in MT. We further address some important factors that could bias the neuronal/psychophysical sensitivity comparison, including the possibility that monkeys reach a decision before the end of the stimulus presentation. Unlike the predictions of a simple model that uses Poisson spiking statistics, the sensitivity of many MT neurons has little dependence on the time interval over which spikes are counted to compute a neuronal threshold. Thus the response properties of many MT neurons appear to be adapted for rapid discrimination of depth, and we describe how temporal variations in both signal and noise contribute to this effect. We therefore predicted that psychophysical thresholds should exhibit little dependence on viewing duration in our task, and this was confirmed by additional behavioral experiments. Overall, our findings show that MT is well suited to provide sensory signals that form the basis for perceptual judgments of depth.
Introduction
Horizontal binocular disparities are used by the visual system to reconstruct three-dimensional (3-D) scene structure from two-dimensional retinal images. Many areas in primate visual cortex contain disparity-selective neurons, including V1, V2, VP, V3/V3A, V4, MT, MST, CIP, and IT (Hubel and Wiesel, 1970; Poggio and Fischer, 1977; Maunsell and Van Essen, 1983; Burkhalter and Van Essen, 1986; Felleman and Van Essen, 1987; Poggio et al., 1988; Roy et al., 1992; Eifuku and Wurtz, 1999; Taira et al., 2000; Uka et al., 2000; Hinkle and Connor, 2001; Prince et al., 2002b; Watanabe et al., 2002) (for review, see Cumming and DeAngelis, 2001). Although the existence of disparity-selective neurons is well documented, the respective roles of these different cortical areas in binocular vision remain unclear. Moreover, the presence of disparity-selective neurons does not prove that an area contributes to depth perception. For instance, some of these areas might be engaged in the control of vergence posture (Masson et al., 1997; Takemura et al., 2001), and others might use disparity signals for scene segmentation (von der Heydt et al., 2000). These different possible functions cannot be untangled simply by measuring disparity tuning curves.
Several techniques have been used to establish firm links between neuronal activity in the middle temporal (MT) area and perception of visual motion (for review, see Parker and Newsome, 1998). These techniques include comparison of neuronal and behavioral sensitivity (Britten et al., 1992), analysis of correlations between neuronal responses and behavioral choices (Britten et al., 1996), electrical microstimulation (Salzman et al., 1992), and lesions (Newsome and Pare, 1988). Recent studies have started to provide similar links between neuronal activity and depth perception. DeAngelis and colleagues (1998)showed that microstimulation in MT can bias monkeys' judgments of depth, and other recent studies have shown that responses of MT neurons are correlated with monkeys' judgments of 3-D structure from motion (Bradley et al., 1998; Dodd et al., 2001).
If MT plays an important role in depth perception, then MT neurons should be sufficiently sensitive to account for the ability of monkeys to discriminate depth. We tested this hypothesis by recording from single MT neurons while monkeys performed a depth discrimination task identical to the one used by DeAngelis et al. (1998). Neuronal thresholds were computed by using receiver operating characteristic (ROC) analysis and were compared with the monkeys' psychophysical thresholds. The average neuronal and psychophysical thresholds matched almost exactly, indicating that MT neurons could account for the monkeys' performance in this task.
Comparison of neuronal and behavioral sensitivity is fraught with assumptions and practical difficulties. A second major goal of our study was to determine how some of these factors could affect our estimates of neuronal/psychophysical threshold ratios. Specifically, we examined the effects of trial-to-trial stimulus variation, variability in psychophysical performance, and the monkeys' decision (integration) time. Our analyses place firm bounds on how much each of these factors affects the overall results, providing new insights into how the brain uses information coded by single neurons to form a perceptual decision. Preliminary results have been reported previously (Uka and DeAngelis, 2001).
Materials and Methods
Subjects and surgery
Physiological experiments were performed with two male rhesus monkeys (Macaca mulatta) weighing 5–6 kg. The animals were prepared for daily training and recording sessions by using standard surgical procedures described in detail previously (Britten et al., 1992; DeAngelis and Newsome, 1999). After training the monkey to sit calmly in a primate chair, we attached a CILUX head post receptacle (Crist Instrument, Hagerstown, MD) to the monkey's skull for head restraint, and we implanted a coil of wire under the conjunctiva of one eye for monitoring eye position (Judge et al., 1980). To reduce coil slippage in the eye, we sutured the coil to the sclera by using either a permanent or long-lasting dissolvable suture (7–0 Dexon or 8–0 nylon). All surgical procedures were done under gas anesthesia (isoflurane, 1–2%) with sterile techniques. The monkeys were treated with antibiotics (cefazolin, 25 mg/kg, i.m.) and an analgesic (Buprenex, 0.02 mg/kg, i.m.) after surgery. They were allowed to recover for at least 4 weeks before the first behavioral training session.
After the monkey had 3–6 months of training on the discrimination task, a beveled CILUX recording chamber (Crist Instruments, Hagerstown, MD) was attached to the monkey's skull at an angle of 25° above the horizontal, and was located over the occipital cortex approximately 17 mm lateral and 14 mm dorsal to the occipital ridge. A second eye coil was implanted into the other eye at this time to allow measurements of vergence posture. After 1–2 weeks of recovery time, the animal underwent an additional training period in which vergence angle was monitored and enforced to be accurate to within ±0.25°; subsequently, we started electrophysiological recordings in MT. All animal care and experimental procedures were approved by the Institutional Animal Care and Use Committee at Washington University and were in accordance with NIH guidelines.
Visual stimuli
The monkeys sat in a primate chair and faced a flat-screen 22 inch color monitor (Sony GDM-F500) placed at a viewing distance of 57 cm. The display subtended a visual angle of 40 × 30°, had a resolution of 1152 × 864 pixels, and was refreshed at 100 Hz. Visual stimuli were generated by a Dual CPU workstation running Windows 2000. Random dot stimuli were programmed in Microsoft Visual C++ by using the OpenGL libraries and were displayed by an OpenGL accelerator board with quad-buffered stereo support (Oxygen GVX1, Creative Labs, Milpitas, CA). Each random dot stereogram (RDS) was presented within a circular stimulus aperture. Dot density was 64 dots per square degree/sec, with each dot subtending ∼0.1°. The starting position of each dot within the aperture was newly randomized for each trial (VAR condition) except for some trials, specifically noted in the text, in which the dot patterns were identical across trials (NOVAR condition). Precise disparities and smooth motion were achieved by plotting dots with subpixel resolution, using the hardware anti-aliasing capabilities of the OpenGL accelerator board.
Stereoscopic images were displayed by presenting the left and right half-images alternately at a refresh rate of 100 Hz. The monkey viewed the display through a pair of ferroelectric liquid crystal shutters (DisplayTech, Longmont, CO) that were synchronized to the video refresh such that one shutter was closed while the other was open. Ghosting effects were minimized (stereo crosstalk was <3%) by presenting red dots on a black background, because the decay of the red phosphor is much faster than that of the green or blue phosphors. The position of each dot in a moving stimulus was updated every video frame (rather than every pair of frames) to avoid unwanted changes in the binocular disparity of the stimulus with variations in the direction or speed of motion.
All dots within the RDS moved coherently (100% motion coherence) at a velocity tailored to each MT neuron. Thus dots did not disappear until they reached the boundary of the circular aperture, after which point they resumed motion from the opposite side of the aperture. In the discrimination task described below (Fig.1), the disparity signal was titrated by manipulating the percentage of binocularly correlated dots in the RDS. Correlated (i.e., signal) dots were assigned one of two fixed disparities (crossed vs uncrossed) during each trial, and the remaining (noise) dots were assigned random disparities within the range from −2 to 2° (Fig. 1C). Dots retained their identities (signal or noise) throughout a trial; hence the distribution of noise disparities was fixed within a given trial. For each binocular correlation level the exact distribution of noise disparities varied across trials from one repetition to the next, except where explicitly noted in the text (NOVAR condition).
For a given neuron the location and size of the circular RDS aperture did not vary with the disparity or binocular correlation of the dots, thus eliminating monocular cues to depth. For one monkey and one human we verified that the task could not be performed at all under monocular viewing conditions. As a result of the fixed RDS aperture there was a fringe of binocularly uncorrelated dots along the edges of the stimulus. The aperture size generally was chosen to be slightly larger than the classical receptive field such that this uncorrelated fringe lay outside the receptive field. Stationary background dots (in fixation trials) or flickering background dots (in discrimination trials) were presented at zero disparity to help anchor the monkey's vergence posture (gray dots in Fig. 1A).
Tasks and behavioral training
Behavioral tasks and data acquisition were controlled by a commercially available software package (Reflective Computing, St. Louis, MO), and on-line data analyses were done with MATLAB (MathWorks, Natick, MA). The positions of both eyes were sampled at 1 kHz and stored at 250 Hz. Monkeys were trained first on a fixation task in which they were required to fixate on a yellow spot (0.15 × 0.15°) within a 1.6 × 1.6° electronic window. Monkeys received a water or juice reward for maintaining fixation throughout a 1.5 sec trial. When the monkey's conjugate eye position left the fixation window prematurely, the trial was aborted immediately, reward was withheld, and a brief time-out period ensued before the next trial.
After fixation training the monkeys subsequently were trained on the depth discrimination task (Fig. 1). An RDS containing signal dots at one of two fixed disparities was presented, and the monkeys were required to report whether the signal dots were near (crossed) or far (uncrossed) by making a saccade to one of two targets (located 5° below and above the fixation point, respectively) that appeared 200 msec after offset of the RDS. The saccade had to be made to one of the two targets within 1 sec after their appearance, and the saccade endpoint had to remain within 2.5° of the target for at least 150 msec to be considered a valid choice. Correct responses were rewarded with a drop of water or juice.
Discrimination training began with 100% binocular correlation trials, and lower correlations were introduced gradually after monkeys reached at least 75% correct. The range of correlation levels then was pushed downward gradually over many weeks of training until the monkeys' performance reached a plateau and would not improve further. In early stages of training the monkeys often exhibited strong choice biases, choosing one target on most of the discrimination trials. To discourage these biases, we used a staircase procedure in which the stimulus probabilities could be altered on the basis of the recent history of the monkey's choices. A block of staircase trials began with the highest binocular correlation value. After a correct choice the binocular correlation was lowered (usually by one-half) with a probability of a, and the disparity of the signal dots changed sign with a probability of b. After an incorrect choice the binocular correlation increased (usually by a factor of two) with a probability of c, and the disparity changed signs with a probability of d. A typical set of training parameters was {a, b, c, d} = {0.33, 0.6, 0.66, 0.1}. Note that, after an error, there was a large probability (1 − d) that the next trial had the same disparity as the previous trial. Thus a neglected choice target often would be presented repeatedly until the monkey made a choice in that direction. We found this strategy to be extremely effective in forcing the monkeys to distribute their choices evenly between the two targets, typically resulting in marked improvements in performance as choice bias diminished.
After the monkeys had a few weeks of training with this staircase procedure, the choice biases improved dramatically, and we then transitioned each animal to the “method of constant stimuli,” in which a fixed set of disparities and correlation levels was presented in blocks of randomly interleaved trials. Occasionally, it was necessary to return briefly to the staircase procedure in the days and weeks after this transition. Subsequently, all recording experiments were performed with the method of constant stimuli. Before recording commenced, monkeys were trained extensively by using stimuli with various directions, speeds, disparities, and locations in the visual field. This allowed us to tailor the stimulus to the preferences of each neuron under study.
Electrophysiological recordings
We recorded extracellular activity of single neurons from two monkeys. A tungsten microelectrode (Frederick Haer, Bowdoinham, ME; tip diameter 7–15 μm, impedance 0.2–1 MΩ at 1 kHz) was advanced into cortex through a transdural guide tube, using a micromanipulator (MO-951C, Narishige, East Meadow, NY) mounted on the recording chamber. Single neurons were isolated by using a conventional amplifier, bandpass filter (500–5000 Hz), and window discriminator (Bak Electronics, Mount Airy, MD). Times of occurrence of action potentials and trial events were stored to disk with 1 msec resolution.
Area MT was recognized on the basis of several criteria. First, the patterns of gray and white matter transitions along electrode penetrations, especially the gap between extrastriate visual areas in the anterior bank of the lunate sulcus and MT, were verified. Next the direction, speed, and disparity tuning properties of single units and multiunit clusters, along with the relationship between receptive field size and eccentricity, were measured and identified to be typical of MT responses (see DeAngelis and Newsome, 1999). Changes in receptive field location along the electrode penetrations were as expected from the topography of MT (Zeki, 1974; Gattass and Gross, 1981; Van Essen et al., 1981; Albright and Desimone, 1987; Maunsell and Van Essen, 1987). In many cases the subsequent entry into gray matter after a short gap, with response properties typical of area MST, confirmed the localization of MT. All data included in this study were derived from recordings that were assigned confidently to area MT.
Experimental protocol
After isolating an MT neuron, we used a custom software interface to map carefully the receptive field and to estimate the preferred direction, speed, and horizontal disparity of the neuron. Next we measured quantitatively the direction, speed, size, and horizontal disparity tuning of each neuron. First, a direction tuning curve was obtained by presenting eight directions of motion, 45° apart. In cases in which the tuning width was unusually narrow, the sampling range was reduced accordingly. Then we measured a speed tuning curve for each neuron after adjusting the stimulus to the preferred direction of the neuron. Typically, we presented speeds of 0, 0.5, 1, 2, 4, 8, 16, and 32°/sec. Next we measured a size tuning (i.e., area summation) curve at the preferred direction and speed of the neuron. In most cases we presented aperture sizes of 1, 2, 4, 8, 16, and 32° diameter. This test was used to determine the smallest stimulus patch that yielded the maximal response and to assay for the presence of surround inhibition. Subsequently, we measured a disparity tuning curve with all of the other parameters optimized. In most cases disparities were tested from −1.6 to 1.6° in steps of 0.4°; however, these parameters were adjusted as necessary on the basis of our initial qualitative assessment of the breadth of disparity tuning. All tuning measurements were done in blocks of randomly interleaved trials, and responses were averaged across three to five repetitions of each distinct stimulus. Preferred values were determined on-line by visual inspection of the tuning curves. For the disparity tuning curves the trough of the curve (null disparity) was determined also.
After these preliminary tests we recorded while the monkey performed the depth discrimination task. Both the binocular correlation and the stimulus disparity (preferred and null) were varied in blocks of randomly interleaved trials. The binocular correlation was typically 0, 1.5, 3, 6, 12, 24, and 48% for monkey B and 0, 2, 4, 8, 16, 32, and 64% for monkey J, these ranges being determined from the latter stages of training once performance had stabilized. The vast majority of data sets was collected by using this fixed set of parameters for each monkey. In some cases, however, it was necessary to increase the range of correlations because of difficult stimulus parameters for the psychophysics or because of poor disparity selectivity of the neuron. Whenever possible, data were collected for 40 or more repetitions of each unique stimulus condition, and data sets were discarded if isolation was not maintained for at least 10 repetitions. Across the range of accepted data sets the average number of repetitions was 33 ± 10 SD, and the average number of total trials was 461 ± 139 SD.
Data analysis
Quantitative tuning measurements. For off-line analysis the responses were calculated from the firing rate during the 1.5 sec stimulus presentation period. Spontaneous activity was calculated by using the response to a blank screen.
Direction tuning data were fit with a Gaussian of the form: Equation 1where R0 is the baseline level of the curve, A is the amplitude, θ0 is the location of the center of the Gaussian (i.e., the preferred direction of the neuron), and ς is the SD. We fit this curve to the individual trial responses of the neuron, using the constrained minimization tool, fmincon, in MATLAB. To homogenize the variance of the neural responses across different directions, we minimized the difference between the square root of the neural responses and the square root of the Gaussian (see Prince et al., 2002a,b). This approach was used for all of the curve fits in this study. The Gaussian function generally provided excellent fits to the data, accounting for 96% (median across neurons) of the variance in the mean response across directions.
Analogously, each speed tuning curve was fit with a gamma distribution of the form: Equation 2where s is the stimulus speed andR0, A, α, τ, and n are free parameters. The gamma distribution varies in shape from an exponential to a Gaussian depending on the value of the exponent,n. The denominator term normalizes the curve to have amplitude specified by A. This formulation provided excellent fits to speed tuning curves, accounting for 98% (median) of the response variance across the population.
Each area summation curve was fit with the integral of a Gaussian (or error function, erf) having the form: Equation 3where w is the stimulus size,R0 is the baseline response level, Ais the amplitude, and α gives the SD of the underlying Gaussian. This function provides good fits for neurons that lack surround inhibition. To incorporate surround inhibition, we also fit each area summation curve with a difference-of-error (DoE) function: Equation 4where α is the size of the central excitatory region and (α+β) is the size of the inhibitory surround. Note that this formulation constrains the inhibitory region to have a size greater than that of the excitatory center. A sequential F test was used to determine whether the DoE function provided a better fit than the single error function, indicating that the neuron exhibited significant surround inhibition (p < 0.05). If so, the optimal size and percentage of surround inhibition were derived from the DoE fit. Otherwise, the receptive field size was taken as 1.163*α, which defines the point at which the single error function reaches 90% of its maximal value. These fits accounted for 97% (median) of the variance across the population.
Each disparity tuning curve was fit with a Gabor function having the following form: Equation 5where d is the stimulus disparity,R0 is the baseline response level, A is the amplitude,d0 is the center of the Gaussian envelope, ς is the SD of the Gaussian, f is the frequency of the sinusoid, and Φ is the phase of the sinusoid (relative to the center of the Gaussian). Because the disparity frequency, f, often is poorly constrained by the data at the low-frequency end, this parameter was allowed to vary only within ± 10% of the peak of the Fourier transform of the raw tuning curve. We found that this constraint considerably improved the convergence of the optimization (see also Prince et al., 2002a,b) with minimal increase in the overall error of the fits. Gabor fits were excellent descriptors of disparity tuning in MT, accounting for 96% (median) of the variance in the data (for additional details, see DeAngelis and Uka, 2003).
From each disparity tuning curve we extracted two measures of disparity selectivity: a disparity tuning index (DTI) and a disparity discrimination index (DDI). The amount of response modulation because of disparity was assessed with the DTI: Equation 6where Rmax andRmin are the maximum and minimum responses, respectively. To keep this index restricted to the range from zero to one, we did not subtract spontaneous activity fromRmax or Rmin. Finally, to characterize the ability of the neurons to discriminate between the preferred and null disparities, we used the DDI (see Cumming and DeAngelis, 2001; Prince et al., 2002b): Equation 7where SSE is the sum squared error averaged across disparities, N is the number of observations (trials), andM is the number of disparities tested. This index differs importantly from the DTI in that it takes into account the variability of the neural responses. Both DTI and DDI were computed from the square root of firing rate on each trial. Analogous discrimination indices were computed for direction and speed tuning curves also.
Calculation of neuronal thresholds. To characterize the sensitivity of MT neurons in our depth discrimination task, we used ROC analysis to calculate neuronal thresholds on the basis of the anti-neuron formulation used by Britten et al. (1992). An ROC curve (see Fig. 2C) was calculated from the distributions of responses to the preferred and null disparities at each correlation level (see Fig. 2B). The area under the ROC curve is taken as the ability of an ideal observer to discriminate between the two disparities based solely on the responses of the recorded neuron (and an assumed anti-neuron with opposite tuning). A plot of the ROC area as a function of binocular correlation defines the neurometric function (see Fig. 2D, filled symbols), which is fit with a cumulative Weibull function given by: Equation 8
where c is the binocular correlation of the stimulus,p is the proportion of correct responses, α defines the threshold at 82% correct, and β gives the slope of the curve.
To test whether neurometric and psychometric functions differed in terms of threshold or slope, we used a bootstrap technique. For each correlation level the spike counts and choices were resampled randomly with replacement from the measured distributions. The number of random draws was equal to the number of trials done at each correlation level. One such set of random draws of spike counts for all correlation levels defined a single bootstrap neurometric function, and an analogous set of draws of choices defined a single bootstrap psychometric function. These bootstrap functions then were fit with Weibull curves to extract thresholds and slopes, exactly as described above. This whole process was repeated to compute 1000 pairs of bootstrap neurometric/psychometric functions, and we computed the 95% confidence interval of the difference in thresholds (or slopes) between neurometric and psychometric functions from these distributions. Differences between neuronal and psychophysical thresholds (or slopes) were considered significant if the 95% confidence interval did not overlap with zero.
Statistics. All statistical analyses were done with STATISTICA (StatSoft, Tulsa, OK) software. To account for differences between the two monkeys in our study, we did all correlation analyses as within-cell regressions in the context of an analysis of covariance (ANCOVA), with monkey identity as an independent factor. All correlation coefficients reported here are partial correlations that account for differences between monkeys. Multiple regression analyses also took into account differences between monkeys, using appropriate dummy variables. For all parametric statistics we log-transformed variables whenever this made the distributions closer to normal. We also verified that none of our conclusions would change when nonparametric statistics were used (e.g., Spearman rank correlations).
Results
Neuronal database
Data were drawn from a sample of 170 MT neurons (94 from monkey B and 76 from monkey J) for which preliminary tests of direction, speed, size, and disparity tuning were completed and for which the preferred speed of motion was <32°/sec. Monkeys were trained for speeds up to this value only because of technical limitations on stimulus generation. We estimate that at most 5–10% of cells were excluded because of speed preference. Of the original 170 neurons, 104 (52 from each monkey) were included into the final database for this report on the basis of the following criteria. (1) Good isolation of the action potential of the neuron had to be maintained through at least 10 repetitions of the discrimination task. Forty-one of 170 neurons were excluded because isolation was lost prematurely or because the monkey ceased working. (2) The neuron had to exhibit some disparity tuning such that we could reasonably define a preferred and null disparity for the discrimination task. This judgment was always made from on-line visual inspection of the disparity tuning curve (plotted with error bars), and only 10 of 170 neurons were excluded because of lack of tuning. Cells were never excluded on the basis of qualitative assessments of tuning. (3) The preferred and null disparities had to have opposite signs (one near, one far), because monkeys were not trained to discriminate between two disparities of the same sign (i.e., the task was always to judge near vs far relative to the plane of fixation). Because most MT neurons have disparity tuning curves with odd symmetry around zero disparity (Cumming and DeAngelis, 2001;DeAngelis and Uka, 2003), this criterion was not commonly invoked. Only 8 of 170 neurons were excluded because their tuning curve was precisely symmetric about zero disparity. For most neurons with preferred disparities near zero we usually could place one disparity close to zero and the other disparity on the opposite side of zero (see, for example, Fig. 3D). Two additional neurons were excluded from the sample because the peak and trough of the tuning curve were both on the same side of zero disparity. (4) The monkey's behavior had to be within the range of normal performance exhibited during the latter stages of training. Five of 170 neurons were excluded from the sample because the monkey's behavior was clearly outside the normal range of performance.
Aside from these criteria we recorded from all of the MT neurons that we could isolate, including several neurons with very weak disparity tuning. In fact, post hoc testing showed that 2 of 104 neurons did not have statistically significant disparity selectivity (ANOVA, p > 0.05). Consequently, our selection criteria did not strictly match those of Britten and colleagues, who chose neurons for which the “distribution of response amplitudes evoked by preferred direction motion (100% correlated stimuli) did not overlap with the distribution evoked by null direction motion” (Britten et al., 1992). By their criterion two of our neurons would have been excluded from the sample. Otherwise, the selection criteria used in the two studies appear to be quite similar.
Receptive field eccentricities ranged from 1.9 to 15.7° (median, 6.3°). Preferred speeds ranged from 0.0 to 32.0°/sec (median, 5.4°/sec), and receptive field sizes ranged from 3.5 to 20.0° (median, 7.5°). Direction preferences were distributed uniformly.
Comparison of neuronal and psychophysical sensitivity
Figure 2 shows data from an individual experiment. This neuron was tuned strongly for near disparities, as shown in Figure 2A. Note that the response to a binocularly uncorrelated stimulus (labeled U) lies approximately midway between the maximum and minimum responses to disparities presented at 100% correlation. By visually inspecting this tuning curve on-line, we chose −0.8° to be the preferred disparity and +0.5° to be the null disparity. The monkey then discriminated between these two disparities across a range of binocular correlations from 1.5 to 48%. A 0% binocular correlation stimulus was included also, and all conditions were interleaved randomly. Figure2B shows distributions of the responses of the neuron to each nonzero binocular correlation. Filled and open bars show responses to dots presented at the preferred and null disparities, respectively. At 48% correlation the two distributions are nonoverlapping, indicating that one could discriminate reliably between the two disparities from the responses of this unit. As binocular correlation decreases, the two distributions of responses become progressively more overlapping, and the neuron ceases to carry information about the disparity of the signal dots at very low correlations.
To quantify neuronal sensitivity, we calculated an ROC curve for each binocular correlation, as shown in Figure 2C (Green and Swets, 1966; Britten et al., 1992). The area under the ROC curve defines the proportion correct of an ideal observer whose task is to determine whether a given stimulus presentation contained signal dots at the preferred or null disparity, using only the responses of this single neuron (and an assumed anti-neuron with opposite preferred and null disparities). This is analogous to the proportion of times that a value drawn randomly from the preferred distribution (filled bars) exceeds a value drawn randomly from the null distribution (open bars) (Britten et al., 1992). We refer to these ROC values as the proportion correct of the neuron.
ROC values are plotted as a function of binocular correlation in Figure2D to create a neurometric function (filled circles). These data were fit with a Weibull function (solid curve) to extract an 82% correct threshold and a slope. For this neuron the threshold was 10.7% binocularly correlated dots, and the slope was 1.15. These values now could be compared with the performance of the monkey, which was derived from the psychometric function shown in Figure2D (open circles, dashed curve). By fitting the behavioral data using identical methods, we obtained a psychophysical threshold of 15.2% and a slope of 1.32. Thus this particular neuron exhibited slightly greater sensitivity (lower threshold) than the monkey, although this difference was not statistically significant (p > 0.05) on the basis of a bootstrap analysis (see Materials and Methods). The difference in slope was also not significant.
Figure 3 shows data for four additional MT neurons that illustrate the range of effects that we observed. Like many MT neurons, those illustrated in Figure 3, A andB, had neurometric functions that closely matched the monkey's psychometric function in both threshold and slope. Other MT neurons had thresholds that were significantly higher (Fig.3C) or lower (Fig. 3D) than the psychophysical threshold.
Figure 4 summarizes our results for a population of 104 MT neurons. Figure 4A shows the comparison between neuronal and psychophysical thresholds, whereas Figure 4B shows the comparison between slopes. For more than one-half of the data sets (61 of 104) there was no significant difference between neuronal and psychophysical thresholds (bootstrap, p > 0.05). Among the remainder, 23 of 104 had neuronal thresholds significantly smaller than the corresponding psychophysical thresholds (p < 0.05), and 20 of 104 exhibited neuronal thresholds that significantly exceeded behavioral thresholds. Overall, the neuronal/psychophysical (N/P) threshold ratio was distributed around unity, with a geometric mean of 0.979 (1.03 for monkey B and 0.925 for monkey J). Thus the modal MT neuron matched the performance of the animal. Moreover, 5 of 104 neurons had thresholds lower than the best psychophysical threshold exhibited by either monkey (11.7% correlation). There is a weak, but significant, correlation between neuronal and psychophysical thresholds (r = 0.35; p < 0.001), which we will discuss later. As for the slopes of the neurometric and psychometric functions, Figure 4B shows that these did not differ significantly for most (89 of 104) neurons (p > 0.05). The geometric mean of the N/P slope ratio was 1.16 (0.989 for monkey B and 1.37 for monkey J). These results are similar to those ofBritten and colleagues (1992), who studied MT neurons in a direction discrimination task (see Discussion).
In a comparison of neuronal and psychophysical thresholds there are a number of factors that are not well constrained. Some of these, including the algorithm used by the ideal observer, the size of the neuronal population contributing to decision making, and the contribution of neurons with nonoptimal stimulus preferences, have been considered previously (Shadlen et al., 1996; Prince et al., 2000). There are a number of other factors that also can affect the magnitude of N/P ratios. In the following sections we address three of these factors in detail: trial-to-trial stimulus variations, session-to-session variability in psychophysical performance, and the length of time during which the monkey reads out activity from MT during a trial (integration time). Our analyses put firm limits on the degree to which each of these factors may alter the N/P ratios given in Figure 4.
Effect of trial-to-trial stimulus variations on N/P threshold ratios
In general, both the starting location of dots within the circular aperture and the binocular disparity of noise dots were randomized for each trial (see Materials and Methods). If MT neurons are sensitive to this randomization, it would increase the variance of response distributions for the preferred and null disparities (Fig.2B) and subsequently increase neuronal thresholds computed by using ROC analysis. To assess the magnitude of these effects, in 61 of 104 experiments we divided the 0% binocular correlation trials into two groups; one group had identical random dot patterns for every trial (NOVAR condition), and the other group had the normal randomization of dot patterns across trials (VAR condition). For each neuron we calculated the mean and variance of the spike count across trials for each of these two conditions.
Figure 5A plots the trial-to-trial variance against mean spike count for each neuron that was tested under the VAR and NOVAR conditions. There was no significant difference in mean spike counts between the two conditions (paired t test, p = 0.91), but the variance was significantly smaller for the NOVAR condition (paired ttest, p < 0.0001). Thus MT neurons were sensitive to trial-to-trial variations in the random dot stimuli, most likely because of variations in the mean disparity of noise dots in the VAR condition. We fit separate lines with a slope of 1 to the data for the VAR and NOVAR conditions after confirming that separate slopes for the two lines would not improve the overall fit (sequential Ftest, p > 0.05). From these fits on log–log scales we calculated the variance-to-mean ratio (VMR; the y-intercept at x = 1) for each condition. The VMR was 1.40 for the NOVAR condition (similar to what other studies have found: Tolhurst et al., 1983; Vogels et al., 1989; Snowden et al., 1992; Britten et al., 1993; Softky and Koch, 1993; Geisler and Albrecht, 1997; Shadlen and Newsome, 1998), whereas it was 2.11 for the VAR condition.
To estimate the effect of stimulus variations on neuronal sensitivity, we recalculated the neuronal threshold for each MT unit after scaling down the variance of the response distribution (at each binocular correlation and disparity) by the factor of 1.51 (2.11/1.40) derived from the above analysis, while keeping the mean response constant: Equation 9Rorig and Rscaledare the responses before and after variance scaling, andRmean is the mean response for each disparity at each correlation level. Figure 5B shows the recalculated neuronal thresholds plotted against the original values. Variance scaling reduced the threshold for all MT neurons, with the average effect being a reduction of 17.4%. This estimate assumes that trial-to-trial stimulus variation produces the same increase in VMR for all binocular correlations, an assumption that we did not test because of limitations of recording time.
If MT provides critical sensory signals for performance of our task, then the above analysis predicts that psychophysical performance should improve by ∼20% under the NOVAR condition. We tested this prediction (after all neurophysiological experiments were completed) by remeasuring the psychophysical thresholds of monkey B under both VAR and NOVAR conditions (randomly interleaved). For the NOVAR condition a fixed random dot pattern was used for each disparity at each binocular correlation such that all repeats of each stimulus condition were identical. Because the simulated neuronal thresholds (Fig.5B) were calculated under the assumption of variance scaling but no change in mean responses, we forced the noise dots to have zero mean disparity for each NOVAR stimulus. This matches the mean disparity of the NOVAR stimuli to the mean disparity (across repetitions) of the VAR stimuli and thus minimizes changes in the mean response of MT neurons between the two conditions.
We obtained psychophysical thresholds under VAR and NOVAR conditions for all sets of stimulus parameters used in the 52 recording experiments done with monkey B. Figure 5C shows that NOVAR and VAR thresholds are correlated significantly (r = 0.61; p < 0.0001), but the average NOVAR threshold (19.2%) is significantly lower than the average VAR threshold (24.9%) (paired t test, p ≪ 0.0001). The average reduction in psychophysical threshold under the NOVAR condition was 22.8%, not far from the 17.4% reduction observed for the simulated neuronal thresholds (Fig. 5B). Thus stimulus variation has similar effects on neuronal and psychophysical thresholds in our task.
Retesting of psychophysical thresholds
If the conditions of the recording experiments interfere with peak performance of the task, this will produce N/P threshold ratios that are artificially low. In a study of V1 neurons during performance of a stereoacuity task, Prince and colleagues (2000) found that psychophysical thresholds decreased by an average of 61% when monkeys were retested outside the context of the combined behavioral/physiological experiments. This large change occurred mainly because the range of stimulus disparities often had to be increased during recording sessions to allow for the measurement of neuronal thresholds for insensitive neurons. When the range of disparities was tightened in behavioral retesting, psychophysical thresholds improved markedly, with the average N/P threshold ratio increasing from 1.67 to 4.11.
To address this potential concern, we remeasured psychometric functions by using stimulus parameters that were identical to those of each recording experiment. For the handful of neurons that required a range of binocular correlations larger than the standard range used for each monkey (see Materials and Methods), we retested psychophysical performance by using the standard range. These repeat behavioral measurements were taken after all recording experiments were completed. For monkey B these data were obtained in blocks of trials with interleaved NOVAR conditions, as discussed in the previous section.
A comparison of the original psychophysical thresholds and retested thresholds is shown in Figure 6. For monkey J there was a modest 14% reduction in the average psychophysical threshold after retesting, and this difference was significant (paired t test, p < 0.001). In contrast, the average psychophysical threshold for monkey B increased by 11% during retesting, although this difference was only marginally significant (paired t test, p = 0.02). Combined across the two animals, there is a fairly strong correlation between original and retested psychophysical thresholds (r = 0.62; p ≪ 0.0001), with a slope near unity. This indicates that a good portion of the variance in the original psychophysical thresholds was not simply attributable to random fluctuations in the monkeys' performance; we shall address how stimulus conditions affected thresholds in a later section.
The data of Figure 6 suggest that the psychophysical performance of both monkeys had reached a stable plateau when recording sessions commenced, and this is confirmed by the fact that there was no significant trend for psychophysical thresholds to decline across recording sessions for either monkey (monkey B: r = 0.09, p = 0.52; monkey J: r = 0.14,p = 0.32). There are two likely reasons why the improvement in psychophysical performance after retesting was much smaller than the 61% reduction seen by Prince et al. (2000). First, in our study the average neuronal threshold was comparable to the average psychophysical threshold; hence in most cases we did not need to increase the range of binocular correlations to measure neuronal thresholds. Second, our correlation levels were spaced in logarithmic steps, whereas the stimulus levels used by Prince et al. (2000) were spaced linearly. Logarithmic steps allow one to cover a large range of stimulus strengths while retaining a concentration of values around psychophysical threshold.
When we use retested psychophysical data, the average N/P threshold ratio for monkey J increases from 0.93 to 1.07, and the average N/P threshold ratio for monkey B decreases from 1.03 to 0.93. Combined across monkeys, the average N/P threshold ratio is 1.001 when retested psychophysical thresholds are used. Thus variations in behavioral performance had little influence on our estimates of N/P threshold ratios.
Effect of integration time on neuronal thresholds
Another important issue to consider when comparing neuronal and psychophysical thresholds is the point in time at which the animal reaches his decision during a trial. In all of our physiology experiments the visual stimuli were presented for 1.5 sec, and firing rates were computed over this entire interval. Although the monkeys were not allowed to indicate their decision until the end of the trial (ours was not a reaction time task), monkeys may have reached a decision much sooner and subsequently ignored the visual stimulus. If so, our estimates of N/P threshold ratios would be too low because the monkey would be integrating neuronal activity over less time than the ideal observer (ROC). Because we do not know the monkey's decision time in the simultaneous behavioral/physiological experiments, we first addressed this issue by computing neuronal thresholds over a range of integration times.
For each MT neuron we calculated neuronal thresholds by counting spikes over a variable epoch beginning at stimulus onset and ending at times ranging from 200 to 1500 msec after stimulus onset (in 100 msec steps). We could not calculate thresholds reliably at short integration times for insensitive MT neurons. Thus we restricted this analysis to neurons having a threshold (using the full 1500 msec integration window) less than one-half of the largest binocular correlation used in the measurements. Figure 7A shows neuronal thresholds as a function of integration time for the 75 neurons that met this criterion. Not surprisingly, neuronal thresholds generally decreased with integration time, but the trend was more gradual than we had expected. Strikingly, many MT neurons, including some of the most sensitive units, showed thresholds that were approximately independent of integration time. Data for nine such neurons are shown in Figure 7B; several others are not shown to avoid overcrowding. Summary data for the population of 75 neurons are shown in Figure 8. For each neuron we normalized all thresholds to the value obtained by using a 1500 msec integration window. The thick solid line in Figure 8 shows the median normalized threshold versus integration time, and the thinner solid lines indicate the 25th and 75th percentiles. For a 200 msec integration time the median threshold rises by 83% relative to that at 1500 msec; at 500 msec the median threshold is elevated by only 28%.
The shallow decline in neuronal threshold with integration time for many of our MT neurons appears at odds with similar data reported byBritten et al. (1992). Their data show a much steeper dependence on integration time, but the generality of this result is uncertain because data were shown for only eight neurons. Britten and colleagues state that their data were “… expected as a simple consequence of the fact that noise resulting from irregularity in a neuron's firing pattern becomes less pronounced with longer measurement time” (Britten et al., 1992). Indeed, one should expect a steep decline in threshold with integration time if the spiking of MT neurons approximately obeys Poisson statistics.
To assess whether the integration time behavior of MT neurons differs from the Poisson expectation, we performed the simulations of Figure7C. For each trial of each data set we generated spike trains by using an inhomogeneous Poisson process. We first generated a poststimulus time histogram (PSTH) with 1 msec resolution for each binocular correlation at each disparity and then smoothed it by using a boxcar filter with a width of 20 msec. Spikes were generated randomly for each 1 msec bin, with a probability determined from the smoothed PSTHs. These simulated spike trains have firing rate variations that match the real data, have a fixed VMR equal to 1, and have random local structure. We then performed ROC analysis on the synthesized Poisson spike trains in an identical manner to that used for the real spike trains. Figure 7C shows the results of this simulation for each of the 75 neurons that were analyzed in Figure 7A. Clearly, the simulated neuronal thresholds decline more steeply with integration time than those of the actual neurons, and none of the simulated data sets showed an integration time curve that was flat. The median normalized threshold versus integration time for the simulations is shown by the thick dashed curve in Figure 8, along with the 25th and 75th percentiles for the simulations (thin dashed lines). For a 200 msec integration time the median threshold rises by 227% relative to 1500 msec integration time; at 500 msec the median threshold is elevated by 92%.
What factors might allow MT neurons to have such a diminished dependence on integration time relative to the Poisson expectation? One possibility is that the variability of MT responses is not fixed throughout the trial epoch. If responses are more reliable in the early part of the trial, this could flatten the relationship between neuronal threshold and integration time. To address this possibility, we computed the VMR of each neuron within consecutive 200 msec time windows spanning the trial epoch. Each VMR was obtained by computing the mean and variance across trials for each different correlation level and disparity; then these data were plotted on log–log scales and fit with a unity slope line (as in Fig. 5A). Figure9A (filled circles) shows the average VMR as a function of time for the same 75 neurons that were analyzed in Figures 7 and 8. There is a significant increase in VMR over the first 500 msec of the trial (ANOVA, p = 0.028), with the average VMR increasing by 26%.
If changes in VMR contribute to flattening of integration time curves, neurons with the flattest integration time curves should show the largest increases in VMR and vice versa. Figure 9B shows that this expectation was confirmed. To construct this scatterplot, we divided the neuronal threshold for a 300 msec integration time by the threshold for a 1500 msec integration time (as in Fig. 8). This normalized threshold is plotted against the ratio of VMR between the early (100–300 msec) and late (1300–1500 msec) portions of the response. There is a significant positive correlation (r = 0.31; p = 0.0079) such that neurons with flat integration time curves (values near unity on the ordinate) tend to have the largest increases in VMR during the trial epoch. This confirms that changes in VMR over time do contribute to flattening the integration time curves of MT neurons.
Another factor that could flatten the integration time curves of MT neurons is a change in the differential disparity signal (preferred–null) during the time course of the response. If the mean response difference between preferred and null disparities declines over time, neurons would be more sensitive in the early portion of the trial. Because the firing rate variations of MT neurons were modeled accurately in our Poisson simulations (Fig. 7C), this factor cannot account for the difference seen in Figure 8. To shed further light on this, Figure 9A (solid curve) shows the time course of the preferred–null response difference averaged across the population of 75 neurons. At the population level there is indeed no significant change in the disparity signal after 100 msec following stimulus onset (ANOVA, p = 0.99). Thus changes in the disparity signal over time do not contribute to flattening of integration time curves, on average. Nevertheless, we find that changes in the disparity signal over time do explain variability in the slope of the integration time curve from neuron to neuron. Figure9C shows the normalized neuronal thresholds plotted against the ratio of preferred–null response differences in the early (100–300 msec) versus late (1300–1500 msec) segments of the trial. There is a significant negative correlation between these variables (r = −0.40; p < 0.001) such that neurons with flat integration time curves tend to have larger differential responses in the early part of the trial, whereas neurons with steep integration time curves tend to have more differential response in the late part of the trial.
A third factor that may contribute to flat integration time curves is the statistical dependence of firing rates between one brief time period and the next. For a Poisson process the number of spikes that occur within one short epoch of a trial is not correlated with the number of spikes that occur within the next epoch. As a result, counting spikes over a longer period of time yields a better estimate of the true mean firing rate and thus a lower neuronal threshold (Fig.7C). If trial-to-trial variations in spike counts are correlated between neighboring epochs, however, then the expected improvement in neuronal threshold with integration time will be reduced substantially. To examine this possibility, we again divided the trial into seven nonoverlapping 200 msec time bins. Within each bin the responses were z-scored and combined across binocular correlations and disparities. For each pair of neighboring bins we then computed the correlation coefficient (across trials) between the normalized responses. Figure 9A (open circles) shows the average noise correlation as a function of time for 75 MT neurons. The correlation is constant throughout the trial (ANOVA, p= 0.93), consistent with the fact that integration time curves of MT neurons are flatter than Poisson simulations at every point within the trial epoch (Fig. 8).
To obtain a single noise correlation value for each neuron, we averaged the correlation values across all pairs of neighboring time bins. Figure 9D shows that there is a significant inverse correlation between normalized neuronal thresholds and overall noise correlation (r = −0.24; p = 0.019) such that neurons with flat integration time curves tend to have larger noise correlations. Note also that all of the noise correlation values in Figure 9D are positive, indicating that a larger than average firing rate in one time bin is associated with a larger than average response in the neighboring bins. This result helps to explain why nearly all MT neurons have integration time curves that are flatter than the Poisson expectation.
In principle, positive noise correlations could be either stimulus-driven or intrinsic to neuronal connectivity in MT. To assess this, we compared data from the VAR and NOVAR conditions at 0% correlation, because the NOVAR condition removes stimulus variability. The average noise correlation is significantly larger for VAR (0.29) than NOVAR (0.16) trials (paired t test, p ≪ 0.0001), indicating that the noise correlations in Figure 9 are driven by both stimulus and neuronal factors.
Effect of integration time on psychophysical thresholds
If MT neurons underlie performance of our depth discrimination task, then the results of Figures 7 and 8 suggest that psychophysical thresholds should exhibit little dependence on how long the monkey scrutinizes the visual stimulus. Because we cannot know when the monkey reached his decision during the 1.5 sec trials used in the recording sessions, we performed additional psychophysical experiments to measure how behavioral thresholds depend on stimulus duration. These tests were performed on monkey B as well as another animal (monkey R) that was not used in the recording experiments (monkey J was engaged in other studies and could not be used for these additional tests). We used a fixed set of stimulus conditions (eccentricity, 5.5°; direction, 0°; speed, 7°/sec; size, 8°; disparity, ±0.5°) that were selected by averaging the stimulus parameters across the 52 recording experiments performed with monkey B. Psychophysical thresholds were determined by using a staircase procedure, as done by Britten et al. (1992). Three blocks of trials were performed in each daily session, with stimulus durations of 200, 500, and 1500 msec, respectively. The order of the three blocks was randomized each day, and the intertrial interval was adjusted so that total trial length (and reward interval) was identical among blocks.
Figure 10 shows average psychophysical thresholds as a function of viewing duration for monkeys B and R. Each datum is the average (± SE) across numerous identical blocks of 420 trials/block (20 blocks for monkey B and 14 blocks for monkey R). Surprisingly, there is no significant dependence of psychophysical threshold on viewing duration for either monkey (ANOVA; monkey B,p = 0.85; monkey R, p = 0.62), indicating that sufficient information was available to the monkeys in the first 200 msec of the trial for adequate task performance. In contrast, Britten and colleagues (1992) found large effects of viewing duration on psychophysical thresholds for monkeys performing a direction discrimination task. Possible reasons for this difference are discussed below.
Together, Figures 8 and 10 allow us to place firm bounds on the effects of integration time on our estimates of N/P threshold ratios. If we assume that the monkeys made their decisions within the first 200 msec of the trial during recording experiments, then the mean N/P threshold ratio would rise from 0.98 to 1.79. This almost certainly represents an upper limit, because the animals probably integrated over >200 msec to reach a decision. Moreover, it is quite possible that the dynamics and/or variability of neuronal responses would change when the animal was forced to perform the task with 200 msec stimulus presentations. Thus neuronal thresholds also might improve during blocks of short viewing duration, a possibility that has not been tested. In any case it is important to note that the most sensitive MT neurons have thresholds comparable to, or better than, behavior at even the shortest integration times (Fig. 7A). Thus if monkeys can base their judgments on the more sensitive neurons (the lower envelope hypothesis; see Parker and Newsome, 1998), the activity of our MT population certainly could support performance at even the shortest viewing durations.
Dependence of psychophysical and neuronal thresholds on stimulus parameters
As seen in Figure 4A, psychophysical thresholds varied over a sixfold range across recording sessions. This large variance is not unusual for this type of study. For example, Britten and colleagues (1992) report psychophysical thresholds varying over a 20-fold range, and Prince and colleagues (2000) report thresholds having a >10-fold range. What accounts for the large range of behavioral thresholds? It cannot be explained completely by random variation in the animal's performance, as shown in Figure 6. Here we examine how behavioral thresholds depend on the parameters of the visual stimulus. We also ask whether neuronal thresholds show a similar dependence on stimulus parameters, and in the next section we consider how neuronal thresholds depend on the functional properties of the recorded neurons.
We first examined how psychophysical thresholds depended on the following parameters of the random dot stimulus: eccentricity, direction of motion, speed of motion, aperture size, the sum of the absolute values of the preferred and null disparities (disparity magnitude), and the asymmetry of the two disparities relative to zero (disparity asymmetry). Disparity asymmetry was calculated as: Equation 10Psychophysical thresholds were correlated positively with stimulus speed (r = 0.30; p = 0.0022) and were correlated negatively with stimulus diameter (r = −0.34; p < 0.001). Other stimulus parameters had no significant correlation with psychophysical thresholds (p > 0.1). This pattern of results was confirmed in the context of a multiple regression analysis that included all of the above stimulus parameters. Overall, stimulus parameters accounted for 23% of the variance in psychophysical thresholds across experiments.
At first glance it may seem odd that only approximately one-fourth of the variance in psychophysical thresholds was explained by stimulus parameters. However, this is not really unexpected. The correlation between original and retested psychophysical thresholds in Figure 6(R2 = 0.38) indicates that only 38% of the variance in thresholds was not attributable to random behavioral fluctuations. If stimulus variations account for 23% of the variance, this leaves only 15% of variance unexplained. Given that we assessed the effects of stimulus variations by using a simple linear model, some of this discrepancy also could be attributable to nonlinear relationships between stimulus parameters and psychophysical performance. Thus there is little need to invoke other factors to explain the sixfold range of psychophysical thresholds observed in our experiments.
A similar analysis of the dependence on stimulus parameters also was performed with neuronal threshold as the dependent variable. Again, stimulus (i.e., receptive field) size was correlated negatively (r = −0.27; p = 0.0053) with thresholds, but there was no significant effect for stimulus speed or any other variable (p > 0.1). These results were confirmed again by using a multiple regression analysis. Overall, stimulus parameters accounted for 13% of the variance in neuronal thresholds. The common dependence of neuronal and psychophysical thresholds on stimulus size is consistent with the idea that task performance is linked to the activity of MT neurons (see also Celebrini and Newsome, 1994).
The correlation between neuronal and psychophysical thresholds observed in Figure 4A may have been attributable to a common dependence of neuronal and psychophysical thresholds on stimulus parameters. To test this hypothesis, we determined whether the correlation between neuronal and psychophysical thresholds persists after taking into account the variations in stimulus parameters. Recall that variations in stimulus parameters accounted for 23% of the variance in psychophysical thresholds. Adding neuronal threshold into this model explained an additional 7% of the variance (30% total), and the correlation between neuronal and psychophysical thresholds persisted (r = 0.27; p = 0.004). Therefore, day-to-day variations in some factor other than the stimulus (perhaps attention) must have contributed to the weak linkage between neuronal and psychophysical thresholds seen in Figure4A.
Predicting neuronal thresholds from MT tuning properties
Given that stimulus parameters account for only a modest portion (13%) of the variance in neuronal sensitivity, what explains the ∼10-fold range of neuronal thresholds observed in Figure4A? Clearly, neuronal thresholds must depend on selectivity for disparity, as we shall examine shortly. But it also is worth asking whether neuronal thresholds are correlated with other tuning properties of MT neurons. This allows us to ask whether there is a specific subtype of MT neuron that is best suited to depth discrimination.
We first probed for correlations between neuronal thresholds and the following tuning properties of MT neurons: preferred direction of motion, direction discrimination index (computed analogously to DDI; see Materials and Methods), preferred speed of motion, speed discrimination index (computed like DDI), optimal stimulus size, and percentage of surround inhibition. Only the direction discrimination index was correlated significantly with neuronal threshold such that poorly directional neurons tended to have higher thresholds (r = −0.27; p = 0.0046). However, this effect became nonsignificant (p > 0.25) when DDI was added to the model, indicating that it was attributable to an intervening effect of DDI. We conclude that neuronal thresholds cannot be predicted from any tuning properties other than disparity selectivity.
It is a priori unclear how well one could expect to predict neuronal sensitivity to low binocular correlations from a disparity tuning curve measured at 100% correlation, nor is it clear what metric of tuning strength ought to be most predictive. We also wanted to know whether neuronal thresholds were related to other aspects of disparity tuning such as disparity preference, tuning width, or tuning shape. We therefore analyzed the relationships between neuronal thresholds and the following metrics of the disparity tuning curve: preferred disparity (both signed and absolute value), disparity tuning index (DTI), disparity discrimination index (DDI), disparity frequency (f in Eq. 5), and phase of the tuning curve (absolute value of Φ in Eq. 5).
Among these parameters neuronal threshold was weakly correlated with phase (r = 0.25; p = 0.011), weakly anti-correlated with DTI (r = −0.28; p= 0.003), and strongly anti-correlated with DDI (r = −0.64; p ≪ 0.0001); no other correlations were significant (p > 0.2). Moreover, when all of these tuning parameters were inserted into a multiple regression model, DDI was the only variable with a significant partial correlation (r = −0.56; p ≪ 0.0001) to neuronal threshold, and a backward stepwise regression removed all variables from the model except DDI.
The comparison between DTI and DDI is worth examining more closely. Figure 11A shows neuronal thresholds plotted as a function of DTI, which measures the peak-to-trough modulation of the disparity tuning curve relative to the peak response (see Eq. 6). It is striking that there is only a modest correlation here (r = −0.28), considering that this type of modulation index is used widely by sensory physiologists to characterize tuning strength. In contrast, Figure 11Bshows that incorporating response variability into the DDI dramatically increases its predictive power (r = −0.64). The strength of this relationship is impressive, given that the disparity tuning curve was measured at a binocular correlation (100%) that was never used in the discrimination task during recording sessions.
Discussion
We measured the sensitivity of MT neurons to coarse disparities embedded in noise while monkeys performed a depth discrimination task. Average neuronal and behavioral thresholds were nearly identical, indicating that MT neurons possess sufficient sensitivity to mediate task performance. Together with previous microstimulation results for the same task (DeAngelis et al., 1998), our findings indicate that area MT contributes importantly to depth perception, at least in the case in which there is substantial noise to overcome in establishing binocular correspondence (such as when a predator tracks prey through dense foliage). Thus two of the key links between area MT and motion perception (single unit sensitivity and effects of microstimulation) now have been established for depth perception as well.
Comparison of neuronal and psychophysical thresholds
To compare neuronal and behavioral sensitivity, one must make a handful of choices and assumptions, some of which have been discussed previously (see Britten et al., 1992; Parker and Newsome, 1998; Prince et al., 2000). Like others, we have chosen to tailor our visual stimuli to the preferences of each neuron, and we have used a nonparametric ideal observer model (ROC analysis) to compute neurometric functions. Our neuronal thresholds represent the best that one could perform the task on the basis of the mean firing rates of a single MT neuron (and an assumed anti-neuron with opposite tuning; Britten et al., 1992). Of course, the monkey may not be able to extract signals from MT neurons with the same precision as our ideal observer. On the other hand, the monkey potentially has access to many neurons with similar tuning within MT (Britten et al., 1992; Shadlen et al., 1996). Although it is striking that our average N/P threshold ratio is almost exactly unity, these factors make literal interpretation of the absolute N/P threshold ratio somewhat perilous. Comparisons of N/P threshold ratios across different tasks and different brain areas should be extremely informative, however.
Along with these issues of population coding, there are other important factors that may affect N/P threshold comparisons. We have addressed three of these factors. First, we showed that stimulus variation (trial-to-trial randomization of noise disparities and dot locations) has similar effects on neuronal and psychophysical thresholds, thus eliminating any substantial impact on N/P threshold ratios. Second, we showed that retesting our monkeys' behavior outside of the recording experiments produced little change in psychophysical thresholds, indicating that N/P ratios were not reduced artificially by poor behavioral performance during recording sessions. Finally, we showed that integration time generally had modest effects on neuronal thresholds and that viewing duration had little effect on psychophysical thresholds. Assuming that monkeys reached a decision within the first 200 msec of the 1500 msec stimulus presentations used in recording sessions (which seems unlikely), our average N/P threshold ratio rises to 1.79.
We therefore have a high degree of confidence that our mean N/P threshold ratio lies within the range from 0.98 to 1.79 and that many MT neurons have sensitivity equal to, or better than, the monkey.
Comparison to previous studies
Five previous studies have made direct simultaneous comparisons of behavioral and neuronal sensitivity (Britten et al., 1992; Celebrini and Newsome, 1994; Croner and Albright, 1999; Prince et al., 2000; Cook and Maunsell, 2002). Given that our average N/P threshold ratio is the lowest among these studies, we now consider some of the differences among these studies.
Three previous studies characterized the sensitivity of MT/MST neurons during performance of a direction discrimination task, reporting average N/P threshold ratios of 1.19 (MT; Britten et al., 1992), 1.06 (MST; Celebrini and Newsome, 1994), and 1.50 (MT; Croner and Albright, 1999). These figures were based on the same neuron/anti-neuron ideal observer model that we have used. All three studies used the same basic inclusion criterion for admitting neurons to the study: neurons had to exhibit no overlap between response distributions measured at the preferred and null directions of motion, using 100% coherent stimuli. It is somewhat unclear how many neurons were excluded by this criterion. Britten and colleagues (1992) state that “very few neurons were eliminated because of it,” whereas Croner and Albright (1999)report that 40% of MT neurons were excluded. In any case we excluded only 6% of neurons by a tuning strength criterion, so these differences cannot explain our lower N/P threshold ratio. Apart from the different behavioral tasks used, the most likely explanation is that our stimuli were better tailored to neuronal preferences. All three of the direction discrimination studies used stimuli at zero disparity; thus most MT and MST neurons were not tested at their optimal disparities. In addition, speed preferences were assessed qualitatively (Britten et al., 1992; Celebrini and Newsome, 1994) or very coarsely (Croner and Albright, 1999). In contrast, we used quantitative direction, speed, size, and disparity tuning curves (on-line) to tailor stimulus properties to the preferences of each neuron.
A recent study by Cook and Maunsell (2002) reports that MT neurons are fourfold less sensitive than monkeys in a motion detection task. For at least two reasons this result is not as different from ours as it might seem. First, Cook and Maunsell used a single neuron model to calculate neuronal sensitivity. With an anti-neuron model the neurons would be more sensitive by a factor of up to 2. Second, Cook and Maunsell used a detection task, and in determining neuronal sensitivity, they used a 100 msec time window and a correct–rejection criterion. This ideal observer model is quite different from ours so the thresholds are not directly comparable, but the detection thresholds are likely to be higher. Accounting for these two differences, it seems likely that Cook and Maunsell's N/P threshold ratio would be closer to 2, and this is what we would expect by extrapolating our data (Fig. 8) down to an integration time of 100 msec.
In the only other published study of disparity discrimination, Prince and colleagues (2000) reported an average N/P threshold ratio of 4.11 for a population of 41 V1 neurons (after correction for psychophysical retesting). This was calculated by using an ortho-neuron ideal observer. With an anti-neuron model their N/P threshold ratio drops to 1.97, closer to that of the present study. It is important to note, however, that Prince and colleagues (2000) excluded 140 of 232 neurons (60%) from their study because of weak disparity selectivity. If neuronal thresholds could have been measured for all neurons, the average N/P threshold ratio in V1 would be much larger. Thus it seems clear that MT neurons are more sensitive (relative to the monkey) in our task than V1 neurons were in the task used by Prince et al. (2000). This does not imply necessarily that MT neurons are better suited to disparity processing than V1 neurons because of important differences between tasks in the two studies. Our task involves discriminating between two coarse disparities in the presence of noise, whereas the stereoacuity task of Prince et al. (2000) involves discriminating fine differences in relative disparity between two stimuli in the absence of disparity noise. The results of the present experiments certainly do not guarantee that MT neurons would be as sensitive as the monkey in other types of stereo tasks, and we currently are testing MT neurons in a stereoacuity task to allow for a more direct comparison between V1 and MT.
Effects of integration time
A striking finding is the existence of MT neurons with neuronal thresholds that have little dependence on integration time (Fig.7B). This clear departure from Poisson predictions (Fig.7C) results mainly from reduced variability (VMR) in the early portion of the MT response and from positive correlations between the number of spikes that occur in neighboring time bins (noise correlation). In addition, the slope of individual integration time curves also depends on temporal variations in the differential (preferred–null) disparity signal. These properties allow MT neurons to achieve near-maximal sensitivity within a brief period after the appearance of a stimulus and may serve to facilitate information processing during natural vision in which saccadic eye movements occur every few hundred milliseconds.
Müller and colleagues (2001) recently have reported similar phenomena in V1 of anesthetized monkeys. Responses of V1 neurons to sinusoidal gratings were found to have high response gain and low variability in the first 100–200 msec of the response, allowing for improved detectability and discriminability by the neurons. The similarity between their findings and ours suggests that this may be a general coding strategy used in visual cortex.
We found little effect of viewing duration on psychophysical thresholds for depth discrimination (Fig. 10), whereas Britten and colleagues (1992) and Gold and Shadlen (2000) have reported strong effects of viewing duration on direction discrimination thresholds. This discrepancy may be attributable to important differences between the visual stimuli used in the two tasks. In the variable-coherence motion stimulus, the identity of a dot as signal or noise typically changes every few video frames. As a result, the integrated net motion of the noise dots varies over time and averages toward zero with longer stimulus durations. In contrast, the distribution of noise disparities was fixed throughout a given trial in our depth discrimination task, and any net disparity of the noise dots (by chance) persisted throughout the trial. Combined with the fact that signal and noise dots were updated every 45 or 50 msec in the motion tasks and every 20 msec in our task, this means that the disparity signal evolves rapidly and remains constant in our stimulus, whereas the direction signal evolves more gradually and fluctuates considerably in a low-coherence motion stimulus. This should allow greater improvement in psychophysical thresholds with increasing viewing duration for the direction discrimination task than for the depth discrimination task.
We cannot be certain how the persistence of noise dots in our stimuli affects the relative contributions of signal, VMR, and noise correlation to flat integration time curves (Fig. 9). However, the effect of noise correlation may be most stimulus-dependent. We found larger noise correlations in the VAR condition than in the NOVAR condition, resulting from trial-to-trial variations in the mean disparity of noise dots in the VAR condition. Because this stimulus-related component of noise correlation would not affect the monkey's read out of population activity in an individual trial, we might underestimate the slope of the neuronal integration time curves. This point highlights the importance of distinguishing between sources of noise in the stimulus and sources of noise intrinsic to the neurons, when neuronal and psychophysical sensitivity are evaluated (see alsoBarlow and Tripathy, 1997).
Footnotes
This work was supported by the National Eye Institute (EY-013644) and by a Career Award in the Biomedical Sciences (to G.C.D.) from the Burroughs Wellcome Fund. T.U. was supported by the Japan Society for the Promotion of Science Research Fellowship for Young Scientists and a long-term fellowship from the Human Frontier Science Program. We thank Amy Wickholm and Heidi Loschen for excellent technical support and monkey training. We are grateful to Ken Britten, Erik Cook, Bruce Cumming, Kristine Krug, Jing Liu, John Maunsell, and Andrew Parker for valuable comments on this manuscript.
Correspondence should be addressed to Gregory C. DeAngelis, Department of Anatomy and Neurobiology, Washington University School of Medicine, Box 8108, 660 South Euclid Avenue, St. Louis, MO 63110. E-mail:gregd{at}cabernet.wustl.edu.