Abstract
Primates are capable of discriminating depth with remarkable precision using binocular disparity. Neurons in area V4 are selective for relative disparity, which is the crucial visual cue for discrimination of fine disparity. Here, we investigated the contribution of V4 neurons to fine disparity discrimination. Monkeys discriminated whether the center disk of a dynamic random-dot stereogram was in front of or behind its surrounding annulus. We first behaviorally tested the reference frame of the disparity representation used for performing this task. After learning the task with a set of surround disparities, the monkey generalized its responses to untrained surround disparities, indicating that the perceptual decisions were generated from a disparity representation in a relative frame of reference. We then recorded single-unit responses from V4 while the monkeys performed the task. On average, neuronal thresholds were higher than the behavioral thresholds. The most sensitive neurons reached thresholds as low as the psychophysical thresholds. For subthreshold disparities, the monkeys made frequent errors. The variable decisions were predictable from the fluctuation in the neuronal responses. The predictions were based on a decision model in which each V4 neuron transmits the evidence for the disparity it prefers. We finally altered the disparity representation artificially by means of microstimulation to V4. The decisions were systematically biased when microstimulation boosted the V4 responses. The bias was toward the direction predicted from the decision model. We suggest that disparity signals carried by V4 neurons underlie precise discrimination of fine stereoscopic depth.
Introduction
The precise detection and discrimination of binocular disparity is a hallmark of stereovision. Humans, as well as monkeys, discriminate disparity as small as a few seconds of arc (∼0.001°) when a reference is present (Sarmiento, 1975; Westheimer, 1979; McKee, 1983; Kumar and Glaser, 1992). Without the reference, psychometric thresholds increase by an order of magnitude (Westheimer, 1979; Kumar and Glaser, 1992; Prince et al., 2000). Thus, fine stereoscopic depth perception primarily relies on the relative disparity between adjacent visual features, rather than the absolute disparity of a single feature. This property of stereoscopic perception suggests that the neural representation underlying fine stereoscopic depth is also in a relative frame of reference.
The primary visual cortex (V1) and the middle temporal area (MT) are unlikely to play a direct role in fine stereoscopic depth perception because neurons in these areas do not signal the relative disparity between adjacent features (Cumming and Parker, 1999; Uka and DeAngelis, 2006). Microstimulation and inactivation have not provided a causal link between neural activity in MT and discrimination of fine disparity (Uka and DeAngelis, 2006; Chowdhury and DeAngelis, 2008). In contrast, the inferior temporal cortex (IT) is related to fine disparity discrimination because trial-by-trial fluctuations of neuronal responses are correlated with perceptual decisions about subthreshold disparity (Uka et al., 2005). But the decision-related activity emerged substantially late (360 ms) after the stimulus onset, suggesting that the decision-related activity reflects modulation in the top-down direction, e.g., decision signals modulating sensory signals, rather than sensory signals driving decisions in the bottom-up direction. Other areas may feed the bottom-up signals to the decision-making processes.
We conducted a series of experiments to examine the contribution of area V4 to fine disparity discrimination. Because many V4 neurons encode relative disparity (Umeda et al., 2007), area V4 is a possible site for the representation of fine stereoscopic depth perception. We trained monkeys to discriminate the disparity of the center with respect to the surround of a random-dot stereogram. We first examined whether a monkey could generalize disparity discrimination to surround disparities that were never used in the prior training. Discrimination performance for untrained surround disparities was as good as that for trained surround disparities, confirming the use of the disparity representation in a relative frame of reference. We then recorded single-unit activity in V4 during the discrimination task. An ideal observer (Geisler, 2003) of typical V4 neurons was less sensitive than the subjects, but an ideal observer of the most sensitive V4 neurons was as sensitive as the subjects. Responses of V4 neurons were predictive of forthcoming perceptual decisions. The decision-related activity appeared soon after the stimulus onset. In another set of experiments, we applied electrical microstimulation to V4. Perceptual decisions were biased by microstimulation toward the direction of the disparity preference of the stimulated neurons. These results suggest that V4 neurons provide disparity signals to decision-making mechanisms when subjects judge a small difference in binocular disparity between two regions of a visual stimulus.
Materials and Methods
Subjects and surgery
We performed experiments on two adult male Japanese monkeys (Macaca fuscata) weighing 7.8 and 8.5 kg. Each monkey was implanted with a head-holding device, a recording chamber, and scleral search coils in both eyes under sterile conditions. The center of the recording chamber was placed over the left hemisphere 25 mm dorsal and 5 mm posterior to the ear canal. General anesthesia was maintained by intravenous infusion of sodium pentobarbital (Nembutal, 17 mg/kg, i.v., Dainippon Sumitomo Pharma) or by inhalation of a mixture of isoflurane (Forane, 0.3–2.0%, Abbott), nitrous oxide (66%), and oxygen (33%). Additional details of surgical procedures can be found in Kumano et al. (2008). All animal care, surgical, and experimental procedures conformed to the National Institutes of Health Guide for the Care and Use of Laboratory Animals and were approved by the Animal Experiment Committee of Osaka University.
Visual stimuli
The monkey viewed visual stimuli presented on one of the three CRT displays: MacNaughton NuVision 21MX, Sony GDM-FW900, or Iiyama HM204DA. The display subtended a visual angle of 40° × 30° (NuVision 21MX; HM204DA) or 48° × 30° (GDM-FW900) at a viewing distance of 57 cm. Images for the left and right eyes were presented alternately at a refresh rate of 120 Hz (NuVision 21MX) or 100 Hz (GDM-FW900; HM204DA). Dichoptic viewing was achieved by using either a liquid crystal polarizing filter (for NuVision 21MX) or ferroelectric liquid crystal shutters (for GDM-FW900 and HM204DA).
All visual stimuli used in this study were dynamic random-dot stereograms (RDSs). Each dot pattern consisted of a center circular patch and a surrounding annulus. The binocular disparity of the center patch, or the center disparity, and the disparity of the surrounding annulus, or the surround disparity, were assigned independently (see below). The size of the center patch was matched to the classical receptive field of the neuron under study. The outer diameter of the surrounding annulus was 2° larger than that of the center patch. RDSs were composed of the same number of bright (3.6 cd/m2, NuVision 21MX; 4.6 cd/m2, GDM-FW900; 0.82 cd/m2, HM204DA) and dark (0.5 cd/m2, NuVision 21MX; 0.3 cd/m2, GDM-FW900; 0.03 cd/m2, HM204DA) dots presented against a midlevel gray background (2.0 cd/m2, NuVision 21MX; 2.5 cd/m2, GDM-FW900; 0.42 cd/m2, HM204DA). In all the luminance measurements, we placed either the polarizing filters (for NuVision 21MX) or the liquid crystal shutter (for GDM-FW900 and HM204DA) between the display screen and the photometer. Dot density was 26% at the dot size of 0.17° × 0.35° (NuVision 21MX), 0.15° × 0.15° (GDM-FW900), or 0.17° × 0.17° (HM204DA). A new dot pattern was generated at a rate of 10 Hz.
Images were anti-aliased automatically by the OpenGL graphic board (Oxygen GVX1 or Wildcat VP870, 3Dlabs). Only the red phosphor was used for visual stimulation. This phosphor has the shortest decay time among the three phosphors in CRT displays, so turning off the other phosphors minimized interocular cross talk. For dichoptic presentation with a liquid crystal polarizing filter, interocular cross talk from the left eye to the right eye was 10%, whereas there was no measurable cross talk from the right eye to the left eye (Tanabe et al., 2004). We did not detect any interocular cross talk in dichoptic presentation with ferroelectric liquid crystal shutters.
Fine disparity discrimination task
The monkeys were trained to perform a fine disparity discrimination task (Fig. 1A). They were required to report whether the center patch of an RDS was nearer or farther than the surrounding annulus. At the beginning of a trial, a fixation point appeared at the center of the display. An RDS was then presented for 1.5 s over the classical receptive field of the neuron under study. Half a second after the offset of the RDS, the fixation point disappeared and two targets appeared 5° above and below the fixation point. The monkeys indicated their judgment by making an eye movement toward one of the two targets (the upper and lower targets for “far” and “near” choices, respectively). The monkeys were required to maintain fixation within an invisible window of 1.4° × 1.4° (Monkey R) or 2.0° × 2.0° (Monkey K) from the onset of the RDS to the offset of the fixation point. During the fixation, the vergence angle was not allowed to exceed ±0.4° (Monkey R) or ±0.5° (Monkey K) of the display plane. A trial was aborted if the monkey broke these fixation criteria. Task difficulty was manipulated by varying the magnitude of the center disparity across trials. In single-unit recording sessions, the center disparity was one of 13 values (0, ±0.005, ±0.01, ±0.02, ±0.05, ±0.1, or ±0.2°) or 15 values (0, ±0.002, ±0.005, ±0.01, ±0.02, ±0.05, ±0.1, or ±0.2°). In microstimulation sessions, the center disparity was one of 11 values (0, ±0.005, ±0.01, ±0.02, ±0.05, or ±0.1°). The surround disparity was maintained at 0° except in training sessions and generalization test sessions (see below). A liquid reward (water or juice) was delivered for correctly choosing the bottom target for a near center and the top target for a far center. When the center disparity was equal to the surround disparity, the monkey was randomly rewarded in 50% of the completed trials.
The monkeys were initially trained to discriminate center disparities of ±0.2°. After they learned to discriminate these disparities, smaller disparities were applied to the center patch while keeping the surround disparity at 0°. We used the method of constant stimuli for Monkey R and a staircase procedure (Uka and DeAngelis, 2003) for Monkey K for discrimination training. The training with these methods continued for several months until psychophysical thresholds reached asymptotic values (∼0.01–0.03° at eccentricities of 5–10°).
Monkey R was subsequently trained with nonzero surround disparities (first ±0.4° and then ±0.2°). We rewarded the monkey for making perceptual decisions based on the relative disparity between the center patch and the surrounding annulus. During the course of discrimination training, we tested whether Monkey R could discriminate relative disparities with surround disparities that had never been used in the prior training (generalization test sessions). Monkey R was engaged in the generalization test session twice: once after having been trained with surround disparities of ±0.4° and once after having been trained with surround disparities of ±0.2°. In the generalization test session, Monkey R was presented with a surrounding annulus at one of the five disparity values (the first session, 0, ±0.2, or ±0.4°; the second session, 0, ±0.1, or ±0.2°), two of which (the first session, ±0.2°; the second session, ±0.1°) had not been used in the prior training. The center patches of RDSs were 5° in diameter and centered at an eccentricity of 7.8°. The center disparity was chosen so as to achieve one of 13 levels of relative disparities that ranged between −0.2 and 0.2°. Trials with untrained surround disparities were randomly interleaved with those with trained surround disparities. Rewards were delivered on a random half of the trials of untrained surround disparities, whereas the monkey was rewarded for a correct choice in trials of trained surround disparities. In the first generalization test session, the monkey performed 30 trials for each untrained surround disparity and 180 trials for each trained surround disparity. In the second generalization test session, the monkey performed 50 and 300 trials for each of the untrained and trained surround disparities, respectively.
Electrophysiological recording
Custom-made tungsten-in-glass microelectrodes (0.2–1.8 MΩ at 1 kHz) were advanced transdurally into the cortex using a micromanipulator (MO-95S or MO-951, Narishige) attached to the recording chamber. Extracellular voltage signals were amplified (MDA-4I, Bak Electronics; a custom-made amplifier) and filtered (Multifunction Filter 3611, NF Corporation; a custom-made filter) for measurements of single-unit or multiunit activity. Action potentials from a single neuron were isolated on-line using either a template-matching system (Multi Spike Detector, Alpha-Omega Engineering) or a window discriminator (DDIS-1, Bak Electronics; a custom-made window discriminator). Multiunit activity was detected using a window discriminator. The position of both eyes was monitored with a magnetic search coil system (MEL-25; Enzanshi Kogyo) in all but three of the single-unit recording sessions (85 of 88) and a subset of the microstimulation sessions (9 of 32). In the remaining sessions, the position of one eye was monitored. Behavioral tasks and data collection were controlled by commercial software (TEMPO, Reflective Computing). Times of action potential occurrence, eye positions, and task events were stored on a hard disc drive with a time resolution of 1 ms. We identified area V4 based on the estimated locations of the lunate and superior temporal sulci as well as the relationship between receptive field properties (size and center position) and the location of electrode penetration, as in our previous studies (for details, see Watanabe et al., 2002).
Experimental protocol
Single-unit recording sessions.
After isolating activity of a single V4 neuron, we manually determined its classical receptive field by using, in most cases, a small RDS patch (∼2.0° in diameter). In the cases in which RDSs did not elicit reliable visual responses, we used gratings, bars, or spots. We next measured disparity tuning of the neuron by presenting RDSs for 1 s while the monkey performed a fixation task. For this measurement, the center disparity was varied from −1.6 to 1.6° at steps of 0.4° across trials, whereas the surround disparity was always 0°. Each stimulus condition was repeated 10 times. If the neurons was significantly selective for center disparity (Kruskal–Wallis test, p < 0.05), we recorded their activity during performance of the fine disparity discrimination task as long as single-unit isolation was maintained. For analysis, we only included sessions in which each stimulus condition was repeated at least 10 times in the fine disparity discrimination task. On average, we collected 35 trials per stimulus condition for each neuron.
Microstimulation sessions.
We performed microstimulation experiments on Monkey R. Microstimulation sites were selected so as to activate neurons with similar disparity selectivity around zero disparity. The receptive field of multiunit activity was determined as in the case of single-unit recording sessions. The size of the center of RDSs was matched to the receptive field of the multiunit activity. We estimated the disparity-tuning curve of multiunit activity around zero disparity (±0.1° and ±0.4° in the center patch; 0° in the surrounding annulus; 2–10 repetitions per stimulus condition) at steps of 100 μm along electrode penetrations. The electrode was placed in the middle of a penetration length of 200–300 μm, in which the disparity-tuning curves around zero disparity were qualitatively similar. We then measured the overall shape of the disparity-tuning curve at the site (0, ±0.1, ±0.4, ±0.8, ±1.2, or ±1.6° in the center patch; 0° in the surrounding annulus; 10 repetitions per stimulus condition). The monkey then performed the fine disparity discrimination task. We applied electrical microstimulation during the presentation of visual stimuli on a random half of the trials. Each condition was repeated 40 times, resulting in 880 trials in total for each stimulation site (11 disparity levels × 40 trials, with and without microstimulation). Correct disparity discrimination was rewarded in trials both with and without microstimulation. A pulse generator (BPG-1, Bak Electronics) and a stimulus isolator (BSI-1, Bak Electronics) were used to generate biphasic current pulses at 200 Hz. Each biphasic pulse was composed of a 0.2 ms cathodal current followed by a 0.2 ms anodal current with an interpulse interval of 0.1 ms (Salzman et al., 1992). We used currents with an amplitude of 40 μA because, in preliminary experiments, currents of this amplitude elicited perceptual biases most reliably among the four levels of amplitudes (10, 20, 40, and 60 μA). At two of 33 microstimulation sites, we could not complete the experiments because microstimulation of 40 μA at these sites often elicited large eye movements, resulting in a fixation break.
Data analysis
For both single-unit and multiunit activities, we defined the magnitude of responses to visual stimuli as the mean firing rates during the period from 80 ms after the onset to 80 ms after the offset of RDS presentation.
Neuronal and psychophysical thresholds.
To quantify the sensitivity of V4 neurons to fine disparity, we calculated neurometric functions using receiver operating characteristics (ROC) analysis (Britten et al., 1992). First, we determined the neuron's disparity preference from the linear regression of the response recorded during the discrimination task as a function of signed disparity. Negative and positive slopes were classified into “near-preferring” and “far-preferring,” respectively. Second, we used the disparity preference to split the response as a function of signed disparity into a pair of neuron–antineuron responses as a function of unsigned disparity (Britten et al., 1992). An antineuron is a hypothetical neuron whose disparity-tuning curve is the inversion of the disparity-tuning curve of the recorded neuron along the axis of zero disparity. Other response properties are the same between the recorded neuron and its antineuron. We computed the ROC value (the area under the ROC curve) (Green and Swets, 1966) for each unsigned disparity of the neuron–antineuron responses. The ROC value corresponds to the proportion of correct choices of an ideal observer that makes a decision based on responses of the recorded neuron and the antineuron. Finally, we plotted the ROC values as a function of signed disparity (a neurometric function) and fitted them with a cumulative Gaussian function using the maximum likelihood method. The neuronal threshold was defined as the standard deviation of the fitted cumulative Gaussian. The threshold corresponds to the performance of 84% correct by an ideal observer. For neurometric functions, the mean of the fitted cumulative Gaussian was inevitably zero because the ROC values were symmetric about zero disparity. The neuronal sensitivity was directly compared with the behavioral sensitivity of the subject on the same set of trials. We quantified the behavioral sensitivity using a psychometric function in which the proportion of far choices of the subject was plotted as a function of signed disparity. This function was fitted with a cumulative Gaussian function. The psychophysical threshold was defined as the fitted standard deviation. The mean of the fitted cumulative Gaussian gives the point of subjective equality, that is, the center disparity at which the subject would choose near or far with equal probability.
Subjects typically make errors at a small rate even when the sensory signal is well above threshold. These errors complicate the interpretation of the psychophysical threshold, because it is difficult to identify the cause of those errors. The errors may be a reflection of the noise in the sensory system, in which case the standard fit should suffice. Another possible source of noise is outside of the sensory system. A lapse in the subject's attentive state is one example of such noise (Wichmann and Hill, 2001). In the latter case, the noise needs to be modeled at two different stages; one in conjunction with the sensory signal, and another in conjunction with the decision variable. We addressed the latter possibility by factoring out the second source of noise from the thresholds of the generalization test. We did this by explicitly fitting an additional model in which the lapse rate was a free parameter.
Choice probability.
ROC analysis was also used to quantify how predictive neuronal responses were of the perceptual decision. Responses as a function of disparity were split by the succeeding decision, followed by calculation of an ROC value for each disparity. The ROC values calculated in this way are called choice probabilities and represent the probability that an ideal observer correctly predicts the subject's choice by monitoring the neuronal response (Britten et al., 1996). Choice probabilities were calculated only for stimulus conditions in which the monkey chose one target in no more than 75% of the trials. These stimulus conditions were also used to calculate a single grand choice probability for each neuron (Britten et al., 1996). Neuronal responses for each disparity were transformed into z-scores and then combined, followed by calculation of an ROC value.
We performed a permutation test for testing statistical significance of grand choice probability. For each disparity, we randomly reassigned neuronal responses to choices without replacement. A grand choice probability was then calculated from the permutated data. The null distribution of grand choice probabilities was obtained with 2000 permutations.
Microstimulation effects.
We used logistic regression to measure the effect of microstimulation (Salzman et al., 1992). For each microstimulation site, we fitted the proportion of far choices with a logistic function, as follows: where d is the center disparity, and s represents the presence (1) or absence (0) of microstimulation. The parameters β0, β1, β2, and β3 were fitted using the maximum likelihood method. We defined that microstimulation created a choice bias if β2 was significantly different from zero (Wald test, p < 0.05) (Hosmer and Lemeshow, 1989). The psychophysical threshold was defined as the disparity that is needed to increase the proportion of far choices from 0.5 to 0.84 in trials without microstimulation.
We computed an equivalent center disparity, which is the horizontal shift of the center (50% far choices) of the psychometric functions between with and without microstimulation conditions. The equivalent center disparity is a signed value because the expected direction of choice bias depends on the site's disparity preference. For example, microstimulation of a site with near-preferring neurons should bias the subject's decision toward near choices, but not far choices. To determine the site's disparity preference, we used the multiunit activity recorded during the disparity-tuning curve measurement and focused on the trials in which the center disparity was within the range of the microstimulation test (0 and ±0.1°). A linear regression summarized the multiunit response as a function of disparity. The sign of the slope was taken as the disparity preference of the multiunit activity.
Results
Psychophysical performance
We trained two monkeys on a fine disparity discrimination task (Fig. 1A). The monkeys reported whether they perceived the center patch of an RDS as nearer or farther than its surrounding annulus by making a saccade to one of two target points appearing below and above the fixation point (Fig. 1A). The disparity of the center patch (or the center disparity) was randomly varied in its sign and magnitude from trial to trial. The smaller the magnitude of the center disparity, the harder the task was for the subjects. The disparity of the surrounding annulus (or the surround disparity) was fixed at 0° unless otherwise stated.
Figure 1B shows the average psychometric function for the two monkeys in single-unit recording sessions (n = 53 for Monkey R; n = 35 for Monkey K). The sigmoidal shape indicates that the monkeys made more correct choices as the center disparity deviated from the surround disparity. For each session, the proportion of far choices as a function of center disparity was fitted with a cumulative Gaussian function to estimate the psychophysical threshold. The geometric mean of psychophysical thresholds across the sessions was 0.023° (0.017° for Monkey R; 0.036° for Monkey K). The mean eccentricity of stimulus presentation was 7.3° (6.7° for Monkey R; 8.3° for Monkey K). These thresholds were comparable to those measured in previous studies using similar visual stimuli (Prince et al., 2000; Uka and DeAngelis, 2006). Our subjects performed near the limit of stereoacuity.
Generalization of relative disparity discrimination
The decision signal can be generated either directly from the disparity representation in an absolute frame of reference or indirectly through a representation in a relative frame of reference. An implementation of a direct generation is the neuron/orthoneuron model, in which decisions are generated from responses of a pair of neurons that detect the center and the surround disparities, respectively (Prince et al., 2000). An implementation of an indirect generation is the neuron/antineuron model, in which decisions are generated from responses of a pair of neurons that prefer near and far relative disparities, respectively. For the indirect generation, detectors of relative disparity take the place of the motion detectors in the original neuron/antineuron implementation (Britten et al., 1992). The two decision models can be distinguished by examining the transfer of the discrimination; if the subjects exploit responses of neurons that are selective for relative disparity, discrimination performance should generalize to RDSs that have untrained surround disparities. We trained Monkey R with surround disparities of 0° and ±0.4° and tested the monkey with surround disparities that had never been used in the training (±0.2°). The monkey was randomly rewarded for the untrained stimuli regardless of the choice, and was rewarded for correctly discriminating the relative disparity for the trained stimuli. The psychometric functions with the untrained stimuli had sigmoidal shapes similar to those of the trained stimuli (Fig. 2A, filled squares and filled triangles). The monkey was as sensitive to fine disparities of untrained stimuli as to those of trained stimuli (Fig. 2B, solid and dashed lines for the models with zero and free lapse rates, respectively; the estimated lapse rates were 0.027 in the first session and 0.016 in the second). Psychometric functions as a function of center disparity were shifted depending on the surround disparities. The direction and the magnitude of the shift matched the surround disparities with a slight undercompensation (Fig. 2C). We obtained similar results from another experiment where surround disparities of ±0.1° served as untrained stimuli (Fig. 2D–F). Perceptual decisions about fine disparity are generated from a representation of binocular disparity in a relative frame of reference.
Part of the surround in our stimulus was always closer to the fovea than the center. The surround could potentially induce a vergence eye movement toward the depth of the surround. A perfect tracking vergence eye movement would make the relative disparity discrimination a complete artifact. The mean vergence eye movements during stimulus presentation were at most only 8% of the changes in surround disparity (data not shown). The almost complete shift of psychometric functions (Fig. 2C,F) was not an artifact of tracking vergence eye movements.
Comparison of neuronal and psychophysical thresholds
If the monkeys based their perceptual decisions on a neuron in V4, their performance should be equal to the performance of an ideal observer of the V4 neuron. Neurometric analysis describes the performance of an ideal observer who makes a perceptual decision by comparing responses of a neuron and its antineuron (see Materials and Methods, above) (Britten et al., 1992). An example cell in V4 had a strong selectivity for near center disparities (Fig. 3A). The preference for near center disparities was clear even within the small range of center disparities used in the fine disparity discrimination task (Fig. 3B). Figure 3C shows the neurometric performance of an ideal observer (filled symbols). We estimated the neuronal threshold of this example neuron as 0.036° with the fitted function. This was almost comparable to, or slightly higher than, the psychophysical threshold measured during the recording from this neuron (0.026°). This neuron was one of the most sensitive neurons in our dataset. The data from a neuron with higher neuronal threshold are shown in Figure 3D–F. This neuron was also strongly tuned for near center disparities (Fig. 3D,E), but its neuronal threshold (0.079°) was twice that of the monkey (0.036°).
Across the sampled population of 88 cells (n = 53 for Monkey R; n = 35 for Monkey K), neuronal thresholds were higher than the psychophysical thresholds (Fig. 4). Only four neurons were more sensitive than the monkey. The median of neuronal-to-psychophysical threshold (N/P) ratios was 10.8 (17.2 for Monkey R; 5.7 for Monkey K; Fig. 4, upper right). Most of the intersubject variation of the N/P ratio derives from the variation in the denominator: the psychophysical threshold. Monkey K had a higher psychophysical threshold than monkey R (Wilcoxon's rank sum test, p < 0.0001). In summary, V4 neurons were less sensitive to fine disparity than the monkeys were. Only a small fraction of them were sufficient for describing the psychophysical threshold.
The ideal observer integrated the entire period of visual responses, which is not generally the case with real subjects. We examined how N/P ratios depended on the integration window by truncating the period of integrating visual response to a half (terminating at 830 ms after stimulus onset). The median N/P ratio was 13.4 with the shortened window (23% increase from the full window). Two neurons preserved an N/P ratio of smaller than 1. Thus, an observer of V4 responses could discriminate fine disparity as reliably as subjects over a range of integration windows.
Choice-predictive activity in fine disparity discrimination
Psychophysical and neuronal thresholds characterize decisions by combining behavioral and neuronal responses to various disparities. If the subjects used the disparity representation in V4 to make their decisions, there should be a relationship between neuronal responses and perceptual decisions within each disparity; neuronal responses should be predictive of the upcoming decision from trial to trial in repeated presentations of the same disparity. We plotted the distribution of the responses of a near-preferring neuron (Fig. 3A–C, the same neuron shown) separately for the monkey's choices (near vs. far) at each center disparity (Fig. 5). For most center disparities, neuronal responses were higher in trials in which the monkey made choices corresponding to the disparity preference of the neuron (near choices) than in null choice trials in which the monkey reported the opposite of the disparity preference of the neuron (far choices). For example, when we presented RDSs of zero center disparity, the mean neuronal response was higher in preferred choice trials (63.5 spikes/s) than in null choice trials (59.4 spikes/s; Fig. 5, bottom). To quantify how strongly neuronal responses were related to choices of the monkey, we calculated choice probability (Britten et al., 1996) for each center disparity in which the monkey chose either target in no less than one-fourth of the trials. For the example neuron illustrated in Figure 5, choice probabilities were greater than chance (0.67–0.74) for five of the six center disparities (permutation test, p < 0.05). Thus, an observer of this neuron would be able to predict the monkey's choices more frequently than chance during fine disparity discrimination.
The choice probability was averaged across the population for each center disparity (in this analysis, preferred and null disparities were assigned to positive and negative signs, respectively). Coarse disparity (>0.02° or <−0.02°) was easy for the monkeys, so they made few errors. An insufficient frequency of error trials leads to a poor estimate of the mean choice probability (large standard errors) (1, 2, and 3 choice probabilities were obtained for the center disparities of −0.1, −0.05, and 0.05°, respectively). We obtained reliable estimates of the mean choice probability for fine disparities (≤0.02° and ≥−0.02°; >20 choice probabilities were obtained for each of the fine disparities). The average of the reliably estimated choice probabilities was greater than chance (Fig. 6) and was consistent over the entire range of center disparities (one-way ANOVA, p = 0.16). The relationship between neuronal and behavioral responses was independent of center disparity.
The independence of the choice probability from the center disparity assures that we can combine all the trials with various disparities into a single set of trials. The responses were z-scored to take into account the changes in the mean and variance with center disparity and were combined across the trials with different disparities (see Materials and Methods, above). From the combined set of responses, we calculated a single grand choice probability for each neuron (Britten et al., 1996). The grand choice probability is a more reliable estimate than the individual ones because the number of sampled trials is larger. The example neuron in Figure 5 had a grand choice probability of 0.66. The choice probability was greater than what we could expect from chance (permutation test, p < 0.001). The population mean of grand choice probability, 0.55 (0.54 for Monkey R; 0.55 for Monkey K), was >0.5 (t test, p < 0.0001; Fig. 7). In a third of the neurons (29/88), the grand choice probability deviated from chance level. The deviation was in the positive direction for most of them (27/29). An ideal observer of V4 responses would be able to predict the upcoming choice of the monkey better than chance. The prediction is based on a decision model that uses the responses of each neuron as evidence for the disparity of its preference.
Time course of choice-predictive activity
Although the V4 responses were predictive of the monkey's decision, it does not necessarily mean that the V4 responses were the cause of the decision. As a first step to explore the causal relationship, we examined the latency of choice-related activity. We selected the 27 neurons with a grand choice probability significantly greater than chance. For each neuron, spike trains of responses to the same disparity were pooled and converted to spike density functions with a 20 ms sliding window. Each spike density function was then normalized so that the maximum was one. The normalized functions were then averaged across different disparities, keeping the preferred and null choices separate. The functions were then averaged across the 27 neurons (Fig. 8A). The choice-related responses appeared at 130 ms after stimulus onset and remained constant during the course of the visual response. We also calculated the time course of the grand choice probability using a 100 ms sliding window (Fig. 8B). The mean grand choice probability deviated from chance at a very early period of the visual response and remained high up to the end of the visual response. These results are consistent with a causal involvement of activities of V4 neurons in fine disparity discrimination.
Effects of eye movements on grand choice probabilities
One concern with the results of grand choice probabilities is that a high value may be an artifact of choice-related eye movements. For example, if the monkey shifts gaze toward the stimulus more frequently in near choice trials and changes in retinal images caused by the eye movements result in stronger neural responses, a high choice probability could be obtained even though the neural response was not causally related to the decision. The same argument can be applied to vergence angles. To explore these possibilities, we recalculated the grand choice probabilities after removing the linear relationships of neuronal responses with vergence angle, vertical eye position, and horizontal eye position in 85 recording sessions in which we recorded eye movements for both eyes (Uka and DeAngelis, 2004). We first calculated the mean value of each of the three aspects of eye movements in each trial using the same time window for neuronal responses (see Materials and Methods, above). We then performed a linear regression between each of the three eye-movement variables and neuronal responses. We finally recalculated the grand choice probability based on the regression residuals.
The mean grand choice probability was the same before and after correction for vergence eye movement (paired t test, p = 0.26; Fig. 9A), for vertical eye movement (paired t test, p = 0.28; Fig. 9B), and for horizontal eye movement (paired t test, p = 0.44; Fig. 9C). The distribution of grand choice probability was not an artifact of any of the three patterns of eye movement.
Effects of dot pattern variation on grand choice probabilities
Any variable of the task other than eye movements can also generate spurious correlation. For example, the dot pattern of the RDS was varied across trials, so this needs to be tested for spurious correlation. In a subset of recording sessions with Monkey R (31/53), half of the trials for zero center disparity were a repetition of the same sequence of dot patterns. The choice probability of the repeating pattern was the same as the choice probability of the varying pattern (0.54 and 0.55, respectively; Fig. 9D; paired t test, p = 0.77; r = 0.45, p = 0.012). Thus, the variable dot pattern did not cause the biased distribution of choice probabilities.
Relationship between neuronal thresholds and grand choice probabilities
The relationships between choice probability and other neuronal properties give an insight into the underlying decision-making mechanism. It is conceivable that the decision-making circuit is selectively wired to a subpopulation of neurons so as to use the available information efficiently. Specifically, cells with high sensitivity to disparity might have a strong impact on the animal's decision. We tested this idea by examining the relationship between choice probability and neuronal threshold. Grand choice probability was negatively correlated with the neuronal threshold, albeit weakly (Spearman's rank correlation, rs = −0.23, p = 0.030; Fig. 10). This correlation suggests two possibilities for the decision-making circuit. One possibility is that all cells comprising highly sensitive subpopulation have a strong impact on the animal's decision. The other possibility is that only a part of the subpopulation has an impact on the decision, and their responses are correlated with those of the remaining cells in the subpopulation (Cohen and Newsome, 2009).
Microstimulation of V4 neurons during the fine disparity discrimination task
There are still other latent variables that might have caused the choice-predictive activity, e.g., attention, unrecorded motor behavior, or responses of unrecorded cells (Nienborg and Cumming, 2010). Instead of exhaustively testing every possible variable of the task, we artificially boosted the activity of V4 neurons with electrical microstimulation in a separate set of sessions. We were able to selectively activate near-preferring or far-preferring neurons because V4 neurons are clustered according to its preferred disparity (see Materials and Methods, above) (Watanabe et al., 2002; Tanabe et al., 2005). If the monkeys based their decision on the representation of disparity in V4, we expected frequent decisions toward the preference of the stimulated site.
The multiunit activity illustrated in Figure 11A preferred near. Microstimulation elicited an increase in the probability of near choices, resulting in a rightward shift of the psychometric function (Fig. 11B; Table 1). The equivalent center disparity (see Materials and Methods, above) was 0.0060°, which is 51.1% of the psychophysical threshold. At another site, the monkey's choices were biased toward far by microstimulation, where the multiunit activity preferred far (Fig. 11C,D; Table 1). The equivalent center disparity, 0.0079°, was comparable to the psychophysical threshold (89.4%). There were no changes in the slopes of the psychometric functions at these two sites (Table 1). Microstimulation to the near and far sites had the same effect on the perceptual decision as adding near (Fig. 11B) and far (Fig. 11D) disparities, respectively.
We tested microstimulation effects at 31 sites. The shifts were assigned a positive sign when they were in the expected direction and a negative sign otherwise. The mean shift, 0.0021°, was in the positive direction (t test, p = 0.0043; Fig. 12). It was equal to 18% of the average psychophysical threshold. The psychometric function was shifted (Wald test, p < 0.05) at 12 sites, of which 10 (83%) were in the expected direction. The slope of the psychometric function changed in only two of the 31 sites (6.5%). V4 neurons conveyed the signal that was used to form perceptual judgments.
Discussion
We investigated the contribution of V4 neurons to fine disparity discrimination. Discrimination of relative disparity was generalized to untrained surround disparities, suggesting that neurons underlying the discrimination represent disparity in a relative frame of reference. Most V4 neurons had a neuronal threshold higher than the psychophysical threshold, but a fraction of them had a neuronal threshold as low as the psychophysical threshold. On trials with subthreshold disparity, the firing rates of V4 neurons were predictive of the trial-by-trial perceptual decisions. Microstimulation of V4 neurons induced a bias of the perceptual decisions about disparity toward the preferred disparity of the stimulated neurons. This is the first study, to our knowledge, to demonstrate that V4 microstimulation can alter visual perception. We conclude that disparity signals in area V4 are used for fine disparity discrimination.
Relative frame of reference for fine disparity discrimination
The subjects generalized the relative disparity discrimination to untrained stimuli. The generalization suggests that the decision signal was generated from a neural representation of disparity in a relative frame of reference. The psychometric functions, plotted in an absolute reference frame, were shifted by 90% of the shifts in surround disparities (Fig. 2C,F). The shifts were incomplete. The representation of stereoscopic depth is in an imperfect relative frame of reference. The incompleteness matches the properties of a fraction of V4 neurons whose tuning curve for center disparity shifts with 90% of surround disparities (Umeda et al., 2007; for V2, see Thomas et al., 2002; for V3/V3A, see Anzai et al., 2011). The reference frame of fine stereoscopic depth perception might reflect the properties of the disparity representations in V4.
Sensitivity of V4 neurons to small changes in binocular disparity
We found that an ideal observer of V4 neurons was less sensitive than the monkey on average, but an observer of the most sensitive V4 neurons was as sensitive as the monkey (N/P ratios of 1). The ideal observer compared responses of the recorded neuron and its antineuron (Britten et al., 1992). The neuron's response was taken as evidence toward its preferred disparity, while the antineuron's response was taken as evidence for the opposite disparity. Although the presence of an antineuron is hypothetical in nature, it may be possible to implement the ideal observer model with actual neurons in V4. The disparity preference of V4 neurons is distributed across both near and far disparities (Hinkle and Connor, 2001, 2005; Watanabe et al., 2002; Hegdé and Van Essen, 2005b). It is unlikely that the use of red dim stimuli in this study substantially influenced our estimates of N/P ratios because the strength of disparity selectivity of V4 neurons as well as psychophysical thresholds of monkeys are similar over a range of colors and luminances of visual stimuli (Prince et al., 2000; Watanabe et al., 2002; Tanabe et al., 2005). Thus, we suggest that signals in V4 are sufficiently reliable to support fine disparity discrimination.
N/P ratios in fine disparity discrimination for V4 neurons are higher than those reported for V1 and MT neurons (Prince et al., 2000; Uka and DeAngelis, 2006). This does not necessarily indicate that V4 neurons are less sensitive than V1 and MT neurons because different N/P ratios might have resulted from the difference in methodology. In this study, the surround disparity was fixed regardless of the neuron's disparity preference, whereas in the previous V1 and MT studies, the surround disparity was adjusted to the maximum slope of the disparity-tuning curve. This adjustment maximizes the information of individual cells for the ideal observer, as long as the cells are selective to absolute disparity. For V4 neurons, the same procedure cannot maximize the information because the disparity tuning shifts with the surround disparity (Umeda et al., 2007). Therefore, we used the same surround disparity (0°) for all neurons. Recording from V1 and MT with a fixed disparity surround will be needed for a fair comparison of sensitivity to fine disparity between V4 and these areas.
Apart from the perception of stereoscopic depth, thresholds of visual cortical neurons have been measured in two other classes of fine discrimination using a fixed reference. In one study, subjects discriminated a small difference in the direction of visual motion (Purushothaman and Bradley, 2005), while it was the heading direction that subjects discriminated in other studies (Gu et al., 2007, 2008). Purushothaman and Bradley (2005) quantified neuronal sensitivities under a neuron/orthoneuron assumption. The ideal observer in the neuron/orthoneuron model compares responses to the test and the response to the reference (Prince et al., 2000). The performance of the ideal observer under the neuron/orthoneuron assumption is theoretically half that under the neuron/antineuron assumption. Thus, we corrected the neuronal threshold of Purushothaman and Bradley (2005) by dividing the value by two to allow a direct comparison between the three studies. The N/P ratios were comparable despite the differences in cortical areas examined and sensory signals to be discriminated [N/P ratio of 11, 13, and 10 for this study, Purushothaman and Bradley (2005), and Gu et al. (2007), respectively]. There might be a common relationship between neuronal and psychophysical sensitivities across different visual areas during the discrimination of small signals with a fixed reference.
Latency of choice-predictive activity during fine disparity discrimination
The activity of V4 neurons was predictive of the final choice from soon after response onset (130 ms after stimulus onset; Fig. 8). The short latency of choice probability suggests that sensory signals in V4 drive decision processes in the bottom-up direction. Our microstimulation results add support for the bottom-up direction of the relationship of V4 neurons with perceptual decisions.
Uka et al. (2005) reported that neurons in the IT show choice-predictive activity 360 ms after stimulus onset. This is three times the latency of the visual response, or the latency of disparity selectivity (Uka et al., 2005). The increase in the delay of the choice-predictive activity from V4 to IT (230 ms) is disproportionate to the increase in the delay of the visual response or the disparity selectivity (∼20 ms) (Watanabe et al., 2002; Tanabe et al., 2004; Uka et al., 2005). Attention modulates neuronal activity in V4 and IT with a similar time course (Mehta et al., 2000). It is unlikely that the choice-predictive activity in V4 is transmitted directly to IT. Possible mechanisms of choice-predictive activity in IT are the feedback signals of decision or attentional modulation of long latency (Nienborg and Cumming, 2009). In either case, these results suggest that V4 and IT play different roles in fine disparity discrimination.
Roles of area V4 in stereopsis
The altered percept by microstimulation of V4 suggests that V4 is part of a causal chain of sensory-motor transformation for fine disparity discrimination. This is in contrast to area MT, in which microstimulation and inactivation had no effects on decisions about fine disparity (Uka and DeAngelis, 2006; Chowdhury and DeAngelis, 2008). Another contrast between V4 and MT is the degree to which neurons are selective for relative disparity between adjacent visual features; a substantial proportion of V4 neurons are selective for relative disparity between the center and surround of concentric stimuli whereas virtually all MT neurons are not (Neri et al., 2004; Uka and DeAngelis, 2006; Umeda et al., 2007; for the selectivity of MT neurons to another type of relative disparity, see Krug and Parker, 2011). The two seemingly independent contrasts may not just be a coincidence, but the result of learning. The animals were under pressure to optimize their performance for fine disparity discrimination. In such a situation, the neural representation of disparity might be more informative in a relative frame of reference than in an absolute frame of reference because of the robustness of the relative reference frame to vergence errors (Westheimer, 1979; McKee et al., 1990). The disparity representation of MT neurons contributes to coarse disparity discrimination in which the absolute reference frame might be more informative (DeAngelis et al., 1998; Uka and DeAngelis, 2003, 2004). Thus, the visual systems seem to be capable of selecting an appropriate disparity representation from the distributed representations across the visual cortex depending on task demands, as suggested previously (Uka and DeAngelis, 2006; Chowdhury and DeAngelis, 2008). Our results provide critical evidence to this unified framework, in which the ventral and dorsal pathways have distinctive roles in stereopsis (Neri, 2005; Parker, 2007; Read et al., 2010).
Although our task required only representations of uniform disparity, V4 neurons are also selective for orientation disparity or disparity gradients (Hinkle and Connor, 2002; Hegdé and Van Essen, 2005a). These properties of V4 neurons suggest that V4 may be an intermediate stage of generating representations of more complex features of three-dimensional objects in IT (Janssen et al., 1999, 2000; Yamane et al., 2008; Anzai and DeAngelis, 2010). IT responses have a direct role in discriminating disparity-defined curved surfaces (Verhoef et al., 2010). V4 and IT contribute to different aspects of stereopsis.
In conclusion, we have demonstrated the contribution of V4 to fine discrimination of relative disparity between center and surround of RDSs. We suggest that relative disparity signals of V4 neurons underlie fine stereoacuity. Thus, V4 plays a distinctive role in stereoscopic depth perception.
Footnotes
This work was supported by grants to I.F. from the Ministry of Education, Culture, Sports, Science and Technology of Japan (23240047, 23135522), Core Research for Evolutional Science and Technology of Japan Science and Technology Agency, Center for Information and Neural Networks, and the Takeda Science Foundation. We thank H. Kumano and K. Umeda for help with data collection; Y. Okazaki, S. Aoki, and Y. Shinomoto for technical assistance; and Y. Shinomoto for excellent animal care. A monkey was obtained from the National Bio-resource Project.
The authors declare no financial conflicts of interest.
- Correspondence should be addressed to Dr. Ichiro Fujita, Laboratory for Cognitive Neuroscience, Graduate School of Frontier Biosciences, Osaka University, Toyonaka, Osaka 560-8531, Japan. fujita{at}fbs.osaka-u.ac.jp
References
- Anzai and DeAngelis, 2010.↵
- Anzai et al., 2011.↵
- Britten et al., 1992.↵
- Britten et al., 1996.↵
- Chowdhury and DeAngelis, 2008.↵
- Cohen and Newsome, 2009.↵
- Cumming and Parker, 1999.↵
- DeAngelis et al., 1998.↵
- Geisler, 2003.↵
- Green and Swets, 1966.↵
- Gu et al., 2007.↵
- Gu et al., 2008.↵
- Hegdé and Van Essen, 2005a.↵
- Hegdé and Van Essen, 2005b.↵
- Hinkle and Connor, 2001.↵
- Hinkle and Connor, 2002.↵
- Hinkle and Connor, 2005.↵
- Hosmer and Lemeshow, 1989.↵
- Janssen et al., 1999.↵
- Janssen et al., 2000.↵
- Krug and Parker, 2011.↵
- Kumano et al., 2008.↵
- Kumar and Glaser, 1992.↵
- McKee, 1983.↵
- McKee et al., 1990.↵
- Mehta et al., 2000.↵
- Neri, 2005.↵
- Neri et al., 2004.↵
- Nienborg and Cumming, 2009.↵
- Nienborg and Cumming, 2010.↵
- Parker, 2007.↵
- Prince et al., 2000.↵
- Purushothaman and Bradley, 2005.↵
- Read et al., 2010.↵
- Salzman et al., 1992.↵
- Sarmiento, 1975.↵
- Tanabe et al., 2004.↵
- Tanabe et al., 2005.↵
- Thomas et al., 2002.↵
- Uka and DeAngelis, 2003.↵
- Uka and DeAngelis, 2004.↵
- Uka and DeAngelis, 2006.↵
- Uka et al., 2005.↵
- Umeda et al., 2007.↵
- Verhoef et al., 2010.↵
- Watanabe et al., 2002.↵
- Westheimer, 1979.↵
- Wichmann and Hill, 2001.↵
- Yamane et al., 2008.↵