Human stereopsis, the perception of depth from differences in the two eyes' images, is very precise: image differences smaller than a single photoreceptor can be converted into a perceived difference in depth. To better understand what determines this precision, we examined how the eyes' optics affects stereo resolution. We did this by comparing performance with normal, well-focused optics and with optics improved by eliminating chromatic aberration and correcting higher-order aberrations. We first measured luminance contrast sensitivity in both eyes and showed that we had indeed improved optical quality significantly. We then measured stereo resolution in two ways: by finding the finest corrugation in depth that one can perceive, and by finding the smallest disparity one can perceive as different from zero. Our optical manipulation had no effect on stereo performance. We checked this by redoing the experiments at low contrast and again found no effect of improving optical quality. Thus, the resolution of human stereopsis is not limited by the optics of the well-focused eye. We discuss the implications of this remarkable finding.
A fundamental question in visual neuroscience is how the eyes' optics, photoreceptors, and subsequent neural mechanisms combine to determine visual performance. Studying visual resolution has proven particularly illuminating. For example, the contribution of optics to letter acuity is now reasonably well understood: defocus causes a predictable worsening of acuity (Cheng et al., 2004), while correcting the high-order aberrations of the well-focused eye yield a predictable improvement (Yoon et al., 2002). Similar changes are observed with contrast sensitivity: defocusing the eye yields poorer sensitivity, particularly at high spatial frequencies (Campbell and Green, 1965), and correcting the high-order aberrations yields better-than-normal sensitivity (Williams et al., 2000; Yoon and Williams, 2002). Thus, the contribution of optics to visual acuity and contrast sensitivity is reasonably well understood. By inference, the contributions of postoptical receptoral and neural mechanisms have been quantified (Williams, 1985; Banks et al., 1987; MacLeod et al., 1992; Chen et al., 1993).
We know much less about the optical and neural determinants of stereopsis. Humans can discriminate changes in binocular disparity as small as 5 arcseconds (Westheimer and McKee, 1980), so stereopsis is clearly a very precise visual function. Using an approach similar to that used in the analysis of the limits of visual acuity and contrast sensitivity, we examined how the eyes' optics affects the resolution of stereopsis, thereby revealing more about the influence of postoptical mechanisms.
Materials and Methods
Three observers (1 female, 2 male) with normal visual acuity and stereopsis participated. They were emmetropic (spherical and cylindrical refractive errors both smaller than 0.5 diopter). Average SD of root–mean–square wavefront error for the higher-order aberrations (HORMS) was 0.46 ± 0.23 μm for a 6 mm pupil, not differing significantly from the normal population. Two observers were authors (B.N.S.V., G.Y.).
Apparatus and optical manipulation.
Stimuli were projected directly into the eyes with two DLP projectors (Sharp PG-M20X; 1024 × 768 pixels) that had been optically modified. Pixels subtended 24 arcseconds, except in the two-line stereo experiment in which they subtended 12 arcseconds. Grayscale resolution was 8 bits, and the displays were luminance calibrated.
Monocular and binocular stimuli were brought to sharp focus on the retinas with two image-relay optical systems, one for each eye. We improved the optical quality of the retinal images beyond normal, well-focused optics in three ways: (1) Eliminating chromatic aberration by monochromatic filtering of the stimuli at 550 nm (10 nm bandwidth); (2) Reducing the effective pupil diameter to 2.5 mm with an artificial pupil, which improves overall quality by reducing blur due to higher-order aberrations (Campbell and Gubisch, 1966; Liang and Williams, 1997); (3) Blur due to higher-order aberrations was further reduced by manipulating the wavefront of incident light to compensate for residual ocular aberration. We first measured the wavefront aberrations of both eyes using a Shack–Hartmann sensor (Liang et al., 1994) and then used those measurements to fabricate phase plates (Yoon et al., 2004) that neutralize the aberrations. This technique improves retinal-image sharpness beyond what is achievable with conventional ophthalmic correction (Navarro et al., 2000; Yoon et al., 2004).
We combined these three manipulations to produce higher image quality than that of normal, well-focused eyes. The phase plates and artificial pupils were placed in pupil-conjugate planes. Accurate alignment of the observer's visual axes with the light paths was maintained by video monitoring of the positions of the pupils relative to the phase plates, by the observer maintaining fixation on a small binocular marker, and by the observer checking alignment by repeated judgments of the sharpness of a small letter E presented between trials. Potential changes in focus were eliminated by inducing cycloplegia to both eyes (i.e., paralyzing the ciliary muscles that control accommodation) by administering cyclopentolate. Ophthalmic lenses (spherical and cylindrical) in each eye's light path assured best focus for all conditions. The appropriate lens was chosen by having the observer judge the sharpness of the small E.
Contrast sensitivity measurements.
We assessed the optical improvement provided by our procedure by comparing monocular contrast sensitivity in three optical conditions: (1) with white light, normal well-focused optics (i.e., no phase plates), and 4 mm pupil (a typical diameter for the experimental light level) (Spring and Stiles, 1948); (2) with white light, normal well-focused optics and 6 mm pupil (larger than typical for the experimental light level); and (3) when the optics were improved by the procedure described above. The stimuli were gratings with sinusoidal luminance variation and space-average retinal illumination of 561 troland (Td). Contrast was constant over the central 3° and decreased with a half-Gaussian profile (SD = 0.67°) to merge smoothly with the uniform background. Observers initiated stimulus presentations with a keypress. The gratings were oriented +10° or −10° relative to horizontal and were presented for 16.7 ms (one frame). We used short durations so that contrast sensitivity would be relatively low, and in so doing, we assured that the grayscale resolution of our projectors was sufficient to obtain reliable thresholds. The grating's phase was randomized from trial to trial. We presented spatial frequencies of 10, 20, 28, and 40 cycles per degree (cpd). After each presentation, observers indicated the grating's orientation. No feedback was given. The grating's contrast was varied according to an adaptive staircase procedure (Watson and Pelli, 1983). Afterward, the psychometric data from six such staircases of 25 trials each were combined and fitted with a cumulative Gaussian using a maximum-likelihood criterion (Wichmann and Hill, 2001). Thresholds and confidence intervals were calculated from those fits and boot-strapping.
Stereo resolution measurements with corrugations.
We measured stereo resolution under the same three optical conditions. In this experiment, the stimulus was a random-dot stereogram specifying sinusoidal corrugations in depth (Fig. 1a). To create the stereograms, we first generated a hexagonal lattice with an interdot distance of s. Then each dot was displaced in a random direction (distributed uniformly from 0 to 2 π) for a random distance (distributed uniformly from 0 to s/2). We copied the randomized lattice into the images for the left and right eyes and then horizontally displaced the dots in each image in opposite directions by half the horizontal disparity. Horizontal disparity was where x and y are dot coordinates, and A, f, ϕ, and α are, respectively, the corrugation's peak-to-trough disparity amplitude, spatial frequency, phase, and orientation. Dot density varied from 25 to 336 dots/deg2. Dot size varied from 0.8 to 1.6 arcminutes. Anti-aliasing was used so we could present small disparities. A fixation target with dichoptic and binocular elements was presented between stimulus presentations so observers could maintain accurate fixation and assess optical quality. Observers initiated stimulus presentations with keypresses. The corrugation's orientation was either +10° or −10° from horizontal, and observers indicated after each 600 ms presentation which orientation they had seen. By making the corrugations nearly horizontal, we greatly reduced the visibility of monocular artifacts in the stereograms. By using the orientation-discrimination task, we assured that observers had to perceive some spatial structure to perform significantly above chance. No trial-by-trial feedback was provided.
Peak-to-trough disparity amplitude was 2.4 arcminutes. We chose such a small value to assure that the disparity-gradient limit (Burt and Julesz, 1980; Banks et al., 2004; Filippini and Banks, 2009) was never exceeded. For sinusoidal corrugations, the gradient is ∼2fA, so the disparity-gradient limit of 1 is reached when f becomes >1/(2A), which for our experiment would be a corrugation frequency of 12.5 cpd. We never presented frequencies >10 cpd and thereby avoided the disparity-gradient limit.
The spatial frequency of the corrugation was varied from trial to trial according to an adaptive staircase procedure to determine the highest frequency at which reliable performance could be obtained. The psychometric data from six staircases of 25 trials each were combined and fit with a cumulative Gaussian using a maximum-likelihood criterion. Threshold was the 75% point on the fitted curve.
We conducted a second corrugation experiment with stimuli of different contrasts. In this case, the retinal illuminance of the background was 561 Td.
Stereo resolution measurements with two lines.
We also measured stereo resolution using a depth-discrimination task (Blakemore, 1970). Two of the three observers participated. A thin vertical test line (0.037° × 2.25°) was presented 0.27° below a reference line of the same dimensions (Fig. 1b). The disparity of the reference line was 0°, and the disparity of the test line was varied (both increasing and decreasing from 0). The retinal illuminance of the lines was 891 Td. Pixels subtended 12 arcseconds, and anti-aliasing was used to allow the presentation of small disparities. A fixation target with dichoptic, and binocular elements was presented between stimulus presentations allowing observers to maintain accurate fixation and to assess optical quality. Observers initiated stimulus presentations with keypresses. The test and reference lines were presented for 500 ms, and observers indicated whether the test was in front of or behind the reference. The disparity of the test line was varied using the method of constant stimuli. The psychometric data were combined, and the proportion of “behind” responses was calculated for each disparity. A cumulative Gaussian was fit to those data, and its SD was taken as the threshold.
We also ran the two-line experiment with different contrasts between the lines and background. In that case, the background had a retinal illuminance of 446 Td.
Figure 2a plots contrast sensitivity for both eyes of the three observers under the three optical conditions. The red and blue symbols represent sensitivity with normal, well-focused optics for pupil diameters of 6 and 4 mm, respectively. The green symbols represent sensitivity with improved optics. As you can see, contrast sensitivity was higher with improved optics than with normal, well-focused optics in both eyes of all three observers. We subjected the data to a three-way, repeated-measures ANOVA with factors optical condition, spatial frequency, and eye. There was a statistically significant effect of optical condition: highest sensitivity was observed with improved optics and lowest with normal optics and 6 mm pupil (F(2,4) = 24.711, p = 0.006; missing data at 40 and 28 cpd for observer G.Y. were assigned sensitivities of 1). The improvements were in some cases quite large. The contrast sensitivity of observer B.N.S.V. increased nearly sevenfold at 28 cpd in his left eye from the normal, 6 mm condition to the improved condition. The increase was greater than fourfold for H.R.F. at 28 cpd, right eye and nearly fourfold for G.Y. at 20 cpd, left eye. The improvement in sensitivity from the normal, 4 mm condition to the improved optics condition was also statistically significant (F(2,2) = 22.863, p = 0.041). These results show that our procedure for producing sharper-than-normal retinal images was quite effective.
Stereo resolution with corrugations
Figure 2b shows the stereo resolution thresholds, the highest discriminable corrugation frequency as a function of dot density, for the three observers under the three optical conditions. The left column shows the whole functions; the right column shows exploded views of the data at high dot densities.
The Nyquist sampling frequency is the highest corrugation frequency that can be conveyed by the random-dot stimulus: where D is dot density. The diagonal dashed lines in the figure represent this frequency. When dot density was low, the highest discriminable frequency for all three optical conditions was near the sampling limit. (Some thresholds slightly exceeded the Nyquist frequency because the random dot arrangement yielded regions in which local density was higher than overall density.) We conclude that stereo resolution is determined under those conditions strictly by the number of samples in the stimulus. However, when dot density was higher, resolution leveled off at a particular frequency, so something other than sample number is limiting performance there. Our primary interest is in understanding the determinants of that asymptotic frequency.
The different symbols in Figure 2b represent the data from the three optical conditions: red for the normal, well-focused condition with 6 mm pupil, blue for the normal, well-focused condition with 4 mm pupil, and green for the improved optical condition. As you can see, performance leveled off at the same spatial frequency for all three optical conditions. We subjected the data at the two highest dot densities to a repeated-measures ANOVA, and there was no reliable effect of optical condition (F(2,2) = 2.74, p = 0.178). Examining the individual observer data reveals no systematic differences with the possible exception of observer G.Y. who had a slightly lower asymptotic frequency in the 6 mm condition than in the other two (10.5% lower with 6 mm pupil than with improved optics). Furthermore, there was no systematic relationship between the quality of individual observers' optics and their stereo performance. With normal, well-focused optics, H.R.F. had the best image quality (quantified by HORMS), and B.N.S.V. had the poorest (H.R.F. = 0.19 μm; B.N.S.V. = 0.7 μm). Yet B.N.S.V. had the best stereo resolution (his asymptotic frequency was 5.3 cpd in the 6 mm condition), and H.R.F. had the poorest (in the same condition, her asymptote was 2.8 cpd). Collectively, these results suggest rather remarkably that improving the optics has no effect on stereo resolution. We know that degrading the optics from normal reduces stereo resolution (Westheimer and McKee, 1980; Banks et al., 2004), but it seems that the resolution of stereopsis is not limited by the blur in normal, well-focused eyes.
Two-line stereo resolution
We tested the generality of our observations by assessing stereo resolution another way. The data points on the right side of Figure 3 represent the results from the two-line, depth-discrimination experiment. The red and blue symbols again represent the data with normal, well-focused optics and 6 and 4 mm pupils, respectively. Green symbols represent data with improved optics. There was no systematic relation between performance and optical quality. For improved optics with 4 and 6 mm pupil, stereo resolution for observer B.N.S.V. was, respectively, 18.7 arcseconds (95% CI: −4.2, +7.6), 11.3 (−4.6, +6.9), and 14.6 (−4.1, +5.0) arseconds; for H.R.F., acuity was 32.8 arcseconds (−9.2, +11.4), 27.3 (−5.6, +6.8), and 59.9 (−15.7, +34.1). Thus, the differences across optical condition were not statistically reliable. There was also no consistent relationship between the optical quality of individual observers and their performance in the task. For example, H.R.F.'s 6 mm optical quality was significantly better than B.N.S.V.'s, but her disparity threshold was poorer than his: 32.8 versus 18.7 arcseconds. Again, we conclude that stereo resolution is not limited by the blur associated with normal, well-focused optics.
Stereo resolution with low-contrast stimuli
Improving optical quality increases retinal-image contrast. Perhaps the failure to observe an improvement in stereo resolution was due to a saturating nonlinearity early in visual processing (MacLeod et al., 1992; Chen et al., 1993) such that the high-contrast dots and lines were effectively clipped, and therefore the contrast increase was not retained for processing at later neural stages. If this were the case, improving optical quality should yield better resolution with lower-contrast stimuli. We examined this possibility in two ways.
First, we retested observer B.N.S.V. in the corrugation task with low-contrast stimuli. The background was gray with a retinal illuminance of 561 Td. Contrast, defined as (Ldot–Lbkgrnd)/Lbkgrnd, ranged from 0.125–1. Dot density was fixed at 232 dots/deg2. The results are plotted in Figure 4. Reducing contrast had no reliable effect on the highest discriminable corrugation frequency at contrasts from 0.25–1. At a contrast of 0.125, there was a clear reduction in stereo resolution in the 6 mm condition, but this is because the dots became generally invisible. Thus, this particular finding is due to dot visibility rather than the precision of stereo processing.
Second, we redid the two-line experiment with lower-contrast stimuli. Observers B.N.S.V. and H.R.F. participated. The background was gray with a retinal illuminance of 446 Td, and contrast was varied from 0.125–1. The results are shown on the left side of Figure 3. There was again no systematic effect of optical condition at any contrast provided that the lines were visible. B.N.S.V.'s disparity threshold was not measurable at 0.125 in the 6 mm condition because he could not see the lines.
We conclude that the failure to observe an improvement in stereo resolution with improved optics is not a byproduct of clipping due to a saturating nonlinearity. Human stereopsis simply does not seem to benefit from better retinal-image quality than that associated with natural viewing.
Humans can discriminate binocular disparities much smaller than a single foveal photoreceptor (Westheimer, 1979), so stereopsis is generally considered an extremely precise process. It is, therefore, surprising that providing sharper-than-normal retinal images, while improving contrast sensitivity, has no measurable effect on stereopsis.
Perhaps the limiting process is in visual cortex where the two eyes' images are combined. The standard model of disparity estimation involves disparity-energy calculation in cortex (Ohzawa et al., 1990). Disparity estimation by a population of disparity-energy units is well modeled by local cross-correlation (Anzai et al., 1999; Banks et al., 2004). The receptive fields associated with this computation must be large enough to allow meaningful computation of interocular correlation. But large receptive fields have limited stereo resolution because they cannot signal disparity variation finer than their own size. Consequently, the smallest available receptive fields determine the resolution of stereopsis (Banks et al., 2004; Nienborg et al., 2004; Filippini and Banks, 2009). Presumably, the size of the smallest fields has been determined by the visual diet they have been provided, and that diet is limited in everyday vision by the on-going optical quality of the eye, which is adversely affected by high-order aberrations, chromatic aberration, tear film changes, and accommodative fluctuations (Charman and Heron, 1988; Montés-Micó et al., 2004). Thus, a smaller receptive-field size may not be available because it would have no use in daily life. It would be interesting to know if long experience with improved optics, such as with aberration-correcting contact lenses (Sabesan et al., 2007), would yield smaller receptive fields that would then allow an increase in stereo resolution. But if we apply the same logic to luminance contrast sensitivity and visual acuity, the explanation falls short: underlying neural mechanisms for luminance processing should also have developed receptive-field sizes that are appropriate for the visual diet they receive. Why then does contrast sensitivity and visual acuity improve with super-normal optics and stereo resolution does not?
Perhaps the difference is due to how eye movements affect performance in visual resolution and stereo resolution tasks. During fixation, the eyes continually jitter, drift, and make micro-saccades (St. Cyr and Fender, 1969; Rucci et al., 2007). The visual system integrates over time, so such movements cause spatial blur in the stored retinal image. There is, however, little to no influence of these small movements on monocular contrast sensitivity at high spatial frequencies (Packer and Williams, 1992), presumably because there are epochs in which the eye is stationary, thereby providing a sharp stored image and/or because some movements are parallel to the grating and do not blur the stored image. These small eye movements may have a greater effect on stereopsis. To estimate disparity, the brain cross-correlates the two eyes' images (Ohzawa et al., 1990; Banks et al., 2004). The cross-correlation operation presumably has its own integration time, which might be rather long given the inability to respond to fast alternations in disparity (Norcia and Tyler, 1984; Nienborg et al., 2005). Therefore, any difference in the movements of the two eyes would degrade the output of the cross-correlator. The movements of the two eyes during fixation are partially uncorrelated (St. Cyr and Fender, 1969), so there will be very few epochs in which both eyes are stationary or in which both are moving parallel to the depth structure. From this observation, we hypothesize that small eye movements during fixation are more detrimental to stereo resolution than to visual resolution (i.e., visual acuity, contrast sensitivity). Specifically, these movements cause changes in disparity estimation that are similar to spatial blur. As a consequence, the blur due to eye movements may be the limit to stereo performance rather than the blur inherent to the optics of normal, well-focused eyes.
This work was supported by National Institutes of Health Research Grants R01-EY08266 (M.S.B.) and R01-EY01499 (G.Y.), and The Netherlands Organization for Scientific Research Rubicon fellowship 446-06-021 (B.N.S.V.). We thank Austin Roorda for assistance in the wavefront measurements.
- Correspondence should be addressed to Martin S. Banks, Vision Science Program, School of Optometry, 360 Minor Hall, University of California, Berkeley, Berkeley, CA 94720-2020.