Abstract
Cortical visual neurons in the cat and monkey are inhibited by stimuli surrounding their receptive fields (surround suppression) or presented within their receptive fields (cross-orientation or overlay suppression). We show that human contrast sensitivity is similarly affected by two distinct suppression mechanisms. In agreement with the animal studies, human surround suppression is tightly tuned to the orientation and spatial frequency of the test, unlike overlay suppression. Using a double-masking paradigm, we also show that in humans, overlay suppression precedes surround suppression in the processing sequence. Surprisingly, we find that, unlike overlay suppression, surround suppression is only strong in the periphery (>1° eccentricity). This result argues for a new functional distinction between foveal and peripheral operations.
Introduction
In the last decades, neurophysiologists have devoted a substantial effort to the study of suppressive phenomena in the visual cortex (Fitzpatrick, 2000; Carandini, 2004). These phenomena are seen when a stimulus that does not affect the responses of a neuron by itself markedly suppresses the responses to an optimal test stimulus (i.e., masks the test). Two forms of suppression have been isolated in the primary visual cortex of cats and monkeys. One is “surround suppression,” in which the mask has the orientation preferred by the neuron but is presented outside the receptive field (DeAngelis et al., 1994; Cavanaugh et al., 2002). The other is “overlay suppression,” in which the mask is presented within the receptive field (superimposed on the test) and can have any orientation (Morrone et al., 1982; Bonds, 1989; DeAngelis et al., 1992; Carandini et al., 1997).
In a pair of elegant neurophysiological studies, DeAngelis et al. (1992, 1994) showed that these two types of suppression had profoundly different effects on the response of V1 neurons in the cat cortex. First, they demonstrated that the suppression produced by targets presented within the excitatory receptive field was not tuned for orientation, crossed or otherwise, because stimuli at the preferred orientation of the neuron (but nonpreferred spatial frequency) would suppress the firing rate to the same degree as when they were orthogonal to the preferred orientation. They also showed that this suppression, which we call overlay suppression, originated from a region that was comparable with or smaller than the excitatory receptive field. Next, DeAngelis et al. (1992, 1994) showed that the excitatory receptive field was surrounded on all sides by an inhibitory zone, tuned to the same orientation and spatial frequency as the excitatory response, although the tuning for suppression was somewhat broader than the tuning for excitation. It has now been shown that surround suppression is a common property of neurons in early visual areas of both cats and primates (Fitzpatrick, 2000; Cavanaugh et al., 2002; Carandini, 2004).
Despite these findings in cats and primates, there is scant evidence in human vision for a clear distinction between overlay and surround suppression. The addition of a high-contrast overlay mask, oriented orthogonal to the test target, raises psychophysical thresholds, although generally not to the same degree as the mask (pedestal) of the same orientation as the test (Ross and Speed, 1991; Ross et al., 1993; Meier and Carandini, 2002). Foley (1994) proposed a model of contrast masking that brought psychophysical data for overlay masking into agreement with physiological observations (Chen and Foley, 2004).
The psychophysical evidence for overlay suppression is so pervasive that studies often fail to differentiate it from surround suppression. Thus, most of the early psychophysical studies used large grating targets with coextensive masks and therefore tended to confound the effects of overlay and surround suppression. Most recent contrast threshold measurements in the fovea have found that adjacent targets enhance contrast sensitivity (Polat and Sagi, 1993; Yu et al., 2003), whereas studies in the periphery have noted that adjacent targets suppress sensitivity (Chubb et al., 1989; Solomon et al., 1993; Wilkinson et al., 1997; Snowden and Hammett, 1998; Xing and Heeger, 2001; Zenger-Landolt and Koch, 2001). We asked whether there are two distinct suppressive mechanisms in human vision and whether they are consistent with the neurophysiological suppression mechanisms.
Materials and Methods
The test target was a standard Gabor (σ = λ/√2) in which ∼1.5 periods (0.3° in fovea) of the sinusoidal pattern were visible. The Gabor spatial frequency was 5 cycles per degree (cpd) for foveal measurements. The frequency and all dimensions were scaled proportionally for peripheral presentation according to the cortical magnification factor given by the formula: target frequency (cpd) = 5/[1 + eccentricity (degree)/3] (Rovamo and Virsu, 1979). Thin (1 pixel) low-contrast circles surrounded the target region to reduce the observer's uncertainty about the target locations, particularly for test targets presented without a mask. Target duration was 150 ms; viewing was binocular. The mask was either a 30% contrast Gabor patch added to the target (overlay masking) or an annulus of a 10% contrast sinusoidal grating surrounding the target (surround masking), as shown in Figure 1. For the measurements on surround suppression in the fovea, the circular annulus had an inner radius 0.4° and an outer radius 1.6°. It contained a sinusoidal grating of variable contrast, orientation, and spatial frequency; a blank region (at the background luminance) ∼1 period wide (0.2° in fovea) separated the target from the annular mask. For the measurements on overlay suppression, the superimposed mask was a Gabor of variable contrast, orientation, and spatial frequency. When the spatial frequency was varied, the size of the overlay mask was scaled proportionally with its spatial frequency according to σ = λ/√2. For the peripheral measurements, we used a two-alternative forced choice (2AFC) procedure in which the test target appeared at one of two locations, located above and below the fixation point at equal eccentric loci; the masks and faint dark circles were presented at both locations on all trials. For the foveal measurements, we used a two-temporal interval FC (2IFC) procedure. The fixation mark comprising two low-contrast concentric circles and a pair of nonius lines was displayed in the beginning of each 2AFC trial and also in the interstimulus interval of each 2IFC trial but disappeared 150 ms before the stimulus interval onset.
Stimuli were displayed on a gray background (42 cd/m2) and viewed through a Wheatstone stereoscope on a pair of linearized Sony (Tokyo, Japan) Trinitron G220 monitors. Four subjects with normal or corrected visual acuity were tested. Two of the subjects were naive to the purpose of the study. The task was to indicate (with a button press) in which location (2AFC) or interval (2IFC) the target was shown. We used the adaptive staircase algorithm of Kontsevich and Tyler (1999) to estimate detection thresholds of 76% correct, corresponding to a d′ of 1. Typically, ≥300 trials were accumulated in blocks of 100 or 150 trials for each threshold measurement. Because experimental results varied little between subjects, we present data averaged over the four subjects (individual data are available on the Petrov website).
It is important to note how the particular stimuli and parameter values were chosen for this study. First, we wanted to maximize the effect of the surround mask (e.g., by choosing the contrast of the surround at 10%) (see Fig. 4), and second, we wanted to make the comparison with neurophysiological results relatively straightforward. Here, we have focused on the orientation, spatial frequency, contrast, and eccentricity aspects of suppression, but other dimensions of the stimulus were probed as well. These included target pedestal contrast, mask contrast, mask phase, and the spatial layout of the surround (i.e., separation from the target, location relative to the target, disparity, etc.), as well as timing between the mask and the target onset. These data could not be presented in one brief article and will be published elsewhere. In a nutshell, the results show that suppression is a ubiquitous phenomenon not restricted to the particular type of stimulus used here. In particular, we found that surround suppression is fairly independent of the surround phase, eye-of-origin, or its spatial layout around the target. Thus, when sectors (quadrants) were used instead of the full annulus, the position of the sectors (collinear at Gabor ends vs parallel at Gabor sides) did not have much effect on suppression strength.
Results
Guided by neurophysiological results for cats and primates, we expected that the major difference between the two forms of suppression would be their dependence on mask orientation and spatial frequency. For these experiments, we compared performance for no-mask and masked 1.3 cpd Gabor targets at 6° eccentricity. This eccentricity is representative of the range of eccentricities at which single-cell recordings are usually performed. Typically, detection thresholds in the no-mask condition were between 1 and 3% contrast. Figure 2a plots the ratio (suppression factor) of the masked to unmasked thresholds as a function of mask orientation relative to the target. For both surround and overlay masking, the strongest suppression was found for masks of the same orientation as the target. Yet, compared with overlay suppression, surround suppression was much more tightly tuned: overlay suppression was still strong for the cross-oriented (orthogonal) mask, whereas surround suppression disappeared once the relative orientation between the target and the annulus exceeded 45°. These results agree with physiological measurements of surround and overlay suppression in cats (DeAngelis et al., 1992, 1994) and primates (Cavanaugh et al., 2002).
Varying the spatial frequency of the masks produced similar results (Fig. 2b). For both overlay and surround masking, the largest threshold elevation was observed for a mask of the same spatial frequency as the target. However, for the overlay mask, suppression remained almost constant until mask and target frequencies differed by more than a factor of 4. In contrast, the tuning curve for surround suppression was sharply peaked at a bandwidth of ∼1.5 octaves.
So far, we have shown that psychophysical measurements are in substantial agreement with the known physiology when stimuli are presented in the periphery. However, almost all previous psychophysical studies were done with stimuli in the fovea. In the next experiment, we varied stimulus eccentricity from 0 to 7° to see whether this could explain the lack of psychophysical evidence for surround suppression in these previous studies. The surround mask was of the same orientation as the target, but the overlay mask was orthogonal to the target. Spatial frequencies of the masks were equal to the spatial frequency of the target, and all of the dimensions of the stimulus (target and masks) were varied with eccentricity according to the cortical magnification factor (see Materials and Methods).
For all subjects, we found strong overlay suppression and surround suppression in the periphery but only overlay suppression in the fovea. Figure 2c plots the ratio (suppression factor) of the masked to unmasked thresholds as a function of eccentricity. The annular mask produced, at most, a 30% increase in threshold in the fovea, but suppression rose rapidly with increasing eccentricity, reaching a plateau at ∼4° eccentricity where thresholds were three times their unmasked value. However, masking by an orthogonal grating overlaid on the target produced no consistent difference between foveal and peripheral loci in our four subjects. Note that we repeated the foveal measurements with different spatial frequency Gabor targets (square, 1.3 cpd; diamond, 14 cpd; triangle, 18 cpd); there was no observed suppression for any of these frequencies and different-sized targets.
The striking difference between the two types of suppression suggests that they are implemented by different mechanisms. What is the order of these mechanisms: do they operate in parallel or does one precede the other? To answer this question, we added a second mask to our stimuli arranged to suppress the first mask but not the target. Stimuli were shown at 6°. eccentricity; the spatial frequency was 1.3 cpd. For the stimulus in Figure 3c, the mask was superimposed orthogonally on the annular surround to form a plaid. It follows from the orientation properties of overlay suppression that the new (orthogonal) mask will suppress the collinear mask. Yet, because of the sharp orientation tuning of surround suppression, the orthogonal mask should not, by itself, have any effect on the target. Indeed, as the contrast of the second orthogonal mask increased, suppression fell significantly. Thus, the overlaid mask reduced the suppression produced by the collinear mask. Therefore, the collinear surround was suppressed before it had a chance to suppress the target. This result is completely consistent with the disinhibition of surround suppression found in cats (Walker et al., 2002).
Is the opposite also true? Would the surround reduce the effectiveness of a mask overlaid on the target? For this stimulus, we superimposed an orthogonal mask on the target and surrounded it with an annulus of the same orientation as the superimposed mask (Fig. 3b). Again, the surround, being orthogonal to the target, should have a minimal direct effect on the target detection, but it could, in principle, suppress the overlay mask, improving sensitivity. In fact, the surround increased overlay suppression. This indicates that surround suppression occurred later in the sequence of processing and therefore could not prevent the overlay mask from suppressing the target. We conclude that the two suppressive mechanisms are arranged in series and that overlay suppression operates before surround suppression.
Discussion
To summarize, our results reveal that the same two mechanisms of suppression seen in neurons of cats and monkeys are present in human vision. These mechanisms are distinct and operate in series, overlay suppression coming first. Our most surprising result is that foveal contrast detection thresholds show very little surround suppression. This would explain the lack of previous evidence for strong surround suppression in humans: the majority of psychophysical studies positioned stimuli in the center of the visual field, whereas neurophysiologists purposely avoid the small receptive fields and the confluence of visual areas of the primate fovea.
Is it possible that this result just represents a failure of scaling? We used a cortical magnification factor derived by Rovamo and Virsu (1979). They demonstrated that this spatial frequency scaling produced constant contrast detection thresholds at all eccentricities. As we were measuring contrast detection, this scaling factor seemed most appropriate for our measurements. We did confirm the findings of Rovamo and Virsu (1979). In the absence of a mask, Gabor targets scaled according to this magnification factor (starting from 5 cpd in the fovea) produced constant detection over the range of tested eccentricities. We also tested surround suppression in the fovea for three control spatial frequencies: 1.3 cpd (all four subjects), 14 cpd (SPM), and 18 cpd (YP). The dimensions of the stimulus were scaled proportionately. We were unable to probe higher frequencies, because detection thresholds for a Gabor target become too large to accommodate any significant surround suppression. The results are shown by open symbols in Figure 2c. No significant suppression was found in the fovea for either low or high spatial frequencies, which demonstrates the generality of the result.
A possible explanation for peripheral surround suppression could be that the excitatory summation zone in the periphery is disproportionately large compared with the fovea. If this were true, the surround might produce standard overlay (pedestal) masking through indirect stimulation of the neural mechanisms responding to the test target (Snowden and Hammett, 1998). We think this explanation is unlikely because, as shown above, orientation and spatial frequency tuning differ greatly between the two masking types. In addition, we showed that the threshold versus contrast functions (TvC) are different for pedestal and surround masking. In a control experiment, we measured contrast discrimination thresholds (i.e., thresholds for contrast increment detection) as a function of a pedestal contrast for a single Gabor target at 6° eccentricity. The bottom curve in Figure 4 shows the resulting TvC curve averaged over four observers. Next we measured contrast detection thresholds for the same target surrounded by a mask of the same orientation and spatial frequency, now as a function of the surround contrast. The top curve shows the resulting averaged thresholds. If surround suppression were a simple pedestal effect, the two curves would have looked similar (up to some monotonic transformation of the x-axis). Yet, they have very different shapes. In particular, the surround suppression does not show the dip ∼3% contrast characteristic of TvC curves. Instead it peaks ∼10% contrast. Zenger et al. (2000) observed a similar saturation in contrast thresholds measured as a function of the contrast of two adjacent Gabors at 4° eccentricity.
Here we chose contrast sensitivity as a measure of the suppression effects, primarily because of a straightforward relationship between the TvC curve and the response function of the underlying neuronal pool. Some studies have used perceived contrast instead (Cannon and Fullenkamp, 1991). Because there is no simple relationship between contrast sensitivity and perceived contrast, a direct comparison is not possible. Nevertheless, both measures revealed similar surround suppression properties, including orientation (Solomon et al., 1993; Xing and Heeger, 2001) and spatial frequency (Chubb et al., 1989) tuning, as well as stronger suppression in the periphery (Xing and Heeger, 2000).
Our results provide strong evidence that surround suppression is distinct from overlay suppression both in function and in the neuronal locus. Although the Foley (1994) model assigns the role of a contrast normalization mechanism to overlay suppression, the role of surround suppression is not well understood. The sharp tuning to orientation and spatial frequency suggests that it is not simple contrast normalization. Because of the orientation tuning of neural surround suppression, Schwartz and Simoncelli (2001) suggested that it was a special normalization mechanism that used redundancy in natural images to enhance cortical response specificity. Our peripheral results are consistent with this suggestion, but it is difficult to understand why this special normalization would not also occur in the fovea. An alternative application for such a narrowly tuned long-range suppression is texture segmentation, as in the models by Malik and Perona (1990) and by Li (2000).
Intriguingly, our results show that surround suppression is present only in the periphery. This indicates that periphery is not just a “poor cousin” of the fovea but rather suggests a deeper, functional distinction. We speculate that fovea and periphery perform a different level of analysis in the texture segmentation task. By masking homogeneous peripheral regions, surround suppression performs a rough presegmentation analysis that assists in the selection of salient sites (e.g., object boundaries) for subsequent saccades. A saccade to the chosen region of interest is followed by more-refined processing in the fovea unhindered by the distortion of visual information that would result from surround suppression.
Footnotes
This work was supported by National Institutes of Health (NIH)–National Eye Institute Grant 06644 (S.M.) and by an NIH–National Eye Institute Institutional National Service Award (Y.P.). We thank Dr. BartFarell for suggesting the experiment diagrammed in Figure 3b.
Correspondence should be addressed to Dr. Yury Petrov, The Smith-Kettlewell Eye Research Institute, 2318 Fillmore Street, San Francisco, CA 94115. E-mail: yury{at}ski.org.
DOI:10.1523/JNEUROSCI.2871-05.2005
Copyright © 2005 Society for Neuroscience 0270-6474/05/258704-04$15.00/0