A long-standing puzzle in vision is the assignment of illusory brightness values to visual territories based on the characteristics of their edges (the Craik–O’Brien–Cornsweet effect). Here we show that the perception of the equiluminant territories flanking the Cornsweet edge varies according to whether these regions are more likely to be similarly illuminated surfaces having the same material properties or unequally illuminated surfaces with different properties. Thus, if the likelihood is increased that these territories are surfaces with similar reflectance properties under the same illuminant, the Craik–O’Brien–Cornsweet effect is diminished; conversely, if the likelihood is increased that the adjoining territories are differently reflective surfaces receiving different amounts of illumination, the effect is enhanced. These findings indicate that the Craik–O’Brien–Cornsweet effect is determined by the relative probabilities of the possible sources of the luminance profiles in the stimulus.
The distorted perception of territorial qualities as a result of adjacent regions and their boundaries was first reported by Chevreul after an early 19th century investigation of the wool-dying industry in France (Chevreul, 1824). Although validating the legitimacy of dye techniques in response to complaints that the fabric patterns from certain shops did not look as bright as expected, Chevreul initiated an ongoing interest in the influence of edges on the brightness and color of the adjacent surfaces. Since Mach’s work in the latter part of the 19th century (Mach, 1865, 1886), such effects have generally been explained in terms of lateral interactions among retinal (or other lower order) sensory elements. The discovery of antagonistic lateral interactions in the eye of the horseshoe crab (Hartline, 1938) and later the cat (Kuffler, 1953) supported Mach’s theoretical explanation of contrast effects, leading to much further work and eventually to the incorporation of this theory into both the electrophysiological and psychological canon (for review, see Ratliff, 1965; Cornsweet, 1970).
Despite this history, it has long been recognized that various additional “cues” in visual scenes (e.g., the shape and shadowing of objects) influence the perception of relative brightness [see von Helmholtz (1924), Evans (1948), and Beck (1972) for reviews of the earlier literature]. There has been little agreement, however, about how such cues are used in visual processing. For von Helmholtz (1924), shape and shading (among other factors) provided a basis for making “unconscious inferences” about the nature of the scene, which allowed the observer to “discount the illuminant” and thus to perceive the underlying “constant” qualities of surfaces. More recently, some aspects of brightness and color perception have been successfully modeled by computational algorithms based on luminance or spectral ratios across the entire scene (Land and McCann, 1971). Such ratiometric computations, however, cannot be readily applied to other sorts of cues that influence brightness [e.g., three-dimensional (3-D) shape], and other investigators have proposed more limited algorithmic rules to explain the effects of a variety of specific cues on brightness (Knill and Kersten, 1991; Adelson, 1993;Pessoa et al., 1996; Wishart et al., 1997).
The success of some of these models notwithstanding, there has been no obvious way to include these various observations and explanations of particular brightness phenomena within a single theoretical framework. Moreover, other observations have made plain that the retinal explanation of brightness illusions favored by Mach (1865, 1886), Ratliff (1965), Cornsweet (1970), and other earlier investigators (which is still found in most textbooks) cannot account for many aspects of these perceptual phenomena (Gilchrist, 1977;Gilchrist et al., 1983; Adelson, 1993, 1999; Williams et al., 1998a,b). In considering these problems and a possible solution to them in the context of simultaneous brightness contrast (Williams et al., 1998a,b;Lotto and Purves, 1999) and subsequently Mach bands (Lotto et al., 1999a,b), we suggested that all perceptions of luminance are empirically based associations instantiated in the nervous system according to the relative frequency of occurrence of the possible sources of the stimulus in question.
To test the merits of this hypothesis in relation to edge effects, we here examine a probabilistic explanation of a class of such phenomena referred to as the Craik–O’Brien–Cornsweet effect, in which the territories adjacent to boundaries defined by specifically constructed luminance gradients are perceived to have relative brightnesses that differ from their measured (photometric) qualities (for review, see Kingdom and Moulden, 1988).
MATERIALS AND METHODS
Construction of graphics. The graphics used to test perception of the Cornsweet stimulus were created with a Power Macintosh G3 computer (Apple Computer, Cupertino, CA), Adobe Illustrator 8.0 and Photoshop 5.0 (Adobe Systems, San Jose, CA) and StudioPro 2.0 (Strata, George, UT). The territories on either side of the Cornsweet edge (see Fig. 2) were set at a gray scale value of 156 (33 cd/m2 on the monitor used for these studies), the gradients spanning 99 U (16 cd/m2) across the light gradient, and 102 U (17 cd/m2) across the dark gradient; the gradients increased or decreased parabolically between their initiation and termination. These gray scale values and the ratio of the area of the gradient relative to the size of the adjoining territories were kept constant in all test images. A checkerboard background with an average luminance of 65 ± 2 cd/m2was used in all the images tested, except the one illustrated in Figure8 (which was also tested against an appropriate checkerboard control).
Selection and testing of subjects. The six test images and an additional control image were presented to 20 subjects with normal acuity and color vision (the 3 authors and 17 naive subjects who were paid for their participation) on a calibrated 48 cm (diagonal) color monitor (Sony Multiscan 300sf; monitor resolution = 1024 × 768; color depth set to millions of colors; scan rate 75 Hz, noninterlaced). The monitor was viewed at a distance of 60 cm in an otherwise darkened room to which the subjects were adapted before testing. The sequence of presentation is shown in Table 1.
For each of the scenes tested, subjects were asked to adjust two small squares remote from the stimulus until they matched the apparent difference in brightness between the two territories adjoining the edge in the Cornsweet stimulus (Fig. 1). An interface created in Director 6.0 (Macromedia, San Francisco, CA) provided “buttons” under each square that allowed the subject to darken or lighten the remote squares, and a “match” indicator that recorded the gray scale values of the fully adjusted squares. When a subject clicked the “lighten button,” the value originally assigned to the square was increased ∼0.5 cd/m2; conversely, the “darken button” reduced the value by this amount. To insure that all subjects matched approximately the same regions of the Cornsweet stimulus, two small reference dots were placed in the territories adjoining the Cornsweet edge, as indicated in Figure 1, to remind the subjects in each presentation of the areas that they were to compare. Once the two remote squares had been adjusted to appear as nearly similar to the corresponding territories in the Cornsweet stimulus as was deemed possible, the subject designated a match, resetting both remote squares to their initial values and launching the next image. Selecting the match button also recorded the chosen gray scale values and exported them to a spreadsheet for subsequent analysis. The total testing time was ∼30 min. Each such test was taken on two occasions, separated by an interval of at least 4 weeks to minimize any priming effects.
Under a given set of conditions, perceived brightness is linearly related to CRT gray scale values (Wandell, 1995). For the sake of simplicity, we have presented the results in terms of the median percentage difference in the gray scale adjustments made in the two remote test areas to match the apparent brightness of the two flanking territories in the Cornsweet stimulus.
Statistical comparisons. The perceived differences in the relative brightness of the territories on either side of the Cornsweet edge were taken as the average of the two trials for each subject, expressed as medians and ranges (see Table 1). The Friedman repeated measures on ranks test, followed by a pairwise multiple comparison procedure (Student–Newman–Keuls), was used to determine the levels of significance shown in Table 1. We specifically chose the Friedman test, which is the nonparametric equivalent of the repeated measures ANOVA test, because the variance of performance among the images tested was different, and because each subject viewed multiple images.
The Cornsweet illusion
The most thoroughly studied of the several Craik–O’Brien–Cornsweet edge effects is the so-called Cornsweet illusion (Cornsweet, 1970) (see also O’Brien, 1959; Craik, 1966;Kingdom and Moulden, 1988). In this illusion (Fig.2), equiluminant territories adjoining opposing light and dark luminance gradients along a step boundary are filled in with brightness values that are different from one another, thus making it obvious that the perception of the stimulus does not accord with its actual luminances. In a standard presentation such as that in Figure 2 B, the territory adjacent to the light gradient appears brighter (lighter) than the territory adjoining the dark gradient. On average, subjects viewing this stimulus perceived a difference of 10% between the lighter and darker territories (Table1).
Sources of luminance gradients
To understand how the Cornsweet stimulus might elicit these effects in a manner akin to the empirical generation of simultaneous brightness contrast illusions and Mach bands (Williams et al., 1998a,b; Lotto et al., 1999a,b), we first considered the possible sources of the luminance gradients that give rise to the standard illusion.
Luminance gradients are generated in one of two general ways: (1) from changes in the reflectance of surfaces or (2) from changes in the illumination of surfaces. Examples of luminance gradients arising from graded differences in the material properties (reflectances) of surfaces are illustrated in Figure3 A. The sources of luminance gradients arising from graded differences of surface illumination are more varied and can be generated by (1) partial occlusion of an extended light source, which results in penumbras at the edges of shadows; (2) surface curvature, which alters the intensity of light reaching the surface as a function of the angle of incidence; (3) graded transmittance of objects, which also alters the amount of light reaching the eye from the surface in question; and (4) progressive diminishment of the light that reaches a surface as a function of distance from the origin of the light (Fig.3 B). Whatever the source of a specific stimulus, a luminance gradient that arises from illumination generally signifies a variation in the amount of light reaching the eye from the object in question. As a result, the territory flanking the lighter edge of a luminance gradient based on illumination is typically more intensely lit than the territory flanking the darker edge. A luminance gradient arising from the reflectance properties of an object, on the other hand, does not imply this association, because the territories adjoining such gradients are usually illuminated to the same degree, as indicated in the examples in Figure 3 A. In short, the luminances of the territories adjoining a gradient based on illumination usually have a different significance than the territories adjoining a luminance gradient based on reflectances.
Possible sources of the luminance gradients in the Cornsweet stimulus
Although many specific sources could give rise to the standard Cornsweet stimulus (or something like it) (e.g., an evenly illuminated surface on which the gradients are painted; a “valley” in the plane of the territories adjoining the gradients; a “ridge” in the plane of the two adjoining territories and so on), such instances will typically have represented one of the two major categories of luminance gradients described in Figure 3: an opposing pair of gradients arising from reflectance properties or opposing gradients based on differences of illumination (Fig.4).
If perceptions of brightness are governed empirically by what visual stimuli have turned out to be, then the perception of the Cornsweet stimulus should change in accordance with the relative probabilities of the underlying source of the stimulus. For instance, if the elements in the scene accord with the possibility that the luminances of the Cornsweet edge are reflectance features, and thus that the overall stimulus is uniformly illuminated (Fig.4 A), then the perceived difference in brightness of the two adjoining surfaces should be decreased (because, based on past experience, the equiluminance of the adjoining territories will generally have arisen from two surfaces with the same material properties under the same light). Conversely, if the Cornsweet edge and other elements in a scene more closely accord with the possibility that the two equiluminant flanking surfaces are differently illuminated (Fig. 4 B), then the perceived difference in brightness should increase (because in past experience the equiluminant adjoining territories will usually have arisen from surfaces with different material properties under different light, and brightness—or more properly lightness—is how the visual system represents the reflectivity of objects). We tested these predictions in the following series of experiments.
Effect of increasing the probability that the source of the Cornsweet edge is graded differences in reflectance
The standard Cornsweet stimulus was embedded in a uniform surround, identical in luminance to the surfaces flanking the gradients (Fig. 5). By removing the background contrast without changing the elements of Cornsweet stimulus per se, the probability of uniform illumination across the scene is increased (because the absence of a boundary around the flanking territories of the standard Cornsweet stimulus, and the uniformity of the background, increases the likelihood that both territories are composed of the same material seen in the same light). As a result, the salience of the illusion should be diminished.
When this change in the usual stimulus presentation was assessed quantitatively, subjects adjusted the luminances of the two remote test regions to very nearly the same value, in distinction to the adjusted luminances of the test regions required to match the apparent brightness difference of the territories in the standard presentation (a 3% difference vs a 10% difference, or a 69% reduction in the salience of the illusion; see Table 1). Thus, the illusion for all subjects was greatly reduced (or in some cases abolished) by this change, although the luminance relationships in Cornsweet stimulus itself remained the same [see also Knill and Kersten (1991) and Buckley et al. (1994)].
Effects of increasing the probability that the source of the Cornsweet edge is graded differences in illumination
If the difference in brightness values assigned to the two adjoining territories is diminished by information that increases the probability that the Cornsweet edge is in effect painted (and thus that the adjoining territories are more likely to be similar surfaces under the same illuminant) (Fig. 4 A), then the difference in assigned brightness should be enhanced by information that increases the probability that the gradients arise from differences in illumination (and thus that the equiluminant adjoining territories are more likely to be objects that are differently illuminated and differently reflective) (Fig. 4 B). We tested this predication in the following experiments.
The effect of perspective
The salience of the Cornsweet illusion should be increased by implying depth by the addition of perspective (i.e., by accurately depicting the diminution of apparent size with distance from the observer, as occurs in any 3-D to 2-D projection) (Fig.6 A). The rationale for this prediction is that perspective increases the probability that the source of the opposing gradients is a doubly curved surface illuminated from the right (as indicated in Fig. 6 B). Accordingly, the equiluminant returns reaching the eye from the flanking regions are more likely to signify a less reflective surface in light and a more reflective surface in shadow.
When the Cornsweet stimulus was presented with perspective added, the perceived difference in the brightness of the two sides was 30% greater than when the stimulus was presented in the absence of perspective (Table 1).
The effect of stimulus orientation
A further prediction is that changing the overall orientation of the Cornsweet stimulus should also change its salience. Because humans evolved in an environment in which the primary source of illumination is usually from above (i.e., from the sun), the spatial arrangement of the same objects can look quite different when they are turned upside down (Fig.7 A provides an example of this well known effect). Thus, if the Cornsweet stimulus is rotated from its usual horizontal presentation (Fig. 2 B) such that the dark gradient is above and the light gradient below (Fig.7 B), the stimulus is more likely to have been generated by light from above (because the direction of the gradients is consistent with a doubly curved surface arranged in this way). If, on the other hand, the same stimulus is rotated 180°, as in Figure7 C, this likelihood is diminished.
Accordingly, when the equiluminant territory adjoining the dark gradient is uppermost (Fig. 7 B), its surface is likely to be less reflective than that of the lower territory. The reason is that the possible sources of the stimulus include at some higher level of probability an object whose uppermost surface is better lit than the lower surface (as indicated in the cutaway view to the right of the stimulus). When two surfaces return the same amount of light to the eye and one is better lit than the other, the better lit surface will always have been the less reflective. Because the visual system, according to our theory, constructs percepts based on the relative probabilities of the possible sources of the stimulus, the statistical influence of this increased probability causes the uppermost of the two equiluminant adjoining surfaces to appear darker than the lower one. When the stimulus was oriented in this way, subjects indeed perceived a brightness difference between the two surfaces that was 85% greater than in the standard presentation (Table1).
By the same reasoning, the perceived difference in the relative brightness of the two surfaces in the opposite orientation (Fig. 7 C) should be less than when the stimulus is oriented with the dark gradient uppermost. The reason is that, under these circumstances, it is less likely (although still quite possible) that the source of the stimulus is an object with differently reflective surfaces receiving different amounts of illumination. Consequently, the probability that the surfaces adjoining the Cornsweet edge in Figure7 C have the same reflectance (and that the opposing gradients are painted features) is increased relative to the presentation in Figure 7 B (as indicated in the diagram on the right). When subjects were presented with the Cornsweet stimulus in this orientation, they perceived about the same difference in the relative brightness of the surfaces as in the standard stimulus, instead of the 85% increase seen in the opposite arrangement (Table1).
The effect of combining probabilistic cues pertinent to the possible sources of the Cornsweet stimulus
A further prediction of a probabilistic theory of perceived brightness is that if two or more changes in the depiction of the Cornsweet stimulus occur together, they should combine in affecting relative brightness according to the direction of their separate influences on the relative probabilities of the possible sources of the stimulus. The scene in Figure 8 combines perspective, orientation, texture, additional gradients and objects, and a distinctive background which all accord in indicating that the two equiluminant territories in the Cornsweet stimulus (the object in the foreground) have a high probability of being differently reflective surfaces in light and shadow, respectively. Compared with the standard presentation of the Cornsweet stimulus in Figure 2 B (the luminances of this basic stimulus are still the same in the foreground object in Fig. 8), the perceived brightness difference of the territories adjoining the Cornsweet edge was increased by 168% (Table1).
That the perceived intensity of a visual stimulus (its brightness) is dependent on context was described by both Hering (1964)and von Helmholtz (1924) and confirmed by the classic demonstrations ofGelb (1929) and later Wallach (1963). Despite this evidence, the interpretation of these phenomena has been much disputed by these and other investigators (see introductory remarks). Thus Hering argued that the assignment of brightness was primarily the result of the intrinsic properties of neural processing, von Helmholtz that there was a substantial contribution of “unconscious interference” to such perceptions, and Wallach that the perceived brightnesses could be understood quantitatively in terms of luminance ratios. Until recently, the consensus among both visual physiologists (Ratliff, 1965) and psychophysicists (Cornsweet, 1970) has been that territorial assignments of perceived light intensity that do not accord with photometric relationships are best explained as the result of lateral interactions among neurons in the retina (or at least the “lower order” input stages of the visual system).
This consensus notwithstanding, a number of investigators have concluded that perceptions of relative brightness cannot be explained in any simple way by the receptive field properties of lower order visual neurons as they are presently understood (Gilchrist, 1977;Gilchrist et al., 1983, 1999; Knill and Kersten, 1991; Adelson, 1993,1999; Williams et al., 1998a,b; Lotto et al., 1999a,b). How then can Cornsweet illusion and related misperceptions of luminance be accounted for?
An empirical explanation of the Cornsweet effect
The behavior of the Cornsweet effect that we describe indicates that this illusion, and perhaps the filling in of territorial brightnesses based on the nature of adjoining edges generally, is determined empirically by the relative probabilities of the possible sources of the stimulus. This explanation of the discrepancies between the measured luminances of the stimulus and what we actually see derives from recent studies of the familiar illusions of simultaneous brightness contrast (Williams et al., 1998a,b) and of Mach bands (Lotto et al., 1999a,b). Using a series of graphic tests, we showed that these phenomena can be explained satisfactorily as the result of an empirical process in which percepts are elicited as statistically generated associations determined by the relative probabilities of the possible sources of the stimulus in question. Thus, a gray patch on a dark background looks lighter than the same patch on a light background because the underlying source of the luminance profile on the printed page or the computer screen (or any other circumstance) will often have been a more reflective object in shadow and a less reflective one in light, a statistical fact that determines the related percept. We went on to show that the theory could also account for the appearance of Mach bands, the illusory light and dark zones that adorn luminance gradients (Lotto et al., 1999a,b). In this case, the common occurrence of highlights and lowlights at the initiation and termination of luminance gradients evidently leads to their probabilistic incorporation in the perception of similar gradients that lack these adornments.
The observations reported here extend this theory to the territorial assignment of “illusory” brightnesses as a consequence of adjoining edges (O’Brien, 1959; Craik, 1966; Cornsweet, 1970;Kingdom and Moulden, 1988). As we have shown, these phenomena can also be explained in terms of the relative probabilities of the possible sources of the stimulus. Like the standard illusions of simultaneous brightness and Mach bands, such “misperceptions” are the signature of an extraordinarily powerful strategy of vision: by eliciting percepts that represent the sources of inevitably ambiguous visual stimuli in this probabilistic manner, the observer will always have the best chance of responding to the stimulus with successful visually guided behavior.
Relation to filling in
Various other phenomena have been described in which territorial qualities are misperceived; therefore, it is of interest to consider whether any or all of these manifestations of “filling in” might be explained in the same probabilistic manner as the Cornsweet effect. Thus an object can disappear from perception and be replaced with the quality of the background despite its continued presence (Troxler, 1804), whereas actual discontinuities or anomalies in a pattern are often invisible [see, for example, Heckenmueller (1965) and Ramachandran (1992a)]. The most thoroughly studied of these phenomena is the physiological blind spot arising from the absence of photoreceptors overlying the optic nerve head (von Helmholtz, 1924; Lettvin, 1976; Andrews and Campbell, 1991;Ramachandran, 1992a,b). Other physiological elisions of retinal information are the foveal blind spot in dim light (because of the absence of foveal rods), the invisibility of small blue stimuli in central vision (because of the paucity of short wavelength-sensitive cones in the foveola), and the invisibility of the shadows cast by retinal blood vessels.
In each of these cases, the qualities of the surrounding region of visual space are assigned to the missing, unobserved, or anomalous area of the field. Although a discussion of such a wide array of phenomena is beyond the scope of this article, these effects may all be explainable in terms similar to those that we have used here to account for territorial filling in based on the characteristics of particular edges (as indeed the scope of the theory that we are proposing requires; see above). This suggestion runs counter to the widely held view that filling in missing visual information relies on surrogate activity in the relevant regions of the visual cortex, stimulated by the responses of adjacent neurons and conveyed to the deprived region by lateral cortical connections (Fiorani et al., 1992;Gilbert and Wiesel, 1992; Pettet and Gilbert, 1992; Ramachandran et al., 1993; Murakami, 1995) (an idea similar in principle to the influence of lateral retinal interactions long used to explain simultaneous brightness contrast and Mach bands). The results we describe make this interpretation suspect, at least as a general explanation of filling in. In none of our examples could the brightness values assigned to the territories that are filled in derive in any simple way from the luminances of the topographically adjacent regions of visual space.
Relation to other theories
Because a number of other investigators have recently explored how the perception of light intensity is influenced by the wealth of information in visual scenes, it is important to distinguish our theory from related ideas about the perception of surface qualities and the way that the visual system might compute them (Knill and Kersten, 1991; Adelson, 1993; Buckley et al., 1994; Freeman, 1994;Pessoa et al., 1996; Wishart et al., 1997).
Taken together, these studies have indicated that (1) a wide range of information is taken into account in determining the perception of luminances (2-D contours, 3-D shape, binocular disparity, object orientation, object color, the presence of penumbras, and presumably much else that remains to be studied), and (2) no simple “input stage” mechanism such as lateral interactions among retinal ganglion cells can explain these effects. Although there has been no consensus about how these facts should be interpreted, some investigators have concluded that the visual system relies on algorithms that allow the “higher order” perception of the scene to determine other more basic perceptual qualities (e.g., that the perception of the shape of an object allows the appropriate perception of its surface reflectance; see, for example, Knill and Kersten (1991) and Buckley et al. (1994)]. The problem with this conclusion is the implication that “a perception occurs in addition to the perception itself” (Evans, 1974, p.7). This hierarchical conception of visual processing is flawed in much the same way that the Cartesian concept of an internal observer is flawed by the specter of an infinite regress. Even perceptual theories that include probability in such computations, such as the statistical influence of more or less probable viewpoints on what is ultimately perceived (Freeman, 1994), do not avoid this dilemma.
The theory we propose is that perception is a series of associations generated on an empirical basis by the stimulus confronting the observer at any given moment. By virtue of the relative probabilities of the possible sources of the stimulus (that is, what the sources of the same or similar stimuli have turned out to be), all of the factors in the scene that have in the past been germane to the accurate perception of luminance are included in the generation of the percept. This conception satisfactorily accounts for all of the observations presented here, as well as those described in most other studies of brightness that we are aware of. It also rationalizes some otherwise conflicting results. For example, Knill and Kersten (1991)showed that the apparent brightness difference between two adjacent territories can be decreased by the depiction of curvature, whereas we have provided an example of how curvature canincrease the difference in brightness between such territories (Fig. 6); these seemingly paradoxical results are readily explained in terms of the source probabilities of the respective stimuli but are difficult to account for in other terms.
Finally, the theory we outline here provides a plausible neuronal mechanism for this empirical strategy of vision: the enormous amount of empirical information required for appropriate associations to be triggered by visual stimuli can be accumulated and stored in synaptic connections and weightings that have arisen by natural selection during the evolution of the species and during ontogeny by activity-dependent feedback on synapse formation (for review, seePurves, 1994).
This work was supported by National Institutes of Health Grant NS29187. We are grateful to Tim Andrews, David Coppola, Don Katz, Tom Polger, Len White, and Mark Williams for helpful criticism, and to Rochelle Schwartz-Bloom for advice with statistical issues.
Correspondence should be addressed to Dale Purves, Box 3209, Duke University Medical Center, Durham, NC 27710.