Abstract
The brightness and color of a surface depends on its contrast with nearby surfaces. For example, a gray surface can appear very light when surrounded by a black surface or dark when surrounded by a white surface. Some theories suggest that perceived surface brightness and color is represented explicitly by neural signals in cortical visual field maps; these neural signals are not initiated by the stimulus itself but rather by the contrast signals at the borders. Here, we use functional magnetic resonance imaging (fMRI) to search for such neural “filling-in” signals. Although we find the usual strong relationship between local contrast and fMRI response, when perceived brightness or color changes are induced by modulating a surrounding field, rather than the surface itself, we find there is no corresponding local modulation in primary visual cortex or other nearby retinotopic maps. Moreover, when we model the obtained fMRI responses, we find strong evidence for contributions of both local and long-range edge responses. We argue that such extended edge responses may be caused by neurons previously identified in neurophysiological studies as being brightness responsive, a characterization that may therefore need to be revised. We conclude that the visual field maps of human V1 and V2 do not contain filled-in, topographical representations of surface brightness and color.
Introduction
Many compelling visual illusions indicate that perception of color and brightness is guided by contrast at surface borders, rather than the local luminance or spectral content of the light. Basing vision on contrast, rather than on absolute light levels, reduces perceptual dependence on changes in ambient illumination, permitting us to recognize surfaces based on reflectance. This is important in a world in which the ambient light level can vary by eight orders of magnitude over the course of a day (starlight, 10−3 cd/m2; sunlight, 105 cd/m2; see http://white.stanford.edu/∼brian/numbers/).
One question that follows is how the brain represents perceived properties, such as brightness and color, that are derived from retinal signals. Classical psychophysical studies of brightness and color constancy imply that surface representations are formed through the long-range spatial integration of visual information (Land, 1959, 1977, 1983, 1986). One well known computational theory, for example, advocates the spreading and “filling-in” of contrast information within regions defined by object boundaries (Gerrits and Vendrik, 1970; Cohen and Grossberg, 1984; Grossberg and Todorovic, 1988). Some psychophysical observations on brightness perception appear to be consistent with this concept of filling-in of brightness information (Paradiso and Nakayama, 1991; Arrington, 1994; Paradiso and Hahn, 1996; Davey et al., 1998).
Nevertheless, the existence of such a filling-in process as a spreading of neural activity between edges is disputed. Certain multiscale filtering models explain surface brightness on the basis of edge responses and contrast normalization without the need for a filling-in process (Blakeslee and McCourt, 1999; Dakin and Bex, 2003; Blakeslee et al., 2005).
The filling-in process and representation has been hypothesized to exist in the firing rates of neurons in early, retinotopically organized visual areas, perhaps V1 and V2 (Grossberg and Mingolla, 1985). Neurophysiological studies of monkey and cat visual cortex provide some support for the filling-in hypothesis. A small proportion of neurons, responding to the interiors of achromatic surfaces, exhibit properties consistent with aspects of human brightness constancy (MacEvoy and Paradiso, 2001), brightness induction (Rossi et al., 1996; Rossi and Paradiso, 1999; Kinoshita and Komatsu, 2001; Peng and Van Essen, 2005), the Craik–Cornsweet–O'Brian brightness illusion (Hung et al., 2001; Roe et al., 2005), and surface completion of the retinal blind spot (Komatsu et al., 2000, 2002). No neurophysiological study, however, has yet revealed evidence of a topographic cortical representation corresponding to uniform surface regions (Friedman et al., 2003; von der Heydt et al., 2003).
Here, we search for brightness and color filling-in using functional magnetic resonance imaging (fMRI) (Fig. 1). While lying supine in an fMRI scanner, subjects viewed a disk that modulated in brightness or color caused either by changes in the mean luminance or color of the central disk or its immediate surround. If the filling-in hypothesis is correct, we should find positive correlations between fMRI responses in the cortical representations of the disk, with modulation of either the disk or surround.
Materials and Methods
Subjects.
fMRI signals were measured in six (A.R.W., B.A.W., D.N., F.W.C., J.L., R.F.D.) right-handed males between the ages of 23 and 49 years. All experiments were undertaken with the understanding and consent of each subject. All subjects had extensive previous experience with performing fMRI experiments. Subjects D.N. and J.L. were unaware of the purpose of the experiments.
fMRI set-up.
Data were acquired on a GE 3T Signa LX scanner (GE Medical Systems, Milwaukee, WI) using a custom-built high-gain head coil. Subjects' heads were fixed throughout the measurement period by means of snug-fitting pads.
Anatomical data preparation.
Anatomical images were acquired on a GE 1.5T Signa LX scanner using a three-dimensional (3D) spoiled gradient-recalled acquisition in a steady state (SPGR) pulse sequence [one echo, minimum echo time (TE), 15° flip angle, two excitations]. Sagittal slices were acquired with an in-plane voxel size of 240/256 × 240/256 mm with 1.2 mm slice thickness. The anatomical images were segmented into gray and white matter using custom software (Teo et al., 1997). To facilitate analysis and visualization of the data, the occipital-lobe area of interest was computationally flattened using methods described previously (Wandell et al., 2000) and available on the internet (http://white.stanford.edu/).
Stimulus display.
Subjects viewed stimuli displayed on a liquid crystal display placed in a shielded box at the foot of the scanner bed and viewed through binoculars and adjustable mirrors. This display system subtended 24° of visual angle vertically and 32° horizontally. The display, which was calibrated using a PhotoResearch (Chatsworth, CA) spectroradiometer, had a mean luminance of 30 cd/m2. A fixation point was present throughout the experiment. Stimuli were displayed using custom software developed in Matlab (MathWorks, Natick, MA) using routines from the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997).
Visual field mapping.
Visual field maps were measured using rotating wedge and expanding ring stimuli that create traveling waves of neural activity in visual cortex (Engel et al., 1994, 1997; Sereno et al., 1995; DeYoe et al., 1996). The wedge and ring consisted of drifting, achromatic (mean luminance, ∼50 cd/m2), dartboard contrast patterns (∼90% contrast), which alternately moved radially toward and away from fixation at a velocity of 1 °/s. The wedge spanned 90° of angle and extended to 12° from fixation. The wedge completed a full rotation every 24 s, changing positions in synchrony with the data acquisition frame rate of 3 s [repetition time (TR)]. The ring stimuli occupied one-half of the visual field (50% duty cycle), completed a full expansion every 24 s, and changed eccentricity position in synchrony with the 3 s TR.
Luminance stimuli.
The main stimulus used in this study consisted of a central disk (diameter, 14°) surrounded by a large annulus [7° until outer border of screen (i.e., 12/16° radius)]. In the experiment, either the central disk or the surround was modulated in luminance at 1 Hz. Stimuli were modulated around the background luminance level at three levels of luminance contrast (12.5, 25.0, and 50.0%). A fixation point was present at all times. Onset and offset of stimuli were modulated by a temporal Gaussian window. In each experimental run, we alternated 12 s of one experimental condition (so either central disk or surround modulation, at one particular contrast level) with 12 s of fixation (static background with only a fixation point present), thus resulting in a stimulus alternation frequency of 1/24 Hz. Six cycles of activation and fixation were presented during one run for a total run duration of 144 s. Experimental conditions were presented in a pseudo-randomized way.
Color stimuli.
Color stimuli were identical to the luminance stimuli, except that modulations were done in isoluminant color space [colors were varied along the long-medium wavelength sensitive cone-opponent (LM) axis in Macleod-Boynton cone-color space (MacLeod and Boynton, 1979)]. Stimuli were presented at two levels of LM contrast (2.5 and 5.0%). Four subjects (A.R.W., F.W.C., J.L., R.F.D.) participated in the color experiment.
Localizer stimuli.
During the first run of each experiment, high-contrast (∼90%) checkerboard stimuli with the same dimensions as the central disk and surround were alternated to localize regions of interest (ROIs) in visual cortex that represented the central disk, the surround, and the transition between the latter two (edge representation). Delineating the different visual areas in the foveal part of visual cortex is much harder than in the periphery (Dougherty et al., 2003). We therefore also used an additional localizer stimulus to delineate the central 1.5° representation of the visual field. This region is analyzed without further attempting to segment it into different visual areas.
Contrast stimulus.
As a control condition, a uniform central disk, surrounded by a dartboard checkerboard annulus with the same dimensions as the surround luminance stimulus, was presented. This dartboard was presented at a contrast level of 25%. Five subjects (all except B.A.W.) were presented with this condition.
Functional MR.
Functional MR data were acquired with a spiral pulse sequence (Glover and Lai, 1998; Glover, 1999) with 21 obliquely oriented slices acquired every 3 s (TE, 30 ms; TR, 1.5 s; two interleaves; 70° flip angle; effective voxel size, 2 × 2 × 3 mm). Each individual functional scan lasted ∼2.5 min, and subjects were given a brief break between scans. A set of two-dimensional (2D) fast SPGR anatomy images was acquired before the series of functional scans. These T1-weighted slices were physically in register with the functional slices and were used to align the functional data with the high-resolution anatomy data via a semiautomated 3D-coregistration algorithm. The functional data were inspected for unwanted head movements. The time series for multiple measurements of the same stimulus were averaged.
Visualization and preparation of the measurements can be simplified by working on two-dimensional flattened representations of the cortical manifold. These flat maps allow for easy specification of regions of interest and simplify the process for finding visual areas. Because the flattening process inevitably distorts distance and area measurements, all distance measurements were made in the 3D cortical manifold by mapping the 2D coordinates back to the 3D manifold. We measured distance along the boundary between gray and white matter.
fMRI data analysis.
To improve sensitivity, only data from gray-matter voxels were analyzed for activity. Linear trends were removed from the fMRI time-series signal, and activity was measured by correlating the time series with a harmonic at the stimulus alternation frequency (1/24 Hz). Data for runs of the same conditions were averaged. Each subject performed each condition twice (with the exception of the localizer conditions, which were run only once). The data were then displayed on a 3D representation of the boundary between white and gray matter or on a flattened representation of this same boundary.
Based on the retinotopic maps in each subject, the location of V1 and V2 (V2v and V2d data were averaged) was determined. Based on the localizer stimuli, ROIs representing the central disk and the annulus (and beyond) were delineated within each of these retinotopically defined areas. The transition between the central disk and annulus representations was traced manually on the flattened representations of visual cortex and defined as the edge ROI. Next, for each voxel in the center and annulus representation, we determined the shortest distance along the cortical surface to this edge ROI. We could then determine fMRI activations as a function of the distance to the edge representation (positioned at 0 mm).
In a parallel part of the analysis, we determined the average activations in the foveal confluence (FC; determined using the 1.5° localizers) as a function of contrast and type of modulation. Regression analysis was used to determine the influence of contrast on fMRI activation.
In all cases, fMRI activation was expressed as the amplitude of the blood oxygenation level-dependent (BOLD) modulation, projected on the anti-phase of the center response (5–25 mm away from the edge) to the center localizer stimulus. The phase of the localizer stimulus was determined individually for each subject. This procedure assures that activations with a fixed temporal relationship to the stimulus presentation are used under all circumstances. In addition, this procedure takes small individual variations in BOLD delay into account.
Results
The brightness experiment included two conditions, both run in a block design. In one condition, subjects viewed a central disk, the luminance of which varied sinusoidally in time in the presence of a constant surrounding field (12 s). This modulation alternated with a fixation period (12 s) consisting of a uniform field. In the second condition, the disk luminance was constant and the surround luminance varied in time, again in alternation with a uniform field. In both conditions, the disk brightness changes and the apparent brightness change in the two conditions were similar. The disk diameter (14°) was chosen so that the cortical region representing the disk in each hemisphere would provide a large surface area. Luminance was modulated at 12.5, 25, and 50% contrast relative to the background. Throughout the session, subjects fixated a small black dot in the center of the screen. In separate conditions, subjects viewed high-contrast checkerboard stimuli that were used to locate the regions in visual cortex representing the fovea (central 1.5°) and the transition between central disk and surround (Fig. 2). Individual subjects saw two presentations of each level of luminance modulation in each condition. It is important to note that the induced brightness and color changes we examine are compelling and cannot be influenced at will. An observer who is well aware of looking at a constant luminance disk amid a modulating surround still cannot perceive the disk as constant (see also supplemental Fig. 1, available at www.jneurosci.org as supplemental material). fMRI data were collected and analyzed for foveal and parafoveal regions in visual cortex. For each individual subject, responses were sorted according to the level of contrast modulation and the part of the stimulus that was modulated (center or surround).
Figure 2 shows the amplitude of the fMRI (or BOLD) signals in primary visual cortex (V1) as a function of distance along the cortical surface for two subjects. The signal peaks at the retinotopic region corresponding to the edge of the central disk and decreases with distance on both sides of the edge representation. In Figure 3, average V1 and V2 fMRI signals for modulation of the disk and the surround are plotted. In both V1 and V2, the largest activations are found around the edge representation: the transition between the central and peripheral ROIs. Relatively little modulation is found at retinotopic locations corresponding to the spatially uniform portions of the image. Luminance modulation in the central disk generates a weak but noteable activation in the central disk representation (positive distances). During surround modulation, activation is slightly higher in the peripheral section (negative distances). This indicates that V1 and V2 are not only activated by the edge but also by luminance modulations. Negative responses are found in the most foveal and peripheral parts of the response curves. This could indicate the presence of mechanisms generating a negative BOLD response (Tootell et al., 1998; Shmuel et al., 2002; Smith et al., 2004). In principle, this effect could mask small signals associated with the putative filling-in process.
To control for this possibility, we include a condition in which a checkerboard stimulus of medium contrast (25%) was shown in the surround, alternating with a uniform background field. Importantly, observers perceive no changes in brightness in the central disk in this condition. The checkerboard stimulus causes a strong activation in the surround and near the edge (Fig. 3, gray curve). Despite the much stronger positive BOLD signal associated with the checkerboard stimulus, the signals in the parafoveal region (positive distances) are essentially comparable with those for the surround-modulation condition (Fig. 3, filled circles). Hence, there is no indication that negative BOLD effects are masking responses in more distant regions. Only during luminance modulation of the central disk do we observe a slight increase in the signal in the foveal region (Fig. 3, open circles).
To test whether the BOLD signals measured in the central ROI correlates with apparent brightness, we varied the amplitude of the surround modulation. Increasing the amplitude of the surround modulation increases the amplitude of the perceived brightness changes in the central region. If the BOLD response in the central disk ROI increased monotonically with increasing surround modulation amplitude, it would support the theory that neurons in the fovea carry a signal that is correlated with perceived brightness as well as luminance (Fig. 1B). To perform this measurement, it was necessary to dissociate potential (weak) responses to the uniform regions of the disk from powerful edge responses. The central region furthest away from the edge representation, and therefore the region least likely to be contaminated by edge responses, is the cortical representation of the fovea. We define this region, called the FC, as the central 1.5° of the visual field. The representation of this ROI was determined independently using small checkerboard localizers. Within the FC, we make no attempt to distinguish between responses from different functional areas (V1–V4). Figure 4 shows the average signals in the FC as a function of type and amplitude of luminance modulation. For modulations of the central disk, we observe a positive increase in activation with increasing modulation amplitude. For surround modulations, we observe a response that is independent of modulation amplitude. Regression analysis shows that the increase in BOLD response with log contrast is significant for modulation of the central disk (t = 3.035; p = 0.005), but not for the surround (t = −0.17; p = 0.99). This finding indicates that although increasing the magnitude of the luminance modulation of the surround increases the magnitude of the perceived brightness modulation in the central disk (see also supplemental Fig. 1, available at www.jneurosci.org as supplemental material), this increase in brightness is not represented by neurons in the FC.
The largest source of the BOLD signal in V1 and V2 was the contrast modulation at the edge of the central disk. Neurons in V1 and V2 tend to have relatively small receptive field sizes, and the line-spread function of the BOLD signal in V1 has been measured at ∼3.5 mm full-width at half maximum (FWHM) (Engel et al., 1997). For this reason, we expected the BOLD responses in these areas to be confined to the immediate vicinity of the edge. Yet, the activations are clearly much wider than this. To better understand the edge and surface responses we obtained, we fitted the V1 and V2 responses in terms of a linear combination of underlying edge and surface components (Fig. 5) (see supplemental material for mathematical details, available at www.jneurosci.org). Each component is itself modeled as a linear function, with unit slope (multiplied by log contrast) and intercept. The first component in our model represents the activity of neurons responding to the magnitude of local luminance modulations. Because local-luminance modulations correlate with local-brightness changes, the first component in our model represents the combined effect of local luminance and any putative brightness response. The second component in our model corresponds to the putative filling-in signal associated with surround modulation only. The third component corresponds to a local-edge response. This component is modeled as a Gaussian function, the FWHM of which is fixed at 1 mm. The final component of our model, a Gaussian function whose FWHM varies freely during the fitting procedure, allows for an extended edge-centered response. We convolve the linear combination of edge and surface responses with a Gaussian to simulate the low-pass spatial filtering characteristics of the BOLD signal. Evidence for key model components is examined by constraining the parameters associated with those components to zero. To compare model performance, we calculate Akaike's weights (Burnham and Anderson, 2002), which trade off the goodness-of-fit of a particular model against the number of free parameters in the model. Akaike's weights represent the relative probability that each model is correct. We calculate the evidence ratio between successively ranked models in the performance hierarchy by dividing the respective relative probabilities associated with the models.
First, we examine the evidence in favor of the presence of an extended-edge response by calculating the relative probability of a model containing a single (wide) edge response relative to one containing two separate edge responses (one fixed at 1 mm, and the second free to vary). In both V1 and V2, our results indicate that the model with two edge responses is substantially more likely to be correct (evidence ratio, >107). For this reason, we use the model with two edge responses as the basis for subsequent modeling. Two important points are worth noting here. The width of the variable edge response is much broader than that of the narrow response (V1: FWHM, 17.5 ± ∼2 mm; V2: FWHM, 19.0 ± 1.6 mm). Second, we find that the narrow and extended-edge responses can be fitted with the same intercepts and slopes without affecting performance. This latter result suggests that the local and extended-edge responses arise from highly correlated, if not identical, neural sources. Finally, allowing different widths for the foveal and peripheral sides of the variable edge response did not result in better model performance.
In our study, the main issue is whether adding a putative filling-in signal, restricted to the central region during surround modulation, improves performance. Consistent with our analysis of the data shown in Figure 6, in neither V1 nor V2 do we find evidence that the induced component varies with contrast (evidence ratios are 0.4 for both V1 and V2). Importantly, however, we find strong evidence in V1 that adding an induced component, consisting of a (constant) positive intercept, results in substantially improved performance (evidence ratio, 86,000). In V2, the evidence is only slightly in favor of such a constant induced response (evidence ratio, 19). The model that provides the most parsimonious account of our results is depicted in Figure 6, superimposed on the average data of our subjects.
Four subjects that participated in the brightness experiment also performed a chromatic variant of this experiment (Fig. 7), in which central disk and surround were modulated in isoluminant color space at 2.5 and 5% contrast relative to the background (all other methodological details were identical to the brightness experiment). Once again, we obtained fMRI responses that were spatially highly nonuniform, with strong activations in the region representing the transition between central disk and surround and much weaker responses to uniform surfaces. Only for surfaces that physically varied in color did we find small increases in response amplitude with increasing modulation. The model with the two edge responses was also fitted to the data of this experiment. Again, we asked whether adding a putative filling-in signal improves performance. Interestingly, for the color experiment, for both V1 and V2, we now find that a model without the induced component (no intercept) provides the most parsimonious account of the results (evidence ratios for the other models were <1) (see supplemental material for details, available at www.jneurosci.org).
Discussion
Our main finding is that, in early visual cortex, perceived brightness and color changes are not accompanied by commensurate changes in the fMRI signal. Although we found very strong activations in the region representing the disk edge, responses to uniform surfaces are much weaker. Only for surfaces that physically vary in luminance do we find small increases in response amplitude with increasing luminance modulation. Surface responses in early human visual cortex thus correlate with physical changes in stimulus luminance and spectral content but not with perceived brightness or color.
We found that our BOLD responses could not be accurately modeled using a single, local-edge response. These responses were at least a factor of four wider than expected on the basis of previous estimates of the low-pass characteristics of the BOLD signal (Engel et al., 1997). Our modeling indicates that a combination of narrow (FWHM, 1 mm) and broad (FWHM, ∼16–20 mm) edge-centered responses is the most likely explanation for the shape of the BOLD response. At the eccentricity at which the edge was presented in our study (7°), the FWHM point of the broad response would lie 3–3.5° in the direction of the fovea, or 4.5–6° into the periphery (Engel et al., 1997).The local-edge response is readily interpreted as arising from the activity of cortical simple and complex cells. What process could underlie the much broader edge responses? The width and symmetry of the response excludes that it is because of smearing as a consequence of fixation drift. A number of neurophysiological studies on brightness, texture, and color processing indicate that remote edges may have contextual influences that extend far beyond the classical receptive field (Zipser et al., 1996; Lamme et al., 1998, 2002; Kinoshita and Komatsu, 2001; Wachtler et al., 2001; Roe et al., 2005). One possibility is that the broad edge responses may arise through contextual influences of the edge on distant edge- or luminance-coding neurons.
Could the broad edge response, which appears to cover a substantial part of the central disk representation, be related to brightness perception? An important indication that it is not is that the extended response was found to be more or less symmetrical around the edge. It was thus also present in a region that represented a part of the stimulus that did not modulate in brightness (peripheral ROI during central disk modulation). This essentially excludes the possibility that the extended response has any direct relationship with brightness or color perception. Nevertheless, the relationship might be indirect, for example, the extended response could be mediating edge integration or contrast normalization.
The most parsimonious model of the fMRI response also includes a contrast-dependent surface response restricted to the part of the stimulus that was physically modulated and, for luminance modulations, an additional contrast-independent signal, restricted to the central region during surround modulation. Thus, although we find some evidence for an increased signal in the central region of V1, our modeling reinforces the conclusion that this signal does not vary with perceived brightness or color. One aspect in our paradigm that was constant regardless of whether center or surround was modulated is the appearance (during modulation) and disappearance (during fixation) of the central disk. Hence, although the constant signal does not appear to encode surface brightness, it could, for example, encode figure–background separation (Lamme, 1995; Lamme et al., 2002) or border ownership (Zhou et al., 2000), properties that need not necessarily vary in a contrast-dependent manner. The constant signal was not required to model the responses in the chromatic variant of our induction experiment. Consistent with an interpretation in terms of figure–background segregation-related activity, segregation is generally much harder in isoluminant color displays compared with those that contain luminance edges (Livingstone and Hubel, 1987). A recent fMRI study, however, failed to find figure–ground surface responses in V1 and V2 (Schira et al., 2004).
Comparison to previous neuroimaging studies of human surface perception
Haynes et al. (2004) studied surface responses in early human visual cortex and, like us, found small contrast-dependent activations during changes in local (disk) luminance. Moreover, they found that fMRI responses correlated with the subjective rating of brightness, suggesting that the measured responses represented surface brightness. It is important to note, however, that Haynes et al. did not explicitly dissociate luminance from brightness, as we did. We find no correlation between fMRI signal intensity and surround-luminance modulation, indicating that surface responses in early human visual cortex do not encode brightness.
Our results agree with those of Haynes et al. in terms of the magnitude of cortical responses to local-luminance modulation. Haynes et al. report a 0.25% signal change for a 60% change in contrast, whereas we obtain a 0.2% modulation difference for a 50% change in contrast. This similarity is important. We used 1 Hz temporal modulations, a frequency at which perceptual changes in brightness are particularly apparent. Such modulations, however, cannot be directly resolved by the relatively slow BOLD mechanism that is probed by fMRI. Potentially, one could argue that, because of our use of continuous modulations, increments and decrements in brightness-related activity could have cancelled each other out. In the event-related approach of Haynes et al., responses were measured separately for luminance increments and decrements, so such a cancellation could not play a role in their study. The fact that we obtained similar activations to those of Haynes et al. during disk-luminance changes suggests that cancellation did not play a major role in our study.
Using a paradigm similar to ours, Boucard et al. (2005) found increased fMRI signals in early visual cortex for both disk- and surround-luminance modulations. The latter responses, however, showed a large temporal delay with respect to stimulus onset, and so did not appear to be directly related to brightness perception. Moreover, the use of a relatively small stimulus may have resulted in the capture of edge responses, which we here show are much wider than expected on the basis of BOLD resolution. Sasaki and Watanabe (2004) found evidence for color filling-in signals in human V1. Importantly, in this study, the magnitude of the perceived color and brightness changes was not varied, making it hard to judge whether responses were correlated with the perceptual experience of color or brightness or some other aspect of the stimulus such as figure–ground segregation. Here, we find some evidence for a contrast-independent increase in BOLD signal when illusory brightness is induced within the central disk. The data of Sasaki and Watanabe (2004) may reflect this contrast-independent BOLD response (of yet unknown origin), rather than a specific contrast-dependent filling-in response. Consistent with our present results, Perna et al. (2005) found no increased activity in V1 for a Craik–Cornsweet–O'Brian brightness illusion. This study did report increased activations related to brightness in two dorsal areas (caudal region of the intraparietal sulcus and in the lateral occipital sulcus). Because the magnitude of perceived brightness was not changed, it is again hard to judge whether responses were correlated with the perceptual experience of brightness or another aspect of the stimulus, such as shape.
fMRI signals in early human visual cortex have been shown previously to correlate with perception, rather than with the physically presented stimuli (Ress and Heeger, 2003; Zenger-Landolt and Heeger, 2003). At first glance, our results suggest that this finding cannot be extended to the domain of brightness. Yet, we find powerful edge responses that are almost identical for disk and surround modulations. Given the psychophysical evidence that edges play a key role in color and brightness perception (Reid and Shapley, 1988; Brenner and Cornelissen, 1991; Rudd and Arrington, 2001; Bindman and Chubb, 2004a,b; Hong and Shevell, 2004a,b; Rudd and Zemach, 2004; Shapiro et al., 2004), our results could be interpreted as evidence that surface filling-in is unnecessary and that edge responses alone, or in combination with local luminance, determine surface brightness (Blakeslee and McCourt, 1999; Dakin and Bex, 2003; Blakeslee and McCourt, 2004; Shapiro et al., 2004).
Comparison to neurophysiological results
Monkeys and humans appear to perceive surface brightness in similar ways (Huang et al., 2002). Neurophysiological studies in monkey suggest that a subpopulation of V1 and V2 neurons respond to luminance modulations well outside their classical receptive field (Kinoshita and Komatsu, 2001; Roe et al., 2005), in a manner qualitatively consistent with brightness perception. Similar results have been reported in cats (Rossi et al., 1996; Rossi and Paradiso, 1999; Hung et al., 2001). Friedman et al. (2003), however, found no evidence for color filling-in signals in V1 and V2 of awake behaving monkeys. Recent modeling work suggests that the majority of V1 responses reported by Kinoshita and Komatsu (2001) can be understood on the basis of local and mean luminance processing and that only a small minority of responses are consistent with edge-driven surface activity, such as brightness filling-in (Vladusich et al., 2006). We speculate that the properties of these previously determined surface-responsive neurons (Rossi et al., 1996; MacEvoy et al., 1998; Rossi and Paradiso, 1999; Hung et al., 2001; Kinoshita and Komatsu, 2001; Roe et al., 2005) may in fact arise from the mechanisms underlying the extended edge responses we observed in our study, and so are presumably not directly related to our perception of brightness, color, or filling-in.
Footnotes
-
F.W.C. and T.V. were supported by Grant 051.02.080 from the Cognition Program of the Netherlands Organization for Scientific Research and BCN. F.W.C. was additionally supported by Prof. Mulderfonds. A.R.W., R.F.D., and B.A.W. were supported by Grant RO1 EY30164 from the National Eye Institute. We thank Alyssa Brewer for help with retinotopic mapping and Just van Es and Christine C. Boucard for assisting with the psychophysical experiments.
- Correspondence should be addressed to Frans W. Cornelissen, Laboratory of Experimental Ophthalmology and BCN NeuroImaging Centre, School of Behavioral and Cognitive Neurosciences, University Medical Centre Groningen, University of Groningen, P.O. Box 30.001, Groningen 9700 RB, The Netherlands. Email: f.w.cornelissen{at}rug.nl