Illusory contours (perceived edges that exist in the absence of local stimulus borders) demonstrate that perception is an active process, creating features not present in the light patterns striking the retina. Illusory contours are thought to be processed using mechanisms that partially overlap with those of “real” contours, but questions about the neural substrate of these percepts remain. Here, we employed functional magnetic resonance imaging to obtain physiological signals from human visual cortex while subjects viewed different types of contours, both real and illusory. We sampled these signals independently from nine visual areas, each defined by retinotopic or other independent criteria. Using both within- and across-subject analysis, we found evidence for overlapping sites of processing; most areas responded to most types of contours. However, there were distinctive differences in the strength of activity across areas and contour types. Two types of illusory contours differed in the strength of activation of the retinotopic areas, but both types activated crudely retinotopic visual areas, including V3A, V4v, V7, and V8, bilaterally. The extent of activation was largely invariant across a range of stimulus sizes that produce illusory contours perceptually, but it was related to the spatial frequency of displaced-grating stimuli. Finally, there was a striking similarity in the pattern of results for the illusory contour-defined shape and a similar shape defined by stereoscopic depth. These and other results suggest a role in surface perception for this lateral occipital region that includes V3A, V4v, V7, and V8.
Illusory contours are perceived edges that typically bridge gaps between precisely aligned luminance edges, but do not physically exist in the image. Shapes defined by illusory contours are of special interest because they reveal mechanisms that segment figures from their background, but are not confounded with luminance-defined cues (Kanizsa, 1979; Petry and Meyer, 1987). In contrast, luminance contours can arise because of a wide variety of factors in addition to object boundaries, such as shadows, highlights, or internal texture. Thus, direct comparison of the physiological response to luminance and illusory contours may reveal brain mechanisms that contribute critically to object perception.
The mechanisms involved in illusory contour perception are thought to overlap with those responsible for the perception of real contours, at least partially (von der Heydt and Peterhans, 1984; Vogels and Orban, 1987; Paradiso et al., 1989; Dresp and Bonnet, 1994). Experiments in cats and monkeys suggest that neurons in at least two visual areas, V1 and V2, carry signals related to illusory contours, and that signals in V2 are more robust than in V1 (Redies et al., 1986; von der Heydt and Peterhans, 1989; Grosof et al., 1993; Sheth et al., 1996). However, such electrophysiological studies have not focused on the representation of illusory contours in the many visual areas beyond V2. In addition, the extent to which results depend on the exact choice of stimuli is unclear. There may be an important distinction between stimuli in which the illusory contour lies parallel to the inducing edges and those in which the illusory contour lies perpendicular to the inducing lines (Lesher and Mingolla, 1993).
Recently, functional magnetic resonance imaging (fMRI) has furnished evidence on the neural substrates of illusory contour perception in humans (Hirsch et al., 1995a; ffytche and Zeki, 1996), but exactly which visual areas were activated remains unknown. Few functional landmarks were available in these studies to serve as reference points. Also, none of these studies tested more than one type of illusory contour, which makes it difficult to generalize the findings across a range of stimuli.
It is also of interest to compare the cortical circuits activated by shapes defined by illusory contours and by stereoscopic depth. Illusory shapes possess implied depth ordering caused by the perceived occlusion of inducing shapes, i.e., amodal completion. Comparing the cortical response to implied depth with the response to actual stereoscopic depth might indicate common regions associated with the grouping of retinal features to reconstruct the relations between three-dimensional surfaces in the world.
For these reasons, we collected functional magnetic resonance images of human visual cortex during the perception of multiple types of illusory and real contours. We designed the current experiments to address specific questions regarding contour representation in human visual cortex. (1) Do visual areas activated by illusory contours largely overlap with those activated by real contours? (2) Do contours defined by different types of illusory contours activate different cortical regions? (3) Is there evidence for common processing of shapes defined by illusory contours and shapes defined by stereoscopic depth?
MATERIALS AND METHODS
Magnetic resonance imaging
Methods were similar to those reported previously ( Tootell et al., 1997). Subjects were scanned in a General Electric 1.5 Tesla scanner with echoplanar imaging (Advanced NMR, Wilmington, MA). Subjects’ heads rested in a semicylindrical bilateral quadrature receive-only surface coil. After a sagittal localizing scan was obtained, one or more scans were collected to optimize (15–5 Hz; full width at half max) the settings of four shim coils (linearx, y, z, and quadratic spherical harmonic z) (Reese et al., 1995). Then, a T1-weighted inversion recovery sequence [repetition time (TR), 21 sec; inversion time (TI), 1100 msec] was used to acquire 16 contiguous 4 mm slices with 1.5 × 1.5 mm in-plane resolution, oriented perpendicular to the calcarine sulcus, extending posteriorly to the occipital pole. These scans were used for anatomical registration (described below).
Next, multiple functional scans were acquired using the same slice prescription selected in the anatomical scans with 3 × 3 mm in-plane resolution. For each scan, 128 functional images were collected from each of the 16 slices (2048 images), including all of the occipital, and posterior parietal and temporal lobes. Functional signals reflecting neural activity via local oxygen consumption and blood flow were acquired (Kwong et al., 1992; Ogawa et al., 1992) using an asymmetric spin echo (ASE) pulse sequence [TR, 2 sec; echo time (TE), 70 msec; 180° refocusing pulse offset by −25 msec; matrix, 64 × 64]. For most stimulus comparisons, three functional scans of 4 min, 16 sec duration were repeated in one scanning session and averaged together. In the case of functional scans used to determine the retinotopy of visual areas (see Visual Stimuli) we used scans of 8 min, 32 sec duration (TR, 4), with all other parameters as described above. The entire scanning procedure typically lasted 2–3 hr, including 8–15 functional scans, except in the rare event of equipment failure or subject discomfort. In the latter cases, the scans were terminated prematurely.
Head movement (within and between scans) was minimized by the use of a bite bar, in which subjects stabilized their jaw in a rigid, deep individual dental impression, mounted in an adjustable frame. As in previous studies (Tootell et al., 1997), the use of a bite bar typically reduced head motion to <1 mm. Motion correction algorithms were available (Woods et al., 1992; Jiang et al., 1995; Friston et al., 1996) but were not necessary for the data we report here. Informed consent was obtained from all subjects, and procedures were approved by Massachusetts General Hospital Human Studies Protocol #90–7227.
Overall, 16 subjects participated in this study. Because of the investment of time needed to obtain surface reconstructions of individual brains, our subjects came from a limited pool of experienced subjects, comprised of local colleagues and Massachusetts General Hospital personnel. These subjects were relatively sophisticated psychophysical observers, and had a high motivation level. Although we did not monitor eye movements, the MR data indicate adequate fixation during each functional scan. If subjects had not maintained fixation, we would not have obtained the retinotopically specific data we show (see Results). Furthermore, the stimuli were simple, predictable, and symmetric around the fixation point, so they did not produce a tendency for eye drift (e.g., optokinetic nystagmus).
During the MR imaging experiments, the visual stimuli were generated by a Silicon Graphics Onyx computer or a Macintosh IIvx computer with a resolution of 640 × 480 pixels. In either case, the video output was converted to a 60 Hz interlaced composite S-VHS signal, which served as input to a Sharp 2000 color LCD projector. The projector’s image passed through a focusing lens into the bore of the magnet, and appeared (∼17.5 × 13 cm; ∼40 × 30°) on a plastic rear-projection screen (Day-tex) placed in front of the subject’s chin. The subjects viewed the screen, which was oriented perpendicular to the long axis of their prone body, by looking straight up at a mirror placed at an ∼45° angle to both the screen and the subject’s line of sight. In this manner subjects could comfortably view the stimulus.
All the stimuli created for this study were similar in that they contained an achromatic single contour, arranged as a circle or square, centered on the fixation point (Fig. 1). Throughout each experiment, subjects fixated the center of these figures so that contours were always approximately isoeccentric (ranging from 1–9o). Within a scan session, the size of all comparable stimuli remained constant. All control and experimental comparisons were matched with respect to luminance levels, unless that variable was being assessed directly.
During most functional scans, subjects viewed alternating experimental and control epochs in a two-condition, blocked design. The experimental and control alternation always occurred at 16 sec intervals during the 4 min, 16 sec scan (eight cycles per scan). Within each epoch, the stimuli typically alternated between two versions of the experimental stimulus (called E1 and E2) and two versions of the control stimulus (called C1 and C2) every 2 sec (eight times per epoch). This alternation was usually a reversal of stimulus contrast. This alternation within each epoch was used to prevent retinotopic visual aftereffects and to make the stimulus more dynamic and interesting. At least in the case of illusory contours, opposing contrasts do not reduce or eliminate contour perception (Prazdny, 1983;Shaply and Gordon, 1983).
Illusory contours: Kanizsa-type. Our first experiment compared the effects of an illusory contour-defined shape with the absence of that shape. In the experimental stimulus, four inducers (“pacmen”) were aligned to create the percept of an illusory diamond shape (Fig. 1 A). In the control stimulus, the same pacmen were rotated to disrupt the percept of that diamond shape (Fig. 1 B) (Kanizsa 1979; Hirsch et al., 1995a). In an additional control experiment, in one subject, a blank screen with a fixation point was interposed between the experimental and control conditions so that the time course of the fMRI signal could be plotted and related to the fixation baseline. We used a diamond configuration of inducers so that any possible fMRI signal caused by the small change in the location of inducer edges between the two conditions could be localized relative to the vertical or horizontal meridian representations in visual cortex. This stimulus subtended 15.8° in maximal extent, along the vertical and horizontal meridians. Each inducer was 3.6° in diameter, and the inducers were separated by 8.6° (center to center) for a support ratio of 0.4 (i.e., the ratio of the portion of the illusory shape perimeter which was defined by the luminance edges of the inducers, to the total perimeter of the illusory shape). The sign of contrast (black on gray vs white on gray) reversed every 2 sec. All subjects reported the sensation of an illusory diamond shape when the inducers were aligned, but not in the alternating epochs when the inducers were not aligned.
Two other experiments used Kanizsa-type inducers. For these experiments we arranged the inducers to form an illusory square rather than a diamond, to confirm that the results were not specific to the diamond shape. One experiment compared the response to illusory squares of varying size (each vs a rotated inducer control). In those experiments, we compared inducer separations of 1.9, 3.8, 5.5, and 7.5° (center to center), all with a support ratio of 0.5. The second experiment compared an illusory square with a stimulus that was identical except that the central square was created by actual luminance contrast (Fig.1 C,D). The contrast of the inducers and the luminance square reversed every 2 sec. In the latter experiment, the average Michelson contrast of the square against the background was 11%.
Illusory contours: displaced gratings. For this experiment, the experimental stimuli were gratings with a central region displaced to form a diamond shape (Fig. 1 E). The control stimuli were standard gratings that lacked this displacement (Fig.1 F). The sign of contrast reversed every 2 sec as described above. Three versions of the grating-based illusory contour stimuli were used in which the line spacing was 0.5, 1, and 2° (spatial frequencies of 2, 1, and 0.5 cycles/°, respectively). As a further control, a radial version of the grating-based contours was also used, with inducing lines perpendicular to the illusory circular shape (Fig.1 G,H).
Stereopsis contours. Static red-green random dot stereograms (RDS) (Julesz, 1971) with a dot size of 0.19° were used to create depth from binocular stereopsis (Fig. 1 I,J). In the experimental epochs, a depth-defined square (8.8 × 8.8°) was visible at a depth nearer than background because of a disparity of 0.56°. The control epoch was a homogeneous, achromatic, random dot field. During the stereopsis scans, subjects wore plastic glasses with a red filter over one eye and a green filter over the other. To ensure stable binocular fusion, we omitted the 2 sec alternation, except in a control version in two subjects. All subjects reported clear binocular depth boundaries.
Luminance contours. These stimuli were created using Vision Shell (MicroML) software on a Macintosh IIvx. A single circular shape (7.7° eccentric) alternated with a homogeneous background every 16 sec (Fig. 1 K,L). The sign of contrast reversed every 2 sec, as described above. The luminance-defined circle had a mean luminance of 132.2 Foot-Lamberts and a Michelson contrast of 15%.
Ipsilateral field mapping. Additional experiments studied the activation produced in the ipsilateral hemisphere by visual stimuli contained in a retinotopically fixed sector (displaced by 20° of polar angle from the vertical meridian, also avoiding a circle of 0.5° radius centered around the fixation point). This wedge-shaped aperture contained colored images of recognizable scenes and objects (Tootell et al., 1998a).
Retinotopic mapping. This study took advantage of previously reported methods developed for mapping retinotopic areas with slowly moving phase-encoded stimuli comprised of counterphasing luminance checks (DeYoe et al., 1994, 1996; Engel et al., 1994, 1997;Sereno et al., 1995; Tootell et al., 1997; Hadjikhani et al., 1998). Very briefly, we used stimuli that systematically map either visual field polar angle or eccentricity during paired but separate scans. The data from these paired scans was combined to yield field sign maps in which visual area borders were made visible. Visual area naming conventions are as described in Tootell et al. (1998) and are consistent with previous retinotopic studies. The superior portions of V1, V2, and V3, contain representations of the contralateral lower visual field, whereas the inferior portions of V1, V2, VP, and V4v contain representations of the contralateral upper visual field. V3A represents both the lower and upper contralateral field. Areas V1, V2, VP, V3, V3A, and V4v are “classical” retinotopic areas that have been described previously. Anterior to these areas there is a “fringe” region including V7 and V8, whose cruder retinotopy has been demonstrated only with high-field scanning (Hadjikhani et al., 1998). This fringe region has also been shown to be activated by both left and right visual fields (Tootell et al., 1998a). Thus, the evidence suggests that areas V7 and V8 lie near the end of a continuum of decreasing retinotopy and increasing receptive field sizes.
Intracortical connections between human visual areas are not yet known. Here we presume these connections and the resultant cortical hierarchy are similar to those shown in macaque (Felleman and Van Essen, 1991). Conveniently, the hierarchical levels of cortical areas V1, V2, V3/VP, V3A/V4v, V7/V8, and MT are approximately consistent with their cortical location, running from posterior to anterior, respectively. Thus, we use the terms “lower-tier” and “higher-tier” to refer to general positions in the presumptive human hierarchy.
Retinotopic maps were obtained from all of our 16 subjects sufficient to discriminate the borders of these areas. For individual subject analysis, the borders from each subject’s field sign map were extracted and overlaid on the activation patterns produced by other stimuli (Fig. 3 B,C). We also used the field sign maps to define regions of interest for the across-subject analysis described later in this section.
Cortical surface reconstruction
Details of the cortical surface analysis have been described elsewhere (Dale and Sereno, 1993; Dale et al., 1999; Fischl et al., 1999). Briefly, brain reconstruction was begun by collecting whole-head Siemens magnetization-prepared rapid gradient-echo (MP-RAGE) scans (1 × 1 × 1 mm), optimized for contrast between gray and white matter, for each subject. Voxels containing white matter in an intensity-normalized volume were labeled using an anisotropic planar filter. A region-growing algorithm was then used to ensure that each cortical hemisphere was represented by a single connected component with no interior holes. The surfaces of these components were tessellated (∼150,000 vertices), refined against the MRI data using a deformable template technique, and manually inspected for topological defects, i.e., departures from spherical topology. In a separate step, the cortical surface was computed by expanding the gray–white surface by 3 mm and refining it against the MR data. The sampled functional signal included most of cortical gray matter, but it was centered just above the gray–white boundary to avoid the pial surface where macrovascular fMRI artifacts are greatest, and to ensure that functional signals were assigned to the correct sulcal bank.
The surface reconstructions of the subjects’ brains were “inflated” by an iterative algorithm that reduced local curvature while approximately preserving local areas and angles. Flattened patches of cortex were obtained by “cutting off” the posterior third of cortex from the inflated hemispheres and making an additional cut (i.e., disconnection of vertices) along the fundus of the calcarine fissure (Fig. 2 D,E). These cortical patches were flattened with a relaxation algorithm that minimized linear and angular distortion. Residual linear and angular distortion varies across the flattened surface (Sereno et al., 1995; Tootell et al., 1997), but recent analyses indicate that residual distortion averages only ∼10% (Fischl et al., 1998).
Functional MR data analysis
Individual subjects analysis. The MR data acquired for three-dimensional surface reconstruction was used to register anatomically the T1-weighted echoplanar imaging inversion recovery scans (1.5 × 1.5 × 4 mm resolution) that were obtained for the functional scans. The two data sets were manually aligned by direct iterative comparisons of the coronal, horizontal, and sagittal planes. Once the optimal registration was achieved, the same registration matrix was applied to the functional data to align them with the reconstructed cortical surface. For cortical inflating and flattening, the lower resolution functional data (3 × 3 × 4 mm) was smoothly interpolated onto the high-resolution surface reconstruction.
For each functional scan, a Fourier analysis was done on the time series of each voxel. For two-condition comparisons, significance values were computed for each voxel by performing an F test on the ratio of the signal at the stimulus cycle frequency (eight cycles per scan) compared to all other nonharmonic frequencies between 3 and 64 cycles per scan, excluding ±1 cycle around the stimulus frequency. Excluding cycle frequencies <3 helps to remove baseline drift, and head motion artifacts. Harmonic frequencies were excluded because any periodic signal that is not perfectly sinusoidal will be expressed by the sum of sine waves at its fundamental frequency and all of its harmonics. The phase of the signal at the stimulus frequency was used to distinguish between signal increases and decreases in the MR signal for two-condition comparisons and to encode visual field location in phase-mapped retinotopic experiments.
Across-subjects analysis. To generate regions of interest (ROIs) specific to a given visual area, or part of such areas, patches of flattened cortex that corresponded to each retinotopic area were defined based on the retinotopic field sign map for each subject. These objectively defined borders were available for visual areas V1 (superior and inferior), V2 (superior and inferior), V3, VP, V3A, and V4v. Given that several of our experiments produced activation immediately adjacent to V3A and V4v, we created two additional ROIs adjacent to these areas to encompass the newly defined crudely retinotopic areas V7 (adjacent to V3A) and V8 (adjacent to V4v). The eccentricity range of these ROIs was ∼1–15°. For the classical retinotopic areas (V1, V2, VP, V3, V3A, V4v) an additional analysis was done using restricted ROIs within each visual area that included only the eccentricities from 3 to 9o, as assessed by retinotopic mapping of eccentricity in each subject. This eccentricity range included the location of the isoeccentric contours in the illusory and real contour stimuli.
We also created an ROI for area MT+. This area refers to presumptive human area MT, but the term MT+ is used to acknowledge the possibility that other small, adjacent motion areas are included (DeYoe et al., 1996). This ROI was defined by taking all the cortical surface voxels that exceeded a functional statistical threshold of p≤ 10−2 included in the area MT+ defined by our standardized stimulus comparison (low contrast motion vs stationary) (Tootell et al., 1995). For each subject, we also created an additional functional ROI based on the aligned (Kanizsa) inducers versus rotated inducers experiment (see Results). Again, this ROI consisted of all the cortical surface voxels that exceeded a statistical threshold of p ≤ 10−2.
For each ROI, the time course of the fMRI signal was averaged for all voxels. Then, the average magnitude for the experimental and control epochs were calculated separately, and their difference was computed, factoring in a 4 sec hemodynamic delay. These difference scores were then averaged across subjects and analyzed statistically usingt tests, with correction for multiple comparisons. These data were also analyzed with pairwise multivariate ANOVAs to determine if the relative pattern of activation across visual areas varied for the different stimulus comparisons.
Representation of illusory figures on single-subject flat maps
Illusory contour-defined figures: aligned (Kanizsa) inducers versus rotated inducers
In the first experiment, we presented stimuli that either did or did not give rise to illusory contours, but were otherwise very similar to each other (Kanizsa, 1979; Hirsch et al., 1995a). In the experimental stimulus, four inducers (pacmen) were aligned to create the percept of an illusory diamond shape (Fig. 2 A). In the control stimulus, the pacmen were rotated to destroy the perception of the diamond shape.
For 12 subjects, the regions of cortex that responded more to the experimental condition than to the control condition were analyzed (44 scans; 90,112 images). Such results are shown for four representative subjects (Figs. 2 E, 3B; see 5B,D). In all but one subject (who showed no significant signal specific to illusory contours), the differential activation was located bilaterally, centered on the lateral surface of the occipital lobe. The pattern of activation was an elongated stripe centered on the lateral occipital sulcus, that tended to become patchy toward the parietal and temporal lobes. In each of the 11 subjects, such signals were obtained from both the right and left hemispheres.
To demonstrate more explicitly the relative signal strength across visual areas in the above comparison, we performed an additional experiment in which we repeated the comparison between aligned and rotated inducers, with interposed epochs consisting of a fixation point alone. This made it possible to plot a time course for those cortical surface voxels preferentially activated by the Kanizsa stimulus (Fig.2 B). Furthermore, we can compare the signals from this statistically defined region to the locations of the known visual areas, defined by retinotopic mapping in the same subjects. It is evident that the region of interest, which was obtained in a separate scan of aligned versus rotated inducers in the same subject (Fig.2 E), is distinguished by a stronger response to aligned than to rotated inducers. In contrast, lower-tier visual areas such as V1, V2, V3, and VP show a response to both aligned and rotated inducers that is not reliably different for individual subjects (although small but significant differences were seen in the across-subjects analysis described later).
We directly compared the map of retinotopic areas with the illusory contour-related activity in each of the 12 subjects (38 scans; 77,824 images). The illusory contour signals were concentrated in the lateral occipital region, including V7 and V8, but often extended into V3A and V4v. The relative lack of signal in V1, V2, V3, and VP was consistent across individual subjects, and representative cases are shown (Figs.3 B; see 5B,D).
Finally, we performed an additional control experiment to exclude the possibility that the brain activation produced by the original Kanizsa comparison represents a simple sensitivity to the small displacement of inducer edges that acompanies their rotation. In this case, we compared a stimulus like that in Figure 1 B (except that all inducers were facing left) with a similar stimulus in which each inducer was rotated by 180° (all facing right). In this case, neither configuration was consistent with an illusory shape. Correspondingly, this comparison yielded no differential activation.
The next step was to test the extent of overlap between the cortical regions that responded more to illusory contours, compared to those regions that were activated by a comparable “real” contour. When we examined the brain regions that responded more to an isoeccentric luminance-defined contour than to a homogeneous field, we found an irregular but continuous line of activation along an isoeccentric contour that runs perpendicular to the long axis of the retinotopic areas, in both hemispheres of 11 subjects (Fig.3 C). In the subjects with the greatest extent of activation, no clear difference was visible in the strength of activation across retinotopic visual areas, although there was variability in the extent of activation anterior to V3A and V4v. Thus, the luminance-defined shape provided a clear contrast with the illusory contour shape by activating all the visual areas approximately equally (see across-subjects analysis below).
Size invariance of illusory contour response
It could be argued that lower-tier retinotopic areas were not strongly activated by the illusory shapes because of a large mismatch between receptive field size compared to stimulus size. Perhaps the lateral occipital region was selectively activated simply because it contains neurons with large receptive fields capable of bridging the gaps (8.6°) between inducing elements. We tested this hypothesis by comparing the extent of activation produced by edge-type (Kanizsa) stimuli of different sizes (gap sizes of 1.9, 3.8, 5.5, and 7.5°) in 6 subjects. In comparison with the original results with gaps of 8.6°, we obtained no evidence of greater activation in the lower-tier retinotopic areas (V1, V2, V3, and VP) (Fig.4). The focus of maximal activation produced by the four smaller sizes was similar to that obtained originally. The consistency of responses over a range of stimulus sizes fits nicely with other data, suggesting that receptive fields are large and bilateral in this region (Tootell et al., 1998a). Similar size-invariant responses have been documented in single neuron responses in the inferotemporal region of monkey cortex (Lueschow, 1994). This property is thought to underlie the ability of monkey and human observers to recognize objects over a wide range of stimulus sizes.
Aligned inducers (Kanizsa) versus aligned inducers with luminance occluder
Based purely on the above data, it could be argued that the results of the original Kanizsa comparison could still be caused by factors other than the presence versus absence of an illusory shape. Perceptually, the aligned inducer condition created an illusory closed figure that appeared to occlude the inducers. To investigate this effect of occlusion, we compared the original stimulus with a stimulus in which the area of the illusory shape was filled in with an actual luminance change (Fig. 1 C,D).
The results for this test (seven subjects; 24 scans; 49,152 images) were similar to those obtained for the original comparison, in that greater activation was obtained for the illusory Kanizsa stimulus in V3A, V4v, V7, and V8. However, we found two further differences. The overall signal strength was weaker in these areas when the luminance-occluding figure served as a control. Also, in visual areas V1 and V2, there was greater activation during the luminance occluder epoch than during the illusory-occluder epoch. This effect is consistent with recordings in monkey V2 showing more vigorous single unit responses to a luminance edge than to an illusory edge (Peterhans and von der Heydt, 1989). This type of comparison does not allow us to distinguish between fMRI responses to illusory (or real) contours as opposed to surfaces, but it does suggest that the lower- versus higher-tier areas respond with opposite “preferences” to the luminance and illusory shapes. These conclusions are confirmed by the across-subjects analysis described later.
Next we localized the regions that responded more to an isoeccentric contour in depth than to a zero depth random dot display. The pattern of results for the stereo-defined shape was similar to the illusory shape in that the activation peak was centered in the anterior visual areas (Fig. 5 A,C). Comparison between the regions activated by the illusory contour-defined shape and the stereopsis-defined shape indicated a significant overlap, particularly in V3A and V7 (Fig. 5). The degree of overlap decreased inferiorly (e.g., anterior to V4v), where the illusory contour stimuli produced more activity than the stereo stimuli.
Displaced versus nondisplaced gratings
Here we compared the results obtained from the Kanizsa-type stimuli to those produced by grating-based illusory contours. These two stimulus types have known psychophysical differences (Petry et al., 1983; Lesher and Mingolla, 1993). Also, displaced-grating illusory contours have been used often in physiological experiments in animals (von der Heydt and Peterhans, 1989; Grosof et al., 1993; Sheth et al., 1996), and these studies suggest that displaced gratings may evoke a stronger response in lower-tier areas than the Kanizsa-type.
For this experiment, the experimental stimuli was a grating with a central region displaced to form a diamond shape (Fig.2 E), whereas the control grating lacked this displacement (Fig. 2 F). We initially used stimuli with a line spacing of 0.5° (2 cycles/°). As was observed for the other illusory contour comparisons, this stimulus comparison selectively activated the higher-tier visual areas. In addition, this stimulus produced an isoeccentic “contour” representation in the retinotopic areas V1, V2, V3, and VP (Fig.6 A).
It could be argued that the activation in retinotopic areas is an artifact caused by the Fourier energy at the orientation and location (phase) of the illusory contours in this stimulus (Ginsburg, 1975;Skottun, 1994). To reduce this artifact, we created two additional versions of this stimulus, in which the line spacing was increased to 1 and 2° (1 and 0.5 cycles/°, respectively) (Fig.6 B,C). Interestingly, the resultant signal in retinotopic areas did not decrease; instead, the differential activation in the lateral occipital region actually increased. This general pattern was seen in all nine subjects tested.
Another control experiment attempted to generalize the results with grating-based contours to a case in which the illusory contour was produced at a different angle relative to the inducing lines. In two subjects, we repeated the experiment using radial lines that ran perpendicular to an illusory circle (Fig.2 I,J). The results were very similar to those obtained with the standard gratings.
The fact that the differential signal grew stronger as the number of line terminations was reduced (lower spatial frequency) also helps to support the conclusion that the presence of line terminations themselves was not the primary source of activation. Furthermore, we performed an additional control experiment to equate the presence of line terminations in three subjects. The new control stimuli consisted of the original displaced-grating stimuli with the line terminations misaligned, i.e., interleaved with each other, so as not to form an illusory contour (von der Heydt and Peterhans, 1989). The displaced-grating experimental stimuli were unchanged. The results were very similar to the original comparison, suggesting again that these areas show a response to illusory contours that goes beyond the response to line terminations per se. This trend was seen, despite the fact that the interleaved version does not entirely eliminate the global figure–ground segmentation.
Across-subjects analysis for isoeccentric figures
In these experiments, the stimuli were comprised of single figures with edges that remained approximately isoeccentric at 7–9° eccentricity. Such isoeccentric contours produced very orderly maps on the flattened cortical surface: essentially straight lines crossing the retinotopic isopolar lines, including the isopolar area borders. This was consistent with earlier retinotopic evidence for an approximately polar coordinate system, similar to that found in other species (Schwartz, 1977). The representation of a square/diamond (rather than a circle) produced a predictable deviation from the isoeccentric lines, but this deviation was small because of the moderating influence of the cortical magnification factor. It is experimentally convenient that a single, approximately isoeccentric contour produced a single stripe of activation that runs across the visual areas, because this allowed for direct comparison of the activity patterns across visual areas (Fig. 3 C). Also, using just a single contour allowed us to predict with accuracy the resulting site of activation in retinotopic cortex.
The individual flat maps imply that certain areas lack responsiveness to certain stimuli, (e.g., the lack of response to aligned vs rotated inducers in lower-tier areas like V1 and V2). To test such negative results more rigorously, we devised a strategy that allowed for data to be averaged across subjects quantitatively. First, we created ROIs based on nine separate visual areas (see Materials and Methods). For each of these ROIs we calculated the average percentage of fMRI signal change that was produced by the stimulus comparisons discussed above. The percent signal change score for each area could then be averaged across subjects. In areas V1, V2, VP, V3, V4v, and V3A, we also performed a similar analysis on restricted ROIs that included only the eccentricities from 3–9°, the eccentricity at which the isoeccentric contours were represented. It should be noted that the choice between larger ROIs or the restricted (by eccentricity) ROIs involves certain tradeoffs. Because of differences in receptive field size and retinotopic point spread across areas, using larger ROIs may put the lower-tier retinotopic areas at a disadvantage. Using restricted ROIs can mitigate this problem, but this analysis was not applied to less retinotopic areas such as V7, V8, and MT+, effectively putting them at a disadvantage.
To test for differences between the two hemispheres, we compared the average percent signal change for all visual areas in the left hemisphere with those in the right hemisphere, using a ttest. In all cases, the difference between left and right hemispheres was not significant (luminance, p = 0.19; stereopsis,p = 0.72; aligned vs rotated inducers,p = 0.32; displaced vs nondisplaced gratings,p = 0.99).
These tests of hemispheric lateralization were particularly interesting, because a previous study reported stronger signals in the right hemisphere for the aligned versus rotated comparison (Hirsch et al., 1995a). In our study, the average right hemisphere the modulation was 0.078%, whereas that for the left hemisphere was 0.056%, but this difference was not significant. To test for hemispheric asymmetry more extensively, we measured the extent of activation in individual subjects. For each of 11 subjects, we determined the number of voxels that exceeded the significance threshold of p = 10−2(colored red and white) separately in the right (R) hemisphere and the left hemisphere (L). Then we calculated the mean laterality index [(R − L)/(R + L)] to be 0.13. If a higher threshold is chosen that includes the voxels >p = 10−5 (colored white) the mean index increases to 0.34. The regions included at those two significance levels can be estimated from the pseudocolor activation in Figure 5. Thus, in individual subjects, highly thresholded data can indicate a laterality effect that does not survive across-subject analysis. Therefore, in the following analyses, we averaged together the percent signal change obtained for corresponding ROIs in the left and the right hemispheres.
The across-subjects results confirmed the conclusions from individual subject analysis (Fig. 7). Specifically, signal changes were relatively constant across retinotopic areas for luminance contours, but shifted anteriorly for the contours defined by stereopsis and illusory contours. F tests confirm that signals were greater in anterior retinotopic areas compared to the lower-tier retinotopic areas for the stereopsis-defined figure (F (5,50) = 4.38; p = 0.01), the aligned (Kanizsa) inducers versus rotated inducers (F (5,55) = 7.65; p < 0.0001), and the displaced versus nondisplaced grating (F (5,40) = 7.2; p < 0.0001). The two types of illusory contours differed in that larger signals were produced by the grating-type illusory contours in the lower-tier retinotopic areas. Finally, there was also a significant change across visual areas in the case of illusory versus luminance (Kanizsa) squares (F (5,30) = 6.1;p < 0.0005).
Figure 7 also shows the results for the restricted ROIs within each retinotopic area, including only the eccentricities from 3 to 9o (see bullets with heavy error bars). As expected, the smaller regions of interest resulted in greater apparent signal changes. This is particularly interesting when comparing results in the aligned versus rotated inducers comparison (Fig. 7C). After all of our efforts to increase the statistical power of the data, we see that signal changes in areas V1 and V2 increase to nonzero values. This indicates not only that there was a small but detectable response to the Kanizsa-type illusory shape in lower-tier visual areas, but that the signals were retinotopically specific.
To formally test for different levels of modulation across retinotopic visual areas, we performed several multivariate ANOVAs with six subjects. A grand 4 × 8 ANOVA with factors of cue (shape defined by: luminance, stereopsis, Kanizsa-type illusory contour, and lowest spatial frequency displaced-grating illusory contour) and visual area (V1, V2, V3, VP, V3A, V4v, V7, and V8) was performed. The cue-by-area interaction was significant (F (21,126) = 3.25; p = 0.0001). The equivalent analysis for restricted ROIs had a borderline significant cue-by-area interaction in a 4 × 6 ANOVA (F (15,120) = 1.59; p = 0.08).
We followed up the significant grand ANOVA with pairwise comparisons between all of the cues (Table 1). The pairwise comparisons were performed for the full area retinotopic ROIs, and the eccentricity restricted retinotopic ROIs. Because of the large number of tests here, we also considered the effects of multiple comparisons. We have indicated with asterisks the p values that would survive a Bonferroni correction of 6 (the number of pairwise comparisons in each case). We report all of the p values because they provide a concise indication of signal strength and variance.
The cue-by-visual area ANOVAs test for a main effect of cue, a main effect of visual area, and their interaction. Significant main effects of cue indicate that (averaging over all visual areas) there is a difference in signal magnitude, which could possibly be caused by differences in stimulus visibility. Table 1 shows that we did obtain a few marginally significant main effects of cue, but they do not dominate, or survive multiple-test correction, except in the case of Kanizsa-type versus displaced-grating type illusory shape. More importantly, we obtained two cue-by-area interactions that were clearly significant. Such significant interactions indicate that the pattern of response across visual areas differed across cues, even when constant overall differences in signal strength were removed. The interactions confirm that the signals from higher-tier areas are larger than those in lower-tier areas for shapes defined by illusory contours. It is also interesting that these interactions were markedly reduced in the case of the restricted ROIs, because of the boost in signal that this manipulation gives to the lower-tier areas. The interactions between other pairs of cues (e.g., stereopsis vs luminance and Kanizsa-type vs displaced-grating illusory shape) were marginally significant. The current technique (using a 1.5 T scanner) may lack the power to detect these interactions; future high-field scanning at 3 T should resolve the issue.
Activation maps from individual subjects indicated that MR signals varied with the spatial frequency of the displaced-grating illusory shape stimuli (Fig. 6). We followed up on this observation with an ANOVA across nine subjects (Fig.8). A 3 × 8 ANOVA showed a significant effect of spatial frequency (F (2,14) = 0.047; p = 0.05), and a significant effect of visual area (F (7,49) = 6.85; p = 0.0001), but no interaction (F (14,98)= 0.45; p = 0.95).
Finally, we compared the four sizes of Kanizsa squares used in the aligned versus rotated inducer comparisons, across five subjects. A 4 × 8 ANOVA showed no significant effect of size (F (3,12) = 1.93; p = 0.20). It is with caution that we accept this null hypothesis, but there is 75% power to exclude a correlation between stimulus size and MR signal ≥0.5 (assuming independent samples; p = 0.05, one-tailed). It would be worthwhile to address this issue again with high-field scanning. As expected, there was a significant effect of visual area (F (7,28) = 10.65;p = 0.0001) and no interaction (F (21,84) = 0.62; p = 0.89).
To describe the location of our visual area ROIs more precisely within the cortical volume, we computed the mean Talairach coordinate for each visual area ROI using the automated stereotaxic procedure provided by the Montreal Neurological Institute (Collins et al., 1994). For all ROIs, we calculated the mean Talairach coordinates of all cortical surface vertices, then averaged the coordinates across subjects (Table2). According to Collins et al. (1994), the average (uncorrected) variability in location of cortical anatomical landmarks across subjects is 7.74 ± 1.74 mm. Not surprisingly, we see slightly higher variability for our occipital ROIs created by purely functional specification, because coordinates likely reflect some variation of functional location with respect to anatomical landmarks. Also, these functional areas extend over a relatively large cortical territory, particularly along their long axis, so more variability is expected.
Visual field representation in the lateral occipital region
Very strong signals were produced by illusory contour stimuli in the cortex immediately adjacent to V3A and V4v. That region of cortex is located on the lateral occipital surface of the cortex (Figs. 2, 3), and it is likely to contain multiple visual areas. We calculated the mean Talairach coordinate of all statistically significant voxels for the 11 subjects who produced activation maps for the aligned Kanizsa inducers versus rotated inducers comparison. The coordinates were −33.2 ± 9.4, −83.7 ± 7.2, and 2.9 ± 9.5 in the left hemisphere, and 27.4 ± 7.0, −84.7 ± 8.0, and 10.0 ± 9.1 for the right hemisphere. The exact relation between the regions of cortex activated in this study, and the complex called “LO” in a previous report (Malach et al., 1995) is not yet known, although some overlap is likely. The Talairach coordinates published for LO by Malach et al. (1995) are 42.8 ± 2.7, −72.7 ± 8, and −18.2 ± 9.8. The coordinates for LO in the Malach study are similar, but not identical to the ones we obtained for the Kanizsa comparison. In particular, Malach et al. (1995) obtained signals more ventrally with their paradigm. One likely source of this difference is that Malach et al. (1995) included recognizable objects (as well as abstract sculpture) in their experimental epoch; the control epoch consisted of visual textures. Several previous studies comparing recognizable objects with various controls have localized responses in the ventral occipital region around the fusiform gyrus (Stern et al., 1996;Kanwisher et al., 1997; Halgren et al., 1999)
Subsequent to the completion of this study, our research group has mapped additional retinotopic areas adjacent to V3A and V4v (V7 and V8, respectively) fueled primarily by the availability of a new 3 Tesla scanner (Hadjikhani et al., 1998). Although these new areas show some degree of retinotopy, it is cruder than in the six classically retinotopic areas (Tootell et al., 1998b). These and other results suggest that the receptive field sizes in these regions are relatively large (Tootell et al., 1997).
We have also demonstrated that this lateral occipital region can be strongly driven by the ipsilateral field (Tootell et al., 1998a). For the present study, we specifically compared (four subjects; eight scans; 16,384 images) the area that responded to the illusory contour comparisons with the region activated by the ipsilateral presentation of complex natural scenes. The two activation patterns overlapped extensively (Fig. 3 A,B). Because this ipsilaterally driven area was also activated by contralateral stimulation, we know that it is activated bilaterally and presume that the underlying receptive fields are bilateral.
Additional comparisons (11 subjects; 22 scans; 45,066 images) were made to determine the relationship between the illusory contour activation and the motion-sensitive area MT+ described previously (Watson et al., 1993; Tootell et al., 1995). In all of the subjects, MT+ was located anterior to the cortical regions activated by illusory contours. Thus, the region activated by the illusory contour comparisons lies between the most anterior classical retinotopic areas (V3A and V4v) and MT+, and it is largely comprised of bilaterally responsive cortex. Here we use the term LO region to refer to this lateral occiptial region.
By monitoring brain activity in many predefined visual areas simultaneously, we have explored the representation of several types of contours. Our results suggest a great deal of overlap in the visual areas that respond to luminance, stereopsis, and illusory contours. The visual areas we examined responded to all of the visual cues we tested, to some degree. However, the contours defined by different cues produced some differences as well. Significantly, illusory contours and stereopsis-defined contours were marked by relatively high signal changes in higher-tier cortical regions. In the following section we discuss the neural repsonse to illusory and stereopsis-defined contours and propose that a surface-based level of visual processing in the lateral occipital region may be a shared feature.
The neural response to illusory contours
Our results suggest that illusory contours are processed throughout the visual pathway, but signals are strongest in higher-tier areas, V3A, V7, V4v, and V8. The literature describing single-unit physiology in animals has shown neural responses to illusory contours in area V2, and to lesser extent, V1 (Peterhans and von der Heydt, 1989; von der Heydt and Peterhans, 1989; Grosof et al., 1993; Sheth et al., 1996). Although our individual subject analysis did not show that V1 or V2 neurons are activated by Kanizsa-type illusory stimuli, small signals were seen in V1 and V2 in the most sensitive across-subjects analysis. Furthermore, several additional factors mitigate any apparent discrepancy with respect to previous animal experiments. (1) Most obviously, previous single-unit studies did not test for responses to illusory contours in areas beyond V2. A testable prediction from our findings is that responses to illusory contours should be very strong in macaque areas V3A and dorsal V4; (2) We may have isolated responses specific to closed illusory contours or surfaces, as opposed to single illusory contours; (3) Our subjects were humans, rather than macaque monkeys; and (4) We recorded population signals, rather than specific single units.
We demonstrated significant activation for Kanizsa-type illusory shapes in the lower-tier retinotopic areas when we averaged across subjects, despite the lack of response shown in the individual, thresholded activity maps. This apparent difference is caused by the much better signal-to-noise ratio obtained by averaging many retinotopically restricted ROIs, compared to examining each individual activity map. Almost all the cortical regions that were activated in single subjects were contained within our quantitative ROIs; all such areas have at least some degree of retinotopy (which defined the borders). Thus, in this study the illusory contour comparisons activated primarily retinotopic areas. One possible exception is a region in the intraparietal sulcus that was seen as a distinct foci in several subjects (Figs. 3 B, 5 B). Overall, our results indicate a graded increase in responsiveness to illusory contour-defined shapes as one proceeds through the presumed cortical hierarchy. Luminance-defined shapes, for example, produced a different pattern, with stronger signals in lower-tier areas.
Our results indicate a larger signal in retinotopic areas in response to displaced-grating illusory contours compared to the Kanizsa-type. The results are consistent with the published evidence that displaced-grating contours are more likely to drive single neurons in V1 than the Kanizsa-type (Grosof et al., 1993; Sheth et al., 1996). There are multiple interpretations of the difference between the two types of illusory contours. One possibility is that the displaced gratings produced a response to the edges of each grating per se. However, the fact that the signals in the retinotopic areas did not decrease when we reduced the number of inducing lines argues that the signals reflected a response to the illusory contour itself. The population response to displaced-grating stimuli has been studied in V1 and V2 in experimental animals (Sheth et al., 1996), and both areas responded in an orientation-specific manner to the illusory contour, the inducing lines, and a combination of the two, with a greater proportional response to the illusory contour in V2 than in V1. Because of the local discontinuities present in the displaced gratings, our Kanizsa-type comparisons may be a purer test for illusory contour representation.
Our results are consistent with the results of previous human neuroimaging work using illusory contours that reported extrastriate activation loci for (Kanizsa) stimulus comparisons like that in Figure2, A and B (Hirsch et al., 1995a; ffytche and Zeki, 1996). However, we report more widespread signals thanHirsch et al. (1995). This difference likely reflects our efforts to achieve greater sensitivity using increased signal averaging, different hardware (e.g., surface coil), and analysis (e.g., across-subject averaging). Our results support the idea that both the right and left hemispheres have access to the bilateral neural representation of illusory shapes, as suggested by Mattingly et al. (1997). We also provide the first evidence that signals related to illusory contours are retinotopically specific in retinotopic areas, and that visual areas beyond V1 and V2 areas are the sites of most active processing. This information should be useful for models of illusory contour perception (Grossberg and Mingolla, 1985; Peterhans and von der Heydt, 1989; Takemoto and Yoshimichi, 1997). For instance, the role of feedback connections in V1 and V2 could be considered with greater emphasis, in addition to that of lateral connections between areas.
The neural response to stereopsis-defined contours
The stereopsis-defined contour produced activation that was strong in V3A and the lateral occipital region. In the case of stereo, we do not think that the higher activation in the relatively anterior regions was caused simply by a stronger “bottom-up” driving force, because signal amplitudes in V1 and V2 were roughly equal when produced by luminance-defined versus stereopsis-defined figures.
One PET study has reported areas that were activated by binocular disparity discrimination (Gulyas and Roland, 1994). However, in that study subjects performed a task, and there was no fixation. The most relevant comparison in that study was a luminance-based task subtracted from a depth discrimination task. In that case, a strongly activated locus was found in the “occipital superior gyrus” bilaterally with Talairach coordinates (−17, −79, 17; 28, −78, 14), which are close to those obtained for our ROI in V3A (Table 2). Additional brief reports have indicated the importance of V3A and the inferior parietal region in depth perception (Savoy et al., 1995, Nagahama et al., 1996), although other brief reports have emphasized earlier areas including V1 (Ptito et al., 1993; Hirsch, 1995b; Kahn et al., 1997). Methodological and stimulus differences may help explain the difference in results. Unlike our results, several studies, particularly PET studies, have reported an asymmetry favoring the right hemisphere in tests of binocular disparity (Ptito et al., 1993; Hirsch, 1995b;Nagahama et al., 1996), but this was not universally reported (Savoy et al., 1995).
A surface-based level of visual processing
We found a dissociation between stimuli containing stereoscopic depth cues or implied occlusion, compared to stimuli that did not create strong segmentation in depth. The illusory contour stimuli that produced strong signals in higher-tier areas include Kanizsa-type stimuli as well as our most artifact-free displaced-grating stimulus. Both these stimuli also give a clear impression of a solid shape occluding the background, as does the shape defined by stereopsis. Thus, the activation in the LO region might be related to segmentation of figures from background. Such a task is thought to occur at an intermediate level of processing (after edge detection, but before object recognition), and it may be associated with partial reconstruction of the three-dimensional depth relations between surfaces (Kanizsa, 1979; Marr, 1980; Nakayama et al., 1995). Theoretical and psychophysical support exists for a surface-based representation of the visual image (Petry and Meyer, 1987; Nakayama and Shimojo, 1992), but physiological evidence for such representations is limited.
It is likely that certain stages of surface processing require large bilateral receptive fields, e.g., the ability to integrate over distant retinal cues. Therefore, the fact that the LO region contains cortex that is bilaterally responsive is an important finding. One hypothesis regarding the function of the lateral occipital region is that it contains neurons that subserve long-range grouping, which is important for surface perception. Thus, activation including the LO region has been reported for stimuli that contain surfaces defined by kinetic contours (Van Oostende et al., 1997), for abstract three-dimensional shapes (Malach et al., 1995), and for symmetric stimuli (Tyler and Baseler, 1998). Future experiments will address the relationship between the stimuli with implied depth used in this study, and shapes defined by other means, to clarify the segmentation processes that are used constantly in normal vision.
This research was supported by grants from McDonnell-Pew to J.M., and Human Frontiers Program and National Institutes of Health Grant EY07980 to R.B.H.T. We are indebted to Ken Kwong, Bruce Rosen, Robert Weisskoff, Thomas Brady, Terry Campbell, Mary Foley, and Patrick Ledden for their critical contributions. We are grateful to Jody Culham and Patrick Cavanagh for generously providing stimulus presentation software. Discussions with Nava Rubin and Hany Farid were particularly helpful. We also thank Robert Savoy and The Rowland Institute for Science for technical and equipment support, and the Brain Imaging Center at the Montreal Neurological Institute for stereotaxic software.
Correspondence should be addressed to Janine Mendola, Massachusetts General Hospital Nuclear Magnetic Resonance Center, 149 13th Street (2301), Charlestown, MA 02129.