We compared neural substrates of two-dimensional shape processing in human and nonhuman primates using functional magnetic resonance (MR) imaging in awake subjects. The comparison of MR activity evoked by viewing intact and scrambled images of objects revealed shape-sensitive regions in occipital, temporal, and parietal cortex of both humans and macaques. Intraparietal cortex in monkeys was relatively more two-dimensional shape sensitive than that of humans. In both species, there was an interaction between scrambling and type of stimuli (grayscale images and drawings), but the effect of stimulus type was much stronger in monkeys than in humans. Shape- and motion-sensitive regions overlapped to some degree. However, this overlap was much more marked in humans than in monkeys. The shape-sensitive regions can be used to constrain the warping of monkey to human cortex and suggest a large expansion of lateral parietal and superior temporal cortex in humans compared with monkeys.
There is considerable evidence that the ventral stream in monkey processes shape information, including objects or parts of objects. Inferotemporal (IT) neurons are selective for two-dimensional shape (Gross et al., 1972; Desimone et al., 1984; Baylis et al., 1987; Tanaka et al., 1991; Kovács et al., 1995; Logothetis et al., 1995). V4, a major source of inputs to IT, is selective for complex patterns (Gallant et al., 1993) and parts of objects (Pasupathy and Connor, 2001). Human imaging studies have shown that a large cortical region referred to as the lateral occipital complex (LOC) is activated preferentially by images of objects compared with a variety of scrambled or noise stimuli (Malach et al., 1995; Grill-Spector et al., 1998b). This object-related region is located primarily ventral and anterior to human MT/V5 and lateral to retinotopic regions V4/V8. Kourtzi and Kanwisher (2000) showed that LOC in ventral occipitotemporal cortex of humans extracts and represents two-dimensional shape. It is often assumed that human LOC is the equivalent of IT cortex in monkeys, because the shape sensitivity of LOC and the shape selectivity of IT neurons are both cue invariant (Sáry et al., 1993; Grill-Spector et al., 1998a), and LOC and IT cortex are implicated in visual recognition (Ungerleider and Mishkin, 1982; Horel et al., 1987; Rosier et al., 1997; Grill-Spector et al., 2000). However, the homology between LOC and IT cortex has not been tested explicitly.
In monkey parietal cortex, more than half of the LIP neurons have been reported to be shape selective (Sereno and Maunsell, 1998). In contrast, several human imaging studies have reported parietal activation when comparing images of objects to their scrambled counterparts (Kraut et al., 1997; Avidan et al., 2002; James et al., 2002), but Kourtzi and Kanwisher (2000) attributed this parietal activation to attention effects. The central question addressed in the present study is the comparison of the visual cortical regions involved in processing two-dimensional shape in humans and monkeys. A second related question concerns the degree to which the differences in activation pattern between the two species, if any, can be explained by the differences in relative size of various subregions of human versus macaque cortex.
Kourtzi and Kanwisher (2000) observed no difference between the activation pattern in human LOC when comparing the intact and scrambled images of grayscale photographs and line drawings. They concluded that LOC represents information about object structure independent of the different image cues. Because object perception in monkeys and humans is not identical (Fagot and Deruelle, 1997), the third question addressed in this study is the extent to which shape-related areas process grayscale photographs and line drawings similarly in the two species. To address these questions, we used functional magnetic resonance imaging (fMRI) in humans (Belliveau et al., 1991) and awake monkeys (Vanduffel et al., 2001), combined with warping techniques of surface representations (Astafiev et al., 2003; Van Essen et al., 2004), to compare the activation patterns elicited by the viewing of either intact or scrambled grayscale photographs or line drawings.
Materials and Methods
Subjects. Four male (M1, M3, M4, and M5) rhesus monkeys (3-6 kg, 4-7 years of age) and 17 young (20-30 years of age) right-handed human subjects participated in the experiments. For surgical procedures, training of monkeys, details of image acquisition, and statistical analysis of monkey scans, see studies by Vanduffel et al. (2001) and Fize et al. (2003).
The monkeys were rewarded for fixating within a 2 × 2° window while stimuli were projected in the background. Human subjects were instructed to maintain fixation and received a small monetary incentive after completion of all scan sessions. Eye position was monitored during scanning, using Iscan (Burlington, MA) for monkeys and Ober 2 for humans.
Monkey subjects sat in a sphinx position in a plastic chair and faced the screen directly. Humans lay in a supine position and viewed the screen through a 45° tilted mirror. When responding in a task, both humans and monkeys interrupted an infrared beam with one hand.
Before each monkey scanning session, a contrast agent, monocrystalline iron oxide nanoparticle (MION), was injected into the femoral vein (5-11 mg/kg). The use of the contrast agent improved the contrast-to-noise ratio (by approximately fivefold) and the spatial specificity of the signal changes compared with blood oxygenation level-dependent (BOLD) measurements (Vanduffel et al., 2001; Leite et al., 2002). Although BOLD measurements used in humans depend on blood volume, blood flow, and oxygen extraction, MION measurements depend only on blood volume (Mandeville and Marota, 1999). For sake of clarity, the polarity of the MION MR signal changes, which are negative for increased blood volumes, was inverted.
Visual stimuli. Visual stimuli were projected from a Barco (Kortrijk, Belgium) 6300 liquid crystal display projector (1280 × 1024 pixels; 60 Hz) onto a screen 54 cm in front of the monkeys' eyes (28 cm for humans). Stimuli, the same as those used by Kourtzi and Kanwisher (2000), were projected at a size of 15 × 15° for monkeys and two humans (12 × 12° for the remaining humans). Two aspects of the stimuli were manipulated, scrambling (into 20 × 20 blocks) and type of stimuli, yielding four classes of stimuli: grayscale images, scrambled grayscale images, line drawings, and scrambled line drawings (Fig. 1). These manipulations were performed on images of novel objects designed by Kourtzi and Kanwisher (2000) and on images of objects that looked familiar to humans (Fig. 1). None of the human or monkey subjects had seen the novel stimuli before scanning. For each of these eight different conditions, 20 different stimuli were shown each for 600 msec. A fixation point, 0.3° in size, was provided to both humans and monkeys.
Tasks. In the main experiments, subjects remained passive with respect to the stimuli. In a number of control experiments, subjects performed a task with different aspects of the stimuli or the fixation target.
Four humans and two monkeys (M1 and M5) performed a high acuity task (Vanduffel et al., 2002) in which subjects had to detect the change in orientation of a very small bar from horizontal to vertical (Fig. 1 E). Intact and scrambled stimuli were shown with the same timing as under passive viewing conditions. In total, 8960 functional volumes were acquired in humans performing this task, compared with 6200 in monkeys. Performance during scanning was similar in the two species. The average percentage correct was 86% (range, 83-89%) in humans and 79% (range, 70-88%) in monkeys.
Two humans performed a one-back task on the intact and scrambled stimuli. Stimuli were presented for 200 msec, followed by a 600 msec period in which only the fixation point was presented. Subjects had to respond with the right hand within 600 msec after onset of a stimulus identical to the preceding one. Consecutive stimuli were identical in 20% of the presentations. To compensate for the increased difficulty of the task with scrambled stimuli, we included a distorted grayscale condition in which the grayscale stimuli differed only little from each other (Fig. 1 F). The distortion was obtained by moving the inner nodes of the scrambling grid (Fig. 1 A-D) in random directions with a random amplitude ranging between zero and X degrees and deforming the image in the same way as the overlaying grid. X typically equaled 0.4° and was used to adjust the difficulty of the task. In the beginning of the two training sessions, performance was much poorer for the scrambled stimuli (73% correct) compared with intact stimuli (92% correct). The performance level for the distorted grayscale images was intermediate (84% correct). Reaction times were shorter for intact (450 msec) than for scrambled stimuli (497 msec), with those for the distorted grayscale lying in between (470 msec). During scanning performance levels were similar, reaching 95, 92, and 94% for intact, scrambled, and distorted grayscale stimuli, respectively; 5152 functional volumes were acquired.
Three human subjects and one monkey (M3) performed two different dimming tasks. In the first task, the fixation point dimmed, whereas in the second, a small part (on average, 2 × 2.5°) of the stimulus dimmed. Intact and scrambled grayscale stimuli were presented as in the passive conditions, and dimming occurred at random times for 200 msec. Timings of the dimming epochs were identical to those of the orientation changes in the high acuity task. In both dimming detection tasks, the amplitude of dimming was adjusted to control performance levels of the subjects. Dimming of a stimulus part could occur in any of 24 positions, within an eccentricity range of 1 to 5° (Fig. 1G). Subjects responded within 600 msec after dimming with one hand. During scanning, the time series corresponding to the two dimming detection tasks and to passive viewing were interleaved. In monkey M3, 16,560 volumes were acquired, and in humans, 6480 volumes were acquired. In the two species, performance during scanning was similar for intact and scrambled stimuli. M3 reached 89% for both types of stimuli during stimulus-dimming task and 86% during fixation-dimming task. Human average performance was 89 and 84% for intact and scrambled stimuli during the stimulus-dimming task, and 88 and 86% for intact and scrambled stimuli during the fixation-dimming task.
Scanning. Each functional time series consisted of gradient-echo echoplanar whole-brain images [1.5 T; Siemens Sonata scanner (Siemens AG, Erlangen, Germany); repetition time (TR), 2.4 sec for macaque, 3.01 sec for human; echo time (TE), 27 msec for macaque, 50 msec for human; 64 × 64 matrix; 2 × 2 × 2 mm voxels for macaque, 3 × 3 × 4.5 mm for human; 32 sagital slices). Five stimulus conditions were tested: images of grayscale objects and their scrambled counterpart, drawings of objects and their scrambled counterpart, and a fixation baseline. In a typical block design (24 sec blocks), the presentation order of the five conditions was randomized, with different orders in different time series. Within a time series, order remained constant, and conditions were repeated three times. In alternate runs, images of familiar and novel objects were presented, and data of both types of runs averaged in the main analysis. In humans, four functional time series obtained in a single session were averaged; in monkeys, between 11 and 21 time series obtained over one to two sessions were averaged. Under passive conditions, 13,376 functional volumes were acquired in humans compared with 11,800 volumes in the monkeys.
In both humans and monkeys, motion localizer scans were run comparing a moving random textured pattern to a static one (7° diameter in humans, Sunaert et al., 1999; 14° in monkeys, Vanduffel et al., 2001). In four of the human subjects and in all four monkeys, early retinotopic regions were mapped (Vanduffel et al., 2002; Fize et al., 2003). The retinotopic stimuli included wedges along the vertical and horizontal meridian, a central stimulus (1.5° radius), and stimuli confined to the upper and lower peripheral visual field (1.5-14° radius). These stimuli were filled with flickering colored random checkerboards (two humans) and flickering colored or achromatic random checkerboards or moving dots and lines (all monkeys and two humans).
Two human subjects were scanned with high spatial resolution in a Philips (Eindhoven, The Netherlands) 3T scanner (TR, 3.3 msec; TE, 30 msec; 64 × 64 matrix; 2.2 × 2.2 × 2.5 mm resolution; 46 horizontal slices). Four time series including epochs with scrambled and intact images were acquired along with motion localizer scans. For each subject, an anatomical (three-dimensional magnetization prepared rapid acquisition gradient echo) volume (1 × 1 × 1 mm voxels) was acquired once (during one of the scan sessions for humans; in a separate session under anesthesia for the monkeys).
Volume-based analyses. Data were analyzed using statistical parametric map (SPM) 99 and Match software. Only scans in which the monkeys kept their fixation within the window for >80% of the time were analyzed. In these analyses, realignment parameters, as well as eye movement traces, were included as covariates of no interest to remove eye movement and brain motion artifacts. In humans, eye movements were rare (0.9 eye movements per 24 sec block on average); therefore, all scans were analyzed, and no covariate of no interest was used. The monkey functional volumes were realigned and nonrigidly coregistered with their anatomical volumes using a customized volume-based registration algorithm, Match (Chefd'hotel et al., 2002). The algorithm computes a dense deformation field by composition of small displacements minimizing a local correlation criterion. Regularization of the deformation field is obtained by low-pass filtering. The monkey functional volumes were then subsampled to 1 mm3 and smoothed with an isotropic Gaussian kernel [full width at half height (FWHH), 1.5 mm]. The human functional volumes were realigned, rigidly matched to their anatomical volumes, normalized, and subsampled to 27 mm3 (for the group analysis, and to 8 mm3 for single subjects) and smoothed with an isotropic Gaussian kernel (FWHH, 8 mm for group analysis and 6 mm for single subjects).
The human fMRI data were registered to the standard human Montreal Neurological Institute template (SPM99; Welcome Department of Cognitive Neurology, London, UK). The fMRI data from the four monkeys were registered to one of the individuals (M3) using the Match software (Chefd'hotel et al., 2002).
The four experimental conditions followed a factorial design with scrambling and type of image (or image cue) as factors. Main effects and interaction were assessed. In humans and monkeys, the group data were analyzed with fixed effects using the BOLD and MION (Vanduffel et al., 2001) hemodynamic response functions, respectively. In addition, the effect of scrambling was assessed in humans with a random effect analysis to ensure the generality of the results. Significance in the random effect analyses applies to humans in general, whereas that of a fixed effect analysis applies only to the experimental subjects (Friston et al., 1999). The number of monkey subjects was too small to use a random effect analysis. In the main analysis, all data were averaged. In an additional analysis, time series obtained with familiar and novel stimuli were analyzed separately. In yet another analysis, the first and second half of the data of each subject were analyzed separately.
For each stimulus comparison, significant MR signal changes were assessed using a map of T-scores (SPM) (Friston et al., 1995) and using p < 0.05 corrected for multiple comparisons as the threshold in the fixed effect analysis and p < 0.0001 uncorrected in the random effect analysis. Given the smaller number of monkey subjects and the larger number of volumes sampled in individual monkeys compared with humans, p < 0.05 corrected was required for individual monkeys, whereas for single human subjects, voxels reaching p < 0.001 uncorrected were considered significant.
Activity profiles plotting MR signal changes relative to fixation were obtained from the group analysis. The profiles of functional regions were generally calculated for a local maximum in the SPM corresponding to the main effect of scrambling in the group by averaging the most significant voxel and six of its neighbors in both hemispheres. To obtain profiles of anatomical regions such as V1 or V3v, unaffected by the scrambling, this group SPM was probed at the stereotaxic coordinates of these regions. To obtain the activity profiles of motion-sensitive regions, the SPM corresponding to the scrambling main effect of each subject was probed at the local maximum of the motion localizer, and these individual profiles were averaged. Finally, activity profiles were obtained along lines placed in the lateral bank of the intraparietal sulcus (IPS) of individual monkeys and humans; MR activity was sampled every 1.4 mm, again averaging over the voxel on the line and its six neighbors. Although the original MR signals were sampled with a linear resolution of 2 mm, the subsequent subsampling and averaging of activity over six neighboring voxels justified this finer sampling.
Surface-based analyses. In one set of analyses, the data were mapped to surface reconstructions generated using FreeSurfer, for which visual topography was mapped in the same monkeys (Fize et al., 2003). The FreeSurfer reconstructions run close to the boundary of gray and white matter. In another set of analyses, the data were mapped to surface reconstructions of the M3 right and left hemispheres generated using SureFit (Van Essen et al., 2001), which yields surfaces that run close to the midcortical thickness (layer 4) throughout each hemisphere. Because the fMRI data were acquired at a resolution equal or exceeding the cortical thickness, strong activations on one bank of a sulcus can result in apparent activations on the opposite bank of a sulcus or in an adjacent sulcus separated by a thin white matter gap. These relationships were determined using a customized blur-compensation algorithm, in which the fMRI activation signal for voxels lying within cortical gray matter was compared with that across a sulcal or gyral gap along an axis orthogonal to the cortical surface. fMRI activations that are likely to reflect such an artifactual spread were encoded as separate volume representations. The fMRI activations (with and without blur-compensation) were mapped onto the SureFit-generated macaque cortical surfaces using a volume-to-surface mapping tool in Caret. These surfaces and the associated fMRI data were registered to the macaque F99UA1 atlas (Van Essen, 2002, 2004; Van Essen et al., 2004) using surface-based registration of spherical maps as constrained by sulcal landmarks on the individual and atlas hemispheres. The human fMRI data were mapped to the human Colin atlas (Van Essen, 2002, 2004; Van Essen et al., 2004) surface in SPM-Talairach space. The monkey and human atlas surfaces were registered to one another using surface-based registration and a set of landmarks for cortical areas that are highly likely to be homologous across species (Astafiev et al., 2003; Van Essen et al., 2004). Data sets for on-line surface visualization (WebCaret) or downloading and off-line visualization (Caret) are accessible in SumsDB (Van Essen et al., 2004) by hyperlinks in Figures 4 and 11 for the illustrated right hemispheres and for associated left hemisphere data sets via http://pulvinar.wustl.edu:8081/sums/archivelist.do?archive id = 665485 (human atlas), http://pulvinar.wustl.edu:8081/sums/archivelist.do?archive id = 665805 and http://pulvinar.wustl.edu:8081/sums/archivelist.do?archive id = 665809 (macaque atlas), and http://pulvinar.wustl.edu:8081/sums/archivelist.do?archive id = 665813 (macaque case M3).
Shape-processing regions in humans
Figure 2 shows the shape-related activation pattern on flat maps of posterior (occipito-temporo-parietal) cortex in humans. Figure 2A shows group-averaged results for 17 subjects, mapped onto the left and right hemispheres of subject 1. Figure 2, B and C, shows results for the right hemisphere of individual subject 2, without (B) and with (C) an attention-demanding high acuity task. The comparison of viewing intact versus scrambled images of objects, whether grayscale or drawing, yielded significant activation (p < 0.0001 uncorrected for multiple comparisons) of ventral cortical regions. The main activation, both in the group analysis (Fig. 2A) and in single subjects (Fig. 2B,C), was located ventral relative to hMT/V5+ (i.e., below in the flat maps). In agreement with previous studies (Grill-Spector et al., 1998a, 1999), this activation occupied a large portion of the posterior inferior temporal gyrus (post-ITG) and extended dorsally into the cortex around the lateral occipital sulcus (LOS) and ventrally into the middle fusiform gyrus (mid-FG) (Fig. 2A,B). The post-ITG and mid-FG sites correspond to LO and LOa or posterior fusiform gyrus (pF), the two traditional parts of the LOC (Malach et al., 2002), but the present terminology better reflects their geographic localization. The results from single subjects in which early retinotopic regions were mapped (Fig. 2B) indicate that these regions lie lateral to V3v and V4v (Grill-Spector et al., 1999) (see below).
Although there is general agreement about the post-ITG and mid-FG components of the LOC, there has been more debate about the more posterior part, here referred to as LOS. It has been considered part of LOC (Malach et al., 1995; Grill-Spector et al., 1999) but also a separate entity identified as V3B by Smith et al. (1998), LOC/lateral occipital peripheral (LOP) by Tootell and Hadjikhani (2001), and V4-topo (-topologue of macaque V4) by Tsao et al. (2003). Furthermore, this part overlaps with the kinetic occipital region (Van Oostende et al., 1997) and is sensitive to motion (Sunaert et al., 1999; Rees et al., 2000) and three-dimensional structure from motion (Orban et al., 1999; Vanduffel et al., 2002). The population data also revealed considerable overlap between shape and motion sensitivity (Fig. 2A, white outlines) in this region. Single subject analysis (Fig. 3) showed that there is a dorsoventral organization in LOS, in which the dorsal and ventral parts are shape sensitive (red), whereas the middle part of LOS is sensitive to motion (green) or both motion and shape sensitive (yellow). The activity profiles of the three parts confirm the overlap between motion and shape sensitivity in the middle part (Fig. 3B). They also show that shape sensitivity is strongest ventrally in LOS (compare Figs. 2B and 3A, same subject). Sampling the motion and shape-sensitive regions at higher resolution and higher magnetic field strength confirmed the observations made at 1.5 T (Fig. 3, compare E and F). This dorsoventral organization is similar to that reported by Murray et al. (2003), except that the latter study described the middle part as exclusively motion sensitive.
Dorsal to the LOS activation, another shape-sensitive site was located close to the transverse occipital sulcus (TOS) overlapping with the motion-sensitive human V3A (hV3A) activation (Fig. 2A). This site was confirmed as hV3A in four single subjects by a combination of retinotopic and functional criteria (Tootell et al., 1997) as follows: the presence of a short horizontal meridian representation in front of V3d (7/8 hemispheres, S2R being the only exception), the presence of an extension of central vision toward the TOS (3/4 hemispheres tested), and an activation by the 7° motion localizer (Fig. 2B, white lines) (4/8 hemispheres). The overlap between shape and motion sensitivities could be evaluated in 13 of the 17 subjects, because both sensitivities reached p < 0.001 uncorrected in this region. In 9 of 13 subjects, there was extensive overlap (proportion of voxels reaching p < 0.001 uncorrected for both sensitivities), the overlap being only partial in the four remaining subjects, including S2 in Figure 2B. Raising the threshold to p < 0.05 corrected as in Figure 2B decreased the overlap, with the motion activation lying ventral to the shape activation. This spatial arrangement may reflect the known retinotopy of hV3A (Tootell et al., 1997), given the difference in size between the motion (7° diameter) and shape (12° square) stimuli. The overlap between motion and shape sensitivity in hV3A was confirmed by comparing the activity profile obtained from the local maximum in the shape subtraction and that obtained in the local maximum of the motion-sensitive region in individual subjects (see below).
Beyond hV3A, four regions along the IPS showed differential activation for intact versus scrambled shapes. Figure 2A shows that these activation sites either overlapped with motion-responsive regions [ventral intraparietal sulcus area (VIPS)] or were located just lateral to them [parieto-occipital intraparietal sulcus area (POIPS), medial dorsal intraparietal sulcus area (DIPSM), anterior dorsal intraparietal sulcus area (DIPSA)] (Orban et al., 1999; Sunaert et al., 1999; Vanduffel et al., 2002). In most single subjects, motion and shape activation sites overlapped, to a large extent, in VIPS (supplemental Fig. 1A; available at www.jneurosci.org). The pattern shown in supplemental Figure 1A was observed in 11 of the 14 subjects in which the VIPS region reached significance (p < 0.001 uncorrected) in both the motion and shape subtractions; in the three remaining subjects, the overlap was only partial. Again, mapping the motion- and shape-sensitive regions at higher resolution yielded similar results as those obtained at 1.5 T (compare the overlap for subject 13 in supplemental Fig. 1A; available at www.jneurosci.org). In the three more dorsal regions (POIPS, DIPSM, DIPSA), the shape-sensitive voxels were generally located lateral to the motion-sensitive ones, as illustrated in supplemental Figure 1B (available at www.jneurosci.org) for DIPSM. In the POIPS region, shape-sensitive voxels were located lateral to motion-sensitive voxels in five of nine subjects with significant motion and shape activation in this region, whereas they overlapped in three of nine subjects. For the DIPSM region, these fractions were 9 of 13 and 3 of 13; for DIPSA, these fractions were 6 of 13 and 6 of 13. The overlap between shape- and motion-sensitive sites was supported by the similarity of the activity profile of the local maximum in the shape subtraction (group analysis) with that obtained in the local maximum of the motion defined regions of individual subjects and averaged over the subjects (see below). Hence, we describe these shape-sensitive regions as VIPS, POIPSs, DIPSMs, and DIPSAs to minimize proliferation of labels.
Overall, the following eight functional regions were found to be sensitive to shape in human cortex: two relatively posterior occipital regions (hV3A and LOS), two occipitotemporal regions (post-ITG and mid-FG), corresponding to the main components of LOC, and four occipitoparietal regions (VIPS, POIPSs, DIPSMs, and DIPSAs). These regions were observed in the majority of subjects (Table 1). Figure 4A shows the shape-sensitive activation pattern mapped onto the whole flattened human hemisphere (human Colin atlas) (Van Essen, 2002). Surface nodes that lie within voxels reaching significance (p < 0.0001 uncorrected) are shaded red and yellow according to significance level. Nodes within 3 mm of significantly active voxels are indicated in green, thereby providing an explicit indication of the spatial uncertainty of localizing shape-sensitive regions.
Shape-processing regions in monkeys
Figure 4B shows the average (n = 4) shape-related activation pattern on a flat map of the entire macaque M3 right hemisphere. The color coding is similar to that in A, with red and yellow indicating regions of significant activation and green indicating an additional 0.5 mm spatial uncertainty. In addition, blue patches indicate foci directly opposite a stronger activation and, hence, are likely to reflect artifactual spread (see Materials and Methods). There is strong, extensive activation of the lateral bank of the intraparietal sulcus; the modest apparent activation of the medial bank (compare slices below) is likely to be entirely a result of blurring of the fMRI signal and the resolution of the fMRI measurements. The shape-related activations occupy a large portion of temporal cortex in the macaque, including much of the lower bank of the superior temporal sulcus (STS), and the inferior temporal gyrus, whereas the human activations are restricted to posterior temporal cortex. There are also activation sites in macaque prefrontal cortex that are discussed elsewhere (K. Denys, W. Vanduffel, D. Fize, K. Nelissen, H. Sawamura, R. Vogels, D. Van Essen, G. A. Orban, unpublished observations). The relationship of the shape-related activation to topographically organized visual areas is shown in Figure 5 and supplemental Figure 2 (available at www.jneurosci.org). Both of the average maps obtained in the group analysis (Fig. 5A), as well as the single subjects analysis (Fig. 5B; supplemental Fig. 2, available at www.jneurosci.org), reveal a consistent significant (p < 0.05 corrected) main effect of scrambling in the IT cortex, as well as in V3 dorsal, V4, and intraparietal sulcus.
The activation of IT was nonuniform, and the five subparts yielded by the group analysis (Fig. 5) did not correspond closely to any of the published architectonic subdivisions of IT as charted on the macaque atlas and registered to the M3 individual surface (data not shown). Hence, we labeled these five subregions by geographic designations for the local maxima, which are not meant to be construed as distinct cortical areas. The IT activation was most significant (in level and extent) in the posterior parts, in a site restricted to TEO (Boussaoud et al., 1990; Fize et al., 2003) and, even more, in a site located in the lower bank near the middle of the STS (mSTS) (Figs. 5A, 6). This latter site includes part of fundus superior temporal area (FST), located next to the vertical meridian representation that separates it from MT/V5 (Fize et al., 2003) but also a region in the lower bank of STS contiguous to FST (Nelissen et al., 2003) and the most dorsal portion of TEO. In most hemispheres, there were also more anterior activation sites (Fig. 6; supplemental Fig. 2, available at www.jneurosci.org), and according to the group results (Fig. 5A), we describe an anterior and posterior dorsal TE site (TEda and TEdp, respectively), as well as a posterior ventral TE (TEvp) site.
The shape-sensitive regions included two retinotopic areas, V3 and V4, which were retinotopically mapped in each monkey (Fize et al., 2003). The V4 activation involved central V4 (Fig. 5A) and, to a lesser degree, a more dorsal, peripheral part extending forward from the V3d activation. The V3 activation was present in most animals, as shown for the left hemisphere of the four monkeys in supplemental Figure 2 (available at www.jneurosci.org), and was located in the anterior bank of the lunate sulcus near the fundus (Fig. 6A). Finally, shape sensitivity was observed consistently across animals in an anterior and posterior region of the intraparietal sulcus (Fig. 5A,B; supplemental Fig. 2, available at www.jneurosci.org). The anterior region is located dorsally in the lateral bank of the IPS (Fig. 6), in a position that corresponds well with the presumed position of the dorsal part of LIP but might extend into anterior intraparietal sulcus area (AIP) (Lewis and Van Essen, 2000) (Fig. 5A). The posterior activation site is located in the fundus of the IPS near the junction with parietooccipital and lunate sulci (Fig. 5A); hence, we refer to this region as pIPS, which lies at the juncture of LOP, VIP, and PIP in the Lewis and Van Essen (2000) scheme.
Overall, the shape-related activation is more balanced between ventral and dorsal cortex in monkeys than in humans. Shape-sensitive regions in the monkey include two early regions (V3d and V4), five IT subregions (mSTS, TEO, TEdp, TEvp, and TEda), and two parietal regions (LIP and pIPS). In the monkey, the most significant shape-sensitive voxels were located in IPS and not in IT, whereas the reverse was true for humans (Fig. 4, compare A and B). This species difference in intraparietal activation was not just reflected in the T-scores. On average, the MR signal in macaque LIP was elevated by ∼1% for intact compared with scrambled images, more than half of the strongest effects in IT (supplemental Table 2; available at www.jneurosci.org). For VIPS, the most shape-sensitive IPS region in humans, the MR signal was only ∼0.25% elevated for intact compared with scrambled images, a third of the effect in post-ITG (LO) (supplemental Table 1; available at www.jneurosci.org).
It could be argued that this species difference in shape-responsive regions reflects the difference in familiarity of the two species with the stimuli. Although, by design, half of the stimuli were novel for humans, it could be argued that most stimuli were novel for the monkey (but see Fig. 1). Separate analysis of the activation pattern obtained with these two types of stimuli (Tables 1, 2) revealed a pattern extremely similar in both species. The main effects of scrambling and stimulus type as well as their interaction are similar for familiar and novel stimuli, although the scrambling effect is slightly stronger for familiar than novel stimuli, whereas the opposite is true for the interaction. In particular, these trends are the same in the occipitotemporal and occipitoparietal regions.
Functional properties of early visual regions
Activity profiles plot the MR signal for the four stimulus conditions with respect to the fixation baseline. They indicate the magnitude of the scrambling effect, of which the significance was reported in the SPMs of Figures 2, 4, and 5 and supplemental Figure 2 (available at www.jneurosci.org). Furthermore, they indicate these magnitudes separately for grayscale images and line drawings. Finally, they indicate the strength of the MR signals in the four experimental conditions compared with fixation. In the standard analysis, the profiles were obtained from the full data set, which was also used to assess significance. However, results were very similar when half of the data were used to localize the significant activation sites and the other half to calculate the activity profiles of these sites (compare all data and half of the data in supplemental Tables 1 and 2; available at www.jneurosci.org). Therefore, activity profiles remained the same when they were assessed in sites, determined in an independent data set.
The activity profiles of V3d and V4 in the monkey (Fig. 7) show that, superimposed on relatively strong responses to all four types of stimuli, images of intact objects drive these two regions more than scrambled images. These profiles also show that the type of stimulus had an effect in the monkey; gray-level stimuli, whether intact or scrambled, activate V3d and V4 more than line drawings. These effects were significant (Table 2), more so in V4 than in V3d. Finally, the profiles show that in these monkey regions, the two factors, scrambling and type of stimulus, interact; the effect of scrambling is significantly larger for gray-level figures than line drawings (Table 2). The effect of scrambling in V3 was restricted to its dorsal subdivision, both in the group and single subjects. To ensure that this was not a simple threshold effect, we used the retinotopic maps (Fize et al., 2003) of the group to probe V3v at the same eccentricity (1.5°) as the local maximum in V3d in the group data. The resulting profile is shown in Figure 7B. Although there is a strong effect of type of stimulus (grayscale vs drawings), there is none of scrambling.
The activity profile of human V3A, although arising from an area different from neighboring V3, is relatively similar to the profile of monkey V3d. Both the effect of scrambling and the interaction between scrambling and type of stimulus are significant (Fig. 7D, Table 1). It is one of the few human regions in which the type of stimulus had an effect, reaching significance in the left but not the right hemisphere (Table 1). However, most human areas reacted like the LOS region and displayed no effect of stimulus type, in agreement with Kourtzi and Kanwisher (2000). The hV3A profile shown in Figure 7D was obtained in the local maximum of the main effect of scrambling in the group analysis. Human V3A is motion responsive (Tootell et al., 1997; Sunaert et al., 1999). To confirm that the shape-processing region, revealed by the scrambling effect, corresponds to hV3A, defined by its motion sensitivity, we probed for the effect of scrambling in the voxel corresponding to the local maximum of hV3A as defined by the motion localizer in each subject. This yielded an average profile (Fig. 7E) very similar to that obtained directly in the group analysis of scrambling effect (Fig. 7D). In fact, the average MR signal change is larger in the profile obtained from the single subject motion maxima than that obtained in the local maximum of the scrambling effect in the group, presumably because the effect of individual variability in anatomical localization has been removed. Hence, we conclude that the motion-sensitive region hV3A is also shape sensitive, in agreement with Grill-Spector et al. (1998b).
In these early visual regions, the type of stimulus had much more effect on the MR activity of monkeys than humans. Interestingly, the profiles of V1 were rather similar in both species (supplemental Tables 1, 2; available at www.jneurosci.org). The MR activity was sampled in a small region of interest (14 voxels) in the two hemispheres at the same eccentricity (1.5°) in the two species. MR activity evoked by scrambled images was significantly larger than that evoked by intact images, perhaps reflecting the presence of end-stopped neurons in V1 (Hubel and Wiesel, 1965; Kato et al., 1978). Furthermore, in both species, grayscale stimuli were more effective to drive V1 than line drawings. Thus, the invariance for stimulus type observed in extrastriate human visual regions reflects intrinsic properties of these regions.
The activity profiles of two of the five monkey inferotemporal activation sites are shown in Figure 8, A and B. To illustrate the similarity of activation patterns for familiar and novel stimuli, the activity profiles in Figure 8 are shown for familiar (F) and novel (N) stimuli separately, in addition to the average (A) shown in other figures and tables. The significance of these effects is listed in Tables 1 and 2. The activity profiles of the other regions are listed in supplemental Tables 1 and 2 (available at www.jneurosci.org). These regions display three significant effects: main effects of scrambling, type of stimulus, and the interaction between scrambling and type of stimulus. Notice that although the more posterior regions (mSTS and TEO; data not shown) respond well to scrambled stimuli, responses were lower in TEda (and TEdp; data not shown).
The profiles of the two main components of human LOC, post-ITG (LO) and mid-FG (pF), are shown in Figure 8, C and D. The profile of post-ITG displays an effect of scrambling and a small, but significant, effect of the interaction. Except for this reduced interaction, the profile is rather similar to that of LOS (Fig. 7F). In mid-FG, there was no interaction between scrambling and stimulus type, but the two main effects, scrambling and stimulus type, were significant (Table 1).
The MT/V5 complex
The human group analyses suggested considerable overlap between shape sensitivity and motion sensitivity, as reported previously by Kourtzi et al. (2002). In humans, however, it is difficult to know from which component of the hMT/V5+ complex these shape responses arise, particularly because the contribution of the homolog of FST to the hMT/V5+ complex has only been reported recently (Vanduffel et al., 2001, 2002). In the monkey, three components of the MT/V5 complex are currently identifiable using fMRI: MT/V5, MSTv, and FST (Vanduffel et al., 2001). The activity profiles of these three regions are shown in supplemental Table 2 (available at www.jneurosci.org). Scrambling has an effect only in FST, not in MT/V5 or MSTv. FST displays a very strong interaction between scrambling and type of stimuli, such that the effect of scrambling is restricted to grayscale images. Our previous study in which monkey and human activation patterns were compared for identical visual inputs (three-dimensional rotation vs two-dimensional translation) (Vanduffel et al., 2002) suggests that the ventral part of hMT/V5+ complex might correspond to the homolog of FST. In agreement with this conjecture, the effect of scrambling in the hMT/V5+ is stronger in ventral than in dorsal parts of the complex, as defined by the motion localizer in each subject (supplemental Table 1; available at www.jneurosci.org). However, as in occipitotemporal regions, the scrambling effect in ventral parts of the human complex is similar for grayscale stimuli and drawings, whereas in monkey FST, the scrambling is much stronger for grayscale stimuli.
The profiles of the two monkey IPS regions (supplemental Table 2; available at www.jneurosci.org) were quite different in the group analysis. The profile for the local maximum of the scrambling effect localized in LIP displays significant effects for stimulus type and scrambling (Table 2). The pIPS site of the monkey is primarily sensitive to scrambling, less to stimulus type (Table 2). Probing VIP, as identified in each monkey by its motion response (Vanduffel et al., 2001), yielded a similar profile to that of LIP but with smaller signal changes (supplemental Table 2; available at www.jneurosci.org). This is consistent with the view that LIP is the predominant intraparietal shape-sensitive region, but that perhaps because of partial volume effects, shape sensitivity extends somewhat into VIP. Detailed mapping of the motion sensitivity and shape sensitivity along the lateral bank of IPS at the level of local maxima in the motion and shape activation confirmed this separation. Figure 9, A-C, shows plots of motion and shape sensitivity along the lateral bank of IPS in the right hemisphere of monkey M5. Points were sampled from the fundus to the lip of IPS and slightly beyond (Fig. 9D) in three coronal sections, which included both the shape (y = 2) and motion (y = 0) of local maxima. Clearly the motion sensitivity is maximum in the ventral part of the bank, corresponding to VIP, in the section at y = 0. In contrast, shape sensitivity consistently (Fig. 9A-C, arrows) reached its maximum in the more dorsal part of the bank, corresponding to LIPd, a subdivision of LIP that is likely to be a distinct cortical area (Lewis and Van Essen, 2000). Notice that only the points with largest scrambling effect reached significance and are shown in the flat maps of Figure 5, supplemental Figure 2 (available at www.jneurosci.org), or sections of Figure 6. Although Figure 9 only shows the posterior part of LIP activation, it demonstrates not only that motion and shape sensitivity were largely distinct but also that shape-related activation in LIP was much more extensive than motion-related activation in VIP. Similar results were obtained in the other three hemispheres of M5 and M1.
The profiles of the four regions along the human IPS were relatively similar, although the signals in the most anterior ones are weaker than in the more posterior ones (supplemental Table 1; available at www.jneurosci.org). They show an effect of scrambling and little interaction between scrambling and stimulus type. In all of these regions, activity profiles revealed scrambling effects that were similar (DIPSMs, DIPSAs) or slightly stronger (VIPS) when they were obtained from the individual local maxima of the motion responses (supplemental Table 1; available at www.jneurosci.org). This is consistent with the view that shape-sensitive regions along the IPS overlap considerably with motion-sensitive regions.
Mapping the motion and shape sensitivity along the lateral wall of the occipital part of the IPS in the two human subjects scanned at higher resolution confirmed that, for VIPS, the overlap between shape and motion sensitivity is complete. This is illustrated for subject S2 in Figure 9, F and G. The two anteroposterior levels shown bracket the local maximum of VIPS located in the left hemisphere of this subject at -26, -86, and 40.
Effects of tasks
The effect of scrambling was measured in several monkey (n = 2) and human (n = 4) subjects while subjects were performing a demanding high acuity task in central vision (Vanduffel et al., 2002). The purpose was to reduce possible effects of attentional differences within species and between species, because it could be argued that each subject pays more attention to the intact than the scrambled stimuli, and that monkeys pay less attention to the background stimuli than the fixation point because of the reward for steady fixation. In both monkeys and humans, the high acuity task had little effect on shape sensitivity (Figs. 2C, 5C), and the shape-sensitive regions showed a significant effect of scrambling under the task conditions (supplemental Tables 3 and 4; available at www.jneurosci.org). In the monkey, TEO was most affected by the task, whereas TEda (Fig. 10), TEdp, and early regions were only mildly influenced. Note that the effect of scrambling remained equally significant in LIP (Fig. 10), of which the overall MR activity compared with the fixation baseline increased with the task. In humans, the significance of scrambling in the three most dorsal IPS regions (POIPSs, DIPSMs, and DIPSAs) and, to a lesser degree, in LOS and post-ITG was reduced in the task condition (supplemental Table 4; available at www.jneurosci.org). In contrast, the task increased overall MR activity, relative to the fixation baseline, slightly in the four parietal regions (VIPS, POIPSs, DIPSMs, and DIPSAs) (supplemental Fig. 3; available at www.jneurosci.org).
The high acuity task entails a negative manipulation of spatial attention, drawing attention away from the intact or scrambled images of objects. The standard procedure to ensure equal attention of the subjects in the different conditions has been to use a one-back task (Kourtzi and Kanwisher, 2000) as in our previous study comparing human and monkey MR activation patterns (Vanduffel et al., 2002). Unlike this previous study, the introduction of the one-back task altered the activation profiles dramatically in human V3A and in four IPS regions (supplemental Fig. 3; available at www.jneurosci.org). The two parts of LOC and LOS (data not shown) still showed a main effect of scrambling during the one-back task, although reduced. Interestingly, the similarity in MR activity in the distorted grayscale and standard grayscale conditions suggests that the MR activity during scrambled one-back does not simply reflect increased general attention or effort. It suggests that during one-back, the scrambled stimuli are not any longer stimuli without shapes but are random spatial patterns that alter the effect of scrambling either by their nature or interaction with the task. The performance levels of naive subjects as well as the reaction times (see Materials and Methods) suggest that the one-back task with intact and scrambled stimuli are computationally different.
The stimulus-dimming task is used here as an alternative to draw attention to the stimulus. It has the advantage of applying equally well to intact and scrambled stimuli. It also allows a positive and negative manipulation of spatial attention in interleaved runs. In fact, during scanning subjects alternated between neutral, positive, and negative states of spatial attention because time series with passive viewing, stimulus-dimming and fixation-dimming tasks were interleaved. The results in the monkey (Fig. 10; supplemental Table 3, available at www.jneurosci.org) and in humans (supplemental Fig. 3, supplemental Table 4; available at www.jneurosci.org) indicate that the scrambling effect is to a large extent unaffected by the state of spatial attention, with a possible restriction for the three most dorsal IPS regions, as noted in the high acuity task. Perhaps there is a weak tendency, especially in humans, for ventral regions to exhibit a larger scrambling effect during attention to the stimulus than during attention away, whereas the opposite is true in hV3A and the intraparietal regions (supplemental Fig. 3; available at www.jneurosci.org).
Comparison of monkey and human shape-sensitive regions
Figure 11, A and B, shows flat maps of monkey and human cortex with a standard set of landmarks representing regions that are likely to be homologous across species (Astafiev et al., 2003; Van Essen et al., 2004). These landmarks include the boundaries of V1 and V2, MT, boundary between areas 3 and 4, central sulcus, frontal eye field (FEF), gustatory and olfactory cortex, orbital sulcus, hippocampal complex (HC), and boundary of cortex with the corpus callosum. After projection from flat maps (which have artificial cuts) to spherical maps (which do not have cuts), the macaque landmarks were deformed to match the human landmarks using a multicycle spherical registration algorithm (Van Essen et al., 2004). When this “standard warping” is applied to the monkey shape-sensitive activation sites (Fig. 11C), the deformed macaque activation pattern on the human map (Fig. 11E) lies substantially farther anterior in temporal cortex (E, arrows 2, 3) and farther lateral and anterior in parietal cortex (E, arrow 1) compared with the actual human activation pattern (Fig. 11D). Cortical areas that overlap with the shape-related activations are indicated on the macaque map (Fig. 11C) (scheme in Lewis and Van Essen, 2000) and the human map (Fig. 11D-F) (scheme in Hadjikhani et al., 1998; others as indicated in Van Essen, 2004).
If the parietal and temporal shape-sensitive activation sites in human and macaque represent homologous cortical regions, an improved registration should be obtainable using landmarks on the basis of the boundaries of activated regions. To explore the implications of this hypothesis, we included additional landmarks derived from the shape activation patterns themselves (Fig. 11A,B). Four landmarks sufficed to yield a satisfactory match between warped monkey and human shape-sensitive activation patterns: the lateral and medial borders of the IPS activation (Fig. 11A,B, IPSm/l) and the posterior and anterior limits of the temporal activation (Fig. 11A,B, IT-post and IT-ant). Using these landmarks yields an improved overall match between the human pattern of activation and the warped monkey pattern in parietal and temporal posterior cortex (Fig. 11D,F). If the hypothesized correspondences are valid, it suggests, in addition, the following homologies. First, the post-ITG part of human LOC corresponds primarily to the lower bank of STS and, more anteriorly, to the lateral convexity of macaque TE. Second, the mid-FG part corresponds more closely to the macaque TEda activation site. Finally, LOS corresponds to area V4d, as suggested by Tootell and Hadjikhani (2001) and Nelissen et al. (2000), although the LOS activation in human was more extensive than the V4 activation in monkey. The warping does not explain, of course, that the human early activation was in V3A rather than dorsal V3 as in the monkey, nor does it account for the relatively more significant parietal activation in monkeys than in humans (Fig. 4).
Shape-sensitive regions in humans
Our results are in agreement with many previous studies indicating that the two main components of the LOC are posterior ITG, which corresponds to LO, and mid-FG, which corresponds to pF (for review, see Malach et al., 2002) or vTO (James et al., 2002). The shape sensitivity of hV3A is in agreement with the study by Grill-Spector et al. (1998b). The shape-sensitive region that we refer to as LOS is part of LOC as defined originally (Malach et al., 1995), but its relationship to more recent descriptions of LOC (James et al., 2002; Malach et al., 2002) is less clear. It corresponds mostly to what Tootell and Hadjikhani (2001) refer to as the topologue of macaque V4d, which is supported by our results from surface-based registration between species. There are also functional similarities between LOS and V4d: both are sensitive to three-dimensional structure from motion (Vanduffel et al., 2002), and both include portions sensitive to kinetic patterns (Van Oostende et al., 1997; Nelissen et al., 2000). In contrast, V4d and LOS differ in retinotopical organization (Gattass et al., 1988; Tootell and Hadjikhani, 2001; Fize et al., 2003) and in motion sensitivity (Sunaert et al., 1999; Vanduffel et al., 2001).
Our study indicates that there is also a parietal component to the human shape-sensitive activation, in agreement with James et al. (2002). The coordinates provided by James et al. (2002) in experiments 2 and 3 correspond to POIPSs and VIPS, respectively. The latter was the most responsive parietal site in our experiments. The activation of these IPS regions, especially the three most dorsal ones, was reduced by the central discrimination task removing attention from the stimuli, but this was also the case to some degree in some ventral areas such as LO. Importantly, the scrambling effect in these regions remained significant under the task condition. The results of the dimming tasks indicate that the parietal effect of scrambling is primarily independent of spatial attention, although, in the one-back test, the effect of scrambling was reversed. Altogether, we considered it unlikely that these parietal activations by viewing intact compared with scrambled stimuli are simply an attention effect.
Shape-sensitive regions in monkeys
The shape sensitivity observed here in IT and V4 agrees with many studies (Gross et al., 1972; Desimone et al., 1984; Baylis et al., 1987; Tanaka et al., 1991; Kovács et al., 1995; Logothetis et al., 1995), indicating that IT neurons are selective for objects or object parts and V4 neurons for object parts (Gallant et al., 1993; Pasupathy and Connor, 2001). Comparing the profiles of V4, TEO, and TE clearly shows that the MR responses to scrambled stimuli decrease as one moves forward in the ventral stream, whereas MR responses to intact images decrease much less. This fits well with the results of Kobatake and Tanaka (1994), who reported that as one moves from V2 to TE over V4 and TEO, responses to simple stimuli such as lines or bars decrease systematically, and the neurons require more complex stimuli to be driven.
Compared with human, two-dimensional shape sensitivity in the monkey was relatively stronger in intraparietal than temporal regions (Fig. 4). This difference is unlikely to reflect the differences in MR measurements between monkey and human. The surface coil used in monkey induces a dorsoventral difference in signal of 30% (Vanduffel et al., 2001). In contrast, the MION method is more sensitive than BOLD (Vanduffel et al., 2001; Leite et al., 2002). Indeed, the differences in parietal activation were observed both in T-scores and magnitude of MR signals, and the signals over monkey V3v were approximately twice as large as those in V3d. Furthermore, in two other studies (Vanduffel et al., 2002; Orban et al., 2003), the opposite trend was observed. Parietal activation by three-dimensional structure from motion and by simple translation was weaker than more ventral activation. These opposite trends also argue against the differences being attributable to postprocessing of human and monkey data (e.g., different number of scans or statistical models).
The more anterior of the two shape-sensitive parietal regions corresponds to LIP, primarily its dorsal part LIPd (Lewis and Van Essen, 2000). Two-dimensional shape selectivity of LIP neurons has been reported previously (Sereno and Maunsell, 1998). The control experiments (Fig. 10; supplemental Table 3; available at www.jneurosci.org) show that the shape selectivity is relatively unaffected by changes in spatial attention. However, there is an interaction between stimulus and task, because MR activity increases when the monkey performs a task that is relevant or irrelevant to the stimulus. This extra activity in the dimming tasks may reflect decision processing (Shadlen and Newsome, 1996; Janssen and Shadlen, 2003). These findings need not imply that shape processing subserves the same functional role in parietal and temporal cortex. This study represents an initial neuroimaging analysis of shape processing in parietal and temporal cortex of primates.
The posterior shape-sensitive region (pIPS) overlaps with several subdivisions described by Lewis and Van Essen (2000), including LOP, PIP, and posterior VIP. It is close to the caudal intraparietal sulcus area (CIPS) region where neurons sensitive to three-dimensional orientation defined from stereo (Taira et al., 2000) and texture (Tsutsui et al., 2002) have been recorded. Comparing supplemental Figure 2, A and B (available at www.jneurosci.org), with Figure 3, A and B, of Tsao et al. (2003) suggests that CIPS mapped with disparity in fMRI corresponds to the posterior part of the shape-sensitive region pIPS.
Comparison between primate species
The activation pattern of shape-sensitive regions in both species required additional constraints to bring the monkey and human parietal and temporal shape-related activation into approximate correspondence. A modified warping, using the fMRI activation pattern as additional landmarks, deforms the IPS region more extensively than the standard deformation (Van Essen et al., 2004). Moreover, it expands even more the part of monkey cortex between the dorsal bank of the STS and auditory cortex, corresponding to the parietotemporal and lateral temporal cortex in humans. Under our proposed scheme, the homolog of LOC includes the lower bank of the middle STS (TEa/m), gradually shifting to the lateral convexity of TE as one moves toward mid-FG. This analysis suggests that human LOC corresponds primarily to the posterior and dorsal part of the monkey IT complex.
Although the warping manages to bring two parietal activation patterns into register, the parietal activation remains proportionally more significant in monkeys than in humans. The warping also does not account for the shift in activation from V3d in monkey to hV3A. It is noteworthy that in addition to shape sensitivity, hV3A shares two other functional properties, motion and three-dimensional structure from motion sensitivities (Orban et al., 1999; Sunaert et al., 1999), with four IPS regions (VIPS, POIPS, DIPSM, and DIPSA). Although these properties differ from those of monkey V3A, the retinotopic organization (Gattass et al., 1988; Fize et al., 2003) and disparity sensitivity are similar in the two species (Tsao et al., 2003).
Finally, many monkey extrastriate regions, even nonshape-sensitive regions, were less responsive to line drawings, whether intact or not, than to gray level images. This might be an effect of the monkey being less familiar with such drawings. Although familiarity had relatively little effect on the effect of scrambling, the overall activation level was somewhat reduced for novel shapes, suggesting that the reduced activation by line drawings might be an ontogenetic effect.
Motion and shape sensitivity
In the monkey, motion- and shape-sensitive regions overlapped little. The main exception is FST and the neighboring part of the lower bank in middle STS. This mixed sensitivity has been predicted in computational studies (Giese and Poggio, 2003) as typical for regions involved in the analysis of biological motion and actions in general.
Surprisingly, the overlap between motion and shape sensitivity was much larger in humans than in monkeys. This species difference survived when similar resolution and procedures were used (Fig. 9). Theoretically, the higher spatial specificity of MION measurements in the monkey compared with BOLD measurements in the human might make the segregation between motion and shape sensitivity more easy to detect in monkeys than humans. However, the plots of MR activity along the banks of cortical sulci (Fig. 9), which clearly show segregated peaks of differential MR signals in monkey but not in human, argue against a pure technical explanation of this result. We observed that hV3A and the four motion-sensitive regions along the IPS are also to a large extent shape sensitive. Furthermore, part of the LOS region involves both motion and shape sensitivity. Also, the hMT/V5+ complex has some shape sensitivity, especially the ventral part, in agreement with Kourtzi et al. (2002), as can be expected from the inclusion of the homolog of FST in this complex.
Most of the human regions, in which overlap was observed, have been implicated in extraction of three-dimensional structure from motion (Orban et al., 1999; Vanduffel et al., 2002). We argued that these three-dimensional structures from motion regions, in particular those typical for humans (Vanduffel et al., 2002), might be related to the more extensive use of tools by humans. This might also explain the extensive overlap between shape and motion sensitivity in humans, because their manipulation requires detailed three-dimensional and two-dimensional shape information.
Many extrastriate regions in occipital, parietal, and temporal cortex of human and monkey are shape sensitive. LOC corresponds, as expected, to a substantial part of inferotemporal cortex. Three species differences were observed. Parietal shape sensitivity is relatively stronger in monkey than in human IPS. Shape processing is more cue invariant, and the overlap between motion and shape-sensitive regions is larger in human than in monkey cortex.
This work was supported by grants from Queen Elisabeth Foundation (GSKE), National Research Council of Belgium [Fonds voor Wetenschappelijk Onderzoek (FWO) G0112.00], Flemish Regional Ministry of Education (GOA 2000/11), InterUniversity Attraction Pole (P4/22 and P5/04), Mapawamo (European Union Life Sciences), Human Frontier Science Program (RGY 14/2002), Mental Illness and Neuroscience Discovery Institute, and the Human Brain Project (R01 MH60974; jointly funded by National Institute of Mental Health, National Science Foundation, National Cancer Institute, National Library of Medicine, and NASA). We thank M. De Paep, W. Depuydt, A. Coeman, C. Fransen, P. Kayenbergh, G. Meulemans, Y. Celis, G. Vanparrys, D. Hanlon, and J. Harwell for technical support. W.V. is a fellow and H.P. is a junior fellow of FWO-Flanders. Furthermore, we thank Z. Kourtzi for making the stimuli available.
Correspondence should be addressed to Dr. Guy A. Orban, Laboratorium voor Neuro- en Psychofysiologie, Campus Gasthuisberg, Katholieke Universiteit Leuven, Herestraat 49, B-3000 Leuven, Belgium. E-mail:.
Copyright © 2004 Society for Neuroscience 0270-6474/04/242551-15$15.00/0