Twelve normal subjects viewed alternating sequences of unfamiliar faces, unpronounceable nonword letterstrings, and textures while echoplanar functional magnetic resonance images were acquired in seven slices extending from the posterior margin of the splenium to near the occipital pole. These stimuli were chosen to elicit initial category-specific processing in extrastriate cortex while minimizing semantic processing. Overall, faces evoked more activation than did letterstrings. Comparing hemispheres, faces evoked greater activation in the right than the left hemisphere, whereas letterstrings evoked greater activation in the left than the right hemisphere. Faces primarily activated the fusiform gyrus bilaterally, and also activated the right occipitotemporal and inferior occipital sulci and a region of lateral cortex centered in the middle temporal gyrus. Letterstrings primarily activated the left occipitotemporal and inferior occipital sulci. Textures primarily activated portions of the collateral sulcus. In the left hemisphere, 9 of the 12 subjects showed a characteristic pattern in which faces activated a discrete region of the lateral fusiform gyrus, whereas letterstrings activated a nearby region of cortex within the occipitotemporal and inferior occipital sulci. These results suggest that different regions of ventral extrastriate cortex are specialized for processing the perceptual features of faces and letterstrings, and that these regions are intermediate between earlier processing in striate and peristriate cortex, and later lexical, semantic, and associative processing in downstream cortical regions.
Physiological studies in Old World monkeys reveal that particular attributes and categories of visual stimuli are processed within specialized regions of extrastriate cortex. Regions engaged in processing of color and form (for review, see Maunsell and Newsome, 1987), faces (Gross, 1992; Perrett et al., 1992), facial expression (for review, see Rolls, 1992), motion (for review, seeAlbright, 1993), and “biological motion” (Oram and Perrett, 1994) have been described. This processing occurs within two separate but interconnected pathways, a ventral pathway dealing with object recognition, and a dorsal pathway dealing with motion, spatial relationships, and visually guided movement (Ungerleider and Mishkin, 1982; Goodale and Milner, 1992; Ungerleider, 1995) (however, seeMerigan and Maunsell, 1993; Zeki, 1993).
Knowledge of human visual processes is limited and has come primarily from psychophysical studies in normal subjects and from study of patients with naturally occurring lesions. Additional information has been obtained from field potentials recorded from chronically implanted electrodes used to localize epileptogenic foci in human visual cortex (Allison et al., 1993, 1994a,b; Nobre et al., 1994). These studies suggest a considerable degree of modularity in the initial processing of some categories of visual stimuli. For example, Allison et al. (1994a) found that discrete portions of the fusiform and inferior temporal gyri were activated by faces but not by equiluminant scrambled faces or by objects such as cars or butterflies. At these sites, faces evoked a large field potential with a mean peak latency of 190 msec (N200). Letterstrings evoked a similar N200 (Nobre et al., 1994), but letterstring sites were less common than face sites. In some patients, Arabic numbers also evoked an N200 from the same region (Allison et al., 1994b). Visual cortex located posterio-medial to the face, letterstring, and numberstring areas was preferentially activated by colored checkerboards (Allison et al., 1993). These results suggest that the human ventral object-recognition system contains localized subsystems for the perception of colors, faces, and words.
Although many response properties of face and letterstring N200s remain to be determined, evidence indicates that this activity reflects an early, automatic stage of category-specific processing. For example, letterstring N200s were insensitive to word type (e.g., concrete nouns or unpronounceable nonwords), whereas longer-latency potentials recorded from anterior regions of the inferior temporal lobe were highly sensitive to word type (Nobre et al., 1994). Face N200s did not habituate on repeated presentation of the same face, suggesting that this activity reflected mandatory processing of face information (Allison et al., 1995).
The spatial sampling afforded by intracranial electrodes is limited.Puce et al. (1995a) used functional magnetic resonance imaging (fMRI) to examine the extent of extrastriate cortex activated by faces. Although the activated portions of inferior extrastriate cortex in normal volunteers corresponded well with the regions that generated face N200s in patients, the extent to which these regions were activated specifically by faces compared with other objects or letterstrings was not determined. In the present study, we examined this issue by comparing fMRI activation evoked by faces to that evoked by letterstrings. To minimize postperceptual semantic processing, stimuli consisted of unfamiliar faces and unpronounceable nonword letterstrings. To maximize the difference in activations evoked by these stimulus categories, we used an alternation paradigm (McCarthy et al., 1995; Puce et al., 1995a) in which subjects were exposed to a repeating pattern of faces and letterstrings. To identify cortical regions activated by both faces and letterstrings, a nonobject stimulus category, textures, was alternated in separate runs with faces and with letterstrings.
MATERIALS AND METHODS
Subjects. Twelve normal right-handed subjects (Edinburgh Handedness Inventory range 70–100) (Oldfield, 1970) without a previous neurological or psychiatric history participated in this study. Age ranged from 19 to 34 years (mean, 26 years), and there were seven males. The experimental protocol was approved by the Human Investigation Committee of Yale University School of Medicine, and informed consent was obtained from each subject.
Activation tasks. Visual stimuli were delivered under computer control to an active matrix projection panel (Sharp Instruments, Mahwah, NJ), the images for which were projected onto a translucent screen mounted at the end of the patient gurney. The subject viewed stimuli on the screen through a mirror mounted on the head coil. All stimuli were presented on a dark background and subtended horizontal and vertical visual angles of 4.3°.
Three stimulus types were viewed as gray-scale images (Fig.1): (1) Digitally scanned faces from college yearbook photographs of individuals without eyeglasses or facial hair were presented. The photographs were full face and consisted of equal numbers of males and females. (2) Letterstrings consisted of five black consonants presented on a gray square that was the same size as faces. (3) Textures were abstract designs. Luminance and contrast of the stimuli were standardized using image processing software. The average luminance of all three stimulus categories was the same, as was the average contrast for the face and texture stimuli.
Experiment 1. The activation task consisted of the alternating presentation of sets of faces and letterstrings (F and L, respectively), each with 10 different stimuli per set as described previously (Puce et al., 1995a). A single cycle of the activation task (F–L) had a duration of 12 sec, and each imaging run consisted of 14 cycles. Each run was preceded by a 12 sec prestimulus period and followed by a 13 sec poststimulus period, during which a white central fixation cross was presented on a dark screen. The order of presentation of faces and letterstrings was counterbalanced to yield three imaging runs starting with faces (FLFL … ) and three runs starting with letterstrings (LFLF … ). Subjects were instructed to lie as still as possible and to concentrate on viewing the stimuli.
Experiment 2. The same subjects were run in a second imaging session in which a nonobject stimulus category, textures (T), was alternated in separate runs with faces (F–T) and letterstrings (L–T). Six runs were acquired for the F–T comparison, with three runs starting with faces (FTFT … ) and three runs starting with textures (TFTF … ). Similarly, six runs were acquired for the L–T comparison, with three runs starting with letterstrings (LTLT … ) and three runs starting with textures (TLTL … ).
MRI acquisition. A 1.5T MRI scanner (General Electric Signa, Milwaukee, WI) with a standard quadrature head coil and echoplanar capability (Instascan, ANMR Systems, Wilmington, MA) was used. The subject’s head was positioned along the canthomeatal line and then immobilized using a vacuum cushion (Olympic, Seattle, WA) and a forehead strap. Anatomical sagittal localizer scans were acquired [T1 weighted: TR(repetition time) = 500, TE (echo time) = 11, NEX (excitations) = 1, FOV (field of vision) = 24 cm, slice thickness = 5 mm, skip factor = 2.5 mm; imaging matrix = 256 × 192]. Seven coronal slices beginning at the posterior edge of the splenium (Fig. 2) were selected, which encompassed all but the most anterior portion of the region studied electrophysiologically (Allison et al., 1994b). Anatomical scans for these seven slices were acquired using a T1-weighted conventional sequence (TR = 500, TE = 11, NEX = 2, FOV = 24 cm, slice thickness = 7 mm, skip factor = 0, imaging matrix = 256 × 256) and echoplanar sequence (TR = 3000, TE = 80, NEX = 4, FOV = 40 × 20 cm, slice thickness = 7 mm, skip factor = 0, imaging matrix = 128 × 64). At the conclusion of experiment 1, axial images were acquired for all subjects using an SPGR sequence (TR = 25, TE = 5, α = 45°, NEX = 2, FOV = 24 cm, slice thickness = 2 mm, skip factor = 0, imaging matrix = 256 × 192). This whole brain image set was reformatted to match the slice angles used for functional imaging. In this way, activated voxels were represented in three dimensions and transformed into Talairach coordinates (Talairach and Tournoux, 1988). In addition, coronal magnetic resonance (MR) angiography images were acquired using a sequence selected to emphasize venous flow (TR = 45, TE = 7.7, α = 40°, NEX = 2, FOV = 24 cm, flow compensation, slice thickness = 2 mm, imaging matrix = 256 × 128).
In both experiments, functional images were acquired using a gradient echo echoplanar sequence (TR = 1500, TE = 45, α = 60°, NEX = 1, FOV = 40 × 20 cm, slice thickness = 7 mm, skip factor = 0, imaging matrix = 128 × 64, voxel size = 3.2 × 3.2 × 7 mm). Each imaging run consisted of 128 images per slice with a duration of 3 min 13 sec. Four additional excitations were performed before each run to achieve steady-state transverse magnetization.
The functional imaging runs were screened for movement and other artifacts using center of mass calculations and by visual inspection of the image series in an animated loop. Data from one subject in experiment 2 were excluded because of head movement artifact. Analysis of functional images was performed using t test mapping and frequency analysis.
t test analysis. The three runs for each stimulus alternation order were averaged resulting in two average runs of 128 images per slice for each experiment. For example, for each of the seven slices of experiment 1, there was one average run for the F–L alternation order and one average run for the L–F alternation order. Three consecutive images for each stimulus type (F and L) were selected from each of the 14 cycles within each average run, resulting in 42 samples for each stimulus type. An unpaired t test then was performed on a voxel-by-voxel basis for these images. Because of the lag associated with the functional activation signal, the selected images were offset from the start and end of each stimulus type within each cycle. For example, in experiment 1, the three images selected for faces were those three consecutive images starting at 4.5 sec after the onset of each face subcycle. Similarly, the three images selected for letterstrings were those three consecutive images starting at 4.5 sec after the onset of each letterstring subcycle. The image acquired 3.0 sec after the onset of either stimulus type was not included in the analysis, because at that point in the cycle, the signal associated with the current stimulus type was rising while the one associated with the past stimulus type was declining. (Note that the example above applies strictly for only the first of the seven slices, because the acquisition of each was offset from each other within the 1500 msec TR according to a 1-3-5-7-2-4-6 slice order. A compensation was introduced in the images chosen for the ttest to account for this acquisition offset.)
The t-maps computed for each stimulus alternation order were further analyzed using two methods: (1) A “split-t” map (Puce et al., 1995b) was generated for each subject in which only those voxels whose associated t values exceeded ± 1.5 forboth stimulus alternation orders were retained. These maps were used as a comparison for the frequency analysis described below. (2) An across-subjects t-map was computed, which combined the average t-maps for all 12 subjects. Before averaging, the t-maps for each subject were translated, stretched, and rotated (independently in two dimensions) to align gyri and sulci to a reference image set based on a representative subject. The alignment factors were calculated using high-resolution anatomical images without regard to the functional activations. Alignments were performed separately for each hemisphere of each anatomical slice. To achieve a good anatomical fit, it sometimes was necessary to offset the slice order by one image in individual subjects to match the reference images (e.g., slices 2–6 from one subject might be aligned with reference slices 1–5). To accommodate these shifts, across-subjects averaget-maps were only created for reference slices 1–5.
Frequency analysis. The frequency analysis has been described in detail (McCarthy et al., 1995; Puce et al., 1995a). First, the 128 functional images obtained in each run for each slice were transformed voxel by voxel into frequency and phase spectra using the fast Fourier transform. These spectra were averaged across the three runs, which made up a stimulus alternation order. This resulted in two average frequency and phase spectra for each voxel in each slice for each experiment; for example, one for the F–L alternation order, and one for the L–F alternation order of experiment 1. Voxels were retained for additional processing if they had a significant spectral peak at 0.083 Hz, the frequency corresponding to the 12 sec stimulus alternation period, in both alternation orders. The spectral peak was considered significant if its power was 1.5 SD higher than the mean power of the 20 adjacent power estimates.
A decision rule then was imposed in which the phase spectra were examined for the voxels retained above. Voxels were defined as activated only if the relative phases of these peaks changed by 180 ± 15° between stimulus alternation orders (e.g., F–L and L–F). Voxels that survived this two-step procedure then were superimposed on anatomical images. No other thresholds were used.
Figure 3 illustrates the frequency analysis method for an individual subject in experiment 1. The activated voxels from both the split t test (Fig. 3 A) and frequency analysis (Fig. 3 B) are shown as overlays on the subject’s anatomical image. The frequency analysis (Fig. 3 B) identified voxels that were typically a subset of those found to be activated by thet test analysis (Fig. 3 A). The time courses for activation in the left hemisphere (white circle in Fig.3 B) by faces (white) and letterstrings (black) are shown in Figure 3, C andD, respectively. Fourteen peaks can be observed in each time course corresponding to the 14 cycles of stimulus alternation. This oscillatory time course is reflected in Figure 3, E andF, which show the frequency spectra for these two regions. A 180° phase shift (Fig. 3 C,D) occurred between the F–L and L–F alternation order. This is shown quantitatively in Figure 3, G and H, where the power at 0.083 Hz (corresponding to the 12 sec stimulus period) is plotted as a function of phase. Two major peaks separated by 180° are evident in each plot. For this subject, the activation evoked by faces (∼2%) was greater than that evoked by letterstrings (∼1%). The time courses show a clear temporal differentiation between the activation by faces of the fusiform gyrus and the activation by letterstrings of the occipitotemporal sulcus. For example, the solid line denoting the F–L stimulus alternation order reached a peak midway in the 12 sec cycle (denoted by vertical lines) in the fusiform gyrus, corresponding to the offset of faces and the onset of letterstrings. Conversely for the occipitotemporal sulcus, the solid line peaked at the end of the 12 sec cycle at the offset of letterstrings. This relationship is seen more easily in Figure 3, G andH, where the phase of activation for the fusiform gyrus precedes that of the occipitotemporal sulcus by 160° for the F–L alternation order (compare solid lines). The reverse relationship holds for the L–F stimulus alternation order in which the occipitotemporal sulcus activation precedes that of the fusiform gyrus by 200° (compare dotted lines).
Anatomical localization. The Talairach coordinates for each activated voxel were measured, and the anatomical locations were determined by consensus of three of the authors. The identification of anatomical landmarks was aided by imaging software that permitted interactive reformatting of the thin slice, high-resolution SPGR image series. This was particularly useful in visualizing the full anterior–posterior courses of the collateral and occipitotemporal sulci. Parcellation of occipitotemporal cortex was straightforward with three exceptions: (1) The fusiform gyrus often is divided by a longitudinal sulcus that can be confused with the collateral sulcus in isolated coronal slices; (2) The transition between temporal and occipital cortex is difficult to determine and is somewhat arbitrary; (3) The location and extent of the transverse collateral sulcus are difficult to assess. Ventral cortex lateral to the collateral sulcus, posterior to the transverse collateral sulcus, and medial to the inferior occipital sulcus will be referred to as ventral occipital cortex. This region comprises portions of the third and fourth occipital gyri of Duvernoy (1991) and ventral cortex of the occipital pole.
To assess the consistency of activation, spatially normalized, across-subjects t-maps were generated for the face–letterstring condition (Fig. 4). Faces (yellow–red) strongly activated the right ventral brain, primarily in the fusiform gyrus (A–E). A large patch of activation also was obtained in the right lateral cortex (A–C). In the left ventral brain, faces activated a more restricted portion of the fusiform gyrus (C,D). Letterstrings (pink–purple) produced less overall activation, which was largely restricted to the left hemisphere. Activation by letterstrings also was observed in the left intraparietal sulcus (C). Most prominent was the activation of the occipitotemporal sulcus and posteriorly contiguous inferior occipital sulcus (B–D) just lateral and superior to the activation by faces of the fusiform gyrus. This pattern of activation also was evident in the individual images of 9 of the 12 subjects. Figure 5 shows data from this region (white circles) in six individuals. Activation evoked by letterstrings occurred primarily in the occipitotemporal and inferior occipital sulci. Letterstring activation of these sulci was restricted in anterior–posterior extent, and was seen in only one slice in six subjects and two adjacent slices in three subjects. Activation by faces in the left hemisphere occurred in the fusiform gyrus and posteriorly contiguous ventral occipital cortex.
The time courses of the MR signal change from the encircled voxels activated by each task are shown to the right of each image (Fig. 5). To illustrate the timing of activation, each time course has been averaged across all 14 cycles. The activation evoked by faces was evident within 1.5–3.0 sec after their onset, and (in five of the six subjects shown) peak activation occurred within 1.5–3.0 sec after their offset (i.e., peak activation by faces occurred after the onset of letterstrings). A complementary time course occurred for letterstring activation. In these cortical regions, the magnitude of the activation (signal change divided by mean signal) evoked by letterstrings was 0.8%, whereas the magnitude of the activation evoked by faces was 1.3%.
To evaluate the complete activation pattern, each activated voxel identified by the frequency analysis was localized anatomically and counted for all slices and subjects. Two percent of these voxels were located in cerebellum, white matter, ventricles, or sinuses. Comparison of activated voxels with the angiographic images revealed that few if any voxels could be attributed to large-vessel activation. Remaining voxels were located in cerebral cortex and are listed in Table1. To simplify Table 1, structures for which little or no activation was found (defined as three or fewer total voxels for any stimulus category for both hemispheres) are not listed. These structures are the cingulate gyrus, cuneus, inferior temporal sulcus, inferior temporal gyrus, parieto-occipital fissure, precuneus, superior occipital gyrus, superior parietal gyrus, superior temporal gyrus, and supramarginal gyrus.
Active structures will be described in medial-to-lateral order. The calcarine sulcus showed little activation to any stimulus category (Table 1), suggesting that it was more or less equally activated by all the stimuli and that a steady-state activation was achieved resulting in no residual periodic signal. The lingual gyrus showed strong activation by faces when alternated with letterstrings, particularly in the left hemisphere. However, this activation was reduced to near zero when faces alternated with textures. The logic of the experimental design requires that brain regions specifically activated by a stimulus category must be activated by that category regardless of which control category was used. [For example, in the left fusiform gyrus, 22 voxels were activated by faces alternated with letterstrings, whereas 17 voxels were activated by faces alternated with textures (Table 1), indicating that the region responded specifically to faces. By contrast, in the left lingual gyrus, 21 voxels were activated by faces when alternated with letterstrings, whereas none was activated by faces when alternated with textures, indicating that this structure was equally responsive to textures and faces and, hence, that the activation by faces was nonspecific.] These results indicate, therefore, that the lingual gyrus was activated in common by both faces and textures. Indeed, the fewest activated voxels were obtained when these two categories alternated, again suggesting that a steady-state activation was achieved. This conclusion is supported further by the fact that textures strongly activated this region when alternated with letterstrings, particularly in the right hemisphere.
The collateral sulcus was strongly and bilaterally activated by textures irrespective of the control condition (Table 1), as illustrated in Figure 6 A, indicating that it was activated specifically by textures. Less but specific activation by textures of the left transverse collateral sulcus also was seen. Some nonspecific activation by faces was seen in the collateral sulcus of the left hemisphere.
The fusiform gyrus, just lateral to the collateral sulcus, showed a different pattern of activation (Fig. 6 B). It was activated strongly by faces when alternated with letterstrings, consistent with both the across-subjects and individual t-maps shown in Figures 4 and 5. Activation by faces also was obtained when alternated with textures, indicating that at least part of the activation by faces was specific. On the other hand, the reduction in the magnitude of the activation by faces when alternated with textures, and the activation by textures when alternated with letterstrings, indicates that part of the fusiform activation was nonspecific and common to faces and textures. Face activation was bilateral, but greater in the right hemisphere, consistent with the results of the across-subjectst-maps (Fig. 4).
Little activation of the fusiform gyrus and more medial structures was evoked by letterstrings, particularly when alternated with faces. However, letterstrings strongly activated the left occipitotemporal and inferior occipital sulci when alternated with either faces or textures, indicating that this activation was letterstring specific. In contrast, faces strongly activated the right occipitotemporal and inferior occipital sulci when alternated with either letterstrings or textures, again indicating a specific activation. This striking interaction of stimulus category by hemisphere is illustrated in Figure6 C, in which the activation of these sulci has been combined and summarized.
Activation of the lateral cortex, a continuous region comprising the lateral occipital sulcus, middle occipital gyrus, middle temporal gyrus, and superior temporal sulcus, is summarized in Figure6 D. Strong activation by faces was obtained when alternated with letterstrings and when alternated with textures, indicating a degree of face specificity. Like the activation of the occipitotemporal and inferior occipital sulci, this activation was greater in the right hemisphere. This result is consistent with the pattern of lateral activation seen in the across-subjects t-map for the right hemisphere in Figure 4. Little activation was obtained for either textures or letterstrings in lateral cortex, but letterstrings showed a small left dominant asymmetry (Fig. 6 D).
Letterstrings also produced strong activation of the fundus of the intraparietal sulcus and adjacent angular gyrus, primarily in the left hemisphere (Table 1, Figs. 4, 5), but only when compared with faces and not with textures, whereas textures also produced little activation when compared with either faces or letterstrings (Figs.3, 4, 5, Table 1). This anomalous pattern of activation is difficult to interpret within the current experimental design.
There was little overlap (5 of 132 voxels) in the activation of ventral cortex by both faces and letterstrings when alternated with textures. Although this suggests specificity in activation, it is possible that even these commonly activated voxels would show a strong preference for either faces or letterstrings when these categories were alternated directly. Another way to assess the relative specificity of activation of a cortical region is to express it as a percentage of the total activation across regions. Such an analysis for ventral cortex is shown in Figure 7, and demonstrates that the collateral sulcus is most strongly activated by textures, the fusiform gyrus by faces, and the occipitotemporal and inferior occipital sulci by letterstrings.
The mean Talairach coordinates for the major regions of activation described in Figures 4, 5, 6, 7 are given in Table 2. The centers of activation did not change substantially for faces or letterstrings across control conditions. However, the center of activation for textures compared with faces was anterior to the center of activation when compared with letterstrings, implying (together with the results of Fig. 6 A) that a posterior region of the collateral sulcus is activated in common by textures and faces.
Activation by faces
When alternated with letterstrings, faces activated more cortex than did letterstrings in both the right (115 vs 26 voxels) and left (100 vs 67 voxels) hemispheres. A similar result was found for objects compared with words in the positron emission tomography (PET) study ofBuckner et al. (1995). We have seen a similar difference in recordings from extrastriate cortex in which face N200 sites are more numerous than letterstring N200 sites. The reason for this difference is unclear. It has been argued that faces are an important category of objects for which rapid recognition requires specialized processing (Young and Bruce, 1991; Newcombe et al., 1994). However, the same argument could be asserted for reading. Perhaps in the absence of competing stimuli, faces activate the general object-recognition system in addition to face-specific regions. This subsystem may occupy more of the ventral pathway than does a subsystem specialized for the perception and grouping of a small set of characters.
Consistent with our previous fMRI study (Puce et al., 1995a), face activation was concentrated within the fusiform gyrus (Figs. 4,7). Previous PET studies have consistently shown face activation of this region (Sergent et al., 1992; Haxby et al., 1994, 1995). Activation was more extensive in the right hemisphere (Fig. 4), particularly when faces alternated with letterstrings (Fig. 6 B). These results are consistent with the neuropsychology literature (for review, seeRhodes, 1993) demonstrating a right hemisphere advantage for face recognition and with previous PET (Horwitz et al., 1992; Haxby et al., 1995) and scalp-recorded evoked potential (Bentin et al., 1996) studies.
Malach et al. (1995) compared the fMRI activation produced by objects (including faces) with the activation produced by textures. They found that a region of lateral occipital cortex centered in the fusiform gyrus was preferentially activated by objects even when the spatial frequencies and contrast of the object stimuli matched those of the texture stimuli. We also found that faces activated the fusiform gyrus compared with textures (Fig. 6 B), although the two categories of stimuli were equated for luminance and contrast. These results argue that the selective activation of regions of ventral cortex by faces and other complex objects is not attributable to between-category differences in luminance, contrast, spatial frequency, or other elementary stimulus features.
Faces also specifically activated the right occipitotemporal and inferior occipital sulci (Fig. 6 C). This is consistent with the evoked potential study of Bentin et al. (1996) in which faces and eyes evoked a right hemisphere dominant N170 hypothesized to be generated in the occipitotemporal sulcus. Finally, a continuous region of lateral cortex comprising the middle temporal and occipital gyri, and superior temporal and lateral occipital sulci, also was activated specifically by faces (Figs. 4, 5, 6 D), again predominantly in the right hemisphere. The center of activation of this region was similar to a right lateral region activated by faces (Puce et al., 1995a), and was anterior and superior to the lateral occipital region activated by objects (Malach et al., 1995).
Activation by letterstrings
Letterstrings activated more cortex in the left than in the right hemisphere (67 vs 26 voxels). Activation was largely restricted to a region of the occipitotemporal and inferior occipital sulci (Figs. 4,5, 6 C), the same region selectively activated by faces in the right hemisphere (Fig. 6 B). Intracranial recordings also have revealed a left dominant asymmetry (our unpublished observations), and evoked potential studies have shown that a letterstring N180 is larger over the left than the right temporal scalp (Nobre and McCarthy, 1994), complementary to the distribution obtained by Bentin et al. (1996) for faces. These results imply a left hemisphere advantage for prelexical letterstring processing. However, activation by both faces and letterstrings was bilateral; the interhemispheric differences were of degree rather than kind.
This study was designed to engage only the early stages of word processing; the results suggest that this objective was achieved. Little activation and no letterstring-specific activation of left superior and middle temporal cortex was seen, regions previously activated in PET studies involving phonological processing or word reading (Petersen et al., 1989; Démonet et al., 1992; Howard et al., 1992; Price et al., 1994). Semantic processing of auditorily presented words also activated regions of the temporal lobe distant from ventral cortex (Petersen et al., 1988; Wise et al., 1991;Démonet et al., 1992). The region activated by unpronounceable nonwords here partially overlapped, but was inferior and lateral to a medial extrastriate region activated by words and pronounceable pseudowords (but not by false fonts or unpronounceable nonwords) in the study of Petersen et al. (1990). These considerations indicate that the unpronounceable nonword stimuli used here activated a prelexical, presemantic stage of word-form processing in the reading system, and that the lexical and semantic processes attendant on reading words activate regions of the temporal lobe anterior and superior to ventral occipitotemporal cortex. Nevertheless, systematic investigation of the activation produced by decomposed letterstrings, unpronounceable nonwords, and different word types will be required to better differentiate the anatomical substrate of word-processing stages.
We predicted from electrophysiological studies (Allison et al., 1994a;Nobre et al., 1994) that activation of ventral surface cortex by letterstrings would be medial to that for faces. This prediction was not confirmed, and indeed we found little activation of the fusiform gyrus by letterstrings. There also was little activation of the inferior temporal gyrus by letterstrings (or, indeed, by faces or textures). In the study of Puce et al. (1995a), only 14% of the total volume of activation by faces was in the inferior temporal gyrus. Only 15% of letterstring N200s, and 24% of face N200s, were recorded from the inferior temporal gyrus (our unpublished observations). These results indicate that the human inferior temporal gyrus is not a major site of letterstring or face processing, unlike the inferior temporal cortex in Old World monkeys, in which it is a major site of face-specific neurons (Gross and Sergent, 1992; Perrett et al., 1992). Just as striate cortex in monkeys is located mainly on the lateral surface but “migrates” to ventromedial calcarine cortex in humans, there may be a parallel migration of lateral face-sensitive regions in monkeys to ventral cortex in humans.
Activation by textures
Although, textures were used primarily as a nonobject control condition, some texture-specific activation was seen in the collateral and transverse collateral sulci (Fig. 6 A, Table 1). The collateral sulcus contains portions of areas V2, ventral V3 (VP), and V4 (Clarke and Miklossy, 1990; Clarke et al., 1995; Shipp et al., 1995). It is activated by colors, shapes, and scrambled faces (Corbetta et al., 1991; Clark et al., 1995; Puce et al., 1995a), but not specifically by faces (Table 1) (see also Clark et al., 1995; Puce et al., 1995a). Cortex in the transverse collateral sulcus also is likely to be a part of the V4 color–form region. These two regions may perform a stage of pattern recognition intermediate between initial processing in V1 and V2 and category-specific processing in the fusiform gyrus and more lateral extrastriate regions.
Specificity of activation by faces and letterstrings
The impetus for this study was our finding that faces and letterstrings evoke field potentials in different sites on the ventral brain surface (Allison et al., 1994a,b; Nobre et al., 1994). From a broader perspective, it is known that different regions of visual cortex process different visual stimulus attributes. The human ventral pathway is involved in object perception, but whether there are category-specific subsystems and if so, how many, is unclear. There are advocates of a unitary system (Damasio et al., 1982; Critchley, 1986;Biederman, 1987), two subsystems dealing with faces and words (Farah, 1990), two subsystems dealing with living and nonliving objects (Sheridan and Humphreys, 1993; Newcombe et al., 1994), three subsystems dealing with faces, words, and numbers (Allison et al., 1994b), and nine subsystems dealing with living, nonliving, and symbolic categories of objects (Konorski, 1967). These classifications are based largely on case reports of patients with lesions producing specific visual agnosias. Consistent with the present study, the lesion evidence suggests that prosopagnosia (for review, see De Renzi et al., 1994) and pure alexia (for review, see Damasio and Damasio, 1983) are forms of visual agnosia preferentially involving lesions of the right and left occipitotemporal regions, respectively. However, it is difficult to compare lesion and imaging data. First, the deficit experienced by the patient may be the result of damage in different parts of the pathway along a perceptual–semantic continuum (Sergent and Signoret, 1992;Sheridan and Humphreys, 1993). Second, most cases were described before the advent of high-resolution imaging, making it difficult to identify the specific location and extent of the lesion.
The PET and fMRI studies cited above, and many others, establish that regions of extrastriate cortex are activated by objects, faces, and letterstrings. Because the centers of activation differ, it could be argued that such differences are evidence of anatomical segregation of processing of different object categories. But such across-study analysis is made difficult by differences in task requirements, control stimuli, and statistical criteria for significant activation. We attempted to deal with this problem by looking for differences in the anatomical pattern of activation produced by two important categories of stimuli, faces and letterstrings. This strategy allows detection of cortical regions preferentially activated by one or the other stimulus category, but does not completely answer the question of specificity. It is possible that different categories of control stimuli would activate different regions, or that the same region would be activated equally by untested stimulus categories. Within these limitations, however, we conclude that portions of extrastriate cortex are activated specifically by faces and letterstrings. Two strategies might be useful to further test the anatomical specificity of category-specific processing: (1) Electrophysiological recordings could provide an independent data set. In patients in whom face N200s have been recorded, activation at the same sites should be obtained by fMRI. (2) Portions of extrastriate cortex may participate to some degree in the perception of any isolated stimulus. If the system were to be challenged by concurrent object arrays, then faces or letterstrings might activate only category-specific processing sites. Studies to investigate both of these strategies now are underway.
This work was supported by the Department of Veterans Affairs and by National Institutes of Mental Health Grant MH-05286. We thank A. Anderson, H. Sarofin, and S. Thomsen for assistance.
Correspondence should be addressed to Aina Puce, Neuropsychology Laboratory 116B1, VA Medical Center, West Haven, CT 06516.