Abstract
fMRI studies have revealed three scene-selective regions in human visual cortex [the parahippocampal place area (PPA), transverse occipital sulcus (TOS), and retrosplenial cortex (RSC)], which have been linked to higher-order functions such as navigation, scene perception/recognition, and contextual association. Here, we document corresponding (presumptively homologous) scene-selective regions in the awake macaque monkey, based on direct comparison to human maps, using identical stimuli and largely overlapping fMRI procedures. In humans, our results showed that the three scene-selective regions are centered near—but distinct from—the gyri/sulci for which they were originally named. In addition, all these regions are located within or adjacent to known retinotopic areas. Human RSC and PPA are located adjacent to the peripheral representation of primary and secondary visual cortex, respectively. Human TOS is located immediately anterior/ventral to retinotopic area V3A, within retinotopic regions LO-1, V3B, and/or V7. Mirroring the arrangement of human regions fusiform face area (FFA) and PPA (which are adjacent to each other in cortex), the presumptive monkey homolog of human PPA is located adjacent to the monkey homolog of human FFA, near the posterior superior temporal sulcus. Monkey TOS includes the region predicted from the human maps (macaque V4d), extending into retinotopically defined V3A. A possible monkey homolog of human RSC lies in the medial bank, near peripheral V1. Overall, our findings suggest a homologous neural architecture for scene-selective regions in visual cortex of humans and nonhuman primates, analogous to the face-selective regions demonstrated earlier in these two species.
Introduction
A sense of “place,” and the ability to recognize the environment and localize oneself within it, is crucial for survival in most animals. Although place-related cues take myriad forms across the animal kingdom, visual cues predominate in humans and other primates. In humans, functional MRI studies (Aguirre et al., 1996, 1998; Epstein and Kanwisher, 1998; Ishai et al., 1999; Maguire, 2001; Bar and Aminoff, 2003; Grill-Spector, 2003; Hasson et al., 2003) have described three visual cortical regions that are more active during the presentation of “places” (typically, scenes or isolated houses) compared with the presentation of other visual stimuli such as faces, objects, body parts, or scrambled scenes. Typically, these human brain regions are named for nearby anatomical landmarks as follows: (1) parahippocampal place area (“PPA”), (2) transverse occipital sulcus (“TOS”), and (3) retrosplenial cortex (“RSC”).
Although all three regions respond well to scenes, recent fMRI studies have revealed intriguing functional differences between them. For instance, PPA reportedly processes the visual-spatial structure of scenes (Epstein and Kanwisher, 1998), responding to changes in viewpoint and to scene novelty, but not during the navigation tasks—whereas RSC responds in the opposite way (Epstein et al., 1999, 2003; Park and Chun, 2009).
Such evidence suggests that these regions form a network for scene processing, analogous to the well known network for face processing. Based on human fMRI, this face-processing network includes several regions, including occipital face area (OFA), fusiform face area (FFA), and the anterior face region (Kanwisher et al., 1997; Grill-Spector et al., 2004; Rajimehr et al., 2009). Recent studies have revealed neurobiological mechanisms underlying this network by studying homologous regions in macaque monkeys (Tsao et al., 2003, 2008a; Rajimehr et al., 2009). Primate studies have shown that (1) at least some of these face-processing regions are anatomically interconnected, as shown by microstimulation combined with fMRI (Moeller et al., 2008); (2) these regions are organized hierarchically, based on physiological recordings (Freiwald and Tsao, 2010); and (3) this face-processing network extends to prefrontal cortex, as demonstrated by fMRI activation (Tsao et al., 2008b). Thus, studies of the face-processing network in monkeys have greatly expanded our understanding of the neurobiological substrates of face perception and recognition.
Analogously, our main goal here was to test for macaque homologs of human PPA, TOS, and RSC, to enable subsequent studies of scene-processing mechanisms in macaque cortex. To generate an optimal reference map, we first defined the precise locations of these regions in human cortex. These maps indicated that all three scene-selective regions are centered near but not on the sulci/gyri for which they were named. Moreover, these “scene-selective” regions are located in or adjacent to known retinotopic areas, including the lowest-tier areas V1 (adjacent to RSC) and V2 (adjacent to PPA). In macaques, the homolog of PPA is located adjacent to the FFA homolog, mirroring the topography of adjacent human regions FFA and PPA. The macaque fMRI also revealed a homolog of human TOS, which included V3A. Preliminary versions of this work have been presented previously (Devaney et al., 2008).
Materials and Methods
Human subjects
Seventeen normal human subjects (seven females; 22–33 years of age), with normal or corrected-to-normal vision, were tested in one to three experimental sessions each (Table 1). Written informed consent was obtained from each subject before the experiments. All experimental procedures were approved by Massachusetts General Hospital protocols.
Human subjects
Primate subjects
Seven juvenile male macaque monkeys (Macacca mulatta) were used in these studies (Table 2). Three of the monkeys (4–6 kg) were studied at the Massachusetts General Hospital (MGH), and four (5.0–8.5 kg) were studied at the National Institute of Mental Health (NIMH). Surgical details and the training procedures for the monkeys were similar across the two sites and described in detail previously (Vanduffel et al., 2001; Tsao et al., 2003; Bell et al., 2009). All experimental procedures conformed to NIH guidelines and were approved by experimental protocols at MGH and NIMH, respectively.
Monkeys
Human imaging
Human subjects were scanned in a horizontal 3 T Siemens Tim Trio MR imager at MGH. Gradient echo EPI sequences were used for functional imaging (TR, 2000 ms; TE, 30 ms; flip angle, 90°; 3.0 mm isotropic voxels; 33 axial slices). A 3D MP-RAGE sequence (1.0 mm isotropic) was used for high-resolution anatomical imaging from the same subjects.
Throughout the functional scans, all subjects continuously fixated a small fixation spot at the center of visual display. To control attention level during the functional scanning, subjects reported an unpredictably timed color change for the fixation target, except as noted. Each session consisted of 10–15 functional runs, and each run contained 14 blocks (block duration, 16 or 24 s).
Primate imaging
All primates were implanted with a MR-compatible headpost and trained to work in the sphinx position in a MR-compatible horizontal restraint device. As in the human task, all monkey subjects were required to fixate a small spot at the center of the display screen, near continuously. Eye position was monitored using an infrared pupil tracking system (ISCAN). Monkeys were rewarded with water or juice for maintaining fixation within a square-shaped central fixation window (typically, 2 × 2° in size) surrounding the fixation spot.
MGH.
Primate scanning at MGH used the 3 T scanner described above. A gradient echo EPI sequence was used for functional imaging (TR, 2000 ms; TE, 19 ms; flip angle, 90°; 1.0 mm isotropic voxels; 50 axial slices). Each monkey session consisted of 20–25 functional runs, with each run containing 14 blocks (block duration, 30 or 40 s). Each monkey was scanned for two to five sessions, and data from all sessions were averaged together. To increase functional sensitivity in the monkey scans (in part, to compensate for smaller voxels in the smaller primate brains), we used a gradient insert coil (Siemens AC88), parallel imaging with a four-channel phased array coil, and an exogenous contrast agent [monocrystalline iron oxide nanoparticle (MION); 8–10 mg/kg, i.v.]. Previous studies (Vanduffel et al., 2001; Leite et al., 2002; Tsao et al., 2003) within the same animals have confirmed that MION and BOLD label corresponding cortical areas (Vanduffel et al., 2001; Leite et al., 2002), although within-area activity details may differ slightly (Smirnakis et al., 2007). For each monkey, structural scans were also acquired using a 3D MP-RAGE sequence (0.35 mm isotropic voxels), during anesthesia.
NIMH.
Imaging data were collected using a 3 T GE scanner. A gradient echo (EPI) sequence was used for functional imaging (TR, 2000 ms; TE, 17.9 ms; flip angle, 90°; 1.5 mm isotropic voxels; 27 coronal slices) with an eight-channel surface coil array, based on MION (7–11 mg/kg, i.v.). Each session consisted of 10–30 functional runs containing three blocks (block duration, 40 s). Each monkey was scanned for two sessions, and data from all sessions were averaged together. High-resolution T1-weighted whole-brain anatomical scans (voxel size, 0.5 mm3) were also acquired on a 4.7 T Bruker scanner with a modified driven equilibrium Fourier transform sequence.
Data analysis
For all human and monkey subjects, functional and anatomical data were preprocessed and analyzed using FreeSurfer (http://surfer.nmr.mgh.harvard.edu/). For each subject, the cortical surface was extracted and reconstructed, allowing analysis on both the “inflated” and “flattened” views.
All functional images were motion corrected, spatially smoothed (unless otherwise noted) using a 3D Gaussian kernel [2.5 mm half width at half-maximum (HWHM) in humans and 1 mm HWHM in monkeys] and normalized across scans. The estimated hemodynamic response was defined by a gamma function, and then the average signal intensity maps were calculated for each condition. Voxelwise statistical tests were based on a univariate general linear model. The significance levels were projected onto the inflated/flattened cortex after a rigid coregistration of functional and anatomical volumes. For monkey data, additional manual corrections were also applied to avoid possible misalignment between functional and structural scans. Using FreeSurfer, functional maps were spatially normalized across sessions (in monkeys) and across subjects (in humans and monkeys). Then, activity within individuals monkey and human brains were transformed spatially onto the “averaged human” and “averaged monkey” brains, respectively (for details, see Fischl et al., 1999), and averaged using a fixed-effects model.
As noted in different analyses, the averaged human cortical surface was based on either the 10 subjects participating in our main study or 40 independent human subjects (FreeSurfer). For all monkeys, we generated an averaged anatomical surface based on the four NIMH monkeys and projected the averaged activity onto those anatomical maps.
In human subjects, flattened maps were generated using largely automated routines in FreeSurfer. These procedures automatically created a number of cuts around the medial aspect of the inflated surface: one in a region around the corpus callosum to remove all midbrain structures, one down the fundus of the calcarine sulcus, a set of equally spaced radial cuts, and a sagittally oriented cut around the temporal pole. The resulting cut surface was projected onto a plane that was oriented perpendicular to the average surface normal at each cortical site. Further details of these procedures are described previously (Fischl et al., 1999).
Visual stimuli
For all experiments (human and macaque) at MGH, stimuli were presented via a LCD projector (Sharp; 1024 × 768 pixel resolution, 60 Hz refresh rate) onto a rear-projection screen using a PC. MATLAB 7.0 and Psychophysics Toolbox (Brainard, 1997; Pelli, 1997) were used to program the experiments. The stimuli were presented in a blocked design. Within a given functional scan, the first and last blocks were always null epochs (i.e., a fixation point on a black background), to allow the hemodynamic response to reach a steady state. The remaining stimulus blocks were ordered pseudorandomly, without a rest period between them. Within each block, stimuli (see below) were presented for 1 s.
Corresponding stimulus presentation details were similar for the monkeys tested at NIMH. There, stimuli were presented via a Sharp Notevision3 projector (resolution, 1024 × 768), via Presentation software (12.2). Each block lasted 40 s, during which 20 images were presented for 2 s each, alternating with 20 s fixation blocks (neutral gray background). Individual scanning runs began and ended with a block of baseline fixation.
Specific stimuli, human subjects
Scenes.
We used four different sets of scenes. Image set 1 included achromatic (gray-scaled) scenes, including 23 images of furnished or empty rooms, and 23 outdoor scenes (cities or natural landscapes). Set 2 included eight naturally colored images of the scanning rooms that were all familiar to the subjects (Rajimehr et al., 2009). Set 3 was eight achromatic scenes of familiar locations outside the scanning rooms, including both indoor and outdoor images. Set 4 included eight achromatic scenes of unfamiliar places, including both indoor and outdoor images.
Faces.
Three different sets of face images were used in this experiment. Image set 1 was 23 images of individual faces (contrasted with scene set 1). Set 2 included eight colored face mosaics that included multiple equal-sized faces adjacent to each other (contrasted with scene set 2) (Rajimehr et al., 2009), of equal retinotopic extent to the scene set. Set 3 included computer-generated (FaceGen) faces, similar to those used by Yue et al. (2011).
Additional category-related images.
Set 1 included eight unfamiliar computer-generated objects (“blobs”) (Yue et al., 2011). Set 2 was eight images of tools (Bell et al., 2009). Set 3 included eight scrambled versions of the scene stimuli. The scrambled images were based on perturbing a random noise field at different scales, to match the original image statistics (Portilla and Simoncelli, 2000).
Retinotopic mapping.
To map the retinotopic organization within the central approximately one-half of the cortical representations (10° radius in the visual field), we used two complementary sets of retinotopic stimuli. Set 1 was scenes and face mosaics (face set 2), which were presented within retinotopically limited apertures, on a black background. The retinotopic apertures included (1) a foveal disk (1.5° radius), (2) a peripheral annulus (5° inner radius and 10° outer radius), (3) an upper vertical meridian wedge (10° radius and 60° angle), (4) a lower vertical meridian wedge (10° radius and 60° angle), (5) a left horizontal meridian wedge (10° radius and 30° angle), and (6) a right horizontal meridian wedge (10° radius and 30° angle). Set 2 was phase-encoded, contrast-reversing (1 Hz) checkerboards within continuously rotating rays or continuously expanding/contracting ring stimuli, as described previously (Sereno et al., 1995; Tootell et al., 1997).
In one subject, we also mapped the representation of the far peripheral visual field, using radially scaled, contrast-reversing checkerboards presented at a range of eccentricities from 70° to (and beyond) the visible limits of the visual field, centered on the vertical and horizontal meridians (retinotopic set 3).
Specific stimuli, monkey subjects, MGH
Stimuli were identical to the human scene set 2, face set 2, and retinotopic set 1, described above.
Specific stimuli, monkey subjects, NIMH
The stimuli used at NIMH were achromatic photographs from three image categories, all relatively familiar to the monkeys. Set 1 was individually presented monkey faces, from the local colony. Set 2 was scenes of the NIMH scanning, training, and housing rooms. Set 3 was objects from those environments. Retinotopic stimuli were not used in the monkeys at NIMH.
Results
Overall
Figure 1A–E illustrates the group-averaged scene-selective activity from the main group of human subjects (n = 10; Table 1), using faces as control images, in the folded (Fig. 1A,B), inflated (Fig. 1C,D), and flattened (Fig. 1E) cortical surfaces. Consistent with previous studies (Epstein et al., 2007; Park and Chun, 2009), we found significantly higher responses to scenes in three main regions, bilaterally, in the vicinity of (1) PPA, (2) TOS, and (3) RSC.
Overall view of scene-selective regions in human and monkey visual cortex. Both species fixated the center of a screen during block-designed presentation of identical scene versus face localizing stimuli. In the human data (A–E), relatively higher activity to scenes versus faces is shown in red/yellow versus blue/cyan, respectively (minimum, p < 10−10; maximum, p < 10−30). The human data are a group average of both the functional and the anatomical data (n = 10), in cortical surface format. The right hemisphere is illustrated; data from the left hemisphere were similar. A and B show the medial and lateral-posterior views of folded cortex, respectively. C and D show corresponding views of inflated cortical surfaces, and E shows the flattened view. For comparison, F shows the comparable flattened activity map from a macaque monkey, viewing the same stimuli (minimum, p < 10−5; maximum, p < 10−10). In both species, presumptively corresponding scene-selective regions are named in white (preceded by “m” in the macaque map). As a reference, the black asterisks indicate FFA in humans (E) and its counterpart in macaque (F); these are already known to correspond to each other (Tsao et al., 2003; Rajimehr et al., 2009).
For comparison, Figure 1F shows a fMRI map from an awake fixating macaque monkey, in response to the same stimuli, displayed in the same cortical surface format. As in humans, multiple scene-biased regions were evident in the macaque. Regions that appear to correspond in the two species (presumptive homologs) are named accordingly in white (Fig. 1, compare E, F). Below, this putative map correspondence was tested in detail.
For simplicity and historical continuity, we used the original names for the human scene-selective regions PPA (Epstein et al., 1999), TOS (Grill-Spector, 2003), and RSC (Maguire, 2001). We also extended the original naming scheme to indicate presumptive monkey homologs of these areas, by adding “m” (i.e., mPPA, mTOS, mRSC). However because the present evidence revealed inaccuracies in all these names, a new set of names is proposed in Discussion, which remains correct across both human and macaque cortex.
Human fusiform anatomy
To clarify the functional maps of PPA, it is helpful to first document a detail in the anatomical maps. Generally, the fusiform gyrus is described as a single uninterrupted gyrus (Polyak, 1957; Duvornoy, 1999). However one group (Chao et al., 1999; Haxby et al., 1999) distinguished fMRI activity on the “medial fusiform” gyrus, from that on the “lateral fusiform” gyrus. Here, we found that this functional subdivision has a rough anatomical correlate: the central portion of the fusiform gyrus is usually split along its length by a shallow sulcus. We named this the “middle fusiform sulcus,” separating the “medial fusiform” gyrus from the “lateral fusiform” gyrus.
Figure 2 shows this anatomical feature in the averaged MRI-based cortical surfaces from two independent subject pools: (1) the current group average (n = 10; Fig. 2A,B) and (2) the averaged surfaces from the standard FreeSurfer average brain (n = 40; Fig. 2C,D). This cortical surface analysis averages the cortical folding pattern (i.e., the gyri and sulci) without conventional volumetric (3D) blurring. However, note that the cortical folds in each individual surface are best fit to the group-averaged folding pattern, so the individual maps are subject to minor 2D misalignment relative to the group map (Fischl et al., 1999).
Evidence for a middle fusiform sulcus in the averaged human cortical surface maps. A shows the averaged surface from the present data (n = 10), and B shows the magnified inset from the same data. C shows the averaged map from an independent subject pool (standardized FreeSurfer surface; n = 40), and D is the magnified inset. Gyri are abbreviated in green (PH, parahippocampal; L, lingual; TO, temporal occipital; IT, inferior temporal; MF, medial fusiform; LF, lateral fusiform). Sulci are abbreviated in red (C, collateral; TO, temporal occipital; IT, inferior temporal). The right hemisphere is illustrated; data were similar in the left hemisphere. In both hemispheres, and in both cortical surfaces, the fusiform gyrus is subdivided into two parallel branches by the middle fusiform sulcus (MFS) (white arrowhead).
The middle fusiform sulcus (white arrow) is apparent in both group-averaged cortical surfaces (Fig. 2B,D). In the n = 40 surface, the middle fusiform sulcus is only 2.5 mm deep, thus ∼5 mm across the cortical surface. By contrast, the two sulci defining the external border of the fusiform gyrus (i.e., collateral and temporal occipital sulci) are much deeper, with maximum depth of ∼10 and 6 mm, respectively. In our n = 17 subject pool, the group values were similar: the depth and length of the middle fusiform gyrus in those individual surfaces ranged from 2 to 5 mm and 8 to 54 mm, respectively.
To confirm the presence of this sulcus in actual brains, we examined ex vivo brains from human autopsy. A middle fusiform sulcus was present in 20 of 24 hemispheres examined (83%). Examples are shown in Figure 3.
Confirmation of a middle fusiform sulcus in individual ex vivo human brains. A–C show ventral views of the temporal lobe in three right hemispheres (anterior, rightmost; lateral, uppermost). Sulci and gyri are indicated as in Figure 2. In all three examples, a clear middle fusiform sulcus (MFS) (labeled in white) divides the fusiform gyrus into two branches.
Human PPA
Figure 4 shows the location of scene-selective activity in this region (PPA), from the main human dataset (n = 10), based on group-averaged maps of the anatomy and function from a common set of subjects. Also shown is the center of fMRI activity (the voxel showing the highest statistical bias for scenes) in the group average (Fig. 4C) and in the individual data comprising our group map (Fig. 4D). Counter to expectations, we found that this ventral scene-selective region (the “parahippocampal place area”) was not centered on the parahippocampal gyrus. Instead, it was consistently centered near the lateral lip of the collateral sulcus, where it meets the medial fusiform gyrus, in both our group-averaged and the individual maps, in both hemispheres. Of course, a lower-amplitude activity bias could extend onto the parahippocampal gyrus, depending on the statistical threshold chosen, the levels of signal averaging and spatial filtering, and variations between individuals.
Scene-selective activity from the present experiment was consistently centered on the lip of the medial fusiform gyrus and collateral sulcus. The right hemisphere is illustrated; data were similar in the left hemisphere. A, Group-averaged activity in response to the scene-versus-face (i.e., PPA vs FFA) localizing stimuli (scenes, red/yellow; faces, blue/cyan), rendered on the group-averaged cortical surface (minimum, p < 10−10; maximum, p < 10−50). B shows the location of the inset on the medial view of the entire hemisphere. The white circle in C shows the centroid (the vertex showing the highest scene bias) in the group-averaged map of PPA. Sulci (C, collateral; MF, middle fusiform) and gyri (PH, parahippocampal; MF, medial fusiform; LF, lateral fusiform) are abbreviated. D shows the location of the centroid in each individual case; all are located on the lip of the collateral sulcus, on the fusiform side.
Role of stimulus variations
There is no single, quantifiable stimulus comparison for localizing PPA. Instead, different studies have localized this region based on correspondingly different scenes or houses, contrasted with various sets of faces, objects, body parts, and/or scrambled scenes. Thus, it could be argued that the location of PPA varies with the stimuli used to localize it. This could occur if the optimal stimuli vary continuously (instead of area-wise) across the cortical sheet (Wang et al., 1996) (but see Tootell et al., 2008). Alternatively, it could occur in some models of a distributed representation (Ishai et al., 2000a,b). Conceivably, either of these hypotheses could explain the presence here of a scene-selective patch of activity located lateral to, instead of on the parahippocampal gyrus.
To address this, we directly tested whether the location and topography of PPA varies due to corresponding stimulus variations. Figure 5 shows the results produced by four different sets of scenes versus natural and computer-generated faces, objects, or scrambled scenes (see Materials and Methods). Despite these wide stimulus variations, the topography of PPA remained remarkably constant in comparisons within a common subject pool.
The topographical shape and center of PPA remains essentially constant, even when produced by different stimuli. All panels show group-averaged scene-selective activity (red/yellow) in medial views of the right hemispheres, equivalent to the views in Figure 4. A–C show activity in one set of subjects (n = 4 from group 1; Table 1), and D–F show activity in a different set of subjects (n = 7 from group 2; Table 1); relevant comparisons are between panels from a common row (A–C or D–F). The color scaling was adjusted to topographically comparable levels, to offset differences in statistical power in each comparison, as a result of differences in subject number and variations in stimulus effectiveness (A, B, minimum, p < 10−10; maximum, p < 10−50; C, p < 10−3; maximum, p < 10−16; D, E, minimum, p < 10−5; maximum, p < 10−30; F, minimum, p < 10−2; maximum, p < 10−10). A shows activation produced by localizing stimuli analogous to those used by Epstein and Kanwisher (1998): a wide range of scenes (scene image set 1: indoor rooms, empty rooms, outdoor city scenes, and landscapes) was contrasted with individual faces (image face set 1). B shows activity in response to a more limited set of scenes (scene set 2, indoors, from the scanning laboratory), compared with images of groups of faces (face set 2). The scene and face stimuli in C were equivalent to those used in B, except they were summed from three retinotopically limited ring apertures, at complementary ring diameters (retinotopic set 1). D shows the activity produced by a different set of scenes (set 3; indoor and outdoor), compared with computer-generated faces (face set 3). E shows the activity produced by a different set of scenes (scene set 4, indoor scenes from an unfamiliar site), compared with arbitrary computer-generated objects (blobs) (Yue et al., 2010). F shows activity produced by scene set 3, compared with scrambled versions of the same scene stimuli, based on perturbing a random noise filed in different scales to match the original image statistics (Portilla and Simoncelli, 2000).
Thus, the unexpected localization of the scene-selective region here (away from the crown of the parahippocampal gyrus) cannot be attributed to stimulus differences between the current versus past studies. Instead, these results in PPA are fully consistent with results in classic lower-level visual areas, such as V1, V2, MT: none of these areas changes shape or moves across the cortical map, dependent on object stimulus variations.
Meta-analysis
How does the unexpected PPA localization here compare with analogous localizations in the literature? To clarify this, the following meta-analysis was conducted. The centers of previously published scene-biased activity in this region were translated onto a common, standardized cortical surface (using FreeSurfer and its averaged human brain) based on Talairach coordinates (Talairach and Tournoux, 1988) reported in previous studies (Table 3). Coordinates were found in 12 neuroimaging comparisons of scenes or buildings, relative to faces, objects, or scrambled scenes. Each study was assigned a character, and that distribution is shown in Figure 6. Eleven studies were based on fMRI; one was based on PET.
References for meta-analysis
A meta-analysis of the previous literature localizes PPA to the lip of the collateral sulcus/medial fusiform gyrus, consistent with the current data (Figs. 4, 5). Data are rendered on a standardized cortical surface (FreeSurfer). B and D show the whole normal and inflated cortical surfaces, respectively. A and C show the magnified insets. The centers of PPA activity in previous publications have been translated to this common surface based on Talairach coordinates (indicated with white letters in A and C). When distinguished in the individual reports, coordinates from left and right hemispheres were averaged together, and displayed here on the right hemisphere. The references corresponding to those letters are given in Table 3. For comparison, an asterisk marks the center of PPA in the current dataset (from Fig. 4C). None of these centers was located on the crown of the parahippocampal gyrus, but five were located on the crown of the medial fusiform gyrus. Remaining centers were located on the lips (but not the fundus) of the collateral sulcus.
The averaged center of PPA in the current data (asterisk, from Fig. 4C) lies squarely in the middle of these previously published sites; thus, our data were representative. Among the previously published sites, five were located on the crown of the medial fusiform gyrus, but none was on the crown of the parahippocampal gyrus. This and prior descriptions (Haxby et al., 1999; Levy et al., 2004) suggest that the parahippocampal place area is not centered on the parahippocampal gyrus (see Discussion); instead, it is located lateral to that gyrus. However, as noted above, submaximal activity beyond the center can extend onto adjacent regions of the cortical surface (including the parahippocampal gyrus), depending on thresholding and related factors.
All but one of the remaining sites were located along the lip of the collateral sulcus, which divides the medial fusiform gyrus from the parahippocampal gyrus. The confluence of published centers along the lip (but not within the depth) of the collateral sulcus may reflect signal contributions from the large vein that overlies the collateral sulcus (Menon et al., 1993; Kim et al., 1994), in addition to signal contributions arising from the gray matter itself.
The macaque homolog of human FFA
Human PPA is located immediately adjacent to FFA in the cortical sheet, on the medial side, on the side closest to the splenium of the corpus callosum. Thus, any candidate homolog of PPA in the monkey (“mPPA”) should also lie immediately adjacent to monkey FFA (mFFA), on the side closest to the splenium.
To test that prediction, it was necessary to first localize mFFA as a reference landmark. Previously (Tsao et al., 2003; Rajimehr et al., 2009), the location of mFFA was defined based on quantitative transformation of cortical areas in the human and macaque maps, using fMRI and equivalent stimuli, based on maps from individual monkeys. In both reports, mFFA is the large, high-amplitude, face-selective patch located approximately midposteriorly along the length of the superior temporal sulcus, extending from the ventral bank onto the lip of the middle temporal gyrus (Fig. 1F, black asterisk).
However, additional face-responsive patches have also been reported in this cortical region, which might confuse the accurate localization of mFFA. In the simplest account, both monkeys and humans have two main face patches in corresponding cortical regions of each hemisphere, with the more posterior patch comprising (m)FFA (Pinsk et al., 2005, 2009; Hadj-Bouziane et al., 2008; Bell et al., 2009; Rajimehr et al., 2009). Another account is more complex: the monkey has either three (Tsao et al., 2003) or six (Tsao et al., 2008a) face patches in each hemisphere, whereas humans have three (Tsao et al., 2008a) in this occipito-temporal region.
It is possible that this discrepancy arises in part from variation in the individual maps chosen for illustration. To date, group-averaged maps have not been calculated for the monkey face patches, which would reduce or eliminate such individual variations. To remedy this, group-averaged maps were first calculated from the fMRI data from three monkeys (Table 2) used throughout this study, based on the same localizing stimuli used in human subjects (faces vs scenes). In the monkey experiments, we used an exogenous contrast agent (MION) (see Materials and Methods), which increased the spatial specificity of the MRI signal compared with the more conventional BOLD signal used in human studies (Mandeville and Marota, 1999; Vanduffel et al., 2001; Leite et al., 2002). These averaged data showed two main face patches in each hemisphere (Fig. 7), consistent with those described earlier (Pinsk et al., 2005, 2009; Hadj-Bouziane et al., 2008; Bell et al., 2009; Rajimehr et al., 2009) (Fig. 1E).
Group-averaged cortical surface maps showing a macaque homolog of human PPA. The data show the averaged fMRI activity (n = 3) from fixating monkeys presented with localizing stimuli identical to those used in the human maps (scene set 2 vs face set 2). A and B show activity in the folded cortical surface, in the right and left hemispheres, respectively. The left hemisphere has been mirror-reversed for ease of comparison. C and D show the same data in flattened cortical format. Scene-biased activity (red/yellow) is shown across the entire cortex (minimum, p < 10−12; maximum, p < 10−24). The peak of face-biased activity in the posterior (“mFFA”) and anterior face patches (“mATFP”) are indicated with a black and a white asterisk, respectively. Two patches of scene-biased activity are evident in this group-averaged data: one dorsal and another ventral. A consistent ventral scene patch (mPPA) is located adjacent and ventral to the posterior face patch, analogous to the relationship between FFA and PPA in humans. The dorsal scene-biased patch (mTOS) is the presumptive macaque homolog of human TOS.
To confirm this finding, we calculated a second group-averaged map based on an additional and independent set (n = 4) of monkeys. This second set of activity maps was generated in a different laboratory (NIMH), using a different scanner, based on stimuli that were familiar to the monkeys (i.e., faces of conspecifics, scenes and objects from the laboratory)—as opposed to stimuli that were matched to the human localization studies, as tested first. Despite these technical differences, again the group averages showed two main face patches (Fig. 8, black asterisks and arrowheads), as expected from previous reports (ibid).
Group-averaged map of mPPA and mFFA from an independent set of experiments. This group average (n = 4) was based on fMRI acquired at NIMH, based on an independent set of stimuli (monkey faces, and scenes and objects from the housing and training environment of the monkeys at NIMH). A and C show activity in the right hemisphere; B and D show the left hemisphere, mirror-reversed for ease of comparison. A and B show the contrast of scenes (red/yellow) versus faces (blue/cyan) (minimum, p < 10−6; maximum, p < 10−12). As a control, C and D show the contrast of scenes versus [faces + objects]/2. In this NIMH data, the slice prescription did not include regions posterior to the superior temporal sulcus, to the left of the dashed line (e.g., V1, V3A). Despite these experimental differences, the overall results shown in Figure 7 were confirmed as follows: (1) two face patches were found in the expected locations [asterisk above the posterior patch (mFFA); anterior patch indicated with a black arrowhead], and (2) a scene-biased patch is evident immediately ventral to the posterior face patch (white arrowheads), consistent with that in Figure 7, and with the position of FFA and PPA in the human map. Anterior to that, additional scene-biased activity was also found. When objects were included as a stimulus contrast (C, D), an additional (“third”) face patch could be seen, depending on the level of thresholding. However, additional evidence suggests that this patch reflects retinotopic differences, rather than differences related to semantic category.
At lower thresholds, additional, smaller face-biased patches were sometimes found within a given monkey, as described previously (Tsao et al., 2008a; Ku et al., 2011). However, the presence and location of such additional patches varied across animals, dependent on threshold level and other factors. Accordingly, those patches did not survive group averaging. Note also that face-selective activity in mFFA sometimes extended farther posteriorly (in or near V4d) as in the human maps (Fig. 1F). However, in both species, retinotopic maps from the same subjects suggest that this variable posterior activity reflects a difference in stimulus size/position, not necessarily face selectivity per se.
Macaque PPA
Based on the cortical maps, a candidate mPPA should lie adjacent to this main face patch (mFFA) in the monkey cortical map, analogous to the relationship of FFA to PPA in the human map. Thus, in macaques, mPPA should lie on the crown of the middle temporal gyrus, slightly anterior and ventral to the posterior middle temporal sulcus.
Such a result has been shown in individual maps from two monkeys (Rajimehr et al., 2011). Here, that initial finding was confirmed in both sets of group-averaged data (Figs. 7, 8). In one hemisphere, scattered regions of scene-biased activity also extended into the region of occipito-temporal sulcus (Fig. 7D). However, the latter activity was inconsistent in location, relative to the consistent peak of scene-selective activity in mPPA, in all four averaged hemispheres (Figs. 7, 8).
Human TOS
In humans, an additional focus of scene-selective activity is found in dorsal occipital cortex (Nakamura et al., 2000; Grill-Spector, 2003; Hasson et al., 2003; Epstein et al., 2005; Park and Chun, 2009) (Figs. 1, 9). Depending on experimental details, that dorsal patch can be as prominent as the one in PPA, in both amplitude and topographical extent. However, this dorsal occipital patch has received relatively little attention.
In human cortex, the scene-selective region TOS is centered on the lateral occipital gyrus. All panels show the right hemisphere from a posterior-lateral viewpoint, as in Figure 1, B and D. A–C are cortical surface maps in the folded format, and D–F show the same region in inflated surface format. A, B, D, and E show the right hemisphere, and C and F show the left hemisphere after mirror reversal. In A and D, gyri are abbreviated in green (LO, lateral occipital; IPP, inferior posterior parietal; A, anectant; MT, middle temporal), and sulci are abbreviated in red (TO, transverse occipital; IP, inferior parietal; L, lateral; LO, lateral occipital; IO, inferior occipital; PTO, posterior temporal occipital; ST, superior temporal; MT, middle temporal). The cyan lines in the remaining panels indicate the threshold (p < 10−10) of group-averaged (n = 10) scene-biased activity in this region (TOS). The white circles indicate the vertex showing the highest activity in the group map; on average (combining results from both hemispheres), this point lies along the crown of the lateral occipital gyrus.
In the original report, the dorsal patch of scene-selective activity was localized on the transverse occipital sulcus; thus, it was named “TOS.” However, before that time, a classic retinotopically defined area (“V3A”) was also localized on the transverse occipital sulcus (Tootell et al., 1997). Thus, either (1) the transverse occipital sulcus spans both activity-defined areas (i.e., V3A plus TOS), (2) the TOS region coincides with (or includes) V3A, or (3) the original localization of TOS is incorrect.
Our evidence supports the third hypothesis. When averaged across subjects and hemispheres, this scene-selective patch (TOS) was centered on the crown of the lateral occipital gyrus (Fig. 9), anterior and ventral to the transverse occipital sulcus. As in PPA, the centers of highest activity occurred on the edges of this gyrus, consistent with a contribution from the large veins overlying the adjacent sulci.
Human area V3A is easily defined based on retinotopic mapping stimuli, because it has a distinctive map of the complete contralateral visual field (Tootell et al., 1997). In Figure 10, we localized the scene-selective TOS region relative to retinotopically defined area V3A, within all hemispheres in which V3A was unambiguously defined, based on two retinotopic criteria: (1) upper versus lower field subdivisions and (2) horizontal versus vertical meridians (see Materials and Methods).
In humans, scene-selective TOS is consistently located immediately anterior and ventral to retinotopically defined V3A. A–D show data from the flattened right hemisphere of a single subject. A is a summary map of retinotopic areas (solid black lines, vertical meridian; dotted black lines, horizontal meridian; dashed black lines, foveal representation), scene-selective regions (labeled in yellow) and face-responsive regions (labeled in cyan), in one hemisphere. B–D show the original maps on which A is based. B, The upper versus lower field retinotopy. C, The vertical versus horizontal meridian retinotopy. D, Scenes versus faces, with overlaid retinotopy. E–I show the combined maps (as in D) from five additional hemispheres. In all six hemispheres, the peak of scene-selective TOS is located immediately anterior and ventral to retinotopically defined area V3A. For all panels, minimum, p < 10−12, and maximum, p < 10−24.
These data confirmed that TOS is consistently located immediately anterior and ventral to V3A, and dorsal to the confluent foveal representations in V1 through V3 (Fig. 10). Thus, TOS lies within explicitly retinotopic cortex—extending from V7 (Tootell et al., 1998) through V3B (Press et al., 2001) and LO-1 (Larsson and Heeger, 2006).
Macaque TOS
Next, we tested whether a TOS homolog (“mTOS”) exists in macaque visual cortex. When translated from the human maps to the macaque maps, a homolog for human TOS should lie immediately anterior to macaque V3A (Gattass et al., 1988), in macaque “V4d,” and/or the newly described retinotopic representations CIP-1, CIP-2 (Arcaro et al., 2011), and perhaps also the DP (dorsal prelunate) gyrus (Andersen et al., 1990; Heider et al., 2005).
However, this specific human-to-monkey prediction is complicated by the existing maps of macaque V3A, which are not perfectly clear. The original single-unit maps of V3A frequently showed a representation of the contralateral 180° on the anterior bank of the lunate sulcus, posterior to the prelunate gyrus (Van Essen and Zeki, 1978; Gattass et al., 1988). However, in some animals, the anterior (upper field) representation in V3A was less certain (Gattass et al., 1988). A similar uncertainty can be seen in fMRI maps of V3A in some macaques (Fig. 11, upper field representation). When defined by variations in polar angle, the fMRI maps of V3A in macaque consistently extend over the prelunate gyrus (Arcaro et al., 2011) (Fig. 11).
Within-hemisphere comparisons of retinotopy and scene-selective activity in macaque TOS. The format is similar to that in Figure 10. A–C are taken from a single hemisphere, analogous to Figure 10B–D. A is a retinotopic map produced by vertical versus horizontal meridians (blue/cyan vs red/yellow, respectively). B, Upper versus lower visual field retinotopy (red/yellow vs blue/cyan, respectively). In A and B, minimum, p < 10−5, and maximum, p < 10−10. C, Activity due to scenes versus faces (red/yellow vs blue/cyan). D–F show the combined maps as in C, from three additional hemispheres. In C–F, minimum, p < 10−5, and maximum, p < 10−20. All maps are shown in right hemisphere format. All scene versus face stimuli were identical with those used in Figure 10.
In all three animals in which the MR slice prescription included this region (MGH), we found patches of scene-selective activity in this general location, extending variably across both sides of the prelunate gyrus (Figs. 1F, 7, black arrows). In two monkeys, we were also able to map the retinotopy (Fig. 11). Direct comparison between the scene-biased and retinotopic maps showed that mTOS included area V4d, which is roughly the topographic equivalent of human areas V7, V3B, and LO-1 (Fig. 11C,D,F). However, in macaques, this scene-selective activity also extended into area V3A, with some variability. In one hemisphere, mTOS was mainly in area V3A without any clear activity in area V4d (Fig. 11E). Thus, mTOS activity included V3A (as defined by the polar angle), plus areas more anterior to V3A (as in human TOS). Given the uncertainty in the definition of macaque V3A, it seems likely that the macaque TOS is homologous with human TOS.
Human RSC
A third patch of scene-selective fMRI activity was noted in human studies (Maguire et al., 1998; O'Craven and Kanwisher, 2000) and eventually attributed to RSC (Maguire, 2001), referring to architectonically defined retrosplenial cortex (Brodmann, 1909). However, the fMRI-defined scene-selective RSC has not been localized in detail.
In our human maps, scene-selective RSC was consistently located in the fundus of the parieto-occipital sulcus, bilaterally (Fig. 12A,B). Extrapolating from many early architectonic studies, the scene-selective RSC region thus lies near the peripheral retinotopic representations of primary and secondary visual cortex, V1 and V2. To localize these regions in more detail, we first compared functional and anatomical maps based on group-averaged data (Fig. 12). Scene-selective RSC was localized using our main group-averaged data based on faces versus scenes, as described above. V1 was localized anatomically, based on increased myelination in the stria of Gennari (Hinds et al., 2008), as translated to the current brain surface using spherical coordinates (Fischl et al., 1999). The topography of V2 was based on the following two kinds of data: (1) previous fMRI studies of the retinotopy in human V2 (Sereno et al., 1995; DeYoe et al., 1996; Engel et al., 1997; Pitzalis et al., 2006, 2010) up to 60° eccentricity, and (2) flattened human cortical tissue stained for cytochrome oxidase (Tootell and Taylor, 1995; Horton and Hocking, 1998) including the far peripheral representation, which reveals thin stripes that are known to span the width of V2 (Tootell et al., 1983; Horton, 1984).
The human scene-selective region RSC is located adjacent to peripheral V1. All panels show a medial view of the inflated cortical surface. A and B show the right and left hemispheres (respectively), illustrating the group-averaged scene-biased activity, including the sulcal/gyral labels (minimum, p < 10−10; maximum, p < 10−30); both RSC (top) and PPA (bottom) are visible. Subsequent panels show magnified views of the inset region (yellow borders, in A). C shows the borders (white) of group-averaged RSC (white lines), plus the full extent of V1 based on the high myelination in layer 4B, in group-averaged data (Hinds et al., 2008). The location of the horizontal meridian representation in V1 is indicated with a dotted black line. D shows RSC and PPA, as revealed in one individual with the scene versus face contrast (minimum, p < 10−20; maximum, p < 10−30). E shows borders of the group-averaged myelination map, plus the more central retinotopy (up to 10° eccentricity, based on vertical versus horizontal meridians, in blue/cyan and red/yellow, respectively) (minimum, p < 10−3; maximum, p < 10−5) from that same subject. F shows the activity produced by flickering checkerboards positioned outside versus inside the visible limit (minimum, p < 10−10; maximum, p < 10−30). The dotted line indicates the limits of retinotopic activity, from the data shown in E.
According to this group data, RSC is located immediately adjacent to V1. The close proximity of RSC to V1 and V2 is somewhat surprising, given the higher-order properties reported for RSC (Epstein et al., 2007; Park and Chun, 2009; Vann et al., 2009) (see Discussion).
These maps also revealed a partially mirror-symmetrical topography in scene-selective regions PPA and RSC (Fig. 12). Although PPA lies farther away from the border with V1, both RSC and PPA lie adjacent to the peripheral representation of V2: PPA is located adjacent to the representation of the upper visual field, while RSC lies adjacent to the representation of lower visual field.
Given these unexpected results in the group-averaged data, we conducted more detailed tests to confirm these conclusions within an individual subject. Figure 12D–F shows those results, based on patterns of fMRI activity produced by (1) scenes versus faces (set 2; to label RSC and PPA); (2) vertical versus horizontal meridians in the central 20° (retinotopic set 1); (3) monocular activation of the visible limit of the ipsilateral far periphery (the monocular crescent) of the visual field, versus the (invisible) farther periphery (see Materials and Methods). As a reference, we also included the group-averaged border of V1 based on the stria of Gennari.
Overall, we found a good match between the group-averaged data and the individual data. The retinotopically defined border of V1/V2 (the vertical meridian representation) in the individual subject corresponded well with myelination boundaries in the group-averaged map (Fig. 12E), within the central approximately one-half of V1, where both measures were available. In addition, the peripheral extent of checkerboard-driven activation in the individual map coincided with the peripheral border of V1 in the myelination map (Fig. 12F). The peripheral extent of the checkerboard-driven activity spread slightly into adjacent areas, including presumptive V2 and the posterior portion of PPA. This spread of the checkerboard-driven activation was expected; previous studies have demonstrated that both V2 (Sereno et al., 1995; DeYoe et al., 1996; Engel et al., 1997) and PPA (Rajimehr et al., 2011) are strongly activated by flickering checkerboards.
As in the group map, RSC in this individual map was located immediately adjacent to the dorsal border of peripheral V1, thus occupying what would otherwise be the peripheral representation of V2. Also consistent with the group comparison, PPA was located adjacent to peripheral V2, at an eccentricity similar (or even more peripheral) to that of RSC.
Macaque RSC
Based on the translation of cortical maps across species, a presumptive macaque homolog of RSC should be located on the medial bank, in or adjacent to the parietal occipital (medial) sulcus (POm) (Pitzalis et al., 2006). In at least one of the monkeys, we confirmed the presence of that scene-biased patch, bilaterally (Fig. 13). As in human RSC, this presumptive macaque homolog of RSC (“mRSC”) was small in size and low in amplitude, in response to the localizer used here. This small size and amplitude of RSC may explain why mRSC did not reach threshold in the n = 3 group map (Fig. 7C,D).
Evidence for RSC in one macaque monkey. A patch of scene-biased activity was present bilaterally, in a location consistent with the location of RSC in humans (i.e., in POm) (minimum, p < 10−5; maximum, p < 10−10). A and B show medial views of this activity, in the right and left hemispheres, respectively. Relevant sulci are labeled in red (C, calcarine; PO, parietal occipital).
Discussion
The correspondence between scene-selective regions in human and macaque cortex is diagrammed in Figure 14.
Diagram of visual cortical areas, relative to regions of scene-biased fMRI activity, in flattened visual cortex in humans (top) and macaque monkeys (bottom). Scene-biased regions are indicated in gray. Both maps are based on fMRI maps of retinotopy, motion selectivity plus face/scene selectivity in a representative single subject. Less understood regions, and regions not mapped directly in the present study, and regions that do not have accepted homologs in both species are indicated by dotted lines, or labeled without borders. The interspecies correspondence is relatively good. Retinotopic areas V1, V2, V3, V3A, and V4v, and motion-selective MT/V5 are similar in both species. The correspondence of FFA with mFFA, and the anterior temporal face patch (ATFP) is also excellent, based on quantitative cortical transformations (Tsao et al., 2003; Rajimehr et al., 2009). The adjacent arrangement of mFFA with mPPA is essentially identical with the arrangement of human FFA and PPA. Macaque cortex also shows a dorsal patch of scene-responsive activity, likely homologous with human TOS. Scene-responsive RSC is well established in humans, but less certain in monkeys. Despite this correspondence in local neighborhood relationships, the maps suggest overall map differences relative to sites located farther from PPA. For instance, the distance between mPPA relative to ATFP is ∼6 mm in macaque (center to center), but much further apart (∼35 mm) in humans. Part (but not all) of this difference can be accounted for by relative expansion of the temporal lobe in humans relative to macaques, since ATFP remains ∼6 mm from the anterior tip of the temporal lobe in both species. A similar situation is evident between (m)PPA relative to other distinguishable areas in anterior temporal lobe, including the subiculum. Comparison of the maps also raises the possibility of the converse expansion in macaques relative to human, after equating relative surface areas. In humans, PPA abuts ventral V2 (Fig. 12). However, in macaques, there is a large region between mPPA and this retinotopic area, in which visual areas are poorly defined. It is possible that scene-responsive activity extends farther ventrally compared with the consistent mPPA region than shown here (Fig. 7B,D; B). Moreover, V4v and TEO may extend far enough ventrally to help fill this gap in the map.
Human PPA
We found that scene-selective fMRI activity in PPA was typically centered on the lips of the collateral sulcus and adjacent medial fusiform gyrus, rather than on the parahippocampal gyrus per se. This was borne out in our MRI data (Figs. 4, 5) and in a meta-analysis of the literature (Fig. 6). This finding is also consistent with a few reports describing functionally equivalent regions on the collateral sulcus (Levy et al., 2004) or medial fusiform gyrus (Chao et al., 1999; Haxby et al., 1999).
The discrepancy in localizing PPA cannot be easily attributed to differences in experimental design or stimuli, relative to previous localizers. Although the size of PPA varied according to the stimuli we tested, the peak location and the topography of this area remained remarkably constant, within a given set of subjects (Fig. 5).
Medial fusiform gyrus
In two independent group-averaged cortical surfaces (n = 17 and n = 40; Fig. 2), and in 20 of 24 human brains from autopsy (Fig. 3), we documented that a shallow sulcus (the middle fusiform sulcus) subdivides the fusiform gyrus into two parallel branches: the lateral and medial fusiform gyri. This middle fusiform sulcus roughly divides the scene-responsive fMRI activity (on the medial fusiform gyrus) from face-responsive activity (on the lateral fusiform gyrus). Since that middle fusiform sulcus was not considered in the original report (Epstein et al., 1998), it remains true that PPA is located on the gyrus immediately medial to “FFA,” in both the present and the original accounts.
Macaque PPA
We compared maps across species in the cortical sheet, using functional landmarks, without considering the cortical folding patterns. This approach has become standard (Van Essen et al., 2001; Tootell et al., 2003; Orban et al., 2004; Sereno and Tootell, 2005), partly because gyri and sulci vary enormously across species. For instance, macaques do not have a fusiform gyrus. Even when similar cortical folds exist, homologous areas vary in location relative to the cortical folds across species. For example, the well established direction-selective area MT/V5 is located in the superior temporal sulcus in macaque, but in the inferior temporal sulcus in humans.
Previously (Rajimehr et al., 2011), we presented evidence for mPPA in two individual monkeys. Here, we confirmed that finding in seven animals, in two independent group averages. In all cases, mPPA was defined as a patch of scene-responsive activity (Figs. 7, 8) centered exactly where a macaque homolog of human PPA should lie, adjacent to the most prominent face patch (mFFA). In the folded brain, this location is ventral and slightly anterior to the posterior middle temporal sulcus (PMTS). Area TEO is centered roughly on the PMTS (Boussaoud et al., 1991); thus, mPPA apparently lies immediately anterior to TEO. Like human PPA, mPPA is elongated along the posterior-to-anterior axis (Figs. 1, 7, 8). Thus, by the local-neighborhood criterion, the human-to-macaque match is good. The more global comparison including areas much farther from PPA (e.g., anterior temporal lobe, the subiculum) may not match quite as well, consistent with the disproportionate expansion in some cortical regions in humans, relative to macaques (Fig. 14).
Human TOS
Our data (Figs. 9, 10) indicate that the human scene-selective region TOS is actually centered on the nearby lateral occipital gyrus, rather than within its namesake, the transverse occipital sulcus. As shown previously (Tootell et al., 1997), the transverse occipital sulcus spans a different, retinotopically defined area, V3A. Thus, scene-selective TOS should lie immediately anterior and lateral to retinotopically defined V3A, in/near retinotopic human areas V7 (Tootell et al., 1998), V3B (Press et al., 2001), and/or LO-1 (Larsson and Heeger, 2006). That conclusion was confirmed here in six hemispheres (Fig. 10), consistent with earlier illustrations in two hemispheres (Levy et al., 2004), and one of two hemispheres in the study by Spiridon et al. (2006).
Macaque TOS
Macaque cortical maps showed a corresponding cluster of scene-selective patches in dorsal occipital cortex (mTOS) (Figs. 1, 11). As in human TOS, mTOS includes the area anterior to macaque V3A (i.e., area V4d). In macaques, mTOS also extends posteriorly into V3A (Fig. 11), depending on how V3A is defined.
This possible posterior extension of mTOS in macaques (relative to humans) does not rule out the assumption of homology, because incremental changes occur naturally as cortical maps evolve across species. Moreover, if there is an interspecies shift in (m)TOS relative to V3A, this has a precedent in the existing literature. In humans, V3A shows high motion selectivity (Tootell et al., 1997). However, in macaques, higher motion selectivity is instead found in area V3 (Van Essen et al., 1990). To the extent that mTOS includes V3A, the region of high scene selectivity would thus be located adjacent and anterior to the region of higher motion selectivity (Fig. 11), in both humans and macaques. That is, both functional properties (sensitivity to motion and sensitivity to scenes) would have shifted by a single area.
RSC
A third scene-responsive area was named RSC, with reference to the architectonically defined retrosplenial cortex [areas 26, 29, and 30 of Brodmann (1909)]. However, Brodmann's report of small cytoarchitectonically defined areas located posterior to the splenium (i.e., BA 26, 29, and 30) was not confirmed by subsequent anatomists (Economo, 1929; Bailey and von Bonin, 1951), nor was an analogous area reported in macaque (Brodmann, 1909). More importantly, the location of Brodmann areas 26, 29, and 30 does not overlap with the location of scene-selective RSC. Recently, the original definition was blurred by widely broadening its borders (Fenske et al., 2006; Epstein et al., 2007) and/or the name itself (retrosplenial “complex”) (Bar, 2007). In all of our data, scene-selective RSC is a discrete region consistently located in the fundus of the parieto-occipital sulcus, ∼1 cm from the original Brodmann areas.
Surprisingly, we also found that RSC is located immediately adjacent to V1, in what would otherwise be the peripheral representation of dorsal V2. This was unexpected. Except for RSC, V1 is surrounded mainly by the second-tier cortical area V2. Thus, RSC is quite unique: it is an apparently higher-tier area (Park and Chun, 2009) that nevertheless borders the two lowest-level areas in the cortical visual hierarchy (Van Essen et al., 1990). Functionally similar areas are often located near each other (e.g., area MT/V5 and surrounding direction-selective areas), presumably because such adjacency can shorten the more numerous cortical connections between functionally related areas. However, counterexamples can also be cited, in which adjacent areas are not functionally similar. The proximity of RSC with V1/V2 may be an example of the latter.
The topography of these three areas supports certain observations in the literature. First, Gattass et al. (1988) reported that V2 does not include a representation of the far peripheral visual field, unlike that found in V1. Such a retinotopic difference would “make room” for RSC along the V1 border, as reflected in our data. Second, our data are consistent with evidence for an asymmetry in dorsal versus ventral V2 in macaques (Van Essen et al., 1984; Felleman and Van Essen, 1991).
An even more restricted representation of eccentricity has also been reported in area V3 (Van Essen et al., 1984; Gattass et al., 1988). As described above, such an arrangement would make room for PPA, adjacent to V2 (Fig. 12).
Nomenclature
The present data reveals numerous complications in the current names for scene-selective cortical regions. The human regions are not centered on the gyri/sulci for which they are named, and the human names cannot be accurately generalized to homologous areas in macaques. The latter discrepancies arise commonly in cross-species comparisons, because different species develop different sulci and gyri.
Above, we used the original names for the scene-selective regions, for historical continuity. However, in Figure 14, we proposed a simple alternative naming scheme that would remain accurate across both humans and macaques. In the new scheme, regions PPA, TOS, and RSC are renamed VS, DS, and MS, respectively (for ventral, dorsal, and medial regions of scene responsivity). Corresponding regions in humans and macaques would be distinguished using the prefix “h” or “m,” yielding hVS, hDS, and hMS in humans, and mVS, mDS, and mMS in monkeys.
Future directions
The demonstration of scene-selective regions in macaques enables future experiments using classical neurobiological techniques, to reveal common neural mechanisms underlying scene processing. For instance, what are the functional properties of single units in each of these scene-selective patches? Do the different scene-selective regions share specific neural connections with each other, and/or with higher-level brain regions implicated in place processing (e.g., hippocampus via entorhinal cortex), and/or spatial navigation (the dorsal stream)? An analogous proliferation of knowledge about neural mechanisms followed the demonstration of “face-selective” patches in macaques based on fMRI (Tsao et al., 2003)—which were prompted in turn by fMRI studies on face-selective patches in humans (Kanwisher et al., 1997; Haxby et al., 2000; Rajimehr et al., 2009). Hopefully, the current study will serve a similar purpose.
Footnotes
This work was supported by NIH Grants R01 MH67529 and R01 EY017081 (R.B.H.T.), the Martinos Center for Biomedical Imaging, the NCRR, the MIND Institute, Shared Instrumentation Grants 1S10RR023401, 1S10RR019, and 1S10RR023043, and the NIMH Intramural Research Program.
- Correspondence should be addressed to Shahin Nasr, Martinos Center for Biomedical Imaging, Massachusetts General Hospital, 149 13th Street, Charlestown, MA 02129. shahin{at}nmr.mgh.harvard.edu
This article is freely available online through the J Neurosci Open Choice option.