The existence of color-processing regions in the human ventral visual pathway (VVP) has long been known from patient and imaging studies, but their location in the cortex relative to other regions, their selectivity for color compared with other properties (shape and object category), and their relationship to color-processing regions found in nonhuman primates remain unclear. We addressed these questions by scanning 13 subjects with fMRI while they viewed two versions of movie clips (colored, achromatic) of five different object classes (faces, scenes, bodies, objects, scrambled objects). We identified regions in each subject that were selective for color, faces, places, and object shape, and measured responses within these regions to the 10 conditions in independently acquired data. We report two key findings. First, the three previously reported color-biased regions (located within a band running posterior–anterior along the VVP, present in most of our subjects) were sandwiched between face-selective cortex and place-selective cortex, forming parallel bands of face, color, and place selectivity that tracked the fusiform gyrus/collateral sulcus. Second, the posterior color-biased regions showed little or no selectivity for object shape or for particular stimulus categories and showed no interaction of color preference with stimulus category, suggesting that they code color independently of shape or stimulus category; moreover, the shape-biased lateral occipital region showed no significant color bias. These observations mirror results in macaque inferior temporal cortex (Lafer-Sousa and Conway, 2013), and taken together, these results suggest a homology in which the entire tripartite face/color/place system of primates migrated onto the ventral surface in humans over the course of evolution.
SIGNIFICANCE STATEMENT Here we report that color-biased cortex is sandwiched between face-selective and place-selective cortex on the bottom surface of the brain in humans. This face/color/place organization mirrors that seen on the lateral surface of the temporal lobe in macaques, suggesting that the entire tripartite system is homologous between species. This result validates the use of macaques as a model for human vision, making possible more powerful investigations into the connectivity, precise neural codes, and development of this part of the brain. In addition, we find substantial segregation of color from shape selectivity in posterior regions, as observed in macaques, indicating a considerable dissociation of the processing of shape and color in both species.
Color is a fundamental dimension of visual experience, informing us about myriad facets of our environment. Befitting its importance for human vision, considerable evidence from patient studies (Meadows, 1974; Newcombe and Ratcliff, 1974; Heywood et al., 1987, 1991; Zeki, 1990; Bouvier and Engel, 2006; Nijboer et al., 2007) and human neuroimaging (McKeefry and Zeki, 1997; Hadjikhani et al., 1998; Beauchamp et al., 1999; Wade et al., 2002, 2008; Brewer et al., 2005; Simmons et al., 2007; Cavina-Pratesi et al., 2010) implicates specific cortical regions of the ventral visual pathway (VVP) in the analysis of color. But little is known about the causal role of these regions in perception (Murphey et al., 2008; Rangarajan et al., 2014), their connectivity, the precise computations conducted within them, or the underlying neural circuits. The most powerful methods for answering these questions require a primate model. Such a model would be most informative if the processing mechanisms were homologous to those found in humans. How similar, then, are the cortical mechanisms for color vision in nonhuman primates and humans?
Macaques and humans have very similar color behavior (Stoughton et al., 2012; Gagin et al., 2014), raising the possibility that the two species have similar cortical machinery for color vision. Here, we test this hypothesis by investigating whether humans show the same organization for color in the VVP as found in macaques, in which color-biased regions are sandwiched between face- and place-selective regions (Lafer-Sousa and Conway, 2013; Verhoef et al., 2015). No prior study has assessed the topographic relationship of these regions directly in humans. A strong test of the idea requires a systematic and quantitative evaluation of category, shape, and color selectivity in the same subjects. To do this, we scanned 13 human subjects using fMRI while they viewed either full-color or achromatic movie clips of faces, objects, scenes, bodies, or scrambled objects. We first determined the spatial layout of color, face, and place selectivity by projecting color, face, and place preferences onto each subject's inflated cortical surface. We then defined color-, face-, and place-preferring regions within each subject and measured the response magnitude of each region to each of the 10 conditions in independently acquired data. These analyses enabled us to map the relative locations of shape-selective, category-selective, and color-biased regions in each subject individually and to quantify the selectivity of each region to each dimension.
Our experimental design enabled us to address a second open question: Is the processing of color and shape inextricably linked throughout the VVP? On the one hand, one might think of color as just another object property, likely processed in the same cortical regions (and neurons) as those implicated in computing object shape. On the other hand, the computations entailed in processing color, and the role of color in behavior, often diverge from those for shape (Conway, 2009, 2014), suggesting color and shape are processed by somewhat separate circuits. Some micro-electrode studies in monkeys have been interpreted as supporting the idea that color and shape are processed together (Schein and Desimone, 1990; Gallant et al., 2000; Gegenfurtner and Kiper, 2003; Shapley and Hawken, 2010; Yasuda et al., 2010), but other results in macaques (Zeki, 1980, 1983; Bartels and Zeki, 2000; Conway, 2001; Lafer-Sousa and Conway, 2013; Conway, 2014) and humans (Newcombe and Ratcliff, 1974; Milner and Heywood, 1989; Heywood et al., 1991; Bouvier and Engel, 2006; Cavina-Pratesi et al., 2010) suggest some dissociation of color and shape processing. We addressed this question by determining the extent to which regions that respond preferentially to intact shape (compared with scrambled counterparts), such as the lateral occipital region (LO), were modulated by the presence of color, and the extent to which regions responding preferentially to color were modulated by the presence of intact shape. These experiments enabled us to test the degree to which color, category, and shape preferences are segregated in the human brain and to investigate possible homologies of cortical regions in humans and macaques.
Materials and Methods
Thirteen subjects (age 20–39 years, 7 female) participated in the study. Participants had no history of neurological or psychiatric impairment, had normal or corrected vision, normal color vision (tested with Ishihara plates), and were native English speakers. All participants provided written, informed consent.
To localize category, shape, and color-biased activity we used fMRI to scan subjects while they viewed natural video clips corresponding to one of five stimulus classes (Faces, Bodies, Scenes, Objects, and Scrambled Objects) presented in either full color (chromatic; the same stimuli used in Pitcher et al., 2011a, 2011b) or gray scale (achromatic; see Fig. 1A). Sample stimuli are provided in a short movie file (Movie 1) and a high-resolution version can be downloaded from the web (http://web.mit.edu/bcs/nklab/rosaStims.shtml). Each run consisted of 25 18-s-long blocks (20 stimulus blocks and 5 full-field gray fixation blocks to allow the signal to return to baseline). Each stimulus block contained 6 3-s-long video clips randomly drawn from 60 clips of a specific stimulus condition (e.g., chromatic faces). In a single scanning session, subjects viewed 16–20 blocks of each of the 10 stimulus conditions (5 classes × chromatic/achromatic) or a total of 32–40 blocks for each class (e.g., 32–40 blocks of faces, comprising 16–20 chromatic and 16–20 achromatic); 80–100 chromatic blocks (comprising all classes of chromatic stimuli); and 80–100 achromatic blocks. The stimuli subtended a maximum visual angle 20° wide and 15° tall. The order of conditions was palindromic (e.g., A-B-C-C-B-A) within a run and was counterbalanced across runs such that each condition happened equally often in each serial position in the run (10 run orders). The MATLAB function rgb2gray was used to render the clips in gray scale while retaining luminance structure (confirmed with photometric measurements). Scrambled objects were constructed by dividing each object movie clip into a 15 by 15 box grid and spatially rearranging the location of each of the resulting movie frames.
In control experiments, we localized color responses using drifting grating stimuli comparable to those used in prior work (Lafer-Sousa and Conway, 2013); these stimuli allow more stringent control of the physical parameters of the stimulus but contain no recognizable objects. The chromatic gratings were designed to be equiluminant; the achromatic gratings were designed to comprise some luminance contrast. The stimulus paradigm was similar to that used to identify color-biased regions in macaque (Lafer-Sousa and Conway, 2013). Three achromatic gratings conditions (luminance contrasts 50%, 25%, and 7%) and eight color directions defined by the cardinal and intermediate directions of the equiluminant plane in cone-opponent color space (Macleod and Boynton, 1979; Derrington et al., 1984) were used. Block-lengths were shortened compared with those used in the macaque experiments (18 s instead of 32 s) to match the hemodynamic response function of the BOLD signal in humans (block lengths needed to be longer in the macaque experiments to account for the longer hemodynamic delay that accompanies the use of an intravenous contrast agent). Stimuli were calibrated using spectral readings taken with a PR-655 spectroradiometer (Photo Research). The spectra were multiplied with the Judd-revised Commission Internationale de l'Eclairage (CIE) 1931 color matching functions to derive CIE xyY coordinates of the monitor primaries (Hansen et al., 2008) and cone excitation was calculated using the Smith and Pokorny cone fundamentals (Smith and Pokorny, 1975). Stimuli were presented as vertical trapezoid-wave gratings (Conway and Tsao, 2006). Each chromatic and achromatic grating was drifted back and forth for 18 s, switching directions every 2 s (2.9 cycles/°, drifting 0.75 cycles/s). Within a run, the grating blocks were interleaved with equiluminant neutral full-field gray (100 cd/m2) (12 s) to allow the signal to return to baseline [e.g., Gray, Achromatic-25%, Gray, Color-1, Gray, Achromatic-50%, Gray, Color-8, Gray, etc]. There were 9 run orders, counterbalanced across runs such that each condition happened equally often in each serial position in the run. Each achromatic grating appeared three times per run, and was shown between chromatic blocks (with neutral gray blocks interleaved). Subjects were required to fixate a central fixation cross and performed a difficult motion-detection task to maintain attention: the gratings drifted left to right, switching direction every 2 s except once per block, when the grating would drift 3 s in one direction. Subjects had to report when this occurred (via a button press in the scanner).
In nine subjects, we identified the borders of visual areas V1 through V4 by mapping the responses to horizontal and vertical meridians following standard procedures (Sereno et al., 1995). The stimuli were flickering (counterphase at 7.5 Hz) checkerboards restricted to wedges (subtending 55°) along either the horizontal (spanning 20° of visual angle) or vertical (spanning 15° of visual angle) meridian. Each run comprised 36 10-second blocks. Subjects were required to fixate throughout. Responses to vertical meridians were contrasted with responses to horizontal meridians and projected on each subject's cortical surface, producing a striped map used to demarcate the boundaries of V1, V2, V3 and V4. Because the anterior border of V4 (between V4/hV4 and ventral occipital cortex, VO-1) is difficult to define functionally (McKeefry and Zeki, 1997; Bartels and Zeki, 2000; Tootell and Hadjikhani, 2001; Brewer et al., 2005; Winawer et al., 2010), its location was determined using an anatomical landmark, the posterior transverse collateral sulcus (ptCoS) (Winawer et al., 2010).
Data were acquired using a Siemens 3T MAGNETOM Tim Trio scanner (Siemens AG, Healthcare, Erlangen, Germany) with a 32-channel head coil (at the Athinoula A. Martinos Imaging Center of the McGovern Institute for Brain Research at MIT). Functional data were collected using a T2*-weighted echo planar imaging (EPI) pulse sequence sensitive to blood-oxygen-level-dependent (BOLD) contrast. Given the known susceptibility artifact of the anterior regions of the ventral temporal lobe (caused by the ear canals), we performed extensive piloting to determine the parameters that produced the highest voxel SNR. Pilot data were collected with and without iPAT (Griswold et al., 2002) and field correction; we varied the voxel-resolution, slice angle, TE, and fraction of k-space sampled. We obtained improved tSNR and BOLD sensitivity by increasing voxel resolution by a factor of 4 (from the standard 3 mm iso to 2 mm iso), forgoing image acceleration (iPAT), and collecting field maps before each run. Reducing TE (to 14 or 23 ms) enhanced the tSNR, but compromised BOLD sensitivity. TE was set to 30 ms.
The functional volumes were acquired for a restricted portion of the brain spanning the ventral surface of the temporal lobe. Each functional volume comprised 25 slices (2 mm isotropic; field of view [FOV] = 192 mm, matrix = 96 × 96 mm) oriented approximately parallel to the temporal lobe, covering the occipital cortex (V1–V4) and the full length of the temporal lobe ventral to and including some of the superior temporal sulcus (2.0 s TR, 30 ms TE, 90° flip angle, 6/8 echo fraction). The first 5 volumes of each run were discarded to allow for T1 equilibration. Field maps (2 mm isotropic, 25 slices) were collected before each functional run to measure magnetic inhomogeneities and were used to estimate spatial distortions in the functional volumes that were then removed during analysis. High-resolution T1-weighted anatomical images were collected for each subject using a multiecho MPRAGE pulse sequence (1 mm isotropic voxels; FOV = 256 mm, matrix = 256 × 256).
Data preprocessing and modeling
Data were processed using Freesurfer (http://surfer.nmr.mgh.harvard.edu/) and custom MATLAB scripts. Freesurfer was used to segment white- and gray-matter structures from the anatomical volumes (Dale et al., 1999; Fischl et al., 1999, 2001). Each individual subject's functional data were field corrected, motion corrected using rigid-body transformations to the middle image of each run, intensity normalized (nonbrain tissue was masked), spatially smoothed using an isotropic Gaussian kernel (3 mm FWHM) to improve SNR, and aligned to the individual's anatomical volume using a rigid-body transformation determined by Freesurfer's bbregister (Greve and Fischl, 2009). Digitally inflated (see Figs. 1, 2, 5, 6, 9) and flattened (see Figs. 2, 6, 7, 9) cortical surface reconstructions were generated using freesurfer.
Whole-volume general linear model-based analyses were performed for each subject's even and odd runs separately. Regressors were defined as boxcar functions convolved with a gamma hemodynamic response function (Friston et al., 1994). The boxcar function for each condition included each block from that condition. Nuisance regressors for motion (three translations, three rotations) and a linear trend to capture slow drifts were included.
Analysis of color, face, place, and shape preferences across cortex (dynamic localizer data)
To show the large-scale spatial organization of the ventral visual pathway within each individual, statistical contrast maps were projected on the digitally inflated surface of each subject's own cortical anatomy. Following standard practices (Julian et al., 2012), functional regions were defined as follows. Face-selective voxels were defined as those showing higher responses to video clips of faces than objects (p < 0.0001; see purple regions in Figs. 1B, 2B). Place-selective voxels were defined as those showing a higher response to places than objects (p < 0.0001; see green regions in Figs. 1B, 2B). Color-biased voxels were defined as those showing a higher response to chromatic video clips than achromatic versions of the same clips matched in luminance, averaging responses obtained in all stimulus classes (p < 0.001; activation maps shown in Fig. 1B; blue regions in Fig. 2B). The conclusions were not affected if color-biased regions were only defined using movie clips of a single category (data not shown). Shape-selective voxels were defined as those showing a higher response to intact objects than scrambled objects (p < 0.0001; see Fig. 5).
The major conclusions presented here are derived from an analysis of individual subject data. To illustrate overall trends in the spatial layout of color-biased regions relative to face-, place-, and shape-selective regions, we generated group overlap maps. To do this, statistical contrast maps from each of the 13 individual subjects were registered to a published group-average template (Freesurfer, CVS_avg35, an average of 35 people). The group-registered contrast maps from each individual were then thresholded and each voxel was assigned a binary value (1 or 0 for above threshold or not). For a given contrast, all 13 subject maps were tallied and the voxels in the resulting volume were assigned a value according to the number of subjects who had above-threshold activity at that location. Figure 2A shows the group summary map for face, color, and place contrasts, thresholded at 5 subjects (individual contrast maps thresholded at p < 0.01), on the reconstructed cortical surface of the CVS_avg35 volume. Figure 5A shows the group map for the shape contrast (intact objects > scrambled objects; p < 0.0001) as a surface histogram where voxel color (red to yellow) reflects the agreement of 2 to 5+ subjects.
The surface maps enabled us to examine the spatial segregation of the activation patterns for color, faces, and places. However, region of interest (ROI)-based analyses make possible stronger tests of our hypotheses. Specifically, they enabled us to cross-validate the contrast by which each region was defined and to quantify the effect size and selectivity of that contrast. Effect size matters: a region may have highly significant selectivity but of small magnitude, an important distinction that is not apparent in standard activation maps showing only significance levels. By quantifying activity in ROIs, we were able to determine the profile of response of each region to each of the 10 conditions tested, providing a richer profile of the selectivity of each region. From these results, we investigated whether the selectivity for color is present to a similar degree for each stimulus category and the extent to which it depends on stimulus category (which would be manifested by an interaction of color/achromatic by category). Significant interactions between two factors (e.g., color and shape) may be evidence that the two factors depend upon common neural populations rather than being analyzed independently in two distinct neural populations within that region (Sternberg, 1969). The ROI analyses also enabled us to test whether two regions differ significantly from each other with respect to some contrast. Standard activation maps do not support such conclusions because they only show regions that reach significance in a given contrast and regions that do not—that is, they show differences in significance, but not significant differences across regions. For example, in one ROI analysis, we explicitly tested whether color-biased regions are significantly more color biased than category-selective regions and shape-selective regions, contrasting the responses in these regions to each other in an ANOVA with ROI as a factor (described below).
Functional ROIs were defined in individual subjects, in the volume, using data from even runs. The odd runs were used to quantify the responses within each ROI to each of the 10 conditions. ROIs were defined using a p-threshold criterion. For a given contrast, all contiguous voxels above the p-threshold constituted an ROI; p-threshold was p < 0.0001 for occipital face area (OFA), fusiform face area (FFA), anterior temporal lobe (ATL), parahippocampal place area (PPA), LO cortex, posterior fusiform (pFS) and posterior superior temporal sulcus (pSTS); and p < 0.001 for color-biased regions. A less stringent p-threshold was used to define color-biased voxels because color selectivity was weaker overall than shape and category selectivity. Note that the evidence for color selectivity comes not from the statistics used to define the region, but from the statistics of the response magnitudes conducted using independently acquired data (the odd runs). In some subjects, face activations were so robust that they formed a contiguous swath extending from the OFA to the FFA; similarly, shape-selective regions sometimes formed a contiguous swath from LO to pFS and place activations sometimes extended contiguously into early visual cortex. In these cases, a published parcel atlas derived from 30 human subjects was used to constrain ROI definition for the separate components within each swath of activation (Julian et al., 2012). The main conclusions regarding the segregation of face, color, and place activations do not change if ROIs were defined solely using functional criteria (e.g., in this case, by lumping the OFA and FFA together).
For each subject, the percent signal change (PSC) within each ROI (see Fig. 3) was extracted by averaging β values across the ROI and dividing by the mean BOLD signal in the ROI. Each ROI was detected in most subjects and, when a region appeared bilaterally (typical for most subjects and regions; see Results), the results across hemispheres were averaged. Participants who lacked a certain ROI were not included in the statistical analyses (ANOVAs) involving that ROI; if the selectivity of an ROI present in only 10 subjects was being compared with an ROI present in 13 subjects, only the 10 subjects that had both ROIs were included in the analysis. Statistical tests (ANOVAs) for main effects and interactions are described in the Results.
As described in the Results, the color activation along the posterior–anterior extent of the VVP was not homogeneous; three main peaks in the activation were defined: posterior color-biased region (Pc), central color-biased region (Cc), and anterior color-biased region (Ac). To quantify this inhomogeneity, in each subject, we defined two ROIs (Fig. 4B, schematic, black circles) between the peak color-biased activations (white circles): iPCc (intermediate to Pc and Cc) and iCAc (intermediate to Cc and Ac). These ROIs were defined in each subject individually in the volume using even runs and the nonsignificant voxels in the contrast were used to define the color-biased regions. Figure 4B bar plots show the PSCs for colored and achromatic stimuli, averaging across faces, bodies, scenes, objects, and scrambled objects.
To plot color bias as a function of shape bias (see Fig. 5C), we used an index of color selectivity (response to color − response to achromatic/response to color + response to achromatic) and a comparable index of shape selectivity (response to intact objects − response to scrambled objects/response to intact objects + response to scrambled objects). The response to color was the average PSC to chromatic intact objects and chromatic scrambled conditions, and the response to achromatic was the average PSC to the achromatic-intact and achromatic-scrambled conditions. The response to intact objects was the average PSC to chromatic and achromatic intact object conditions; the response to scrambled objects was the average PSC to chromatic and achromatic scrambled conditions. These indices were computed at the group level to avoid complications posed by dividing by noisy negative individual responses. Error bars were calculated using bootstrapping across subjects and represent 95% confidence intervals (10,000 bootstrapped samples) (Efron, 1982).
Group analysis (random effects) of color-biased activations (dynamic localizer)
To test the extent to which color activations were found at consistent locations in the brain across subjects (see Fig. 4), we performed a group random-effects analysis on the chromatic versus achromatic contrast, in which data from each subject were registered and normalized to a common template (Freesurfer, CVS_avg35).
Analysis of inhomogeneity in tSNR across the cortical surface
The ability to detect significant differences in the BOLD response in each region of the cortex depends on the signal-to-noise ratio (SNR), which is known to vary across the brain due inhomogeneities in the magnetic field. To show this spatial variation in SNR, we computed temporal SNR (tSNR) for each subject as follows: the motion-corrected functional volumes were high-pass filtered with a cutoff of 0.004 Hz to remove slow drifts and, for each voxel, we computed the mean of the time course for each voxel divided by its SD. To generate a mean tSNR map from the group of subjects, the tSNR map for each subject was registered to a common template (Freesurfer, CVS_avg35) and the volumes from all subjects were averaged.
Color localization with colored gratings
In addition to localizing color-biased cortex using a contrast of chromatic > achromatic movie clips, color-biased regions were also localized in a subset of subjects using a more controlled stimulus paradigm containing no recognizable objects (full-field drifting gratings), comparable to the one used in an earlier study of macaque monkeys (Lafer-Sousa and Conway, 2013) and similar to those used in most prior work on color localization in humans (McKeefry and Zeki, 1997; Hadjikhani et al., 1998; Wade et al., 2008). This enabled us to directly compare the human data with the previously published macaque data (obtained using gratings), and establish the extent to which the natural videos recover the same regions localized with standard low-level stimuli.
In the three subjects who participated in this control experiment, color-biased activations were defined by contrasting responses to the chromatic grating conditions that elicited the weakest response in MT+ (these colors would suffer the least luminance contamination) with the achromatic grating condition that yielded the same magnitude of response in V1 (50% luminance contrast). We quantified the PSC to gratings in each of the ROIs defined using the dynamic natural stimulus, as well as regions V1, V2, and MT.
If the homology between humans and monkeys hypothesized in the present report is true, then color-biased regions in monkeys (Lafer-Sousa and Conway, 2013) and color-biased regions in humans (described presently) should show a similar pattern of color and shape selectivity when scanned using the same dynamic natural videos. To test this hypothesis, we scanned the two macaque subjects from Lafer-Sousa and Conway (2013) while the animals were shown the dynamic natural video stimuli using the same acquisition and preprocessing methods described in the earlier report. MION contrast agent was used in the monkey experiments. The stimulus block lengths for the dynamic localizer were extended (32 s) to account for the slower hemodynamic response function associated with MION (Vanduffel et al., 2001). The task was free viewing, as in the human experiments; animals were rewarded with juice for looking anywhere on the screen. We extracted PSC responses to the dynamic natural videos using the ROIs defined in the earlier report (independent data): the middle face patch (ML) and the set of color-biased ROIs (PLc, posterior lateral color, located in PIT; CLc, central lateral color, located in CIT; and ALc, anterior lateral color, in AIT). Bar charts (see Fig. 8A) show the mean PSC of the last 7 TRs in each stimulus block. A scatter plot (see Fig. 8B) shows the relative color and shape bias of each ROI (as computed for Fig. 5B in the humans).
Topography of face, color, and place responses
We first asked to what extent the functional organization of the human VVP resembles that of macaque monkey, where parallel and adjacent processing systems have been reported for faces, places, and color, with color-biased cortex sandwiched between face-selective regions (superiorly) and place-selective regions (inferiorly) (Lafer-Sousa and Conway, 2013; Verhoef et al., 2015). To do so, 13 subjects were scanned with fMRI while viewing short video clips containing multiple stimulus categories, each presented (in different blocks) in either full color or luminance-matched gray scale (Fig. 1A). Face-selective and place-selective regions were defined in each subject individually (contrasts were faces > objects and places > objects, respectively) and are shown in Figure 1B (outlines) on the inflated cortical surface for four example subjects (voxel threshold of p < 0.0001). Face-selective regions (purple outlines, Fig. 1B) and place-selective regions (green outlines, Fig. 1B) were found in most subjects, typically bilaterally. The PPA (Epstein and Kanwisher, 1998) and FFA (Kanwisher et al., 1997) were detected in all 13 subjects and appeared bilaterally; the OFA (Pitcher et al., 2011a, 2011b) was detected in 12/13 subjects (11/12 bilaterally); and the ATL face area (Kriegeskorte et al., 2007; Collins and Olson, 2014) was detected in 9/13 subjects (3/9 bilaterally). The FFA frequently consisted of a posterior and anterior component (FFA-1/pFus-faces and FFA-2/mFus-faces; Weiner and Grill-Spector, 2010). In addition to the face-selective regions of the VVP, the dorsal face-selective region of the pSTS (Puce et al., 1998) was detected in 11/13 subjects (8/11 bilaterally). The scene-selective retrosplenial cortex (Epstein et al., 2007) and occipital place area (Julian et al., 2012) were outside of the region scanned in many subjects and were not analyzed.
Activation maps in Figure 1B show those voxels with greater activation to colored versus achromatic video clips. In each subject, the color-biased activation was sandwiched between the face-selective regions and the place-selective regions. The volume-wide contrast (smoothed 3 mm Gaussian FWHM) of the color-activation pattern showed a mottled band of color-biased activation running posterior-to-anterior along the VVP, within which we typically observed multiple peaks. To quantify the activity, we defined three color-biased regions of interest in most subjects (the interdigitating tissue is further examined in the “Quantification of the magnitude of the color bias across the VVP” section below). In all subjects (11/13 subjects bilaterally), we observed a posterior color-biased region that we refer to as Pc (for “posterior color”; Talairach coordinates: −32, −76, −7), which corresponds to an area originally referred to by Zeki as V4 (Zeki et al., 1991) and hV4 by others (Wade et al., 2002; Brewer et al., 2005). In most subjects, this region extended beyond the V4 border (anteriorly) into VO-1 (Brewer et al., 2005). We observed a second color-biased region in all subjects (12/13 subjects bilaterally), about 1 cm anterior to Pc, medial to the midfusiform sulcus, and often extending into the CoS. This region, referred to here as Cc (for “central color”; Talairach coordinates: −25, −54, −10), corresponds to V4α (Bartels and Zeki, 2000), V8 (Hadjikhani et al., 1998) and part of the VO complex (Wade et al., 2002). In at least one hemisphere of 10/13 subjects (6/10 bilaterally), we identified a third even more anterior region that we refer to as Ac (for “anterior color”; Talairach coordinates: −32, −37, −8). Ac was located anterior to the VO complex. This region falls in the neighborhood of activations that have previously only been observed in studies involving a demanding color behavior task (Martin et al., 1995; Simmons et al., 2007), and most closely resembles the region described by Simmons et al. (2007) (located in the left fusiform gyrus), that is hypothesized to be a “color-knowledge” region. Color localization studies that use low-level stimuli and passive viewing (e.g., standard Mondrians) do not report this region (McKeefry and Zeki, 1997; Hadjikhani et al., 1998; Wade et al., 2002, 2008; see “The anterior color-biased region” section below).
Figure 2 shows thresholded activation maps of face, color, and place preferences for the group overlap (Fig. 2A) and for each individual subject (Fig. 2B). The group overlap map shows locations where at least five subjects had voxelwise activation overlap for each contrast (see Materials and Methods). The pial surface view of the group overlap map shows three parallel bands in each hemisphere running along the posterior–anterior axis: color-biased activations (blue) were sandwiched between face-selective (purple) and place-selective activations (green), with face-selective cortex lateral to color-biased cortex, and place-selective cortex medial. This pattern was even more obvious in the inflated and flattened views (Fig. 2A, bottom) and was evident in data from most individual subjects (Fig. 2B). Three subjects also showed some color-biased activation lateral to the FFA (S5, S7, S10; Fig. 2B). In some subjects, Cc overlapped partially with the PPA (overlap indicated in cyan). Although viewing activations on the inflated surface allows us to observe their relative position on the cortical sheet, smaller activations sometimes fail to project despite being visible in the volume (Fischl et al., 1999; Operto et al., 2008). Activations that fall in regions of high signal inhomogeneity are particularly vulnerable to imperfect registration to the anatomy, which can lead to surface projection failure (Tucholka et al., 2012). Region Ac is acutely susceptible given its small size and proximity to the ear canals and is in a region of known signal inhomogeneity (discussed in “The anterior color-biased region” section below). Despite being visible in the native volume (where ROIs were defined), it failed to visibly project to the surface in several subjects (S2, S4, S8, S9, S11, and S13). For example, Ac was not clear in the surface for subject 2 (Fig. 2B), but was obvious in the native volume for this subject (slices; see Fig. 4).
In sum, we find a parallel and adjacent structure of face, color, and place preferences on the ventral surface of the brain in humans. This pattern mirrors the systematic organization found on the lateral surface of macaque inferior temporal (IT) cortex, suggesting a broad homology between the two regions.
Segregation of color and category in the VVP
To what extent are color and category information processed separately in the VVP? The brain activation maps described in the previous section show some spatial segregation of preferences for color, faces, and places. As described in the Materials and Methods, ROI-based analyses make possible stronger tests of our hypotheses about the functional relationship between color and other stimulus dimensions. Therefore, category-selective (OFA, FFA, ATL, PPA) and color-biased (Pc, Cc, and Ac) ROIs were defined in each subject individually using data from even runs and the response magnitude of each region to each of the 10 conditions was quantified using the independent data from odd runs (Fig. 3A–F,H). For comparison, we also defined a region that responded preferentially to intact objects (defined by shape selectivity not category selectivity) that is adjacent to Pc and OFA (region LO), and a face-selective region (pSTS) on the lateral surface (as opposed to the ventral surface) (Fig. 3G,I). We discuss responses in these two regions in the “Anatomical segregation of color and shape preferences” and “Relationship to monkey organization” sections.
An omnibus three-factor (ROI, category, and color/achromatic) ANOVA across the category-selective and color-biased ROIs of the VVP (OFA, FFA, ATL, PPA, Pc, Cc, and Ac) confirmed that these regions differ significantly from each other in their category selectivity (ROI × category interaction, p < 0.0001) and color selectivity (ROI × color interaction, p < 0.0001). These significant differences, as well as our prior hypotheses, license subsequent analyses of each ROI individually. Two-factor ANOVAs on stimulus category (faces, scenes, objects, and bodies) × color (color/achromatic) were run on each of the three color-biased regions (Pc, Cc, and Ac) and each of the four category-selective regions (OFA, FFA, ATL, and PPA) individually. The results of those analyses are shown in Table 1.
As expected, color-biased regions responded more strongly to colored than to achromatic stimuli; the size of this effect in each ROI can be seen in Figure 3 (significance levels are reported in Table 1). Category-selective regions showed the expected strong (Fig. 3) and significant (Table 1) selectivity for their preferred category. However, most of the ROIs were selective for both color and category (Table 1). This result reflects in part the great statistical power of the ROI method (Saxe et al., 2006), which is able to detect statistically significant but very small effect sizes such as of color in the FFA (the blue bar is slightly higher than the black bar in the first pair of bars in Fig. 3E). Are the color ROIs (Fig. 3A–C) more color selective than the category-selective ROIs (Fig. 3D–F,H), and are the category-selective ROIs are more category selective than the color ROIs? Because there is no uncontroversial pairing of a particular color-biased ROI with a particular category-selective ROI, we answered these questions using a stringent exhaustive analysis of all 12 pairwise ANOVAs, contrasting each color ROI with each category ROI. This analysis confirmed that each color-biased ROI is more color selective than each category-selective ROI (i.e., p < 0.01 for the interaction of ROI × color in 12/12 pairwise ANOVAs), and each category-selective ROI is more category selective than each color ROI (i.e., p < 0.0001 for the interaction of ROI × category in 12/12 pairwise ANOVAs, p < 0.0001). Note that the large number of statistical comparisons used here does not require correction for multiple comparisons because our hypothesis required that each of the tests (rather than any of them) reach significance.
The statistical tests presented thus far provide evidence of some anatomical segregation of category and color responses: color-biased regions were more color selective than category-selective regions, and category-selective regions were more category selective than color-biased regions. If color and category processing were strongly independent, then each kind of information would be carried by different populations of neurons with no interaction. This strong hypothesis predicts that responses to color and category would not interact within an ROI (instead, the effect of the two factors on the response of a voxel containing the two distinct neural populations would be additive). This prediction follows from an extension to fMRI (Sternberg, 2011) of the classic additive factors logic from Sternberg (1969). All seven ROIs in the ventral cortex were tested (Pc, Cc, Ac, OFA, FFA, ATL, and PPA) and only the PPA showed a significant interaction of color by category (see Table 1).
These analyses provide evidence for a double dissociation between color and category preference across the VVP, as well as statistical independence of color and category preferences within six of the seven ROIs.
Quantification of the magnitude of the color bias across the VVP
As described above, the color-biased activations comprised a mottled band running posterior to anterior along the VVP (Figs. 1B, 2, 4A). Within this band, the three main peaks (Pc, Cc, and Ac) were identifiable in most subjects (10/13). In each subject, each peak was typically separated by tissue that did not show a significant color bias. Moreover, peaks were sufficiently stereotyped in their location within the brain such that they are apparent in a random-effects group analysis (Fig. 4B). In the “Segregation of color and category in the VVP” section, we showed that these regions were significantly more color biased than category-selective regions, but how strong is their color selectivity relative to the VVP more broadly? In particular, are the color-biased regions more color selective than the cortical tissue that immediately surrounds them?
To answer this question, we used an ROI-based analysis that enabled us to leverage the power obtained by measuring responses in many subjects (see Materials and Methods). This method is powerful because it enables one to uncover activations that may not be significant in a given ROI within a single subject, but may be significant within that ROI when pooling results from many subjects. Using this method, we asked, are the color-biased regions more color biased than the immediately intervening patches of cortex?
In each individual subject, we used data obtained in even runs to define two ROIs located between the color-biased activations with respect to the posterior–anterior axis (Fig. 4C, schematic, black circles). These included region iPCc (between Pc and Cc) and iCAc (between Cc and Ac). We used results obtained during odd runs to quantify responses. Figure 4C shows each region's mean percent signal change to all chromatic stimuli (cyan bars) and all achromatic stimuli (gray bars) (“all” = faces, bodies, scenes, objects, and scrambled objects). Although both intermediate regions were not color biased in the individual subject contrast maps, they showed main effects of higher responses to chromatic than achromatic stimuli in the ROI analysis [both p < 0.01; 2-factor ANOVAs on stimulus class (faces, bodies, scenes, objects, and scrambled objects) × stimulus color]. Nonetheless, the color-biased regions were more color biased than iPCc and iCAc: Pc and Cc were each more color biased than iPCc (p < 0.001) and Cc and Ac were each more color biased than iCAc (p < 0.003; pairwise 3-way ANOVAs on ROI × color × stimulus class, each contrasting an intermediate region with one of its neighboring color regions). These results show that, although a weak color preference extends into surrounding cortex, the color-biased regions represent reproducible peaks in that spatial pattern of color bias.
Anatomical segregation of color and shape preferences
Prior work has documented regions of cortex that respond more strongly to intact than scrambled objects, including a posterior region called LO and an anterior region often referred to as posterior fusiform (pFS) (Grill-Spector et al., 1999). How segregated is shape processing from the analysis of color? As with the case of color and category described above, this question can be tested both in terms of anatomical segregation of preferences for shape and color, and functional independence within a region. Do responses within ROIs show an interaction of sensitivity to color and shape?
Following standard practices, we defined shape selectivity as a higher response to intact objects than scrambled objects (p < 0.0001) (Malach et al., 1995). The location of cortical regions showing a shape preference was anatomically segregated from the location of color-biased regions for most of the VVP. Figure 5A shows a surface histogram of the voxelwise group overlap map for shape selectivity. The large swath of shape-selective cortex (consistent with published group-derived parcels for LO and pFS, dark blue outlines; Julian et al., 2012) showed little overlap with color-biased cortex (outlined in light blue, Fig. 2A), except for the most anterior region where some overlap of shape and color preferences was evident. This pattern suggests that color and shape processing is segregated in posterior VVP but converges at more anterior stages of processing. To test this hypothesis, we analyzed data from individual subjects. We quantified the response profiles of shape-selective ROIs (LO and pFS) and the color ROIs in each subject. Figure 5B shows the average PSC across subjects to intact (O) and scrambled objects (SO), presented with color and without color. Two-factor ANOVAs on stimulus shape (Intact/Scrambled) × color (Chromatic/Achromatic) were run separately on Pc, Cc, and LO. Because Ac overlapped slightly with pFS in 5 of 10 subjects (i.e., the ROIs comprised common voxels), quantification of the responses of these regions was not independent of each other and will be discussed in the next section. As expected, Pc and Cc showed main effects of higher responses to chromatic than achromatic stimuli (p < 0.0001). Neither region showed a preference for intact over scrambled objects; to the contrary, Pc had a significant effect in the opposite direction. Conversely, LO showed a significant preference for intact shape (p < 0.0001), but only a small preference for chromatic over achromatic stimuli (p = 0.003), which was driven mostly by the scrambled object condition (see Fig. 5B, bar plot). To test whether the color ROIs were more color biased than LO, and if LO was more shape selective than the color ROIs, we ran two three-factor ANOVAs, each contrasting one of the two posterior color ROIs with LO. This analysis confirmed that both the posterior and central color-biased were more color selective than LO (i.e., p < 0.001 for the interaction of ROI × color in each pairwise ANOVA) and LO was more shape selective than Pc and Cc (i.e., p < 0.001 for the interaction of ROI × shape in each pairwise ANOVA). These analyses confirm the dissociation of color and shape preferences among the posterior color-biased regions (Pc, Cc) and LO. This dissociation can also be appreciated in Figure 5C, which plots the color bias as a function of the shape bias for all the ROIs. In this plot, the three color-biased regions are located in the upper half of the plot and are clearly separated by a horizontal line from the other regions; the shape-biased regions (notably LO) are located on the right of the plot and could be separated by a vertical line from the color-biased regions.
We also tested whether color preferences depended on the presence of shape information or vice versa. This hypothesis predicts a superadditive interaction in which the presence of both color and shape information in a stimulus has a greater effect on the BOLD response than the sum of the two main effects. Contrary to this prediction, although all three regions (Pc, Cc, and LO) showed significant interactions of color and shape (p < 0.03 for the interaction of color × shape), these interactions went in the opposite direction, reflecting a greater sensitivity to color in scrambled rather than intact shapes (presumably because of the larger number of color edges present in the scrambled images).
Together, these results show that the processing of color and shape occupy largely segregated portions of posterior VVP. Next we turn to the more anterior portions of the VVP, including color region Ac, which showed evidence for modulation by both color and shape.
Anterior color-biased region
We defined Ac in each of the 10 subjects in whom it could be found in at least one hemisphere, and quantified the magnitude of response to each of the 10 conditions in left-out data from the same subjects (Fig. 3). As described above (see “Segregation of color and category in the VVP”), we found in a two-factor ANOVA of color × stimulus category that this region showed a significant main effect of color, but no main effect of category and no interaction of the two. Next, we investigated whether this region was sensitive to object shape (Fig. 5) by conducting a two-factor ANOVA (intact/scrambled objects × chromatic/achromatic). In addition to the expected main effect of color (p = 0.01), this analysis found a significant main effect of shape (p = 0.01), suggesting a functional difference between this color-biased region and the two posterior color-biased regions (Pc and Cc), where we did not see significant sensitivity to intact shape. A three-factor ANOVA of ROI (Pc, Cc, and Ac) × Shape (intact/scrambled objects) × Color (chromatic/achromatic) confirmed that the anterior region was significantly more shape selective than the posterior color regions (interaction of ROI × shape, p < 0.0001; driven by the preference for intact objects in Ac but not Pc, pairwise ANOVA, p < 0.0001). We found no interaction between color and shape information in the anterior color region (p = 0.6), suggesting that, although both kinds of information are present, each dimension is processed independently within the region (see Materials and Methods).
In Figure 5C, the color ROIs (cyan triangles) fall along the shape bias dimension in a pattern that mirrors their arrangement on the ventral surface: Pc falls to the left (prefers scrambled objects to intact objects), Ac to the right (some preference for shape), and Cc in between (no shape bias). Whereas the shape-biased regions (LO and pFS, dark blue symbols), face-selective regions (OFA and FFA; gray), and iCAc (gray) all cluster on the right (low color bias, high shape bias) and are low on the y-axis (weak color bias).
Despite being reported less often in the literature than the posterior and central color regions, the anterior color region was evident both in the random-effects group analysis (Fig. 4B) and in individual subject data (in 10/13 subjects). Why do we see the anterior region in most subjects whereas many previous studies have not? First, we went to lengths to optimize tSNR in this region (see Materials and Methods), which lies in a part of the brain notorious for signal dropout (due to the presence of the ear canals; Fig. 6). Although tSNR was still low in this part of the brain, we were also able to reliably detect (in 9/13 subjects) the anterior face region ATL, an area that has only been reliably detected (in >50% of subjects) through innovative signal enhancing efforts (Axelrod and Yovel, 2013). Also note that the inhomogeneity observed within the mottled band of color-biased activity cannot be attributed to variations in tSNR because the tSNR is strong between the color-biased regions (iPCc and iCAc). Second, more complex or engaging stimuli may be necessary to elicit activity in Ac. The few studies that have localized anterior color responses required demanding color behavior tasks (Martin et al., 1995; Beauchamp et al., 1999; Simmons et al., 2007). Simpler stimuli, such as those used in many studies of color, may elicit weaker responses than dynamic natural movies, especially from regions implicated in high-level object vision. We tested this hypothesis by measuring responses in three subjects to both video clips and gratings. Figure 7A shows the results for an example subject. The left panel shows the color activations observed using the movie clips (Pc, Cc, and Ac, outlined in white). The right panel shows the results of the gratings experiment (for the contrast of chromatic gratings > achromatic gratings). Only the two posterior regions, Pc and Cc, were detected in the gratings experiment (right panel; black arrow indicates absence of Ac activation). Responses to the movies (both colored and achromatic) were much higher across the visual system in both the early visual cortex (V1, V2) and the VVP (Fig. 7B; average PSCs across the three subjects). This higher activation is perhaps unsurprising: the movies are more engaging than the gratings and they comprise a rich mixture of low-level visual features, including both luminance and colored components.
Is it possible that color activations extend beyond Ac but are obscured by low tSNR? The flattened surface view (Fig. 6B, bottom) shows the extent of signal coverage obtained presently for an example subject; regions anterior and medial to Ac are particularly impoverished (white asterisks), making it difficult to discern whether color activations span the full length of the temporal lobe (as they appear to in monkeys).
Relationship to monkey organization
The topographic relationship of the face, color, and place activation patterns, and the relative functional dissociation of color-biased and category-biased cortex in humans, mirror results we reported in macaque IT cortex (Lafer-Sousa and Conway, 2013; Verhoef et al., 2015). To further assess the extent to which color and shape information are dissociated in the monkey, we scanned the same animals from the earlier report using the dynamic natural video stimuli used here in humans. We quantified the responses to video stimuli in the set of color-biased ROIs defined in the original report: PLc, CLc, and ALc. The macaque color-biased regions responded similarly to the human color-biased regions: posterior regions (PLc and CLc) showed a color bias but no shape bias, whereas the anterior region ALc showed both a color bias and a shape bias (Fig. 8). For comparison, we quantified responses in the middle face patch ML. ML lacked a striking color bias but showed a shape bias, consistent with the results obtained in human FFA.
In addition to the face-selective regions in the human VVP (OFA, FFA, and ATL), a more dorsal face-selective region has been reported in the pSTS (Collins and Olson, 2014). We detected this region in 11/13 subjects and quantified its responses in independent data (see Fig. 3). A two-factor ANOVA on these responses showed a main effect of category, a (weak) interaction of color × category, but no main effect of color (Table 1). The putative homology of human ventral regions and macaque IT raises the question, is there a comparable region in macaque monkeys of the human pSTS region? Freiwald and colleagues have proposed a division of the monkey face system into two streams (Yovel and Freiwald, 2013; Fisher and Freiwald, 2015). On this scheme, the ventral face stream in macaques, which follows the ventral lip of the STS and includes patches PL, ML, and AL, corresponds to regions OFA, FFA, and ATL in humans, whereas the dorsal face stream in macaques, residing along the dorsal lip of STS and comprising MD and AF (Fisher and Freiwald, 2015), corresponds to human pSTS (and perhaps aSTS). This scheme would account for the functional dissociation found in both species (Haxby et al., 2000; Pitcher et al., 2011a, 2011b, 2014; Fisher and Freiwald, 2015) between face regions identifiable by responses to static images (PL/ML/MF/AL in macaques; OFA and FFA in humans) versus those that require dynamic stimuli (the dorsal face stream, MD/AF in macaques; pSTS in humans). Figure 9 provides a direct comparison of functional data obtained in a representative monkey and human subject.
Together, these results suggest that the entire face/color/place tripartite cortical system is broadly homologous in macaque monkey and human and, moreover, that this system migrated onto the ventral surface in humans as the cortex expanded over the course of evolution (see Discussion). Establishing this homology greatly empowers studies of high-level vision in humans by making directly relevant to humans the results from invasive studies in macaques.
Here, we address the functional organization of color processing in the human brain and its relationship to regions implicated in processing faces, objects, and places, using dynamic naturalistic stimuli that robustly drive the VVP. We confirm the existence of a mottled band of color-biased cortex extending along the length of the VVP, within which we identified three main peaks of activation that correspond to previously described color-preferring regions. We found that the series of color-biased regions was sandwiched between bands of face-selective and place-selective regions, mirroring prior results from macaques. In addition, we found substantial anatomical segregation of color-biased regions from those preferring shape or category. Below, we discuss two implications of this work: (1) the homology that it suggests between lateral regions of monkey cortex and ventral regions of human cortex; and (2) the segregation that it reveals of the cortical processing of color, shape, and category.
Color regions are sandwiched between face- and place-selective regions, as in monkeys
We identified three color-biased regions: two regions within posterior VVP, referred to here as Pc and Cc (McKeefry and Zeki, 1997; Hadjikhani et al., 1998; Wade et al., 2002; Brewer et al., 2005; Simmons et al., 2007), and an anterior region, Ac (likely corresponding to an unnamed region reported by Simmons et al., 2007). These regions were neatly sandwiched in each subject between face-selective regions on the lateral side and place-selective regions on the medial side. This systematic arrangement of three parallel sets of regions running along the posterior-to-anterior axis of ventral occipitotemporal cortex mirrors the organization of lateral IT in macaque monkey (Fig. 9; Lafer-Sousa and Conway, 2013) and suggests that the entire tripartite cortical structure responsible for face/color/place processing constitutes a “regional homology” (Orban et al., 2004) between the two species.
If this face/color/place system is homologous between humans and macaques, then why is it in such a different location within the brain in the two species? We speculate that this system was pushed onto the ventral surface of the brain over the course of human evolution by the expansion of regions engaged in language and social cognition (white arrow, Fig. 9). The human cortex expanded 10-fold since humans and macaques diverged from a common ancestor, but this expansion was not uniform. Regions involved in communicative and social behaviors, including the temporal parietal junction and the ventrolateral prefrontal cortex, expanded as much as 30-fold, whereas visual areas expanded relatively little (2- to 6-fold; Orban et al., 2004; Hill et al., 2010; Chaplin et al., 2013). Therefore, the identification of neighborhood relationships, like the parallel organization of face/color/place selectivity, may be a better cue to homology than gross anatomical correspondences. Our hypothesized homology of the face/color/place tripartite system between humans and macaques is consistent with prior hypotheses about the face system (Tsao et al., 2008; Yovel and Freiwald, 2013) and could be investigated further using tests of connectivity, cytoarchitecture, and gene expression.
Dissociation of color from shape and category
How segregated is the processing of color and shape in the brain? The spatial configuration of colors within a scene has a profound influence on color perception, leading some to argue that color and form are linked inextricably in visual cortical processing, especially in V-1, early in the visual-processing hierarchy (Gegenfurtner and Kiper, 2003; Shapley and Hawken, 2010). Conversely, color presents computational challenges somewhat distinct from those associated with shape and scene processing. The information provided by color and shape is often independent, for example, in signaling a person's emotional state independently of face features or the edibility of a piece of fruit (e.g., bananas ripen green to yellow without changing shape); moreover, color and luminance edges are often independent in natural scenes (Hansen and Gegenfurtner, 2009). These observations suggest that color might be processed separately from shape.
We found substantial segregation of color and shape processing in the VVP, consistent with prior evidence for separate pathways in the VVP for processing surface properties (texture and color) versus geometric/shape properties of objects in humans (Cant et al., 2008, 2009; Cant and Goodale, 2009; Cavina-Pratesi et al., 2010). Specifically, the two posterior color-biased regions did not show a preference for intact objects (responding equally, if not more strongly, to their scrambled counterparts; Fig. 5B). Conversely, shape-biased cortex (LO) showed only a weak preference for color (and this was found only for scrambled, not intact, objects). This dissociation of color and form processing accords with prior fMRI studies and with double dissociations reported in neurological patients, in which color perception can be impaired while form perception is spared (e.g., patient MS, Bouvier and Engel, 2006) or vice versa (e.g., patient DF, Humphrey et al., 1994).
Objects can be defined not only by their shape but also by the more abstract category (e.g., faces or places) to which they belong. We found that the VVP showed segregation, not only for shape and color, but also for category and color: each color-biased region showed a higher response for colored than achromatic stimuli for all stimulus categories, and this effect did not differ in magnitude across categories. Further, each color-biased region was more color biased than each category-selective region. This anatomical segregation of regions preferring color versus those preferring object category suggests that category-selective regions rely primarily on luminance information; the lack of color selectivity within shape- or category-selective regions is consistent with the observation that one can recognize most objects from achromatic images (Gregory, 1977). The hypothesized dissociation of color and category processing supported by the present results could be further evaluated using adaptation (Cavina-Pratesi et al., 2010) and multivariate pattern analysis (Brouwer and Heeger, 2013).
Our findings agree with the available data from causal methods. First, the majority (72%) of cortical lesion patients who acquire achromatopsia also acquire prosopagnosia, but rarely object agnosia (8%) (Bouvier and Engel, 2006). This pattern is expected given that cortical lesions are typically larger than the scale of any single functional domain, and color and face regions tend to be adjacent. Second, the few cases of electrical stimulation of the fusiform gyrus in humans produce perceptual effects (Rangarajan et al., 2014) consistent with our findings. For example, stimulation of an electrode overlying FFA of one subject elicited the response “You turned into someone else. Your face metamorphosed. Your nose got saggy and went to the left.” Stimulation of an electrode overlying non-face-selective tissue (posterior to the FFA, likely near Pc) prompted the response, “Light moved across my eyes. It was green, purple, and yellow light together” (Rangarajan et al., 2014).
Overlap of shape and color at more anterior locations of the VVP
The perception of complex scene structure modulates multiple aspects of color perception: color constancy depends on knowledge of 3D scene structure (Bloj et al., 1999; Lafer-Sousa et al., 2015), and pan-field color filling-in fails when scene structure is scrambled (Balas and Sinha, 2007). Moreover, the memory of object colors can modulate object appearance (Hansen et al., 2006; Olkkonen et al., 2010; Witzel and Gegenfurtner, 2013) and facilitate object and scene recognition (Tanaka and Presnell, 1999; Gegenfurtner and Rieger, 2000; Oliva and Schyns, 2000; Nijboer et al., 2007) when color is specifically associated with a given object or scene. These observations suggest that color and shape interact at a high level of visual processing (Tanaka et al., 2001). In support of an interaction of color and shape processing, we found anatomical convergence of color and shape bias in the most anterior color-biased region, a region implicated in color knowledge (Simmons et al., 2007). We observed a similar result in macaques, in which the anterior color-biased region (ALc) showed a shape bias whereas the more posterior color-biased regions did not (Fig. 9). This result is consistent with physiological findings of intermingled cell types in the neighborhood of ALc (cells selective for color, shape, or both) (Komatsu and Ideura, 1993; Edwards et al., 2003). We speculate that the color-biased regions of the VVP form a multistage system for the analysis of color, analogous to that observed in the adjacent face system (Lafer-Sousa and Conway, 2013; Tsao, 2014). This hypothesized architecture is consistent with functional differences between posterior and anterior color regions reported here and elsewhere (Simmons et al., 2007) and with observed differences in the color-processing deficits that can result from lesions in different anatomical locations (Miceli et al., 2001). The specific computations that these regions support is largely unknown, but the functional architecture uncovered here provides a roadmap for their investigation.
In sum, our data support two major conclusions. First, humans and macaques show striking similarity in the topographic relationship of color-biased regions, face-selective regions, and place-selective regions, suggestive of a broad interspecies regional homology. Second, color, shape, and category selectivity appear to be substantially segregated throughout much of the VVP. These results invite future work into the computations, connectivity, and developmental origins of the tripartite face/color/place system, a research program made more tractable by the system's apparent homology to macaques.
Supplemental material for this article is available at http://web.mit.edu/bcs/nklab/rosaStims.shtml. Provided is a high-resolution download of sample stimuli used to localize face, color, place, and shape biased cortex (Movie 1). This material has not been peer reviewed.
This work was supported by the National Institutes of Health (NIH Grant EY13455 to N.G.K., Grant EY023322 to B.R.C., and Grant 5T32GM007484-38 to R.L.S.) and the National Science Foundation (STC Award CCF-1231216 to N.G.K., Grant 1353571 to B.R.C., and a Graduate Research Fellowship to R.L.S.). The human work was performed at the Athinoula A. Martinos Center for Biomedical Imaging at Massachusetts Institute of Technology (MIT). The animal work was performed at the Athinoula A. Martinos Center for Biomedical Imaging at the Massachusetts General Hospital using resources provided by the Center for Functional Neuroimaging Technologies (Grant P41EB015896) and a P41 Biotechnology Resource grant supported by the National Institute of Biomedical Imaging and Bioengineering. This work also involved the use of instrumentation supported by the NIH Shared Instrumentation Grant Program and/or High-End Instrumentation Grant Program (Grant S10RR021110). We thank Steve Shannon, Sheeba Arnold, and Atsushi Takahashi of the MR team at MIT for developing custom acquisition protocols and other resources; L. Wald, A. Mareyam, and the rest of the MR team at Massachusetts General Hospital for providing the four-channel MR coil and other resources; Joseph Mandeville for the jip analysis software; Jenelle Feather and Walid Bendris for stimuli development; Alexander Kell for providing custom analysis code; and Sam Norman-Haignere for valuable comments on the manuscript.
The authors declare no competing financial interests.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License Creative Commons Attribution 4.0 International, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.
- Correspondence should be addressed to Rosa Lafer-Sousa, Department of Brain and Cognitive Sciences, MIT, 77 Massachusetts Avenue, 46-4141, Cambridge, MA 02139.
This article is freely available online through the J Neurosci Author Open Choice option.