Abstract
Colors distinguishable with trichromatic vision can be defined by a 3D color space, such as red-green-blue or hue-saturation-lightness (HSL) space, but it remains unclear how the cortex represents colors along these dimensions. Using intrinsic optical imaging and electrophysiology, and systematically choosing color stimuli from HSL coordinates, we examined how perceptual colors are mapped in visual area V4 in behaving macaques. We show that any color activates 1–4 separate cortical patches within “globs,” millimeter-sized color-preferring modules. Most patches belong to different hue or lightness clusters, in which sequential representations follow the color order in HSL space. Some patches overlap greatly with those of related colors, forming stacks, possibly representing invariable features, whereas few seem positioned irregularly. However, for any color, saturation increases the activity of all its patches. These results reveal how the color map in V4 is organized along the framework of the perceptual HSL space, whereupon different multipatch activity patterns represent different colors. We propose that such distributed and combinatorial representations may expand the encodable color space of small cortical maps and facilitate binding color information to other image features.
Introduction
The ability to perceive the rich color palette of the natural environment helps us to distinguish similarities and differences between objects, improving our visual behaviors and evolutionary fitness (Mollon, 1982; Osorio and Vorobyev, 1996; Lotto and Purves, 2002). Humans and macaques see colors based on simultaneous inputs from three different retinal photoreceptor types (red-green-blue cones [RGB]) of different but overlapping spectral sensitivities and neural interactions between these channels. Although the colors for trichromatic vision can be defined by a 3D color space, such as RGB space or hue-saturation-lightness (HSL) space (Munsell, 1912; Lotto and Purves, 2002), how perceptual color information is processed by the brain is still largely unknown. Previous investigations have identified color-selective neuronal assemblies in various cortical areas with variable characteristics (Zeki, 1973; 1983; Livingstone and Hubel, 1984; Komatsu et al., 1992; Rossi et al., 1996; Hanazawa et al., 2000; Johnson et al., 2001; Wachtler et al., 2003; Xiao et al., 2003; Conway and Tsao, 2006; Conway et al., 2007; Wang et al., 2007; Lu and Roe, 2008; Conway and Tsao, 2009; Conway, 2009; Kaskan et al., 2009). For example, hue sensitivity of neurons or populations can show relative luminance or saturation invariance (Xiao et al., 2003; Conway et al., 2007; Conway and Tsao, 2009). Other studies have reported luminance-modulated hue sensitivity (Hanazawa et al., 2000; Wachtler et al., 2003; Conway et al., 2007) and color-biased neurons, which respond selectively to a narrow range of hue and saturation or luminance (Komatsu et al., 1992; Hanazawa et al., 2000; Johnson et al., 2001; Conway, 2009). Furthermore, some cortical areas contain luminance-tuning regions (Rossi et al., 1996; Wang et al., 2007), adjacent to color-biased regions, that could potentially form independent modules for encoding luminance. Markedly, however, because of different experimental constraints or specific hypotheses, many past studies tested cortical representation of relatively narrow sets of colors, such as equiluminant (or isoluminant) colors, which cover only a 1D fraction of the 3D color space of natural images. But to obtain full color perception (Lotto and Purves, 2002; Wachtler et al., 2003), the visual system needs to encode the entire color space. Where and how the visual cortex solves this puzzle have remained a mystery.
In this study, we test the hypothesis that the organization of cortical color maps in the visual area V4, a part of extrastriate visual cortex, tuned to object features of intermediate complexity (Roe et al., 2012), incorporates already the color order of the full perceptual HSL space. By using intrinsic optical imaging and electrophysiological multiunit recordings in V4 of behaving macaques and systematically presenting color stimuli to test a broad range of the 3D HSL space, we discovered the general rules of how the representation of perceptual colors is organized by their hue, lightness, and saturation information (or values) into 2D color maps in millimeter-sized color-preferring modules, called the globs (Conway et al., 2007; Conway and Tsao, 2009; Harada et al., 2009; Tanigawa et al., 2010).
Materials and Methods
Physical characteristics and calibration of color stimuli.
Color stimuli were generated by a ViSaGe system (Cambridge Research Systems) and displayed on a 21 inch CRT monitor (Sony G520; 800 × 600 pixels, 160 frames/s; using standard RGB color space, sRGB, mode). The three guns of the monitor generated its red (R), green (G), and blue (B) color components. Its RGB luminance outputs were measured by a light meter (optiCAL, Cambridge Research System) and fed back to the ViSaGe system, in which γ calibration function set the monitor's input/output function (voltage/luminance) to a linear scale through an adjustable look-up table. These RGB values of our monitor were converted to CIE-xyY (Commission internationale de l'éclairage, or International Commission on Illumination; the international authority on light, illumination and color spaces, based in Vienna, Austria) color, following the equations:
The display properties in CIE-xy coordinates are shown in Figure 1A, without luminance, Y. CIE-xyY (or CIE XYZ) color space can be converted to other color spaces, using MATLAB (MathWorks) functions: makecform and applycform. Thus, the calibrated color stimuli used in this study were monitor-independent.
Selecting test colors from the HSL space. A, The spectral output of the monitor, shown as CIE-xy coordinates. B, The colors we systematically tested are shown within the HSL space (double cone). In this color system, hue, H, rotates around the cone, from 0 to 360°. Saturation, S, changes along the radial direction, from the center (S = 0) to the side (S = 1). Lightness, L, changes from the bottom (L = 0) to the top (L = 1). We sampled colors on logarithmic scale in lightness and saturation dimensions. For example, Rk2 has half lightness of R, and Rk3 has half lightness of Rk2. C, We used a total of 65 colors from the HSL cone, with the corresponding luminances (as cd/m2) given, shown as square patches on the background (BG) with 1/5 lightness of white. Luminance values with * are readouts of the light meter, while other values are summations of the it's RGB components.
The monitor with its color gamut defined by CIE coordinates can display a broad natural color range of the full perceptual color space (e.g., see Ebner, 2007). In natural scenes, different surfaces reflect white (sun) light differently (with different wavelength distributions), as perceived by their different colors. Animals with trichromatic vision determine the color of an object by the ratio it reflects primary colors (RGB) (Mollon, 1982), whereupon its maximum RGB luminance is bound by white light (a green surface may reflect all green in the white light but can never exceed it). We set the maximum white luminance of the CRT monitor to 115 cd/m2. With this value, the software code 0000ff (R 256 G 0 B 0) defined a bright red with lightness of 1/2 in HSL space, generating red luminance of 22.7 cd/m2 (see also the paragraph below). Similarly, 00ff00 defined bright-green with a lightness of 1/2 (HSL value) and green luminance of 78 cd/m2. Figure 1C and Table 1 give the software codes, HSL values, and the luminance output of the RGB colors used in this study, firmly linking HSL space to the physical sRGB color space. In the HSL space, the same hue may have different luminances and different hues may have the same luminance.
Important HSL values, software code, luminance and CIE-xy values of the colors used in this study. Code = software code. Luminance units: cd/m2. The boxes with checkered frames highlight the near equiluminant hues used for Figure 13
In the best approximation of natural color space, a white-point (D65 CIE standard daylight illuminant) equals to RGB luminance ratios of 1:3.36:0.34 (defined by ITU-RBT.709-standard for CRT phosphors), with the brightest green having a higher luminance than the brightest red and blue. Obviously, daylight luminance outdoors can be much higher than that of the maximal monitor output, but this luminance ratio (as also used in this study) mimics our perception of white in natural scenes. In contrast, many past studies have balanced RGB equiluminant colors by various neuronal outputs, such as S-cone, cortical, or behavioral responses, resulting in more limited color sets. These include fixing RGB luminance ratios close to 1:1:1 by minimum motion technique (Cavanagh et al., 1987); RB ratio close to 5.9:1 by visually evoked potential measurements (Tootell et al., 2004), and RGB ratio to 2.3:1 by fMRI measurement (Conway and Tsao, 2006). Other studies have used physical equiluminant colors (Xiao et al., 2003; Conway et al., 2007), but these color sets rather artificially cutoff bright red or green from natural color space.
Reasons for selecting stimuli from HSL color space.
Our objective was to determine a full cortical map for full colors (with the range of tested/perceivable colors being only limited by the monitor's gamut). Crucially, HSL color space is a full color space (Fig. 1B; Table 1). Equiluminant colors, in contrast, allocate a much smaller (one-dimensional) fraction of the full color space (see above). Hence, after obtaining a full color map of HSL space, its equiluminant parts can be approximated by selecting the response patches to hues of similar luminance. Theoretically, it would be also possible to convert an orderly cortical map, representing colors along HSL coordinates, to another map for RGB coordinates; HSL space is transformable to RGB space: RGB cube ↔ HSL double-cone. However, because of the limited sampling density of colors (constrained by long experiments) and nonlinearities in the cortical maps, no such transformations were attempted here.
Animals and procedures.
Experiments were performed on four adult male monkeys. All procedures were in compliance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals, approved by the Institutional Animal Care and Use Committee of Beijing Normal University. The monkeys were surgically prepared for the optical imaging experiments by implanting a titanium post to the skull that was used to restraint the head. After fixation training, a transparent cranial window was chronically implanted above the area V4d (Fig. 2A), containing lunate sulcus, superior temporal sulcus, and inferior occipital sulcus (Arieli et al., 2002). V4d, either in the left or right hemisphere, was imaged, as indicated for each monkey (marked as Mk. A-D in the figures): A and C, left; B and D, right hemisphere. After the surgery, the monkeys were given at least 1 week to recover, before performing the first optical imaging experiments. No obvious differences between the hemispheres were identified in the key results.
Behavioral task.
The monkey sat in a primate chair with its head restrained, looking at the monitor screen, positioned 114 cm from its eyes. Its eye positions were monitored with an infrared eye-tracking device (ISCAN). In the experiments, the monkey was required to fixate on a small white spot (0.1° × 0.1°) within 2° × 2° for 5 s, during which period color stimuli were presented, to be given juice rewards.
Optical imaging.
Intrinsic optical signals were recorded using the Imager 3001/M system (Optical Imaging). The imaging area covered 13 mm × 13 mm, including V4 and parts of V1 and V2 (Fig. 2A,D). The cortex was illuminated with 605 nm light. The image focus was adjusted to 300–400 μm below the cortical surface by a tandem lens system. Intrinsic optical imaging responses can be compromised by surface vessels. Therefore, we did not include image data in the regions near large vessels. For example, there is a relatively large vessel in the right-bottom part of the maps in Figure 3A–C. The impact of small vessel is considerably less, often neglectable. Importantly, the used deep image focus much decreased these effects. In the data presentation, following the standard intrinsic optical imaging experimental protocol (Bonhoeffer and Grinvald, 1996), the images with clear vessels were used as the background (reference images), taken <546 nm light illumination and focused on the cortical surface to localize/identify the landmarks of the recording site. The actual regions of color-induced cortical activity (compare Fig. 2C), superimposed on these background images (shown as accordingly colored contours), were recorded and quantified in deeper images (using 605 nm wavelength cortical illumination), being thereby much less influenced by vessel effects.
Further, the intrinsic optical signals in this study were reasonably strong (reaching 0.15% for grating experiments and 0.1% for color squares, see below) and reproducible over several months. In contrast, the artificial signals from the vessels, themselves, were weak, having narrow stripe shapes. The shadows of the vessels had no, or only very limited, effect on the spatial locations or the optical signal amplitude of the adjacent active patches (compare red, orange, yellow, and purple preferring patches in Figs. 4 or 6).
Two different stimulus modes.
To identify color-preferring regions in V4 (see Fig. 2), we used square-wave red/green gratings and white/black gratings with 100% and 10% contrast, respectively, covering 20° × 15° screen area, as seen by the monkey. The gratings had spatial frequency of 2 cycles/°, drifting at 3 cycles/s. But for mapping of hue, lightness, and saturation representations (see Figs. 34567891011121314-15), colored static squares of 1° × 1°, including black and white, were presented at the predetermined screen point, which marked the local visual field center of the (studied) color-preferring cortical area (module) in V4. All color stimuli were presented on gray background (1/5 lightness of white; CIE-xyY: 0.31, 0.33, 22.8).
Data collection during stimulation.
A trial started when the monkey's gaze fell within the 2° × 2° window around the central fixation point. After 200 ms delay, a stimulus was displayed for 4 s at the predetermined screen point (which had evoked maximal cortical color response in the earlier retinotopy tests; see below). If the monkey's gaze wandered outside the 2° × 2° fixation area, the data were discarded and the trial was repeated. The camera captured data from the stimulus onset for 5 s at 2 Hz (2 frames/s). In a trial, a set of colors were shown sequentially (in a random order) to the monkey. For example, we used 5 (colors) stimuli for the data in Figure 2, 10 stimuli for Figure 4, and 30 stimuli for Figure 6, including a blank screen stimulus. Conversely, we call an experiment the period of completing data collection for a specific stimulation protocol, typically lasting for several hours and including many repeated trials: 50 for Figure 2C–E, 45 for Figure 4C–H, and 29 for Figure 6C–G. For each monkey, different experiments were usually performed in different days.
Data analysis of color-induced cortical activity included eight steps (1–8)
(1) Averaging frames to one stimulus.
The capturing of intrinsic optical signals started ∼0.5 s after the stimulus onset. We collected 10 frames to each stimulus. The frames 3–10 (n = 8) were averaged to form a single frame, representing the response to the corresponding stimuli.
(2) Computing the difference maps for a trial.
The averaged frames (to each of the given stimuli) were used to compute the difference map (or difference image). For square-wave grating stimuli, this was done as follows:
where IT and IC are the average test (red-green) and control (black-white) images, respectively, and horizontal and vertical denote the orientation of the stimuli.
The dark regions in the difference map indicated the color-preferring modules. Thus, although luminance-sensitive regions may activate more strongly to black-white grating than to red-green grating, and a red-green grating map could possibly also contain regions that are only sensitive to luminance, by dividing the test response with the black-white grating response largely canceled out these effects (Xiao et al., 2003). Furthermore, to exclude the possibility that dark regions in the difference map represented additional contrast information (as the red-green grating had lower contrast than black-white gratings), we performed control experiments using low-contrast gray gratings (data not shown) but found the maps to be similar to the high-contrast grating ones.
For color square stimuli, the difference map, ΔMH, for each test condition (static hue, lightness, or saturation) was calculated as follows:
where Cf and Bf are snapshots of cortical activity (sampled image frames, f) to color stimulus (on a blank gray background) and to the gray background alone, respectively: ΔMH corresponds to the average image for the used test stimulus divided by the average of frames to the blank gray background (see Fig. 3A). To minimize artifacts (spurious uncorrelated activity), the prestimulus image pixel values were subtracted from the average images before their division, and the stimuli were presented in random order (see above).
As further controls, we also calculated the difference map in two other ways (see Fig. 3B,C). Figure 3B shows a difference map calculated as follows: ΔMH = 2IH/(IB + IW). Because images, IB + IW, to white and black stimuli, respectively, showed complementary signals, their sum was a clean background image and the shape of the difference map was practically unaffected. To indicate signal values from 1 (no difference) to ≠ 1, the same hue response image was summed twice (2 in the numerator), before it was divided by the sum of black and white response images. Figure 3C, conversely, shows the difference maps for “cocktail blank” control (all stimulus conditions combined). All these choices gave nearly identical geometric centers for the hue patches with similar peak regions in the difference maps. Although the blank controls with Equation 3 could be slightly noisier (Fig. 3A), attributable to varied background activity (Churchland et al., 2010), this method is free of luminance contrasts that could impact the resultant map features of the other controls (Fig. 3B,C). Therefore, the difference maps were calculated by Equation 3 throughout this study.
(3) Identifying and omitting trials with artifact.
By using the data collected in a single trial and Materials and Methods described above, we estimated the difference map for each stimulus. SD represented the distribution of all pixels values in one map. If some data frame contained motion artifacts, the pixel values of its relative difference map always scattered more (i.e., had a larger SD). Thus, we set a criterion to identify motion artifacts by the SD of a difference map, as checked by an automated routine. If a difference map had SD > 0.003, the corresponding trial was not included in other analyses (deleted). Overall, only a very small fraction of trials was deleted, but no details were recorded to establish this frequency. The motion artifacts occurred rarely because the monkeys were well trained and remained mostly steady during the experiments and because our CCD chip was attached to the monkey's head, whereupon small head movements affected little the raw data frames. The number of trials (n) used in each experiment is given in the figure legends.
(4) Statistical analysis.
After each accepted trial (n trials in total) was processed (see step 2), we had n independent difference maps for one stimulus. Across these n difference maps, one-tailed t test was used (pixel-by-pixel) to identify the active areas, using the significance level (α) = 0.05 (i.e., the “dark” regions, which reflect decrease in blood oxygenation level, signaling increased neural activity) (Bonhoeffer and Grinvald, 1993; Grinvald et al., 1999). Because we focused only on the “dark” regions, use of one-tailed t test was appropriate.
(5) Averaging the difference maps for each stimulus.
To reduce noise, n difference maps to one stimulus (across n trials) were averaged, giving its general difference map. For example, in Figure 2C, 50 trials were conducted, having 4 stimuli within each trial, as described above. Therefore, 50 difference maps were obtained from 50 trials, using Equation 2. These 50 difference maps were averaged to obtain the general difference map.
(6) Filtering.
The general difference map for each stimulus was then filtered by a pair of Gaussian spatial filters to reduce noise. Low-pass filtering (σ = 52 μm; i.e., 1/2.35 times the full-width at half-maximum) was used because the spatial resolution of the intrinsic optical imaging was > 50 μm. Thus, the signals < 50 μm were regarded as high-frequency noise, and low-pass filtered. High-pass filter (σ = 322 μm; scaling as above) was used to reduce uneven luminance and global intrinsic signal effects that did not correlate with the stimulus (Frostig et al., 1990). Detailed comparisons between the unfiltered and filtered data (e.g., see Fig. 3A–C) verified that this spatial filtering biased neither the locations nor the overall relationship of the color-activated cortical patches.
(7) Converting the filtered general difference maps to [0–1] grayscale maps.
Mean value and SD were calculated across all pixel values of a general difference map. Then its pixel values were clipped by mean ± 3 SD; the pixel values > mean + 3 SD were truncated to mean + 3 SD and the pixel values < mean − 3 SD were truncated to mean − 3 SD. Finally, the clipped map was projected linearly to appropriate grayscale values [0–1], according to the following:
where Vmax = mean + 3 SD and Vmin = mean − 3 SD.
(8) Contouring the “peak activity areas” within each grayscalemap.
“Peak activity areas” were the most active parts of the active areas (the regions in the difference images that showed significant darkening, as identified by t test over all trials; see step 4). “Peak activity areas” were contoured separately (see Fig. 3D–G), with a relative threshold (0.75 times the maximum value of each response region), identifying and emphasizing the locations and organization of the central parts of the active areas. Thus, because an active area was directly identified by t test and distributed in 2D Gaussian-like spatial shape, we chose the corresponding “peak activity area” to represent it. All figures in this study show “peak activity areas” (contoured by colorful lines).
Although the absolute signal intensities evoked by different colors vary, their sequential shifts are typically clear (see Fig. 3D–G). Contouring the active regions with a high fixed level could miss some dominant active regions, whereas some arbitrary low threshold levels (dotted black lines) can contour response regions and present well their spatial shifts, too. Nonetheless, if the baseline activity for some colors is higher than the average (here blue for example), a too low threshold can artificially broaden the contoured activity area. Therefore, using a relative level of 75% (colored dotted lines) presented a safe method to characterize the “peak” response regions, which potentially indicate dominant cortical columns.
Testing retinotopy of the cortical maps.
Ten stimulus positions were used (Fig. 5A) to survey the region of the visual field to which each hue cluster was mostly responsive. First, one-tailed t test was used to detect patches of increased activity (only dark regions) in the difference maps. We then focused on specific locations in a cluster; consider CH1, for example. In this cluster, the red-activated patches appeared when the red stimulus was presented in x-coordinates between −0.5 and 2.5 (y = −1.5), and in y-coordinates between −0.5 and −4.5 (x = 0.5). This information provided us four corner-points of the effective red-stimulation area: p1 (−0.5 to 1.5), p2 (0.5 to 0.5), p3 (2.5 to 1.5), p4 (0.5 to 4.5). These points defined a rectangular effective stimulus area. To minimize bias, each edge of the rectangle was smoothed using a Bezier curve in the same way (Fig. 5C), which together approximated the effective stimulus area. The long recordings times required for data collection made it unfeasible to increase the sampling resolution of these experiments.
Defining clusters.
In the functional hue map (see Fig. 4D), spatially overlapped patches that display systematic hue order form a hue cluster. Because the HSL colors can be depicted in a closed and continuous sequence, any hue patch can be at the end of a hue cluster.
In the functional lightness map, spatially overlapped patches that display systematic lightness order form a lightness cluster. Colors in HSL space (at the edge of the hue-cone: bright and fully saturated R (red), O (orange), Y (yellow), G (green), A (aqua), B (blue), P (purple); see Figure 1B) travel up or down when their lightness increases or decreases, respectively. When such dynamics were reflected by the cortical patch activation, we treated the lightest and the dimmest color patches as the endpoints of the lightness clusters, although a light cluster and a dark cluster could meet at the center point (corresponding to bright fully saturated hues), as happens with the corresponding colors in the HSL cone. For example, see clusters CLR1 and CLR2 in Figure 6G.
Groups of ≤3 spatially overlapped patches were not included in further analysis (see below) because of their inconspicuous order. Thus, the exclusion of the single, duplet, and triplet patches was not done on the basis of trying to fit the data with a preconceived notion of the organization of color responses in V4, but because of such patches provided no extra information. Three patches placed on the vertices of a triangle cannot be shuffled so that a patch would have different neighbors, making their order index meaningless. As for two connected patches, they are always in order, whereas the single patches obviously have no order. The excluded single patches constituted 5% and duplets and triplets 6% of all the patches in this study.
Testing correlation between the hue order in HSL space and its projected cortical representation.
We used a correlation coefficient to evaluate the goodness of hue projection from HSL color space to the cortical representation. Hue representation on the cortex forms a vector, in contrast to the hue circle of HSL space. Correspondingly, the HSL hue circle was converted, from the starting point of the cortical hue cluster, to a hue vector (angles for red, orange, yellow, green, aqua, blue, and purple in hue circle are 0, 30, 60, 120, 180, 240, and 300 degrees, respectively). Correlation coefficient was calculated between the distances for all possible pairs of selected hues in the hue vector and the homologous distances between the hue patches on a cluster, using patch centers. The hues, which did not activate any significant area, were not included in the analysis. As the distances between the other hue responses in a cluster were not affected by these missing responses, the analysis was not biased by them. For example, in the CH2 cluster of Monkey A (see Fig. 4D), we have np = 5 hue patches: purple, red, orange, yellow, and green. Then, the hue vector from HSL color space is [0, 1, 1.5, 2, 3]. From this vector, we can obtain a distance between any two hues; for example, the distance between purple and yellow is 2 − 0 = 2. As for the cortical map, we obtain 5 center points of its hue patches. Thus, we can calculate a distance between any two points in 2D dimension; for example, the distance between purple and yellow is 0.564 mm. Totally, we can get n = np × (np − 1)/2 = 10 pairs of corresponding distances (plotted on Figure 4I, middle). The correlation coefficient r of these 10 pairs of distances was calculated using corr functions in MATLAB (MathWorks).
Testing cortical representation of hue against random position.
To test whether the hue representations within a cluster formed a line of sequential order, or whether the patches were just randomly positioned, the correlation coefficient of a cluster, rc, was calculated as above. Next, we sampled 10,000 random points within 1 mm × 1 mm cortical area and organized these into “resamples,” which contained the same amount of points as the cluster. Correlation coefficient, rr, was then calculated for each “resample.” The significance of rc was resolved by its position in the distribution of rr. p = (number of (rr ≥ rc))/10,000; significance level was set as 0.05.
Testing sequential order of cortical hue representations within a cluster.
Permutation test was used to determine whether the correlation coefficient of a cluster was above the significance level. Similar to the analysis above, rc of a hue cluster was calculated first. Then its patches were rearranged randomly, and the correlation coefficient, now named ro, was recalculated. This process was iterated 10,000 times. The significance of rc was determined by its position in the distribution of ro. p = (number of (rr ≥ rc))/10,000; significance level was set as 0.05.
Testing cortical representation of lightness.
Analogous three analyses, as above, were used for quantifying lightness projections and statistics of the lightness patches in red, green, and blue lightness clusters. Each experiment included 7 stimuli, selected from the logarithmic lightness values in the HSL color space for the given colors, as shown in Figure 1B, C. The lightness values, Llin, used from R to K were as follows: 1, 1/2, 1/4, 1/8, 1/16, 1/32, 0. When black was set as 1/64, then Llin values were as follows: 1, 1/2, 1/4, 1/8, 1/16, 1/32, 1/64. Conversely, the logarithmic values, Llog, given by −log2(Llin) were as follows: 0, 1, 2, 3, 4, 5, 6. Llog were used to calculate all correlation coefficients for this case.
Identifying cortical stacks by patch spatial density analysis.
Figure 12 illustrates how the patch spatial density in cortical color map was analyzed step by step for two monkeys. First, the centers of all patches were plotted on a black background (see Figure 12B,G). Then, 3 × 3 pixels (77 × 77 μm) binning was used across each pixel in the map. This procedure provided the rough patch spatial density (see Figure 12C,H), which was smoothed by a Gaussian low-pass filter (σ = 26 μm; see Figure 12D,I), highlighting many peaks (stacks) on this map. Finally, we calculated the mean values and SD of all the nonzero pixels in the smoothed patch spatial density map, using mean + 5 SD, as the threshold for identifying the peaks (see Figure 12E,J).
Cortical color map for near-equiluminant hues.
In our experiments, the most closely equiluminant hues were red (22.7), brown (17.0), yellow-green (27.4), green (20.9), cyan (25.4), blue (15.2), and purple (11.0 cd/m2), as sampled from the HSL space (Table 1). The peak activity areas, evoked by these hues, were used for the near-equiluminant cortical color map (see Fig. 14A). Because we used 1/5 lightness gray (22.8 cd/m2) as the background, the luminance contrast, c, values of these hues are 0.4%, −25.4%, 20.2%, −8.3%, 11.4%, −33.3%, and −51.8%, respectively, calculated as follows:
Predicting the patch centers for equiluminant hues.
Cortical locations of patch centers representing equiluminant hues can be predicted from the recorded lightness clusters. As an example, we explain how to approximate the cortical position for an arbitrarily chosen dark red luminance. The luminances for different lightness of reds, from fully saturated to dark reds, that we used in this study were as follows: 22.7, 11.8, 6.5, 3.8, and 2.4 (Table 1). In the optical imaging experiments, we obtained accordingly five center points for patches evoked by these colors on the cortex, forming a dark red lightness cluster. Next, we fitted a curve using these center points and rendered the curve with corresponding colors (e.g., see Fig. 11E–L). The distance along this curve represents the projection of logarithmic color luminance to linear cortical distance. Suppose that the length of the curve is 1, and we have four measured cortical distances from the luminance point 22.7. Then, the cortical distance, d, for any luminance, lum, of dark reds can be approximated by the following formula:
For example, luminance 11.8 cd/m2 maps to position 0.236, marked on a full cortical color map in Figure 14B. In this map, the same method was used for approximating the cortical positions for representing equiluminant red, green, and blue (20.0 cd/m2), as well as some other lower luminances of the same hues (darker colors).
Electrophysiology.
Platinum-iridium alloy microelectrodes (impedance 1–2 mΩ) were inserted through the silicone artificial dura to perform single-unit or multiunit recordings. The position of the recording site was guided by the patterning of the surface blood vessels. Together, seven microelectrodes were used simultaneously to record neural activity in different places of a color map in V4 in Monkey A (Fig. 15A). The recording depths varied from 200 to 1000 μm. Action potentials were acquired and sorted by Cerebus system (Cerebus 128). The same stimulus as in the optical imaging was presented for 500 ms, after a 500 ms delay from the start of the fixation. The neural responses were calculated by subtracting the spontaneous firing activity during the 500 ms delay period from that during the stimulus presentation. All stimulations were interlaced pseudorandomly and repeated for >20 times to obtain the average neural responses.
Results
Localizing color-sensitive areas in central V4 by optical imaging
We performed long-term optical imaging studies (Arieli et al., 2002) in area V4 on four behaving macaque monkeys. The imaged 13 mm × 13 mm cortical area (Fig. 2A; highlighted rectangle) included a central portion of V4d, representing the central visual field (right and down with <7° eccentricity) and part of V1 and V2. In all experiments, the monkeys performed a simple fixation task (Fig. 2B). From the differential maps, obtained by dividing the images taken during red/green gratings (20° × 15°) stimulation by those taken during black/white gratings, the color-biased modules could be localized (e.g., Tanigawa et al., 2010) (Fig. 2C–E). Based on their dimensions, dynamics, and spatial distribution (Fig. 2F), these functional modules (Conway et al., 2007; Conway and Tsao, 2009; Tanigawa et al., 2010) constitute “globs,” so named in the previous studies.
Cortical location and dynamics of color-preferring modules in V4 by optical imaging. A, Cranial window covers the cross region of lunate sulcus (LS), superior temporal sulcus (STS), and inferior occipital sulcus (IOS). White frame indicates imaged area, containing the center of V4d and parts of V1, V2. A, Anterior; M, medial. B, Monkey fixated to screen center, as verified by eye tracker, triggering 4 s color (red-green) or orientation grating stimuli (20° × 15°), after which it received juice reward. CCD chip was mechanically coupled to the lens system and the monkey head, with flexible wires connecting it to the camera; thus, it could move slightly with the head without shifting images, giving high image stability. C, Images of the differential maps for color (top) and orientation (below), respectively, from the stimuli onset onward. Darkening regions (contoured) indicate increasing activity (n = 50 trials). D, Average image from color grating; color processing areas (red-green dashed contour) are separated from the orientation zones (black-white dashed contour), consistent with earlier findings (Tanigawa et al., 2010). E, Enlarged view of the boxed region in D: peak responses (75% maxima) inside significantly activated regions (p < 0.01) contoured. Images Gaussian high-pass (σ = 322 μm) and low-pass filtered (σ = 52 μm), frames clipped to ±0.15%. F, Time course of the reflectance change (mean ± SEM) in the differential maps in the peak regions, as in C, during color stimulation (black trace represents red/green grating; gray trace represents blank condition) in two monkeys (Monkeys A and C). Frames from 1.5–5 s (light gray area) were used for the average image in E. Scale bars represent 1 mm.
Cortical hue maps
To investigate how hues are represented in the globs, we used bright (saturated) colors, sampled from the hue circle of the HSL space (Fig. 4A). At the center of a glob's local visual field, small (1° × 1°) squares of the chosen seven hues, and black and white, were presented on a gray background in a pseudorandom order (Fig. 4B). The differential images for the hue map were then obtained by dividing the images taken during the color squares (test conditions) by those taken to the blank gray background stimuli, as explained in Materials and Methods.
Characterization of color-induced cortical activity (color maps) in optical imaging data. A–C, Comparison of cortical maps to red and green stimuli (1° × 1° squares on gray background) obtained by different reprocessing methods. A, Blank control. B, Control by white + black. C, Cocktail blank control by all stimulus conditions. Top rows, Unfiltered. Bottom row, Gaussian filtered maps and their effects on the contoured maps. A, There are common activity patches in red and green maps at the bottom left corner (white arrowheads), possibly evoked by the common shape components of the stimuli. This area belongs to the orientation preference area (see Fig. 2C). B, Contours show 75% relative thresholds of the activity areas. D, A hue cluster with a white line crossing the spatial distribution of its peak activity areas. E, Optical signals along the selected cross section (white line in D) in the hue cluster. F, Red lightness cluster with a white line crossing its spatial distribution of the contoured cortical activity patches (peak activity areas). G, Optical signals along the selected cross section (white line in D) in the lightness cluster. E, G, Dotted colored lines indicate the 75% relative threshold for the same-color-induced activity areas; the black dashed lines indicate examples of low arbitrary absolute thresholds.
Hue maps in area V4. A, Bright (saturated) hues: R, red; O, orange; Y, yellow; G, green; A, aqua; B, blue; P, purple, sampled from the hue circle (i.e., maximum cross section of HSL space; Fig. 1B). Controls: K, black; W, white; BG, gray background is 1/5 lightness of white: CIE xyY of (0.31, 0.33, 22.8). B, Fixation to the gray screen (BG) center (white dot), triggered 4 s presentation of 1° × 1° hue square at a preset screen position, which marks the local visual field center of the glob. After a rest (>10 s), new fixation triggered a new hue at the same position. C, Cortical activity patches to seven hues, respectively, as indicated by optical imaging in Monkey A. Each hue evoked multiple patches of activity. Peak responses (75% maxima) inside significantly activated regions (p < 0.05) outlined by corresponding hue contours. Images of Gaussian high-pass (σ = 322 μm) and low-pass filtered (σ = 52 μm) frames, clipped individually to ± 3 SD (n = 45 trials); see details in Materials and Methods. D, Combined map for seven hues. Patches for adjacent hues form clusters: “rainbows of patches.” E–G, Larger views of hue clusters: CH1, CH2, and CH3, respectively. H, Interpolated locations of hue-preferring patches in hue clusters of Monkey A. I–K, Correlation analysis between hue distances to cortical distances. The hue distance and the cortical distance correlate significantly (r, correlation index; n, number of distance measurements between patches). L, The hue map in Monkey B. M, Interpolated locations of hue-preferring patches in hue clusters of Monkey B. N–O, Correlation analysis of its hue clusters. Scale bars, 1 mm.
The hue-preferring regions for the seven tested colors are shown in Figure 4C. The colored outlines in its combined panels indicate the peak activation regions evoked by each corresponding color (Fig. 4D). Normally, a single small hue stimulus, presented at the center of a glob's local visual field, evoked 1–4 patches of increased cortical activity. With hue changing from purple to red, orange, yellow, green, aqua to blue, the activated patches shifted systematically over the adjoining cortical area, with neighboring patches overlapping considerably (see also Conway and Tsao, 2009). Therefore, a localized hue map in V4 comprised 3 or 4 separate hue clusters (CH) (Fig. 4E–G), many of which formed unbroken “rainbows of patches.” The sequential hue order of these clusters becomes yet clearer when we connect their patch centers with their preferred colors (Fig. 4H).
To quantify how faithfully the activated patches in hue clusters map hue coordinates from HSL space, we calculated the linear correlation coefficient, r, between the hue distances on hue circle and the cortical distances of the activated patch centers. Here, 60° defined the unit distance in hue circle, whereas the distances between the patch centers were measured in micrometers. A high index indicates faithful projection, in which neighboring hues activate neighboring patches and more distant hues activate more distant patches, whereas a low coefficient signifies disordered or distorted mapping. In the four tested monkeys, r of the hue clusters varied from 0.59 to 0.92 (in Monkey A, CH3: r = 0.59, p = 0.019, Fig. 4I; in other hue clusters, p < 0.01, Fig. 4J–L). Most of the coefficients were significantly higher than those for disordered (9 of 14; permutation test for sequence order) or distorted (14 of 14; randomization test for spatial organization) mapping. Thus, when further considering the finite sensitivity of the optical imaging system, which cannot resolve weak activity from noise, the spatial sequence of hue-preferring patches in most clusters follows remarkably consistently the hue order in the HSL space.
Retinotopic survey of the hue map
We then asked how much is the fractioning of hue patches between different clusters caused by retinotopy; namely, does each patch process explicit hue information about specific spatial position in the visual field? This question is important because if retinotopy had very little contribution on the structure of the hue map, then different combinations of activated patches mostly represent different hues. To resolve this question, we surveyed the regions of the visual field to which each hue cluster was responsive (local visual field) in finer detail using 0.5° × 0.5° red or green squares (Fig. 5A); generally, red and green evoked the strongest activity in the hue map (Fig. 5B), making them the most reliable hues for this survey. In the analysis, we used the statistical significance of the resulting activity patches against the stimulus locations to outline their local visual fields (see Materials and Methods).
Surveying the retinotopy of the hue map in V4. A, Local visual field (i.e., the region of the visual field to which each hue cluster was responsive) mapping was performed by presenting 0.5° × 0.5° red or green square (stimulus) on six horizontal and five vertical positions; fovea is 0°. RF centers of color-preferring modules ∼−1.5° below the fixation point (FP). The subfigures show nonfiltered images. B, Three clusters of Monkey A's hue map: CH1, CH2, and CH3, tested in this survey (the same ones as in Fig. 4D). C, Red- and green-preferring patches in cluster I have virtually identical local visual fields. D, Red-preferring patches in clusters CH1 and CH3 have different but largely overlapping local visual fields. E, Green-preferring patches in clusters CH1 and CH2 have different but largely overlapping local visual fields. F, Red or green squares of different sizes (0.5°, 1.5°, 4°) presented at the estimated geometric center of the hue map's local visual field evoked comparable optical signals at their corresponding locations. Despite variations in patch sizes in each location (experiments done on different days), their activity did not differ significantly (p > 0.05, two-tailed t test). Scale bars represent 1 mm.
We found that red and green patch locations in the same cluster shared the same large local visual field (Fig. 5C). Furthermore, the separate patches for the same hue in different clusters had different but greatly overlapping large local visual fields (Fig. 5D,E). There were other signs of relative positional invariance for encoding hues: red or green squares of different sizes (tested for 0.5°, 1.0°, and 4.0° in different days) activated their corresponding cortical locations. Although their peak activity expanded slightly differently from one experiment to another (Fig. 5F; compare patch sizes and shapes), these variations were deemed statistically insignificant (p > 0.05). Therefore, red or green stimulus, appearing anywhere within 3° × 4° fovea-perifoveal area, is consistently encoded by parallel neural activity in multiple patches (i.e., multipatch representations are not the result of retinotopy). Similar observations were obtained from three other monkeys when adjusting the color stimuli to the center of their hue maps' local visual fields, but with the detailed survey lasting >10 h, no further hues were tested in these experiments. Nonetheless, because red and green are far apart in the hue circle (Fig. 4A), it is reasonable to expect that other neurons within the same cluster would also share this retinotopy. Thus, we conclude that (1) any single hue point in the visual field is represented by at least one, but typically more, patches of activity; and (2) different hues are represented by different multipatch patterns in the hue map. These results (of local analysis) do not challenge the global retinotopic organization of V4, as previously evidenced by single-unit and fMRI recordings (compare Roe et al., 2012).
Cortical color lightness maps
We next examined how color lightness is represented in the globs. For these experiments, we selected colors along the conical surface of HSL space (Fig. 6A). We sampled three primary colors of red, green, and blue with 11 different levels of lightness, and used white and black as the control stimuli. Lightness values were sampled using a logarithmic scale (1, 1/2, 1/4, 1/8…). For example, R indicates saturated red, W and K white and black poles, respectively; Rk1 is the midpoint of R and K, and Rk2 is the midpoint of Rk1 and K, and so forth. Initially, we used a linear scale to sample color lightness but found that the cortical patches, which represent one hue (e.g., red), changed little near 0.5 lightness values, yet shifted considerably more near 0.1. However, logarithmically scaled color lightness samples evoked well-distributed patterns of cortical activity.
Lightness maps in area V4. A, Colors sampled with different lightness on HSL cone in logarithmic scale. Center points: Rw2 (pink) between red (R) and white (W); Rw3 (light pink) between Rw2 and W; Rk2 (darken red) between R and black (K); Rk3 between Rk2 and K. The same criteria used for green and blue stimuli. B, Activity patches, evoked by reds of varying lightness, shift with decreasing lightness. The 1° × 1° lightness stimuli were presented at the local visual field of the color-biased region in pseudorandom order, as in Figure 3B. Images Gaussian high-pass (σ = 322 μm) and low-pass filtered (σ = 52 μm), frames clipped to ± 3 SD peak responses (75% maxima) in significantly activated regions (p < 0.05) contoured (n = 29 trials). C, Activity patches to red lightness stimuli form three lightness clusters. D, E, Similar lightness clusters for green and blue stimuli, respectively. F, Combined lightness map for red, green, and blue. G, The corresponding interpolated locations of lightness-preferring patches; there is the common branch point for (RGB) lightness values: the central black-patch. H, Correlation analysis between lightness distances to cortical distances for red, green, and blue lightness clusters: CLR1, CLG1, and CLB1, respectively. The logarithmic lightness distance and the cortical distance correlate significantly (r, correlation index; n, number of distance measurements between patches). I, The color lightness map in Monkey B. J, Interpolated locations of its lightness-preferring patches. K, Correlation analysis for its lightness clusters. Scale bars, 1 mm.
Figure 6B shows the activated patches evoked by red stimuli of different lightness. By combining these maps, we found that the locations of most activated patches shifted systematically with lightness changes, forming separate lightness clusters (Fig. 6C). Similar clustering of activity patches occurred for green and blue stimuli (Fig. 6D and Fig. 6E, respectively), and their spatial relationship becomes obvious when these are presented together in a combined lightness map (Fig. 6F). Although there are clear individual variations in cortical activity between monkeys (compare Fig. 6I), most lightness clusters demonstrated the same basic organization: when lightness of red, green, or blue was decreased, the activated patches shifted toward the black-preferring patches and, in some cases, converged to them (Fig. 6C–E). Conversely, with increasing lightness, they gravitated toward the white-preferring patches. Thus, the lightness clusters are exclusive for each hue (RGB) and branch out to allocate separate cortical locations (Fig. 6F).
As with the hue cluster analysis, we also calculated the linear correlation coefficient, r, to quantify how accurately the log distances between the patch centers in the lightness clusters (Fig. 6G,J) map the corresponding lightness distances in HSL space (see Materials and Methods). In the four monkeys, we identified 46 lightness clusters containing at least three adjacent lightness patches. Most of them (∼67%) had correlation coefficients that were significantly higher than those of the controls (31 of 46 clusters in random-order permutation test). Therefore, their patch activation sequences normally follow the lightness order in HSL space (Fig. 6H,K). But because the shift distances of most adjacent patches, evoked by logarithmic lightness increments, were nearly equal (Fig. 7A), this projection is logarithmically distorted. With logarithmic lightness coordinates, the correlation is significantly stronger (p = 0.0021, two-tailed t test) than with linear coordinates (four monkeys, n = 17 clusters). Thus, the lightness maps may provide some neural substrate for the classic observations that color lightness is perceived as a nonlinear function of a scene's reflectance (Land, 1977).
Comparison of correlation coefficients when using logarithmic and linear coordinates, in HSL color space. A, B, Examples correlate the lightness distances with the distances between activated cortical patches. The lightness distance between all pairs of tested colors is measured separately in logarithmic and linear coordinates.
We also examined the lightness representation of four secondary colors: orange, yellow, aqua, and purple (Fig. 8). In the lightness maps, these colors activated patches around the main frame of the primary colors. Similar to the primary colors, their patches in lightness clusters followed the lightness order in HSL space. Furthermore, these colors, especially aqua, elicited weaker responses than red or green, in agreement with previous studies (Stoughton and Conway, 2008).
Cortical color lightness maps for different colors. A–D, Color lightness maps for orange, yellow, aqua, and purple, respectively. With lightness decreasing, the locations of activated patches shift toward the black-preferring patches. Scale bar, 1 mm.
Cortical color saturation representation
Perceptual colors change also in the saturation dimension. To investigate how the color maps in globs represent color saturation, we selected 15 other colors of red, green, and blue with different saturations within the HSL cone as stimuli (Fig. 9A). From red of R to Ru4, saturation decreases from 1 to 1/16, yet these colors have equal lightness to middle gray. We observed that, with saturation decreasing, the peak activity in their corresponding patches decreased gradually (Fig. 9B). Thus, their activity showed an approximately linear relationship with logarithmic color saturation (Fig. 9C), much alike the logarithmic distortion in the lightness projection (Fig. 7A).
Representation of color saturation in area V4. A, Unsaturated colors chosen inside HSL cone. Central points: Ru12 is between R and middle gray; Ru13 between Ru12 and middle gray. The same criteria were used for selecting unsaturated green and blue stimuli. B, Maps for red with 1, 1/2, 1/4, 1/8, and 1/16 saturation (S1–S5), respectively, in Monkey A. Images of Gaussian high-pass (σ = 322 μm) and low-pass filtered (σ = 52 μm); same clipping range (± 0.1%) used for every map (n = 45, 35, and 22 trials for R, G, and B stimuli, respectively). C, Peak responses (mean ± SEM) of the activated patches in B. Correlation between peak response intensities and color log saturation approximates linearity (r = −0.938, p < 10−4). Scale bar, 1 mm.
There are practically infinite varieties of colors that can be chosen from the permutation of hue, lightness, and saturation, and naturally we cannot check them individually. However, we can sample reasonable arrays of colors to test the basic representational rules. To test the generalization of the rules observed so far, we surveyed the lightness representation of unsaturated colors and the saturation representation of darkened colors in V4. We sampled nine red variations using permutations of 1, 1/2, 1/4 lightness and 1, 1/2, 1/4 saturation (Fig. 10A). We found that the rule of lightness representation for unsaturated colors is similar to that for saturated ones (Fig. 10B). The activated patches shifted toward and converged to black-preferring patches as the lightness of the tested color decreased. Here, the rule of saturation representation for darkened colors appeared to be the same as that for bright (saturated) colors; the responsiveness of the activated patches decreased with decreasing saturation.
Cortical representation of unsaturated and darken red. A, Differential maps for unsaturated and darken red. B, Cortical activation decreases when saturation decreases (one-tailed t test). C, Locations of the peak responses shift with decreasing lightness. Data are from Monkey A. Scale bar, 1 mm.
Combined cortical maps of perceptual colors
To assess how the perceptual color representations are interconnected in the globs, we combined the hue and lightness maps for each tested monkey (Fig. 11A–H). The combined maps showed considerable individual variations, likely reflecting differences in their formation during development and maturation, as has been shown for other cortical architectures (Wiesel and Hubel, 1965; Buonomano and Merzenich, 1998; Feldman and Brecht, 2005). Nonetheless, in every map, cortical lightness and hue representations revealed interconnections: crosses or branch-points, which have their counterparts in the conical surface of HSL space (Fig. 6A). Typically, lightness clusters (CL) had orthogonal crossings with hue clusters (CH), and many lightness clusters branched out from the same black or white patches (Fig. 11E–H), analogous to the branching from the two opposing poles of HSL space. For example, in Monkey A, CH1 crossed CLG2 at a green-preferring patch (Fig. 11E); and in Monkey B, CH1 crossed CLR1 and CLR2 at a red-preferring patch (Fig. 11F). In Monkey A, CLR1, CLG1, and CLB1 branched out from the same black-preferring patch.
Clusters and stacks in combined maps in globs of area V4. A–D, The combined maps of Monkeys A, B, C, and D, respectively, show individual layouts but similar representational rules for hue and lightness information with each map containing hue and lightness clusters and stacks. E–H, The corresponding interpolated locations of hue-preferring patches in hue clusters, CH, and lightness-preferring patches in lightness clusters, CL. Small black or white disks: black- or white-preferring patches, forming end- or branch-points. Light gray rectangles represent orthogonal crossings between hue and lightness clusters. I–L, Stacked patches, S, in the combined maps. Red stacks represent SR, red disks; blue stacks represent SB, blue disks; unique stacks represent SU, light gray disks. Scale bar, 1 mm. Although in Mk.B some stacks appear next to or over vessels, this did not lower the image quality much (see Materials and Methods).
The spatial distribution of hue and lightness representations in the color map was quite uneven, with some local areas showing greatly overlapping color-preferring patches. By calculating density maps for the patch distribution, we could identify the regions with significantly higher patch densities, which we call “stacks” (Figs. 11I–L and 12). Some of the stacks combined hue selectivity with a broad lightness variance. For example, in Monkey A, stack SG1 represented greens from bright to very dark green (Fig. 11E); in Monkey B, SR1 was activated by reds from bright to white red (pink), whereas SG1 represented greens from bright to light green (Fig. 11F). Thus, these areas represented specific hue preferences; for the preferred hue, their activity remained much the same regardless to changes in its lightness dimension.
Examples of spatial density analysis for patches in Monkey A (A–E) and Monkey B (F–J). A, F, Contoured color maps. Contours for all test colors were superimposed on the blood vessel background. B, G, The centers of all contours (in A,F) were plotted as white points. C, H, Spatial density map, showing the center points of all patches. These maps use 3 × 3 pixel (77 × 77 μm) binning for each pixel (in B, G) given in arbitrary units. D, I, Smooth spatial density maps derived from C and H, using Gaussian low-pass filter (σ = 26 μm). E, J, Red points in these maps represent peaks in the maps D and I, respectively; 5 SDs are used as threshold. These peaks indicate “stacks,” where many patches concentrate. Scale bar, 1 mm.
In some map regions, color representations were not simple HSL projections. For example, in Monkey A, hue cluster CH3 and lightness cluster CLR1 overlapped almost completely, and in Monkey D, conflicting color-preferring patches mixed with hue cluster CH2. In addition, we found three other regions, representing complex hue and lightness variations, marked as SU and highlighted by gray disks (Fig. 11I–L). In Monkey A, SU1 contained patches of pink, green, yellow, and aqua (Fig. 11I), perhaps preferring hues with high subjective lightness, and in Monkey C, SU1 represented cool colors: green, dark green, and aqua (Fig. 11K). These clusters and stacks may represent non-HSL perceptual color dimensions, such as subjective lightness (CH3, C RK1, SU1 in Monkey A) or warmness (CH2 in Monkey D).
Totally, we identified 237 color preferring patches in the color maps of V4 of four monkeys (the lightness of orange, yellow, aqua, or purple not included). Most of them were arranged in clusters, reflecting the basic HSL framework, or in stacks, reflecting some general features of similar colors along the perceptual dimensions (66 of 68, 38 of 43, 57 of 57, and 69 of 69 in Monkey A, B, C, and D, respectively; 230 of 237, or 97% totally). Only a few patches were distributed irregularly (3%).
To examine how stable is the intrinsic optical responses in color maps, typical recordings were repeated over period of weeks; in Monkey A, nearly 3 months. Figure 13 shows the red- and green-preferring patches (significantly activated regions) in V4 of Monkeys A and D, measured in two experiments on different days. We found that the locations of the color-activated patches and their activity levels remained essentially stationary over the whole test period.
Reproducibility of the color maps in V4. A, B, Difference maps to red stimulus in Monkey A, on day 91 and day 97. These maps were calculated by using Equation 3. Peak responses (75% maxima) inside significantly activated regions (one-tailed t test, p < 0.05, n = 35 trials) outlined by red contours. C, Difference map, calculated by subtracting response to red on day 97 from response to red on day 91, revealed no significant regions (two-tailed t test, p < 0.05, n = 35 trials). Peak response contours from day 91 and day 97 are superimposed on this map to easy comparison. D–F, Another example of the reproducibility test in Monkey D, using green stimulus (n = 34 trials), produced similar results. A–F, The maps were unfiltered. Clipping range for all the map was as follows: −0.1% to 0.1%. Scale bar, 1 mm.
Because HSL color space is a full color space, defining all colors within a monitor's luminance range by their hue, saturation, and lightness values, it also controls for luminance. In contrast, equiluminant colors, as used in previous studies (e.g., Xiao et al., 2003; Conway et al., 2007), allocate a smaller 1D fraction of the natural images' 3D color space (see Materials and Methods). To help to compare our results to those of previous studies, we first estimated the cortical map for nearly equiluminant hues (Fig. 14A). This map was obtained by selecting the cortical activity patches to red, orange, yellow, green, aqua, blue, and purple of very similar luminance (Table 1) from the color map of HSL space (Fig. 11A). To further verify that this map was an accurate approximation of an equiluminant color map in globs, we estimated cortical locations of patch centers representing equiluminant hues (20 cd/m2) from the recorded lightness clusters (Fig. 14B). Unsurprisingly, we found that all the approximated patch centers for equiluminant hues were very close (within 10 μm) to the measured patch centers of the corresponding near-equiluminant colors (arrows); these results were predictable from the linear cortical mapping of logarithmic lightness values (compare Fig. 7A). The observed distances between the patch centers are minute compared with the diameter of the full color map, which is ∼1500 μm. The chromatic order is evident in the globs' equiluminant hue map (Fig. 14A), as first reported by Conway et al. (2007), but it seems less striking than that for bright (saturated) hues (Fig. 4D). Naturally, such a map provides no cues for its underlying lightness order. Therefore, it would have been impossible to extrapolate the layout of the full color map in V4 from the results of the previous studies (from the cortical activity to equiluminant hues or to other narrow color sets) alone.
Cortical map for (near-) equiluminant colors makes up only a small section of Monkey A's full color map (compare Fig. 10A). A, The map displays cortical activity patches (peak activity areas), evoked by seven hues (red, orange, yellow, green, aqua, blue, and purple) of relatively similar luminance values (22.7, 17.0, 27.4, 20.9, 25.4, 15.2, and 11.0 cd/m2, respectively; Table 1). The map shows chromatic hue order but not as clearly as “rainbows of patches,” which form hue clusters to bright saturated hues (compare Fig. 3D). B, Cortical locations of patch centers representing equiluminant red, green, and blue (20 cd/m2) were approximated from the recorded lightness clusters (see Materials and Methods). Because of linear cortical mapping of logarithmic lightness values (compare Fig. 6A), the approximated patch centers for equiluminant red, green, and blue are very close to the measured patch centers of the corresponding near-equiluminant colors (22.7, 20.9, and 15.2 cd/m2, respectively; marked by crosses and pointed by arrows). Therefore, the map of peak activity areas of near-equiluminant colors in A is likely to provide an accurate approximation of the corresponding equiluminant color map in globs. To highlight the color map's linear representation of logarithmic luminance values, few other estimated patch center positions for darker (lower luminance) reds, greens, and blues are also indicated by arrows. Scale bar, 1 mm.
Microelectrode recordings validate optical imaging results
Last, through microelectrode recordings, we investigated how well the measured optical signals reflected the underlying neuronal responses. We inserted eight Pt/Ir-alloy microelectrodes (pt1–8) in the surface layer of a cortical color map (Fig. 15A) by using its surface vessel patterns, as seen in the optical images, as landmarks. These electrodes picked up multiunit action potentials (Fig. 15B,C), which were used to quantify local neuronal responses to the tested hue and red lightness stimuli. In each case, the color selectivity of the recorded neural groups resembled that seen during optical imaging (Fig. 15D). Thus, the neural groups showed broad spectral sensitivities, similar to their overlapping hue and lightness patches (pt1–8), having the same dominant peaks for the preferred colors. These results demonstrated that the neural responses and optical signals correlate closely across the color map (Fig. 15E), validating the optical imaging results, and suggested that the hue and lightness patches for different colors overlap because neurons in one patch respond to these colors in different ratios or strengths. Overall, these results also agree with electrophysiological findings of Conway et al. (2007), who showed luminance invariant hue tuning of individual glob neurons. Although the spectral responsiveness of individual neurons can vary even within the same hue-patch (compare Fig. 15A,D, pt2 and pt3), our data indicate that, if a neuron prefers red over green, regardless of luminance (or lightness) changes, it would typically still respond more to (lighter/darker) reds than to (lighter/darker) greens.
Neural responses, measured through microelectrode recordings, correlate with optical imaging signals. A, Locations of the microelectrode recordings, marked with + in respect to the corresponding optically measured color maps in Monkey A. B, Action potentials at electrode pt2, located in a red area, are mostly evoked by dark red and reds with greater lightness. After 60–150 ms delay from the stimulus onset, the firing peaks transiently before settling at steady-state level. Average firing rate (±SD) for each color, obtained from recordings to 15–20 repetitions, show color-specific variations. C, Action potentials at electrode pt6 are mostly evoked by pink, purple, and reds with greater lightness. Interestingly, this recording site showed enhanced firing (opponency) after green and yellow stimulation. D, The normalized color tuning measured at each recording site compared with the corresponding optical signals. Color-induced intensity of the optical signal was taken from its grayscale difference maps (e.g., Figure 3C) as the average of 3 × 3 image pixels at the point where each electrode resided. E, Correlation between the neuronal responses and the optical signals (p < 10−4); includes all data from D. Scale bar, 1 mm.
Discussion
By systematically testing color stimuli from the HSL coordinates, we discovered the macroscopic organization of full color maps in the globs that much concurs with our common color perception. In these maps, each color is represented by multiple patches, in which activity increases with saturation. Most patches are arranged in hue or lightness clusters, which establish, together with their activity level (saturation dimension), a basic framework for cortical HSL coordinates. In other areas, however, some patches overlap nearly completely, creating stacks, which possibly represent color invariances, and few are positioned irregularly. The plausible outcome of this spatial organization is that different colors are represented by different multipatch activity patterns. Furthermore, because each color map appears to have its unique multipatch color representations for the same test colors, the findings suggest that their detailed topology is also shaped by development and maturation.
HSL color space and perceptual color map
In trichromatic vision, all observable colors can be represented by a 3D color space, including RGB, YUV, CIE, and HSL spaces (Ebner, 2007). RGB space (Mollon, 1982) is confined by broadly tuned sensitivity of retinal cones to short, medium, and long wavelengths, whereas HSL space is a nonlinear transformation of the Cartesian (cube) representation of RGB space. HSL color space, first defined by Munsell based on his perceptual color measurements, uses hue, value (lightness), and chroma (color purity) dimensions (Munsell, 1912). From the viewpoint of color perception, it seems more intuitive and perceptually relevant than RGB space, and accordingly, HSL space is widely used in technical fields and daily life, such as in computer graphics and in chromatic adjusters of color televisions. Thus, the HSL-based framework of the color map, which we report here, is unlikely coincidental but rather evolved with color perception. At the same time, we cannot assert that HSL would be the only, or the optimal, color space for mapping color representation in V4. After all, neural systems are especially complex and their outputs highly adaptive, and even with HSL space, there remain ambiguities between its projections and the color map. Nonetheless, its logarithmically projected lightness dimension is noteworthy, as this suggests that perhaps the most natural, or suitable, color space might be an inflated spindle-shaped, rather than a linear double-cone-shaped HSL space. Munsell's original color space has this shape, containing a more perceptually uniform color scale near the black point, in contrast to a linear scale.
In HSL space, red changes to orange, orange to yellow, and pink to red. Similarly, in the globs, red-preferring patches neighbor orange-preferring patches; orange is represented next to yellow-preferring patches and pink next to red. Importantly, each color in the HSL space has its own cortical representation rather than having a composite representation of basic colors. This is because in the color map there is no common lightness module for different hues, but each hue has its own lightness clusters, branching over distinctive areas. Dark red is represented by dark red-preferring patches, instead of joint activation of red- and black-preferring patches, which was further confirmed by microelectrode recordings from the dark red-preferring patches and concurs with previous electrophysiological studies (Conway et al., 2007; Conway and Tsao, 2009). These results suggest that the brain has integrated brightness and hue representations at V4 and may explain why we perceive pink, brown, and red as three different colors, rather than one color with different lightnesses. Overall, the encoding of color lightness seems quite different from that of V1/V2, where the color brightness and hue may be represented independently (Livingstone and Hubel, 1984; Roe and Ts'o, 1995). Nonetheless, there are further areas in the color map that are less clear to interpret. These include stacked patches, presumably representing folds or singularities along hue or lightness dimensions, and irregularly positioned patches.
One attractive idea is that the neurons constituting a stack could encode complementary color information (features or dimensions) that are needed to distinguish color similarities and differences simultaneously. In HSL space, although pink and red are far apart in lightness dimension, they share the same hue angle, belonging to reds. Thus, their stacked representations could categorize similar colors. Together, information from stacks and clusters could then be used to group, according to their spectral qualities, the similar and different elements of a scene into object surfaces (Lotto and Purves, 2002). However, because each of the four color maps we studied contained more clusters than stacks and not all colors had stacked representations, most such operations would be incomplete. So this explanation can only be a partial answer. This is further supported by the variable map topology in different monkeys, which suggests that the color maps self-organized to accommodate also other mapping requirements that likely competed for cortical representation during development and maturation, as is known to happen in other cortical architectures (Wiesel and Hubel, 1965; Buonomano and Merzenich, 1998; Feldman and Brecht, 2005).
Requirements for distributed combinatorial representations
In HSL space, the number of possible colors is vast. Naturally, this must be reflected in the structure and operations of the visual system (Barlow, 1981). Thus, from a functional point of view, cortical circuits in V4 presumably optimized connectivity and computations so that their output can represent effectively the probable eventualities in the perceptual color space.
The large degree of overlapping between different hue and lightness patches (Fig. 15A), as further confirmed by local multiunit recordings of broad spectral sensitivity (Fig. 15D), makes it implausible that neural activity in a single small cortical area would represent a single color unambiguously. Instead, the combinatorial and distributed color representation is the likely solution to the problem of the physical color space being vastly larger than the number of neurons in a small cortical map to encode it (for similar arguments about olfactory space coding, see Laurent, 2002). Because multipatch representations for different test colors appear different in the same map, the limited representational color space of the globs is likely massively expanded and may benefit from the patch overlaps, which could further increase perceptual discriminability (Knill and Pouget, 2004; Moreno-Bote et al., 2011). Here, fractioning color representation into distributed patch activity may also optimize local continuity (Saarinen and Kohonen, 1985; Graziano and Aflalo, 2007), by minimizing wiring length and maximizing coding efficiency (Graziano and Aflalo, 2007; Wen and Chklovskii, 2008), to provide information-rich representations of complex color distributions in natural scenes (Land et al., 1983). Conversely, the maps' linear representations of logarithmic color lightness and saturation changes may reflect adaptations to nonlinear response dynamics of retinal cells (Barlow, 1957; Marr, 1974; van Hateren et al., 2002), their signal mixing (Field et al., 2010), and the Weber-Fechner's law of color perception (McCann et al., 1976; Land, 1977; Land et al., 1983).
During their likely self-organization (Saarinen and Kohonen, 1985; Graziano and Aflalo, 2007), the color maps may have been also shaped by the need to bind color information to other image features, such as shapes, textures, and orientations of objects that are represented within and around V4 (Essen and Zeki, 1978; Desimone and Schein, 1987; Tanigawa et al., 2010; Kourtzi and Connor, 2011), and by the related space allocation of adjacent top-down connections (Zhou and Desimone, 2011). Furthermore, the color map may incorporate knowledge or qualities that are intrinsic constructs, such as color constancy (Schein and Desimone, 1990), the percept of which cannot be simply predicted from the physical properties of the light (McCann et al., 1976; Land, 1977; Lotto and Purves, 2002; Kusunoki et al., 2006). Thus, some aspects of its recorded activity may not only represent colors in the external world but also reflect correlations to internal factors, arising from the complex dynamics of the underlying cortical circuits (Harris, 2005). After all, V4 lies in the interface between major bottom-up and top-down pathways, which have been suggested to jointly drive the network activity of its specialized neuronal assemblies in a task-related manner (Roe et al., 2012).
Possible relationship to V2 color maps
In most of our experiments, the small color square stimuli did not activate V2 within the optically imaged area. However, the hue-lightness map in V2 could be reconstructed from limited data for larger stimuli from one monkey (Fig. 16). It has been proposed that V2 contains spatially separate modules for hue and luminance, with the luminance of different colors represented by a common module (Wang et al., 2007). Our limited data suggest that V2 could also have continuous 2D luminance clusters (here, lightness of dark reds was only tested) along with hue clusters.
Hue and lightness maps in area V2. Boxes frame a hue cluster (CH1) and a red-lightness cluster (CLR1) that largely follow the hue/lightness order of HSL-color space. This limited data from V2 was obtained as a byproduct of optical imaging experiments that targeted V4 in Monkey A. The mapping was performed by presenting 4° × 4° color square (stimulus) at a preset screen position, which marks the local visual field center of the glob (in V4). Only lightness of dark reds was tested. Scale bar, 1 mm.
In conclusion, our results give new insight into how the brain perceives color. In area V4, our findings indicate that its macroscopic color maps are organized along the three intuitive dimensions of hue, lightness, and saturation so that colors from 3D chromatic space are broadly allotted to 2D cortical surface according to their perceptual relationship. This perception-based functional architecture uses distributed and combinatorial hue and lightness representations possibly to enable sophisticated multifeature extraction and to expand the representational color space of the globs.
Footnotes
This work was supported by the National Science Foundation of China Outstanding Young Researcher Award 30525016, the National Science Foundation of China project 90408030, the Knowledge Innovation Project of the Chinese Academy of Sciences to S.T., the National Science Foundation of China project 30810103906 to S.T. and M.J., Biotechnology and Biological Sciences Research Council Grant BB/F012071 to M.J., and Jane and Aatos Erkko Fellowship to M.J. We thank Amiram Grinvald for advice on how to perform long-term optical imaging in alert monkeys, Horace Barlow for discussions, and Roger Hardie, Hugh Robinson, Gonzalo de Polavieja, Wu Li, Robert Desimone, and two anonymous referees for comments.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Shiming Tang, Peking University-Tsinghua University Joint Center for Life Sciences and Peking University-International Data Group-McGovern Institute for Brain Research, Beijing 100871, China. tangshm{at}sun5.ibp.ac.cn