Spatial navigation is believed to be guided in part by reference to an internal map of the environment. We used functional magnetic resonance imaging (fMRI) to test for a key aspect of a cognitive map: preservation of real-world distance relationships. University students were scanned while viewing photographs of familiar campus landmarks. fMRI response levels in the left hippocampus corresponded to real-world distances between landmarks shown on successive trials, indicating that this region considered closer landmarks to be more representationally similar and more distant landmarks to be more representationally distinct. In contrast, posterior visually responsive regions such as retrosplenial complex and the parahippocampal place area were sensitive to landmark repetition and encoded landmark identity in their multivoxel activity patterns but did not show a distance-related response. These data suggest the existence of a map-like representation in the human medial temporal lobe that encodes the coordinates of familiar locations in large-scale, real-world environments.
A cognitive map is a representational structure that encodes spatial locations within large-scale, navigable environments. O'Keefe and Nadel (1978) proposed that the hippocampus is the brain structure that supports the cognitive map in mammals. Supporting this hypothesis are data from neurophysiological studies indicating that hippocampal neurons exhibit increased firing for particular spatial locations (O'Keefe and Dostrovsky, 1971; Matsumura et al., 1999) and lesion data indicating that damage to the hippocampus impairs navigation using map-based but not route-based strategies (Morris et al., 1982). The theory has been further enhanced by the recent discovery of a grid-like spatial representation in entorhinal cortex, the primary source of hippocampal input (Hafting et al., 2005). The spatial regularity of the entorhinal grid suggests that it may facilitate precise coding of location within the environment and a metric for calculating distances between locations (Jeffery and Burgess, 2006).
In humans, the evidence for hippocampal involvement in cognitive map coding is less clear. Although place cells have been discovered in the human hippocampus (Ekstrom et al., 2003), damage to this structure does not lead to a purely spatial impairment. Rather, these amnesic patients suffer from a more general declarative memory problem (Squire, 1992), which can leave the ability to navigate through familiar environments essentially intact (Teng and Squire, 1999). Furthermore, neuroimaging studies of spatial navigation obtain hippocampal activation in some cases (Ghaem et al., 1997; Maguire et al., 1998) but not others (Aguirre et al., 1996; Aguirre and D'Esposito, 1997; Rosenbaum et al., 2004). In summary, the claim that human medial temporal lobe structures such as hippocampus encode spatial information per se, as opposed to other kinds of navigationally relevant information, remains controversial (Shrager et al., 2008).
Here we present evidence for a signal in the human hippocampus that exhibits a key feature of a cognitive map: preservation of real-world distance relationships. That is, the hippocampus considers locations that are physically closer in space to be more representationally similar and locations that are further apart in space to be more representationally distinct. Such a distance-related response has not been identified previously in the hippocampus: the existence of place cells indicates that different locations are distinguished but does not necessarily imply that these locations are organized according to a map-like code. To test for such a code, we scanned university students with functional magnetic resonance imaging (fMRI) while they viewed images of landmarks from a familiar college campus. We examined multivoxel activity patterns evoked by landmarks as well as adaptation effects related to the distance between landmarks. We reasoned that a brain region involved in encoding locations within an allocentric map should demonstrate adaptation effects that are proportional to the real-world distance between successively viewed landmarks. In contrast, regions representing visual or semantic information about landmarks should exhibit adaptation during landmark repetition and multivoxel patterns that distinguish between landmarks but should not exhibit distance-related adaptation.
Materials and Methods
Fifteen right-handed volunteers (10 female; mean age, 22.6 ± 0.3 years) with normal or corrected-to-normal vision were recruited from the University of Pennsylvania. All subjects had at least 1 year of experience with the campus (average length of experience, 3.7 ± 0.2 years) and gave written informed consent according to procedures approved by the University of Pennsylvania institutional review board.
Scans were performed at the Hospital of the University of Pennsylvania on a 3 T Siemens Trio scanner equipped with a Siemens body coil and an eight-channel head coil. High-resolution T1-weighted anatomical images were acquired using a three-dimensional magnetization-prepared rapid-acquisition gradient echo pulse sequence [repetition time (TR), 1620 ms; echo time (TE), 3 ms; inversion time (TI), 950 ms; voxel size, 0.9766 × 0.9766 × 1 mm; matrix size, 192 × 256 × 160]. T2*-weighted images sensitive to blood oxygenation level-dependent contrasts were acquired using a gradient-echo echo-planar pulse sequence (TR, 3000 ms; TE, 30 ms; voxel size, 3 × 3 × 3 mm; matrix size, 64 × 64 × 45). Images were rear-projected onto a Mylar screen at 1024 × 768 pixel resolution with an Epson 8100 3-LCD projector equipped with a Buhl long-throw lens. Subjects viewed the images through a mirror attached to the head coil. Images subtended a visual angle of 22.9° × 17.4°.
Stimuli and procedure.
Visual stimuli were color photographs of 10 prominent landmarks (i.e., buildings and statues) from the University of Pennsylvania campus. Twenty-two distinct photographs were taken of each landmark for a total of 220 images. To ensure that all subjects were familiar with the landmarks, they underwent behavioral testing 1 d before scanning in which they were asked to indicate (yes/no) whether they were familiar with each landmark. In the same session, “subjective” distances between landmarks were determined by asking subjects to estimate the number of minutes required to walk between each pair of locations.
The main experiment consisted of two fMRI scan runs that lasted 6 m 51 s each, during which subjects viewed all 220 images without repetition. Images were presented every 3 s in a continuous-carryover sequence that included 6 s null trials interspersed with the stimulus trials (Aguirre, 2007). This stimulus sequence counterbalances main effects and first-order carryover effects, thus allowing us to use the same fMRI dataset to examine both the multivoxel response pattern for each landmark and adaptation between landmarks presented on successive trials. A unique continuous-carryover sequence was defined for each subject. On each stimulus trial, an image of a landmark was presented for 1 s, followed by 2 s of a gray screen with a black fixation cross. Subjects were asked to covertly identify each campus landmark and make a button press once they had done so. During null trials, a gray screen with black fixation cross was presented for 6 s during which subjects made no response. Each run included a 15 s fixation period at the beginning of the scan to allow tissue to reach steady-state magnetization and ended with an additional 15 s fixation period.
After the experimental runs, subjects were scanned twice more for the functional localizer. Each functional localizer scan lasted 7 m 48 s and consisted of 18 s blocks of images of places (e.g., cityscapes, landscapes), single objects without backgrounds, scrambled objects, and other stimuli, presented for 490 ms with a 490 ms interstimulus interval.
Functional images were corrected for differences in slice timing by resampling slices in time to match the first slice of each volume, realigned to the first image of the scan, and spatially normalized to the Montreal Neurological Institute (MNI) template. Data for all univariate analyses, including the functional localizer scans, were spatially smoothed with a 6 mm full-width half-maximum (FWHM) Gaussian filter; data for multivoxel pattern analyses (MVPAs) were left unsmoothed.
Regions of interest.
Data from the functional localizer scans were used to define functional regions of interest (ROIs) for scene-responsive cortex in parahippocampal place area (PPA) and retrosplenial complex (RSC) (places > objects), object-responsive cortex in the lateral occipital complex (LOC) (objects > scrambled objects), and early visual areas (scrambled objects > objects). Thresholds were determined on a subject-by-subject basis to be consistent with those identified in previous studies and ranged from T > 2.0 to T > 3.5 (mean T = 2.7 ± 0.1). Bilateral PPA and LOC were located in all 15 subjects. Right RSC was identified in all subjects and left RSC in 13 of 15 subjects. We also defined anatomical ROIs for the hippocampus using sagittal T1-weighted images. The hippocampal ROI included all CA fields and the subiculum but did not include entorhinal cortex. The hippocampus was separately defined for the left and right hemispheres and further subdivided into its anterior/inferior and posterior/superior subregions by an axial division at z = −9.
Data were analyzed using the general linear model as implemented in VoxBo (www.voxbo.org), including an empirically derived 1/f noise model, filters that removed high and low temporal frequencies, and nuisance regressors to account for global signal variations and between-scan differences. Between-landmark adaptation effects were modeled with a regressor corresponding to the distance between each landmark and the immediately preceding landmark, calculated in one of two ways: the Euclidean distance in meters between landmarks (i.e., objective distance “as the crow flies”) or an individual subject's perceived distance in minutes of travel time between landmarks (subjective distance), each mean centered. Also included in the model was a regressor modeling the response to any landmark versus baseline and two regressors to account for situations in which the distance to the previous landmark was undefined: (1) when a landmark stimulus followed a null trial and (2) when a stimulus consisted of a landmark after another view of the same landmark (i.e., repeated-landmark trials). A separate, supplementary analysis examined distance effects in a less constrained manner by assigning each trial to one of four bins based on the distance from the currently presented to the previously presented landmark, plus a fifth regressor for repeated-landmark trials. Finally, a modified version of the first model was run, in which distance was only defined for non-covisible landmarks, and the covisible versus non-covisible distinction was modeled with an additional regressor.
For all models, β values were calculated for each ROI, which were then compared with zero using a one-tailed t test. In addition, whole-brain analyses were performed by calculating subject-specific t maps for contrasts of interest, which were then entered into a second-level random-effects analysis. Monte Carlo simulations involving sign permutations of the whole-brain data from individual subjects (1000 relabelings, 12 mm FWHM pseudo-t smoothing) were performed to find the true type I error rate for each contrast (Nichols and Holmes, 2002). All reported voxels are significant at p < 0.05, corrected for multiple comparisons across the entire brain.
To ensure accurate localization of distance-related adaptation effects to the hippocampus in the whole-brain analyses, we performed an additional step to anatomically coregister the structures of the medial and lateral temporal lobes for this contrast. The hippocampus, entorhinal cortex, perirhinal cortex, parahippocampal cortex, insula, superior temporal gyrus, and middle temporal gyrus were anatomically defined according to parcellation protocols (Kim et al., 2000; Matsumoto et al., 2001; Pruessner et al., 2002; Kasai et al., 2003). These structures were then coregistered across subjects using the ROI alignment method and the same transformations applied to the functional data before random-effects analysis (Yassa and Stark, 2009). The results were similar when this additional coregistration step was not performed.
Twenty regressors were created to model each of the 10 landmarks separately within the two experimental runs. These regressors were then used to extract β values for each condition at each voxel. Multivoxel pattern classification was performed on these values using custom MATLAB code based on the method described by Haxby et al. (2001). In short, a cocktail mean pattern was calculated for each of the two runs and subtracted from each of the individual patterns before classification. Pattern classification was performed by pairwise comparisons across all 10 landmarks. Patterns were considered correctly classified if the average pattern correlation between landmark A in opposite halves of the data was higher than between landmark A and landmark B in opposite halves of the data. Classification accuracy was then averaged across all possible pairwise comparisons for a given ROI and tested against random chance (i.e., 0.5) using a one-tailed t test. We also examined classification using a one-versus-all procedure in which landmark A was only considered correctly classified if the same-landmark correlation between opposite halves of the data (i.e., landmark A–landmark A) was higher than all nine cross-landmark correlations (i.e., landmark A–landmark B, landmark A–landmark C, etc.). Chance in this analysis is 10%.
A searchlight analysis based on Kriegeskorte et al. (2006) was implemented using custom MATLAB code to look for areas of high classification accuracy outside of the predefined ROIs. A small spherical ROI (radius, 5 mm) was created and centered on each voxel of the brain in turn. Overall classification accuracy was calculated for this region using the pairwise comparison procedure, and the value was assigned to the center voxel of the cluster. These values were then used to create subject-specific accuracy maps, which were smoothed with a 9 mm FWHM Gaussian kernel before entry into a random-effects analysis. As before, a Monte Carlo sign permutation test was performed to calculate the true false-positive rate for classification accuracy against chance (50%). All reported voxels are significant at p < 0.05, corrected for multiple comparisons across the entire brain.
To test whether landmarks that are nearer in space have more similar multivoxel patterns, we computed correlations between neural distance and physical distance. Neural distance between two landmarks A and B was quantified as 1 − rAB, where rAB is the correlation between the pattern elicited by landmark A and the pattern elicited by landmark B after subtraction of the cocktail mean from both. Because this analysis does not require reserving part of the fMRI data as a separate test set, the fMRI response patterns used in this calculation included data from both scan runs. Neural distances were obtained for all pairs of landmarks and were correlated with the actual physical distances between those pairs. Pearson's R values were then converted to Fisher's Z values, averaged across subjects, and compared against zero using a one-tailed t test. This analysis was performed both within predefined ROIs and also within a set of 5 mm searchlights whose center positions covered the entire brain.
During the main experiment, University of Pennsylvania students viewed photographs of prominent landmarks (buildings and statues) from the Penn campus (Fig. 1), which were presented one at a time without any image repetitions. Subjects made a button press once they identified the landmark shown on each trial. Note that this task did not explicitly require subjects to retrieve information about the location of the landmark or its relationship to other landmarks. Reaction times on this task revealed a behavioral priming effect for landmark identity: subjects responded more quickly on trials in which the landmark was a repeat of the landmark shown on the previous trial than on nonrepeat trials (repeat, 522 ± 29 ms vs nonrepeat, 547 ± 30 ms; t(14) = −2.0, p = 0.03). We also measured reaction time as a function of the real-world distance between the currently viewed landmark and the landmark shown on the previous trial; however, here we observed no significant effect (r = 0.002, p = 0.48).
fMRI adaptation analyses
fMRI adaptation is a reduction in response observed when an item is repeated, or when elements of an item are repeated (Grill-Spector et al., 2006). This reduction is interpreted as indicating representational overlap between the first and second item, with the amount of adaptation proportional to the degree of overlap (Kourtzi and Kanwisher, 2001). We examined two forms of fMRI adaptation effects within our functionally and anatomically defined ROIs. First, we looked for adaptation effects caused by presentation of the same landmark on successive trials. When the landmark on the current trial was identical to the landmark shown on the preceding trial, fMRI responses in PPA and RSC were significantly attenuated, as indicated by a significant negative loading on a regressor modeling response differences between repeat and nonrepeat trials (PPA, t(14) = −3.25, p = 0.003; RSC, t(14) = −3.47, p = 0.002). Whole-brain random-effects analysis revealed additional landmark-related adaptation in the left superior lingual gyrus abutting the anterior calcarine sulcus (−18, −53, 1) and the left medial retrosplenial region (−6, −47, 15) medial to the functionally defined RSC (Fig. 2). At lower thresholds, these activations extended into the functionally defined RSC and the PPA/fusiform region.
Next, we looked for adaptation between pairs of landmarks as a function of the real-world distance (i.e., objective distance) between them. We predicted that regions supporting a map-like representation would exhibit greater adaptation (i.e., less fMRI response) when proximal landmarks were shown on successive trials and less adaptation (i.e., greater fMRI response) when distal landmarks were shown on successive trials. We tested for a linear relationship between neural response and the distance between the currently viewed landmark and the landmark shown on the immediately preceding trial by measuring the loading on a continuous covariate modeling real-world distances between successive trials. This effect was positive and significant in the left anterior hippocampus (t(14) = 4.35, p = 0.0003), indicating that activity in this region correlated with real-world distances between sequentially presented landmarks. This effect was confined to the left anterior hippocampus: no similar relationship was observed in the left posterior (t(14) = 0.20, p = 0.42), right anterior (t(14) = 0.21, p = 0.42), or right posterior (t(14) = 0.49, p = 0.32) hippocampal subregions. An analysis of second-order distance (i.e., distance between the current landmark and the landmark occurring two trials back) found no significant effects in any hippocampal subregion (all p values > 0.3).
Because a cognitive map of the environment may not be entirely faithful to the real world, we also assessed the relationship between adaptation effects and subjects' perceived “subjective” distance between landmarks. Subjective distances were estimates of the number of minutes required to walk between each pair of locations, obtained the day before the fMRI scan in a separate testing session. Subjective distance judgments were highly correlated with objective physical distances (mean r = 0.90, p = 1.71 × 10−13), as one would expect given the high degree of familiarity with the campus and the grid-like organization of campus paths that facilitate direct or near-direct travel between locations. We found that activation was dependent on subjective distance in the left anterior hippocampus (t(14) = 3.22, p = 0.003) but no other hippocampal subregions (left posterior, p = 0.47; right anterior, p = 0.47; right posterior, p = 0.17).
Whole-brain analyses revealed significant dependence of activation on objective distance in the left anterior hippocampus (−29, −9, −18), consistent with the ROI analyses reported above (Fig. 3A). Distance-related activation was also observed in the left inferior insula (−45, −1, −6 and −42, −15, −6), left anterior superior temporal sulcus (aSTS) (−48, −6, −18), and right posterior inferior temporal sulcus (pITS) (46, −62, −2) near the location usually occupied by middle temporal/medial superior temporal visual areas (MT/MST) (Kourtzi et al., 2002) (Fig. 3A). Whole-brain analyses using subjective distances were similar.
To further explore the distance-related adaptation effect in the hippocampus, we performed two additional analyses. First, we passed functional data to a model in which distances between landmarks on successive trials were discretized into four covariates. This allowed us to graphically examine activation as a function of distance without assuming a linear relationship. The results confirm our previous findings (Fig. 3B,C) indicating that activity in the left anterior hippocampus scales with distance between campus locations. Second, we performed an analysis in which successively presented landmarks that are covisible (i.e., one landmark can be seen from the other landmark) were modeled separately from landmarks that are not covisible. Distance-related adaptation was then examined for the non-covisible landmarks (because there was little variability in distance for the covisible landmarks). We observed greater activity in the left anterior hippocampus for non-covisible landmarks compared with covisible landmarks (t(14) = 2.49, p = 0.01), as well as distance-related adaptation among the non-covisible landmarks (t(14) = 2.97, p = 0.005). This last effect is of particular importance because it indicates that the adaptation effect we have observed cannot be solely attributed to adaptation for landmarks that sometimes occur within the same scene but rather reflects a true distance effect.
Finally, we tested whether distance-related adaptation was found in the regions showing landmark-specific adaptation in the whole-brain analysis and whether landmark-specific adaptation could be found in the regions showing a distance-related effect. We observed a complete dissociation: there was no effect of landmark repetition in the regions showing distance-related adaptation [left anterior hippocampus (t(14) = −0.20, p = 0.42), left inferior insula (t(14) = −0.76, p = 0.23), left aSTS (t(14) = −0.68, p = 0.25), and right pITS (t(14) = 1.38, p = 0.09)], and there was no effect of distance in the regions sensitive to landmark repetition [superior lingual (t(14) = −0.86, p = 0.20) and retrosplenial (t(14) = 0.47, p = 0.32)]. To confirm the apparent dissociation between brain regions, we performed an analysis (distance, landmark repetition) × ROI ANOVA for three ROI pairings: hippocampus–PPA, hippocampus–lingual gyrus, and hippocampus–retrosplenial cortex. The interaction term was significant for all three pairings [hippocampus–PPA (F(1,14) = 7.78, p = 0.01), hippocampus–lingual (F(1,14) = 17.58, p = 0.001), and hippocampus–retrosplenial (F(1,14) = 13.64, p = 0.002)]. The fact that we did not observe landmark-specific adaptation in the hippocampus although we observed distance-related adaptation may at first seem surprising, but it is in fact similar to findings from other studies indicating that same-identity repetitions engage additional processes not engaged by different-identity repetitions (Sternberg, 1998; Drucker and Aguirre, 2009). Landmark repetition trials were relatively rare in our experiment, and this fact may have led to the engagement of novelty or oddball processing mechanisms on these trials that would have masked or attenuated any adaptation effect (Strange and Dolan, 2001) (see also Summerfield et al., 2008).
Multivoxel pattern analyses
A second method for determining the representational distinctions made by a brain region is to examine multivoxel patterns elicited by different stimuli. MVPA can provide information that is complementary to that obtained through adaptation, insofar as MVPA is likely to be more sensitive to information coded on a coarser spatial scale (Drucker and Aguirre, 2009). We performed two such analyses: the first examining the distinguishability of patterns elicited by the 10 campus landmarks, the second examining whether the similarities between these patterns reflected real-world distances.
We first used MVPA to decode the identities of campus landmarks viewed in one scan from patterns evoked during the other scan. This analysis involved comparison of same-landmark and different-landmark patterns across all landmark pairs. Decoding accuracy was significantly above chance in a variety of visually responsive regions (Fig. 4), including the PPA (t(14) = 6.12, p = 0.00001), RSC (t(14) = 4.47, p = 0.0003), object-selective LOC (t(14) = 7.28, p = 0.000002), and early visual cortex (t(14) = 5.18, p = 0.00009). Performance was not significantly different from chance in any of the hippocampal subregions (left anterior, t(14) = 0.07, p = 0.47; left posterior, t(14) = 0.77, p = 0.23; right anterior, t(14) = −0.04, p = 0.49; right posterior, t(14) = −0.88, p = 0.20). Similar levels of significance were observed when classification performance was scored using a one-versus-all rather than a pairwise comparison procedure. Classification using this method was significantly above chance (10%) in PPA (19.2%, p = 0.001), RSC (14.2%, p = 0.03), LOC (21.3%, p = 0.00002), and early visual cortex (23.6%, p = 0.0003) but at chance in the left anterior hippocampus (11.3%, p = 0.23). A separate analysis of pairwise decoding performance for individual landmarks indicated that classification performance was approximately equivalent for all landmarks in PPA, RSC, LOC, and early visual cortex and equivalently at chance in the hippocampus (supplemental Fig. 2, available at www.jneurosci.org as supplemental material). This suggests that above-chance classification accuracy is not driven by high performance on only a few landmarks.
A searchlight analysis of pairwise decoding performance across the entire brain revealed areas throughout the occipital and parietal cortices in which landmark identity could be decoded at rates that were significantly above chance (Fig. 5). Interestingly, these regions were only partially overlapping with regions showing landmark-related adaptation effects in the previous analysis. Similar disjunctions between regions exhibiting adaptation for a stimulus dimension and regions exhibiting multivoxel patterns that distinguish between items along this dimension have been reported previously in the literature (Drucker and Aguirre, 2009).
A second set of analyses tested whether similarities and differences between the multivoxel patterns evoked by the various landmarks related to the real-world distances between the landmarks. To examine this possibility, we calculated a “neural distance” between landmarks for all landmark pairs and then compared this neural distance with the physical distance between landmarks (see Materials and Methods). There was no significant correlation between neural and physical distance in the left anterior hippocampus (mean r = 0.02, p = 0.23) or in any of the other three hippocampal subregions (left posterior, mean r = 0.01, p = 0.40; right anterior, mean r = −0.02, p = 0.28; right posterior, mean r = 0.04, p = 0.07). We also examined the correlation between neural and physical distance in the three extrahippocampal regions that exhibited distance-related adaptation. This relationship was not significant in the left aSTS (mean r = −0.02, p = 0.32), but there was a nonsignificant trend in the right pITS region (mean r = 0.09, p = 0.06) and a small reversed effect in the left inferior insula (mean r = −0.06, P = 0.02). A searchlight analysis examining the neural versus physical distance relationship across the entire brain found no significant voxels at either a corrected (p < 0.05) or uncorrected (p < 0.001) significance level. Levels of performance within the predefined ROIs were not significantly improved by a two-step procedure in which data from one scan run were used for feature selection through a searchlight procedure and testing was performed within the best-performing searchlight on the data from the other scan run (Chadwick et al., 2010).
To gain insight into the cognitive processes that might be driving our observed neural effects, we examined an additional 10 subjects in a purely behavioral version of the experiment, after which they were queried about the thoughts and mental processes they experienced while viewing the campus photographs. This version of the experiment was identical to the fMRI version, except that stimuli were presented on a desktop computer screen within a quiet room. Most subjects (9 of 10) reported they visualized themselves standing at the location the photograph was taken (e.g., “I see Huntsman [Hall] all the time because I'm always in class there, so I was just picturing myself looking at it from this point of view”). Some subjects (6 of 10) noted that the photographs elicited specific memories tied to the viewed locations. For example, one subject reported that a picture taken underneath a campus bridge reminded them of a time when they had walked under it to avoid seeing someone, whereas another subject reported that photographs of the athletic field reminded him of attending a music festival at that location. Only a minority of subjects (3 of 10) reported that they imagined traveling between the locations. These results suggest that subjects experienced vivid retrieval of the corresponding campus location when viewing the landmark photographs but did not typically have explicit retrieval of the spatial relationships between these landmarks.
Our results demonstrate that fMRI activity in the human hippocampus is modulated by distances between locations in a spatially extended environment. When subjects viewed images of landmarks drawn from a familiar university campus, hippocampal response to each landmark was dependent on the distance between that landmark and the landmark shown on the preceding trial. We observed this distance-related effect although subjects were not given any explicit navigational task but were simply asked to think about the identity of each landmark, suggesting that the mechanism operates essentially automatically. These data are broadly consistent with the idea that the hippocampus either supports a spatial map of the environment or receives direct input from such a map.
These findings advance our understanding of the role of the human medial temporal lobe in spatial navigation. Although previous neuroimaging studies have obtained activation in the hippocampus during virtual navigation and spatial learning (Ghaem et al., 1997; Maguire et al., 1998; Shelton and Gabrieli, 2002; Wolbers and Büchel, 2005; Spiers and Maguire, 2006; Suthana et al., 2009; Brown et al., 2010), this finding is by no means universal (Aguirre et al., 1996; Aguirre and D'Esposito, 1997; Rosenbaum et al., 2004). More importantly, although these studies generally implicated the hippocampus in navigation-related processing, they did not demonstrate hippocampal coding of spatial information per se. A true spatial code does not merely distinguish between different locations (e.g., place A is different from place B) but also encodes the coordinates of those locations such that distance relationships can be ascertained (e.g., A is closer to B than to C). It is such a distance-preserving code that we demonstrate for the first time here.
Distance-related adaptation effects were also observed in the insula, aSTS, and pITS. Because these effects were unexpected, we interpret them with some caution. Nevertheless, it is intriguing that the pITS region is near the coordinates typically reported for visual areas MT/MST and also exhibited a relationship between interlandmark distance and neural distance for multivoxel patterns. MT/MST has been implicated in the coding of location during virtual navigation tasks such as triangle completion (Wolbers et al., 2007), and neurons with place-selective responses have been observed in this region in monkeys (Froehler and Duffy, 2002). These results suggest that the role of MT/MST in coding location-based information deserves more attention. The insula has also been activated in previous studies of navigation and has been associated with imagined body movements, although its exact role in navigational processing is unknown (Ghaem et al., 1997; Hartley et al., 2003).
In contrast to the adaptation results, similarities between multivoxel patterns in the left anterior hippocampus did not relate to real-world distances between locations. Previous work suggests that multivoxel patterns may be more sensitive to information coded by narrowly tuned neurons clustered by their response properties, whereas adaptation is more sensitive to information coded by broadly tuned neurons with no clustering principle (Drucker and Aguirre, 2009). Thus, finding adaptation effects in the hippocampus but no correlation between distributed patterns and real-world distances suggests a population of neurons with broadly tuned place fields and little spatiotopic organization (Redish et al., 2001). Alternatively, it is possible that the spatial resolution of our study was insufficient for revealing multivoxel patterns in the hippocampus. Using smaller voxels than those used here, a recent study was able to decode the locations of subjects within a virtual-reality room based on hippocampal multivoxel patterns (Hassabis et al., 2009). Although some of the discrepancy between those results and our own may reflect task and analysis differences, it is also possible that location information would have been evident in the current experiment had the fMRI data been acquired at a finer resolution.
Complementary to the distance-related adaptation effects observed in the hippocampus, landmark-specific adaptation effects were observed in neocortical regions, including the superior lingual gyrus, medial retrosplenial cortex, and (at lower thresholds) RSC and PPA. Our findings are broadly consistent with previous work that indicated these regions code individual scenes and landmarks, but there are two important differences. First, we observed repetition effects in the PPA and RSC, although exact landmark views were never repeated. Thus, the adaptation effect exhibited some degree of viewpoint tolerance. We previously observed cross-viewpoint adaptation in the PPA and RSC when campus scenes were repeated across intervals of several minutes but viewpoint-specific adaptation for shorter repetitions of 100–700 ms (Epstein et al., 2008). The present results suggest that intermediate repetition intervals of 2 s elicit viewpoint-tolerant responses more consistent with the longer-interval repetition regimen, a surprising finding that may have important implications for our understanding of the mechanisms that drive fMRI adaptation. Second, previous studies revealed repetition effects primarily in the PPA and RSC, whereas the strongest effects in the current study were found in the medial retrosplenial region abutting, but distinct from, the functionally defined RSC. This region, corresponding to anatomically defined retrosplenial cortex (i.e., Brodmann's areas 29 and 30), has been shown previously to contain spatial and episodic memory-related signals (Rosenbaum et al., 2004; Vann et al., 2009). Thus, the current results emphasize the importance of this region in the retrieval of information about familiar places.
We also examined the multivoxel patterns associated with different campus landmarks. Landmark identity could be decoded in several cortical regions, including some involved in scene perception (PPA, RSC), some involved in object recognition (LOC), and early visual cortex. These results extend previous findings indicating multivoxel patterns in these regions contain information about scene category (Walther et al., 2009) by showing that they also contain information about specific landmarks. Because all of the stimuli in the current experiment were outdoor images of a college campus, it is unlikely that landmark decoding reflects categorical differences. Rather, these regions may encode visual or geometric properties that are useful for discriminating scenes in terms of general scene categories or as specific scene exemplars. Although these properties may be more holistic in regions such as PPA and RSC, it is likely that simpler visual features such as texture or color may give rise to successful decoding in early visual cortex. In any case, the MVPA and adaptation results converge to implicate neocortical regions such as the PPA and RSC in landmark identification, a role that contrasts with medial temporal lobe involvement in calculating distances between landmarks.
Mechanisms and implications
What are the mechanisms underlying the distance-related signal? The simplest account is that it reflects adaptation among neurons with large and partially overlapping place fields. However, simple adaptation effects in the hippocampus are rarely reported (Brown et al., 1987); thus, we favor an account in which these effects are interpreted in terms of the operation of an active mechanism.
One possibility is that hippocampal activity reflects replay of the route from the immediately preceding landmark to the currently viewed landmark, an operation that would involve more extensive processing for longer routes (Foster and Wilson, 2006). However, we think such an account is unlikely because the subjects did not actually navigate between locations, nor did they report mentally doing so.
Another possibility is that the hippocampal signal reflects the operation of a “mismatch” mechanism that occurs subsequent to an initial pattern completion phase (Gray and McNaughton, 1982; Vinogradova, 2001; Kumaran and Maguire, 2007). Previous studies have demonstrated that the left hippocampus (but not the right) activates when the expectations of a previously established “context” are violated: for example, when the first few items of a sequence are presented in a familiar order but the last few items are rearranged (Kumaran and Maguire, 2006). In the current experiment, viewing a familiar landmark may have established a “context” on each trial; the hippocampal response on the immediately subsequent trial might then reflect the degree to which the new landmark violated this context. If the activated context on each trial included information about the spatial location of the landmark (in addition, possibly, to nonspatial information not tested here), then the degree of “mismatch” would scale with the distance between landmarks. Alternatively, the degree of context violation might reflect overlap in routes emanating from the two locations, a possibility we cannot exclude given that route overlap is likely to be highly correlated with Euclidean distance on the Penn campus.
Under this account, the hippocampus may work in concert with other brain regions to form a cognitive map. Indeed, based on the rodent data (Hafting et al., 2005) and recent neuroimaging results (Doeller et al., 2010), we suggest that the entorhinal cortex encodes metric information about the spatial relationships between landmarks, whereas the hippocampus calculates the extent to which the current stimulus is consistent or inconsistent with these spatial relationships. This hippocampal–entorhinal representation of the enduring spatial structure of the environment might project to goal representations in the subiculum or other areas, allowing the system to construct routes to different goal locations during navigation (Burgess et al., 2000). Consistent with this hypothesis, Spiers and Maguire (2007) observed activity in the subiculum and entorhinal cortex corresponding to distance to a navigational goal; here we show that a different medial temporal lobe region (the anterior hippocampus) encodes distances between landmarks even in the absence of a navigational goal.
The current results may help to illuminate some of the apparent discrepancies between rodent and human data on hippocampal function. Neurophysiological data (mostly from rodents) indicate that the hippocampus primarily [but not exclusively (Leutgeb et al., 2005; Manns and Eichenbaum, 2009)] encodes spatial information, whereas neuropsychological data (mostly from humans) suggest that hippocampal damage leads primarily to impairments in episodic memory. The idea of context has been used to bridge the gap; indeed, behavioral data indicate that spatial context may play a privileged role in shaping episodic memory (Nadel and Willner, 1980; Hupbach et al., 2008). In the current study, subjects did not physically or mentally navigate between landmarks, but the hippocampal response indicated sensitivity to the spatial relationships between landmarks. We believe that this response may reflect the operation of a spatial context processing mechanism that automatically shapes episodic memory encoding and retrieval.
This work was supported by National Eye Institute Grant EY016464 (R.A.E.) and National Science Foundation Spatial Intelligence and Learning Center Grant SBE-0541957.
- Correspondence should be addressed to Russell A. Epstein, Department of Psychology, 3720 Walnut Street, Philadelphia, PA 19104.