Environmental psychology models propose that knowledge of large-scale space is stored as distinct landmark (place appearance) and survey (place position) information. Studies of brain-damaged patients suffering from “topographical disorientation” tentatively support this proposal. In order to determine if the components of psychologically derived models of environmental representation are realized as distinct functional, neuroanatomical regions, a functional magnetic resonance imaging (fMRI) study of environmental knowledge was performed. During scanning, subjects made judgments regarding the appearance and position of familiar locations within a virtual reality environment. The fMRI data were analyzed in a manner that has been empirically demonstrated to rigorously control type I error and provide optimum sensitivity, allowing meaningful results in the single subject. A direct comparison of the survey position and landmark appearance conditions revealed a dorsal/ventral dissociation in three of four subjects. These results are discussed in the context of the observed forms of topographical disorientation and are found to be in good agreement with the human lesion studies. This experiment confirms that environmental knowledge is not represented by a unitary system but is instead functionally distributed across the neocortex.
- topographical disorientation
- topographical memory
- spatial representation
- dorsal stream
- ventral stream
What form does our knowledge of large-scale, environmental space assume? Several lines of evidence within environmental psychology suggest that knowledge of topographic space is not stored as a unitary representation. Instead, it appears that humans develop distinct types of environmental knowledge, including landmark knowledge (information regarding the appearance of places) and survey knowledge (the relative spatial location of places) (Siegel and White, 1975; Thorndyke, 1981; Thorndyke and Hayes, 1982).
Studies of brain-damaged patients suffering from selective topographical disorientation provide some additional, although not conclusive, evidence for this psychological proposition. After restricted, posterior brain lesions, patients occasionally become selectively impaired in their ability to find their way from place to place within the large-scale environment. The cognitive nature of the impairment appears to be related to the lesion site: ventral lesions cause an inability to recognize salient landmarks, whereas dorsal lesions impair manipulations of the spatial (survey) relationships between places (Levine et al., 1985). It should be noted, however, that these dissociations are rarely, if ever, clear cut. Patients often demonstrate some degree of impairment in both landmark and spatial spheres of environmental competency (Bottini et al., 1990). In addition, there seems to be a reasonable degree of variation among patients in either the amount to which these functions are segregated and/or their anatomical location, because lesions in these sites do not invariably produce these deficits (Tohgi et al., 1994).
Given the suggestive yet inconclusive findings of lesion case reports, we wished to test the hypothesis that the divisions of environmental knowledge proposed by psychology studies are realized as functionally distinct, neuroanatomical regions. Over a period of several days, four subjects explored a detailed, virtual reality town. During this time, they became familiar with both the appearance of different places and the spatial position of any one place in relation to another. Although the experiment would have been theoretically unchanged by the use of a “real” environment, the virtual presentation allowed precise control over the level of environmental knowledge across subjects. Functional magnetic resonance imaging (fMRI) was used to observe regional brain activity while subjects recalled information regarding the environment. During two alternating conditions, subjects made judgments regarding the visual appearance of familiar places or judgments of relative spatial (configural) location. The two tasks used identical stimuli, differing only in the aspect of environmental knowledge that the subject was required to recall. A direct comparison of these two conditions was hypothesized to reveal a posterior dorsal/ventral dissociation, as tentatively suggested by patient lesion studies. In addition, activity observed during judgments of appearance and position was compared with a control task to reveal areas of common activity. Given the observation that the topographical deficits observed in lesion studies are not completely isolated to either appearance or position knowledge, we expected to identify areas of posterior cortex that would be activated by both appearance and position tasks when compared with the baseline control.
MATERIALS AND METHODS
The environment and training
The virtual “town” in which subjects were trained was developed as a modification of a commercially available video game (Marathon2, Bungie Software, Chicago, IL). Maps and environmental details were designed using a freeware map editor (Pfhorte 2.0 by Steve Israelson). The town itself was composed of 16 distinctive, named “places,” interconnected by a variety of roads and paths and arranged in a 4 × 4 grid. A river divided the town approximately in half. Figure 1 provides an aerial view of the environment. Subjectively, the town was ∼140 meters in width. Each “place” was designated as such by the presence of a small round marker in the ground. No place marker was visible from any other place.
Training occurred 2–3 d before functional scanning. During initial training, subjects were permitted to freely explore the environment, with their goal being the discovery of all 16 places, typically requiring between 15 and 30 min. Next, while the subject traveled from location to location, the instructor provided the subject with the name of each place. An ∼30 min period of training followed, in which the subject would attempt to travel to locations named by the instructor. Errors were corrected and reminders provided as necessary. When the subject was able to travel without hesitation from one location to another within the town, test versions of the tasks described below were administered. The stimulus set used during training was different than that used later during fMRI scanning. Performance on each component (appearance and position) of the test was scored. If accuracy was <90% on either component, the subject was given an additional 15 min of unguided exploration and then retested. This process was repeated until the subject reached criterion on both components. On the morning of scanning, subjects were provided with an additional 10 min of free exploration to allow them to “refresh” their knowledge of the town. All subjects received between 120 and 200 min of training within the virtual environment before scanning.
Behavioral tasks and analysis
The behavioral tasks used during scanning were developed on the Psyscope platform (Cohen et al., 1993). The subject’s view of stimuli from within the magnet bore subtended a 24° horizontal and 8° vertical visual angle. Stimuli were projected on a screen that the subject viewed through a mirror. The subject made responses with the use of a four-button, fiber optic control pad operated with both thumbs.
During scanning, subjects were presented with pictures of locations within the town. These pictures were taken in all four cardinal directions from all places. During the APPEARANCE task (Fig.2 A), the picture was accompanied by the name of a place, located at the bottom of the screen. The subject was required to indicate by a button press whether the pictured place matched the name given. This task putatively required subjects to recall the visual appearance of places associated with place names to correctly identify the location. One in six presented scene/name pairs matched (best chance performance: 83%). During the POSITION task (Fig.2 B), the picture was accompanied by the correct name of the place, followed by an arrow and the name of another, target location. The subject was to indicate, again by a button press, the cardinal direction in which the target location lay. This task required the subject to recall the relative survey location of her current location (chance performance: 25%). A third, CONTROL (Fig.2 C) task was also included in which subjects alternated left and right button presses to the presentation of scrambled visual scenes and words. Each task block was 60 sec long, and all tasks were self-paced. Left and right button press responses, as well as the order of tasks, were varied across subjects.
fMRI data acquisition and analysis
Scanning parameters. After the acquisition of sagittal [repetition time (TR) = 500, echo time (TE) = 11, 128 × 256, 1 excitation (NEX)] and axial (TR = 600, TE = 15, 192 × 256, 2NEX) T1-weighted localizer images, gradient-echo, echoplanar fMRI was performed in 18 contiguous 5 mm axial slices (TR = 3000 msec, TE = 50 msec, 64 × 64 pixels in a 24 cm field of view) using a 1.5-Telsa GE Signa system equipped with a prototype fast-gradient system and the standard quadrature head coil. Twenty seconds of “dummy” gradient and radiofrequency pulses preceded the actual data acquisition to approach tissue steady-state magnetization. Head motion was minimized using foam padding. A total of three 6 min scans were conducted for each subject, resulting in 360 observations per voxel per subject. Off-line data processing was performed on SUN Sparc workstations using programs written in Interactive Data Language (Research Systems, Boulder, CO).
Motion compensation. A slice-wise motion compensation method was used that removed spatially coherent signal changes via the application of a partial correlation method to each slice in time. For each axial slice at each time, a difference image between that slice at time t and that slice at time 0 (a Motion image) was correlated with an image composed of the difference between the slice at time 0 shifted to the right one voxel and that same slice shifted to the left one voxel (an x-shift image). The same operation was performed for y shifts (using y-shift images). The x-shift and y-shift images, weighted by the strength of their respective correlations with the Motion image, were subtracted from the image of the slice at timet. A conceptually similar method for motion in the Z dimension was then applied to each axial image. The rationale for this method was to subtract out signal changes that correlated with small (on the order of a voxel) translations. This was justified, because all subject motion (as judged by SPM95 image realignment) was <2 mm for all directions for all scans. We have observed informally that this motion compensation technique consistently results in voxel variances that are less than those found in datasets that are only motion-corrected.
Statistical analysis. To permit the analysis of individual differences, these studies were designed to provide significant results in the single subject. To make single-subject analyses meaningful, sufficient specificity is necessary to hold type I error to the desired (α = 0.05) level, and sufficient sensitivity is required to reject the null hypothesis. To address the former requirement, we used an analysis method that accounts for intrinsic temporal autocorrelation present in fMRI data (Worsley and Friston, 1995), corrects for the multiple comparisons conducted over the brain volume (Worsley, 1994), and has been demonstrated empirically to hold the map-wise false-positive rate to a 0.05 level for null-hypothesis data (Aguirre et al., 1997). This analytical method also optimizes sensitivity by using an empirically derived estimate (Zarahn et al., 1997) of the hemodynamic response of the fMRI system to model the expected signal response. Finally, each subject was studied repetitively, providing a high level of power. As a result, each subject studied can be considered an individual experiment, with additional subjects constituting replications.
The raw data for each subject were spatially smoothed by convolution with a small (3 voxel full-width half-maximum) Gaussian kernel. Spatial smoothing can be expected to increase sensitivity when the underlying activation is equal to or greater than the width of the smoothing kernel. Additionally, spatial smoothing will not change the position of local maxima, except in cases in which two local maxima are separated by a distance less than the width of the kernel (Worsley et al., 1996). Voxel-wise analysis was performed using the general linear model for autocorrelated observations (Worsley and Friston, 1995). Included within the model was an estimate of intrinsic temporal autocorrelation (Zarahn et al., 1997), a global signal covariate, and sine and cosine regressors for frequencies below that of the task. These components have been demonstrated empirically to hold the map-wise false-positive rate at or below tabular values (Aguirre et al., 1997). Temporal data were smoothed with an empirically derived (Zarahn et al., 1997) estimate of the hemodynamic response of the fMRI system.
If global signal changes within a dataset are correlated with the task variables, the use of a global signal covariate can be expected to spuriously increase the incidence of voxels negatively correlated with the task (Aguirre et al., 1997). The global signals from all datasets were concatenated and analyzed with the modified general linear model as described above, except that the model lacked a global signal covariate. This preliminary analysis revealed that the global signal was correlated with the APPEARANCE–CONTROL and POSITION–CONTROL comparisons [t(486 eff df) = 3.6 and 3.7, p < 0.0005], but not with the POSITION–APPEARANCE comparison [t(486 eff df) = −0.3, p = 0.38]. Inclusion of the global signal covariate thus would be expected to increase the incidence of negatively correlated voxels for the comparisons versus CONTROL, but not for the direct comparison of the two experimental tasks. Because we believe that the inclusion of a global signal covariate improves inferential power without adversely affecting sensitivity, global signal covariates were included in the models used for analysis, and only one-tailed (positively correlated) tests were considered for the comparisons versus CONTROL. A two-tailed test was used for the POSITION–APPEARANCE comparison.
Given values for effective degrees of freedom, smoothness, search volume, desired minimum cluster volume (14 voxels or 1 cm3), and desired α value (0.025 for the two-tailed hypotheses tested), we calculated a critical tvalue (Worsley, 1994) of ∼3.6 for each map. The use of a cluster result (Friston et al., 1994) requires the assumption of a Gaussian field and seemed justified because of the large number of effective degrees of freedom (170) of each study. Each subject’s statistical map was thresholded at the critical t and cluster levels.
These thresholded maps were then transformed to standardized Talairach space (Talairach and Tournoux, 1988) by 12 parameter affine transformation (part of the SPM95 package, Friston et al., 1995) guided by the T1 localizers. Local maxima within the excursion set clusters were identified and their coordinates noted. These thresholded, transformed maps were also combined to reveal sites of replication of significant activity across individual subject studies. The transformed T1 localizers from each subject were averaged and used to present the group analysis.
To identify locations of clustered local maxima across subjects, the coordinates of local maxima for all subjects were entered into a k-means analysis (Hartigan, 1975) provided as a component of the Interactive Data Language package. Clustering was examined for increasing numbers of target clusters until all clusters were composed of maxima that were no farther than 15 mm from the cluster center. Only clusters composed of local maxima originating from two or more subjects were retained.
Four right-hand subjects (ages 18–27), three male, were studied. All subjects received between 120 and 200 min of training within the virtual environment before scanning. All subjects performed the behavioral tasks administered during scanning well above chance (APPEARANCE: 93% correct ± 2% SD, d′ = 2.63; POSITION: 74% correct ± 23% SD, d′ = 1.90). The high SD of POSITION task performance can be attributed to a single subject (G.C.) who scored 48% correct (d′ = 0.71, still above chance level). Interestingly, many of this subject’s responses were “rotated in frame,” in that the relative directions indicated from different places were consistent but were rotated 90° with respect to the true orientation. The subject’s performance would be 77% (d′ = 1.72) if the rotated responses were considered correct.
Direct POSITION versus APPEARANCE comparison
All four subjects demonstrated significantly greater activity in the premotor cortex (Brodmann area 6) and superior parietal lobule (area 7) during the POSITION task, compared directly with the APPEARANCE task. The parietal area of activity was on the left for two of the subjects, bilateral for two, and extended to the inferior parietal lobule for two. Three subjects had additional activity in the superior precuneus (area 7). Figure 3 (bottom rows) presents superior slices from two subjects for this comparison.
Three of the four subjects demonstrated significantly greater activity in the lingual gyrus (area 19) and/or inferior fusiform gyrus (area 37) during the APPEARANCE task, as compared with the POSITION task. Two of the four subjects had significantly greater parahippocampal signal for this comparison. The parahippocampal activity extended relatively superiorly (z ≈ +4 mm in Talairach space) in both cases. These areas of activity were bilateral. Two subjects also had right-side activity in the inferior aspect (z ≈ −4) of the middle occipital gyrus (area 19). Figure 3 (top rows) presents the inferior slices from two subjects for this comparison. One subject (J.D.) did not have activity for this comparison in inferior cortical areas.
The anatomical location of these dissociated areas of activity was consistent across subjects. Figure 4 Aillustrates the points where significant activity was observed in multiple single subjects for the direct POSITION–APPEARANCE comparison. A clear dorsal/ventral dissociation is evident. The parahippocampus and fusiform/lingual gyrus bilaterally and the middle occipital gyrus on the right were consistent sites of greater activity during the APPEARANCE task. The inferior parietal lobule bilaterally, the precuneus, and the superior parietal lobule and premotor cortex on the left were consistently activated to a greater extent during the POSITION task. Table 1 presents the coordinates of local maxima that were similar across subjects for this comparison.
POSITION and APPEARANCE versus CONTROL
The APPEARANCE and POSITION tasks were also compared with a visuomotor CONTROL to determine whether the tasks shared any areas of cortical activity when compared with a common baseline. Figure4 B illustrates the replicated locations across subjects where signal was greater during the POSITION and APPEARANCE tasks as compared with CONTROL. As can be seen, an extensive strip of posterior cortex was activated bilaterally during both of the topographical recall tasks. These areas include the medial occipital lobe, posterior cingulate cortex, middle occipital gyrus, inferior parietal lobule, precuneus, and superior parietal lobule. The premotor cortex (bilaterally) and supplementary motor area were also observed. Finally, significant bilateral parahippocampal activity was observed in either recall task versus the CONTROL for all subjects.
As Figure 4 A demonstrates, several cortical areas had significantly different activity between the two task conditions. For a number of these areas, both of the task conditions themselves were significantly different from CONTROL. These are the areas seen in Figure 4 B. For the cortical regions present in Figure4, both A and B, we can state that although they have dissociated levels of activity for the two task conditions, they do not appear to respond selectively to one task condition or another, because both tasks increase signal in the region over CONTROL. It is interesting to note, however, that several areas present for the direct task comparison are absent for the comparison of both tasks versus CONTROL (i.e., present in Fig. 4 A but absent in Fig. 4 B). Not only was the activity within these areas significantly different between the POSITION and APPEARANCE tasks, but one of the tasks (either POSITION or APPEARANCE) failed to recruit the region as compared with the CONTROL. These areas, including the inferior fusiform gyrus and inferior parietal lobule bilaterally, thus may be considered sites of complete double dissociation. For these areas, we may state that not only is there a dissociation of activity between the two tasks, but a selectivity of response.
We hypothesized that a posterior dorsal/ventral dissociation of neural activity would be observed in response to decisions regarding the appearance of a place (landmark knowledge) versus decisions regarding the relative location of a place (survey knowledge). Our hypothesis was confirmed for three of the four subjects studied. Posterior parietal (and premotor) areas possessed greater activity during the POSITION condition, whereas the lingual and fusiform gyri demonstrated greater activity during the APPEARANCE condition. For the fourth subject (J.D.) differential activity for the direct comparison was observed only in dorsal areas.
The division of survey and landmark information into dorsal and ventral areas was hypothesized based on two observations from the literature. First, there is evidence that visual processing (Mishkin et al., 1983;Sergent et al., 1992; Ungerleider and Haxby, 1994) and long-term memory (Moscovitch et al., 1995) are divided into dorsal and ventral pathways, corresponding approximately to “what” and “where” information. The termination of these two pathways, in inferior extrastriate cortex and in superior posterior parietal cortex, corresponds to the two dissociated sites observed in this study. Second, dorsal and ventral lesions appear to produce different varieties of way-finding impairments in human patients. Topographical disorientation has typically been reported after lesions confined to one of three cortical locations: the parahippocampus (Habib and Sirigu, 1987; Maguire et al., 1996a), the medial occipital lobe (Hecaen et al., 1980; Landis et al., 1986), or the posterior parietal cortex (DeRenzi, 1982; Hublet and Demeurisse, 1992). Lesions of ventral cortical areas, typically described as medial occipital and fusiform gyrus, produce a form of disorientation described as topographical agnosia (Pallis, 1955;Whiteley and Warrington, 1978; Hecaen et al., 1980); patients so afflicted are unable to recognize salient environmental landmarks. Interestingly, some degree of prosopagnosia frequently (Landis et al., 1986), but not invariably (Tohgi et al., 1994; McCarthy et al., 1996), co-occurs with topographical agnosia. In this study, activity within the fusiform/lingual gyrus was found in three of the four subjects during recall of place appearance. The specific site of ventral activity we observed is similar to that reported for face-processing tasks, as detected by depth electrode recording and neuroimaging studies (Allison et al., 1994; Haxby et al., 1994). It is possible that face and landmark processing are subserved by adjacent, yet distinct, areas of extrastriate cortex.
Alternatively, lesions of dorsal neocortex, typically described as parietal, or parietotemporal areas, have been reported to produce a form of topographical disorientation with a more “spatial” character (DeRenzi et al., 1977; DeRenzi, 1982; Hublet and Demeurisse, 1992; Obi et al., 1992). These patients appear to have intact object (landmark) recognition but are unable to represent the spatial relationships between places, as evidenced by impaired sketch-map and direction production. Although these patients do not suffer from neglect or right–left confusion, their spatial impairments are, however, typically fairly severe and are rarely exclusive to topographical orientation. All four subjects studied here had greater activity within posterior parietal regions during judgments of environmental position.
It is certainly not the case, however, that all of the reported cases of topographical disorientation fall neatly into one or another of the categories described above. Some patients with neocortical damage are impaired within landmark and spatial domains (Bottini et al., 1990). Although there may be specialization of dorsal and ventral areas for representation of different features of environmental information, these divisions are perhaps not absolutely specified and/or there are additional regions that subserve both functions. In the current study, comparison of the APPEARANCE and PLACE tasks to the CONTROL task revealed a large confluent area of posterior cortical activity, reaching from the parietal lobe to the medial occipital lobe and parahippocampus (Fig. 4 B). We observed a similar stretch of cortical activation in a previous study (Aguirre et al., 1996) in which subjects explored and navigated a spatially extended, virtual reality maze during fMRI scanning. Because of the complexity of the task used in our previous study, it was not possible to assign particular functions to the many areas of cortical activity observed. The present study advances our understanding of the role of the most dorsal (posterior parietal) and ventral (fusiform, lingual) of those areas. The specific function of those cortical areas that were activated to a similar degree by both the APPEARANCE and the PLACE tasks is less clear. It is not possible to determine whether this similarity of activation is the result of anatomical heterogeneity (populations of neurons responsive to only object or spatial features mixed within a given area) or functional homogeneity (neurons responsive to cognitive components common to both tasks).
The mixed nature of topographical disorientation deficits seem particularly pronounced after lesions restricted to the parahippocampus. Habib and Sirigu (1987) noted both object and spatial impairments in their four patients with well-defined parahippocampal lesions. Additionally, our reading of the literature suggests that in those cases in which the lesion is demonstrably limited to the parahippocampus, patients are primarily unable to acquire new topographic knowledge, whereas neocortical lesions impair acquisition as well as way-finding in previously familiar environments. For example, Maguire and colleagues (1996a) reported that after unilateral medial temporal lobe resections that included the parahippocampus, patients developed a demonstrable deficit in the acquisition of novel topographic information. These patients, however, denied any way-finding difficulties and were not disoriented within familiar environments. The lesion evidence thus would suggest that the parahippocampus is necessary for the acquisition of novel landmarkand survey features that define an environment. Finally, it is worth noting that lesions restricted to the hippocampus proper have not been reported to produce clinical topographical disorientation (Milner et al., 1968; deRenzi, 1982).
In our study, activity was noted bilaterally in the parahippocampal gyrus when either the APPEARANCE or the POSITION task was compared with the CONTROL task. Our previous study (Aguirre et al., 1996) revealed bilateral parahippocampal activity during both topographical learning and immediate recall, as did a recent positron emission tomography study of topographical learning that presented subjects with videotaped walks through a town (Maguire et al., 1996b). It is now possible to state that recall of recently learned landmark or survey environmental information is sufficient to activate this structure. Future studies, conducted with higher spatial resolution, may demonstrate functional divisions of the parahippocampus that have been posited based on the different cortical connections of these areas (Suzuki and Amaral, 1994). In addition, given the observation that parahippocampal lesions produce primarily a topographical learning deficit, we might predict that if the present experiment were repeated under conditions in which several weeks elapsed between training and recall, parahippocampal activity might be attenuated. For two of the subjects studied, parahippocampal activity was greater during the APPEARANCE task as compared with the POSITION task. We are unable to offer any specific explanation for this finding at this time.
The great majority of cases of topographical disorientation result from right-sided lesions. Although examination of the deficits that follow brain lesions can inform as to the identity of cortical areas necessary for a given task, a neuroimaging study can only identify tasks that are sufficient to involve a given cortical area (Sarter et al., 1996). Thus, the observation of bilateral activity in the fusiform gyrus, parahippocampus, and parietal lobe is entirely consistent with the possibility that only the right hemisphere structure is necessary for topographic orientation. In the case of the two subjects with unilateral parietal activity on the left for the direct APPEARANCE versus POSITION comparison, activity was present on the right in these areas for both the POSITION and APPEARANCE tasks relative to CONTROL. As a result, the absence of activity in the right superior parietal lobule in the direct comparison may be attributed to the similarity of the magnitude of activation across the two tasks.
In conclusion, it seems that for at least some proportion of the population, knowledge regarding a familiar environment does not exist in a unitary format. This finding is contrary to the “cognitive map” model of hippocampal function (O’Keefe and Nadel, 1978), which suggests that all flexible representations of environmental space (specifically survey representations) are confined to the medial temporal lobes. Instead, separable areas support environmental information in a divided manner consonant with that suggested by psychology studies. This model of environmental representation proposes that the appearance of landmarks, the routes between them, and their absolute (survey) position in space are all separable components (Thorndyke, 1981; Thorndyke and Hayes, 1982). It has been proposed that children develop spatial competency in a progressive manner along these three steps, as do adults placed in a novel environment (Siegel and White, 1975). Whereas landmark and survey knowledge was examined in this study, the representation of route information was not. We hypothesize that because successful route-following requires linking spatial information (directions) to recognized objects (landmarks), recall of route information will involve both dorsal and ventral areas. The general technique of pretraining in a virtual reality environment followed by topographical recall during neuroimaging can be applied to this question as well.
This work was supported by grants from National Institutes of Health (NS01762 and AG13483) and the McDonnell-Pew Program in Cognitive Neuroscience. We thank Eric Zarahn for his incisive comments regarding this manuscript.
Correspondence should be addressed to Dr. Mark D’Esposito, Department of Neurology, Hospital of the University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19104-4283.