Abstract
Current evidence strongly supports the central involvement of the human medial temporal lobes (MTL) in storing and retrieving memories for recently experienced events. However, a critical remaining question regards exactly how the hippocampus and surrounding cortex represents spatiotemporal context defining an event in memory. Competing accounts suggest that this process may be accomplished by the following: (1) an overall increase in neural similarity of representations underlying spatial and temporal context, (2) a differentiation of competing spatiotemporal representations, or (3) a combination of the two processes, with different subregions performing these two functions within the MTL. To address these competing proposals, we used high-resolution functional magnetic resonance imaging targeting the MTL along with a multivariate pattern similarity approach with 19 participants. While undergoing imaging, participants performed a task in which they retrieved spatial and temporal contextual representations from a recently learned experience. Results showed that successfully retrieving spatiotemporal context defining an episode involved a decrease in pattern similarity between putative spatial and temporal contextual representations in hippocampal subfields CA2/CA3/DG, whereas the parahippocampal cortex (PHC) showed the opposite pattern. These findings could not be accounted for by differences in univariate activations for complete versus partial retrieval nor differences in correlations for correct or incorrect retrieval. Together, these data suggest that the CA2/CA3/DG serves to differentiate competing contextual representations, whereas the PHC stores a comparatively integrated trace of scene-specific context, both of which likely play important roles in successful episodic memory retrieval.
- episodic memory
- high-res fMRI
- hippocampus
- multivariate pattern analysis
- parahippocampal cortex
- spatiotemporal
Introduction
Central to its role in episodic memory processing, the hippocampus is thought to play a critical role in binding elements of events together to facilitate their later recall (Diana et al., 2007; Davachi and Dobbins, 2008). Consistent with this role, several studies suggested the importance of the hippocampus to coding both spatial and temporal components of episodic details in memory (Spiers et al., 2001; Konkel et al., 2008; Konkel and Cohen, 2009; Staresina and Davachi, 2009; Ekstrom et al., 2011). Together, these data suggest that the hippocampus is a convergence zone for representation and binding of spatiotemporal context as part of a larger role in episodic memory (Battaglia et al., 2011; Ekstrom and Watrous, 2014). Despite these proposals, how the hippocampus processes spatiotemporal context as part of episodic memory retrieval remains unresolved.
Past work suggests that the representation of spatiotemporal context in the hippocampus and surrounding cortex may be accomplished by three putative mechanisms. One possibility is that the retrieval of this spatiotemporal representation involves an overall increase in neural similarity of representations underlying the two different contexts (Levy, 1996; Lisman, 1999; Lee et al., 2004). A second possibility is that representing events within an episode may, in part, involve differentiating competing representations bound to a stimulus (Yassa and Stark, 2011). A third possibility is that different structures perform these two different functions within the medial temporal lobe (MTL); for example, subregions within the hippocampus differentiate inputs, and the parahippocampal cortex (PHC) tends to store and represent them as more similar (LaRocque et al., 2013).
To address these competing ideas, we used a multivariate approach called multivariate pattern similarity analysis (MPSA). MPSA allows us to test hypotheses about the similarity of neural representations that support retrieval of spatiotemporal context that would otherwise be difficult to assay with univariate methods (Kriegeskorte et al., 2006). Thus, MPSA provides insight into how putative spatial and temporal representations might differ for complete retrieval of spatiotemporal context (i.e., retrieval of both spatial and temporal context) versus partial retrieval (i.e., retrieval of either spatial or temporal context only). We also used high-resolution imaging specifically targeting the hippocampus to better image its subregions. With this method, we find that MPS is lower during complete spatiotemporal contextual retrieval but not during partial retrieval. This effect was restricted to the CA2/CA3/DG. In contrast, we find that MPS is higher for complete versus partial retrieval in the PHC. Together, these data are consistent with complementary roles played by the CA2/CA3/DG and the PHC in differentiating and integrating spatiotemporal context.
Materials and Methods
Participants.
We tested a total of 20 participants (10 females) that were recruited from University of California, Davis and received monetary compensation for their time. Inclusion criteria required that participants be right-handed, fluent in English, and naive to the test stimuli. One participant was excluded for below chance memory performance, which resulted in a final group of participants totaling 19 (nine females). All participants were screened for neurological disorders and tested and consented according to Institutional Review Board regulations.
Behavioral design: encoding.
Participants navigated a virtual environment on a laptop computer and played the YellowCab delivery game created using the PyEPL programming library and OpenGL. This encoding task involved a participant traveling through a virtual environment that contained seven stores irregularly located around the perimeter of the virtual city (Fig. 1a). Stores were located approximately equidistant from the center where a passenger was located that the participants were instructed to pick up between visits to each store. The participant's task was to search for the centrally located passenger and then to deliver this passenger to each store in a predetermined sequential order. Participants then repeated this same order of deliveries four times to encode the spatial locations of the stores within the city and the temporal order of stores within the delivery order. There was also a light fog in the city, allowing at most two stores to be visible at any given time during the encoding phase. The presence of together with uneven store spacing encouraged participants to use a spatial strategy to encode the locations of the stores instead of the order of evenly spaced stores around the perimeter (Huttenlocher et al., 2004).
We used two different cities in the encoding section that we counterbalanced equally across participants. During retrieval, stores in the unstudied second city served as lures for the first city and vice versa. Participants were instructed to learn the spatial layout and the order of deliveries during four different sets of deliveries; they were told that they would be tested on this information later. The order of deliveries was the same every trial, and, after delivering once to each store, participants were instructed that they would start the deliveries again. The temporal order of deliveries was designed so that it was uncorrelated with the spatial arrangement of the stores. Thus, if a participant delivered to stores A, B, C, D, E, F, G (in that order), no temporally proximal stores (e.g., A, B, C, D) would be spatially nearby in the layout. We address this issue further in Results.
Behavioral design of retrieval.
We used a mixed event-related, block functional magnetic resonance imaging (fMRI) design for the retrieval task. The methods used here mirror those used by Ekstrom et al. (2011); for clarity, we describe the methods here again in detail. The participants were tested in 10 total blocks, five for spatial questions and five for temporal questions. Each block consisted of 20 trials; thus, participants completed 200 trials during the retrieval task, which were split evenly across spatial and temporal blocks. Within each block, a trial consisted of two questions. The trial began with a 2 s presentation of one storefront and an inquiry of whether the participant had seen the store before. Participants indicated whether the store was “old” (meaning they had seen it before) or “new” (meaning they had not seen the store in the encoded city) and pressed the appropriate button on a button box (1 for old and 2 for new).
After the recognition question, a screen with two storefronts was presented for 4 s. Participants were instructed to indicate which of the two stores was more proximal to the store in the previous recognition question in either spatial distance or delivery order, depending on the block condition (Fig. 1b). Participants then responded with a 1 or a 2 to indicate whether the store displayed on the left or right was more proximate. If any one of the three stores presented, in either the item or context part of the trial, was not a store from the encoded city, then participants were instructed to press a third button that indicated that this context question was not applicable. In each block, there were six of these non-applicable “lure” trials, totaling 30 lure trials over all five blocks for each context. Participants could respond only during the presentation of the stimuli. The only stimuli presented were the storefronts with no accompanying text on the screen; before scanning, participants were trained with the question they needed to answer when presented with either one or two stores.
All trials in a block were part of one condition (spatial or temporal) and were presented in an interleaved manner. Whether the series of blocks started with a spatial or temporal block was counterbalanced across participants. Trial presentation within blocks was jittered using OptSeq2 (Dale, 1999) to optimize the timing of a baseline task between trials, which allows for better modeling of the hemodynamic response function (HRF). Stimuli were presented with the MATLAB Psychophysics Toolbox (Brainard, 1997). Participants performed an active baseline as opposed to a passive one (Stark and Squire, 2001). For the baseline task, participants were presented with a series of randomly presented X's or O's, with each letter being presented for 1 s; baseline trials lasted for a total of 0–6 s, depending on the jittering dictated by OptSeq2. Participants were instructed to press one button when the X was on screen and another button when the O was on screen.
The mixture of target and lure trials consisted of 70 target trials, 14 target/lure pairings and 16 lure/lure pairings distributed over five blocks. These same “triads” were used for both spatial and temporal blocks, with the only difference between sets of blocks being the context question (i.e., whether they were answering the question based on the spatial or temporal distance). Participants performed extremely well in the item recognition portion of the retrieval task, easily identifying target from lure trials. For all analyses, we took only the target trials into consideration and excluded any trials in which the participant mistakenly identified the target trial as lure, which nonetheless happened infrequently.
Imaging methods.
All participants were tested in the University of California, Davis Imaging Research Center 3 tesla TIM TRIO 32 channel scanner (Siemens). During retrieval, we imaged participant brain activity using a high-resolution protocol designed to enhance signal on echo planar imaging (EPI) sequences in the hippocampus (Zeineh et al., 2003; Ekstrom et al., 2009). Sagittal T1-weighted images were acquired using an MPRAGE sequence (matrix size, 256 × 256; 208 slices; voxel size, 1 × 1 × 1 mm) to localize the hippocampal formation for alignment of high-resolution images. High-resolution T2-weighted structural images were acquired in an oblique coronal plane perpendicular to the longitudinal axis of the hippocampus using a spin-echo sequence (matrix size, 200 × 161 mm; TR, 4200 ms; TE, 106 ms; FOV, 20 cm; 28 slices, interleaved; voxel size, 0.4 × 0.4 × 1.9 mm). Images sensitive to blood oxygen level-dependent (BOLD) contrast were acquired using a high-resolution gradient-echo EPI sequence (200 time points; matrix size, 192 × 192 mm; TR, 3000 ms; TE, 33 ms; FOV, 20 cm; 40 slices, interleaved; voxel size, 1.5 × 1.5 × 1.9 mm) perpendicular to the longitudinal axis of the hippocampus. Coplanar matched-bandwidth high-resolution gradient echo EPI sequences (matrix size, 192 × 192 mm; TR, 3000 ms; TE, 33 ms; FOV, 20 cm; 40 slices, interleaved; voxel size, 1.5 × 1.5 × 1.9 mm) were also acquired for registration of functional scans to structural images (Ekstrom et al., 2009).
fMRI analysis.
Before general linear modeling, functional images were motion corrected and high-pass filtered to remove scanner drift in SPM. Pattern similarity analyses were performed on data sampled in functional space (1.5 × 1.5 × 1.9 mm). To map functional results onto higher-resolution structural images, functional volumes were linearly registered to individual-subject high-resolution T2-weighted structural images in a two-step process via a coplanar matched-bandwidth volume.
Subregions of the MTL were demarcated manually on each participant's high-resolution structural brain volume (Duvernoy, 1998; Zeineh et al., 2001; Ekstrom et al., 2009) using FSLview in individual participant space (Fig. 1e). The specific boundaries demarcated were between perirhinal cortex (PRC), entorhinal cortex (ERC), PHC, subiculum, CA1, CA2/CA3/DG, and the anterior CA fields. Because of potential signal loss in anterior parts of the MTL, we do not consider PRC, ERC, or anterior CA fields further, as is the practice in some 3 tesla subregion segmentation protocols (Mueller et al., 2007). After subregion demarcation, each participant's high-resolution anatomical scan was registered to a template participant (p117) using 12 degrees of freedom affine transformations to create a common coordinate space among participants. We then used diffeomorphic registration in ANTS (Avants et al., 2010) to warp each individual participants' demarcated subregions diffeomorphically to create a group MTL template.
Univariate analyses.
Individual participants' functional images were minimally spatially smoothed to remove noise using a 3 mm smoothing kernel and entered into a general linear model (GLM) in SPM (Friston et al., 1995). Retrieval trials were modeled separately for each condition (complete retrieval, partial retrieval, correct, incorrect, spatial, temporal, lure), and trial time courses were convolved with a double gamma HRF. This allowed us to extract the parameter estimates for each trial across conditions for each participant. Participant parameter estimates were then entered into a second-level groupwise paired t test to allow direct comparison of activations between conditions. These last two steps (binning trials across conditions and converting parameter estimates to group-level t statistics) were specific to our GLM analysis and were not involved in the MPSA. To determine the false-positive rate for GLM-based analysis, we performed Monte Carlo simulations using ClustSim on MTL masks. We used a voxelwise p value of 0.01, which resulted in a cluster threshold of k > 16 and a corrected p value of p < 0.05 (Forman et al., 1995).
MPSA.
To assess the overall pattern of activation in the MTL rather than simply the magnitude of activation provided by the GLM, we used an MPSA approach. First, we correlated the single-trial parameter estimates derived from the GLM (modeling each trial as the 6 s event within the contextual retrieval question window) between spatial and temporal blocks, averaging over each pattern similarity (PS) value for pairs of spatiotemporal triads (PS1, PS2, … etc., as shown in Fig. 1c). A single trial modeled separately by the GLM is considered “complete” or “partial” depending on whether the participant successfully recalled both spatial and temporal contexts (complete) or whether they retrieved only one contextual feature (partial) for that stimulus triad. These individual trial parameter estimates were correlated across blocks within complete or partial conditions. In the between-stimuli analysis (Fig. 1c), we correlated spatial and temporal trials that did not match the stimulus for the complete or partial pair to assay for between-stimuli pattern similarity, thus providing a neural measure of general spatiotemporal context similarity. To avoid BOLD adaptation effects (Grill-Spector et al., 2006), which would be expected to weaken the second presentation of a stimulus, we correlated every possible combination of the first presentation of stimuli within the complete or partial pair (be it spatial or temporal) with the context that was presented in the corresponding block and not vice versa. We used Pearson's r as our measure of pattern similarity (Kriegeskorte et al., 2006; Kriegeskorte and Bandettini, 2007). We correlated parameter estimates across all the voxels in a predetermined searchlight sphere that we iteratively moved throughout the MTLs. This searchlight with a 2 mm radius contained 31 voxels total and was iteratively passed through the MTL by moving the center voxel of the searchlight through each voxel of the entire MTL. In addition to a searchlight approach, we also compared parameter estimates across all the voxels in a particular region of interest (ROI) and computed an average pattern similarity for the entire region.
We computed the MPS between different retrieval blocks for the complete–partial retrieval comparisons, and we removed correlations that were computed within blocks to avoid any potential confound caused by higher than normal correlations attributable to temporal autocorrelations. We found that they did not significantly differ from the results we saw when within-block correlations were included. Thus, our results reported include within-block correlations (which was only an issue for correct–incorrect comparisons; all comparisons for complete and partial retrieval were across blocks). We then used paired t tests to compare pattern similarity values between conditions of interest (e.g., correctly retrieving both spatial and temporal information vs either one). False-positive rates were determined by permutation testing (Kriegeskorte et al., 2006) in which the MPSA values entered into paired tests were iteratively shuffled (2500 permutations). A cluster threshold was set at the 95th percentile (pcorrected < 0.05) of the maximum cluster size across permutations, based on a voxelwise threshold of p < 0.05, which corresponded to a cluster size ≥8 voxels.
Results
Behavioral independence of spatial layout and temporal order retrieval
To address how different areas of the MTL represent spatiotemporal context during retrieval, we used a paradigm that involved the encoding and retrieval of both spatial layout and temporal order details. Participants acquired this information by freely navigating a virtual environment and encoding both details of the spatial layout and the order in which they encountered the stores. After navigation of the environment, participants retrieved information about either the spatial layout or temporal order on alternating blocks. Temporal judgments involved deciding which of two stores was closer to a third store in temporal order; spatial judgments involved deciding which of two stores was closer to a third store in the spatial layout (Fig. 1a,b). Participants performed significantly above chance for both categories of context retrieval (spatial context correct, 76.4 ± 2.5%; temporal context correct, 74.1 ± 3.3%). Importantly, spatial and temporal contextual retrieval performance did not differ significantly from each other (paired t test, t(18) = 0.9189, p = 0.3703).
An additional important question to establishing whether the two contexts indeed were retrieved independently was whether spatial retrieval facilitated temporal retrieval and vice versa. One means of determining the relative influence of one condition on another is to test whether the probabilities of two processes are statistically independent (Kreyszig, 1993; Uncapher et al., 2006), in which independence requires that the conditional probabilities of two processes are equal to the marginal probabilities of either process (i.e., Pspatial|temporal = Pspatial and Ptemporal|spatial = Ptemporal). Consistent with spatial and temporal order retrieval having no significant influence on each other, we found no significant difference between Pspatial|temporal and Pspatial (t(18) = 0.4017, p = 0.693) and Ptemporal|spatial and Ptemporal (t(18) = −0.7560, p = 0.459; Table 1). These findings are consistent with other behavioral findings suggesting that spatial and temporal processing often occur independently, provided they are not strongly interdependent during encoding (Parmentier et al., 2006; van Asselen et al., 2006; Ekstrom et al., 2011; Noack et al., 2013).
Increased activation in the hippocampus and PHC during successful context retrieval
To address the role of the hippocampus in representing spatiotemporal context, we used high-resolution imaging targeted to the MTLs. Using a cluster-based GLM approach, we found significant activation primarily centered in the CA2/CA3/DG and PHC for correctly retrieved spatial and temporal trials versus baseline (Fig. 2; p < 0.05, corrected). An ROI analysis using all of the voxels in these areas revealed similar results (CA2/CA3/DG, t(18) = 2.97, p = 0.008; PHC, t(18) = 3.08, p = 0.006). These results showed that correctly retrieving spatial or temporal context (regardless of complete or partial retrieval) resulted in increased activation within these areas and establishes the involvement of the MTL, particularly the CA2/CA3/DG and PHC, as assessed using high-resolution imaging, in our task. These results demonstrate that our task indeed recruits the hippocampus and replicate our previous findings using whole-brain imaging (Ekstrom et al., 2011), arguing that our high-resolution sequence can indeed detect task-related changes within the hippocampal subregions. However, when we assessed complete and partial retrieval with a GLM approach, we did not find any significant clusters for complete > partial or partial > complete in the MTL. The lack of a difference between complete and partial trials with our GLM analysis suggests that any differences for complete versus partial spatiotemporal retrieval we might find using an MPSA were unlikely to be attributable to regional univariate changes alone.
Greater neural pattern differentiation for complete compared with partial spatiotemporal retrieval in the CA2/CA3/DG
Functionally, if the hippocampus plays a role in merging spatial and temporal context for episodic memory processing, we would expect increased correlations between spatial and temporal retrieval trials (higher pattern similarity) during complete compared with partial retrieval. However, if the hippocampus differentiates spatial and temporal context as part of a more general role in storing episodic details, we would expect lower similarity for complete compared with partial spatiotemporal contextual retrieval. To address these competing proposals, we correlated the pattern of parameter estimates between correct spatial and temporal context trials (complete retrieval trials; mean, 40.05 ± 3 trials per participant) across all voxels within our searchlight and compared these correlations with those between parameter estimates for trials on which only spatial or temporal information was correctly retrieved (partial retrieval trials; 22.9 ± 2.25 trials per participant). To ensure that differences in numbers of trials did not affect our overall findings, we compared lower performing participants (who were nonetheless above chance overall; 29.3 average complete trials vs 30.8 partial trials) with higher performing participants (52.1 vs 14.1 trials). Pattern similarity did not significantly differ (MPS on complete trials: high vs low performers, t(17) = −0.3551, p = 0.7269; MPS on partial trials: high vs low performers, t(17) = 0.8869, p = 0.3875). We performed pattern similarity correlations across all spatiotemporal retrieval cue combinations, excluding within-stimulus comparisons, thus evaluating the complete and partial retrieval of all triads across varying stimuli (Fig. 3a). Because of a limited number of trials on which participants retrieved neither spatial nor temporal context correctly (mean of 3.85 ± 0.71 trials per participant), we could not accurately consider cases in which no spatiotemporal context was retrieved. Nonetheless, when we performed this analysis, the no retrieval versus complete spatiotemporal retrieval mirrored our results overall for partial versus complete spatiotemporal retrieval contrasts.
Our searchlight analysis revealed one cluster within the hippocampus proper showing higher pattern similarity for partial > complete retrieval trials, which was an 11 voxel cluster within the boundaries of the CA2/CA3/DG (Fig. 3b; t(18) = 3.07, p < 0.004). No other clusters reached significance for this contrast, and no clusters were found in the CA2/CA3/DG for the reverse contrast (complete > partial). The difference in MPS for the partial > complete cluster in the CA2/CA3/DG was unlikely to be driven by univariate activation differences alone, because post hoc analyses revealed no significant differences in overall activation magnitude in these voxel coordinates.
As an additional test for the differentiation effect we found in the CA2/CA3/DG, we repeated the same analysis, this time using voxels from the entire CA2/CA3/DG ROI. Whereas the searchlight analysis described above identifies subgroups of voxels that show large between-region pattern similarity differences, this analysis is more conservative in that it uses all of the voxels in the ROI (Etzel et al., 2013). Again, we found that pattern similarity was higher for partial retrieval than complete retrieval trials (two-tailed t test, t(18) = −2.2, p = 0.034). We did not find significant differences between partial > complete retrieval or complete > partial retrieval in any other subregion, mirroring our searchlight approach. Together, these findings suggested that successfully representing and retrieving spatial and temporal context was supported by differentiation of putative neural representation underlying spatial and temporal contexts, specifically within the CA2/CA3/DG.
Greater pattern similarity in the PHC for complete retrieval of spatiotemporal context compared with partial retrieval
Unlike the hippocampus, the posterior PHC appears to show specificity for scenes compared with other stimuli (Epstein and Kanwisher, 1998; Epstein et al., 2003), which has also been demonstrated using multivariate pattern analyses (Diana et al., 2008; Park et al., 2011; LaRocque et al., 2013; Liang et al., 2013). If the PHC is more sensitive to scene-specific stimuli than the hippocampus, then we would expect that scene-specific pattern similarity comparisons across spatiotemporal context would be sensitive to PHC involvement. To address this scene-specific prediction, we restricted our previous analysis of complete and partial spatiotemporal retrieval to only trials involving the same retrieval cues. Thus, in this complete versus partial retrieval contrast, we included comparisons that involved trials in which participants were viewing the exact same stimulus triads during retrieval (Fig. 4a), which we expected would elicit scene specificity. Although the partial > complete retrieval contrast for these stimuli did not yield any significant clusters, the complete > partial retrieval contrast revealed a 9 voxel cluster that was completely encompassed within the PHC (Fig. 4b; t(18) = 2.82, p < 0.008). The corresponding parameter estimates for the cluster indices were not significantly different between complete and partial retrieval trials. However, we note that we did not find a significant difference when we averaged pattern similarity across the entire volume of the posterior PHC using an ROI approach but still using the exact same analysis in PHC that yielded a significant searchlight result. It is likely that many voxels did not respond significantly within this area because the PHC occupies a much larger volume of gray matter than the CA2/CA3/DG (PHC, 1195 voxels; CA2/CA3/DG, 182 voxels) and perhaps involves more functional heterogeneity (Libby et al., 2012). Thus, the cluster identified by our searchlight analysis suggests that a smaller subregion of the PHC, within the posterior PHC, likely responded selectively rather than the whole PHC. Overall, these findings indicate that the PHC played a role in representing scene-specific information from previous navigation such that accurately retrieved spatiotemporal context showed greater pattern similarity than partially retrieved spatiotemporal context.
Control comparison: no complete and partial retrieval effects for spatial or temporal context generally
An important possibility running counter to our interpretations is that the spatiotemporal MPS effects we observed above emerged more generally during correct and incorrect spatial or temporal retrieval. In other words, it could be the case that correct trials generally showed high correlations with incorrect trials in the CA2/CA3/DG and correct trials generally showed higher correlations with correct trials in the PHC. This would in turn suggest that the effects above were not specific to spatiotemporal context but a general feature of correct and incorrect contextual retrieval. To test for this possibility, we correlated correct and incorrect trials and contrasted with the correlations between correct and other correct trials within spatial blocks. We did not find any significant clusters using a searchlight approach, and there were no significant effects in any subregion ROIs (CA2/CA3/DG, t(18) = −0.5653, p = 0.2894; PHC, t(18) = −0.0925, p = 0.5363). Similarly, we found no significant clusters or effects within our ROIs when we performed the same comparisons for temporal blocks (CA2/CA3/DG, t(18) = −1.291, p = 0.1065; PHC, t(18) = −0.8217, p = 0.789). Finally, the reverse contrast (correct–correct contrasted with correct–incorrect trials) did not reveal any significant clusters or ROI effects for either spatial or temporal blocks (spatial: CA2/CA3/DG, t(18) = −0.5653, p = 0.7106; PHC, t(18) = 0.0925, p = 0.4637; temporal: CA2/CA3/DG, t(18) = −1.291, p = 0.8935; PHC, t(18) = −0.8217, p = 0.789). These findings suggest that our partial versus complete retrieval effects were specific to correlations between spatial and temporal retrieval blocks and were not an effect of context retrieval more generally.
Distinct roles of the CA2/CA3/DG and PHC in spatiotemporal representation during retrieval
Our results so far suggested that the CA2/CA3/DG supported differentiation of spatial and temporal context during successful retrieval, whereas the PHC was involved in integrating scene-specific spatiotemporal context. To address whether these two subregions indeed played distinct roles in spatiotemporal representation, we compared pattern similarity for the cluster in the CA2/CA3/DG with the one from the PHC. This yielded a significant condition × subregion interaction effect (F(1,18) = 20, p < 0.001), showing that pattern similarity was lower for retrieving both contexts in the CA2/CA3/DG and higher in the PHC for the specific stimuli for the complete > partial retrieval contrast. These findings suggested that the CA2/CA3/DG and PHC played distinct roles in spatiotemporal retrieval, as measured by MPSA. The CA2/CA3/DG played a role in general contextual differentiation, whereas the PHC played a role representing scene-specific information in a way in which correct spatiotemporal representations were better merged than partially retrieved ones.
Discussion
Here, we used a novel approach to understanding how spatiotemporal context is represented within the hippocampus, using a paradigm that requires participants to form different representations for the spatial layout and temporal order of events during virtual navigation of a city. We aimed to assess the neural representations of complete context retrieval and how these compare with representations of a more incomplete type that involved partial spatiotemporal retrieval. Correlating neural responses during correct trials between spatial and temporal retrieval blocks involving behavioral responses about spatial or temporal proximities allowed us to estimate these representations. We found no difference in representations with regard to spatial and temporal context, which supports the domain agnostic nature of the hippocampus found in previous human fMRI studies (Azab et al., 2014). We also found lower overall pattern similarity for complete retrieval of spatiotemporal context compared with partial retrieval in the CA2/CA3/DG. However, when complete retrieval representations were compared with partial in the PHC, we found that there was higher overall pattern similarity for complete retrieval compared with partial. These findings suggest that successful retrieval of spatiotemporal context involves differentiation of putative representations underlying spatial and temporal context in the CA2/CA3/DG and a possible integration of these representations in the PHC. These findings could not be accounted for by differences in univariate activations for complete versus partial retrieval nor by differences in correlations for correct or incorrect retrieval. However, it could be suggested that our MPSA effects, which were during retrieval, instead reflect activity related to the encoding of a new retrieval trial, with the hippocampus differentiating information to rapidly represent this new information. If this were the case, we might expect that the hippocampal cluster we found would be present in both contrasts (complete > partial and partial > complete) because this encoding activity would be ubiquitous across all trials. Although our paradigm does not allow us to rule out the presence of some encoding-related activity during retrieval, our pattern of MPSA results would appear more consistent with an interpretation supporting roles for differentiation and integration of information based on how this information was originally encoded during navigation and then subsequently retrieved rather than attributable solely to encoding that occurred during retrieval.
In one human hippocampal high-resolution study, stimuli that closely resembled a previously seen target stimulus did not induce adaptation in the BOLD signal compared with seeing the target itself. This effect was restricted to the CA2/CA3/DG and suggested that this subregion played a role in differentiating similar stimuli from other previously seen images (Bakker et al., 2008), signifying a role in pattern separation. This study did not involve memory decisions nor investigate spatial and temporal retrieval specifically, and thus pattern separation could not be tied specifically to successful contextual memory retrieval. Although previous studies implicated the CA2/CA3/DG in differentiation of competing representations, studies also suggested its role in pattern completion, and no studies to date have tied in subregion-specific activity in humans with spatiotemporal representation.
Additionally, past research in rodents investigated the differential functions of the hippocampal subfields with respect to learning and memory (Lee et al., 2004; Vazdarjanova and Guzowski, 2004; Leutgeb et al., 2007; Alvernhe et al., 2008). One conclusion from this literature is the involvement of DG in pattern separation and CA3 in pattern completion/separation dependent on a sigmoidal input–output transfer function (Guzowski et al., 2004; Yassa and Stark, 2011). The limitations of human 3 tesla fMRI did not allow us to distinguish CA3 from DG reliably, and our results suggested the involvement of this combined region in the differentiation of contextual representations. Nonetheless, these results are consistent with animal studies that implicate the CA3/DG in a pattern separation-like function by indicating the differentiation of putative contextual representations when these representations are accurately retrieved (Leutgeb et al., 2007; Alvernhe et al., 2008). Although the exact mechanism for why pattern similarity would be higher between a correct and incorrect trial versus two correct trials in the CA2/CA3/DG is as yet unclear, computational models suggest that incorrect trials often involve retrieval of “noisy” traces that may be mixtures of multiple encoded stimuli (Howard et al., 2005; Sederberg et al., 2008). As such, we might expect the correlation to be lower for two uniquely and correctly differentiated contexts. Because spatiotemporal context forms the core of what many believe is a fundamental component of episodic memory (Tulving, 2002; Kraus et al., 2013), our data suggest that part of the binding process for contextual details from the same episode involves differentiation of their underlying details in the CA2/CA3/DG.
In contrast to the CA2/CA3/DG, we found somewhat of the opposite pattern for spatiotemporal retrieval in the PHC. Specifically, we found higher MPS for complete retrieval of spatiotemporal context compared with partial retrieval, but this emerged when we matched specific store triads such that they contained identical visual information (Fig. 4a). Thus, in contrast to the CA2/CA3/DG, the PHC effects showed a higher degree of scene and landmark specificity; MPS was also higher for complete than partial retrieval. Similar to our findings, LaRocque et al. (2013) found greater pattern similarity in the PHC for scenes and higher similarity in general for items from the same category; the degree of similarity was predictive of memory performance. In contrast, the hippocampus showed lower MPS for items from the same category, and furthermore, lower MPS in the hippocampus correlated with better memory performance. These findings support our proposal that the PHC and hippocampus may serve complementary roles in how scene and object information are coded in the first place. However, our results also provide an important extension to the study by LaRocque et al. because our findings relate specifically to spatiotemporal memory retrieval rather than memory for objects. Our results suggest a specific role for the CA2/CA3/DG in spatiotemporal differentiation, whereas the study by LaRocque et al. did not differentiate specific subregion function within the hippocampus.
Our results additionally provide support for a recent conceptualization of episodic memory, the memory transformation theory. Specifically, Winocur and Moscovitch (2011) propose that episodic memories formed in the hippocampus maintain their detail rich context, whereas those in the surrounding cortex (i.e., PHC) lack this context-rich detail. Our findings of greater pattern similarity for partial versus complete retrieval between different scene stimuli in the CA2/CA3/DG are primarily consistent with this proposal. Specifically, the differentiation of contextual representations that we see in the CA2/CA3/DG can be attributed to the retrieval of detailed contextual representation. In contrast, greater pattern similarity for scene-specific complete trials in the PHC likely speaks to cortical transformation of memories. Specifically, although the PHC contains information about scenes, it lacks the original, more holistic spatiotemporal contextual details contained in the CA2/CA3/DG representation. Together, these ideas and results support a dynamic memory storage and retrieval process that involves the complementary function of multiple brain regions.
Regarding the scene specificity that we observed in the PHC, we note that numerous studies in humans suggest the involvement of the posterior parahippocampal gyrus in the representation of spatial scenes and spatial context compared with representation of objects (Epstein and Kanwisher, 1998; Epstein and Higgins, 2007; Diana et al., 2008; Park et al., 2011; LaRocque et al., 2013; Liang et al., 2013). PHC scene-related responses also code information about temporal context (Turk-Browne et al., 2012). Thus, our results advance previous findings on the PHC by suggesting that this subregion may play a role in storing scene-specific information that may be used in both spatial and temporal context memory. Consistent with the idea that the PHC may be important for coding scene-specific details of context, patients with lesions to the hippocampus and an intact parahippocampal gyrus show some preserved ability to navigate, including using their knowledge of specific routes to find goal locations (Bohbot and Corkin, 2007; Rosenbaum et al., 2007). In contrast, patients with parahippocampal lesions show broader deficits in navigation and are unable to recover even simple routes they learned recently (Bohbot et al., 1998). Our results support the idea that the PHC may store scene-specific spatiotemporal contextual information that, even in the absence of a hippocampus, could serve as a potential index for some forms of memory.
Past studies also implicated regions outside of the MTL in spatial and temporal order retrieval. Specifically, in a previous study (Ekstrom et al. 2011), we found that spatial retrieval resulted in higher univariate activation compared with temporal retrieval within the PHC, whereas temporal retrieval resulted in higher activation relative to spatial retrieval in the prefrontal cortex. Also, consistent with the importance of cortical regions to spatial and temporal representation, lesion studies implicate the prefrontal cortex and posterior parietal regions in episodic memory functions (Berryhill et al., 2007; Blumenfeld and Ranganath, 2007), with prefrontal cortex lesions more specifically linked to temporal order processing (Duarte et al., 2010) and parietal/retrosplenial lesions linked to spatial processing deficits (Takahashi et al., 1997). Recent findings from our laboratory demonstrate significant levels of functional interactions between these cortical regions and the MTL during both spatial and temporal processing, as measured by coherent oscillations using intracranial EEG recordings in humans (Watrous et al., 2013). Given the hypothesized importance of the hippocampus as a hub in spatiotemporal episodic memory (Battaglia et al., 2011; Watrous et al., 2013), one interpretation of our results reported here showing differentiation of spatiotemporal context in the CA2/CA3/DG is that they reflect additional processing of cortical input involved in storing detail-rich information needed to represent and retrieve episodic memories (Winocur and Moscovitch, 2011).
Footnotes
This work was supported by National Institute of Neurological Disorders and Stroke Grant RO1NS076856, the Sloan Foundation, and the Hellman Young Investigator Award. We thank the University of California, Davis memory group for comments on this manuscript.
- Correspondence should be addressed to Dr. Arne Ekstrom, Center For Neuroscience and Department of Psychology, University of California, Davis, Davis, CA 95616. adekstrom{at}ucdavis.edu