Abstract
The medial temporal lobe (MTL) is essential for episodic memory encoding, as evidenced by memory deficits in patients with MTL damage. However, previous functional neuroimaging studies have either failed to show MTL activation during encoding or they did not differentiate between two MTL related processes: novelty assessment and episodic memory encoding. Furthermore, there is evidence that the MTL can be subdivided into subcomponents serving different memory processes, but the extent of this functional subdivision remains unknown. The aim of the present functional magnetic resonance imaging (fMRI) study was to investigate the role of the MTL in episodic encoding and to determine whether this function might be restricted to anatomical subdivisions of the MTL. Thirteen healthy volunteers performed a word list learning paradigm with free recall after distraction. Functional images acquired during encoding were analyzed separately for each participant by a voxel-wise correlation (Kendall’s tau) between the time series of the T2*-signal intensity and the number of subsequently recalled words encoded during each particular scan. Of the 13 participants, 11 showed voxel clusters with statistically significant, positive correlations in the posterior part of the hippocampus. Across participants, an ANOVA on the number of voxels with significant, positive correlations within individually defined volumes of interest confirmed a statistically significant difference in activation for anterior versus posterior regions of the hippocampus. However, no differences between left and right hippocampal activation were revealed. Thus, these findings demonstrate that successful encoding into episodic memory engages neural circuits in the posterior part of the hippocampus.
- episodic memory
- declarative memory
- encoding
- memory formation
- hippocampus
- medial temporal lobe
- functional neuroimaging
- fMRI
- parametrical analysis
- Alzheimer’s disease
Episodic memory, which is accessible to conscious recollection and concerned with unique personal experiences, is dependent on the integrity of the medial temporal lobe (MTL) (Tulving, 1972; Squire and Zola-Morgan, 1991). However, it is not clear to what extent the MTL can be divided into subcomponents subserving different memory processes (Eichenbaum et al., 1994;Zola-Morgan et al., 1994). At the very least, the hippocampus is required for episodic memory, because lesions confined to the hippocampus are sufficient to induce anterograde and temporally limited retrograde amnesia (Zola-Morgan et al., 1986; Victor and Agamanolis, 1990; Rempel-Clower et al., 1996). Furthermore, as revealed by intrahippocampal electrical stimulation, the MTL is involved in both episodic memory encoding and retrieval (Halgren et al., 1985). Nevertheless, several functional brain imaging studies failed to reveal encoding-related MTL activations (Petersen et al., 1988; Frith et al., 1991; Démonet et al., 1992; Grasby et al., 1993a, 1994; Kapur et al., 1994, 1996; Raichle et al., 1994; Shallice et al., 1994; Tulving et al., 1994a; Fletcher et al., 1995; Nyberg et al., 1996a).
In contrast, studies comparing brain activity during processing of novel versus familiar items have demonstrated MTL activations, and these activations were interpreted as encoding-related, assuming that familiar items need less encoding than novel items (Tulving et al., 1994b; 1996; Grady et al., 1995; Haxby et al., 1996; Stern et al., 1996; Dolan and Fletcher, 1997; Gabrieli et al., 1997). However, from these studies it is difficult to disentangle memory encoding from novelty assessment, a process that computes novelty and familiarity for incoming information. Novelty assessment either influences encoding success or represents an early stage of memory encoding, but it does not represent the whole process of episodic memory encoding (Metcalfe, 1993; Tulving and Kroll, 1995; Tulving et al., 1996). Because the MTL is involved in episodic memory processing as well as novelty assessment (Squire and Zola-Morgan, 1991; Knight, 1996; Tulving et al., 1996), it is reasonable to assume that reported MTL activations could be based on either or both of these two processes.
Most imaging studies of memory have relied on subtraction methods, but a few studies have used parametric designs (Grasby et al., 1993b, 1994;Nyberg et al., 1996b). In parametric studies of memory, correlations between the number of items retrieved and the signal intensity of each voxel across a time series and/or across subjects were calculated to identify activated brain regions. Positron emission tomography (PET) scans showed positive correlations between MTL blood flow and the number of retrieved items (Grasby et al., 1993b; Nyberg et al., 1996b).The scans in these studies were acquired during a recognition task or across study and free recall tasks. Thus, no separate analysis of encoding-related activity was performed. However, parametric analyses require no control task and show the direct relationship between specific tasks and the brain regions that are involved. Therefore, they are also well suited to investigate encoding-related processes.
The aim of the present functional magnetic resonance imaging (fMRI) study was to identify MTL structures that are involved in episodic memory encoding, using a word list learning paradigm with free recall after distraction. The higher spatial and temporal resolution of fMRI allows the analysis of single subject data, the investigation of encoding separately from retrieval, and a topographical analysis within the MTL.
MATERIALS AND METHODS
Participants. Thirteen healthy volunteers (9 females, 4 males) participated in the study. Each gave written informed consent. The study was approved by the Ethics Committee of the Otto-von-Guericke University, Magdeburg. All were right-handed and had normal vision; German was their first language. The mean age was 25 years (range, 18–41 years).
Stimulus material. A total of 300 German nouns were selected from the CELEX Lexical Database (word frequency: mean ± SD, 47 ± 8/1 million) (Baayen et al., 1993). Words contained 3–10 letters (5.85 ± 1.53), and 50% of the words had an abstract meaning. The pool of words was partitioned pseudorandomly into 20 study lists of 15 words each, under the constraints that the word lengths and the ratio of abstract/concrete meanings were balanced between the lists and that within one list neither semantic nor phonological similarities occurred. The order of the lists and of the words within and across lists was counterbalanced across participants.
Procedure. Participants laid in a supine position in the MRI scanner with their head stabilized by an individually molded vacuum cushion. The stimuli were projected upside down onto a mirror located at the end of the scanner bore. The participants wore prism glasses so they saw the projection upright, in central vision, and without optical distortion. The experiment consisted of 20 blocks. Each block included three tasks: the encoding task, the distraction task, and the recall task (Fig. 1A). During encoding, words of each study list (15 words) were presented sequentially in upper case (white on a black background). The words subtended maximum horizontal and vertical visual angles of ∼3.0 and 0.6°, respectively. Words were presented for durations of 500 msec each, with an interstimulus interval of 2.5 sec, during which a fixation asterisk was displayed. To avoid associative processing, the participants were required to memorize each presented word separately using a rote mnemonic strategy. They were explicitly instructed to avoid elaborate strategies like making rows, sentences, stories, pictures, or to connect the words in any other way. Furthermore, the participants were advised to avoid any speech movement during the encoding task. After the presentation of each study list, a distraction task was presented for 15 sec to prevent ongoing rehearsal. In this task, pairs of signs (! # or ! ! or # #) were displayed on the screen for 150 msec each with an interstimulus interval of 850 msec. Participants were required to give a same–different response, with equal emphasis on speed and accuracy. They responded with their right hand, pressing one button of a computer mouse when the two signs were different and the other button when the two signs were identical. The distraction task was followed by the presentation of three question marks for 45 sec, which indicated the recall task. During recall, the participants were instructed to say aloud the previously presented study words in any order. Responses were recorded by a tape recorder for later analyses.
Image acquisition. All participants were scanned in a Bruker Biospec 30/60 system (field strength: 3.0 Tesla; Bruker Analytik GmbH, Rheinstetten, Germany) with a birdcage head-coil and an asymmetric gradient system (30 mT/m). Functional MR images were acquired continuously during the whole experiment. The functional images were collected using a FLASH gradient echo sequence with 40 phase encoding steps, repetition time (TR) = ∼375 msec, echo time (TE) = 37.65 msec, and a flip angle of 12°. The gradient rise time was 2500 μsec. The field of view of 16 cm and the in-plane matrix of 64 × 64 pixels led to a pixel size of 2.5 × 2.5 mm. Structural scans in a parasagittal plane covering the hippocampus were used to select seven 8-mm-thick contiguous coronal sections for the functional images that covered the hippocampus in a plane perpendicular to its long axis. The synchronous acquisition of this set of seven functional images lasted 15 sec. Additionally, in all experiments in-plane anatomical scans were obtained using a FLASH sequence with TE = 19 msec, TR = 300 msec, and a flip angle of 60° in the functional measurement planes.
Image analysis. The fMRI protocol yielded time series of data points at each voxel. From these time series, the data acquired during the encoding task were extracted for further analysis. This encoding-related data set consisted of 60 data points per voxel (three images per list for 20 study lists). Each list contained 15 words; hence, each data point represents the mean signal during the encoding of five words. We used the correlation between the time series of T2*-signal intensity and the number of subsequently recalled words (0 to 5 of 5) to characterize the response of each voxel. To consider the physiological delay in the hemodynamic response, the word lists assigned to each scan must be shifted by several seconds (Malonek and Grinvald, 1996). Because the exact hemodynamic response latency is unknown for the human MTL, it was impossible to select an a priori shift value. Therefore, it was necessary to make a data-derived optimization to find the maximal phase delay by analyzing all data sets without a shift as well as with shifts of one and two words. A shift by one or two words, a delay of 3 or 6 sec, respectively, would be well in line with the peak latency of the hemodynamic response as directly revealed by optical imaging in the visual cortex of monkeys (Malonek and Grinvald, 1996). It was also necessary to decide whether the shifts in each participant should be one or two words, and whether this should be different for each subject. This decision was made by counting the number of significant voxels in each entire imaging set, and the shift that led to the largest number of active voxels was chosen for further analysis. This procedure led to a shift by one word in five participants and by two words in eight participants. As shown in Figure1B, these shifts led to an improvement of the signal-to-noise ratio but not to a general change in the pattern of results.
The imaging data were analyzed off-line using the software package KHOROS 2.1 (Khoral Research, Albuquerque, NM) with the extension KHORFU (Gaschler et al., 1996) and a motion correction program (Hinrichs et al., 1994). To analyze the data, the following steps were performed for each subject separately. (1) To remove trend effects of the signal intensity, the best fitting polynomial of first, second, or third order was subtracted from the time series of each voxel. (2) A motion correction algorithm was applied to all scans (Hinrichs et al., 1994). The mean distance for corrective shifts was 0.2 mm in thex-, y-, and z-axes. (3) Afterward, the functional images were carefully inspected visually for further artifacts using a sequential animation tool. This inspection led to the removal of three single images in the data of three different participants. (4) By using the Kendall’s tau procedure (Kendall, 1970), a correlation matrix between the time series of each voxel and the subsequent recall performance was calculated. (5) Each voxel in the functional map whose correlation exceeded the 95% level of significance (p < 0.05) was coded. (6) The resulting matrices were processed with a median filter with the spatial width of two voxels to emphasize spatially coherent patterns of activation and then overlaid on the corresponding anatomical scan. (7) The anatomical localization of significant voxels was determined using the brain atlases of Duvernoy (1991) and Jackson and Duncan (1996) as references.
Volumes of interest (VOIs). To analyze the interhemispheric and anterior versus posterior differences of significant MTL activations across subjects, four individually adjusted VOIs were defined according to anatomical landmarks within the MTL (Amaral and Insausti, 1990; Duvernoy, 1991; Jackson and Duncan, 1996). These were the left anterior MTL, right anterior MTL, left posterior MTL, and right posterior MTL. The VOIs were rectangular in shape and had the same size within each participant and similar sizes between participants. The VOIs enclosed the hippocampus and the parahippocampal gyrus, including the entorhinal, perirhinal, and parahippocampal cortices (Amaral and Insausti, 1990). The slice that included the head of the hippocampus was defined as the anterior border, and the slice with the crus fornix defined the posterior border. The division between the anterior and posterior VOIs was located halfway between these two borders. The VOI definitions were performed on the anatomical scans without overlaid functional maps. Finally, the number of median-filtered voxels with a significant, positive correlation was automatically counted within each VOI and analyzed with a repeated measure two-way ANOVA with the factors left versus right and anterior versus posterior.
RESULTS
Behavioral data
The mean rate of correctly recalled words was 35.1% (range, 24.7 to 44.3%). On average, 1.75 words per scan were subsequently recalled (SD = 1.21). The mean number of scans with zero, one, two, three, four, or five subsequently recalled words showed a wide (kurtosis, −0.70) and slightly left-shifted distribution (skewness, 0.24). As intended, this yielded a fairly balanced and widely distributed basis for the correlation between the recall rate and the T2*-signal intensity. The distraction task was performed with high accuracy (correct responses: mean = 88.4% correct; range, 80.1–94.2%) and will not be discussed further.
Imaging data
In 11 of the 13 participants, clusters of voxels with significant, positive correlations (p < 0.05) between the number of subsequently recalled words and the T2*-signal intensity during the encoding task were detected within the posterior MTL and the hippocampus (Fig. 2). These hippocampal activations were more pronounced (larger number of active voxels) on the left side of six and the right side of five participants. In addition to these hippocampal activations, clusters of voxels with significant, positive correlations were found in the cerebellar hemispheres in eight participants, in the right precentral gyrus (Brodmann’s area 4) in five, in the left precentral gyrus (Brodmann’s area 4) in two, and in the posterior part of the left superior temporal gyrus and posterior transverse temporal gyrus (Brodmann’s area 22/42) in five of the 13 participants. These extratemporal activations frequently appeared in parallel within subjects: eight participants exhibited combinations of these activations, whereas five participants exhibited none of these extratemporal activations.
VOI statistics
The mean number of voxels with significant, positive correlations in each VOI are depicted in Table 1. The two-way ANOVA (left/right × anterior/posterior) revealed a significant main effect of anterior versus posterior (F(1,12) = 7.05; p = 0.024). Therefore, as also indicated by the mean values, the encoding-related enhancement of neural activity was significantly more pronounced in the posterior part of the hippocampus, and no inter-hemispheric differences were reliable across participants. The voxels with a significant positive correlation within the anterior VOIs were in the most posterior slice and thus were interpreted as being related to the larger activations within the posterior hippocampus. Additionally, some small clusters of voxels (approximately two to three connected voxels) without consistent localization across participants were counted in the anterior VOIs. These activations do not exceed the general background level of activity; thus, they most likely reflect noise or statistical type I errors.
DISCUSSION
To identify MTL structures engaged in verbal encoding into episodic memory we performed a parametrically analyzed fMRI study. The main result was that 11 of the 13 participants showed clusters of voxels with significant, positive correlations between the number of successfully encoded words and the T2*-signal intensity in the posterior part of the hippocampus. Across participants, there were no differences between left and right hippocampal activations.
The finding that the encoding-related activity occurred in the posterior hippocampus has some parallels with the findings of Gabrieli et al. (1997), who found posterior parahippocampal activation for encoding of novel pictures, and Stern et al. (1996), who found a posterior hippocampal and parahippocampal activation also during encoding of novel pictures. As already mentioned, from these studies it is difficult to differentiate activity related to episodic encoding from that related to novelty assessment. Because we presented common words that are frequently encountered during life and just once during the experiment, our findings cannot be interpreted as a correlate of novelty assessment. Furthermore, our activations were certainly in the posterior MTL; however, in contrast to the activations in novel versus familiar paradigms, the activations revealed here were almost confined to the hippocampus. This localization of encoding-related activity is in agreement with findings in amnesic patients with hippocampal lesions (Zola-Morgan et al., 1986; Victor and Agamanolis, 1990; Rempel-Clower et al., 1996). To our knowledge, there is no lesion study in humans that compares the impact of anterior versus posterior hippocampal lesions, although one study showed that anterior hippocampal volume reductions in chronic alcoholics were not correlated with episodic memory impairments, indicating that episodic memory is not critically dependent on the anterior hippocampus (Sullivan et al., 1995). Studies conducted in rodents have also shown that the posterior part of the hippocampus exerts more control on spatial memory than does the anterior part (Moser et al., 1993; Laurent-Demir and Jaffard, 1997). Our results are in line with these findings, because they demonstrate that the posterior hippocampus is activated during episodic memory encoding in healthy humans.
Because memory deficits of patients with left-sided hippocampal lesions mostly affect memory for verbal material and right-sided lesions affect memory for material that cannot be readily verbalized (Hermann et al., 1997), one might have expected a stronger activation within the left than the right hippocampus. Such neuropsychological findings are usually obtained by tests using auditorily presented words. Therefore, our visual presentation could have led to additional visuospatial encoding processes (Helmstaedter et al., 1995). However, the large differences in memory performance between patients with bilateral and unilateral hippocampal lesions seem to indicate that episodic memory processes generally involve both MTLs (Zola-Morgan et al., 1986; Victor and Agamanolis, 1990; Rempel-Clower et al., 1996; Baxendale, 1997;Hermann et al., 1997; Oxbury et al., 1997). Our results extend this notion by demonstrating that episodic memory encoding engages both hippocampi in healthy subjects. However, it is possible that neuroimaging methods based on hemodynamic measurements do not have enough sensitivity to assess small asymmetries that are evident with more direct measures such as intrahippocampal electrical recordings (Elger et al., 1997).
In the present study, the encoding success was indexed by the subsequent free recall. The comparison of brain activity during encoding of subsequently retrieved and unretrieved items is a well established method to assess encoding-related activity as measured by event-related potentials (ERPs) (for review, see Rugg, 1995). We did not separately acquire the T2*-signal resulting from single-word processing, and therefore we did not separate brain activity for each subsequently recalled and unrecalled word. However, the summation over five words enabled us to acquire the slower hemodynamic response, and it is a valid approximation of the ERP paradigm. In ERP studies it is difficult to localize the source of encoding-related activity from scalp recordings alone. Thus, it is not possible to relate scalp-recorded ERPs to activity in the hippocampus. In contrast, our fMRI findings show that activity correlated with successful encoding is localized to the hippocampus, an essential structure for episodic memory. This localization supports the interpretation that this enhanced activity is directly related to episodic memory encoding and not to other processes such as attention, elaboration, or emotional arousal. Such processes influence subsequent retrievability, but they are not dependent on the hippocampus.
In addition to the hippocampal activations, we identified clusters of voxels with significant, positive correlations in cerebellar hemispheres, the precentral gyrus, and the sylvian fissure, which were less consistent across participants. Activations of the cerebellar hemispheres are common in various different cognitive tasks (for review, see Cabeza and Nyberg, 1997; Shulman et al., 1997). In specific studies, cerebellar activations were correlated with attentional processes (Allen et al., 1997), working memory (Desmond et al., 1996), or subvocal rehearsal (Fiez et al., 1996). Our finding of cerebellar activations might also be explained by different demands of attention. In contrast, it seems less likely that they are correlated with working memory processes, because we tested recall after distraction; thus the predominant number of recalled words was retrieved from episodic memory and not from working memory. The concurrent activation of the cerebellum, the precentral gyrus (primary motor cortex), and the sylvian fissure (auditory cortex), however, may indicate different extents of subvocal rehearsal connected with different recall probabilities (Fiez et al., 1996). It is important to note, however, that the structures beyond the MTL were not completely imaged. Therefore, these findings should be evaluated further with a study optimized to investigate these non-MTL activations.
Although contrasting semantic and perceptual encoding or word generation and reading lead to reliable differences in episodic encoding success, PET studies comparing theses tasks have not revealed MTL activations (Petersen et al., 1988; Frith et al., 1991;Démonet et al., 1992; Kapur et al., 1994; Raichle et al., 1994;Fletcher et al., 1995). What are the reasons for this apparent paradox in comparison to our findings? In addition to the general problem of the subtraction approach, which assumes only additive relations between cognitive processes (Friston et al., 1996), there are further possibilities. Control tasks might elicit encoding processes, and this may lead to hemodynamic changes that are not detectably different from that of the encoding condition. The findings by Martin and colleagues (1997) support this interpretation. They revealed an MTL activation only in comparisons between processing of specific stimuli (words, nonwords, objects, and nonsense objects) and a baseline without specific stimulus information (visual noise). Furthermore, the failure of MTL activations could also be explained by insufficient signal-to-noise ratios attributable to partial volume effects that occur when the imaging planes are not aligned with the hippocampus. Another possibility for failure of previous studies to find MTL activations related to encoding arises from the fact that in most PET studies the data are collected across subjects and averaged into a common stereotactic space. This procedure might lead to an interindividual mismatch for the MTL and its subregions. Furthermore, the subtraction approach does not consider each subject’s memory performance; hence, variability of performance may reduce the power to detect changes in hemodynamic responses. The present study used methods that eliminated or mitigated the impact of the foregoing problems, and thus in addition to the parametric analysis we used may have permitted the activations of the posterior hippocampus to be detected.
The hippocampus receives its input mainly from cells located in layers II and III of the entorhinal cortex (EC), which give rise to the perforant path, and it connects the EC with the dentate gyrus, the hippocampal “gateway” (for review, see Amaral and Insausti, 1990). The posterior hippocampus, the area that is activated in our study, may receive its major input from the lateral portion of the EC, because in primates the lateral EC is mainly connected with the posterior hippocampus and the medial EC with the anterior hippocampus (Witter et al., 1989). Patients with Alzheimer’s disease show profound neuronal loss in layer II of the EC (Gómez-Isla et al., 1996), and this neuronal loss precedes the hippocampal damage (Mizutani and Kasahara, 1997). These findings suggest that the EC plays a crucial role in episodic memory. However, we did not find an entorhinal activation. This result may support the hypothesis formulated by Hyman and collaborators (1984) and further developed by De Lacoste and White (1993) that neuronal loss within the EC impairs episodic memory primarily by disconnecting the hippocampus and not by damage of neuronal circuits directly engaged in episodic memory formation.
Computational models of MTL function hypothesize that the neocortical activity pattern that represents an episode and finds its way into memory is first processed by the parahippocampal cortex. The information then undergoes preliminary storage by pathways between the EC, dentate gyrus, and CA3 region of the hippocampus, including the recurrent collaterals, which enable autoassociative encoding, storage, or binding processes (Alvarez and Squire, 1994; McClelland and Goddard, 1997; Rolls, 1997). As already reported, processing of novel in comparison to familiar stimuli leads to enhanced neuronal activity in the posterior parahippocampal gyrus (Gabrieli et al., 1997) or posterior hippocampus and parahippocampal gyrus (Stern et al., 1996). These findings may represent a correlate of novelty assessment, whereas the present findings exhibit activations more purely related to episodic memory formation. This would provide the first indications that there are two distinct stages of encoding into episodic memory. This episodic memory encoding model would distinguish between an initial encoding process like novelty detection subserved by the parahippocampal cortex and a process of memory formation subserved by the hippocampus. This interpretation based on evidence derived across studies needs further confirmation, but it provides a testable model for future research.
Footnotes
H.J.H. is supported by Human Frontier Science Program Grant RG0136/1997, Deutsche Forschungsgemeinschaft/Sonderforschungsbereich (DFG/SFB) Grant 426,C5, and DFG Grant He1531/4–1; G.R.M. is supported by Human Frontier Science Program Grant RG0136/1997 and National Institutes of Mental Health Grant MH55714; H.H. is supported by DFG/SFB Grant 426,C5; and G.F. is supported by DFG Grant FE 479/1–1. We thank James B. Brewer for detailed comments on earlier versions of this article and instructive discussions about the data.
Correspondence should be addressed to Dr. Guillén Fernández, Klinik für Neurophysiologie, Otto-von-Guericke Universität, Magdeburg, Leipziger Strasse 44, 39120 Magdeburg, Germany.