Abstract
We continuously encounter and process novel events in the surrounding world, but only some episodes will leave detailed memory traces that can be recollected after weeks and months. Here, our aim was to monitor brain activity during encoding of events that eventually transforms into long-term stable memories. Previous functional magnetic resonance imaging (fMRI) studies have shown that the degree of activation of different brain regions during encoding is predictive of later recollection success. However, most of these studies tested participants' memories the same day as encoding occurred, whereas several lines of research suggest that extended post-encoding processing is of crucial importance for long-term consolidation. Using fMRI, we tested whether the same encoding mechanisms are predictive of recollection success after hours as after a retention interval of several weeks. Seventy-eight participants were scanned during an associative encoding task and given a source memory test the same day or after ∼6 weeks. We found a strong link between regional activity levels during encoding and recollection success over short time intervals. However, results further showed that durable source memories, i.e., events recollected after several weeks, were not simply the events associated with the highest activity levels at encoding. Rather, strong levels of connectivity between the right hippocampus and perceptual areas, as well as with parts of the self-referential default-mode network, seemed instrumental in establishing durable source memories. Thus, we argue that an initial intensity-based encoding is necessary for short-term encoding of events, whereas additional processes involving hippocampal–cortical communication aid transformation into stable long-term memories.
Introduction
It is well established that degree of neural activity at the moment of encoding is predictive of the formation of initial memory representations (Davachi, 2006; Diana et al., 2007). Research using “subsequent memory paradigms,” in which encoded events are analyzed based on what participants later remember, has found that strong encoding activity in perceptual, attentional, control, and memory networks and deactivation in the default-mode network (DMN) is related to recollection at test (Kim, 2011). Processing intensity during encoding thus seems to be the first crucial factor in establishing long-term episodic memories. Still, a fundamental feature of memory is that the amount of details available about a specific episode decreases with time (Sadeh et al., 2014). Because most studies tested participants' memory only after delays of minutes or hours, we know little about how brain processes at encoding cause some episodes to be recallable weeks and months later.
Here, we compare two alternative accounts of which encoded events will become durable source memories. The first account is an extension of the intensity principle governing the initial differentiation: episodes associated with high activity levels during encoding have the highest likelihood of being recollected after extended intervals. Support for this account comes from a few studies that applied subsequent memory paradigms with prolonged delays between encoding and test (Uncapher and Rugg, 2005; Carr et al., 2010; Liu et al., 2013). An alternative account is based on the observation that post-encoding processes affect systems consolidation of memories and their integration into existing cognitive schemas (Diekelmann and Born, 2010; Stickgold and Walker, 2013). It is hypothesized that some encoding events may become tagged as salient because of their resonance with aspects relevant for the individual and therefore get enhanced during post-encoding periods and sleep (van Kesteren et al., 2010; van Dongen et al., 2012; Oudiette et al., 2013; Tambini and Davachi, 2013; Tononi and Cirelli, 2014) and that this occurs independently of encoding strength as long as it exceeds a critical threshold. Post-encoding reactivation involves hippocampal communication with encoding-related perceptual areas (Tambini et al., 2010) and the self-referential DMN (Peigneux et al., 2004; Gais et al., 2007; Rasch et al., 2007), and some findings suggest that relevance tagging during encoding also is hippocampally based (Rauchs et al., 2011). Thus, according to the second account, hippocampal–cortical connectivity is critical in establishing durable episodic memory representations.
To investigate support for the suggested encoding mechanisms, we scanned participants using functional magnetic resonance imaging (fMRI) while performing an associative encoding task and tested their episodic memories after short (hours) or long (weeks) intervals. Equal activity levels during successful encoding across groups would indicate that the subset of memories achieving a durable status are not simply those encoded with the highest intensity. In line with seminal models of memory formation (McClelland et al., 1995; Squire and Alvarez, 1995; Nadel and Moscovitch, 1997), the initiation of durable memories should, in addition, require communication between hippocampus and neocortical sites for long-term retention. We tested this by comparing hippocampal connectivity during successful source memory encoding in the two groups.
Materials and Methods
Participants.
Seventy-eight subjects were scanned using BOLD fMRI during two runs of an incidental memory encoding task. Forty of the subjects (short-delay group) were given a surprise memory test 1.5 h after seeing the last encoding task stimulus. The remaining 38 subjects (long-delay group) were given the surprise test when returning for neuropsychological testing after, on average, 46.1 d (6–129 d; median ± SD, 39 ± 25.1 d). As a result of excessive motion (>1.5 mm) during fMRI runs or insufficient source memory performance (<10% of encoded stimuli correctly recalled), four subjects were excluded from the analyses. The analyzed fMRI sample thus consisted of 39 subjects in the short-delay group (females, n = 27; age, 20.1–36.3 years; mean ± SD, 25.2 ± 3.9 years) and 35 subjects in the long-delay group (females, n = 27; age, 19.5–38.6 years; mean ± SD, 24.1 ± 4.2 years). All subjects gave written informed consent, and the study was approved by the Regional Ethical Committee of South Norway. Participants reported no history of neurological or psychiatric disorders, chronic illness, premature birth, learning disabilities, or use of medicines known to affect nervous system functioning. They were further required to be right-handed, speak Norwegian fluently, and have normal or corrected-to-normal hearing and vision. Participants were paid for their participation.
Experimental design.
The stimulus material consisted of 300 black and white line drawings depicting everyday objects and items. The experiment consisted of two encoding runs, which took place in the MRI scanner, and four test runs. Each run consisted of 50 trials. All runs started and ended with a 11 s baseline recording period in which a central fixation cross was present. The baseline period was also presented once at the middle of each run.
In the encoding runs, a trial started with a pre-recorded female voice asking (the Norwegian equivalent of) one of two questions into the subject's headphones: “Can you eat it?” or “Can you lift it?” (Fig. 1A). Each question was asked 25 times in a run, and the order in which the two questions appeared was mixed pseudorandomly. One second after question onset, a picture of an item appeared on the screen (∼10 visual degrees in diameter, depending on the item) together with a response indicator that instructed the subject which button to press to respond “Yes” (the object can be eaten/lifted) or “No” (the object cannot be eaten/lifted). Button-response mapping was counterbalanced across subjects. The subject had 2 s to produce a response before the object was replaced by a central fixation cross. The fixation cross remained on the screen throughout the following intertrial interval (ITI), which lasted for 1–7 s (exponential distribution over four discrete intervals; mean ± SD duration, 2.98 ± 2.49 s). The jittering of stimulus onsets facilitated the later disentangling of fMRI data reflecting different encoding conditions (Ollinger et al., 2001a,b; Serences, 2004). Because of the participant response-dependent nature of the subsequent memory design, condition order and frequency varied naturally from participant to participant. Nevertheless, design efficiency (i.e., ITI order) was tentatively optimized using optseq2 (http://surfer.nmr.mgh.harvard.edu/optseq/) to ensure sufficient complexity in the recorded BOLD time series for participants producing a high amount of responses in one condition. Importantly, subjects did not know during the encoding phase that they were part of a memory experiment and would be tested on the evaluated material, and they remained ignorant about this until just before the first test run.
The test trials started with the pre-recorded female voice asking the following (Question 1): “Have you seen this item before?” (Fig. 1B). Then, a picture of an item appeared on the screen, and the participant was instructed to indicate Yes (I saw the item during the encoding phase) or No (I did not see the item during the encoding phase) with a button press. In each run, 25 of the items were old (i.e., they had been presented during the encoding phase), and the remaining 25 items were new (and had not been presented during encoding). Old and new items were presented in a pseudorandom order. The object stayed on the screen for 2 s, and if the participant signaled that he or she had not seen the item before (pressed No) or did not respond, the trial ended. If the participant remembered seeing the object (pressed Yes), a new question followed (Question 2): “Can you remember what you were supposed to do with the item?”. Again, a No response ended the trial, whereas a Yes response, indicating that the participant also remembered the action associated with the item during encoding, was followed by a final control question (Question 3): “Were you supposed to eat it or lift it?”. Here, the participant got a two-alternative forced choice (2AFC) between the two encoding actions “Eat” (I imagined eating the item during the encoding phase) and “Lift” (I imagined lifting the item during the encoding phase).
For behavioral analyses, test trial responses were classified as follows: (1) source memory (Yes response to Question 1 and 2 and correct response to Question 3); (2) item memory (correct Yes response to Question 1 and No response to Question 2 or incorrect response to Question 3); or (3) miss (incorrect No response to Question 1). However, although we included the second question to discourage guessing behavior on the recollection (source memory) test question, we cannot exclude the possibility that participants incorrectly would claim Yes at Question 2 and then take a guess at Question 3. In these cases, because of the 2AFC structure of Question 3, participants would, on average, produce a correct recollection response, just by guessing, 50% of the times. However, this also implies that, on average, participants would produce wrong recollection responses the same amount of times. Thus, by calculating the amount of wrong recollection responses (Yes response to Questions 1 and 2 and incorrect response to Question 3) for each participant and subtract this number from the participant's correct recollection score, we could estimate source memory performance unbiased by guessing. Therefore, we tested whether this estimate was significantly above 0, because this would mean that participants were correct in their recollection judgments more often than they were incorrect and thus performed better than chance.
MRI scanning details.
Imaging was performed with a Siemens Skyra 3T whole-body MRI unit equipped with a 24-channel Siemens head coil. The functional imaging parameters were equivalent across all fMRI runs: 43 transversally oriented slices (no gap) were measured using a BOLD-sensitive T2*-weighted EPI sequence [repetition time (TR), 2390 ms; echo time (TE), 30 ms; flip angle, 90°; voxel size, 3 × 3 × 3 mm; field of view (FOV), 224 × 224 mm; interleaved acquisition; generalized autocalibrating partially parallel acquisitions acceleration factor, 2]. At the start of each fMRI run, three dummy volumes were collected to avoid T1 saturation effects in the analyzed data. Each encoding run produced 131 volumes. Anatomical T1-weighted MPRAGE images consisting of 176 sagittally oriented slices were obtained using a turbo field echo pulse sequence (TR, 2300 ms; TE, 2.98 ms; flip angle, 8°; voxel size, 1 × 1 × 1 mm; FOV, 256 × 256 mm). Additionally, a standard double-echo gradient-echo field map sequence was acquired for distortion correction of the echo planar images. Visual stimuli were presented in the scanner environment with an NNL 32-inch LCD monitor at a resolution of 1920 × 1080 pixels (NordicNeuroLab), positioned 176 cm from the mirror attached to the coil. Participants responded using the ResponseGrip system (NordicNeuroLab). Auditory stimuli were presented to the participants' headphones through the scanner intercom.
Preprocessing of MRI data.
Cortical reconstruction and volumetric segmentation of the T1-weighted scans were performed with Freesurfer 5.3 (http://surfer.nmr.mgh.harvard.edu/fswiki). Briefly, this processing included motion correction (Reuter et al., 2010) of the T1-weighted images, removal of non-brain tissue using a hybrid watershed/surface deformation procedure (Ségonne et al., 2004), automated Talairach transformation, segmentation of the subcortical white matter and deep gray matter volumetric structures (including the hippocampus, amygdala, caudate, putamen, and ventricles; Fischl et al., 2002, 2004a), intensity normalization (Sled et al., 1998), tessellation of the gray matter/white matter boundary, automated topology correction (Fischl et al., 2001; Ségonne et al., 2007), and surface deformation after intensity gradients to optimally place the gray/white and gray/CSF borders at the location where the greatest shift in intensity defines the transition to the other tissue class (Dale et al., 1999; Fischl and Dale, 2000). Additional data processing and analysis included surface inflation (Fischl et al., 1999a), registration to a spherical atlas that used individual cortical folding patterns to match cortical geometry across subjects (Fischl et al., 1999b), and parcellation of the cerebral cortex into units based on gyral and sulcal structure (Fischl et al., 2004b; Desikan et al., 2006). Functional imaging data from the encoding task was preprocessed using the Freesurfer Functional Analysis Stream (FSFAST), version 5.1 (http://surfer.nmr.mgh.harvard.edu/fswiki/FsFast). First, all functional images were corrected for distortions caused by B0 inhomogeneities in EPI scans (FSL PRELUDE/FUGUE; http://fsl.fmrib.ox.ac.uk/fsl/fslwiki). Next, the images were motion corrected (AFNI 3dvolreg; http://afni.nimh.nih.gov), slice timing corrected to the middle of the TR of a volume, intensity normalized, and registered to the same-subject anatomical volume. Each 4D functional dataset got resampled to “common space” using the surface-based intersubject registration created during the cortical reconstruction steps of Freesurfer for the cortical mantle (bringing the left and right cortical hemispheres into the average space of Freesurfer; fsaverage). For the multivariate searchlight analyses, individual subjects' functional data were realigned into MNI305 2 mm volume space using linear transformations estimated with 12 degrees of freedom. Finally, 8 mm FWHM smoothing was applied to each surface (2D surface-based smoothing) and MNI305 volume (3D volume-based smoothing).
Univariate analyses.
A first-level general linear model (GLM) was set up for each encoding run, consisting of three main conditions/regressors modeled as events with onsets and durations corresponding to the study item encoding period and convolved with a two-gamma canonical hemodynamic response function (HRF). Each of the 100 encoding items was assigned to a condition based on the participant's response to the item at test. The source memory encoding condition consisted of items that were later correctly recognized with correct source memory (Yes response to test Questions 1 and 2 and correct response to Question 3). The item memory condition consisted of items that were correctly recognized but for which the participant had no source memory (Yes response to Question 1 and No response to test Question 2 or incorrect response to Question 3). The miss condition consisted of items that were not recognized during test (incorrect No response to test Question 1). In the few cases in which the participant did not produce any response within the valid response period during the test phase, the corresponding encoding item was assigned to a fourth regressor. This regressor was only included to soak up BOLD variance associated with stimulus presentation and was not included in any contrasts. In addition to the task regressors and their temporal derivatives, estimated motion correction parameters and a set of polynomials (up to second degree) were included in the GLM as nuisance regressors. The model and the data were high-pass filtered with at cutoff at 0.01 Hz, and temporal autocorrelations [AR(1)] in the residuals were corrected using a prewhitening approach.
For each individual, the following contrasts of parameter estimates were calculated and brought to the group level (separately for the short- and long-delay groups): (1) source memory versus miss; (2) source memory versus item memory; and (3) item memory versus miss. Here, statistical significance was tested at each vertex on the cortical surface using GLMs and a weighted least-squares approach, treating subjects as random effects and weighting them by the inverse of their first-level noise variance (Thirion et al., 2007). The resulting statistical estimates were corrected for multiple comparisons using a vertex-wise false discovery rate (FDR) threshold of p = 0.05. Additionally, significance of the raw source memory parameter estimate (source memory vs implicit baseline) was established and corrected following the same approach. Note that the source memory condition is the only encoding condition that is directly comparable across groups because items recollected with source after long-delay intervals also are assumed recollected with source when tested after short delays (Carr et al., 2010; Liu et al., 2013). A between-group comparison (short-delay group vs long-delay group) of the source memory versus implicit baseline contrast was performed to test whether activity during encoding predicts source memory durability. Because no voxels survived FDR correction in this comparison, a cluster-based correction method was applied: statistical estimates were tested against an empirical null distribution of maximum cluster size across 10,000 iterations with a vertex-wise threshold of p < 0.05 and cluster-forming threshold of p < 0.05 (Hayasaka and Nichols, 2003; Hagler et al., 2006).
In addition to the surface-based cortical analyses, all tests were also performed on the average signal extracted from two regions of interest (ROIs) representing the left and right hippocampi (automatically segmented by Freesurfer at the individual level and manually corrected after visual inspection). Mixed ANOVAs [group (short/long delay) × condition (source/item/miss)] showed significant main effects of condition in both hippocampi (p < 0.025) but no main effect of group (p > 0.34) or any interactions between group and condition in hippocampal activity (p > 0.14). Thus, we focused on pairwise contrasts in the analysis of hippocampal data as on the surface.
Additional GLMs were used to test for correlations between individual parameter estimates in the source memory encoding condition and proportion of source memories at test in the two groups, and areas in which the regression slopes differed between the groups. Effects of age and sex were controlled for by covariates.
Psychophysiological interaction analyses.
The preprocessed functional data were analyzed for task-specific hippocampal connectivity using the generalized psychophysiological interaction (PPI toolbox; McLaren et al., 2012). First, observed BOLD data (eigenvariate) for the anatomically defined left and right hippocampi was deconvolved into estimates of neural events (Gitelman et al., 2003). Next, each task time course from the first-level FSFAST design matrix, representing the three stimulus conditions of the experimental design (source memory, item memory, and miss) were multiplied separately by the deconvolved neural estimate and convolved with a canonical HRF, creating PPI terms. Finally, these three PPI regressors, together with the original convolved task regressors, the observed left/right hippocampal BOLD data, and the nuisance regressors from the first-level FSFAST model were regressed onto whole-brain (cortical surface) time series data. For each group separately, the estimated β values for the source memory PPI term were contrasted against the average of the item memory and miss PPI terms to isolate connectivity patterns during source memory encoding within groups. To compare source memory connectivity between the groups, the two source memory PPI terms were contrasted against each other (i.e., akin to the univariate between-subject analysis shown in Fig. 3). Significance after multiple comparisons was evaluated on a cluster level using similar Monte Carlo simulation routines as for the univariate analyses.
Results
Subsequent memory performance
Of 100 encoded item–action associations, participants in the short-delay group remembered an average ± SD of 55.4 ± 15.1% with source memory, 18.7 ± 9.8% with item memory only, and missed 21.8 ± 12.0% of the items at test. Participants tested in the long-delay group showed significantly lower (p < 10−18) source memory performance at 19.8 ± 9.9%, retrieved 30.2 ± 11.0% with item memory, and had forgotten 47.7 ± 16.9% of the items.
d′ scores, calculated from the recognition responses to the first test question, were significantly above 0 in both groups (one-sample t tests, p < 10−18). Additional analyses revealed a significantly lower ability to separate previously seen (old) from unseen (new) items in the long-delay group (mean ± SD d′ = 1.11 ± 0.32) compared with the short-delay group (d′ = 2.7 ± 0.98; p < 10−12). However, participants' criterion C, a recommended measure of response bias (Stanislaw and Todorov, 1999), did not differ between the two groups (short delay, 0.47 ± 0.51; long delay, 0.43 ± 0.42; p = 0.7), indicating that both groups were similar in their tendency to classify an item as previously unseen when in doubt. To confirm that the two-step test procedure after recognition discouraged guessing behavior on the source memory task, we calculated individual estimates of source memory scores corrected for the amount of times a participant produced a wrong response on the last 2AFC question in a test trial (under the rationale that guessing would be successful 50% of the time). Average ± SD source memory performance using the corrected scores was 52.6 ± 15.9% in the short-delay group and 15.3 ± 7.7% in the long-delay group both strongly significantly above chance level (i.e., 0; p < 10−12). The corrected scores were highly correlated with the original scores (r = 0.98 and 0.92 in the short- and long-delay groups, respectively).
Univariate analyses of fMRI data
All univariate analyses were run on the complete sample and a reduced sample from which participants with unrepresentatively long or short retention periods were excluded from the long-delay group (new long-delay group size, 31; retention range, 17–87 d). The correlation between individual delay length and source memory performance was not significant in the original long-delay group sample (r = −0.21) or the new sample (r = 0.023), and all analyses produced similar results in both samples. Thus, we focus on the original sample in the remainder of the text.
First, BOLD activity was estimated in response to the 2 s encoding/stimulus presentation period separately across the three memory conditions. Parameter estimates for trials later recollected with source memory, item memory, and miss trials were contrasted against the implicit fixation baseline activity and pairwise with each other. In the short-delay group, although all three conditions produced similar general patterns of activation/deactivation, the activity was significantly stronger (p < 0.05, corrected) in several cortical areas (Fig. 2A) and in the left hippocampus (Fig. 2C) during encoding leading to source memory than during encoding leading to a miss response after 1.5 h. Furthermore, cortical areas associated with the DMN were less active during encoding leading to subsequent source memory than during miss trials. Similar patterns of effects were observed in the short-delay group in the contrast source versus item memory, albeit not significantly so in the hippocampi. Unsurprisingly, weaker patterns of effects were observed after analyses of the same contrasts in the long-delay group (Fig. 2B). Some of the items recognized without source or missed after long delays would have been classified as source memories or item memories if tested after short periods. Therefore, the item memory/miss conditions were expected to be associated with stronger average activity levels in the long-delay group.
Thus, as expected, strong levels of activation/deactivation during encoding seem to be necessary to produce an initial pool of memories that can elicit memory for source when probed after 1.5 h. Knowing that many of these initially successfully encoded events later will be forgotten and only some consolidated as more stable memories, we next tested whether the few durable memories resulting from this pool could be characterized following the same encoding intensity principle. We contrasted parameter estimates associated with source memory encoding in the short versus long retention interval groups (Fig. 3) under the rationale that mean BOLD activity should differ if the items recollected after ∼6 weeks result from particularly strong activation levels in memory networks at the time of encoding. Obviously, the source memory encoding parameter estimates from the short-delay group was based, in part, on items that would be characterized as “durable” were they to be tested after ∼6 weeks. However, of the total number of items making up the source memory regressor in the short-delay group, the durable memories only constituted a smaller proportion (on average, approximately one-third if calculated from the long-delay group; see behavioral results above). Thus, any systematic encoding intensity differences between durable and nondurable memories should be reflected as a lower source memory parameter estimate in the short-delay group relative to the long-delay group. However, no significant differences were found, indicating that durable source memory representations are not characterized by stronger brain activity during encoding beyond the initial threshold necessary for successful 1.5 h recall.
Although we included a high number of participants in our study to ensure sufficient power to detect potentially subtle effects, we took several additional steps to ensure that the observed null effect was not a false negative. First, we calculated the average effect size (difference in source memory encoding parameter estimates; long-delay group − short-delay group) across all vertices within the encoding networks observed in Figure 2A. The average effect was negative (−0.039) in the task-positive parts of the network and positive (0.035) in the task-negative parts, indicating the opposite average direction of effects of what would be predicted from the encoding intensity account. Next, we isolated vertices showing effects in line with the encoding intensity account (i.e., positive effects in task-positive parts of the network and negative effects in task-negative parts). For each vertex and the hippocampi, using the observed effect size and its SD, we calculated the sample size needed to achieve 80% detection power of a similar effect at an α level of 0.05. Median sample size required across vertices was 8827 subjects in each group (minimum, 383) for the task-positive parts of the network and 15669 (minimum, 446) in the task-negative parts. The sample sizes required in the left and right hippocampi were 866 and 4814, respectively. In other words, even in the minority of the vertices showing directions of effects in line with the encoding intensity account and in the hippocampi, the effect sizes were negligible.
If intensity encoding is the mechanism subserving successful retention over short intervals (hours) but not over long intervals (weeks), one would expect individual differences in encoding intensity to be predictive of memory performance in the short-delay group but not in the long-delay group. We tested this by running a GLM analysis with individual subjects' BOLD activity during source memory encoding as the dependent variable and proportion of source memory responses in the ensuing test, separated over groups, as independent variable, partialling out age and sex. A significant interaction in the BOLD–behavior relationship was observed between the groups (p < 0.05 corrected), encompassing the precuneus cortex (peak MNI coordinate, x = −6, y = −60, z = 14) and superior frontal gyrus (peak MNI coordinate, x = −8, y = 59, z = 22) in the left hemisphere (Fig. 4A). No significant interactions were observed in the hippocampal ROIs (p > 0.25, uncorrected). Descriptive plotting of the associations underlying the interactions (Fig. 4B) showed that there was a positive relationship between source memory encoding activity and later successful source memory in the short-delay group (partial correlations, controlling for age and sex: precuneus, r = 0.39, p = 0.016; superior frontal gyrus, r = 0.37, p = 0.026) but not in the long-delay group (precuneus, r = −0.19, NS; superior frontal gyrus, r = −0.25, NS). Thus, BOLD activity levels in these cortical areas during encoding relate differently to source memory performance in the two groups. Interestingly, in additional analyses investigating within-group relationships between BOLD encoding activity and source memory behavior on a whole-brain level, we found a significant positive relationship (p < 0.05, corrected) in an extensive set of cortical areas and in the hippocampi (p < 0.02) in the short-delay group but no such correspondence in the long-delay group (Fig. 4C,D; hippocampal ROIs, p > 0.13, uncorrected). This indicates that high levels of encoding activity is important for source memory performance after short intervals (hours) but that additional processes come into play when durable episodic representations are established.
Finally, we ran a simulation analysis on assumptions from the encoding intensity account to approximately estimate the BOLD effect sizes that, according to this account, should accompany the observed changes in source memory performance between short and long test delays. Three simple assumptions were implemented in this simulation: (1) memory traces with representational strengths above a certain threshold will be retrieved as source memories at test; (2) the representational strengths of memory traces decay with time; and (3) for a few hours after encoding, BOLD encoding intensity is a good proxy for the representational strength of memory traces.
Mimicking the experimental setup used in our study, we generated 100 trials of random data for 2 × 35 subjects (uniform sampling of values between 1 and 9; 35 subjects because this was the smallest group size in our experiment). For each of the subjects in the simulated short-delay group, we used true individual behavioral scores to separate plausible portions of the encoding data into source memory, item memory, and miss encoding. As an example, a participant in our experiment produced 18% miss, 24% item memory, and 58% source memory responses. Thus, for one simulated subject, the first (and lowest) 18 values in the sorted sample of generated intensities were classified as misses, the next 24 values were classified as item memories, and the remaining 58 (and highest) values were classified as source memory trials. This operation was repeated 10,000 times for each simulated subject, and the average intensities producing miss, item memory, and source memory responses across subjects were calculated, together with the average SDs around these values: miss, 1.84 ± 0.49; item, 3.53 ± 1.06; source, 6.71 ± 0.67.
Identical steps were followed for the subjects in the long-delay group, but the resulting mean strengths associated with each encoding condition were here additionally multiplied with a decay factor, and the resulting “decayed” data points were classified into miss, item memory, or source memory trials based on whether they exceeded the source/item memory thresholds established before decay. We iterated over several decay factor values to find one that produced results resembling those observed in the true behavioral data from the long-delay group. A decay factor of 0.58, when applied to the simulated data, gave the lowest mean deviation from the true behavioral scores in the long-delay group and produced average source memory scores of 20.0%, item memory scores of 35.2%, and miss scores of 44.8%. With this decay factor, mean encoding intensities producing miss, item memory, and source memory responses after a long delay were 2.79 ± 0.88, 5.95 ± 1.54, and 8.09 ± 0.81, respectively. From the simulated mean strengths in the two groups, we could approximate the BOLD effect size expected to be found between the two groups if durable source memories are selected based on intensity alone. The effect of main interest, source encoding strength compared across the two groups, produced a Cohen's delta effect size of 1.86. Unsurprisingly, post hoc power analyses using G*Power 3 (Faul et al., 2007) demonstrated that we would reliably detect such a big effect—should it be present in the data—with the sample size used in the current experiment [achieved power > 99%, given α error probability of 0.05, sample size of 74 (39 + 35), and effect size of 1.86]. Because we do not find this effect, we take the simulation results as additional support for the argument that intensity encoding alone cannot explain how durable episodic representations are established.
PPIs
Having evaluated the encoding intensity account of how durable memories are established, we next investigated whether a complementary underlying mechanism could be found in levels of connectivity during source memory encoding. The generalized PPI analysis allowed us to search for task-specific changes in hippocampal–cortical functional connectivity during encoding. With the term “hippocampal–cortical,” we here mean connectivity between the Freesurfer-defined hippocampus (which is a part of the subcortical segmentation stream of Freesurfer) and the Freesurfer-defined cortical surface. We first analyzed data from the two groups separately and found similar significant (p < 0.05, corrected) patterns of stronger connectivity between the left hippocampus and the left rostral middle frontal lobe (peak MNI coordinate, x = −39, y = 32, z = 18) during source memory encoding compared with item memory and miss trials (Fig. 5). When directly comparing the connectivity of the left hippocampus during source memory encoding between the two groups, no differences were observed, indicating that the observed patterns are associated with source memory encoding in general and not with encoding of durable memories in particular. However, such a selective pattern was observed in the task-specific connectivity associated with the right hippocampus: only in the long-delay group did we find evidence for stronger communication with cortical areas during source memory encoding than during the other task conditions (Fig. 5). A direct comparison of the right hippocampal–cortical connectivity of the two groups during source memory encoding showed significantly (p < 0.05, corrected) stronger effects in the long-delay group, confirming that hippocampal communication with a set of occipital, parietal, and temporal regions is particularly high during encoding of durable source memories.
Discussion
During the encoding of an event, high levels of BOLD activation/deactivation in memory encoding networks predict successful initial consolidation. Although this intensity principle seems to govern which episodes will survive into the immediate post-encoding interval and become candidates for system consolidation, the present results indicate that another, complementary, mechanism is responsible for establishing which of these memory traces will take a more permanent form and remain accessible during the following weeks and months. Here we show that high connectivity at encoding between the right hippocampus and a set of posterior perceptual neocortical regions is predictive of recollection success after a retention interval of ∼6 weeks. We suggest that this connectivity pattern reflects tagging of memory traces that will undergo additional post-encoding processing and therefore potentially reach a durable status.
The initial, intensity-governed, consolidation involves multiple brain regions associated with attentional selection, perceptual content processing, and memory operations, including the left hippocampus (Kim, 2011). The effect of high activity levels in these areas during encoding is most likely to initiate and facilitate rapid consolidation processes at the cellular level—processes that can be disrupted for a few hours after encoding by mechanisms targeting synaptic operations (e.g., protein synthesis inhibitors; Dudai, 2004). Events associated with activation levels below a certain intensity threshold during encoding tend to be forgotten or remembered without source information when tested during the early post-encoding interval. Importantly, recent studies have demonstrated that source memory for an event after a short delay is a prerequisite for that event to be recollected with source memory after delays of days and weeks (Carr et al., 2010; Liu et al., 2013). Thus, typically, only events that have undergone successful initial consolidation become candidates for post-encoding processing during which durable source memories are established.
The stabilization of source memories, or systems consolidation, is hypothesized to be the result of post-encoding memory processes such as reactivation of stored representations during periods of sleep and awake rest (Diekelmann and Born, 2010; Stickgold and Walker, 2013). In support of this theory, it has been shown that experimentally triggered replay of memory traces increases the likelihood that these traces stabilize and remain accessible over time (Rasch et al., 2007; Oudiette et al., 2013). Furthermore, recent investigations of the selectivity of consolidation in humans suggest that information tagged as relevant by the individual during or immediately after encoding, through reward (Oudiette et al., 2013), emotional salience (Hu et al., 2006; Payne et al., 2008; Nishida et al., 2009; Sterpenich et al., 2009), or instructions to remember (Saletin et al., 2011; Wilhelm et al., 2011; van Dongen et al., 2012), benefit the most from post-encoding sleep as measured by delayed memory testing. Seminal models of episodic memory formation agree that the first stages of system consolidation are characterized by hippocampal integration of distributed cortical modules that represent the various features of an experience (McClelland et al., 1995; Squire and Alvarez, 1995; Nadel and Moscovitch, 1997), and empirical studies have found increased hippocampal connectivity with neocortical areas during encoding of subsequently remembered events (Ranganath et al., 2005), interestingly sometimes also in the absence of clear-cut univariate activation effects (Gagnepain et al., 2011). The observed pattern of connectivity between the right hippocampus and cortex during encoding of durable source memories fits well with predictions from these models and our experimental task: increased hippocampal connectivity with visual processing streams, coarsely categorized as the occipito-temporal pathway for visual perception and the occipito-parietal action pathway (Goodale and Milner, 1992), suggests that information in task-relevant brain areas (“visualize an action with an object”) gets fused during successful long-term encoding. Our finding of hippocampal integration at encoding might well be a neurobiological consequence of the relevance tagging concept discussed above: according to this view, information in trials showing the characteristic hippocampal connectivity pattern at encoding should to a greater degree undergo post-encoding processing and replay. In line with this interpretation, it has been shown that synchronized activity between visual sensory areas and the hippocampus can persist from encoding to post-encoding rest periods and that the strength of this post-encoding connectivity predicts subsequent memory performance (Tambini et al., 2010). Furthermore, and also in support of the relevance tagging account of the observed hippocampal connectivity at encoding, we observed stronger interactions between the right hippocampus and posterior cingulate cortex during source memory encoding in the long-delay group compared with the short-delay group (Fig. 5). The posterior cingulate cortex plays a pivotal role in episodic memory processing (Vann et al., 2009) and shows, as a core of the DMN (Andrews-Hanna et al., 2010), a tight functional coupling to processes involved in evaluation of aspects relevant to one self. It is established that self-referencing during encoding benefits memory operations as it produces both organized and elaborate processing (Symons and Johnson, 1997), and stronger connectivity between the hippocampus and the posterior cingulate cortex could thus promote memory consolidation.
Interestingly, we observed different patterns of connectivity from the left and right hippocampi during source memory encoding. In both groups, the left hippocampus shows synchronized activity with a cluster in the left dorsolateral prefrontal cortex (DLPFC) that overlaps with the intensity-defined source memory encoding maps shown in Figure 2. Regions in the left DLPFC (inferior frontal gyrus) have been shown to play causal roles in the encoding of visual information (Sneve et al., 2013), and meta-analyses, together with our data, show that the left hippocampus is more strongly involved during source memory encoding than during other subsequent memory conditions in pictorial tasks with short retention intervals (Kim, 2011). Further in line with the meta-analysis, we did not observe robust subsequent memory effects in the activity levels of the right hippocampus. Strong activity in the right hippocampus has been commonly associated with future thinking and the successful encoding of such simulations (Addis and Schacter, 2011), constructive tasks requiring the formation of novel associations. Because right hippocampal connectivity during source memory encoding exceeds baseline levels only in the long-delay group (i.e., during encoding of episodes known to become long-lasting memories), we speculate that the encoding of durable source memories to a greater degree relies on the successful use of similar constructive strategies.
In contrast to the present results, previous investigations of the relationship between BOLD activity levels during encoding and memory performance after long retention periods [2 days (Uncapher and Rugg, 2005) and 1 week (Carr et al., 2010; Liu et al., 2013) have found indications of durable memory encoding effects in the DLPFC and in the medial temporal lobes. These studies used so-called remember/know evaluations during memory testing, and it is debated whether such tasks measure the same aspects of episodic memories (Wixted and Squire, 2011). The long-term fate of memories formed using these different paradigms should thus be the topic of additional investigation. Nevertheless, the experimental design used in the current study was optimized to allow for the isolation of source memories by requiring participants to show familiarity of an item and confirm its encoding context through a two-step procedure after recognition. Thus, we are confident that encoding trials going into the source memory condition in our study are characterized by later episodic recollection. It would be interesting to compare how source memories are retrieved after short and long retention periods and see whether similar differences are found on a connectivity level between durable and short-lived memories. However, this would require an even more extensive scanning regimen and will be a challenge for future studies. Finally, one can debate whether a within-subject version of our design would be optimal because of higher sensitivity and the smaller risks of sampling error. We believe we have dealt with these concerns by testing a large amount of participants: average sample size in the 2011 meta-analysis of 74 fMRI studies using the subsequent memory paradigm (Kim, 2011) was 16.4 participants (range, 9–30), whereas we included 74 participants in our sample. Moreover, by letting subjects know that there will be a test associated with the encoded stimuli (a consequence of the first test in a within-subject design), one risks introducing rehearsal strategies affecting the subsequent memory measure in the long retention condition selectively.
In conclusion, our results suggest that the evolution of a transient experience into a durable episodic source memory depends on two complementary processes. An initial, intensity-based consolidation allows the memory trace to survive during the first period after encoding but will not in and of itself lead to a stable memory for longer intervals. To be consolidated on a system level and become accessible for longer time periods, the memory trace also has to be integrated through interactions between hippocampus and neocortical processing sites. In the present study, we find evidence for such integrative tagging in the shape of increased functional connectivity between the hippocampus and posterior perceptual neocortical regions during encoding of information that later becomes stable memories. Future research should pursue the effects of the observed tagging processes into the post-encoding interval and investigate the extent to which they correspond with prioritization for post-encoding processing and long-term stabilization.
Footnotes
This work was supported by the European Research Council Starting Grant Scheme (A.M.F. and K.B.W.), the Norwegian Research Council (A.M.F. and K.B.W.), and the Department of Psychology, University of Oslo.
- Correspondence should be addressed to Markus Handal Sneve, Department of Psychology, P.O. 1094 Blindern, 0317 Oslo, Norway. m.h.sneve{at}psykologi.uio.no