Abstract
What brain regions underlie retrieval from episodic memory? The bulk of research addressing this question with fMRI has relied upon recognition memory for materials encoded within the laboratory. Another, less dominant tradition has used autobiographical methods, whereby people recall events from their lifetime, often after being cued with words or pictures. The current study addresses how the neural substrates of successful memory retrieval differed as a function of the targeted memory when the experimental parameters were held constant in the two conditions (except for instructions). Human participants studied a set of scenes and then took two types of memory test while undergoing fMRI scanning. In one condition (the picture memory test), participants reported for each scene (32 studied, 64 nonstudied) whether it was recollected from the prior study episode. In a second condition (the life memory test), participants reported for each scene (32 studied, 64 nonstudied) whether it reminded them of a specific event from their preexperimental lifetime. An examination of successful retrieval (yes responses) for recently studied scenes for the two test types revealed pronounced differences; that is, autobiographical retrieval instantiated with the life memory test preferentially activated the default mode network, whereas hits in the picture memory test preferentially engaged the parietal memory network as well as portions of the frontoparietal control network. When experimental cueing parameters are held constant, the neural underpinnings of successful memory retrieval differ when remembering life events and recently learned events.
SIGNIFICANCE STATEMENT Episodic memory is often discussed as a solitary construct. However, experimental traditions examining episodic memory use very different approaches, and these are rarely compared to one another. When the neural correlates associated with each approach have been directly contrasted, results have varied considerably and at times contradicted each other. The present experiment was designed to match the two primary approaches to studying episodic memory in an unparalleled manner. Results suggest a clear separation of systems supporting memory as it is typically tested in the laboratory and memory as assessed under autobiographical retrieval conditions. These data provide neurobiological evidence that episodic memory is not a single construct, challenging the degree to which different experimental traditions are studying the same construct.
- autobiographical memory
- default mode network
- episodic memory
- fMRI
- parietal memory network
- recognition memory
Introduction
The investigation of episodic memory, defined as memory for events or episodes in one's past (Tulving, 1972), has proceeded along two broad research paths. In the more common approach, which has been termed the laboratory tradition (McDermott et al., 2009; Roediger and McDermott, 2013), participants learn and retrieve materials within an experimental context. This approach affords considerable experimenter control, and it assumes that similar rules will govern the remembrance of any single experience, whether a word on a screen or a meal with a friend (Tulving, 1972). An alternative approach rejects this assumption, and is guided by the belief that to properly understand memory for lifetime events, one must forgo the control offered by list-learning tasks and instead ask participants to retrieve memories from their daily lives, as would be done in more “natural” settings.
Are the two methodologies tapping fundamentally similar memory systems, or is it necessary to use autobiographical methods if the goal is to understand retrieval of events from one's lifetime? Posing this question is difficult with purely behavioral measures. A more direct way of approaching the question is to use functional neuroimaging methods, in which putative functions can be ascribed to specific regions of the brain. Relatively few studies have attempted to address the question of whether (and how) the different traditions of psychological memory research may be similarly or differentially supported in the brain, and they have reached disparate (and at times conflicting) conclusions. Some evidence suggests that regions within the medial temporal lobe and medial prefrontal cortex exhibit greater activity during autobiographical memory retrieval than during recognition memory (Conway et al., 1999; Nyberg et al., 2002; Cabeza et al., 2004; Hassabis et al., 2007; Summerfield et al., 2009; Elman et al., 2013). Evidence of regions showing the opposite pattern is less consistent, with only three of the previously cited studies locating such effects across frontal and parietal regions (Conway et al., 1999; Nyberg et al., 2002; Elman et al., 2013). It may therefore be the case that similar neural substrates support episodic memory broadly, with autobiographical memory retrieval driving such regions to greater activity levels (possibly due to greater complexity). This hypothesis invokes (often implicitly) the assumptions of the verbal learning tradition, in which word lists and similar materials are thought to serve as a proxy for life events (Tulving, 1983).
A recent meta-analysis suggests a different conclusion, however. Specifically, an activation likelihood estimation procedure was used to quantify regions of the brain that tend to be active during autobiographical memory tasks (relative to a variety of control conditions) and those that tend to be active during successful recognition memory (comparing activity for hits to correctly rejected lures); the resulting voxelwise maps were almost nonoverlapping (McDermott et al., 2009). This analytic approach suggests that recognition memory and autobiographical memory recruit distinct regions of cortex and therefore measure distinct types of memory.
Is this conclusion justified, or might differences in the procedures (e.g., different stimulus cues, different retention intervals, different response times) inherent in the two literatures lead to this outcome? The purpose of the present experiment was to directly compare the neural correlates supporting memory retrieval within these traditions, striking a balance between experimental control and ecological validity. Participants encoded scenes and later viewed a mixture of new and old scenes, and were asked about their memories in response to each stimulus. To represent the laboratory-based tradition, participants made yes/no responses indicating whether a scene had been recently encoded. To represent the autobiographical tradition, participants made yes/no responses to indicate whether or not a stimulus reminded them of an event from earlier in their lifetime. All variables, including type of stimuli presented, test item history, and trial duration, were matched closely. Direct contrasts of laboratory and autobiographical conditions demonstrate that even within this task setting, and within a single group of subjects, there are clear differences in the neural underpinnings of the two types of memory.
Materials and Methods
Experiment 1: fMRI study
Participants.
Thirty-one participants (16 female, ages 18–35) were recruited from Washington University and the St. Louis area. Participants were all right-handed, native speakers of English (acquired by the age of 5), had normal or corrected-to-normal vision, and were free of psychiatric or neurological disorders. Data from three participants were excluded from analysis due to experimenter error. In addition, one participant was excluded due to excessive motion, leaving a final sample of N = 27. Informed consent was obtained for all participants, and the study was conducted in accordance with Washington University human research practices. Participants were paid $25 per hour.
Materials.
Scenes were chosen as cues for two reasons. First, pilot work demonstrated that scenes were effective cues for autobiographical memory (as effective or more effective than words). Second, the average response time (RT) for scenes to elicit autobiographical memories fell within 1 s of the response time for recognition memory (a much smaller difference than when words are used as cues).
Collection of scene stimuli followed the procedure used by Konkle et al. (2010): 284 images of various categories (such as a cafeteria, a lecture hall, tennis courts, and an airport) were gathered via Google Images (images.google.com). Only images with resolution higher than 800 × 600 pixels were used. None of the scenes contained people. The scenes were then resized to 400 × 300 pixels (overall screen resolution: 1024 × 768 pixels). Scenes were rotated across conditions across participants.
Encoding and retrieval instructions and procedures.
The procedure of the experiment is portrayed in Figure 1. All of the tasks took place in the scanner, and participants remained in the scanner between encoding and retrieval. Participants began by studying 126 indoor or outdoor scenes; each scene was displayed for 2 s, followed by a blank screen for 500 ms. Participants used a button press to classify each scene as indoor or outdoor while learning the scenes (intentional encoding).
Experimental procedures. Participants viewed 126 scenes (intentional encoding) with a binary (indoor/outdoor) judgment for each. They then took two types of memory tests: recognition and autobiographical (counterbalanced order). Each test consisted of 96 scenes (48 scenes in each of two runs; totaling 64 new, 32 old) for 4 s with a 1 s blank screen following. The only difference between test type was the instructions and the title given to subjects (picture memory test or life memory test). Images in this figure are publicly available on Wikimedia Commons (attributions from left to right: Kevin Payravi, Kapacytron, Foxparabola). ISI, Interstimulus interval.
During the memory test requiring retrieval from autobiographical memory, participants were told that they were taking a life memory test and were asked to report (via button press) whether they could use the picture “to help you remember a specific event or moment in your life” and to “please try hard to remember specific details about an event from your life that is distinct in time and place.”
In the memory test requiring retrieval of recently learned stimuli, participants were told that they were taking a picture memory test and were asked to report (via button press) whether they could “remember having seen” the picture in the study phase. Furthermore, they were asked for each picture to “try hard to remember specific details about having seen it before (e.g., some specific feature of the picture that you recall looking at, or what the picture made you think of when you studied it.” Hence, both instructions emphasized the importance of remembering, recollecting, or reliving to the extent possible (Tulving, 1985), and the instructions were kept as similar as possible (given that the instructions constituted the experimental manipulation). Participants went through a few training trials, and they did not enter the scanner until it was clear to the experimenter that they understood the instructions. Half of the participants began with the life memory test, whereas the other half began with the picture memory test. Those subjects receiving the picture memory test first began the test approximately 1 min after the end of the encoding run. The remaining participants began the test ∼16 min after the end of encoding. At the end of the scanning session, all participants completed an additional autobiographical retrieval task (longer trials, no button press, modeled after Szpunar et al., 2007); data from that task are not reported here.
Both types of tests were divided into two blocks (or runs) of 48 trials (16 old or “studied” scenes, 32 new or “nonstudied” scenes). For each trial, a scene appeared on the screen for 4 s and was followed by a blank screen lasting 1 s. Participants were given 5 s from trial onset to respond with a button press to indicate whether they recognized the scene or whether the scene could be used to retrieve a specific life event (for the picture memory test and life memory test, respectively). In sum, item history, stimulus timing, and response output modality were identical for the two tests.
fMRI data acquisition.
Functional and structural scans were acquired on a Siemens 3.0T MAGNETOM Trio system using a Siemens 12-channel head coil. Stimuli were presented with PsyScope (Cohen, 1993) on an iMac computer, which received sync pulses from the scanner. Length of jitter and randomization of trial types were optimized using the program Optseq2 (http://surfer.nmr.mgh.harvard.edu/optseq/).
Structural images were acquired using a T1-weighted sagittal MPRAGE (TE, 3.08 ms; TR (partition), 2.4 s; TI, 1000 ms; flip angle, 8°; 176 slices with resolution 1 × 1 × 1 mm voxels), and were used along with a T2-weighted turbo spin echo structural image (TE, 84 ms; TR, 6.8 s; 32 slices with 2 × 1 × 4 mm voxels) in the same anatomical plane as the BOLD images to improve atlas alignment.
Gradient field maps allowed estimation of inhomogeneities in the magnetic field for each subject. An autoalign pulse sequence protocol provided in the Siemens software was used to align the acquisition slices of the functional scans parallel to the anterior commissure–posterior commissure plane. Slices collected were therefore parallel to the slices in the Talairach atlas (Talairach and Tournoux, 1988). Functional imaging used a BOLD contrast sensitive gradient-echo echoplanar sequence (TE, 27 ms; flip angle, 90°; in-plane resolution, 4 × 4 mm). Whole-brain EPI volumes (MR frames) of 32 contiguous, 4-mm-thick axial slices were obtained every 2.5 s. The first four functional images of each scan were discarded to allow for T1 equilibration effects.
fMRI data preprocessing.
Imaging data from each participant were preprocessed to remove noise and artifacts including (1) temporal realignment using sinc interpolation of all slices to the temporal midpoint of the first slice to account for differences in slice time acquisition, (2) correction for movement within and across scan runs using a rigid-body rotation and translation algorithm (Snyder, 1996), (3) gradient field map correction to correct for spatial distortion due to local field inhomogeneities using FMRIB Software Library's FUGUE (http://fsl.fMRIb.ox.ac.uk), and (4) whole-brain normalization with a single constant factor to a common mode of 1000 in the fifth frame to allow for comparisons across participants (Ojemann et al., 1997). Functional data were then resampled using 3 mm isotropic voxels and transformed into stereotaxic atlas space (Talairach and Tournoux, 1988). Atlas registration involved aligning each participant's T1-weighted image to a custom atlas-transformed (Lancaster et al., 1995) target T1-weighted template (711-2C) using a series of affine transforms (Michelon et al., 2003). Individual subject data were averaged across people to create a group average contrast map. The contrast images were smoothed using a Gaussian smoothing kernel with 6 mm FWHM.
GLM coding.
Each memory test run consisted of 157 frames (TRs), although the first four were dropped, leaving 153 frames analyzed per run. One run from one participant was dropped due to within-run movement. Participants' individual runs were concatenated into a single time series.
The data were modeled with a general linear model, which included eight regressors of interest: four corresponding to recognition memory trial types (hits, misses, correct rejections, and false alarms) and four corresponding to autobiographical memory trial types (successful retrieval for old scenes, unsuccessful retrieval for old scenes, successful retrieval for new scenes, and unsuccessful retrieval for new scenes). For each participant, RTs for each trial were z scored and included as a regressor. We analyzed the data with and without these RT regressors, and the contrast between successful autobiographical retrieval and successful picture memory did not vary appreciably; analyses reported here included RT as a regressor. Regressors of no interest included a trend term to account for linear changes and a constant term to model the baseline. A standard hemodynamic response function was chosen (Boynton et al., 1996) to estimate the hemodynamic response for each condition, with an onset delay of 2 s. Effects were analyzed in terms of percent signal change relative to baseline.
Analysis and visualization software.
Imaging analysis was done using Washington University's in-house software, FIDL (http://nil.wustl.edu/∼fidl). All reported atlas coordinates were converted from 711-2C space to MNI 152 space. Figures displaying statistical maps were made by projecting and displaying the volumetric data onto a partially inflated representation of the human brain using the Connectome Workbench software (Marcus et al., 2011).
Retrieval tasks voxelwise t test analysis and ROI definition.
The obtained t test images were multiple-comparison-corrected to a whole-brain familywise error rate of p < 0.05 using a z > 3 threshold with at least 17 contiguous voxels (McAvoy et al., 2001). The cluster size threshold T values were chosen based on 10,000 Monte Carlo simulations performed by McAvoy et al. (2001). This cluster-level correction was designed to provide adequate control for false positives without a significant loss of statistical power. An automated algorithm (peak_4dfp) written by A. Snyder (Washington University School of Medicine) searched for the location of peaks in the resulting image and drew spheres (16 mm diameter) around each peak. Peaks under 16 mm apart were consolidated via coordinate averaging. Regions of interest (ROIs) were then obtained by masking the 16 mm spheres by the multiple comparison corrected image. Regions located in white matter, CSF, or ventricles were excluded from analysis.
The primary advantage of this experimental design was that we could contrast conditions with identical item history (e.g., old scenes) and identical overt response (“yes, I remember”) but differing underlying experiences (remembering having studied the scene or using the scene to remember a preexperimental life event); that is, the primary contrast of interest was a voxelwise t test (paired sample, two-tailed) contrasting activity estimates for successful retrieval cued by previously studied scenes within the life memory test and the picture memory test. The contrast compared hits on a recognition memory test to successful retrieval on an autobiographical memory test (where recently studied scenes were the cues).
Other contrasts of interest included a comparison similar to the one above, but where novel, nonstudied scenes served as the items contrasted to recognition hits. This contrast is more analogous to a direct comparison between the conditions typically used in the literature; that is, whereas the prior contrast is more elegant (with item history and response equated), the latter contrast is more like what one would achieve in contrasting the conditions typically used to study episodic retrieval. We primarily focus on the prior contrast due to the better match in item history. In addition, a contrast between successful retrieval in autobiographical memory for studied and nonstudied scenes was analyzed to assess possible “contamination” of recognition signal in autobiographical retrieval. Also shown (for completeness) are one-tailed t tests of both test types (all items) relative to the low-level control of the implicit baseline. Finally, we report contrasts between hits and correctly rejected lures in the picture memory test and between old scenes triggering successful memory retrieval and new scenes not triggering memory retrieval in the life memory test.
Retrieval tasks network-wide comparison.
Results from the whole-brain analyses described above suggested a network-wide dissociation for successful retrieval in the picture memory test and the life memory test. To further explore this possibility, we used the 264 ROIs reported by a prior whole-brain network parcellation study (Power et al., 2011) to examine individual networks' activity during successful memory for previously studied scenes (“yes” responses in the picture memory test and life memory test).
For reasons that will become clear, we focused on members of the default mode network (DMN), the parietal memory network (PMN), and a subnetwork of the frontoparietal network (FPN). We drew 10 mm spheres around the peaks, obtained the average magnitude estimates for the “yes” responses to old items for the picture memory test and life memory test (i.e., for recognition hits and successful autobiographical retrieval), and obtained the mean of magnitudes for each task condition for each network or subnetwork. We chose 10 mm diameter spheres because numerous studies (Power et al., 2012; Cao et al., 2014; Boly et al., 2015; Geerligs et al., 2015; Thompson and Fransson, 2015; Lerman-Sinkoff and Barch, 2016) used the same diameter for spheres based on the ROIs of Power et al. (2011), and the use of 10 mm spheres on our part will facilitate across-experiment comparisons in future research. We then performed paired t tests to determine whether the chosen networks showed differential activities for the two types of remembering. It is worth mentioning that the t tests did not use any “double dipping” because these ROIs were independently defined using coordinates from Power et al., 2011.
Experiment 2: behavioral experiment
As will be seen, many of the brain regions more active for the life memory test have been implicated previously in recollective remembering (Rugg and Vilberg, 2013), in contrast to general feelings of familiarity. One possible explanation, therefore, is that the life memory test may elicit more recollective remembering than the picture memory test (and that it is this difference that drives the differential activity). We test that hypothesis with Experiment 2.
Participants.
Thirty-eight participants were recruited using Washington University's subject pool (21 female, ages 18–21). Informed consent was obtained from all participants, and the study was conducted in accordance with Washington University's human research practices. Participants either received course credit or $10 per hour. None of the participants in the two experiments overlapped.
Materials.
The stimuli consisted of 246 indoor and outdoor scenes collected using Google Images, similar to the imaging experiment.
Encoding and retrieval instructions and procedures.
The procedure of the behavioral experiment was very similar to that of the neuroimaging study, except that participants encoded fewer stimuli (96 scenes), all procedures occurred within the psychology lab, and 96 picture memory test trials occurred within a single block, followed by 96 life memory test trials (or the reverse order, for half the participants). In both types of tests, half of the items were old, and half were new.
For both types of tests, participants made remember/know/new judgments. For autobiographical trials, they were instructed to respond “remember” when they could recollect specific aspects (time, place, feeling) of an event, “know” when they had a gut feeling of familiarity about the scene (such as having been to a similar place) but could not recollect specific details of an event, and “new” when they were not able to use the scene to recollect an event from their past.
For recognition trials, participants were instructed to answer “remember” when they could recollect something specific about having seen the scene (e.g., the thought they had or connection they made while seeing it), “know” if they had a gut feeling about having seen the scene but could not remember the specific details, and “new” if they did not remember having seen the picture in the study phase.
Analysis.
The primary question for Experiment 2 was whether the likelihood of recollective remembering might be greater in the life memory test than in the picture memory test, an outcome that would influence interpretation of the results in the fMRI study. As will be seen, this pattern did not occur.
Results
Experiment 1
Behavioral results
The behavioral performance during the picture memory test and the life memory test is shown in Figure 2. Performance in the picture memory test was quite accurate, with a hit rate of 0.76 and a false alarm rate of 0.08. Accuracy cannot be measured for autobiographical memory, but we can see that participants often claimed to be able to retrieve a life memory, whether the cue was an old scene (0.74) or a new scene (0.68). The two percentages do not differ significantly (t(26) = 0.754, p = 0.458).
Accuracy and response latency for the two types of tests. A, Mean proportions of “yes” and “no” responses to old (studied) and new (nonstudied) scenes for the life memory test (red) and the picture memory test (blue). B, Mean response latencies for “yes” and “no” responses to old (studied) and new (nonstudied) scenes for the life memory test (in red) and the picture memory test (blue). Error bars display SEM.
The response times for the test types differed: For hits, the average RT was 1.39 s, whereas for the analogous condition in the life memory test, the average RT was 2.13 s, (t(26) = 8.382, p = 7.294 × 10−9). For this reason, we entered trial-by-trial RT as a covariate in our general linear model of the fMRI data. Both the picture memory test and the life memory test had adequate numbers of trials for analysis. Specifically, the number of hit trials for each participant ranged from 9 to 30 (median, 25), and correct rejections ranged from 32 to 64 (median, 57). For the life memory test, yes responses occurred to old scenes with a frequency of 13 to 32 (median, 24), and to new scenes with a frequency of 19 to 60 (median, 44).
A final observation from the behavioral data is that responses to new scenes differed for the two test types, demonstrating that participants understood and followed the instructions (i.e., infrequently claiming to recognize new scenes as having been studied but frequently reporting being able to use new scenes to trigger memories for life events).
Neuroimaging results
A contrast of old (studied) scenes leading to successful retrieval in the two types of tests revealed differential activity in numerous regions.
The BOLD activity elicited by successful retrieval of previously studied scenes in the picture memory test (i.e., hits) was contrasted with successful retrieval elicited by previously studied scenes in the life memory test (old scenes, for which the subjects' responses were “yes, I remember”; Figure 3A, Table 1). Regions more active during autobiographical retrieval included bilateral hippocampus, left amygdala, bilateral superior frontal gyrus, angular gyrus, medial prefrontal cortex, and bilateral retrosplenial complex. Regions more activated during recognition hits included right middle–frontal gyrus, bilateral insula, right inferior parietal cortex, precuneus, and midcingulate cortex (Table 2).
Differential activation for successful recollection in the life memory test and picture memory test. A, A contrast of old scenes that elicited “yes” responses to indicate successful retrieval. B, A contrast of new scenes that elicited “yes” responses in the life memory test and old scenes that elicited “yes” responses in the picture memory test. For both panels, z > 3, k > 17, whole-brain p < 0.05 (familywise error corrected).
Regions exhibiting greater activity for the life memory test than the picture memory test (specifically, a contrast of “yes” responses to previously studied scenes)
Regions exhibiting greater activity for the picture memory test than the life memory test (specifically, a contrast of “yes” responses to previously studied scenes)
Successful retrieval of autobiographical memory for new items relative to recognition hits revealed a similar pattern of results (Fig. 3B, Tables 3, 4). Because of the similarity of the two contrasts, further comparison between autobiographical retrieval and recognition refers to the successful autobiographical retrieval given old items versus hits (given the equivalent item history). The contrast involving new items is noteworthy, though, in that this condition is the one typically used in the autobiographical memory literature.
Regions exhibiting greater activity for the life memory test than the picture memory test (specifically, a contrast of “yes” responses to novel scenes for the life memory test and “yes” responses to studied scenes in the picture memory test)
Regions exhibiting greater activity for the picture memory test than the life memory test (specifically, a contrast of “yes” responses to novel scenes for the life memory test and “yes” responses to studied scenes in the picture memory test)
The contrast of successful retrieval of autobiographical memory for old and new items revealed a sparse map, consisting mainly of regions in the recently described parietal memory network (Gilmore et al., 2015; Fig. 4, Table 5). The contrast suggests that aside from regions sensitive to stimulus repetition, there is very little “contamination” of recognition signal in the life memory test.
A contrast of old scenes that elicited “yes” responses in the life memory test to new scenes that elicited “yes” responses in the life memory test. The contrast revealed a sparse map, consisting mostly of members in the parietal memory network.
Regions exhibiting greater activity for studied than nonstudied scenes in the life memory test in successful retrieval trials (trials with “yes” responses)
Strong hemispheric asymmetry emerged such that most of the picture memory > life memory differences occurred in the right hemisphere or midline, whereas the opposite pattern (life memory > picture memory) tended to be bilateral but more pronounced in the left hemisphere.
Patterns seen in the voxelwise analysis accord with networks defined by resting state fMRI.
Regions that exhibited differential activity for the two test types were further examined. ROI generation processes revealed 29 regions differentially activated during the two tests (Tables 1, 2). ROIs were categorized based on their network membership using the parcellation of the Power et al. (2011) modified voxelwise map. Each of the 18 regions falling within the default mode network exhibited greater activity during the life memory test (Tables 1, 2; Fig. 5). Specifically, Figure 5A shows the strong correspondence in ROIs from the current data set (spheres) overlain on the default mode network (light red underlay) as identified by Power et al. (2011) based on their voxelwise network parcellation with 0.5% tie density.
ROIs obtained from the whole-brain analysis align with brain networks identified from an independent data set using graph theory to analyze resting state data (Power et al., 2011). Specifically, the underlays are from a Workbench (Marcus et al., 2011) border file obtained from J. Power (Weill Cornell/NY Presbyterian Hospital, Department of Psychiatry) that was based on the 0.5% tie density modified voxelwise subgraphs. A, The DMN (red underlay) and the ROIs within the DMN emerging from the whole-brain analysis. B, C, PMN and FP subnetwork (tan and blue underlays, respectively), along with the spherical ROIs within those networks. D, Mean activity (and associated SEs) for each of the DMN ROIs in Table 1. Activation for most of these regions (relative to implicit baseline) in the life memory test is evident, as is the tendency for little or no activation during the picture memory test. Conversely, E and F show the reverse tendency for the PMN and FP subnetwork. G–I, Mean activity in the present data set for all regions of the independently defined ROIs from Power et al. (2011).
Another regularity in the data is that two network communities exhibited the opposite pattern: They were more activated during the picture memory test (Fig. 5B,C). These consisted of the parietal memory network and an unnamed subnetwork of the frontoparietal network (Power et al., 2011; Fig. 4). Regions within both of these networks have been implicated previously in memory retrieval (Henson et al., 1999; Yonelinas et al., 2005; Nelson et al., 2010; Power et al., 2011).
The network-level dissociation remains even when using independently defined network ROIs.
To examine whether the dissociation between successful retrieval in the two memory tests is restricted to part of the three networks identified from our contrast or whether the networks as a whole would show the same trend, we obtained coordinates for members of the default mode network, the PMN, and the FPN subnetwork from the 264 ROIs in the Power et al. (2011) parcellation (Table 6). There are 58 ROIs in the default mode network defined by Power et al. (2011), 5 ROIs in the PMN, and 5 ROIs in the FPN subnetwork.
Regions from Power et al. (2011) used in network analyses
When averaging over regions within each of the three networks, the results converge with those presented above (Fig. 5G–I). Specifically, the default mode network exhibited activation in the life memory test and little to no activity for the picture memory test, with the difference between the two being significant with a two-tailed paired t test (t(57) = 6.511, p = 2.065 × 10−8). Conversely, the two memory retrieval networks exhibited activation in the picture memory test, but little to no activity in the life memory test, with the difference between the two being statistically significant for the PMN (t(4) = 3.608, p = 0.023), but not reaching significance for the FPN subnetwork (t(4) = 1.710, p = 0.163).
Similarities in the task activation and manipulation checks.
Although the focus of the analyses has been on the differences in activation and deactivation of the two types of tasks, we show (Figs. 6A,B) that these differences should be understood in light of broad similarities when compared to a low-level baseline. These similarities are to be expected in that the tasks were equated as much as possible and both involved looking at scenes, directing attention toward the past, making a memory-related decision (yes/no), and executing a button press; the t tests show activation relative to the low-level control of the implicit baseline. Nonetheless, some of the differences seen in Figure 5 can be gleaned from examining the differences in Figure 6, A and B, the most prominent example being in ventromedial frontal cortex, which is active during the life memory test but strongly deactivated during the picture memory test.
A, B, Activity for old items given “yes” responses on the life memory test (A) and the picture memory test (B), relative to an implicit baseline. Despite many differences as a function of what is remembered, one can see many qualitative similarities in the two tests. C, Binarized maps showing “retrieval success” maps for life memory (“yes” responses to new scenes contrasted with “no” responses to new scenes; red) and for picture memory (“yes” responses to old scenes contrasted with “no” responses to new scenes; blue). These maps were created to be similar in construction to those underlying D, a previously reported meta-analysis showing different retrieval networks for autobiographical memory (red), retrieval success on recognition memory (blue), and the overlap of the two networks (green). Data are from McDermott et al. (2009).
In addition, as a manipulation check and as a way to connect to the prior meta-analysis suggesting pronounced differences (McDermott et al. 2009), we consider retrieval success maps for the two types of tests. Figure 6C (blue) shows regions more activated during hits than correctly rejected lures for picture memory. This contrast has been seen many times in individual studies in the literature (Konishi et al., 2000; McDermott et al., 2000) and in meta-analyses (Wagner et al., 2005; McDermott et al., 2009; Spaniol et al., 2009; Nelson et al., 2010; Power et al., 2011; Kim, 2013), and shows strong alignment with the similar contrast from McDermott et al. (2009) (Fig. 6D); regions include dorsal parietal cortex, middle frontal gyrus, insula, precuneus, and mid/posterior cingulate cortex.
The contrast in the present data set most analogous to that typically used in the autobiographical memory literature is the set of regions more active when novel cues give rise to an autobiographical memory relative to when they do not (Fig. 6C,D, red). These regions in the present data set (Fig. 6C) also align closely with the meta-analysis (Fig. 6D): ventral parietal cortex, posterior cingulate cortex, bilateral parahippocampal cortex, and medial frontal cortex.
Regions showing overlap in these contrasts are depicted in green in Figure 6. Here, we see more extensive overlap than in the meta-analysis, possibly reflecting greater power in a within-subject contrast or features of our autobiographical task that differ from those in the literature. Specifically, we see much greater overlap within left frontal cortex, and some along the middle/posterior inferior parietal lobule (pIPL), but no overlap in posterior cingulate/precuneus.
Experiment 2
The goal of this experiment was to test the hypothesis that the differences observed in Experiment 1 might be attributable to different reliance on remembering and knowing for the two memory tests. For scenes that had been studied, the picture memory test was more likely to be accompanied by “remember” experiences than was the life memory test (Mean = 48 and 30%, respectively; t(37) = 5.206, p = 7.452 × 10−6), as seen in Figure 7. In addition, for old items, participants made more “remember” than “know” judgments for the picture memory test (t(37) = 5.358, p = 4.646 × 10−6) but not the life memory test (t(37) = 0.123, p = 0.9028). The preponderance of “new” judgments for the nonstudied scenes in the picture memory test is high, as would be expected; the few false alarms that existed tended to be “know” responses.
Remember/know/new distributions for the various conditions of Experiment 2. Remember responses (blue) were most common in the picture memory test (for old scenes), not the life memory test.
Overall, this pattern of data suggests that the difference between the two types of tests in Experiment 1 cannot be readily explained by positing that the life memory test had more recollective processing. The remember/know data suggest that to the degree that any differences exist, there is more recollective processing reported in the picture memory test.
Discussion
When participants were asked to retrieve memories in response to recently encountered scenes, the neural substrates differed depending on whether the retrieved memory was of having viewed the scene recently (picture memory test) or whether it was a lifetime event (life memory test). Regions preferentially active during successful retrieval in the life memory test fell within the default mode network (Fig. 5A). Conversely, regions exhibiting greater retrieval-activity in the picture memory test tended to fall within the PMN and a subnetwork of the FPN (Fig. 5B,C). Similar results emerged when nonstudied scenes in the life memory test were contrasted with studied scenes from the picture memory test (Fig. 3B).
In a second experiment, we explored the hypothesis that the life memory test was more likely to invoke vivid recollection than the picture memory test), an idea suggested previously (Cabeza et al., 2004; Rissman et al., 2016). Specifically, using a remember/know procedure (Tulving, 1985; Gardiner, 1988), we found that contrary to this hypothesis, the picture memory test was more likely to lead to “remember” responses than was the life memory test. Results from Experiment 2 therefore suggest that a straightforward remember/know account (such that the life memory test was mostly remembering and the picture memory test mostly knowing) cannot readily explain results from Experiment 1. However, as no remember/know data were collected in Experiment 1, further work is necessary before definitive conclusions regarding recollective differences can be drawn.
What are the critical differences between the two types of tests?
The two memory tasks were designed with the goal of examining memory retrieval of laboratory-learned stimuli and events from one's lifetime with the memory cues, trial timing and structure, and other details held constant. The data are consistent with the hypothesis that episodic retrieval of recently encountered stimuli differs from episodic retrieval from one's life (Roediger and McDermott, 2013). What are the critical features differentiating the two? This question remains to be answered, although we consider some possibilities here.
One avenue that may prove fruitful to explore is the grain size of the search space. The picture memory test (and most laboratory tests) involves a highly constrained search space (i.e., a temporally specific set of stimuli). Conversely, the life memory test (like most autobiographical memory measures) allowed the search of one's entire lifetime up to the point of the beginning of the experiment. Besides the temporal specificity of the search, the absolute number of items within the search space was 32 in the picture memory test and uncountably large in the life memory test.
Temporal specificity of the search space and number of targets being searched also aligned with the “episode” being searched for. In the case of the picture memory test, the question was whether a specific visual stimulus was seen previously, with the picture being a mini episode (Tulving, 1972). The life memory test (and autobiographical memory tests in general) tended to invoke memories of more sustained activities (evolving over seconds or minutes, unlike the fleeting presentation of a single word).
In addition, the monitoring of the unfolding retrieval episode may differ between the two tests; that is, as the retrieval process develops in the life memory test (and in autobiographical memory), there may be less monitoring for accuracy and temporal specificity (Cabeza et al., 2004), a process sometimes referred to as postretrieval monitoring (Henson et al., 1999). A similar possibility is that the monitoring component in autobiographical memory comes later in time (Addis et al., 2007; Daselaar et al., 2008), and the short (5 s) trials used in the present experiment may not have allowed enough time for this component to occur.
The finding that the picture memory test involved greater activation in components of the FPN is consistent with these possibilities outlined above regarding grain size, search set size, specificity, and monitoring (Dobbins et al., 2002, 2003).
Another possibility is that the life memory test was more recall-like than the picture memory test; although participants made old/new judgments in both retrieval tests, the picture memory test was a typical recognition memory test, whereas the life memory test involved cued recall of a lifetime event and then a button press, similar to the recognition test. This difference may line up well with the aforementioned role of monitoring processes and their influence in recognition.
Furthermore, the differences in ventromedial frontal cortex (deactivated in the picture memory test but activated in the life memory test), may be attributable to more prominent self-referential processing in the life memory test (Kelley et al., 2002; Cabeza et al., 2004 St. Jacques et al., 2011). Given the nature of the different memory tasks, this should not be surprising, but it nevertheless likely contributes to some of the observed difference.
Differences between tests were also observed in the retrosplenial cortex, along with parahippocampal cortex (for a similar pattern, see Elman et al., 2013). These regions have been tightly linked with autobiographical memory as well as scene construction (Hassabis and Maguire, 2007, 2009) and the processing of contextual associations (Aminoff et al., 2007; Szpunar et al., 2009; McDermott and Gilmore, 2015). In the current experiment, all of the stimuli were scenes, so scene processing was inherent in both tasks. Nevertheless, the greater activity observed in these regions during the life memory test suggests scene construction may have been a more prominent component process for this test.
Given that the life memory test referred to more remote events than the picture memory test, one might wonder if a difference in retention interval contributed to the neural differences. The data, however, are difficult to interpret. The system consolidation account posits that as time passes, hippocampal involvement in memory retrieval decreases (Dudai, 2012; Squire et al., 2015). Although exceptions exist (Rekkas and Constable, 2005), the majority of memory studies either support or do not contradict this hypothesis (Niki and Luo, 2002; Takashima et al., 2009; Yamashita et al., 2009; Söderlund et al., 2012; Furman et al., 2012). In the present study, however, the more remote life memory test produced greater activity in bilateral hippocampus than did the more recent picture memory test. One might also wonder if temporal remoteness led to greater activity in ventromedial prefrontal cortex, a finding reported by many researchers (for review, see Nieuwenhuis and Takashima, 2011). However, given the ventromedial prefrontal cortex's sensitivity to self-relevance (St. Jacques et al., 2011) and the presumably higher self-relevance in the life memory condition, the relative contributions cannot be estimated within our experimental task.
Informing understanding of a recently described parietal memory network
Several of the regions identified as more active for the picture memory test (specifically, precuneus, midcingulate, and lateral parietal cortex) correspond to a sparse network recently introduced as the PMN (Gilmore et al., 2015). This set of regions was shown to be a functional network as identified by functional connectivity analyses of resting state data (Power et al., 2011; Yeo et al., 2011) and was hypothesized to reflect the perceived novelty or familiarity of a particular stimulus. The finding within the current data set that the network activates more to old stimuli whose oldness is salient or task relevant (relative to old stimuli whose oldness is irrelevant to the task) is consistent with the hypothesis forwarded by Gilmore et al. (2015) (Fig. 3) and suggests a role for attention to the familiarity in driving activity of the network. These data are also broadly consistent with prior findings that certain regions associated with the PMN—notably, the left pIPL—show differences in recognition activity depending on whether one is expecting a novel or familiar item at test (O'Connor et al., 2010; Jaeger et al., 2013) and whether the mnemonic information is task relevant at test (Elman and Shimamura, 2011; Rosen et al., 2016).
Generalization: are there multiple kinds of episodic memory?
We return now to the question posed in the title of our paper. Although no single fMRI experiment can definitively answer such a fundamental question, the present results clearly demonstrate that the type of “episode” influences the neural substrates of episodic memory. Specifically, distinct functional systems can be recruited, depending on the episodes being retrieved. A practical implication of this finding is that the two traditions of studying episodic memory (autobiographical and laboratory) invoke a common name (episodic memory) for different collections of processes. Future work that uses different forms of “laboratory” and “autobiographical memory” test will be crucial to understanding the importance and generality of this conclusion [although a meta-analysis by McDermott et al. (2009) suggests that we will see similar effects in other situations as well].
Furthermore, it is important to acknowledge that the term “episodic memory” has morphed considerably over the years, becoming increasingly complex and theory bound (see Szpunar and McDermott, 2008). Here we adopt the earliest definition, regarding memory for events or episodes, where an event can be as small as a word in a list (Tulving, 1972). The present data cast doubt on the assumption that words on a list are “mini episodes” in a way that is usefully comparable to life episodes, but we must stress that the degree to which the differences could be minimized with increasingly similar tasks (e.g., equating search set size or self-referential processing) remains to be addressed.
Conclusion
In summary, this study demonstrated that laboratory-based and autobiographical retrieval, assessed using methods typically used in the literature, engaged many brain regions, especially the DMN, the PMN, and an FPN subnetwork, differently. This result supports a dissociation in the processes underlying autobiographical memory and laboratory-based memory, while offering a novel paradigm by which different aspects of episodic memory might be explored in future work.
Footnotes
This work was supported by grants from the McDonnell Center for Systems Neuroscience at Washington University and Dart NeuroScience, LLC, and by the NSF Graduate Research Fellowship Program (DGE-1143954 to A.W.G.). This work benefitted from discussions with Ian Dobbins and Jeff Zacks, from network parcellation underlays provided by Jonathan Power, from assistance with data collection by Fan Zou (Experiment 1) and Ruthie Shaffer and Hannah Becker (Experiment 2), and from response scoring by Jiayi Zhou (Experiment 1).
The authors declare no competing financial interests.
- Correspondence should be addressed to Kathleen McDermott at the above address. kathleen.mcdermott{at}wustl.edu