Functional magnetic resonance imaging studies of recognition memory have often been interpreted to mean that the hippocampus supports recollection and that the adjacent perirhinal cortex supports familiarity. Other work points out that these studies have confounded recollection and familiarity with strong and weak memories. In a source memory study, we used two novel approaches to data analysis that allowed item memory strength and source memory strength to be assessed independently. First, we identified regions in both hippocampus and perirhinal cortex in which activity varied as a function of subsequent item memory strength while source memory strength was held constant at chance levels. Second, we identified regions in prefrontal cortex in which activity varied as a function of subsequent source memory strength while item memory strength was held constant. These findings suggest that activity in the medial temporal lobe is predictive of subsequent memory strength, whereas activity in prefrontal cortex is predictive of subsequent recollection.
Declarative memory depends on the medial temporal lobe (MTL), which includes the hippocampus, the dentate gyrus, the subicular complex, together with the adjacent perirhinal, entorhinal, and parahippocampal cortices (Squire et al., 2004). One of the most widely studied examples of declarative memory is recognition, the ability to judge that an item has been encountered previously. Recognition is generally considered to consist of two components, recollection and familiarity (Mandler, 1980). Recollection involves remembering specific contextual details about a previous learning episode, and familiarity involves simply knowing that an item was presented before. One view is that this distinction has an anatomical basis within the MTL such that recollection depends on the hippocampus and familiarity depends on the perirhinal cortex (Brown and Aggleton, 2001; Eichenbaum et al., 2007). However, alternative views have also been proposed whereby the perirhinal cortex and hippocampus respond to differences in memory strength rather than to qualitative differences based on recollection and familiarity (Squire et al., 2007; Wixted, 2007).
Neuroimaging studies have been taken frequently in support of a functional distinction between hippocampus and perirhinal cortex based on recollection and familiarity. One popular method for assessing recollection and familiarity has been to compare item memory and source memory. In a typical study, participants study a series of items presented in one of two different conditions (e.g., in red or green print). On a subsequent recognition memory test, participants first make an old/new judgment for all items (now presented in black), and then, for items endorsed as old, they also make a source memory decision (red or green). Correctly identifying both the item and its source is thought to reflect a recollection-based decision, and identifying the item but not its source is thought to reflect a decision based mainly on familiarity and much less on recollection (Eichenbaum et al., 2007).
In studies in which scanning occurs during the study phase, a typical finding is that activity in the perirhinal cortex is greater for subsequently remembered items than for subsequently forgotten items but that activity is equal for source-correct items in which the source is subsequently remembered and source-incorrect items in which the source is not remembered (Davachi et al., 2003; Gold et al., 2006). In contrast, activity in the hippocampus is often greater for source-correct trials than for source-incorrect trials (Davachi et al., 2003; Ranganath et al., 2004; Kensinger and Schacter, 2006; Uncapher et al., 2006) (but see Gold et al., 2006). This pattern of findings has seemed to support the idea that the perirhinal cortex is important for item judgments (familiarity) and that the hippocampus is important for source memory (recollection).
However, in these studies source correct trials and source incorrect trials are confounded with memory strength. Typically, confidence in the old/new judgment is higher for items that are subsequently associated with correct source judgments than for items that are subsequently associated with incorrect source judgments (Slotnick and Dodson, 2005; Gold et al., 2006). Thus, activity associated with strong, recollection-based decisions (source-correct) is being compared with activity associated with comparatively weak, familiarity-based decisions (source-incorrect). This confound is important because recollection and familiarity are independent of memory strength. One can experience a strong sense of familiarity in the absence of recollection (Mandler, 1980), and some degree of recollection can often be identified for weak memories (Slotnick et al., 2000; Koriat et al., 2003; Slotnick and Dodson, 2005). Thus, if the objective is to isolate recollection and familiarity, it is important to control for memory strength.
We used a subsequent memory paradigm and novel methods of data analysis that allowed us to examine independently the strength of item memory and the strength of source memory. Participants were scanned while they studied words under one of two different conditions. After scanning, participants took a surprise recognition memory test in which they first gave confidence ratings for an old/new decision (item memory), and then, for items called old, they gave confidence ratings for a source memory decision. For functional magnetic resonance imaging (fMRI) analysis, study trials were sorted according to the confidence ratings subsequently given for the old/new and source memory judgments. We then performed two novel analyses. First, we examined changes in fMRI activation during learning related to the subsequent strength of item memory, while source memory strength was held constant at chance levels. We then examined changes in fMRI activation during learning that were related to the subsequent strength of source memory while item memory strength was held constant. We also performed two conventional analyses so that we could compare our results with what has been reported previously in similar studies.
Materials and Methods
Fourteen right-handed volunteers (eight female; mean age, 24.3 years; range, 19–30 years) recruited from the university community gave written informed consent before participation.
The stimuli were 720 nouns with a mean frequency of 27 (range, 1–198) and concreteness ratings >500 (mean, 573) obtained from the Medical Research Council Psycholinguistic Database (Wilson, 1988). Half the words were assigned to six 60-word study lists, and half the words served as foils for the retrieval test. The assignment of words to the study and test conditions was randomized across participants.
Participants were scanned in six separate runs (∼2 min delay between runs), during which the 360 target words were presented at a rate of 2.5 s per word. Half of the study words were presented in green and the other half were presented in red (color order was randomized). When the word was presented in green, participants were instructed to decide whether the item named was animate or inanimate (animacy judgment). When the word was presented in red, they decided whether the object named would fit inside a shoebox (size judgment). Participants were encouraged to respond during the 2.5 s presentation time. Responses were collected via an MR-compatible button box. An odd/even digit task (Stark and Squire, 2001) was intermixed with word presentation and served as a baseline against which the hemodynamic response was estimated. For the digit task, participants saw a digit (1–8, presented in black) for 1.25 s and indicated by button press whether the digit was odd or even. Digit task trials (144 trials per scan run) were pseudorandomly intermixed with the word presentation trials with the following constraints: each scan run began and ended with at least 12 digit trials, and all digit trials occurred in groups of two, four, or six so as to fit within the 2.5 s repetition time (TR) (see below). The mean intertrial interval was 2.5 s (range, 0–7.5 s). Participants were given a short practice block before scanning to ensure that they understood the task and the button assignments.
After scanning (∼15 min delay), participants took a surprise postscan recognition memory test. They saw all 360 words from the scan session (targets) and 360 novel foils one at a time (3.5 s/word) in a random order and printed in black. The recognition memory test was divided into 12 blocks of 60 words each, with a short break between blocks. For each word, participants made an old/new recognition confidence judgment (the item question) on a six-point scale (1, sure new; 2, probably new; 3, guess new; 4, guess old; 5, probably old; 6, sure old). In cases in which the participant indicated that the word was old (old/new responses 4–6), they were further asked to indicate the decision that was made about the word (animacy judgment or size judgment, i.e., the source question) along with their confidence (1, sure animacy; 2, probably animacy; 3, guess animacy; 4, guess size; 5, probably size; 6, sure size).
The confidence ratings from the item and source questions were then used to back sort each of the study trials according to both item memory strength and source memory strength. For the item memory question, responses of 1–3 to study words represented misses of high, medium, and low confidence, respectively, and responses of 4–6 to study words represented hits of low, medium, and high confidence. For source memory, correct responses made with a rating of 1 or 6 were scored as high-confidence source memories, correct responses with a rating of 2 or 5 were scored as medium-strength source memories, and correct responses with a rating of 3 or 4 were scored as low-strength source memories (or source guesses). Incorrect source responses were collapsed into a single “source-miss” category regardless of the confidence rating because of the small number of such trials. Before testing, participants completed a short practice block to ensure that they understood the instructions and the confidence rating scale.
Imaging was performed on a 3 T GE Healthcare scanner at the Center for Functional MRI (University of California San Diego). Functional images were acquired using a gradient-echo, echo-planar, T2*-weighted pulse sequence [TR, 2500 ms; 132 TRs/run; echo time (TE), 30 ms; flip angle, 90°; matrix size, 64 × 64; field of view, 22 cm]. The first five TRs acquired were discarded to allow for T1 equilibration. Forty-two oblique coronal slices (slice thickness, 5 mm) were acquired perpendicular to the long axis of the hippocampus and covering the whole brain. After the six functional runs, high-resolution structural images were acquired using a T1-weighted, inversion prepared spoiled gradient-echo pulse sequence (24 cm field of view; 10° flip angle; TE, 3.8 ms; 166 slices; 1.4 mm slice thickness; matrix size, 256 × 256).
fMRI data analysis.
fMRI data were analyzed using the AFNI suite of programs (Cox, 1996). Functional data were coregistered in three dimensions to the whole-brain anatomical data, slice-time corrected, and coregistered through time to reduce effects of head motion. Large motion events, defined as TRs in which there was >0.3° of rotation or 0.6 mm of translation in any direction were excluded from the deconvolution analysis by censoring the excluded time points but without affecting the temporal structure of the data (mean of 0.4 events per participant). We also excluded the TR immediately preceding and after the motion-contaminated TR. Fifteen behavioral vectors were created that coded each study trial according to the outcome of the subsequent item and source confidence ratings (i.e., item confidence ratings of 1–3 were coded with a single behavioral vector, whereas item confidence ratings of 4–6 had four vectors each, one for source-miss responses and one each for source-correct responses with low, medium, and high confidence). Trials in which there was no response for either the encoding task or for the subsequent recognition memory test (mean, 5 per participant) were modeled but then excluded from additional analysis. The 15 behavioral vectors and six vectors that coded for motion (three for translation and three for rotation) were used in deconvolution analyses of the fMRI time series data. This method does not assume a shape of the hemodynamic response, and the fit of the data to the model was estimated for each time point separately. The resultant fit coefficients (β coefficients) represent activity versus baseline in each voxel for a given time point and each of the trial types. This activity was summed over the expected hemodynamic response (0–12.5 s after trial onset) and taken as the estimate of the response to each trial type (relative to the digit task baseline). Note that the digit task was used only as a baseline for estimating the hemodynamic response in the deconvolution analysis (Stark and Squire, 2001). The following fMRI contrasts of interest were all made within active task conditions.
Initial spatial normalization was accomplished using each participant's structural MRI scan to transform the data to the atlas of Talairach and Tournoux (1988). Statistical maps were also transformed to Talairach space, resampled to 2 mm3, and smoothed using a Gaussian filter (4 mm full-width at half maximum) that respected the anatomical boundaries of the several MTL regions defined for each individual participant (see below). Specifically, the smoothing was performed within each of the anatomically defined MTL regions, but smoothing was not extended beyond the edges of these regions to prevent activity from one region (e.g., parahippocampal cortex) from being blurred into another, adjacent region (e.g., hippocampus). This was accomplished by creating a separate mask for each region, smoothing the data within that mask, and then recombining the smoothed data. The Talairach-transformed data were used in the whole-brain analyses.
The region of interest large deformation diffeomorphic metric mapping (ROI-LDDMM) alignment technique (Miller et al., 2005) was used to improve alignment and increase statistical power for the analysis of medial temporal lobe activity (Kirwan et al., 2007). The first step in this approach is to define anatomical regions of interest for each subject. Anatomical regions of interest were manually segmented in three dimensions on the Talairach-transformed anatomical images for the hippocampus, temporal polar, entorhinal, perirhinal, and parahippocampal cortices. Temporal polar, entorhinal, and perirhinal cortices were defined according to the landmarks described by Insausti et al. (1998b). The caudal border of the perirhinal cortex was defined as 4 mm caudal to the posterior limit of the gyrus intralimbicus as identified on coronal sections (Insausti et al., 1998b). The parahippocampal cortex was defined bilaterally as the portion of the parahippocampal gyrus caudal to the perirhinal cortex and rostral to the splenium of the corpus callosum (Insausti et al., 1998a). Using ROI-LDDMM, the anatomically defined ROIs for each individual participant were then used to normalize each subject's set of ROIs to a previously defined template for each structure (Kirwan et al., 2007). ROI-LDDMM has the advantage over other flat-mapping techniques that the spatial transformation of medial temporal lobe structures takes place so as to maintain the relationships between voxels. This transformation was then applied to the statistical maps, and all MTL analyses were performed on the ROI-LDDMM-transformed data.
After individual deconvolution analysis, individual subject parameter estimate maps were entered into group-level analyses and thresholded at a voxelwise p value of <0.03. For the MTL analyses, group statistic maps were masked using the MTL template from the ROI-LDDMM alignment procedure to include only regions of the MTL. A cluster correction technique was used to correct for multiple comparisons, and Monte Carlo simulations were used to determine how large a cluster of voxels was needed to be statistically meaningful (p < 0.05) (Forman et al., 1995; Xiong et al., 1995) within the volume of the MTL (minimum cluster extent of 33 contiguous voxels) and for the entire brain (minimum cluster extent of 104 voxels).
Figure 1A shows responses on the item recognition portion of the postscan memory test. Overall, participants scored 76.3 ± 1.6% (mean ± SEM) correct for the old/new item recognition test (hit rate, 70.7 ± 3.61%; false alarm rate, 18.0 ± 1.84%; d′ = 1.51 ± 0.08). Participants scored 74.2 ± 1.0% correct on the source memory question (76.1 ± 1.9% for the size question and 71.9 ± 2.5% for the living/nonliving question; d′ = 1.33 ± 0.06). Accuracy for the source memory judgment increased correspondingly with source memory confidence (Fig. 1B): 58.8 ± 1.7, 75.3 ± 1.6, and 88.2 ± 2.0% for low source confidence (ratings of 3 or 4), medium source confidence (ratings of 2 or 5), and high source confidence (ratings of 1 or 6). The accuracy scores were reliably above chance for all source confidence levels (p values <0.001). In addition, source memory accuracy increased as a function of item memory confidence (Fig. 1C): 67.3 ± 9.0, 67.0 ± 11.0, and 82.0 ± 7.6% for item confidence ratings 4, 5, and 6, respectively. Note that source memory was reliably above chance (50%) even at low or medium levels of item memory confidence (t(13) = 7.2, p < 0.001 and t(13) = 5.8, p < 0.001 for item confidence ratings of 4 and 5, respectively). Thus, it was not the case, as has been proposed, that recollection (i.e., successful source memory) is associated mainly with high-confidence recognition responses (Yonelinas, 2001).
We first identified regions of the MTL in which activity varied during learning as a function of subsequent item memory strength. Accordingly, we conducted a linear trend analysis using item confidence ratings 1–6, respectively, and collapsing across all levels of source confidence. This procedure resulted in 18.4 ± 5.9, 34.8 ± 9.2, 52.0 ± 5.9, 48.6 ± 6.1, 62.8 ± 8.4, and 138 ± 14.7 trials per participant in memory strength conditions 1–6, respectively. Activity in the right perirhinal cortex and left and right hippocampus varied linearly as a function of the subsequent strength of item memory (Fig. 2). We also looked for regions showing a quadratic (U-shaped) or cubic (sigmoidal) function using the same multiple regression approach but did not observe any regions in the MTL. An additional analysis was performed using only item confidence ratings 1–5 to repeat the analysis used previously in an earlier, similar study (Ranganath et al., 2004). That study used confidence ratings of 1–6, as we did, but used only confidence ratings 1–5 to assess activity related to familiarity. This additional analysis identified the same region in right perirhinal cortex (Fig. 2, inset) as in the first analysis and in the previous study (for locations of the maxima for all the reported activations, see Table 1).
The finding of activation in the hippocampus only when the high old/new memory confidence ratings were included in the analysis (i.e., item confidence rating of 6) is consistent with the view that the high memory strength trials were especially likely to be associated with recollection and that activity in the hippocampus is a predictor of recollective experience (Yonelinas, 2002; Eichenbaum et al., 2007). However, it is also possible that hippocampal activity predicts strong memories, regardless of whether the memories are based on recollection or familiarity (Squire et al., 2007). The next analysis addressed this possibility by evaluating the effect of item memory strength independently of the effect of recollection (i.e., source memory). To accomplish this, we conducted a linear trend analysis for item confidence ratings 1–6 while keeping source memory confidence constant and at chance levels. Specifically, for items endorsed as old (item confidence ratings of 4–6), we analyzed only those trials in which participants went on to rate their source memory response as a guess (either 3 or 4). Although source memory ratings of 3 or 4 were the lowest possible source confidence ratings, source accuracy was still measurably above chance levels when participants gave ratings of 3 or 4 [59.1 ± 2.1, 57.7 ± 4.3, and 59.8 ± 4.3% correct for item (old/new) confidence ratings 4–6, respectively; p = 0.001, p = 0.10, and p = 0.04, respectively]. Accordingly, to reduce maximally the contribution of recollection, we further constrained source memory performance by randomly removing a subset of low-confidence hit trials to bring source memory accuracy to chance levels (removal of 9.7 ± 1.9 trials per subject or 14% of the trials in which the source memory response was rated as a guess). The resulting linear trend analysis thus involved old items endorsed as new (item confidence ratings of 1–3), in which one assumes that source memory was poor or absent, and old items endorsed as old (item confidence ratings of 4–6), in which source memory accuracy was 54, 50, and 50%, respectively (50% indicates chance). One participant had too few low-confidence source responses and was eliminated from this analysis.
The result was that activity in the right perirhinal cortex and in right and left hippocampus varied as a function of the subsequent strength of item memory (Fig. 3). The linear response was significant in each of these areas (right perirhinal cortex, F(1,12) = 8.17, p < 0.05; left hippocampus, F(1,12) = 8.78, p < 0.05; right hippocampus, F(1,12) = 7.96, p < 0.05), and there was no region × memory strength interaction (F(10,120) = 1.77, NS). The regions identified in this analysis overlapped with those identified in our previous analysis of memory strength (Fig. 2). We also performed this same analysis without removing trials and observed activation in a similar region of left hippocampus and also in left perirhinal cortex. Regions in right hippocampus and right perirhinal cortex, similar to the regions illustrated in Figure 3, were also observed, albeit at slightly relaxed thresholds (p = 0.14 and p = 0.08, respectively, corrected for multiple comparisons).
These results show that activity during learning in both the hippocampus and perirhinal cortex was related to the subsequent strength of item memory (for a similar finding, see Shrager et al., 2008). Because source memory was at chance across the different levels of memory strength, it seems reasonable to suppose that the increasing levels of activity were predictive of the increasing strength of familiarity-based memories. We also conducted this analysis for the whole-brain (non-ROI-LDDMM-transformed) data and report the results in Table 2.
We next examined activity during learning as a function of subsequent source memory strength. Following the method used in similar studies of item memory and source memory, we first compared trials in which subsequent source memory was correct (186.7 ± 10.4 trials) with trials in which subsequent source memory was incorrect (62.6 ± 3.6 trials) (Fig. 4). As was reported previously in a similar study (Ranganath et al., 2004), this analysis identified a region in right hippocampus in which activity during learning was greater when subsequent source memory was correct than when subsequent source memory was incorrect.
Although the analysis just described has been used frequently to assess source memory effects (and recollection), this approach confounds source memory success with memory strength for the old/new judgment. Specifically, when the source memory decision was correct, the confidence rating for the old/new judgment was higher (5.42) than when the source memory decision was incorrect (5.14; t(13) = 6.41, p < 0.0001). Thus, this analysis identified activity associated with subsequent recollection-based decisions that were also based on subsequent strong item memory. Accordingly, it is unclear whether this activity specifically signals subsequent recollective success or whether it signals the strength of subsequent item memory. We suggest that this analysis did not reveal perirhinal activity because the analysis involved only subsequently remembered trials. In Figure 3, perirhinal activity was identified when the analysis also included the lower memory strength categories (i.e., forgotten trials) (see Discussion).
To evaluate the importance of recollection itself, independently of item memory strength, we conducted a different analysis. Specifically, we conducted a linear trend analysis for source memory confidence when item memory confidence was held constant at a high level (i.e., at item confidence rating of 6). That is, we used only trials for which the subsequent old/new judgment was made with high confidence. Additionally, as described above, all source incorrect decisions, regardless of the source confidence rating, were collapsed into a single source-miss category in this analysis because of the low number of trials (mean of 26.1 ± 4.2). In this way, the linear trend analysis included source-miss trials and three levels of source-correct trials (low, medium, and high confidence: 11.1 ± 2.6, 21.9 ± 3.1, and 78.8 ± 8.8 trials, respectively), but only high-confidence correct trials (item confidence rating of 6) were used in each category to hold item memory strength constant.
This analysis did not identify significant regions of activation in the MTL. There was only a small region of left perirhinal cortex (x = −25, y = −9, z = −32) that fell short of the spatial extent threshold (volume, 88 mm3; p = 0.89). In contrast, the whole-brain analysis identified regions in the medial prefrontal cortex and right ventrolateral prefrontal cortex and insula in which activity during learning varied linearly with subsequent source memory strength (Fig. 5). Another way to evaluate source memory strength independently of item memory confidence is to conduct the same linear trend analysis but now by evaluating source confidence. In this case, we combined the source-incorrect trials and the source-correct trials across three levels of source confidence (low, medium, and high), regardless of accuracy. This analysis identified the same two areas of prefrontal cortex shown in Figure 5, likely because source confidence and source accuracy are strongly correlated (Fig. 1B).
Participants studied 360 words in the scanner and then took a recognition memory test in which they first made an old/new judgment and then, for words declared old, made a source memory judgment about the condition under which the word had been presented. Both the old/new judgments and the source memory judgments were made using a six-point confidence rating scale. We then performed four analyses of the fMRI data, using the confidence ratings to sort the study trials according to both subsequent item memory strength and subsequent source memory strength.
Item memory strength effects
The first analysis identified activity in the right perirhinal cortex and in the hippocampus bilaterally that increased during learning as a function of subsequent item memory strength (collapsing across all levels of source confidence) (Fig. 2). When this same analysis was done only for item memory confidence ratings of 1–5 instead of 1–6, activity was observed only in the right perirhinal cortex (Fig. 2). This latter result replicates the finding of a previous study and has been taken to suggest that the perirhinal cortex supports a graded familiarity process (Ranganath et al., 2004). Item recognition judgments made with the highest confidence (rating of 6) were not included in the analysis done in the previous study, because it was assumed that these high-confidence judgments rely mostly on recollection (Yonelinas, 2001). According to this interpretation, our finding of hippocampal activation when the high-confidence item responses are added to the analysis could be attributable to the large proportion of successful source judgments (recollection-based responses) among the high-confidence item responses. However, a problem with this interpretation is that recollection is not confined to high-confidence old/new decisions. Instead, the degree of success at recollection varies continuously with the degree of old/new confidence (Fig. 1C). Thus, high-confidence item responses do not uniquely reflect recollection, but they do represent stronger memories than lower-confidence item responses.
To assess item memory strength independently of source memory strength, we looked for activity that varied as a function of subsequent item memory strength, but we now held source memory constant at the lowest possible level by limiting the analysis to item misses and source guesses. This second analysis identified activity in the hippocampus bilaterally and right perirhinal cortex (Fig. 3). Notably, successful recollection, as measured by source memory success, was not needed to elicit activity in the hippocampus. Thus, when we explicitly controlled for the confound between source memory success and item memory strength, we still observed activity in the hippocampus related to item memory strength (Fig. 3). Because recollective success (i.e., source memory) was weak and invariant in this analysis, the findings in both perirhinal cortex and hippocampus seem to be most related to the subsequent success of familiarity-based memories.
This finding for the hippocampus runs counter to the view that the hippocampus is related to recollection and not familiarity (Brown and Aggleton, 2001; Eichenbaum et al., 2007). However, one might argue that, although source memory was absent in this analysis for the feature of the study episode that we explicitly tested (i.e., whether an animacy or size judgment was made when the word was presented), participants might nonetheless have based their item memory judgments on some task-irrelevant source information (Yonelinas and Jacoby, 1996). By this view, as item memory strength increases, participants may be increasingly likely to recollect some task-irrelevant aspect of the study episode (such as the particular associations that came to mind when a word was presented). Accordingly, the finding that hippocampal activity appears to be related to the strength of item memory might still reflect the amount of available source memory.
This perspective leads to the untestable hypothesis that any hippocampal activity found in a source memory study is always attributable to task-irrelevant source recollection outside of experimental control. Note, too, that this hypothesis about hippocampal activity becomes problematic when one turns to interpreting the same pattern of activity in the adjacent perirhinal cortex. That is, activity in perirhinal cortex during learning that is related to subsequent item memory success has commonly been interpreted to reflect a familiarity signal (Ranganath et al., 2004; Eichenbaum et al., 2007). Yet there is no basis for interpreting the same finding in two adjacent structures in different ways. Indeed, the data suggest a more parsimonious explanation, namely, that activity in both perirhinal cortex and hippocampus predicts strong memories, including strong memories based on familiarity.
Source memory strength effects
In a third analysis, we examined activity during learning as a function of subsequent source memory strength. A common way to conduct this analysis (Davachi et al., 2003; Ranganath et al., 2004; Weis et al., 2004; Gold et al., 2006; Kensinger and Schacter, 2006) is to compare trials in which the subsequent source memory decision is correct with trials in which the subsequent source memory decision is incorrect. This comparison identified a region in right hippocampus just as was reported in a previous, similar study (Ranganath et al., 2004). However, this analysis also confounds the strength of item memory with the strength of source memory because the old/new judgments that lead to correct source decisions are stronger memories than the old/new judgments that lead to incorrect source memories (Fig. 1C). Accordingly, although this analysis did reveal a relationship between hippocampal activity and subsequent recollection-based decisions, it is unclear whether the important factor is that the decisions were based on recollection or that the decisions were based on strong memories.
To assess source memory strength independently of item memory strength, our fourth analysis examined activity during learning as a function of subsequent source memory strength but with item memory strength held constant by limiting the analysis to old/new judgments made with high confidence (rating of 6). This analysis identified no regions in the MTL but did identify two regions in left medial and right ventrolateral prefrontal cortex and insula. The absence of hippocampal activity in this analysis is consistent with the idea that activity in the hippocampus is indicative of strong memories rather than recollection specifically. In contrast, regions in the prefrontal cortex exhibited activity related specifically to recollective success or confidence in the recollection decision, independently of item memory strength. Note that recollective success and confidence in the recollection decision are strongly correlated (Fig. 1B).
A previous study also found frontal lobe activity during learning that was related to subsequent source memory success. Cansino et al. (2002) compared source-correct with source-incorrect trials when item memory strength was high and found two regions in the frontal lobe (left superior and inferior frontal gyri) in which activity was greater for source correct than source incorrect trials. The left insula has also been identified in an encoding study that contrasted high-confident subsequently remembered items with subsequently forgotten items (Sperling et al., 2003). The finding of frontal lobe activity in relation to source memory success is also consistent with a substantial literature linking frontal lobe function to source memory performance (Janowsky et al., 1989; Glisky et al., 1995, 2001). Last, medial frontal activation has been implicated in “self-referential” processing (Northoff and Bermpohl, 2004), which might promote the organization of study material for subsequent remembering.
We investigated neural activity during learning using novel methods of data analysis that allowed us to assess independently the strength of subsequent item memory and the strength of subsequent source memory. In the hippocampus, activity during learning varied in relation to the strength of subsequent item memory when source memory was held constant at chance levels (Fig. 3). This finding suggests that the activity was related to item familiarity. Also in the hippocampus, activity during learning was related to subsequent source memory success (Fig. 4). This finding identified activity related to recollection. However, additional analysis suggested that hippocampal activity was related more to the strength of memory than to the presence of recollection. Specifically, hippocampal activity was not related to variations in source memory strength when item memory strength was held constant at a high level (Fig. 5). We conclude that activity in the hippocampus during learning is related to the subsequent strength of memory, regardless of whether memory is based on familiarity or recollection, whereas prefrontal cortex exhibits activity related specifically to the success of recollection.
Last, the perirhinal cortex exhibited activity related to the strength of subsequent item memory when source memory was held constant at chance levels (Fig. 3). We found no relationship between activity in perirhinal cortex and subsequent recollection (Fig. 4). These findings are consistent with the view that perirhinal cortex supports familiarity but not recollection (Brown and Aggleton, 2001; Eichenbaum et al., 2007). However, in view of the considerable evidence from other studies, especially studies of associative recognition, for a relationship between perirhinal activity and recollection (Jackson and Schacter, 2004; Kirwan and Stark, 2004; Law et al., 2005) (for review, see Squire et al., 2007), it is worth considering an alternative possibility: that a signal for recollection in perirhinal cortex was not evident in Figure 4 because the relationship between memory strength and activity in perirhinal cortex is nonlinear (that is, the fMRI signal is relatively insensitive to changes in memory strength at the high end of the scale) (Squire et al., 2007). In any case, our findings show that, with recollection held constant at chance levels, activity in both perirhinal cortex and hippocampus increased as familiarity increased. With memory strength held constant at a high level, activity in the frontal lobes (but not in the perirhinal cortex or the hippocampus) increased as recollection increased.
This work was supported by the Medical Research Service of the Department of Veterans Affairs, National Institute of Mental Health Grant 24600 and Training Grant T32 MH20002, and the Metropolitan Life Foundation. We thank Jennifer Frascino, Yael Shrager, Mark Starr, and Peter Wais for advice and assistance.
- Correspondence should be addressed to Dr. Larry R. Squire, Veterans Affairs Medical Center 116A, 3550 La Jolla Village Drive, San Diego, CA 92161.