Abstract
We investigated how aging affects the neural specificity of mental replay, the act of conjuring up past experiences in one's mind. We used functional magnetic resonance imaging (fMRI) and multivariate pattern analysis to quantify the similarity between brain activity elicited by the perception and memory of complex multimodal stimuli. Young and older human adults viewed and mentally replayed short videos from long-term memory while undergoing fMRI. We identified a wide array of cortical regions involved in visual, auditory, and spatial processing that supported stimulus-specific representation at perception as well as during mental replay. Evidence of age-related dedifferentiation was subtle at perception but more salient during mental replay, and age differences at perception could not account for older adults' reduced neural reactivation specificity. Performance on a post-scan recognition task for video details correlated with neural reactivation in young but not in older adults, indicating that in-scan reactivation benefited post-scan recognition in young adults, but that some older adults may have benefited from alternative rehearsal strategies. Although young adults recalled more details about the video stimuli than older adults on a post-scan recall task, patterns of neural reactivation correlated with post-scan recall in both age groups. These results demonstrate that the mechanisms supporting recall and recollection are linked to accurate neural reactivation in both young and older adults, but that age affects how efficiently these mechanisms can support memory's representational specificity in a way that cannot simply be accounted for by degraded sensory processes.
Introduction
It has been a longstanding challenge for neuroscience to connect private mental experiences to observable brain phenomena. Here, we describe a way to assess the quality of memories for complex events in younger and older adults using multivariate pattern analysis (MVPA) of functional magnetic resonance (fMRI) data. We operated on the principle that when an experience is recalled faithfully, the distributed pattern of neural activity resembles the pattern that was first instantiated during the perception of the original event (Xue et al., 2010; Buchsbaum et al., 2012). We used this principle to examine the ability to reactivate distributed patterns of neural activation specific to unique perceptual episodes in young and older adults, two groups known to differ in their ability to recollect past events (Levine et al., 2002; Piolino et al., 2009).
Aging reduces the specificity of neural representation for complex stimuli—a phenomenon known as dedifferentiation (Li et al., 2001; Grady, 2008). Dedifferentiation has been observed with tasks of visual perception (Park et al., 2004; Voss et al., 2008; Goh et al., 2010) and working memory (Payer et al., 2006; Carp et al., 2010), both within and outside the ventral visual stream (Park et al., 2010, 2012; Carp et al., 2011). Studies typically show a decrease in how patterns of brain activation differ between stimulus categories such as faces, houses, chairs, and scenes in young and older adults.
In the memory domain, most evidence of dedifferentiation is based on reduced distinctiveness among memory tasks, rather than between individual stimuli or stimulus categories. For example, prefrontal activity becomes less task specific with age (Grady, 2002), and aging reduces differences between neural activation patterns associated with implicit and explicit learning (Dennis and Cabeza, 2010); with episodic, semantic, and autobiographical memory retrieval (St-Laurent et al., 2011); and with episodic encoding and working memory (Sambataro et al., 2012). Importantly, the mechanisms that mediate dedifferentiation between memory tasks might differ fundamentally from those that reduce stimulus-specific neural representations during perception. For example, dedifferentiation across memory tasks might reflect decreased specificity in the engagement of cognitive processes across tasks due to a failure to inhibit competitive processes or to a compensatory pooling of cognitive resources that benefits task performance (Dennis and Cabeza, 2010; Spreng and Schacter, 2012). Although we know that aging affects recollection, we do not know how aging impacts the ability to reactivate stimulus-specific patterns of neural activity from long-term memory, which is the focus of the current study.
Using fMRI and a paradigm in which subjects observed and recalled (i.e., mentally replayed) short multimodal video clips, we used MVPA to quantify the similarity between patterns of brain activity elicited during perception and replay. We also evaluated the hypothesis that individual differences in memory abilities—which we assessed with post-scan recall and recognition tasks—covary with the ability to reactivate distributed patterns of neural activity. Finally, we assessed aging-related dedifferentiation separately at perception and at replay to compare the relative magnitude of both phenomena.
Materials and Methods
Participants
Fourteen young adults (ages 21–32) and 14 older adults (ages 64–78) of either sex (see Table 1) were recruited through the Baycrest subject pool, and tested according to a protocol approved by the Rotman Research Institute's Research Ethics Board. Older participants were screened over the phone for dementia with a modified version of the Telephone Interview for Cognitive Status (Welsh et al., 1993). We used a score of 30/50 as a cutoff point (Welsh et al., 1993), although the lowest score we observed in our sample was 32. All participants were in good health and had no history of neurological or psychiatric disorder, high blood pressure, or diabetes. All participants had normal hearing, normal color perception, and normal or corrected-to-normal vision.
Participants were tested on two separate occasions. They performed the fMRI task during the second session. During the first session (2.5 h), they completed a series of paper and pencil and computerized tests designed to assess a range of cognitive abilities. The test battery included the Vocabulary and the Matrix Reasoning scales from the Wechsler Abbreviated Scale of Intelligence (Wechsler, 1999). It also included memory tests such as the California Verbal Learning Test (second edition; Delis et al., 2000), the Logical Memory test from the Wechsler Memory Scale (fourth edition; Wechsler, 2009), the Rey-Osterrieth Complex Figure test (Duley et al., 1993), and the Operation Span Test (Turner and Engle, 1989) as well as mental imagery tests such as the Paper Folding Task (Ekstrom et al., 1976) and the Object-Spatial Imagery questionnaire (Blakenkova et al., 2006). To characterize intelligence, memory, and mental imagery abilities among our age groups, we report their mean test scores along with demographic information in Table 1.
fMRI task
Video stimuli.
We used a set of short but complex audiovisual videos to elicit brain-wide patterns of activation (Furman et al., 2007; Nishimoto et al., 2011; Buchsbaum et al., 2012). Fifteen short video clips were gathered from on-line sites (Vimeo.com and YouTube.com). Four clips were reserved for practice, and 11 were used for the in-scan task (Fig. 1). Videos and their soundtracks were edited and exported to an .avi format using iMovie. Each video was associated with a short descriptive title shown in conjunction with the video. This title also served as a retrieval cue during recall trials.
Testing procedure.
In the second testing session, participants performed a cued recall task while undergoing fMRI scanning. They viewed and recalled, or mentally replayed, the same set of 11 short videos over seven functional runs of 8 min each. Perception and mental replay trials were intermixed throughout the runs according to a pseudorandom order that differed between runs and participants for a particular video perception and replay trials always alternated, with the restriction that they were separated by at least two (maximum of eight) intervening trials. Each of the videos were 3.4 s long (plus 0.2 s of buffer time to load), and were shown and recalled 21 times each (3 times per run). All trials were cued by a short title that matched each video (e.g., “Race Car,” “Skateboarding Dog”). During perception trials, a title was shown in blue-green letters (0.9 s) in the center of the screen, followed by the matching video (3.6 s) and an interstimulus interval (ISI) of 0.25–1 s. During mental replay trials, a title was shown in red letters (0.9 s), followed by a gray rectangle that covered the same portion of the screen as the videos shown on perception trials (4.85 s). While the gray rectangle was on the screen, participants were instructed to mentally replay the video from memory as vividly and with as many visual and auditory details as possible. Participants were then given 2.75 s to rate the vividness of their replay on a 1–4 scale (1 = not vivid at all, between 2 and 3 = average, and 4 = extremely vivid), followed by an ISI of 0–0.75 s. Mental replay trials were 1.25 s longer (4.85 s compared to 3.6 s) than the actual duration of the videos to give some time for participants to retrieve the episode from long-term memory. Before scanning, participants were given instructions and practiced the task using a separate set of four videos. Another practice session with these same four practice videos was completed during an anatomical scan acquired immediately before the first functional run. The experimental stimulus set was only shown to participants when they performed the in-scan task.
After the scan, participants' memory for the content of the 11 videos was tested. First, we recorded them as they provided verbal descriptions of the visual and sound features of each video. Participants were instructed to report as many features as they could recall, and they were provided with an example of a very detailed description of one of the practice videos to minimize the impact of personal biases in what participants considered detailed recall. Free recall recordings were transcribed to text documents by research assistants. Then, transcripts were scored according to a system for which “Visual Feature” and “Auditory Feature” points were given for each visual or auditory detail described by the participant. For example, Visual Feature points were awarded when participants mentioned an object, its on-screen position, its color, its texture, or the way it moved on the screen. Auditory Feature points were given when participants described music, a voice, background noise, a sound's tonal properties, texture, or intensity. “Visual Error” and “Auditory Error” points were also tallied when participants described visual or auditory details that were not in the video; error points were counted independently from feature points. Points were summed per category for each video, and were averaged across videos for each participant. Transcripts from four participants (44 memories in total) were scored independently by two scorers (A.B. and M.S.-L.). Significant intraclass correlations (ICC; two-way random models assessing consistency; McGraw and Wong, 1996) were observed for the two scorers on the four detail categories (Visual Features coefficient: 0.933, p < 0.001; Auditory Features coefficient: 0.655, p < 0.001; Visual Errors coefficient: 0.570, p < 0.001; Auditory Errors coefficient: 0.255, p < 0.05). Note that low numbers of detail tallies in the Auditory Feature and the two Error categories may account for lower ICC coefficients for these measures (see Table 1). For consistency, scores from the same scorer (A.B.) were used for all participants in the current analysis.
After the verbal free recall, participants answered 88 true-false questions about the videos (eight questions per video, half true and half false; see Table 2 for examples of true-false questions). Due to a mistake, there were 45 false questions, and 43 true questions. Two false questions that were answered with 100% accuracy by both age groups were removed from the final analysis to balance the number of true and false questions. Participants provided their answers on a 1–6 scale for which 1 = sure false, 2 = think false, 3 = guess false, 4 = guess true, 5 = think true, and 6 = sure true. We defined Hits as trials for which participants answered either think true, sure true, or guess true when the answer was true, and False Alarms (FAs) as trials for which participants answered either think true, sure true, or guess true when the answer was false. We calculated d′ from participants' hits and FAs [d′ = Z(hit rate) − Z(FA rate)], which we report as a single performance score.
MRI set up and data acquisition
Visual stimuli were projected onto a screen behind the scanner made visible to the participant through a mirror mounted on the head coil. Stimuli and responses were presented and recorded using E-Prime 2.0 (Psychology Software Tools). Audio stimuli were delivered from the PC running the experimental task through electrodynamic headphones using the MR Confon MRI-compatible audio system.
Participants were scanned with a 3.0 T Siemens MAGNETOM Trio MRI scanner using a 12-channel head coil system. High-resolution gradient-echo multislice T1-weighted scans (160 slices of 1 mm thickness, 19.2 × 25.6 cm field of view) coplanar with the echo-planar imaging scans (EPIs), as well as whole-brain magnetization prepared rapid gradient echo (MP-RAGE) 3D T1-weighted scans were first acquired for anatomical localization, followed by T2*-weighted EPIs sensitive to blood oxygenation level-dependent contrast. Images were acquired using a two-shot gradient-echo EPI sequence (22.5 × 22.5 cm field of view with a 96 × 96 matrix size, resulting in an in-plane resolution of 2.35 × 2.35 mm for each of 26 3.5 mm axial slices with a 0.5 mm interslice gap; repetition time = 1.5 s; echo time = 27 ms; flip angle = 62 degrees).
fMRI data analysis
All statistical analyses were first conducted on smoothed and realigned functional images in native EPI space. The MP-RAGE anatomical scan was normalized to the Montreal Neurological Institute (MNI) space using nonlinear symmetric normalization implemented in ANTS (Avants et al., 2008). An equivalent transformation was then applied to maps of statistical results derived from functional images using ANTS to normalize these maps to MNI space for group analyses.
Univariate fMRI analysis.
Functional images were converted into NIFTI-1 format, motion-corrected, and realigned to the first image of the first run with AFNI's (Cox, 1996) 3dvolreg program, and smoothed with a 4 mm full-width at half-maximum Gaussian kernel. Single-subject multiple regression modeling was performed using the AFNI program 3dDeconvolve. Each of the two conditions (video perception and mental replay) were modeled separately for each video by convolving a hemodynamic response function (SPM canonical function as implemented in AFNI) with the onset and the duration of the trial events. Replay trials for which no rating, or a rating of 1 (= no memory details) was provided, were excluded from all analyses (see Results for number of successful trials per age group). One additional event was modeled but was not further analyzed: the memory rating response for the mental replay condition. An additional set of five nuisance regressors (a constant term plus linear, quadratic, and higher order polynomial terms) was included for each scanning run to model low-frequency noise in the time series data.
Statistical contrasts at the single-subject level were computed as weighted sums of the estimated β-coefficients divided by an estimate of the SE. This procedure produced a t statistic for each voxel in the image volume. Estimates for activity for perceiving or mentally replaying a video were computed in the subject's native image space as contrasts between each video and the average activity in all the other videos within that condition (perception and mental replay, respectively). Thus, for both the perception and mental replay trials, 11 contrast images, one for each video, were computed at the single subject level. In addition, for the purpose of assessing the reliability of within-subject activation patterns, the same set of 11 contrast images specific to the viewing and replay of each video were calculated separately for odd and even scanning runs. The different sets of contrast images obtained from this analysis were used in the computation of three measures of similarity: (1) the Jaccard, (2) the RV, and (3) the Average Pairwise Correlation (AP-Corr) coefficients, as described below. Although contrasting stimulus-specific signal against an average signal rather than against a general baseline can distort absolute reliability estimates for individual stimuli, subtracting the condition's mean signal from stimulus-specific signal does not distort relative neural specificity measures averaged across stimuli such as the ones we report in our analyses (Garrido et al., 2013).
Whole-brain MVPA representational specificity metrics.
We computed several metrics of multivariate pattern similarity to compare activation specificity in older and younger adults. To estimate activation pattern specificity during stimulus perception alone, similarity metrics were computed between activation contrast images for perception trials from odd and even runs. Likewise, to estimate the specificity of activation patterns for memory trials alone, similarity metrics were computed between activation contrast images for mental replay trials from odd and even runs. To assess the extent to which activation patterns during mental replay trials were similar to activation patterns during perception trials (i.e., memory reactivation), video-specific contrast images based on all perception trials were compared with video-specific contrast images based on all mental replay trials. Thus, the odd and even run comparisons may be viewed as video-specific reliability measures between trials of the same type (either perception or mental replay), whereas the comparison between perception and mental replay contrasts assess the extent to which different cognitive activities (perception and replay) performed on identical stimuli share a common distributed neural representation. For each analysis, pairs of data matrices were created for each participant in which each row corresponded to a video, and each column to a voxel whose value (t statistic) was determined by the contrast image for that video obtained from the univariate analysis. To estimate specificity at perception, pairs of “split-half” data matrices were created for perception trials from the odd and even runs, while similarity at recall was estimated from pairs of split-half data matrices for mental replay trials from odd and even runs. Similarity between perception and memory was estimated between pairs of matrices created from the full set of perception and mental replay trials.
For each pair of matrices, three similarity metrics were computed. These three metrics reflected, in different ways, the extent to which activation patterns were stimulus specific, and thus quantified representational specificity either at perception or during mental replay from memory. The first metric was the AP-Corr coefficient between the two matrices (Haxby et al., 2001; Linke et al., 2011): a Pearson correlation coefficient was computed between each row (one per video) of the matrix, and this produced an 11 × 11 correlation matrix representing all pairwise correlations between video contrast images. A global metric of similarity between activation images for the same video was then computed by averaging the values along the main diagonal of the correlation matrix and subtracting the mean of the off-diagonal elements. The second measure was a metric of multivariate similarity called the RV coefficient, which can be interpreted as the matrix equivalent of a squared coefficient of correlation (Abdi, 2010). The RV coefficient, unlike the AP-Corr measure, operates on the respective square covariance matrices of the two input datasets: the cross-product (a.k.a., scalar product) of two covariance matrices is normalized by the square root of the product of their respective sum of squares (i.e., RV is a cosine between the “vectorized” form of two covariance matrices). The RV coefficient is a more general measure of the similarity structure between two matrices than the AP-Corr measure because the RV coefficient captures second-order similarity relationships (Shepard and Chipman, 1970; Shepard, 1978) between the two input matrices. The third representational specificity metric was the Jaccard coefficient: t scores for all voxels in a pair of contrast images (e.g., activation elicited by video 3 at perception and at replay) were thresholded at p < 0.005 (uncorrected). Then, the number of voxels whose t score was above threshold in the two image contrasts (intersection) was divided by the number of voxels whose t score was above threshold in either of the two image contrasts (union). This coefficient was computed between pairs of contrast images for each of the 11 videos, and the Jaccard coefficient was computed by averaging all 11 coefficients to create a composite measure for the entire stimulus set. Thus, the Jaccard coefficient is a measure of the degree of similarity between thresholded activation maps. To summarize, the AP-Corr metric measured the voxel-to-voxel correlation between two sets of activation maps and the RV coefficient measured the structural similarity between two covariance matrices, while the Jaccard coefficient, which is conceptually similar to a “conjunction analysis” (Price and Friston, 1997; Nichols et al., 2005), measured the agreement between the set of suprathreshold voxels between two sets of images.
We elected to report results from three related similarity metrics because, having computed them all, we did not want to bias the findings we reported. Currently, there is no gold standard measure of pattern specificity in the field, and our study provides evidence of each metric's converging power and relative merit to detect patterns of reactivation.
Searchlight analysis.
To identify local regions containing voxels informative about the identity of a stimulus either during perception or during mental replay, we performed a searchlight analysis throughout the entire brain (Kriegeskorte et al., 2006). The AP-Corr coefficient was computed on pairs of split-half matrices based on voxels restricted to a local 8 mm spherical neighborhood surrounding a central voxel, and whose values were the t scores from univariate contrasts based on odd and even runs from the same condition (either perception or mental replay, see above). The 8 mm sphere was moved around the entire brain, excluding voxels falling outside a functional brain mask, to create a whole-brain data map of AP-Corr coefficients attributed to the voxel in the center of each sphere.
To identify brain regions that showed a high degree of similarity between perception and memory retrieval, we also computed a searchlight analysis on pairs of matrices restricted to a local 8 mm spherical neighborhood surrounding a central voxel, and whose values were t scores from univariate contrasts based on all trials from the perception and the mental replay trials. For group analyses, whole-brain AP-Corr maps were spatially normalized to MNI space and analyzed with voxelwise t tests. We used one-sample t tests for analyses conducted within age groups and independent-sample t tests for between-group comparisons. To determine whether the magnitude of dedifferentiation observed for the reactivation signal could be accounted for by dedifferentiation at perception, we also conducted voxelwise ANCOVAs comparing searchlight AP-Corr reactivation coefficients between age groups using a mean head motion metric (see below, Head motion) and searchlight AP-Corr perception coefficients as covariates.
Although we could have conducted searchlight analyses with all three similarity metrics (AP-Corr, Jaccard, and RV), we decided to conduct analyses using only one of them to limit the number of contrasts to report. We selected AP-Corr because it is arguably the simplest of the three metrics, a direct measure of similarity among identical videos, and a variant of the reactivation metric most commonly reported in the literature.
Partial least-squares.
We used Partial Least-Squares Correlation (PLSC) to identify whole-brain patterns of reactivation that were modulated by individual differences in memory performance. PLSC is a multivariate technique that extracts commonalities between patterns of brain activation and either task conditions or external measures of performance (McIntosh et al., 1996; McIntosh and Lobaugh, 2004; Krishnan et al., 2011). Our analysis assessed brain regions whose level of reactivation was modulated by post-scan memory for the videos in each age group. Brain voxel values entered into PLSC corresponded to AP-Corr coefficients from the reactivation searchlight analysis comparing perception and mental replay trials. For behavioral variables, we used three measures of memory performance: (1) d′ from the true-false post-scan questions, and a participant's mean number of (2) Visual Features and (3) Auditory Features produced during post-scan recall.
Separate cross-block covariance matrices (one per age group) were obtained by multiplying a matrix of brain activity (rows = participants, columns = brain voxels) with a matrix of behavioral variables (rows = participants, columns = memory test scores). All data were centered and normalized within age group so that mean group differences did not influence the patterns of covariance between the brain and the behavior variables identified by PLSC. The two cross-product matrices were stacked into a combined matrix, which was then decomposed with the singular value decomposition to provide orthogonal sets of latent variables that extracted the largest amount of information shared by the brain and the behavior matrices. Each latent variable identified a spatial pattern of brain voxels whose activity best accounted for variation in the behavioral measures of performance (voxel saliences). Each latent variable also identified the extent to which each behavioral measure covaried with its pattern of brain activation (behavior saliences). Finally, each latent variable was associated with a singular value measuring the amount of the cross-product matrix covariance accounted for by that latent variable.
The significance of the latent variables was determined using a permutation test with 10,000 permutations, using sampling without replacement to reassign the order of rows (corresponding to participants) in the brain data matrix. A PLSC analysis was run on each new permuted sample and the latent variable's significance (p < 0.05) was computed as the proportion of its permuted singular values that exceeded its observed singular value. In addition, the voxels' contribution to the latent variable's pattern was assessed with a bootstrap estimation of the SEs (10,000 bootstrap samples). Participants were randomly resampled with replacement, and each voxel's SE was computed. A voxel was considered reliable if the ratio of its salience to the SE of its salience—a t-like statistic called a bootstrap ratio—had a magnitude larger than 3.00. These ratios are approximately equivalent to normally distributed Z-scores (McIntosh and Lobaugh, 2004) and so a bootstrap ratio with a magnitude larger than 3.00 approximates p < 0.005. We report clusters of at least 20 suprathreshold voxels in our results.
Head motion.
Overall head motion was quantified during motion correction for each participant. The mean displacement (in millimeters) between each frame and its reference frame (a run's first frame, to which all subsequent frames were realigned within that run) was calculated per run, and then averaged across all runs. We used this measure to assess whether age differences in representational specificity could be accounted for by differences in overall head motion between young and older adults.
Results
Behavioral results
Both age groups were highly successful at mentally replaying the correct video during the in-scan recall condition. The mean percentage of successful trials (vividness ratings > 1) was 97.47% (SD = 2.60%) for young adults, and 97.22% (SD = 3.75%) for older adults. As a group, older adults rated significantly more trials as “extremely vivid” (rating = 4) than young adults (65.6 and 38.1% of trials, respectively; t(26) = 2.231, p = 0.034), while young adults rated significantly more trials as “average” (rating = 2 or 3) than older adults (59.4 and 31.6%, respectively; t(26) = 2.381, p = 0.025).
Post-scanning free recall data were unavailable for one older participant due to a recording error. As shown in Table 1, young adults recalled more Visual and Auditory Feature details than older adults (t(25) = 1.984, p = 0.058, and t(25) = 2.139, p = 0.042, respectively). The two age groups did not differ significantly in the number of Visual and Auditory Errors they made (t(25) = 0.414, p = 0.683, and t(25) = 1.002, p = 0.326, respectively). On the post-scan true-false questions test, young adults had numerically higher d′ scores than older adults but this group difference was not significant (t(26) = 1.355, p = 0.187).
Performance on true-false questions correlated positively with the mean number of Visual and Auditory Features remembered during post-scan free recall in young, but not in older adults (Young Visual: r = 0.593, p = 0.026; Older Visual: r = −0.112, p = 0.716; Young Auditory: r = 0.670, p = 0.009; Older Auditory: r = −0.082, p = 0.790). Z tests based on Fisher's Z-transform comparing correlation coefficients between the two age groups (Weaver and Wuensch, 2013) revealed trends for group differences that did not reach significance, possibly due to sample size (for Auditory Features: Z = 1.667, p = 0.095; for Visual Features: Z = 1.304, p = 0.192). Nevertheless, these results indicate that, while both groups performed similarly on the true-false questions, recall and recognition performance only correlated significantly with each other in young adults.
fMRI results
Representational specificity at perception
Figure 2 illustrates the AP-Corr coefficients from the split-half searchlight analysis conducted on odd and even runs for the video-specific perception contrasts, averaged over each age group. Activity from brain voxels with high coefficients contained information that distinguished between the different videos at perception. In both age groups, a similar set of posterior cortical regions that included primary and secondary visual and auditory regions supported stimulus representation. A direct contrast (t test) between the two age groups revealed that AP-Corr differed significantly in only a few voxels. The only voxel cluster that survived thresholding (>20 voxels, t > 3.700, p < 0.001 uncorrected) was in the left fusiform gyrus (see Table 3).
Among the three whole-brain measures of stimulus specificity at perception, the RV (t(26) = 2.399, p = 0.024) and Jaccard (t(26) = 2.310, p = 0.028) coefficients were significantly higher in young than in older adults, but the AP-Corr coefficient did not differ significantly between the two age groups (t(26) = 0.781, p = 0.442). Overall, our results indicate relatively modest aging-related dedifferentiation at perception.
Representational specificity during mental replay
Figure 3 illustrates the AP-Corr coefficients from the split-half searchlight analysis conducted on odd and even runs for the video-specific mental replay contrasts averaged for each age group. Activity from brain voxels with high coefficients contained information that distinguished among the videos as participants mentally replayed them. Reliable activation patterns for memory trials were found in many regions for both age groups, but unlike at perception, these regions seemed underrepresented in primary sensory areas.
To quantify the correspondence between regions with high signal specificity at perception and at replay, we computed the Spearman rank correlation between signal specificity averaged across age groups (i.e., the proportion of significant voxels within a region, p < 0.001) at perception [p(percept)] and at replay [p(replay)] within each of the Harvard-Oxford atlas' regions of interest (ROIs: 108 regions; FMRIB Software Library v5.0, the Analysis Group, FMRIB, Oxford, UK). The rank correlation between p(percept) and p(replay) was 0.76, confirming that reliable signal at perception is a good predictor of reliable signal at replay within an ROI. Then, we normalized data with the arcsine transform, we fit a robust linear model to predict p(replay) from p(percept) values over all ROIs, and we identified regions with the largest negative residuals (i.e., regions with high signal specificity at perception but not at replay). The two regions with the largest negative residuals were the right and left primary auditory cortex [Heschl's gyrus; p(percept) = 0.982, p(replay) = 0.040, and p(percept) = 1.00, p(replay) = 0.171, respectively). Five left- and right-lateralized posterior occipital areas were also listed among the top 10 regions with the largest negative residuals, including the right occipital pole [p(percept) = 0.668, p(replay) = 0.130]. These results show that signal specificity was considerably less at replay compared with perception in primary and low-level sensory areas.
A direct contrast (t test) between the two age groups' searchlight replay maps revealed that AP-Corr coefficients were significantly higher in young than in older adults in several posterior cortical regions (see Table 3), indicating clear age-related dedifferentiation during mental replay. No region showed greater levels of stimulus-specific activation in the older group. All three whole-brain measures of stimulus specificity at replay were significantly higher in young than in older adults (RV coefficient: t(26) = 3.802, p = 0.001; AP-Corr coefficient: t(26) = 2.917, p = 0.007; Jaccard coefficient: t(26) = 3.570, p = 0.001). Overall, our results indicate clear aging-related dedifferentiation during mental replay of complex audiovisual stimuli.
Stimulus-specific reactivation
Figure 4 illustrates the AP-Corr coefficients from the searchlight analysis that identified stimulus-specific activity elicited both at perception and during mental replay. Activity from brain voxels with high coefficients reflected stimulus-specific information reactivated from long-term memory. Many of the same posterior cortical regions identified by the split-half searchlight analysis conducted only on mental replay trials were also involved in reactivation. A direct contrast (t test) between the two age groups indicated that AP-Corr coefficients were significantly higher in young than in older adults in several posterior and lateral prefrontal cortical regions (see Table 3). No regions indicated greater levels of stimulus-specific reactivation in the older group. An ANCOVA comparing AP-Corr reactivation coefficients between the two age groups, for which mean head motion and AP-Corr coefficients at perception were entered as covariates, revealed patterns of group differences similar to those identified by the two-sample t test (Fig. 4). These results indicate that the aging-related dedifferentiation we observed for patterns of reactivation could not be accounted for by a loss of signal specificity at perception, or by the significantly greater amount of head motion we observed in the older group(t(26) = 2.312, p = 0.029).
All three whole-brain measures of stimulus-specific reactivation were significantly higher in young than in older adults (RV coefficient: t(26) = 4.354, p < 0.001; AP-Corr coefficient: t(26) = 3.843, p = 0.001; Jaccard coefficient: t(26) = 4.634, p < 0.001). An ANCOVA with AP-Corr reactivation coefficients entered as the dependent variable, age group as a fixed factor, and mean head motion and split-half AP-Corr perception coefficients entered as covariates, showed that the split-half AP-Corr perception coefficient was a significant predictor of reactivation (F(1,24) = 19.323, p < 0.001), but that mean head motion was not (F(1,24) = 0.967, p = 0.335). Importantly, age group was still a significant factor after head motion and representational specificity at perception were accounted for (F(1,24) = 18.549, p < 0.001). Similar ANCOVAs showed that age group was also a significant predictor of the Jaccard and RV reactivation coefficients, respectively, after head motion and representational specificity at encoding were taken into account (Jaccard: F(1,24) = 16.809, p < 0.001; RV: F(1,24) = 11.192, p = 0.003). Thus, neither head motion nor loss of signal specificity at perception could account for the observed aging-related reduction in stimulus-specific reactivation.
Neural reactivation and memory performance
We performed a PLSC analysis to identify voxels in which reactivation was associated with post-scan memory for the 11 videos. We used PLSC to examine covariance relationships between the AP-Corr searchlight reactivation maps and the set of post-scan memory test scores (mean Visual Features, mean Auditory Features, and d′ for the true-false questions). PLSC identified one significant latent variable (p < 0.001) that accounted for 50.33% of the cross-block covariance. Voxel saliences for this latent variable corresponded to brain regions whose reactivation correlated positively with the two measures of recall (Auditory and Visual Features) in both young and older adults (Fig. 5). Among these areas were lateral parietal regions that have previously been associated with recollective memory retrieval processes (Wagner et al., 2005; Cabeza, 2008; Vilberg and Rugg, 2008; Hutchinson et al., 2009; see Table 4). Thus, although the two age groups performed differently on free recall, we observed within each age group that individuals with good recall performance also had more specific neural reactivation, a pattern suggesting that these participants performed more detail-oriented in-scan recollection.
In addition, performance on true-false questions correlated positively with the pattern of voxel saliences identified by this latent variable in young adults, but not in older adults. No other latent variables identified by this analysis reached significance, providing no indication of a reactivation pattern that predicted performance on the true-false task in the older group. As mentioned above, performance on post-scan recall (number of Visual and Auditory Features) correlated positively with performance on true-false questions in young, but not in older adults. The current PLSC results indicate that young adults who performed well on true-false questions also had more specific patterns of neural reactivation, and it is possible that detailed in-scan recollection benefited their subsequent recognition performance. As for older adults, our results indicate that neither the specificity of neural reactivation nor performance on the post-scan recall task predicted their performance on the true-false recognition task. If anything, PLSC indicates that older adults with high levels of cortical reactivation performed more poorly on the true-false task. Thus, although older adults as a group performed similarly to young adults on the true-false questions, our results suggest that strategies other than detailed in-scan recollection, such as more semantic or language-based forms of rehearsal during mental replay, benefited post-scan recognition in older adults.
Additional support for this interpretation comes from patterns of correlations between the whole-brain AP-Corr coefficient at reactivation and performance on the true-false questions in each age group. As shown in Figure 6, d′ derived from the true-false task correlated positively with whole-brain AP-Corr in the young group, but not in the older group. A Z test based on Fisher's Z-transform (Weaver and Wuensch, 2013) revealed that the correlation coefficient was significantly greater for young than for older adults (Z = 2.988, p = 0.003). In fact, we observed a trend for a negative correlation between recognition performance and the AP-Corr coefficient in the older group, a pattern suggesting, again, that older adults who performed detail-oriented in-scan recollection did not answer the true-false questions best. Correlations between the AP-Corr whole-brain reactivation coefficient and the number of visual and auditory details recalled post scan were positive in both age groups (Fig. 6) and did not differ between groups (p = 0.943 and p = 0.609, respectively). Although these correlations with the whole-brain measure did not reach significance, the PLSC analysis indicates that the relationship between neural reactivation and recall performance was reliable in both age groups within several brain regions including the inferior parietal lobule, the posterior cingulate cortex, the fusiform gyrus, and the lingual gyrus.
Discussion
Our MVPA analyses identified posterior cortical regions involved in visual, auditory, and spatial processing that supported stimulus-specific representation during perception and during mental replay in young and older adults. We observed some evidence of aging-related dedifferentiation at perception, a phenomenon previously documented by other groups (Park et al., 2004, 2010, 2012; Payer et al., 2006; Voss et al., 2008; Goh et al., 2010; Carp et al., 2011), although not with the kind of complex multimodal stimuli used in the present study. More importantly, our results show that stimulus-specific dedifferentiation is larger when items must be consciously recalled from memory than when they are perceived directly. Furthermore, we showed that the magnitude of dedifferentiation at perception is insufficient to account for the magnitude of dedifferentiation observed for reactivation in our older group. Thus, although degraded sensory inputs can contribute to reduced representational specificity in older adult's nervous system, the cognitive processes responsible for aging-related impairment in vivid mental replay from long-term memory extend beyond sensory processing, and most likely include higher level processes that support associative memory and attention.
Our paradigm provided participants with multiple chances to build a detailed memory representation of each video by presenting each of them several times. It is likely that stimulus repetition provided older adults with the opportunity to compensate for reduced attentional and binding capacity at encoding (Craik and Rose, 2012), which may have accounted for the relatively subtle dedifferentiation we observed at perception. Nevertheless, even when provided with multiple encoding opportunities, older adults still could not reactivate video-specific patterns of neural activation as accurately as young adults. These findings indicate that aging might reduce the capacity to integrate memory details into rich and distinct representations (Chalfonte and Johnson, 1996; Ryan et al., 2007; Dickerson and Eichenbaum, 2010; Craik and Rose, 2012), or that it disrupts the capacity to reconstruct detailed representations from long-term memory (Levine et al., 2002; Addis et al., 2008, 2010; Piolino et al., 2009), or that it affects both encoding and retrieval processes.
Some evidence of aging-related dedifferentiation during memory tasks exists in the literature. For example, Carp et al. (2010) have shown that the specificity of the neural signal that distinguishes between visual and verbal material accessed from working memory is reduced in older adults. Most studies, however, have reported that aging reduces neural distinctiveness among different types of memory tasks, not between particular stimulus items. Rather than reflecting a decrease in representational specificity, these previous results may indicate a loss of distinction among the cognitive mechanisms engaged during different memory tasks (Grady, 2002; Dennis and Cabeza, 2011; St-Laurent et al., 2011; Sambataro et al., 2012). By examining dedifferentiation at both encoding and retrieval, our study was the first to demonstrate that stimulus-specific patterns of neural representation accessed from long-term memory lose specificity in old age. Our use of rich and complex multimodal stimuli also allowed us to assess representational specificity at the whole-brain level, rather than limiting our analysis to a restricted number of ROIs, as is often the case in MVPA studies that rely on a narrow set of predetermined stimulus categories.
Among the regions whose activity reflected reactivation and/or dedifferentiation, we identified left-lateralized temporal and prefrontal regions involved in language processing; regions involved in motor, somatosensory, auditory and visual processing; ventral temporal and medial parietal regions involved in spatial and scene processing (Epstein and Kanwisher, 1998; Epstein, 2008; Vann et al., 2009); and lateral parietal regions involved in recollection (Cabeza, 2008; Vilberg and Rugg, 2009; Cabeza et al., 2012; Johnson et al., 2013). The dedifferentiation we observed among such regions is consistent with evidence that aging is especially detrimental to recollection (Levine et al., 2002; Daselaar et al., 2006; Yonelinas et al., 2007; Piolino et al., 2009), the subjective phenomenon by which one travels mentally back in time to re-live past experiences in a vivid manner (Tulving, 2002). By reporting that representational specificity of memory is reduced in older adults' brains, we show that MVPA can provide an objective measure of the fidelity and richness of recollection.
We also demonstrated a clear link between behavioral memory performance and neural reactivation in both age groups. We replicated the well established finding that recall is typically reduced in old age, and that recognition is better preserved, especially when it is based on familiarity (Craik and McDowd, 1987; Jacoby, 1991; Jennings and Jacoby, 1997; Yonelinas et al., 2007; Luo and Craik, 2008). Interestingly, our behavioral measure of recall—which is known to rely on recollective processes (Ranganath, 2010)—was associated with stimulus-specific reactivation in both age groups, as evidenced by the PLSC analysis. Thus, even though recall performance and reactivation specificity were both reduced in older adults, the relationship between behavioral recall performance and neural reactivation was preserved in older adults. In other words, we observed that aging had a quantitative rather than a qualitative impact on the cognitive and neural processes that support recollection.
In addition, we observed that our behavioral measure of post-scan memory recognition correlated with performance on the behavioral recall task and with reactivation specificity in young but not in older adults. Thus, while older adults performed at the same level as young adults on the post-scan recognition task, brain imaging data suggest that detail-oriented in-scan recollection during mental replay benefited recognition performance in young adults, but that at least some older adults may have relied successfully on alternative strategies to perform the task. Older adults with high neural reactivation scores did not outperform older adults with low reactivation scores on the recognition task. The older participants who were successful at retrieving and binding multiple memory features into detailed representations at replay most likely benefited from these recollective skills during free recall. However, it is possible that some older adults with reduced capacity for recollection relied on in-scan encoding and rehearsal strategies that were rule-based or language-based, and these strategies may not have benefited their subsequent free recall performance, but successfully supported their performance on the recognition task.
This hypothesis is supported by a post hoc ROI analysis for which we assessed age differences in activation at replay in three brain regions involved in semantic and language processing: the left anterior temporal lobe (L-ATL), left posterior middle temporal gyrus (L-MTG), and left anterior ventrolateral prefrontal cortex (L-vlPFC). We used a mask derived from the Semantic label of the NeuroSynth database (http://neurosynth.org/features/semantic; Yarkoni et al., 2011) to identify peak coordinates within each region, and we averaged activity during mental replay within 8 mm spherical ROIs centered around these coordinates. Results showed that, on average, brain activation at replay was significantly greater in the older group than in the young group in the L-ATL (MNI coordinates: −57, −11, −5; t(25) = 3.210, p = 0.004) and the L-MTG (MNI coordinates: −57, −38, 0; t(25) = 2.190, p = 0.038), although it did not differ significantly in the L-vlPFC (MNI coordinates: −51, 27, −5; t(25) = 0.93, p = 0.360). Of note, activity within these three regions did not correlate with performance on the post-scan recognition task within the older group, which may reflect the relatively successful reliance on a variety of in-scan rehearsal strategies within our senior group.
Our analysis focused on brain regions showing stimulus-specific patterns of activation, but it is likely that other brain regions whose activity pattern did not distinguish among the different videos also contributed to recollection. For example, prefrontal regions that work in concert with posterior cortical regions during memory retrieval (Moscovitch, 1992; Svoboda et al., 2006), and medial temporal regions such as the hippocampus, which are known to support detailed recollection (Nadel and Moscovitch, 1997; Moscovitch et al., 2005; Moscovitch, 2008; Ranganath, 2010), may have mediated reactivation without showing similar activation patterns between perception and replay (Ritchey et al., 2013, but see Chadwick et al., 2010). It is therefore possible that activity in brain regions that show univariate age differences but lack stimulus-specific information correlates with whole-brain neural patterns of reactivation, a question that is worth further investigation.
Our results contribute to the aging and memory literature by showing that the distributed neural activation patterns associated with memories for complex stimuli are less specific in old age. We also demonstrate that MVPA can serve to quantify subjective phenomena such as recollection, and that MVPA measures can be correlated with measures of behavioral performance to shed light into the cognitive and neural processes that support memory performance.
Footnotes
This study was supported by a grant from the Canadian Institute from Health Research (CIHR 106501) held by B.R.B., and by a fellowship from the Katz Foundation held by M.S.-L. We thank Oles Chepesiuk, John Paul Koning, and Sabrina Lemire-Rodger for their assistance, and Signy Sheldon, Daniela Palombo, Brian Levine, Jessica Arsenault, and Nathan Rose for helpful discussions.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Marie St-Laurent, 3560 Bathurst Street, Toronto, Ontario, Canada M6A 2E1. mstlaurent{at}research.baycrest.org