A common complaint of older adults is difficulty understanding speech, especially in challenging listening environments. In addition to well known declines in the peripheral auditory system that reduce audibility, age-related changes in central auditory and attention-related systems are hypothesized to have additive negative effects on speech recognition. We examined the extent to which functional and structural differences in speech- and attention-related cortex predicted differences in word recognition between 18 younger adults (19–39 years) and 18 older adults (61–79 years). Subjects performed a word recognition task in an MRI scanner where the intelligibility of words was parametrically varied. Older adults exhibited significantly poorer word recognition in a challenging listening condition compared with younger adults. An anteromedial Heschl's gyrus/superior temporal gyrus (HG/STG) region, engaged by the word recognition task, exhibited age group differences in gray matter volume and predicted word recognition in younger and older adults. Age group differences in anterior cingulate (ACC) activation were also observed. The association between HG gray matter volume, word recognition, and ACC activation was present after controlling for hearing loss. In younger and older adults, causal path modeling analyses demonstrated that individual variation in left HG/STG morphology affected word recognition performance, which was reflected by error monitoring activity in the dorsal ACC. These results have clinical implications for rehabilitation and suggest that some of the perceptual difficulties experienced by older adults are due to structural changes in HG/STG. More broadly, the results suggest the possibility that aging may exaggerate developmental limitations on the ability to recognize speech.
The percentage of people over the age of 65 in the United States will more than double in the next 20 years. This shift in demographics will have a profound impact on society as the number of people with age-related cognitive difficulties, and speech communication difficulties in particular, will substantially increase. Findings from several studies have shown age-related differences in speech recognition, even after accounting for differences in audibility (Dubno et al., 1984; Humes and Christopherson, 1991). These observations suggest that age-related changes in central auditory and/or higher-level cognitive functions are also responsible for speech recognition difficulties. However, there is limited evidence demonstrating which neural systems are most affected and which might be amenable to rehabilitation. Indeed, the poor success rate of hearing aids underscores the importance of identifying the reasons for age-related declines in speech recognition.
Recent electrophysiologic findings support the notion that some of the perceptual difficulties of older adults are related to factors affecting auditory processing within the central auditory system (Tremblay et al., 2002, 2003; Harkrider et al., 2005). Older adults had difficulty discriminating speech sounds that varied across temporal or spectral characteristics and these same speech sounds evoked abnormal P1–N1–P2 complex neural response patterns in older adults with and without hearing loss (Tremblay et al., 2002, 2003; Harkrider et al., 2005). These physiologic and behavioral measures suggest that some of the speech recognition difficulties experienced by older adults can be attributed to impaired processing in the aging central auditory system. Several studies have recently shown that, older adults also require more effortful processing than younger adults to understand speech (Pichora-Fuller, 2003; Wingfield and Tun, 2007). Consistent with these observations, older adults tend to show increased engagement of frontal lobe systems, which support cognitive control, in demanding listening conditions (Sharp et al., 2006; Eckert et al., 2008).
Age-related changes in frontal lobe systems during word recognition suggest there is an attentional burden on older adults, but do not elucidate the reasons that older adults have speech recognition difficulties. The extant literature suggests that speech recognition difficulties could stem from declines in cognitive control systems associated with preserved word recognition (Eckert et al., 2008) and/or exaggerated declines in speech-related auditory cortex that would limit word recognition (Sowell et al., 2003; Salat et al., 2004). In the current study, we tested these predictions using a word recognition task in which word intelligibility was parametrically varied across four low pass filter conditions to localize speech- and attention-related cortex. We then examined the extent to which declining structural integrity of auditory and attention-related systems accounts for decreases in word recognition. Our broader goal was to identify individual variation in the structure and function of neural systems that predicts word recognition differences to provide the neurobiological foundation for studies designed to evaluate the efficacy of aural rehabilitation.
Materials and Methods
Two age groups were included in this study: younger [n = 18; mean age = 28.8 (6.7) years; 12 females] and older [n = 18; mean age = 70.5 (5.6) years; 13 females]. All subjects were right-handed native speakers of American English and none of the subjects had participated in previous listening experiments using the experimental stimuli. There were no significant differences between the groups for their years of education [younger = 19 (1.8) years; older = 17.5 (2.6) years], socioeconomic status [Hollingshead (1975): younger = 53.6 (8.9); older = 49.4 (11.1)], or degree of handedness [Edinburgh handedness questionnaire (Oldfield, 1971), range from −100 to 100 where 100 is strongly right-handed: younger = 90 (9.7); older = 94(9.4)]. Each subject completed the Mini Mental State Examination (Folstein et al., 1983), a screening tool for assessing cognitive mental status, and had three or fewer errors, indicating little or no cognitive impairment [as reviewed by Tombaugh and McIntyre (1992)]. Subjects provided written informed consent before participating in this Medical University of South Carolina Institutional Review Board approved study.
Pure-tone thresholds at conventional frequencies were measured with a Madsen OB922 clinical audiometer calibrated to appropriate ANSI standards (American National Standards Institute, 2004) and equipped with TDH-39 headphones. Younger subjects had normal hearing (defined as thresholds ≤25 dB HL at 250 Hz to 8000 Hz). Pure-tone thresholds varied from normal hearing to moderate high-frequency hearing loss within the older group of subjects. All subjects had normal immittance measures and differences in thresholds between right and left ears did not exceed 15 dB at each frequency. Mean pure-tone audiometric thresholds (±1 SD) are shown in Figure 1A. Using individual sample t tests, older subjects had significantly higher thresholds than younger subjects at each frequency (t(34) ranging from −3.62 to −8.79, p ≤ .001). Although differences in audibility were reduced by the addition of a masking noise and low-pass filtering of the speech, as described below, age-related differences in word recognition may still result from impaired auditory processing. Thus, individual differences in pure-tone thresholds at each frequency were used to assess associations between age-related hearing loss, word recognition, and the imaging measures of brain structure and function.
Stimuli and word recognition task design.
To assess word recognition in the scanner, 120 words were presented in an event-related design. To systematically vary word intelligibility and increase task difficulty, words were presented in one of four low-pass frequency filter conditions (upper cutoff frequencies = 400 Hz, 1000 Hz, 1600 Hz, 3150 Hz; the lower cutoff frequency was fixed at 200 Hz). The words were nouns and verbs from a list of 400 monosyllabic consonant-vowel-consonant words used by Dirks et al. (2001). The words represented a normal distribution of lexical difficulty based on the combination of lexical features that influence word recognition difficulty [word frequency, the number of similar sounding words, and the mean word frequency of those similar sounding words (Luce and Pisoni, 1998)]. The words were presented at 75 dB SPL at precisely 2.5 s into the 8 s trial [scanning repetition time (TR)]. Ten silent trials were included in which no word was presented to reduce the predictability of stimulus onsets. Eprime software (Psychology Software Tools) and an IFIS-SA control system (Invivo) were used to present the words to the subjects in the scanner.
Subjects were instructed to listen to the words while focusing on a white fixation cross. When a red cross was presented at 3 s into each 8 s trial, subjects responded with the word they heard or with “nope” if they could not recognize the word, or if no word was presented (silent trials), ensuring that a motor response was produced on each trial. Each response was recorded as correct, incorrect, or “nope” by two raters. An oral response was chosen so that the results were directly comparable to audiologic assessment of word recognition. Moreover, speech production tasks have been used successfully in other imaging studies of speech and language (Gracco et al., 2005; Shuster and Lemieux, 2005; Fridriksson et al., 2006; Eckert et al., 2008).
A challenge in aging research, particularly for auditory studies of the CNS, is separating the effects of age from the effects of hearing loss that typically accompanies aging. For example, as noted above, older subjects had significantly higher thresholds than younger subjects at each frequency tested, with the most robust effects occurring above 2000 Hz. One approach to reducing age-related differences in speech audibility is to present the words in the presence of a background noise, spectrally shaped to shift thresholds of all subjects. In the current study, a broadband masker was presented continuously at 62.5 dB SPL from a separate PC. The words were mixed with the broadband noise at 2.5 s into the 8 s trial using an audio mixer and were delivered to MR-compatible insert earphones (Sensimetrics). Signal levels were calibrated using a precision sound level meter (Larson Davis 800B). This masker was designed to produce thresholds of 20–25 dB HL from 250 to 2000 Hz, and 30 dB HL at 3000 Hz, and equated speech audibility for subjects whose audiometric thresholds were less than the masker level at a particular frequency. However, quiet thresholds of some older subjects were higher than the masker levels at some frequencies [number of subjects with thresholds above masker levels: 250–1000 Hz: n = 2; 2000 Hz: n = 4; 3000 Hz: n = 8 (Fig. 1B)]. Although much of the age-related differences in hearing levels occurred in the higher frequencies, low-pass filtering restricted the availability of high-frequency speech information to a similar extent for both younger and older subjects. In this way, the impact of age-related differences in audibility on word recognition was further reduced.
Structural and functional images were collected on a Philips 3T scanner using an eight-channel SENSE head coil. T1-weighted images were collected for brain structure analyses (160 slices with a 256 × 256 matrix, TR = 8.13 ms, TE = 3.7 ms, flip angle = 8°, slice thickness = 1 mm, and no slice gap). T2*-weighted functional images were acquired using a single shot echoplanar imaging (EPI) sequence that covers the whole brain (40 slices with a 64 × 64 matrix, TR = 8 s, TE = 30 s, slice thickness = 3.25 mm, and a TA = 1647 ms). A sparse sampling design was used with a single multislice volume collected every 8 s. The sparse sampling design was selected to limit the effects of scanner noise on the stimuli and neural responses to the stimuli and to provide time for subjects to generate a verbal response and stabilize their heads before the next TR (Fridriksson et al., 2006; Eckert et al., 2008).
Structural image processing.
Individual images are typically normalized to a standardized template composed of young subjects (e.g., ICBM-152). To ensure that younger and older subject images were properly coregistered to a common coordinate space, a study-specific template was created using unified segmentation and diffeomorphic image registration (DARTEL) in SPM5 (Ashburner and Friston, 2005; Ashburner, 2007). Unified segmentation was performed to iteratively bias field correct and segment the images into their native space tissue components. This procedure also generated normalization parameters that were used during the DARTEL procedure to coregister the segmented gray matter images (Ashburner, 2007). The recursive DARTEL procrustes procedure involves diffeomorphic registration to preserve cortical topology using a membrane bending energy or Laplacian model. This procedure creates invertible and smooth deformations for each subject's native space gray matter image to a common coordinate space, thereby producing a template that is representative of the brain size and shape of all the participants. The flow fields that describe the spatial deformations were applied to each subject's gray matter image to normalize the images into a common coordinate space. The average DARTEL normalized gray matter image of all 36 subjects is presented in the figures below. A Gaussian smoothing kernel of 8 mm, as recommended because of the high degree of accurate registration across images (www.fil.ion.ucl.ac.uk/∼john/misc/dartel_guide.pdf), was used to ensure that the data were normally distributed and to limit false-positive results.
Functional image processing.
Image volumes, slices, and voxels with significant artifact were identified using the ArtRepair toolbox (http://cibsr.stanford.edu/tools/ArtRepair/ArtRepair.htm) based on scan-to-scan motion (1 SD change in head position) and outliers relative to the global mean signal (3 SD from the global mean). An average of 2.11 image volumes (SD = 1.51) were excluded for artifact from each subject's dataset. To integrate the functional and structural data, as well as to ensure normalization to the same coordinate space across younger and older subjects, the DARTEL flow fields for the T1 gray matter images were used to normalize the EPI data. The EPI dataset for each subject was coregistered to the T1-weighted image using the mutual information algorithm in SPM5 (Collignon et al., 1995). Before coregistration, the T1-weighted image was skull stripped using the Brain Extraction Tool in FSL [FMRIB, Oxford University (Smith, 2002)] to ensure that nonbrain tissue did not influence the alignment of the EPI data to the T1 image. The T1 and EPI images were visually inspected to ensure that they were properly coregistered. With the EPI datasets in the same space as the T1 image for each subject, the DARTEL flow fields were used to normalize the EPI data into the study specific normalized space of the gray matter template. This approach enabled the direct spatial comparisons between structural and functional data sets. The images were then smoothed using an 8 mm Gaussian kernel to ensure that the data were normally distributed and appropriate for parametric testing. Visual inspection of 1-sample t test results obtained with the DARTEL normalized EPI data appeared to provide superior specificity to gyral and sulcal cortical areas than standard ICBM normalization (results not presented). In addition to the two dummy scans omitted for each run, the first real scan from each run was omitted to limit longitudinal magnetization effects that occur at the beginning of each fMRI experiment. The data were convolved with the SPM5 canonical hemodynamic response function and high-pass filtered at 128 s.
The following analyses were designed to determine the extent to which structural declines in speech-related and attention-related cortical areas predicted declines in word recognition. Word recognition during fMRI scanning was compared between the younger and older age groups using repeated-measures ANOVA. Significant group differences were further examined in Pearson correlation analyses to assess whether word recognition was significantly predicted by measures of brain structure and function. Partial correlation was performed to examine the specificity of these relations, as well as to control for effects of hearing loss.
The following steps were performed to obtain structural and functional predictors of word recognition. First level fixed-effects analyses were performed for each individual's functional images to identify brain regions exhibiting increasing activation with increasing word intelligibility. This analysis identified brain regions that were responsive to the perceptual attributes of the word stimuli. A separate first level fixed-effects analysis was performed to generate estimates that represented activity for correct versus incorrect task performance. One-sample t test analyses (random effects estimation) were performed within younger and older subjects to identify brain regions that were consistently activated with increasing word intelligibility and as a function of task performance. Two sample t test analyses (random effects estimation) were performed to examine the age-related differences in brain regions engaged with increasing and decreasing word intelligibility, as well as age-related differences in brain regions engaged as function of task performance. Based on the SPM results output, a joint statistical threshold of peak voxel p < 0.01 and cluster extent p < 0.01 was used for all of the second level analyses to be sensitive to sharp peak and broadly distributed effects (Poline et al., 1997; Eckert et al., 2008). A gray matter mask representing at least a 20% probability of gray matter across the sample was used to limit the analyses to gray matter regions and the number of statistical comparisons. Regions of interest showing age-related differences in activation were defined based on the peak and cluster extent thresholds. Contrast values were obtained from these regions of interest using MarsBar (Brett et al., 2002), and used in subsequent correlation analyses described above.
Voxel-based morphometry was performed using SPM5 to determine the extent to which age-related differences in word recognition and brain activation could be attributed to structural differences (FDR, p < 0.05). First, the analyses were limited to brain regions involved in word perception (increasing word intelligibility) and in task performance (incorrect vs correct) based on hypotheses that age-related atrophy within these regions would be associated with impaired word recognition. A binary mask of the combined increasing intelligibility (see Fig. 3) and incorrect > correct (see Fig. 4) functional results was created to identify speech responsive and task performance-related brain regions that exhibit age group differences in gray matter volume. In addition, whole-brain exploratory analyses were performed. The average voxelwise gray matter volume in regions showing age-related differences was extracted using MarsBar. An estimate of total gray matter volume was collected from the normalized gray matter images using custom Matlab (The MathWorks) code (http://www.cs.ucl.ac.uk/staff/G.Ridgway/vbm/get_totals.m). After controlling for differences in total gray matter volume, age-related differences in gray matter were examined to determine whether declines in specific speech responsive and task-related brain regions significantly predict word recognition or age-related differences in activation, and whether these regions were significantly related to hearing loss.
Finally, path analysis was performed using AMOS 16.0.1 (Arbuckle, 2005) to test causal models, which were developed to explain the significant associations between the structural, functional, and word recognition variables that exhibited age-related differences. Path analysis tests causal relationships and connection strengths that best predict the observed variance-covariance structure of the data. Subsequent multiple-group factor analyses were used to examine whether the identified model (path coefficients, error variance, and covariances between variables) was equal across younger and older subject groups. Statistical inferences about group differences are based on a nested models approach, which entailed increasingly restrictive constraints to test the equalities of path coefficients, error variances, and covariances across the age groups. This included the comparison of an independent model, in which all connections are allowed to vary between younger and older subjects groups, to increasingly constrained models, in which given variables (structured weights, error variance, and covariance) were forced to be equal for the groups.
Initial path model analyses were conducted within younger and older age groups to identify a common model. We used the maximum likelihood estimate method to examine the fit of the models. Models were evaluated using 1000 bootstrap samples and fit indices that included the χ2 estimate and the root mean square error of approximation [RMSEA (Browne and Cudeck, 1993)] with a 90% confidence limit. A model was rejected as having a poor fit of the data if the χ2 probability value of the model was <0.05 and the RMSEA value was >0.05 (Browne and Cudek, 1993). The subsamples were then combined to conduct full sample multiple-group factor analyses. Goodness of fit was evaluated using the RMSEA and comparative fit index (CFI). Acceptable model fit was defined by the following criteria: RMSEA <0.05 and CFI >0.90. Multiple indices of model fit were used because they provide conservative and reliable evaluations of causal models (Jaccard and Wan, 1996). As the fitted models were nested, comparative fit was evaluated by the χ2 difference test.
Word recognition scores (percentage correct) were calculated for each of the low-pass filter frequency cutoff conditions, 400, 1000, 1600, and 3150 Hz. Word recognition varied linearly with filter cutoff frequency [(Fig. 2) (400 Hz: younger mean = 4.89, ±1 SD = 3.74, older mean = 2.98, ±1 SD = 3.60; 1000 Hz: younger mean = 20.93, ±1 SD = 6.03; older mean = 18.85, ±1 SD = 5.86; 1600 Hz: younger mean = 62.59, ±1 SD = 9.29; older mean = 51.62, ±1 SD = 8.13; 3150 Hz: younger mean = 91.21, ±1 SD = 8.05; older mean = 90.32, ±1 SD = 7.32)]. A repeated-measures ANOVA, with filter cutoff frequency as a repeated measure and age as a grouping factor, revealed a significant main effect of age [F(1,34) = 8.39, p = 0.007] and a filter condition by age interaction [F(1,34) = 5.09, p = 0.003]. Post hoc independent samples t tests indicated that significant age group differences were observed only for the 1600 Hz condition [t = 3.77, p < 0.001], in which the words were intelligible but the listening condition was demanding. This group difference did not appear to be related to hearing loss, as individual variation in pure-tone thresholds (250 Hz to 8000 Hz) did not predict word recognition in older adults (r = −0.07 to 0.32, ns).
Functionally defined word recognition systems
Across the sample, increasing word intelligibility (400 Hz to 3150 Hz filter condition) was associated with increasing activity in temporal lobe regions previously shown to be responsive to speech (Fig. 3) (Binder et al., 2000; Fridriksson et al., 2006; Scott et al., 2006; Sharp et al., 2006; Eckert et al., 2008; Obleser, 2008). In particular, bilateral medial Heschl's gyrus (HG) and anterior superior temporal sulcus/superior temporal gyrus (STS/STG) and inferior frontal gyrus regions exhibited increasing activity with increasing word intelligibility. Differences were observed in the extent of brain activation for younger and older subjects (Fig. 3B), but no significant age group differences were observed in the responsiveness of temporal lobe cortex to increasing intelligibility (supplemental Fig. 1A, available at www.jneurosci.org as supplemental material). Importantly, there were no significant effects of hearing loss on brain responses to increasing word intelligibility.
Several brain regions exhibited increased activity as a function of response accuracy (incorrect versus correct responses). Frontal areas, including two peaks within the anterior cingulate cortex (ACC), one occurring in the dorsal ACC (dACC) and a second peak extending from the ACC to supplementary motor area (SMA), and bilateral anterior insula/frontal operculum (AI/FO) demonstrated increased activity for incorrect compared with correct word recognition (Fig. 4A). A similar, but weaker, pattern of results was observed for decreasing word intelligibility (results not shown), indicating that these frontal responses were principally driven by performance and response selection rather than stimulus attributes. Age group differences in functional activation were observed in the dACC and ACC/SMA (Fig. 4B). After controlling for age group differences in accuracy, the ACC/SMA peak remained while the dACC did not, suggesting separate peaks of activity related to age and accuracy (Fig. 4B,C). ACC/SMA activation was not related to word recognition in the 1600 Hz condition (r = −0.20, ns). dACC activity was negatively associated with word recognition (supplemental Fig. 2A,B, available at www.jneurosci.org as supplemental material). Younger adults exhibited robust activation of dACC and ACC/SMA when making an incorrect response, while older adults showed a similar degree of activity across correct and incorrect trials (Fig. 4B–D).
Structural differences within word recognition systems
Several brain regions exhibited significant age group differences in gray matter volume, with younger adults exhibiting greater gray matter volume than older adults (FDR p < 0.05) (Table 1, Fig. 5A; supplemental Figs. 3, 4, available at www.jneurosci.org as supplemental material). We examined the extent to which these age group differences in gray matter volume were associated with word recognition in the 1600 Hz condition, where older adults exhibited significantly poorer word recognition than younger adults. After controlling for total brain volume, individual variation in gray matter in the left HG/STG (and extending into the posterior insula) specifically predicted word recognition in the 1600 Hz condition (Table 1; supplemental Fig. 4, available at www.jneurosci.org as supplemental material). This association between HG/STG gray matter volume and word recognition was independent of individual variability in pure tone thresholds (250 Hz to 8000 Hz: partial r = 0.58–0.65, p < 0.001), and was present within older and younger groups (Fig. 5B). These results reflect an association between individual variability in the HG/STG and word recognition measures across both groups rather than an association that is dependent on clustering of the results for the two age groups.
Previously, we observed that age group differences in the structural integrity of speech-responsive cortex were associated with increased activation in cognitive control systems (Eckert et al., 2008). In the current study we observed a similar relationship between age group differences in left HG/STG and ACC activation (Fig. 5C,D). The y-axis in Figure 5, C and D, represents the average SPM correct–incorrect contrast value from the ACC/SMA and dACC clusters. Increased activation during incorrect trials is presented as negative values in Figure 5, C and D. Elevated left HG/STG gray matter volume was significantly related to greater ACC/SMA and dACC activation for incorrect compared with correct trials, which is consistent with the finding that older adults were more likely to engage these regions across correct and incorrect trials. Unlike the association between HG/STG morphology and word recognition, the relationship between individual variability in HG/STG gray matter volume and ACC activation was dependent on the inclusion of both younger and older subject groups.
Causal path models and confirmatory factor analyses
Causal path models were examined to determine the extent to which age-related differences in gray matter volume of left HG/STG and ACC activation could explain the word recognition declines we observed in the older adults. Latent variables such as gender and handedness were not included in the model because they were not associated with age-related differences in gray matter, functional activation or word recognition. A theoretical model was developed based on previous research demonstrating the following: (1) an association between structural integrity of left HG and speech processing (Golestani et al., 2002, 2007; Golestani and Pallier, 2007; Wong et al., 2008b); (2) age-related changes in frontal lobe systems and speech recognition (Sharp et al., 2006; Eckert et al., 2008); and (3) the role of dACC cortex in error monitoring (Carter et al., 1998; Kiehl et al., 2000; Braver et al., 2001; Menon et al., 2001; Sharp et al., 2006). Specifically, we tested the hypothesis that individual variation in left HG/STG morphology influences word recognition performance, which is reflected by dACC error monitoring activity within older and younger adults (Fig. 6). The ACC/SMA region exhibiting age-related differences in activation was included in the model because of its strong association with dACC activation and because the degree to which the ACC/SMA is engaged has been related to reaction time (Viallet et al., 1995; Leuthold and Jentzsch, 2002; Hester et al., 2004; Perez et al., 2008) and increased verbal monitoring (Christoffels et al., 2007).
Separate causal path model analyses were conducted to ensure the model was a good fit in the younger and older subsamples, and to reduce sample bias of the model. The model outlined in Figure 6 exhibited a good fit and was retained for both younger adults (χ2 = 2.26, ns; RMSEA = 0, 90th percentile confidence limit = 0.00–0.37, CFI = 1.0) and older adults (χ2 = 2.87, ns; RMSEA = 0, 90th percentile confidence limit = 0.00–0.402, CFI = 1.0). Across the subjects, there were significant causal links from left HG/STG gray matter to word recognition and from word recognition to dACC activity (Table 2). Equality constraints to the structural weights did not significantly degrade the fit of the model (χdiff(2)2 = 0.73, ns; RMSEA = 0.00, 90th percentile confidence limit = 0.00–0.16, CFI = 1.0). Similarly, the model that constrained the error variances did not change the fit of the solution (χdiff(6)2 = 1.77, ns; RMSEA = 0.00, 90th percentile confidence limit = 0.00–0.09, CFI = 1.0). The final model, which held the covariances equal, produced a significant decrease in goodness of fit (χdiff(7)2 = 14.29, p = 0.048; RMSEA = 0.12, 90th percentile confidence limit = 0.00–0.23, CFI = 0.77), which was due to dACC and ACC/SMA activity exhibiting more tightly coupled variation in younger than older adults. These causal path model results demonstrate the following: (1) a high degree of measurement invariance in younger and older adults; (2) variation in the morphology of left medial HG/STG is predictive of word recognition in both younger and older adults; and (3) declining word recognition leads to an upregulation in activity of the dACC.
In addition to well known declines in the peripheral auditory system that contribute to speech recognition difficulties, age-related changes in central auditory and attention-related systems are hypothesized to compound speech recognition difficulty. We hypothesized that structural declines in speech-related cortex accounts for some of the age-related changes in speech recognition. Consistent with this hypothesis, robust age-related differences in auditory cortex morphology were observed and were predictive of age-related differences in word recognition. Furthermore, the relationship between structural integrity of auditory cortex and word recognition was present in both younger and older adults. These results indicate that some of the perceptual difficulties experienced by older adults can be attributed to age-related structural changes in auditory cortex that shift older adults lower along a normal continuum of word recognition that is determined, at least in part, by auditory cortex architecture.
Structural changes within word recognition systems
It is well established that aging is associated with pronounced deterioration of brain morphology [for review, see Raz and Rodrigue (2006)]. Consistent with the aging literature, we observed significant age group gray matter volume differences in functionally defined speech- and attention-related cortex, including the left HG/STG, ACC/SMA, and AI/FO. Across these regions, only left medial HG/STG gray matter volume predicted word recognition after controlling for total gray matter volume. These results suggest a specific association between auditory cortex and speech recognition. In support of this premise, the left medial HG/STG has been associated with temporal processing and speech in noise detection (Patterson et al., 2002; Wong et al., 2008a, 2009). In addition, older adults with speech recognition difficulties exhibit degraded speech representations at the level of auditory cortex (Tremblay et al., 2002, 2003; Harkrider et al., 2005, 2006).
The association between word recognition and HG/STG morphology was observed in both younger and older adults. Several recent studies have demonstrated that HG morphology predicts auditory learning, pitch processing, and temporal processing, thereby demonstrating the importance of low-level auditory cortex in speech processing. For example, measures of gray matter (Wong et al., 2008b) and white matter (Golestani et al., 2002, 2007) volume in the left HG were predictive of the speed of learning and ability to learn foreign speech sounds which differed spectrally and temporally from the learners native language. People who were less successful or slower learners exhibited smaller HG volume in the left hemisphere. In addition, increased gray matter volume in auditory cortical regions has been observed in expert listeners such as musicians compared with nonmusicians (Schneider et al., 2002, 2005a,b; Gaser and Schlaug, 2003). In support of the premise that individual variation in HG/STG morphology reflects individual variability in auditory processing, atypical morphology has also been observed in people with oral and written language disability including, specific language impairment (Billingsley et al., 2003), dyslexia (Leonard et al., 2001, 2002), and stuttering (Foundas et al., 2001, 2004; Beal et al., 2007). We observed the most robust voxelwise effects in medial low-level auditory cortex rather than lateral temporal regions (supplemental Fig. 4, available at www.jneurosci.org as supplemental material). In addition to the functional explanations for this association presented below, it is possible that increased sulcal/gyral variability in the morphology of Heschl's gyrus (Leonard et al., 1998), reduces the likelihood of observing voxelwise effects in lateral temporal lobe regions relative to more medial regions where the morphology is less variable. Given associations observed between low-level auditory cortex morphology and language function, however, a plausible explanation for the current findings is that structural declines and atypical development of low-level auditory cortex lead to a degraded speech signal, thereby limiting the ability to understand speech in challenging listening environments. The results of this study also suggest the intriguing possibility that aging may exaggerate developmental limitations on speech recognition.
The strong association between low-level auditory cortex and word recognition does not preclude the possibility that medial HG is a site where top-down modulation may occur to aid in the representation of target stimuli and suppression of irrelevant information. In fact, individual variation in HG gray matter volume was associated with age-related differences in attention-related frontal lobe activation. Prior electrophysiologic studies of animals have shown that attending to a stimulus enhances the sensitivity to that stimulus by suppressing the neuronal responses to irrelevant stimuli and increasing responses to attended stimuli (Reynolds et al., 2000; Reynolds and Desimone, 2003; Fries et al., 2008). There is some evidence that top down suppression, which is behaviorally and physiologically atypical in older adults, occurs at low-level sensory cortex (Fries et al., 2008) that is upstream from brain regions involved in object recognition (Gazzaley et al., 2005, 2008). In particular, Gazzaley et al. (2008) demonstrated that in older adults, atypical suppression of irrelevant information within low-level visual cortex resulted in impaired performance despite successful suppression at later levels of processing. Atypical modulation in low-level auditory cortex could explain why medial HG gray matter volume was related to word recognition rather than anterolateral STG/STS regions that typically show the greatest responsiveness to speech sounds (Binder et al., 2000; Scott et al., 2006; Sharp et al., 2006; Eckert et al., 2008; Wong et al., 2008).
The age-related differences in ACC activation do not provide direct evidence for top-down modulation of auditory cortex, however. Age-related differences in ACC/SMA and dACC regions did not reflect effective compensatory engagement of cognitive control systems. In particular, changes in dACC activation were associated with poor word recognition (supplemental Fig. 2B, available at www.jneurosci.org as supplemental material). A similar relation between dACC activity and age-related declines in speech comprehension has been observed previously (Sharp et al., 2006). Together with the results of this study, the literature on aging and evidence linking dACC activation to failing performance (Carter et al., 1998; Kiehl et al., 2000; Braver et al., 2001; Menon et al., 2001; Sharp et al., 2006) indicate that age-related changes in dACC activation reflect an upregulation of error monitoring systems. The age-related differences in ACC/SMA activity, which were not related to performance, probably reflect increased conflict at the level of response selection in older adults (Botvinick et al., 2004), or continuous monitoring of verbal responses and increased speech-motor planning (Christoffels et al., 2007). More broadly, the recruitment of ACC/SMA with age may represent an additional evaluative signal consistent with an increased reliance on executive systems in older adults or motor planning across task conditions.
Causal path models
In the current study we observed significant age-related differences in the structure of medial HG/STG, the activation of the dACC, and word recognition. However, it was not clear whether the structural and/or functional differences reflect underlying causes of declining word recognition in older adults or are indirectly related through age. We examined specific causal path models, which were based on previous findings showing an association between left HG morphology and speech processing (Golestani et al., 2002, 2007; Wong et al., 2008b) as well as age-related changes in frontal lobe systems and speech recognition (Sharp et al., 2006; Eckert et al., 2008). The hypothesis that HG/STG gray matter volume has a causal impact on word recognition was supported by the present data. The lack of significant differences in the causal path model between age groups provides support for the hypothesis that HG/STG morphology has a causal impact on word recognition in older and younger adults. The negative relationship between word recognition and dACC activation is again consistent with the role of the dACC in error monitoring. This error monitoring effect did not appear compensatory in either age group, although one could argue that age-related differences in word recognition would have been more pronounced without the engagement of the dACC across the experiment. Together, the causal path models and the age group differences in left HG/STG gray matter volume, word recognition, and dACC activation show that aging shifts older adults lower along a continuum of individual variation in brain morphology that has a causal influence on word recognition in challenging listening conditions. Longitudinal aging studies examining changes in word recognition and HG/STG morphology would provide a direct test of our causal model for word recognition.
The results of this study are consistent with the following conclusions: (1) individual variation in the morphology of speech-responsive auditory cortex is predictive of word recognition in challenging listening conditions; (2) aging affects HG/STG morphology and may exaggerate speech recognition difficulties of people already limited by low HG/STG gray matter volume; and (3) the structural differences in the HG/STG were observed even after controlling for hearing loss, suggesting independent effects of age on the peripheral and central auditory systems. These results may have important implications for developing effective interventions for age-related declines in speech understanding. The primary factor believed to contribute to deficits in speech understanding is hearing loss and the resulting decrease in audibility. Current methods of rehabilitation focus on the use of amplification to restore important speech information and improve speech recognition. However, only a small percentage of older adults who could benefit from amplification are successful hearing aid users. While hearing aids may improve audibility for older adults, structural declines in auditory cortex may still lead to degraded auditory representations and poorer word recognition in challenging listening conditions. The results of this study suggest that hearing aid efficacy is determined, at least in part, by the structural integrity of low-level auditory cortex.
This investigation was conducted in a facility constructed with support from Research Facilities Improvement Program Grant C06 RR14516 from the National Center for Research Resources–National Institutes of Health. This research was conducted while Mark A. Eckert was an American Foundation for Aging Research grant recipient. This work was supported by the National Institute on Deafness and Other Communication Disorders (P50 DC00422 and K23 DC008787). We thank the participants of this study and the Medical University of South Carolina Center for Advanced Imaging Research.
- Correspondence should be addressed to either Dr. Kelly C. Harris or Dr. Mark A. Eckert, 135 Rutledge Avenue, MSC 550, Charleston, SC 29425-5500, or