Abstract
Hemispheric asymmetries in the processing of elemental speech sounds appear to be critical for normal speech perception. This study investigated the effects of age on hemispheric asymmetry observed in the neurophysiological responses to speech stimuli in three groups of normal hearing, right-handed subjects: children (ages, 8–11 years), young adults (ages, 20–25 years), and older adults (ages > 55 years). Peak-to-peak response amplitudes of the auditory cortical P1–N1 complex obtained over right and left temporal lobes were examined to determine the degree of left/right asymmetry in the neurophysiological responses elicited by synthetic speech syllables in each of the three subject groups. In addition, mismatch negativity (MMN) responses, which are elicited by acoustic change, were obtained. Whereas children and young adults demonstrated larger P1–N1-evoked response amplitudes over the left temporal lobe than over the right, responses from elderly subjects were symmetrical. In contrast, MMN responses, which reflect an echoic memory process, were symmetrical in all subject groups. The differences observed in the neurophysiological responses were accompanied by a finding of significantly poorer ability to discriminate speech syllables involving rapid spectrotemporal changes in the older adult group. This study demonstrates a biological, age-related change in the neural representation of basic speech sounds and suggests one possible underlying mechanism for the speech perception difficulties exhibited by aging adults. Furthermore, results of this study support previous findings suggesting a dissociation between neural mechanisms underlying those processes that reflect the basic representation of sound structure and those that represent auditory echoic memory and stimulus change.
Hemispheric asymmetries have been linked to a variety of perceptual and language functions, including the processing of elemental acoustic features of speech in humans (Phillips and Farmer, 1990; Sharma et al., 1994) and in animal models (Fitch et al., 1993). The finding that individuals with left hemisphere cortical damage demonstrate deficits in the perception of speech signals supports the importance of the left hemisphere for normal speech perception (Auerbach et al., 1982; Phillips and Farmer, 1990). Moreover, abnormal patterns of asymmetry have been linked to language-learning problems in children (Obrzut et al., 1983; Dawson et al., 1989).
Studies of the neurophysiological representation of acoustic stimuli have demonstrated that stimuli with complex speech-like acoustic properties, including rapid spectrotemporal changes, yield greater activation in auditory cortex over the left hemisphere (Elmo, 1987;Zatorre et al., 1992; Belin et al., 1998). This asymmetry in the neurophysiological representation of basic speech signals has been shown to occur even at the thalamic level and regardless of whether binaural or monaural stimulation is used (King et al., 1999).
It has been suggested that age-related changes in the pattern of hemispheric asymmetry may underlie some of the auditory perceptual difficulties experienced by aging adults (Jerger and Jordan, 1992;Marvel et al., 1992; Jerger et al., 1994; Pekkonen et al., 1995). The purpose of this study was to investigate whether age affects the degree of left/right hemispheric asymmetry in the neural representation of monaurally presented speech stimuli. A further purpose was to determine whether the ability to make fine-grained acoustic discriminations of speech signals was affected by age.
MATERIALS AND METHODS
Subjects
Subjects for this study consisted of very young (ages, 8–11 years), young adult (ages, 20–25 years), and older adult (ages > 55 years) right-handed females with no history of neurological or otological disease or trauma. All subjects evidenced normal peripheral hearing sensitivity, defined as pure-tone thresholds < 25 dB for octave frequencies of 500–8000 Hz. All subjects were paid for their participation.
Evoked potential stimulus and recording parameters
Evoked potential electrophysiological responses reflect processes that require synchronous activity across populations of neurons. Two types of auditory-evoked responses—the auditory cortical P1–N1 response and the mismatch negativity (MMN) response—were elicited by synthetic speech syllables from 15 children, 11 young adults, and 10 elderly adults. Stimuli consisted of two synthesized consonant and vowel (CV) syllables along a /da/-to-/ga/ continuum that differed in the onset frequency of the third formant (F3). The duration of both stimuli was 100 msec, with a 40 msec formant transition. The onset frequency of F3 was 2580 and 2300 Hz for /da/ and /ga/, respectively. The acoustic difference between the two stimuli was easily discriminated by all subjects psychophysically.
Stimulus files were downloaded from a Klatt synthesizer to a personal computer (PC)-based stimulus delivery system that controlled the time of delivery and stimulus intensity and triggered the PC-based evoked potential-averaging system. Stimuli were presented at 75 dB sound pressure level (SPL) to the right ear of each subject through insert earphones (Etymotic ER-2) at a rate of 1.9/sec. The use of monaural stimulus presentation was necessitated by the paradigm of the present study. First, the evoked response recording required multiple sessions of 2 hr each. Second, the MMN response can be affected by attention (Woldorff et al., 1991; Alho et al., 1992; Woods et al., 1992). For these reasons, subjects were seated comfortably in a reclining chair and allowed to view videotapes of their choice during testing; the left ear was unoccluded, and the videotape audio levels were kept below 40 dB SPL (A-weighted) to not interfere with the recording and to allow the subject to hear the video soundtrack. This paradigm helped ensure that the subjects were unlikely to attend to the test stimuli because the video soundtrack was inherently more interesting while, at the same time, minimizing changes in the level of arousal throughout the test session. This paradigm also was instrumental in encouraging the subjects to sit quietly for the lengthy test sessions, because they were able to view full-length movies. In addition, the use of monaural stimulation in the evaluation of hemispheric asymmetry and topography of auditory-evoked potentials in humans is not unprecedented (Pekkonen et al., 1995), and previous research has indicated that consistent patterns of hemispheric asymmetry in the neurophysiological representation of speech signals occur regardless of whether right ear, left ear, or binaural stimulation is used (King et al., 1999). Finally, because the mode of stimulation was constant across all subjects, any differences observed in the topography of responses between subject groups could not be attributed to stimulus delivery issues.
Electrophysiological responses were obtained using a recording window that included a prestimulus baseline of 100 msec and a poststimulus time window of 500 msec. Evoked responses were analog bandpass filtered on-line from 0.1 to 100 Hz (12 dB/octave roll off). Responses were recorded over the right and left temporal lobes (TR; TL) with a noncephalic (nose tip) reference. TR was located halfway between electrode sites T4 and T6, and TL was located halfway between T3 and T5 according to the international ten–twenty system (Jasper, 1958). A forehead electrode served as the ground. Eye movements were monitored with a supraorbital-to-lateral canthus bipolar electrode montage.
The MMN and P1–N1 responses were obtained using procedures that have been described previously (Kraus et al., 1996). The /ga/ and /da/ stimuli served as the standard and deviant stimuli, respectively, in an oddball paradigm. Stimuli were presented in a pseudorandom sequence with at least three standard stimuli separating presentations of deviant stimuli. The deviant probability of occurrence was 10%. Twenty standard stimuli preceded the occurrence of the first deviant stimulus, and responses to standard stimuli immediately after the occurrence of a deviant stimulus were excluded from the average.
Evoked responses elicited by standard and deviant stimuli were averaged separately. For each subject, responses to ∼250 deviant (/da/) stimuli were obtained along with responses to 1800–2500 standard (/ga/) stimuli. In addition, responses to 1800–2500 stimulus presentations of the deviant (/da/) stimulus presented alone were obtained.
Speech sound discrimination stimuli and response parameters
The speech sound discrimination procedure has been described elsewhere (Carrell et al., 1999; Kraus et al., 1999). A parameter estimation by sequential tracking (PEST) procedure was used to evaluate just noticeable differences (JNDs) for synthesized CV speech continua in 17 children, 12 young adults, and 12 older adults. Continua were created using a Klatt synthesizer and represented differences in the third formant onset frequency (/da/ to /ga/). Previous research has shown that individuals with auditory perceptual deficits and/or auditory cortex lesions exhibit deficits perceiving rapid transitions that characterize many consonants, whereas perception of slowly changing, steady-state sounds is not affected (Phillips and Farmer, 1990). An additional continuum that represented changes in the duration of the first and second formants (/ba/ to /wa/), shown to be less vulnerable to misperception (Kraus et al., 1996), also was created for use as a control to ensure that subjects understood and were capable of performing the task.
For both continua, the end points were defined by ideal examples of the syllables (Pisoni et al., 1983; Walley and Carrell, 1983). For the /da/-to-/ga/ continuum, the third formant onset frequency varied from 2580 Hz (/da/) to 2180 Hz (/ga/) in 40 steps of 10 Hz each. The formant transition duration was 40 msec. For the /ba/-to-/wa/ continuum, the duration of the first and second formant transition varied from 10 msec (/ba/) to 40 msec (/wa/) in 30 steps of 1 msec each. Thus, a JND of 7 for the /da/-to-/ga/ task would indicate that the subject could discriminate a difference of 70 Hz in onset frequency of the third formant. Likewise, a JND of 7 for the /ba/-to-/wa/ task would indicate that the subject could discriminate a difference of 7 msec in formant transition, or voice onset time. The total stimulus duration for all stimuli was 100 msec.
A four-interval, forced choice procedure was used to prevent response bias. In each trial, subjects were presented with two pairs of syllables in which one pair was the same and one pair was different. The subjects' task was to indicate via a button push in which interval pair the syllables were different. Consistent with the PEST algorithm, the acoustic difference between stimuli became smaller after correct answers and larger after incorrect answers. The order of same and different pairs within trials was randomized. The listener's JND was defined as the distance between stimuli in the “different” pair when the listener reliably reached an accuracy level of 69% correct. Three trial blocks were obtained for each stimulus condition. In our experience, individuals occasionally perform poorly on an isolated block during the test procedure because of fatigue, unfamiliarity with the task, and/or attention-related issues. Therefore, to reduce the impact of these occasional lapses in performance and to obtain a measure of the individual subject's best discrimination abilities, we computed the JND for each stimulus contrast as the mean of the two best blocks.
Analysis
P1–N1 responses. The P1 was identified as the largest positive deflection after stimulus onset in the latency region between 50 and 100 msec. N1 was identified as the negative deflection after the P1. Peak-to-peak amplitude measures of the P1–N1 complex were calculated off-line as the amplitude in microvolts from the peak of the P1 response to the negative-most point of the N1 response. Latency measures in milliseconds after stimulus onset also were obtained for the P1 and N1; however, because results indicated no differences in the latency of P1 or N1 between hemispheres for any subject group, these analyses are not included here. P1–N1 responses were evaluated using the averaged responses obtained in the standard (/ga/) and deviant-alone (/da/) conditions. The purpose of analyzing the P1–N1 responses to both stimuli was to determine whether the neurophysiological representation differed between stimuli and/or stimulus condition. It should be noted that the stimuli used in this study were quite similar acoustically, differing by only 280 Hz in F3 frequency compared with the classic exemplars of /da/ and /ga/, which differ by 400 Hz. After preliminary analyses that revealed no significant differences in responses as a function of stimulus type (/da/ vs /ga/), results elicited by both stimuli were collapsed for all further analyses.
A two × three repeated measures ANOVA [within, side of response (right or left); between, group (children, young adult, or elderly)] was performed for peak-to-peak amplitude values to determine whether responses were asymmetric (as evidenced by a significant main effect of side of response) and whether the patterns of asymmetry differed among subject groups (as evidenced by the side × group interaction).
In addition, the degree of hemispheric asymmetry was computed by subtracting the right hemisphere peak-to-peak amplitude value from the left hemisphere peak-to-peak amplitude value and dividing by the sum of the two values: [(TL − TR)/(TL + TR)]. Using this equation, completely symmetrical responses would result in a value of zero, larger responses over the left hemisphere would result in positive values, and larger responses over the right hemisphere would result in negative values. Asymmetry values were subjected to a univariate ANOVA procedure to determine the effects of subject group on temporal lobe asymmetry.
MMN responses. The MMN is elicited by a deviant stimulus only when it signals an acoustic change. Therefore, difference waves were computed for each subject by subtracting the response to the deviant stimuli presented alone from the response to the deviant stimuli presented within the oddball paradigm (Alho et al., 1989; Kraus et al., 1995). MMN responses were identified visually in the difference wave as a relative negativity after the N1 and occurring in the latency range of 100–500 msec. Onset, peak, and offset latencies were measured. MMN duration was computed by subtracting the onset latency from the offset latency. Amplitude of the onset-to-peak latency was obtained, and the response area was computed by integrating the overall area between the onset and offset latencies.
As with the P1–N1 responses, a two × three repeated measures ANOVA was conducted for MMN amplitude and area values to determine whether the response magnitude was asymmetric and whether the degree of asymmetry differed among subject groups.
Fine-grained speech sound discrimination. Univariate ANOVAs were conducted for subjects' mean JND scores for each stimulus contrast to determine whether the ability to discriminate the stimuli differed among subject groups.
RESULTS
P1–N1 response asymmetry
ANOVA revealed a significant main effect of subject group on the degree of temporal lobe asymmetry (F = 5.517;p < 0.01). Post hoc Bonferroni comparisons revealed that children and young adults exhibited a significantly greater degree of temporal lobe asymmetry than did the elderly subjects who exhibited essentially symmetrical responses (p < 0.01). These results are illustrated in Figure 1.
Figure 2 shows the grand averages of P1–N1 responses obtained over the right and left temporal lobes for the three subject groups. This figure clearly shows that responses recorded over the left hemisphere were substantially larger than those recorded over the right hemisphere for the children and young adult subjects. Although left hemisphere responses appear somewhat larger than right hemisphere responses in the grand averages for the elderly group, as well, this difference did not approach statistical significance. This underscores the necessity of using individual data points and statistical analysis in any event-related potential study, because the pictorial representation of responses via grand averages inherently are limited in that a large-magnitude response of even a single individual may be over-represented because of the relatively greater weighting afforded larger responses in the grand average.
Hemispheric asymmetry data obtained from individual subjects in each age group using the equation [(TL − TR)/(TL + TR)] are displayed in Figure 3. As can be seen from this figure, the majority of children and all of the young adult subjects exhibited asymmetry of response amplitude favoring the left side. Overall, the elderly subjects exhibited symmetrical responses.
P1–N1 peak-to-peak amplitude
Results of repeated measures ANOVA revealed significant main effects of side of response (F = 16.274;p < 0.01) and group (F = 9.646;p < 0.01) on P1–N1 peak-to-peak amplitude, as well as a significant side × group interaction (F = 5.557; p < 0.01). When collapsed across subject groups, responses were larger over the left temporal lobe than over the right. Post hoc Bonferroni comparisons revealed that the overall response amplitude was significantly larger in the children than in the young adult or elderly subjects (p< 0.01). There was no difference in response amplitude between the young adult and elderly subject groups. Finally, for children and young adults, response amplitudes over the left temporal lobe were significantly larger than that over the right temporal lobe (pairedt, p < 0.01). In elderly subjects, right and left temporal lobe responses were symmetrical. These results are illustrated in Figure 4.
MMN response symmetry
There were no significant differences in the amplitude, duration, or area of the MMN responses obtained over right and left temporal lobes for any subject group, indicating that the MMN response is symmetrical over the temporal lobes and does not vary with age. Results of this analysis are illustrated in Figure5.
Fine-grained speech sound discrimination
Results of univariate ANOVA procedures revealed a significant effect of subject group on the ability to discriminate the /da–ga/ stimulus contrast (F = 4.071; p < 0.05). Post hoc analysis revealed that the elderly subjects exhibited significantly poorer ability to discriminate the /da–ga/ stimulus contrast compared with both the children and the young adult subjects (p < 0.05). That this finding was not caused by an inability to understand or perform the task is evidenced by the finding of no effect of subject group on the ability to discriminate the /ba–wa/ stimulus contrast. These results are illustrated in Figure 6.
DISCUSSION
Results of this study show that aging affects the degree of left/right hemispheric asymmetry in the basic neural representation of speech sounds in normal hearing individuals. It has been demonstrated that left-sided specialization occurs for sounds that have complex speech-like acoustic properties (Belin et al., 1998). The dominance of the left hemisphere for processing acoustic stimuli that have rapid spectrotemporal changes, such as consonants, with a high degree of temporal precision has been demonstrated by a number of investigators (Efron, 1963; Lackner and Teuber, 1973; Schwartz and Tallal, 1980;Phillips and Farmer, 1990). Furthermore, neural and perceptual processing of the rapid acoustic transitions that characterize many consonants appears to be critical for normal speech perception and is particularly vulnerable to disruption (Godfrey et al., 1981; Elliot et al., 1989; Phillips and Farmer, 1990; Tallal, 1994; Merzenich et al., 1996). Therefore, it may be hypothesized that changes in hemispheric asymmetry as seen in the present study may have an adverse effect on the ability to process complex, rapidly changing acoustic stimuli, ultimately resulting in speech perceptual difficulties. In addition, because no hemispheric latency differences in neurophysiological responses were found in any group, it can be assumed that age-related changes in hemispheric asymmetry are manifested primarily in the number of neurons recruited, or the degree of hemispheric activation, rather than in the relative timing of neural transmission.
There is ample evidence to suggest that auditory temporal processing is poorer in aging listeners compared with younger adults (Humes and Christopherson, 1991; Fitzgibbons and Gordon-Salant, 1994;Divenyi and Haupt, 1997; Gordon-Salant and Fitzgibbons, 1999). Results of the present study also demonstrate that the discrimination of speech sounds involving rapid spectrotemporal acoustic change (i.e., /da–ga/) is poorer in elderly listeners, whereas the ability to discriminate speech sounds differing only in formant duration (i.e., /ba–wa/) is unaffected. These results suggest that age-related alterations in hemispheric asymmetry in the neural representation of elemental speech sounds may be one possible contributing factor to the temporal-processing difficulties exhibited by aging adults. However, although the behavioral and physiological responses discussed in this paper are related in that they reflect processing of acoustic events, it must be remembered that they are inherently different responses. Psychophysical tasks require a conscious, behavioral response and may be affected by many different factors, including attention, ability to perform the task, stimulus and response parameters, and other factors that affect the individual's conscious perceptual abilities. In contrast, the neurophysiological response is a preattentive neural representation of acoustic events, originating primarily within the auditory pathway and independent of attention or voluntary response. In addition, alteration in the hemispheric asymmetry of speech sound representation with aging likely is just one of many factors that contribute to speech perceptual difficulties in the elderly. As such, one would not expect a direct one-to-one correlation within individuals between perceptual ability and degree of hemispheric asymmetry as reflected in the P1–N1 neurophysiological response. Nevertheless, behavioral and physiological measures reflect intersecting processes, and the findings of the present study demonstrate how these measures provide insight into specific neurophysiological processes that, at least in part, underlie psychophysical performance.
Belin et al. (1998) demonstrated that the hemispheric asymmetry of cortical activation in response to complex acoustic stimuli results from a relative decrease in right hemisphere activation during rapid acoustic change rather than from greater left hemisphere activation. Examination of the grand averaged responses in Figure 2 also suggests that the changes in hemispheric asymmetry with age in the present study may be the result of relatively greater right hemisphere activation in the elderly subjects compared with the younger two age groups. Alternatively, because the P1–N1 responses obtained in this study demonstrate a decline in amplitude as a function of aging, the relatively greater right hemisphere activation in the elderly subjects may be attributable to a cessation of this decline in the right hemisphere, perhaps as a result of decreased subcortical and cortical inhibition in the older population (Amenedo and Diaz, 1998). Although this topic warrants further investigation, it is possible that the apparently greater right hemisphere response to speech stimuli in aging individuals may contribute to hemispheric competition in the processing of speech, resulting in temporal blurring and concomitant speech perceptual difficulties (Hammond, 1982).
An alternative explanation for the present findings is the possibility of an age-related change in generator sites. Because the present study involved only electrode locations placed over the right and left temporal lobes, subtle age-related shifts in generator sites for the P1–N1 would not be identified. Furthermore, although it is possible that generator sites for scalp-recorded evoked potentials may shift as a function of maturation in children, primarily because of an increase in head size, a shift in generator sites is not a likely explanation for the age-related changes in the P1–N1 response asymmetry in the present study for two reasons. First, significant differences were found in the topography of responses between the two adult groups in which head growth is not a factor. Second, Pekkonen et al. (1995) have demonstrated that the generator sites for the P50m and N100m (the auditory-evoked magnetometric analogs to the P1 and N1 responses examined in the present study) do not change as a function of aging from young adulthood through the age of 86 years. Thus, age-related shifts in generator sites cannot account for topographical differences in the responses of the older subjects compared with the two younger groups.
Although no statistically significant difference was found in the hemispheric asymmetry of the neurophysiological responses of children compared with young adults, individual subject data presented in Figure3 indicate that several of the children in this study (26%) exhibited greater activation over the right versus the left hemisphere. This pattern is entirely consistent with anatomical and neurophysiological data indicating a maturation of the auditory, language, and interhemispheric pathways from childhood to early adulthood (Salamy, 1978; Paus et al., 1999), followed by a regression to a more childlike degree of myelination and anatomical structure in later years (Allen et al., 1991; Hanyu et al., 1997).
Finally, the findings of King et al. (1999) indicate that left/right asymmetry in the neural representation of speech stimuli is apparent even at the thalamic level of the central auditory pathway of nonhuman mammals. Thus, such asymmetry appears to be a basic, prelinguistic element of normal speech perception, and alterations in the pattern and/or degree of asymmetry with aging may hold critical implications for the processing of elemental speech features in the aging adult.
In contrast, acoustic change appears to be bilaterally represented, as evidenced by hemispheric symmetry in the MMN response for all subject groups. Previous research has indicated that, whereas the P1–N1 response demonstrates morphological changes from infancy through the second decade of life (Courchesne, 1990; Ponton et al., 1996;Cunningham et al., 1997), the MMN remains stable and demonstrates no developmental changes throughout the school-age years (Kraus et al., 1999). This finding, combined with evidence demonstrating that the MMN is generated by sources in the nonprimary thalamus and auditory cortex (Scherg et al., 1989; Kraus et al., 1994) that are different from those that generate the cortical auditory-evoked N1 response (Naatanen and Picton, 1987; Sams et al., 1991), supports the theory that the MMN and P1–N1 responses represent different neurophysiological processes. Specifically, the P1–N1 response appears to reflect the basic neural encoding of repetitive, identical acoustic stimuli primarily in the primary auditory pathways (Eberling et al., 1982; Pantev et al., 1998), whereas the MMN likely reflects a preperceptual echoic or trace memory process related to the representation of acoustic change (Naatanen et al., 1989) that is bilaterally represented and mediated by nonprimary auditory pathways. Results of the present study indicate that age affects the basic neural representation of speech sounds but has no effect on the neural representation of acoustic change.
In conclusion, these results demonstrate that the pattern of left-sided dominance in the neural representation of speech sounds seen in children and young adults is not evident in older adults, despite normal hearing sensitivity. This absence of hemispheric asymmetry is accompanied by a poorer ability to discriminate speech sounds involving rapid spectrotemporal changes in older adult listeners. These findings provide evidence of a biological, age-related change in the basic sensory representation of elemental speech signals and suggest one possible underlying mechanism for the speech perceptual difficulties experienced by aging adults. Results of this study also provide a normal metric for comparison with other populations exhibiting communicative difficulties, including individuals with auditory-processing deficits, so that functional implications of atypical patterns of hemispheric asymmetry may be delineated. Finally, results of this study support previous evidence suggesting that auditory processes reflected by the P1–N1 response and the MMN are mediated by different neural generators and reflect separate neurophysiological and functional mechanisms.
Footnotes
This work was supported by the National Institutes of Health National Institute on Deafness and Other Communicative Disorders Grant DC01510 and the Foundation for Hearing and Speech Research. Special thanks to Dawn Burton Koch for her invaluable support and editorial assistance.
Correspondence should be addressed to Dr. Teri James Bellis, Department of Communication Disorders, University of South Dakota, 414 East Clark Street, Vermillion, SD 57069. E-mail: tbellis{at}usd.edu.