Abstract
Older adults often have difficulty understanding speech in a noisy environment or with multiple speakers. In such situations, binaural hearing improves the signal-to-noise ratio. How does this binaural advantage change with increasing age? Using magnetoencephalography, we recorded cortical activity evoked by changes in interaural phase differences of amplitude-modulated tones. These responses occurred for frequencies up to 1225 Hz in young subjects but only up to 940 Hz in middle-aged and 760 Hz in older adults. Behavioral thresholds also decreased with increasing age but were more variable, likely because some older adults make effective use of compensatory mechanisms. The reduced frequency range for binaural hearing became significant in middle age, before decline in hearing sensation and the morphology of cortical responses, which became apparent only in the older subjects. This study provides evidence from human physiological data for the early onset of biological aging in binaural hearing.
Introduction
Understanding speech in a noisy environment or with multiple simultaneous speakers becomes more difficult with increasing age. Changes in both the peripheral and central auditory system are main determinants of impaired communication in the elderly (Murphy et al., 2006). Furthermore, older adults need more time to process what they have heard because of slower cognitive functioning (Salthouse, 1996). Although the contribution of these different factors and their complex interactions are relatively unknown, age-related decline in central auditory function seems to play an important role (Martin and Jerger, 2005). With increasing age, neural synchrony in the central auditory system deteriorates and contributes to the difficulty older adults have when perceiving temporal cues in sound (Frisina and Frisina, 1997; Schneider and Pichora-Fuller, 2001). Binaural hearing is an excellent model for investigating central auditory function because bilateral inputs need to be processed and integrated in the CNS. Here we study how binaural hearing changes across the lifespan.
Binaural hearing enhances the signal-to-noise ratio in a multispeaker environment (Culling et al., 2004; Hawley et al., 2004) because the ability to localize sound in space helps to separate relevant speech from competing noise. Binaural hearing is based on physical properties of sounds arriving at both ears with interaural time differences (ITDs), resulting from a longer sound path to one ear, and interaural intensity differences (IIDs), resulting from the acoustic shadow of the head. For a steady tone, the ITD is equivalent to an interaural phase difference (IPD). These binaural acoustic cues are likely analyzed in separate neural networks in the brain (McAlpine, 2005). The detection of IPD begins at the level of the brainstem, and the output is transmitted along the auditory pathway to the cortex.
The advantage of binaural hearing for speech intelligibility in noise was initially reported by Licklider (1948), who demonstrated that listeners could tolerate more noise when speech sounds were presented with opposite polarity to both ears (180° IPD) compared with the same sound in both ears. Previously, we designed a low-frequency tonal stimulus that contained such a polarity reversal (equivalent to a sudden change in the IPD) and reported already the efficacy of the stimulus for behavioral and evoked response studies in young adults (Ross et al., 2007). With magnetoencephalography (MEG), we recorded cortical auditory-evoked responses to the sound onset, indicating that the subject could hear the stimulus, and responses to a change in IPD, indicating discrimination of binaural disparity.
With the current study, we compared evoked responses and behavioral performances between groups of young, middle-aged, and older adults. The aim of the study was to characterize aging-related changes in binaural hearing based on the cue of IPDs. Our hypothesis was that aging in central auditory function is reflected in changes in the frequency range in which IPD-related auditory responses are evoked as well as in changes in response amplitudes and latencies.
Materials and Methods
Twelve healthy young subjects (seven females; mean age, 26.8 years), 11 middle-aged adults (eight females; mean age, 50.8 years), and 10 older adults (five females; mean age, 71.4 years) participated in the study. Hearing thresholds were <20 dB hearing level (HL) between 250 and 8000 Hz for the young and middle-aged groups and <30 dB HL below 2000 Hz for the older group (Fig. 1). Interaural threshold differences were <10 dB for all frequencies below 2000 Hz. Subjects provided their informed consent before participating in the study, which was approved by the Research Ethics Board at Baycrest Centre.
a , Distribution of ages for young, middle-aged, and older participants. b , Group mean audiograms averaged across ears. Error bars denote the 95% confidence intervals of the group means. Hearing thresholds in the young and middle-aged groups were below 20 dB HL. Older adults showed characteristic age-related, high-frequency hearing loss.
A specific stimulus signal was developed for recording auditory-evoked responses to transitions in the IPD. Sinusoidal amplitude-modulated (AM) tones were repeatedly presented for a 4.0 s duration with stimulus-onset asynchrony uniformly randomized between 7.5 and 8.5 s. At 2.0 s after stimulus onset, sudden phase shifts in the carrier signal of +90° in the left ear and −90° in the right ear, equivalent to 180° IPD, were introduced (Fig. 2). Thus, during the first 2 s, both ears received the identical tones (diotic sound), but during the last 2 s of stimulation, the tones were of opposite polarity in the two ears (dichotic sound). A previous study showed that change in this direction elicited slightly larger responses than the transition from dichotic to diotic sound (Ross et al., 2004). Participants described the diotic part of the stimulus as sound arising from a single source in the center of the head and the dichotic part as sound from separate sources in space without specific localization. To prevent the subject from perceiving a discontinuity in the sound at the moment of the phase shift, the tones were amplitude modulated and the phase shift was set to occur at the minimum point of modulation. In the previous study, we demonstrated also that 90° phase shifts in the same direction in both ears did not elicit an evoked response (Ross et al., 2004). The 12.5 ms onset slope of the modulation envelope eliminates any differences in the onset time as a possible binaural cue. The modulation frequency of 40 Hz was chosen to elicit a strong auditory steady-state response. The effect of IPD changes on the steady-state responses will be reported separately.
a , Grand averaged MEG field maps of onset and changed N1 and P2 responses in the young group. b , Grand averaged waveform of auditory-evoked responses obtained from the right hemisphere. c , Auditory stimuli. Stimuli are AM sounds of 4 s duration presented to the left and right ears at 60 dB sound level with an interstimulus interval of 3.5–4.5 s. For the first 2 s, left and right ear sounds are identical (diotic stimulus; IPD, 0). For the last 2 s, left and right ear sounds are of opposite polarity (dichotic stimulus; IPD, 180°). The phase shift is introduced at the minimum of amplitude modulation to avoid perception of change in a monaural signal. The response waveform shows a P1–N1–P2 complex after stimulus onset and a second P1–N1–P2 complex with similar morphology after the IPD change, a baseline shift sustained for the entire stimulus duration, and an N1 response after stimulus offset. The maps of magnetic field distribution across the head reveal separated dipolar pattern above the left and right temporal lobes. The X symbols indicate approximate dipole source locations. The field maps are almost identical for onset and change responses with the P2 more anteriorly located than N1 sources.
The experimental parameter of stimulus carrier frequency was set to 375, 500, 750, 1000, 1250, and 1500 Hz. Because the phase change was always half a period, the introduced ITD was 1.33, 1.0, 0.66, 0.5, 0.4, and 0.33 ms, respectively. Because of limited recording time, four frequencies out of the set were presented in each group, which were 500, 1000, 1250, and 1500 Hz for the young group; 500, 750, 1000, and 1250 Hz for the middle-aged group; and 375, 500, 750, and 1000 Hz for the older group. The frequencies were chosen according to the hypothesis of decreasing binaural performance with increasing age. One hundred stimuli with the same carrier frequency were presented in each experimental block of 13 min duration. Each block was repeated once to get 200 responses at each frequency, which was assumed to be sufficient for detecting individual responses. MEG responses to two stimulus frequencies could be recorded in a session of 1 h duration. In the first session, 500 and 1000 Hz were tested, which were the stimulus frequencies common to all age groups. The other frequencies were tested in the second session on the following day or later, with no more than 2 weeks between both sessions. Stimuli were presented through Etymotic ER3A transducers connected with 1.5 m of length-matched plastic tubing and foam earplugs to the subject's ears. The sound transducer had to be placed at a sufficient distance from the MEG sensor to avoid any interference between the stimulus signal and the recorded brain activity. Below 2000 Hz, the frequency characteristic of the sound transmission system was relatively flat (±6 dB), and the phase characteristic was linear. The phase relationship between acoustical signals at both earplugs was checked using a 2 cc coupler. The stimulus intensity was set to 60 dB above individual sensation thresholds, which were measured immediately before each MEG recording. Because insert-earphones typically provide interaural attenuation of >75 dB at the frequencies used in this study, no effects of interaural cross talk were expected.
MEG recording and data analysis.
MEG recordings were performed in a quiet, magnetically shielded room using a 151-channel whole-head neuromagnetometer (VSM-Medtech, Port Coquitlam, British Columbia, Canada). The detection coils of this MEG device are almost equally spaced on the helmet-shaped surface and are configured as axial first-order gradiometers (Vrba and Robinson, 2001). After low-pass filtering at 200 Hz, the magnetic field data were sampled at the rate of 625 Hz and stored continuously. MEG data were collected during passive listening, meaning that the subjects did not need to attend to the stimuli or execute a task. To control for confounding changes in vigilance, the subjects watched a closed-captioned movie of their choice while the auditory stimuli were being presented. Compliance was verified using video monitoring. The subjects were in the supine position with the head resting inside the helmet-shaped MEG sensor. The position of the MEG sensor was coregistered to the subject's head using three detection coils attached to the nasion and the preauricular points. Head movements were verified to be <8 mm during each recording block. No data had to be rejected because of excessive head movements.
Each block of continuously recorded MEG data was subdivided into 100 stimulus-related epochs of 6000 ms duration including 1000 ms prestimulus and poststimulus intervals. For artifact rejection, a principal component analysis was performed on each epoch of magnetic field data. This approach is effective for removing artifacts with amplitudes larger than the brain signals of interest. Principal components, which exceeded the threshold of 2 pT in at least one channel, were subtracted from the data. This procedure removed artifact primarily related to dental metal and eye blinks, which are substantially larger than the brain activity. After artifact removal, the magnetic field data were averaged, and magnetic source analysis was applied to the ±10 ms time interval around the maximum of the N1 wave, the most prominent auditory-evoked response 100 ms after stimulus onset. Source analysis was based on the model of spatiotemporal equivalent current dipoles (ECDs) in a spherical volume conductor. Single dipoles in both hemispheres were fit simultaneously to the 151-channel magnetic field distribution. First, the data were modeled with a mirror-symmetric pair of dipoles. The resulting source coordinates were then used as starting points to fit the dipole in one hemisphere while the coordinates in the other hemisphere remained fixed. We then switched between hemispheres and repeated the fitting until the source coordinates showed no further change. Dipole fits were accepted if their calculated fields explained at least 85% of the variance of the measured magnetic field. Eight estimates (four stimulus frequencies × two repetitions) of the N1 source location were obtained for each subject. The mean spatial coordinates and orientations were used as individual models to measure the source waveforms for the auditory-evoked responses. This procedure combined the 151-channel magnetic field data into two waveforms of cortical source strength. The advantage of analysis in source domain is that the dipole moment is independent of the sensor position. The position of the MEG sensor relative to the subject's head may change between sessions and between subjects. This may cause spatial dispersion in group-averaged magnetic field waveforms. In contrast, the waveforms of cortical source activity can be combined across repeated sessions for a subject and across the group of subjects. Signal statistics for detecting individual responses was based on the phase distribution of a wavelet transform applied to single trial source waveforms for each subject. An evoked response was accepted if uniform phase distribution could be rejected with the Rayleigh test (for details, see Ross et al., 2007). Response peaks were identified at time points, at which the first derivative was zero.
Behavioral stimuli and procedure.
An adaptive two-alternative forced choice (2AFC) procedure was used for psycho-acoustical testing. Stimuli were 1.0 s bursts of diotic and dichotic AM sounds, respectively. Pairs of both stimuli were presented in randomized order with a 900 ms interstimulus interval. The subjects were asked to identify whether the first or the second stimulus sounded as if it was separated between both ears. Immediately after the subject responded with a button press, the next pair of stimuli was presented. Initially, the carrier frequency was 250 Hz, which was the easiest condition according to previous findings that binaural performance for tones with a 180° IPD is highest at low frequencies and decreases monotonically with increasing frequency (Kohlrausch, 1986). The carrier frequency was increased by a quarter octave if two responses in a row were correct and decreased by a quarter octave if a single response was incorrect. The threshold found with such a two-down–one-up procedure corresponds to the 70% level at the psychometric function (Levitt, 1971). Because the lowest test frequency was limited to 250 Hz, the adaptive procedure did not converge to 0 in case of random responses. The chance level of ∼380 Hz was determined from computer simulation of the 2AFC procedure including the data analysis. Visual feedback was provided, in the form of a green (correct) or red (incorrect) square on the computer screen. One hundred trials were presented in each run, lasting ∼10 min. Each run was repeated once. All responses were analyzed, except for those on the initial slope from the easiest condition to the first reversal. The cumulative percentage of correct responses was determined as a function of frequency. The resulting response function was approximated by a cumulative normal distribution, and the threshold was defined as its 50% value.
Results
The characteristics of cortical auditory-evoked response waveforms are demonstrated in Figure 2 with the grand averaged response to 500 Hz stimuli in young adults. A P1–N1–P2 complex occurred immediately after stimulus onset, and a second one occurred after the change in IPD at 2 s. A negative sustained potential lasts for the complete duration of stimulus presentation, and a third N1–P2 complex occurred after offset of the 4 s stimulus. The results obtained in the young group have already been reported (Ross et al., 2007).
For all subjects and all stimulus frequencies, localizations of ECDs, which best explained the magnetic field distribution of the N1 onset responses, were estimated with a goodness of fit larger than 85%. No main effect of age or interaction of age, Cartesian coordinate (x, y, z), and hemisphere on the source localizations was significant, even though the 99% confidence limits for the group mean coordinates were less than ±4 mm in any direction. Talairach coordinates of the grand mean source locations in an averaged brain, as determined using the BESA software, were x = −46, y = −26, and z = 4 mm for the dipole in the left hemisphere and x = 48, y = −20, and z = 6 mm in the right hemisphere, which corresponds to left and right auditory cortices (Penhune et al., 1996). The hemispheric asymmetry with a more anterior right N1 source was significant (p < 0.001) in all age groups (young: Δy = 4.1 mm, middle aged: Δy = 8.0 mm, older: Δy = 12.6 mm). In addition, the asymmetry increased with increasing age (middle aged > young: t (129) = 2.75, p < 0.007; older > middle aged: t (146) = 3.56, p < 0.0001). Source locations for the N1 peak of the phase change response were not different from that of the onset response, suggesting generation of both response types in common or overlapping neural populations. The similarity between onset and change response was also expressed in the magnetic field topography, shown in Figure 1 for the N1 and P2 peaks 100 and 200 ms after stimulus onset and 120 and 225 ms after the change in IPD. Linear regression analysis applied to the grand averaged magnetic field data revealed that the field variance of the N1 change response could be explained with r 2 = 0.96 by the magnetic field at the maximum of N1 onset response. Thus, additional data analysis was based on waveforms of source strength [dipole moment in nanoampere meter (nAm)] of dipoles located in left and right auditory cortices.
Group-averaged response waveforms for the three age groups and all stimulus frequencies are shown in Figure 3. Onset and offset responses, which were clearly pronounced for all subject groups, did not systematically change in morphology or size with increasing stimulus frequency. In contrast, the change response amplitudes decreased with increasing frequency. Visual inspection revealed that the IPD change responses vanished between 1250 and 1500 Hz in the young participants, between 1000 and 1250 Hz in the middle-aged participants, and between 750 and 1000 Hz in the older participants. This indicates an upper limit for processing IPDs in sound.
Waveforms of grand averaged responses for the three age groups and all test frequencies. Each group was tested at four frequencies. Clearly pronounced onset responses were observed for all groups at all test frequencies. At the lowest test frequencies, all groups showed P1–N1–P2 IPD change responses. However, the amplitudes of the change response diminished with increasing frequency. A threshold for eliciting an IPD change response can be observed as a function of the stimulus frequency. In the younger group, the response to IPD changes is visible in the response to a 1250 Hz response but is absent in the 1500 Hz response. The IPD change response vanishes between 1000 and 1250 Hz in the middle-aged group, and the threshold in the aged group is between 750 and 1000 Hz.
Individual thresholds were obtained using statistical signal detection methods (Ross et al., 2007). Significant IPD change responses were found at 500 and 1000 Hz in all, at 1250 Hz in 9 of 12, and at 1500 in none of the young subjects. All middle-aged subjects showed IPD change responses at 500 and 750 Hz, 7 of 11 at 1000 Hz, and two at 1250 Hz. In the older group, IPD change responses were significant at 375 and 500 Hz in all, at 750 Hz in nine, and at 1000 Hz in 1 of 10 subjects. Individual physiological thresholds relative to the subject's age are shown in Figure 4 a. Linear regression on a logarithmic frequency scale (t (31) = 9.43; p < 0.0001) resulted in a threshold of 1287 Hz at the age of 20 years, a decrease in threshold by 10% (SD, 1%) per decade, and a correlation of r = 0.86 (r 2 = 0.74) between the model and the observed data. One-way ANOVA verified the effect of age on the physiological thresholds (F (2,33) = 37.8; p < 0.0001). Post hoc t tests showed a significant decrease in threshold between young and middle-aged subjects (t (11) = 2.7; p < 0.021) and between middle-aged and older adults (t (11) = 3.93; p < 0.0023). Mean thresholds for the three age groups were found on the regression line as 1225 Hz at 25 years, 940 Hz at 50 years, and 760 Hz at 70 years.
Summary of physiological and behavioral thresholds for IPD change detection. a , Individual physiological thresholds as a function of age. b , Individual behavioral thresholds as a function of age. Young subjects showed the smallest range of variation. Some middle-aged and older participants performed at chance level (gray shaded area), whereas some participants in these groups performed as well as the best performers in the young group (circled areas).
Individual thresholds obtained during behavioral testing are shown in relation to the subject's age in Figure 4 b. All young adults could perform the behavioral test, and their thresholds were in the range between 770 and 1683 Hz (Fig. 4 b) with a group mean of 1203 Hz. The behavioral thresholds in the middle-aged and older adults were much more variable; thresholds in both groups varied between 300 and 1400 Hz. Four of 11 middle-aged subjects and 5 of 10 older subjects performed around chance level (380 Hz) (Fig. 4 b, gray shaded area). In contrast, two participants in each group performed as well as the best young subjects (Fig. 4 b, circled areas). The group mean thresholds of 705 Hz in the middle-aged group and 638 Hz in the older group were strongly influenced by the non-normal distribution. Linear regression of the data on a logarithmic frequency scale (t (31) = 3.13; p < 0.004) resulted in the threshold of 1290 Hz at the age of 20 years and a decrease in 14% (SD, 4.5%) per decade. However, the model explained only 28% of the variance (r = 0.53; r 2 = 0.28) and other factors contribute to the heterogeneity.
Group effects were studied by measuring the peak amplitudes and latencies. Because change responses were not observed at all stimulus frequencies, latencies of the P1, N1, and P2 waves were compared at 500 Hz, for which all age groups showed clearly pronounced IPD change responses (Fig. 5). ANOVA was performed to study effects of response type (onset, change), hemisphere (left, right), and age group (young, middle age, older) on amplitudes and latencies of the P1, N1, and P2 responses. All change response latencies were delayed compared with the corresponding onset response latencies (Fig. 5). In addition, age was a main effect on response latencies (F (2,135) = 8.0; p < 0.013). The mean onset P1 latency was 66 ms, and the change P1 latency was 15 ms longer (t (134) = 9.2; p < 0.0001). Age was a main effect (F (2,135) = 8.0; p < 0.0006), and an interaction between age and response type was significant (F (2,135) = 3.06; p = 0.05). P1 latency was not different between age groups for the onset (mean, 66 ms), but in the change response, older participants had 12.2 ms later P1 (t (42) = 5.2; p < 0.0001) than the younger participants (76 ms) and 8 ms later P1 (t (42) = 2.2; p < 0.04) than the middle-aged participants (80 ms). The mean latency of the onset N1 peak was 119 ms, whereas the mean IPD change N1 latencies were 27 ms longer. Age had a main effect on the N1 latency (F (2,135) = 7.2, p < 0.001). In the older group, the N1 was 10.1 ms later (t (86) = 2.7; p < 0.02) than in the young group and 9.7 ms later (t (86) = 2.2; p < 0.03) than in the middle-aged group. Mean P2 latencies were 207 ms for the onset and 244 ms for the change response (t (143) = 7.03; p < 0.0001). Age was a main effect (F (2,135) = 18.5; p < 0.0001) and interacted with the response type (F (2,135) = 4.3; p < 0.015). The mean P2 onset latency of 220 ms in the older group was delayed (t (42) = 3.1; p < 0.004) by 22 ms compared with the middle-aged group (mean, 19.8 ms) but was not significantly different from the young group (206 ms). These findings were corroborated by the mean latencies of the change response P2 across all onset responses regardless of stimulus frequency (young, 209 ms; middle aged, 208 ms; older adults, 220 ms). The difference between older and middle-aged adults was significant (t (174) = 3.37; p < 0.001), but not between middle-aged and young groups. A similar trajectory of changes was found for the N1 across all onset responses (young and middle aged, 118 ms; older, 122 ms; t (174) = 3.9; p < 0.002) and was even less expressed for the P1 latencies (young and middle aged, 64 ms; older, 66 ms; t (174) = 2.4; p < 0.017). The most pronounced latency effect was that change response P2 latency of 274 ms in the older group was longer (t (42) = 6.7; p < 0.0001) than in the young group (226 ms) and also delayed (t (42) = 4.5; p < 0.0001) compared with the middle-aged group (236 ms).
a , Grand averaged waveforms of sound onset and IPD change responses at 500 Hz stimulus frequency for the three age groups. b , Peak latencies of the P1–N1–P2m response. P1 and N1 onset responses increased with increasing age. The P1 change response followed the characteristic of amplitude increase with increasing age; however, N1 and P2 were smallest in the older group. The most pronounced effect of age is the P2 latency increase in the change response, and it is less expressed in the onset response. P1–N1–P2m peak latencies were averaged across all test frequencies (4 for the onset, 2 for the change response) for the three age groups and the left and right hemispheres. Error bars denote the 99% confidence limits of the mean. A single asterisk indicates significance at α = 5%, and two asterisks indicate significance at α = 1%. All change responses had longer latencies than the onset responses progressively increasing from P1 to N1 and P2.
Analysis of the response amplitudes at 500 Hz showed that response type (F (1,135) = 25.7; p < 0.0001) and age (F (2,135) = 23.9; p < 0.0001) had main effects on the P1 amplitudes. The onset P1 amplitude (mean, 15.1 nAm) was 6.5 nAm larger (t (134) = 4.3; p < 0.0001) than the change P1. Older participants had larger P1 responses (18.7 nAm) (t (86) = 6.3; p < 0.0001) than younger (7.6 nAm) and middle-aged (10.4 nAm) participants (t (86) = 3.94; p < 0.0002). Age had a main effect on the N1 amplitude when combining the onset responses at all frequencies (F (2,266) = 9.88; p < 0.0001). N1 onset amplitudes increased between the middle-aged and older groups from 26.3 to 33.3 nAm (t (174) = 4.28; p < 0.0001). No effects on P2 amplitudes were significant even when combining the responses across all frequencies.
Discussion
Three major findings in this study were that (1) the auditory-evoked response to changes in IPD showed an upper frequency limit, which declined with advancing age; (2) the morphology of cortical responses were modified with age with a more pronounced latency increase for the IPD change responses than for the sound onset responses; and (3) the time course of aging-related changes was different for these two observations: the IPD threshold decline was already obvious in the middle-aged group, whereas changes in cortical responses became clear in later life.
The observed influence of aging on binaural hearing based on interaural temporal disparity is consistent with previous behavioral studies, which commonly showed that discrimination of ITD was impaired in older compared with younger subjects, whereas performance in IID was not affected. Herman et al. (1977) found that older subjects needed twice the ITD than younger subjects (28 vs 14 μs) to lateralize click stimuli, whereas both groups were equally sensitive in lateralizing the clicks based on IID. Also using click stimuli, Babkoff et al. (2002) confirmed that the sensitivity for ITD but not for IID decreases with increasing age. Those findings seem to be consistent with the general notion that sensory processing of temporal information declines with age (Strouse et al., 1998; Schneider and Hamstra, 1999). However, auditory temporal resolution, measured as performance in monaural gap detection, was not correlated with ITD discrimination in elderly subjects, whereas this was the case in young adults (Strouse et al., 1998). Such findings indicate that aging in binaural hearing may be distinct from an age-related general decline in temporal processing.
The upper limits for binaural hearing based on IPD in young subjects was ∼1300 Hz, consistent with results of several previous studies (Garner and Wertheimer, 1951; Zwislocki and Feldman, 1956; Schiano et al., 1986; Macpherson and Middlebrooks, 2002) At frequencies above 1500 Hz, the IPD in tones becomes ambiguous in humans because ITD (determined by the head size) becomes larger than one period of the sound. Thus, the frequency limit could be thought as a meaningful outcome of evolutional specialization in humans. However, thresholds near 1500 Hz have also been found in much smaller mammals like gerbils, for which IPD in tones does not become ambiguous until 12 kHz (Heffner and Heffner, 1988). More likely, the threshold represents a frequency limit for phase synchrony in the auditory pathway. Processing of IPD requires neural computation of phase differences and hence undistorted representation of the signal phase at the place of computation (Joris et al., 1998), most likely the medial part of the superior olivary complex (Grothe, 2003; McAlpine, 2005). The age-related change in physiological thresholds found in our study show that phase synchrony deteriorates with increasing age and, most importantly, this process commences before midlife. A recent behavioral study showed that binaural advantage for speech understanding in noise was already reduced in middle-aged adults, who, in contrast, performed like young listeners in quiet (Kim et al., 2006). Here we provide first physiological evidence for early onset of aging in central auditory function of binaural hearing.
The P1, N1, and P2 components of the electromagnetic response to sounds generally relate very well with the psychophysical assessments of threshold for stimulus onset or change (Naatanen and Picton, 1987; Hyde, 1997). However, because these waves may also be affected by both the state of arousal and the direction of attention, the relationship is not exact. Although the physiological and behavioral data in the young group were found in close agreement at 1300 Hz, this finding should not be overinterpreted as a general one-to-one relationship between both thresholds. Another finding was that physiological and behavioral threshold frequencies were more expressed as increased intersubject variability. The contrast between consistent physiological thresholds and large variations in behavioral tests in the middle-aged and older groups poses the question to what degree can the physiological measure predict behavior. This question is closely connected to the relationship between the responses (physiological and behavioral) and the different levels of processing that are involved during each measure. At least we can conclude that the physiological responses indicate that bilateral inputs were successfully combined, that the IPD change in the stimulus was identified, and that it was represented at the cortical level. Intersubject variability in auditory processing until this level was comparatively small, which means that biological aging affected subjects fairly evenly. However, subjects differed in their ability to access the IPD change information and make conscious decisions during the behavioral testing. This could explain lower behavioral performance than predicted by the age-related change in the physiological threshold. But why do some subjects perform as well as the young subjects? Some subjects might have good preservation of their auditory timing, or may bring to bear other perceptual strategies. For example, one additional cue could be the arrival time in the sound onset. Also differences in the physiological and behavioral test could potentially contribute to the contrast between groups. Comparison between sequentially presented sounds in the behavioral task could have been more difficult than detection of IPD change in the physiological test. Finally, the subjects who performed at chance level may not have completely understood the task, or may not have been able to perceive the stimulus differences. In either case, the physiological test provided a less variable assessment of binaural function.
Age-related, high-frequency hearing loss, presbycusis, was obvious in the group of older adults and could be considered as a confounding factor contributing to at least some of the observed effects, especially those with onset in late life. This is a common problem in studies of binaural hearing, which often were performed with click stimuli, containing spectral components spread over a wide range including high frequencies. Hearing loss at these frequencies may compromise perception of click stimuli. Thus, one aim of our study was to probe binaural hearing with low-frequency sound. We therefore designed a low-frequency tonal stimulus that contained a sudden change in the IPD and reported already the efficacy of the stimulus for behavioral and evoked response studies in young adults (Ross et al., 2007). The stimulus evoked simultaneously cortical auditory responses to the sound onset, indicating that the subject could hear the sound, and responses to the change in IPD, indicating processing of binaural disparity. The spectral content of the stimuli was clearly below those frequencies, which were affected by age-related hearing loss in some subjects. Thus, age-related hearing loss did not affect perception of the stimuli per se, which is consistent with the finding that onset responses did not decrease with age. This finding is also consistent with others who have shown that hearing-impaired listeners had decreased performance in processing IPD in the carrier of AM sound (Lacher-Fougere and Demany, 2005). But loss in sensation alone does not fully explain the effect. Hearing loss is often accompanied by reduced dynamic range for sound intensity (recruitment). In our study, we presented the stimuli with an intensity of 60 dB above the individual sensation level. Thus, loudness was probably greatest for participants with recruitment, which could explain the larger onset N1 amplitudes in the older groups, because N1 amplitude is more likely related to loudness than physical intensity (Morita et al., 2003).
Hearing loss in older age is likely a more complex phenomenon than simply expressed in elevated thresholds. Aging-related hearing loss occurs gradually over decades of lifetime. The brain, deprived from its sensory input, likely undergoes substantial reorganizations. Conceivably, high-frequency hearing loss impairs sound localization based on short ITDs in sound transients (Cranford et al., 1993). This may impair generally the perception of spatial sound images and consequently reduce the ability to interpret other binaural cues, which are not primarily affected by loss in high-frequency hearing. Personal experiences may affect those long-term plastic changes widely; thus in some people, an initially small deficit in sensory function may aggravate into a general impairment, whereas others may learn to compensate for deficits and even improve through overcompensation. Those different individual strategies might also explain the wide variation in behavioral performance. The larger variability in behavior than in physiological measures as a result of external factors such as lifelong experiences and may indicate a potential for specific training of binaural cues. Results of laboratory training with young listeners showed that short-term training of changes in binaural hearing results in quick adaptation to a changed acoustical environment; however, short-term training of binaural cues such as ITD results only in small improvements (Wright and Zhang, 2006). Larger training effects have been reported in cochlear implant users (Rowan and Lutman, 2006). Future work is necessary to study whether a specific training regimen could reduce the effects of aging-related decline in binaural hearing.
Finally, main effects of age on the morphology of auditory-evoked responses were expressed as prolonged latencies between middle-aged and older groups but not between young and middle-aged participants. The latency effects on the onset responses were small, on the order of 5 ms for P1 and N1 and 11 ms for P2. In contrast, the P2 wave in the IPD change response was delayed by 50 ms compared with the P2 in the young group. This dissociation between aging effects on the onset and change responses indicates that all brain processes do not slow down at the same rate. Perception of the sound itself, as reflected in the onset response, was not affected by aging, whereas for binaural processing, as reflected in the change response, was significantly affected. Interestingly, age had a much stronger impact on P2 than N1 latency. Although similar results of prolonged P2 latencies in advanced age had been reported for speech and tone stimuli (Tremblay et al., 2003), the functional importance of the P2 response is widely unknown. In general, auditory-evoked responses can be seen as a cascade of components reflecting the span from early stimulus evaluation to cognitive processing. In experimental situations without an explicit cognitive task, such as in the current study, the P2 is the last response and might be related to the termination of the stimulus evaluation process. Thus, the prolonged interval between N1 and P2 in the change response may indicate the greater time needed for binaural processing in older adults.
In summary, we demonstrated with MEG-recorded auditory-evoked responses aging-related changes in central processing of binaural sound. Aging-related decline in binaural hearing, expressed by the limited frequency range for detection of IPD, became significant in middle age before changes in cortical responses appeared in late life. Large intersubject variability in behavior in older compared with younger participants demonstrated that some people can cope with the declined binaural resources quite well. This observation may indicate a potential for future training programs in middle age, which likely could help to overcome biologically determined declines in central auditory function with learned compensatory strategies.
Footnotes
-
This work was supported by grants from The Hearing Foundation of Canada, the Canadian Institutes of Health Research, and the Canadian Foundation for Innovation.
- Correspondence should be addressed to Dr. Bernhard Ross, Rotman Research Institute, Baycrest Centre, 3560 Bathurst Street, Toronto, Ontario, Canada M6A 2E1. bross{at}rotman-baycrest.on.ca