Abstract
Aging listeners, even in the absence of overt hearing loss measured as changes in hearing thresholds, often experience impairments processing temporally complex sounds such as speech in noise. Recent evidence has shown that normal aging is accompanied by a progressive loss of synapses between inner hair cells and auditory nerve fibers. The role of this cochlear synaptopathy in degraded temporal processing with age is not yet understood. Here, we used population envelope following responses, along with other hair cell- and neural-based measures from an age-graded series of male and female CBA/CaJ mice to study changes in encoding stimulus envelopes. By comparing responses obtained before and after the application of the neurotoxin ouabain to the inner ear, we demonstrate that we can study changes in temporal processing on either side of the cochlear synapse. Results show that deficits in neural coding with age emerge at the earliest neural stages of auditory processing and are correlated with the degree of cochlear synaptopathy. These changes are seen before losses in neural thresholds and particularly affect the suprathreshold processing of sound. Responses obtained from more central sources show smaller differences with age, suggesting compensatory gain. These results show that progressive cochlear synaptopathy is accompanied by deficits in temporal coding at the earliest neural generators and contribute to the suprathreshold sound processing deficits observed with age.
SIGNIFICANCE STATEMENT Aging listeners often experience difficulty hearing and understanding speech in noisy conditions. The results described here suggest that age-related loss of cochlear synapses may be a significant contributor to those performance declines. We observed aberrant neural coding of sounds in the early auditory pathway, which was accompanied by and correlated with an age-progressive loss of synapses between the inner hair cells and the auditory nerve. Deficits first appeared before changes in hearing thresholds and were largest at higher sound levels relevant to real world communication. The noninvasive tests described here may be adapted to detect cochlear synaptopathy in the clinical setting.
Introduction
Hearing thresholds change gradually with age and, by 70 years, many individuals (∼63%) will have audibility losses significant enough to interfere with communication and quality of life (Lin et al., 2011). Aging listeners also commonly experience difficulty understanding speech in adverse listening conditions and exhibit degraded temporal resolution (Dubno et al., 1984; Gordon-Salant and Fitzgibbons, 2001). These difficulties do not always scale with threshold loss; even when matched for good audiometric thresholds, older listeners show performance declines on such tasks relative to younger listeners (Frisina and Frisina, 1997; Pichora-Fuller and Souza, 2003). With the lack of threshold evidence for peripheral involvement, these deficits have suggested central underpinnings, from decreased inhibition affecting temporal coding in subcortical and cortical regions (Caspary et al., 2008; Walton, 2010) to changes in higher-order executive functions (Pichora-Fuller et al., 1995; Henry et al., 2017).
Our recent work in mouse models of aging reveals what may be a major contributor to these functional declines. We have described a cochlear synaptopathy that progressively interrupts inner hair cell (IHC) to afferent fiber communications with age initially in ears without and ultimately with hair cell loss and threshold elevation (Sergeyenko et al., 2013). This synaptic loss does not alter presynaptic, outer hair cell (OHC)-based responses such as distortion product otoacoustic emissions (DPOAE), but produces proportional decreases in suprathreshold neural responses such as wave 1 of the auditory brainstem response (ABR), reflecting afferent outflow from the cochlea. Several lines of evidence have suggested that auditory neurons with low spontaneous rates of firing are primary targets of both aging (Schmiedt et al., 1996) and noise (Furman et al., 2013; for review, see Liberman and Kujawa, 2017). These low-spontaneous-rate neurons are important for encoding the temporal properties of signals, especially in noise (Costalupes et al., 1984), but have high thresholds of response so the resultant hearing loss can be “hidden” in the normal threshold audiogram (Schaette and McAlpine, 2011; Kujawa and Liberman, 2015).
Although this age-related cochlear synaptopathy has now been well documented (Sergeyenko et al., 2013; Fernandez et al., 2015), what is yet not fully understood is how this loss of synapses contributes to the degraded temporal resolution seen with age and how cochlear synaptopathy may interact with higher-order changes seen in the central auditory pathway. Envelope-following responses (EFRs) can be used to bridge this gap in understanding. EFRs are far-field steady-state potentials evoked by neural synchronization to the amplitude envelopes of spectrotemporally complex stimuli. Prior work has suggested that EFRs can be dominated by cortical (Herdman et al., 2002a,b; Kuwada et al., 2002) and subcortical (Picton et al., 2003; Parthasarathy and Bartlett, 2012) sources, including the auditory nerve (Shaheen et al., 2015), depending on the specific stimulus parameters used to elicit the response. EFRs can be recorded from surface electrodes and thus offer an approach to studying auditory deficits in human clinical populations (Clinard and Tremblay, 2013; Presacco et al., 2015; Bidelman et al., 2017) and in animal models, in which the underlying histopathology can be studied directly (Zhong et al., 2014; Shaheen et al., 2015).
In the current study, we recorded DPOAEs, ABRs, and EFRs from an age-graded series of CBA/CaJ mice to study consequences of age-related synaptopathy to early neural coding. We demonstrate that we can elicit EFRs from hair cell and the early neural generators of the auditory pathway to assess the integrity of temporal processing on both sides of the cochlear synapse and observe the emergence of age-related deficits at the earliest stages of auditory processing. We deduce the contributions of afferent synapse loss to these changes by comparing these responses with histological evidence for cochlear synaptopathy. We offer a method that compares the responses of hair cell versus neural components of the EFR to further describe the deficits in suprathreshold sound coding occurring with this age-related cochlear synaptopathy and provide evidence for its interaction with more central nuclei in the auditory pathway.
Materials and Methods
Experimental animals and acoustic environment.
CBA/CaJ mice (n = 58; 42 male, 16 female, 16–128 weeks of age) were used in this experiment. All animals were born and raised in our animal care facility from breeders obtained from The Jackson Laboratory. The typical lifespan of these mice under laboratory conditions is between 24 and 29 months or 104–129 weeks (Flurkey et al., 2007); >60% survive in our facility >132 weeks. Use of CBA/CaJ mice enables comparison with the existing literature in the characterization of age-related and noise induced synaptopathy in the cochlea (Kujawa and Liberman, 2009; Sergeyenko et al., 2013; Fernandez et al., 2015; Valero et al., 2017; for review, see Kujawa and Liberman, 2015). Envelope-coding properties that are of interest in this study are less reliant on the hearing range of the animals compared with frequency or fine structure coding.
The acoustic environment in the animal care facility has been characterized by noise-level data logging and analysis over 6–24 h periods, as described previously (Sergeyenko et al., 2013; Liberman et al., 2014) and receives ongoing, periodic monitoring to ensure maintenance of a lower-noise environment. Briefly, signals detected by a small electret microphone placed in an empty mouse cage in an active rack are digitized and averaged for weekday and weekend facility conditions. On the basis of such measurements, we find that sound levels are greatest between 1 and 4 kHz, typically ranging between 45 and 50 dB SPL, with sound levels from 5.6 to 45 kHz ranging between 35 and 40 dB SPL. A-weighted noise levels between 10 Hz and 20 kHz are between 50 and 60 dB SPL >95% of the time and >80 dB SPL <1% of the time. All animal procedures were approved by the Institutional Animal Care and Use Committee of the Massachusetts Eye and Ear Infirmary and are consistent with National Institutes of Health guidelines.
Physiological testing.
All tests were conducted in an acoustically and electrically shielded and heated chamber, using a National Instruments PXI-based system and 24-bit input/output boards controlled with custom LabVIEW software. The custom acoustic system comprised two miniature dynamic earphones as sound sources (CDMG15008-03A; CUI) and an electret condenser microphone (FG-23329-PO7; Knowles) coupled to a probe tube to measure sound pressure near the eardrum.
Mice were anesthetized with ketamine (100 mg/kg, i.p.) and xylazine (10 mg/kg, i.p.). A small incision was made in the cartilaginous portion of the external ear canal to allow visual confirmation of the condition of the external and middle ears and optimal placement of the acoustic system. Animals with evidence of middle ear pathology or obstructive cerumen were rarely encountered; when detected, such animals were excluded from the study. Body temperature was monitored by rectal thermometer and maintained at 37°C during testing by heating the air in the experimental chamber. Additional boosters of anesthesia (30–50% of the initial dose) were given as needed to maintain a stable anesthetic state throughout the experiment. Visual observation of the animal's state was maintained throughout the experiment using an infrared camera in the booth.
DPOAEs, ABRs, and EFRs were recorded for all animals. DPOAEs were recorded in response to two primary tones, f1 and f2, with f2 frequencies log spaced from 5.6 to 45.2 kHz, f2/f1 = 1.2, and L2 = L1–10 dB, both incremented together in 5 dB steps. At each frequency–level combination, the DPOAE amplitudes at 2f1–f2 were captured from ear canal pressure measurements and, after spectral and waveform averaging, analyzed offline as response–growth functions. Iso-DPOAE contours were interpolated from the growth functions and used to determine the f2 level required to elicit a DPOAE of −5 dB SPL at each frequency, which was defined as threshold.
ABRs and EFRs were recorded using subdermal needle electrodes at the vertex and the ipsilateral mastoid, with the ground electrode at the base of the tail. ABRs were elicited to tone bursts (0.5 ms rise/fall, 5 ms duration, 33 repetitions/s, alternating polarity) at the same frequencies as f2 of the DPOAEs and varying sound levels between 10 and 90 dB SPL. Responses were amplified (×10,000; Grass Instruments P511 amplifier) and filtered (0.3–3 kHz and a 60 Hz line filter). Trials in which the overall ABR response amplitude exceeded 15 μV were rejected as noise; 512 artifact-free trials of each polarity were averaged to compute the ABR waveform. Averaged ABR waveforms were then imported into custom programs in which individual positive and negative peaks were identified. ABR thresholds were calculated as the minimum sound level that produced a noticeable wave 1 upon visual inspection. ABR peak-to-peak amplitudes were calculated as the difference between the positive peak and the following negative peak. EFRs were elicited to sinusoidally amplitude modulated (sAM) tones (5 ms rise/fall, 200 ms duration, 3.1 repetitions/s, alternating polarity) at two carrier frequencies (12.14 and 30.49 kHz, denoted hereafter as 12 and 30 kHz for simplicity) and varying sound levels between 10 and 90 dB SPL. Other properties of the sAM tones that were varied including amplitude modulation (AM) rate and AM depth are specified in the relevant sections of Results. Responses were amplified (×10,000; Grass Instruments P511 amplifier) and filtered (0.1–10 kHz and a 60 Hz line filter). Trials in which the response amplitude exceeded 200 μV were rejected; 250 artifact-free trials of each polarity were averaged to compute the EFR waveform. Fast Fourier transforms (FFTs) were performed on the averaged time–domain waveforms starting 10 ms after stimulus onset to exclude ABRs and ending at stimulus offset using custom-written programs in MATLAB (The MathWorks). The maximum amplitude of the FFT peak at 1 of 3 frequency bins (3 Hz each) around the modulation frequency gave the peak FFT amplitude. This FFT amplitude at the modulation frequency of the AM frequency is reported as the EFR amplitude. The noise floor was calculated as the average of 5 frequency bins (3 Hz each) above and below the central three bins. A response was deemed as significantly above noise if the FFT amplitude was at least 6 dB above the noise floor.
Histological preparation and assessment.
After testing, anesthetized mice were transcardially perfused with 4% paraformaldehyde in 0.1 m phosphate buffer, followed by an intralabyrinthine perfusion through the round and oval windows of both cochleas. The cochleas were additionally postfixed in 4% paraformaldehyde for 1 h, decalcified in 0.12 m EDTA for 2 d, microdissected into half turns, and immunostained with antibodies to C-terminal binding protein 2 (mouse anti-CtBP2; BD Biosciences, at 1:200), glutamate receptor 2 (mouse anti-GluA2; Millipore, at 1:2000), and myosin-VIIa (rabbit anti-myosin-VIIa, Proteus Biosciences, at 1:200) and secondary antibodies coupled to Alexa Fluor in the red, green, and gray channels.
Immunostained cochlear pieces were measured and a cochlear frequency map was computed (Müller et al., 2005) to associate structures to relevant frequency regions using a custom plug-in to ImageJ. Confocal z-stacks of the 11.3, 22, 32, 45, and 64 kHz areas were collected using a Leica TCS SP5 microscope. Two adjacent stacks were obtained (78 μm of cochlear length per stack) at each target frequency spanning the cuticular plate to the synaptic pole of ∼10 hair cells (in 0.25 μm z-steps). Images were collected in a 1024 × 512 raster using a high-resolution, oil-immersion objective (63×, numerical aperture 1.3), and digital zoom (3.17×). Images were loaded into an image-processing software platform (Amira; VISAGE Imaging), where IHCs were quantified based on their Myosin VIIa-stained cell bodies and CtBP2-stained nuclei. Presynaptic ribbons and postsynaptic glutamate receptor patches were counted using 3D representations of each confocal z-stack. Juxtaposed ribbons and receptor puncta constitute a synapse and these synaptic associations were determined using custom software that calculated and displayed the x–y projection of the voxel space within 1 μm of each ribbon's center (Liberman et al., 2011). OHCs were counted based on the myosin VIIa staining of their cell bodies within the same imaging region and the mean number of cells per row of OHCs was used as a measure of OHC counts.
Round window application of ouabain.
In one experimental series, ouabain, applied locally, was used to obtain a unilateral cochlear neuropathy (Yuan et al., 2014) in animals aged 16–22 weeks. After anesthetization, the pinna was reflected rostrally and a retroauricular incision was made. The underlying muscles were separated by blunt dissection to expose the middle compartment of the bulla and a small opening was made to expose the round window. Ouabain (10 mm in PBS, pH 7.4) or a control solution (vehicle alone) was applied to the round window membrane every 10 min and then wicked off and exchanged for a fresh solution for four total applications. DPOAEs, ABRs, and EFRs as described above were measured after the surgery but before application of the treatment for the “pre” condition and after treatment for the “post” condition.
Experimental design and statistical analyses.
The study was cross-sectional and N-way ANOVA was performed to analyze group differences using preset and custom-written scripts in MATLAB. Data were log transformed if necessary to produce a normal distribution. Main effects of each of the factors and their interactions were calculated. Post hoc comparisons were performed in MATLAB (multcompare) and corrected for multiple testing using the Bonferroni method. Correlations between measures are indicated by Pearson's linear correlation and p-values computed using a Student's t distribution for a transformation of the correlation. Significant differences were reported with a 95% confidence interval and error bars in figures indicate SEM.
Results
Progressive age-related cochlear synaptopathy occurs before loss of hair cells and hearing thresholds
The synapses between the IHCs and the auditory nerve (Fig. 1A) are the most vulnerable elements of normal auditory aging (Sergeyenko et al., 2013). Immunostaining for presynaptic ribbons and postsynaptic glutamate receptor patches (Fig. 1B, red and green, respectively) shows a progressive loss of these synapses occurring throughout the lifespan (Fig. 1C, shown for two representative frequencies in green). This synaptopathy occurs before any losses in OHCs or IHCs (Fig. 1C, dashed and solid gray lines, respectively).
This synaptopathy also begins before changes in OHC function measured as changes in DPOAE thresholds and amplitudes, which are only minimally affected until later in the lifespan. DPOAE threshold losses are <25 dB until 108 weeks of age (Fig. 2A, shown for two representative frequencies). DPOAE amplitudes show minimal differences with age up to and including 64 weeks (p > 0.1 for all comparisons above 50 dB SPL at 12 kHz and 30 kHz), with significant decreases in amplitudes due to age apparent only ≥108 weeks (Fig. 2B, Table 1). The decreases in thresholds at later ages are accompanied by a loss of OHCs (Fig. 1C, gray dashed lines) and the DPOAE thresholds are significantly correlated with the number of remaining OHCs (Fig. 2C, 12 kHz: r = −0.69, p = 5.1 × 10−7, 30 kHz: r = −0.56 p = 0.0001).
ABR thresholds, like DPOAE thresholds, show late elevations (Fig. 2D), whereas ABR amplitudes steadily decline with age (Fig. 2E). Significant main effects of age, sound level, and their interaction (Table 2) were present for ABR amplitudes and post hoc comparisons revealed a significant decrease in wave 1 amplitudes with age at all sound levels above 50 dB SPL (p < 0.005 for all comparisons). The ABR wave 1 amplitudes at a fixed suprathreshold sound level of 30 dB sensation level (SL) are significantly correlated with the number of cochlear synapses remaining (Fig. 2F, 12 kHz: r = 0.83, p = 8.4 × 10−12, 30 kHz: r = 0.71, p = 3.5 × 10−7). Therefore, suprathreshold ABR wave 1 amplitudes are a good physiological indicator for age-related cochlear synaptopathy.
Preneural and early neural temporal processing can be differentiated using EFRs
The modulation frequency used to evoke EFRs can be varied to yield responses dominated by different generators along the auditory pathway. To isolate responses originating from the earliest neural generators, we measured EFRs before and after the application of ouabain at the round window of the cochlea. Ouabain is a neurotoxin that selectively targets Na+/K+ ATPase in a dose-dependent manner. When applied to the round window of the cochlea, it can inactivate auditory nerve responses without significant effects on hair cell function (Lang et al., 2011; Yuan et al., 2014).
An acute 45 min application of ouabain resulted in increased ABR wave 1 thresholds, with significant main effects of treatment (F(3,60) = 23.75, p = 2.9 × 10−10), frequency (F(5,60) = 7.92, p = 8.8 × 10−6), and their interaction (F(15,60) = 2.68, p = 0.0036). Post hoc analysis corrected for multiple comparisons showed that ouabain application increased neural ABR wave 1 thresholds (Fig. 3A) and decreased ABR amplitudes (Fig. 3B) in the frequency regions of the cochlea near the round window site of drug application compared with saline-treated animals (p < 0.0001 for all tested frequencies >25 kHz and for all tested levels >60 dB SPL at 30 kHz). In contrast, hair cell-based DPOAE thresholds (Fig. 3C) and amplitudes (Fig. 3D) were largely unaffected by ouabain. Pairwise post hoc comparisons revealed no difference in DPOAE thresholds at all tested frequencies and amplitudes at 30 kHz due to ouabain treatment (p > 0.1 in all cases).
Next, we tested the effects of ouabain on EFRs obtained to modulation frequencies between 768 and 4096 Hz. Example frequency domain traces show clear responses to both 1024 and 4096 Hz AM frequencies that were well above the noise floor (Fig. 4A,B). Further, the application of neurotoxic ouabain decreased EFR response amplitudes at 1024 Hz AM, but not at 4096 Hz AM. The application of saline did not affect response amplitudes at either modulation frequency (Fig. 4B,C). Repeated-measures ANOVA on log-transformed data for EFRs at 1024 Hz AM revealed a significant effect of treatment for ouabain (F(1,2.44) = 77.41, p = 0.0009), but not saline (F(1,0.005) = 0.45, p = 0.57). For EFRs at 4096 Hz, there were no significant effects of treatment for either ouabain (F(1,0.075) = 2.01, p = 0.25) or saline (F(1,0.038) = 10.67, p = 0.08). We assessed the percentage change in EFR amplitudes with treatment for all modulation frequencies tested between 768 and 4096 Hz using a two-way ANOVA with treatment and modulation frequency as factors. For EFR amplitudes, ouabain application resulted in significant main effects of treatment alone (F(1,21) = 15.7, p = 0.0007). Pairwise post hoc tests corrected for multiple comparisons using the Bonferroni method revealed a significant decrease in EFR amplitudes for modulation frequencies in the 700–1000 Hz range (768 Hz: p = 0.004, 1024 Hz: p = 8.4 × 10–4), suggesting primarily neural sources for these responses (Fig. 4C). This is consistent with a previous study that used analyses of group delays to show that EFRs elicited by ∼1000 Hz AM are generated primarily from the auditory nerve and are sensitive to noise-induced cochlear neuropathy (Shaheen et al., 2015). However, the application of ouabain did not affect the amplitudes of EFRs to AM frequencies in the 3000–4000 Hz range (Fig. 4C, p > 0.6 in both cases). Further, comparison of EFRs elicited to 4096 Hz AM at the mastoid versus the round window in five animals showed a substantial increase in amplitude when recorded at the round window, as large as three orders of magnitude (Fig. 4D). This confirmed that these EFRs at ∼4000 Hz AM represent physiological responses, the amplitudes of which are largely driven by the spatial proximity to the generators.
Pure tone phase locking by the auditory nerve falls off steeply beyond 1000–2000 Hz and is insignificant at ∼4000 Hz (Johnson, 1980; Palmer and Russell, 1986; Weiss and Rose, 1988b; Temchin and Ruggero, 2010; Versteegh et al., 2011). The upper limit of auditory nerve phase locking to the AM envelope is even steeper, with a cutoff of ∼1000 Hz (Palmer, 1982; Joris and Yin, 1992; Dreyer and Delgutte, 2006). Therefore, EFRs observed in this study to stimuli >2000 Hz AM seem unlikely to originate from the auditory nerve. Information regarding phase locking to the AM envelope by hair cells is not available. However, the upper limit of pure tone phase locking by the hair cells is higher than that of the auditory nerve (reaching up to 10,000 Hz for OHCs and ∼4000 Hz for IHCs, depending on species) and is shaped strongly by the membrane time constant of the cell, which behaves as a low-pass filter (Weiss et al., 1974; Russell and Sellick, 1978; Holton and Weiss, 1983, for review, see Fettiplace, 2017). This can be seen when phase locking is measured directly from the hair cells (Russell and Sellick, 1978, 1983; Palmer and Russell, 1986; Weiss and Rose, 1988a) or at the level of the round window (Forgues et al., 2014; Batrel et al., 2017). These observations, together with the results seen here, suggest possible preneural sources such as hair cell receptor potentials for these EFRs at 4000 Hz AM and early neural sources such as the auditory nerve for EFRs at 1000 Hz AM. Therefore, by obtaining EFRs to AM frequencies in the 1000–4000 Hz range, the integrity of temporal processing can be assessed on both sides of the cochlear synapse and contributions from the earliest neural generators can be isolated and separated from other putative generators.
Temporal envelope processing deficits with age appear at the earliest neural generators
We hypothesized that the loss of cochlear synapses with age could lead to a decrease in the precision of neural coding, which would be evident at the earliest stages of the auditory pathway and scale with the degree of synaptopathy. To test this, we recorded EFRs to sAM tones with modulation frequencies in the 1000–4000 Hz range for two cochlear regions (12 and 30 kHz) in our age-graded series. EFRs from early neural sources (∼1000 Hz AM) show a progressive decline with age (Fig. 5A,B, p < 0.005 for all comparisons except between 16 weeks and 32 weeks at 12 kHz). Therefore, in addition to the reduced wave 1 amplitudes seen with age (Fig. 2E), the reduction in cochlear afferent outflow also degrades the representation of temporal envelopes and these deficits present themselves at the earliest neural generators, putatively the auditory nerve. However, EFRs at ∼4000 Hz, evoked putatively from hair cell generators as informed by the previous experiment (Fig. 4), showed minimal changes in amplitude with age. Responses at both ∼1000 Hz AM and ∼4000 Hz AM in most age groups are significantly above the noise floor, as seen from the grand-averaged FFTs (Fig. 5A,B, inset), and decreases in amplitudes for EFRs at ∼4000 Hz AM are evident only when hair cell losses occur (p > 0.1 for all post hoc comparisons except with groups 64 weeks and older at 12 kHz and with the 128 week group at 30 kHz, where p < 0.0001).
To study the relative changes in neural-based versus hair cell-based responses and to relate these to the degree of synapse loss, we calculated a ratio between the ∼1000 Hz AM EFR and the ∼4000 Hz AM EFR at 80 dB SPL for each animal. We defined the ratio as the absolute value of log10(EFR1024 Hz)/log10(EFR4096 Hz) and plotted the result at two cochlear frequencies, as shown in Figure 5, C and D. A high EFR ratio would indicate that the responses from the auditory nerve are large compared with those from the hair cells and a low EFR ratio would indicate the reverse. The EFR ratio decreased with age and this was significantly correlated with the number of cochlear synapses versus age (12 kHz: r = 0.54, p = 0.0001; 30 kHz: r = 0.73, p = 1.6 × 10−8). Therefore, the temporal processing deficits seen at the early neural stages are directly correlated with the loss of cochlear synapses with age. This EFR ratio may serve as an alternative metric to time domain measurements such as the summating potential (SP) and wave 1 of the ABR because measuring the SP and the wave 1 in the time domain response requires peak picking by visual inspection, which can be challenging, especially in the aged animals in which ABR waveform morphology can be degraded. However, EFRs are comparatively easier to quantify in such ears because the response is analyzed in the frequency domain, where the energy at modulation frequency can be compared with the surrounding noise floor and objectively calculated.
Cochlear synaptopathy is associated with a decreased dynamic range of neural coding
Based on studies of human electrophysiology and computational modeling, cochlear synaptopathy is believed to affect suprathreshold temporal processing by degrading the representation of sounds in the early auditory pathway (Bharadwaj et al., 2014, 2015; Parthasarathy et al., 2016). To directly probe the consequences of age-related cochlear synaptopathy on early neural coding, we recorded amplitude versus level responses for EFRs elicited to ∼1000 Hz AM frequency at two cochlear regions.
EFR growth functions showed a progressive decline with age (Fig. 6A,B), suggesting that neural declines in temporal processing occur throughout the lifespan and are exacerbated with older age. Significant main effects of age, sound level, and their interaction was present for both frequencies (Table 3). Although the EFR growth function showed minimal changes in overall slope or shape, the decrease in amplitudes suggested a decreased dynamic range available for encoding stimulus level with age. EFR amplitudes measured at equal levels relative to individual thresholds continued to exhibit these age-related declines (Fig. 6C,D, Table 4). Therefore, this decrease in the fidelity of coding at suprathreshold sound levels persists even when adjustments are made to address individual differences in hearing thresholds due to sensory hearing loss. EFRs measured at equal SLs 30 dB above threshold are also significantly correlated with number of cochlear synapses (Fig. 6E,F; 12 kHz: r = 0.66, p = 1 × 10−5; 30 kHz: r = 0.68, p = 9.2 × 10−7), suggesting that these neural declines in suprathreshold temporal processing occur with age-related cochlear synaptopathy.
Suprathreshold sound processing in real world listening conditions often requires the neural encoding of stimuli with degraded envelope cues (Rosen, 1992; Shannon et al., 1995). We recorded EFRs in response to AM tones with decreasing modulation depth in two cochlear regions at sound levels 30 dB above the threshold of each animal to probe suprathreshold temporal processing deficits due to degraded envelope cues at equal SLs. EFR amplitudes showed a progressive decline as a function of both age and modulation depth (Fig. 7, Table 5). Similar to level coding, the representations of envelope depth are also similar in the overall shape and slope of the depth function with age. However, the overall decrease in amplitudes suggests a reduced dynamic range of coding for modulation depth with age, occurring with cochlear synaptopathy.
Central auditory pathway may introduce compensatory gain adjustments due to degraded peripheral responses
Increasing evidence suggests that, when faced with various forms of injury to the auditory periphery, the higher regions of auditory processing beginning from the midbrain may exhibit compensatory plasticity and increase their relative activity in response to the degraded peripheral input (Chambers et al., 2016a,b; Herrmann et al., 2017; Parthasarathy et al., 2018). This increased central gain may improve simple detection abilities, but it comes at the cost of reduced neural precision in encoding temporally modulated stimuli (Rabang et al., 2012; Overton and Recanzone, 2016). To investigate age-related changes in temporal processing along the auditory pathway, EFRs were obtained to a range of modulation frequencies from 16 to 4096 Hz AM in octave steps (with the exception of 64 Hz AM due to the presence of the 60 Hz line filter) at two cochlear regions. Significant main effects of age, modulation frequency, and their interaction were present for both frequencies (Table 6). As seen previously in Figure 4, EFRs in the 2000–4000 Hz AM range dominated by hair cell generators do not show significant changes with age until 128 weeks when hair cell losses occur, whereas EFRs in the ∼1000 Hz AM from early neural sources show a progressive decline throughout the lifespan (Fig. 8A,B). These declines persist for EFRs to sAM stimuli in the 200–500 Hz AM range, with generators thought to be primarily in the brainstem and midbrain (Kiren et al., 1994; Kuwada et al., 2002; Parthasarathy and Bartlett, 2012). However, EFRs to slower AM rates of <100 Hz, which are dominated by cortical sources (Herdman et al., 2002a,b; Picton et al., 2003), show minimal changes with age (Fig. 8A,B). The amplitude ratio of ABR wave 5 to wave 1 also shows a steady increase with age (Fig. 8A,B, inset). This ratio increase arises because the amplitude of wave 5 generated in the central auditory pathway (Lev and Sohmer, 1972; Buchwald and Huang, 1975; Hashimoto et al., 1981) remained relatively unchanged, whereas wave 1 amplitude decreased. These results suggest that aging may result in compensatory plasticity, leading to an increased gain in neural responses from the central auditory pathway in response to a degraded peripheral input due to cochlear synaptopathy.
Discussion
Hair cells have long been considered to be among the most vulnerable cochlear elements to aging. However, recent work shows that loss of synapses between IHCs and auditory nerve fibers begins before the hair cells themselves (Sergeyenko et al., 2013) and this age-related cochlear synaptopathy can be exacerbated by prior noise exposure (Fernandez et al., 2015). Age-related cochlear synaptopathy is thought to initially affect auditory nerve fibers with low spontaneous rates and high thresholds (Schmiedt et al., 1996). Because a diffuse loss of these high-threshold fibers will not adversely affect hearing detection thresholds, the audiogram is insensitive to its presence and the phenomenon has been termed “hidden” hearing loss (Schaette and McAlpine, 2011). In an independent set of aging ears, this study also demonstrates this progressive loss of cochlear synapses with age (Fig. 1C) that occurs before sensory hearing loss evidenced by changes in cochlear thresholds (Fig. 2A,D), loss of OHC function measured using DPOAEs (Fig. 2A,B), or the loss of the hair cells themselves (Fig. 1C). However, physiological evidence for this synaptopathy is found in the amplitude of the wave 1 of the ABR (Fig. 2E), which reflects the summed activity of auditory nerve fibers, although the amplitude may not be limited to contributions from low-spontaneous-rate neurons alone (Bourien et al., 2014). The wave 1 amplitude, at an equal suprathreshold sound level that accounts for differences in hearing thresholds, is strongly correlated with the number of surviving cochlear synapses (Fig. 2F).
Changes in hearing function with age can reflect damage to peripheral sensory and neural elements, as well as changes occurring along the central auditory pathways. Changes in the central auditory pathway such as decreased inhibition are well documented (for review, see Caspary et al., 2008), whereas assessment of peripheral changes has largely been restricted to studying changes to the cochlear hair cells and the threshold elevations that typically accompany their damage or loss. However, partial deafferentation caused by cochlear synaptopathy may cause additional deficits in early neural coding due to stochastic undersampling of sounds by the auditory nerve (Lopez-Poveda and Barrios, 2013; Lopez-Poveda, 2014). Aging and noise exposure may be especially affected by this stochastic undersampling (Marmel et al., 2015) because the low-spontaneous-rate neurons first targeted with aging (Schmiedt et al., 1996) show a greater preference for synchronized firing to AM sounds at moderate to high levels (Joris and Yin, 1992). EFRs are a noninvasive assay to measure these temporal processing deficits and provide complementary information to the ABRs, which are dominated by onset responses (Parthasarathy et al., 2014; Bidelman, 2015). A previous study found that EFRs were sensitive to noise-induced synaptopathy (Shaheen et al., 2015). However, the synaptopathy induced with noise exposure is immediately dramatic compared with the gradual decrease in synapses compared across six age-graded groups. Looking at age alone in this study, coefficients for the correlations between the synapses and either ABRs or EFRs are comparable at 30 kHz (0.71 for ABRs, 0.68 for EFRs) and ABRs have a higher correlation coefficient at 12 kHz (0.66 for EFRs vs 0.83 for ABRs; Figs. 2F, 6E,F). A comparison of effect sizes (Hedge's g; Hentschke and Stüttgen, 2011) between 16-week-old animals and all other age groups revealed that both ABRs and EFRs typically had effect sizes >0.8, which is considered a “large effect,” with EFRs outperforming ABRs at 30 kHz and ABRs outperforming EFRs at 12 kHz (data not shown). In this study, we used the round window application of ouabain to preferentially affect neural responses (Yuan et al., 2014). By measuring EFRs before and after the application of ouabain, we identified modulation rates that differentiated early neural from hair cell sources (Fig. 4A–C). Early neural sources dominated AM ranges at ∼1000 Hz, which is close to the peak phase-locking capacity of the auditory nerve (Palmer and Evans, 1982; Joris and Yin, 1992). This is also consistent with a previous study using ouabain application and phase coherence analysis to determine that EFRs to AM frequencies at 800–1000 Hz were dominated by responses from the auditory nerve in mice (Shaheen et al., 2015). However, the spatial resolution offered by the EFRs cannot differentiate between responses from the auditory nerve or the primary cochlear nucleus neurons, which have similar temporal properties (Joris et al., 2004). Therefore, we extended these observations by obtaining EFRs to faster modulation rates of up to 4096 Hz. EFRs elicited to these fast AM rates were well above noise floor (Fig. 4A,B), were not affected by ouabain (Fig. 4B,C), increased in amplitude when recorded at the round window (Fig. 4D), and decreased in amplitude only in the oldest age groups when hair cell loss was present (Fig. 5A,B), strongly suggesting that these EFRs had hair cell generators with likely contributions from both OHCs and IHCs. Therefore, by manipulating stimulus parameters, we were able to study temporal envelope coding from hair cells and the auditory nerve. Whereas differences with age were minimal for hair cell-based EFRs until the onset of hair cell loss, an age-progressive loss of EFR amplitudes was observed for the early neural sources (Fig. 5A,B). This change in EFR amplitudes, when quantified as the EFR ratio relative to hair cell derived responses, is strongly correlated with the degree of synaptopathy (Fig. 5C,D), suggesting that the loss of synapses is accompanied by a reduction in temporal processing at the level of the auditory nerve.
Precise temporal coding of sound at the level of the early auditory pathways is necessary to represent both the temporal fine structure (TFS) corresponding to the carrier frequency of the cochlear filter as well as the slower temporal envelope of that carrier. Envelope cues are predominantly used for speech comprehension (Shannon et al., 1995) and auditory scene analysis (Ding and Simon, 2012), whereas TFS is important to represent fine-timing information for lower frequencies and for binaural processing. It is also thought to be crucial in listening conditions in which the presence of multiple speakers adversely affects envelope cues (Lorenzi et al., 2006; Ding et al., 2014; Moore, 2016). Aged listeners with normal hearing thresholds often report significant difficulty understanding speech in degraded listening conditions such as the presence of noise or reverberation (He et al., 1998; Ruggles et al., 2011). In these cases, normal detection thresholds can be poor indicators of suprathreshold auditory performance because even a 10–20% survival of IHCs still yields near normal audiometric thresholds (Lobarinas et al., 2013) as long as OHCs are functionally intact. Rather, these declines in performance may be due to a poorer fidelity of coding timing information in the auditory pathway. In this study, EFRs recorded at equal SLs still showed a significant decline in response amplitudes at suprathreshold sound levels (Fig. 6C,D). These suprathreshold declines were strongly correlated with the degree of synaptopathy with age (Fig. 6E,F). In addition, real world listening conditions with competing maskers often degrade the quality of the temporal envelope, reducing the depth of modulations. Decreasing the modulation depth of the AM while still presenting the stimuli at an equal suprathreshold SL resulted in a further decrease in EFR amplitudes (Fig. 7). This suggests that age-related cochlear synaptopathy decreases the fidelity of temporal processing and reduces the overall dynamic range of coding at suprathreshold sound levels.
Neural processing of sounds in the central auditory pathway relies on a complex interplay of excitatory and inhibitory communications between various auditory nuclei. Peripheral deafferentation due to aging, though more gradual than acute acoustic trauma, may also result in compensatory plasticity to maintain homeostatic balance (Kotak et al., 2005; Resnik and Polley, 2017) due to a decrease in inhibitory neurotransmitters such as glycine and GABA (Caspary et al., 2008; Rabang et al., 2012). This decrease in inhibition in neurons of central auditory nuclei would make them less selective for AM frequencies (Rabang et al., 2012), thereby resulting in an increase in far-field responses. In this study, EFRs elicited by stimuli that emphasize predominantly cortical sources (Ross et al., 2003) show minimal changes with age even when EFRs from peripheral sources are reduced (Fig. 8). This is also accompanied by a relative increase in the activity of the midbrain compared with the auditory nerve, as measured by the ABR wave 5: wave 1 ratio (Fig. 8). These results suggest that the reduced peripheral drive seen with age may be accompanied by increases in central gain. An age-related increase in relative neural activity has also been observed in the inferior colliculus (Walton et al., 2002; Herrmann et al., 2017; Parthasarathy et al., 2018) and the auditory cortex (Overton and Recanzone, 2016). However, this increased activity typically comes at the cost of reduced temporal precision (Trujillo and Razak, 2013; Cai and Caspary, 2015; Overton and Recanzone, 2016) due to the altered balance between excitation and inhibition. Whether the cortical EFRs measured here show further degradations under more challenging listening conditions remains to be explored.
Footnotes
This work was supported by the Department of Defense (Grant W81XWH-15-1-0103 to S.G.K.). We thank Eve Smith for technical support.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Sharon G. Kujawa, Massachusetts Eye and Ear Infirmary, Harvard Medical School, 243 Charles St., Boston, MA 02114. Sharon_Kujawa{at}meei.harvard.edu