Abstract
Sound-level coding in the auditory nerve is achieved through the progressive recruitment of auditory nerve fibers (ANFs) that differ in threshold of activation and in the stimulus level at which the spike rate saturates. To investigate the functional state of the ANFs, the electrophysiological tests routinely used in clinics only capture the first action potentials firing in synchrony at the onset of the acoustic stimulation. Assessment of other properties (e.g., spontaneous rate and adaptation time constants) requires single-fiber recordings directly from the nerve, which for ethical reasons is not allowed in humans. By combining neuronal activity measurements at the round window and signal-processing algorithms, we constructed a peristimulus time response (PSTR), with a waveform similar to the peristimulus time histograms (PSTHs) derived from single-fiber recordings in young adult female gerbils. Simultaneous recordings of round-window PSTR and single-fiber PSTH provided models to predict the adaptation kinetics and spontaneous rate of the ANFs tuned at the PSTR probe frequency. The predictive model derived from gerbils was then validated in female mice and finally applied to humans by recording PSTRs from the auditory nerve in normal-hearing patients who underwent cerebellopontine angle surgeries. A rapid adaptation time constant of ∼3 ms and a mean spontaneous rate of ∼22 spikes/s in the 4 kHz frequency range were found. This study offers a promising diagnostic tool to map the human auditory nerve, thus opening new avenues to better understanding auditory neuropathies, tinnitus, and hyperacusis.
SIGNIFICANCE STATEMENT Neural adaptation in auditory nerve fibers corresponds to the reduction in the neuronal activity to prolonged or repeated sound stimulation. For obvious ethical reasons, single-fiber recordings from the auditory nerve are not feasible in humans, creating a critical gap in extending data obtained using animal models to humans. Using electrocochleography in rodents, we inferred adaptation kinetics and spontaneous discharge rates of the auditory nerve fibers in humans. Routinely used in basic and clinical laboratories, this tool will provide a better understanding of auditory disorders such as neuropathies, tinnitus, and hyperacusis, and will help to improve hearing-aid fittings.
- action potential
- cochlea
- electrocochleography
- neural adaptation
- peristimulus time histogram
- spontaneous discharge rate
Introduction
Hearing relies on auditory nerve fibers (ANFs), which convey the neural spike trains initiated by the sensory cells of the cochlea to the cochlear nuclei. ANFs differ in spontaneous spike rate, threshold of activation, and the stimulus level at which the spike rate saturates, and thereby the population of ANFs achieves intensity coding over a large dynamic range (Sachs and Abbas, 1974; Liberman, 1978; Winter et al., 1990). Beside their spontaneous spike rate and threshold for activation, the ANFs display a typical pattern of discharge in response to a constant sound. At the onset of sound stimulation, the spike rate first increases and reaches an onset peak. The spike rate then declines to a steady-state value (plateau). On cessation of the stimulus, the spike rate drops below the spontaneous rate before gradually recovering (Young and Sachs, 1973; Harris and Dallos, 1979; Chimento and Schreiner, 1991; Relkin and Doucet, 1991).
Given the role of ANFs in sound coding, characterizing ANF populations and their properties is an important approach to studying auditory deficits. However, electrophysiological tests routinely used in the clinic (i.e., measurements of auditory brainstem responses) only capture the first spikes of ANFs occurring in synchrony at the onset of the stimulus. Therefore, the assessment of other aspects, such as spontaneous discharge rate and adaptation time constants, requires a single-fiber recording technique, which is too invasive to be achieved in humans and too difficult and time consuming to be routinely conducted in animals. Another approach relies on the far-field responses from an electrode placed near the cochlea, for example, on the round window membrane, or directly on the auditory nerve, which is much more invasive. In response to a low-frequency pure tone (typically below 2 kHz), the phase-locked responses of ANFs produce an alternating-current (AC) component in the gross potential recorded from the nerve, called neurophonic (Snyder and Schreiner, 1984, 1985). Interestingly, the neurophonic displays a rapid onset and a response decrement similar in appearance to the adaptation of the firing rate of ANFs (Snyder and Schreiner, 1984). When recorded from an electrode placed on the round window membrane, the neurophonic is contaminated by the microphonic potential originated from hair cells, making its measurement difficult. In response to a high-frequency pure tone, the neurophonic is absent because the ANFs do not phase lock (Snyder and Schreiner, 1984) and only the compound action potential (CAP) of the auditory nerve that reflects the well-synchronized spikes at the tone-burst onset remains.
Here, we propose an alternative approach based on an electrical signal recorded at the round window in response to a bandpass-filtered noise rather than a pure tone. We previously showed that round-window response evoked by a bandpass-filtered noise burst displays a synchronized response at the onset of stimulation followed by an AC component (Batrel et al., 2017) arising from a phase-locked activity of ANFs to stimulus envelope fluctuations (Joris et al., 2004; Louage et al., 2004). After full wave rectification, round window response is characterized by a fast onset peak followed by an adaptation and a steady-state response until the end of the stimulation, mimicking the shape of a peristimulus time histogram (PSTH) from single ANFs (Cazals and Huang, 1996). Because of the similarity of shape with the PSTH of a single ANF, we assumed that temporal pattern of the peristimulus time responses (PSTRs) recorded at the round window reflects the time constants of adaptation of ANFs during sound stimulation. To address this hypothesis, we performed simultaneous recordings of round-window PSTR and single-fiber PSTH in gerbils. Indeed, the PSTRs displayed comparable kinetics to those measured from PSTHs derived from single fibers. Additionally, PSTRs nicely predict the rapid time constant and the PSTH peak-to-plateau value of ANFs in gerbils and mice. Finally, we provide data from gross auditory nerve recordings in humans, supporting the use of PSTRs as a promising tool to better understand auditory-nerve dysfunctions.
Materials and Methods
Gerbil and mouse experiments
Young adult Mongolian female gerbils and C57BL/6 strain female mice were obtained from Charles River Laboratories. Animals were housed in facilities accredited by the French Ministry of Agriculture and Forestry (Ministère de l'Agriculture et de la Forêt, Agreement C-34–172-36), and the experimental protocol was approved (Authorization CEEA-LR-12111) by the Animal Ethics Committee of Languedoc-Roussillon (France). Experiments were conducted in accordance with the animal welfare guidelines (2010/63/EC) of the European Communities Council Directive regarding the care and use of animals for experimental procedures. All efforts were made to minimize the number and suffering of the animals used.
Round-window recordings
Gerbils and mice were anesthetized by an intraperitoneal injection of a mixture of Rompun 2% (3 mg/kg) and Zoletil 50 (40 mg/kg). The left cochlea was exposed through a retroauricular surgical approach. The recording electrode was placed on the bony edge of the round-window membrane of the cochlea. The bulla (including the recording electrode) was then closed with dental cement. Electrophysiological recordings were performed in a Faraday-shielded, anechoic, soundproof cage. Animals were placed on a vibration-isolated table (TMC). The rectal temperature was measured with a thermistor probe and maintained at 38°C ± 1°C using a heated blanket beneath the animal. The acoustic stimuli were delivered in closed field under calibrated conditions using a custom acoustic assembly comprising a signal generator (200 kilo samples/s, 24 bit resolution, PXI-4461 controlled by LabVIEW, National Instruments), an audio amplifier (SA1, Tucker Davis Technologies), and a magnetic speaker (MF1, Tucker Davis Technologies) mounted with an adapter and PVC tubing placed in the external meatus.
Electrical signals were recorded in response to acoustical bursts (trapezoidal envelope, 2.5 ms rise and fall, 300 ms duration) of bandpass-filtered noise using different parameters such as frequency bandwidth, interstimulus time interval, level, and center frequency. Two consecutive bursts (called a pair) were presented in opposite polarity to reverse the waveform of the fine structure without changing the waveform of the temporal envelope (Hartmann, 1997). Each pair was designed to be mutually independent by refreshing the seed of the pseudorandom noise generator (Fig. 1A,B). In total, 50 pairs of burst noise were presented for each sound level and frequency investigated. The electrical signal was amplified (×20,000, 1–30,000 Hz filter bandwidth, Grass P511 amplifier) and saved for off-line analysis (50 kilo samples/s, 24-bit analog-to-digital conversion resolution, PXI-4461 controlled by LabVIEW). The two electrical signals within each pair were averaged to reduce the microphonic potential which follows the fine structure of the stimulus, while preserving the neurophonic that stems from the phase-locked activity of ANFs to stimulus envelope fluctuations (Louage et al., 2004; Batrel et al., 2017). The electrical signal recorded during a stimulus-free period (background noise) being uncorrelated, the averaging within each pair reduced the amplitude of the background noise and improved the neurophonic-to-noise ratio. The averaged electrical signal to each pair was 300–1200 Hz filtered (Fig. 1C). This filter bandwidth was chosen according to the spectrum of the spontaneous neural noise, which is centered ∼800–900 Hz in gerbils (Batrel et al., 2017), mice (A. Huet, C. Batrel, J-L. Puel, and J. Bourien, unpublished observations), and humans (Pardo-Jadue et al., 2017; Verschooten et al., 2018). The filter output was then rectified (full-wave rectification) and smoothed (moving average of the elements of the vector with a fixed window length of 1 ms, function smooth in MATLAB) to obtain a rectified smoothed signal (Fig. 1D). Finally, the PSTR was obtained by averaging the rectified smoothed signals over all pairs (Fig. 1E).
Simultaneous round-window and single auditory nerve fiber recordings
Simultaneous recordings from the round window and the auditory nerve were only performed in gerbils but not in mice. The round-window electrode was positioned as described above. Next, animals were prepared for single-fiber recordings from the auditory nerve, as described in Huet et al. (2018). Briefly, animals were placed in a custom head holder, and their body temperature was monitored and maintained at 38°C ± 1°C. The calibrated acoustic stimuli were delivered in closed field to the tympanic membrane through magnetic speakers (MF1, Tucker Davis Technologies) coupled to the ear bars. The left cochlear nerve was exposed using a posterior fossa approach. Extracellular action potentials from single auditory nerve fibers were recorded with glass microelectrodes (in vivo resistance between 80 and 110 MΩ) connected to an AxoClamp 2B (Molecular Devices), filled with 3 m NaCl. A silver-silver chloride reference wire was placed in the neck musculature of the animal. The spontaneous rate of the fiber (SR in spikes/s) was estimated by counting the spikes over 30 s. Characteristic frequencies (CFs) of the fibers were measured using a threshold-tracking program (10 spikes/s > SR).
Recordings were obtained in response to the bandpass-filtered noise bursts described above. Simultaneous round-window and single auditory nerve fiber recordings were performed in 43 ANFs. The CF of the fiber, threshold, and SR ranged from 3 and 13.45 kHz, from 2 to 48 dB sound pressure level (SPL), and from 0.6 to 65 spikes/s respectively. Statistical comparison of CF, threshold, and SR distributions from this set of 43 ANFs with those of a larger set of data published in Huet et al. (2016) did not show a significant difference (two-sample Kolmogorov–Smirnov test, p > 0.05).
The shape of PSTR and PSTH depends on the temporal resolution on the acquisition system and smoothing filter. A smaller span time makes the response noisy and degrades the quality of the fitting, unless the experimenter increases the number of presentations. Inversely, a larger span time leads to a lack of details to finely evaluate the time course of the response. To be consistent with previously reported data in gerbils (Westerman and Smith, 1984), we chose a sampling rate of 50,000 samples/s (i.e., 20 µs resolution) and a smoothing span time of 1 ms for PSTR. Experimentally, simultaneous recording of PSTR and PSTH is a difficult technique because the recording of a fiber over a long period of time is difficult to achieve. Consequently, the choice of a bin size of 0.5 ms to build the PSTHs was a reasonable compromise between the temporal resolution, the quality of the fitting, and the experimental duration of the acquisition.
Recording from the auditory nerve in humans
The electrophysiological recordings from the auditory nerve in humans were performed in the Reims University Hospital. Patients underwent microvascular decompression to relieve trigeminal neuralgia (n = 7) and hemifacial spasm (n = 1) via the retrosigmoid approach (Møller and Jannetta, 1981; Møller and Jho, 1989). The recordings presented in this report were made as part of the routine intraoperative monitoring of auditory-evoked potentials. Such monitoring minimizes the risk of hearing loss resulting from manipulation of the eighth nerve. The Ethics Committee Sud Méditerrannée approved this study (MelAudi-2). All the subjects (7 females, 1 male) gave their informed consent to participate in this clinical trial (ClinicalTrials.gov identifier: NCT03552224). The average age of the patients was 62.1 ± 9.3 years, and they had normal auditory thresholds (≤20 dB HL) between 500 and 4000 Hz. Monitoring was based on the CAP of the auditory nerve in response to clicks varying from 0 to 80 dB SPL in 10 dB steps. At the end of the decompression procedure, PSTRs were recorded in response to bursts of one-third octave bandpass-filtered noise (200 ms duration, 2.5 presentations/s, 100 presentations) centered on 4 kHz and presented 40 dB above the click-evoked CAP threshold. According to pioneering work, the electrical potential recorded on the surface of the cochlear nerve is weakly contaminated by the microphonic potential originating from cochlear sensory cells (Møller and Jho, 1989, 1991) and by the neural potential from the cochlear nuclei (Møller et al., 1982; Møller and Jho, 1989; Młller et al., 1994).
Data analysis
Given the shape of the PSTR recorded at the cochlear round window, we used the same fitting model as that proposed by Westerman and Smith (1984) to characterize the adaptation of the firing rate in auditory nerve fibers. This fitting model consists of two exponentially decaying components (rapid and short-term) plus a constant term as follows:
where
Because of the ANF refractoriness, a silent period frequently appears in the 2 ms following the PSTH onset peak, which produces a very rapid adaptation-like response (Yates et al., 1985). To avoid the adaptation-like response distorting the PSTH fitting process, we removed this silent period in the PSTH by interpolation between the onset bin and the first bin following the silent interval, as previously proposed by Westerman and Smith (1984). To test the ability of the PSTR
Predictive modeling was based on regression analysis. When a significant relationship (linear or nonlinear) was found between a parameter derived from PSTR and its corresponding parameter derived from PSTH (e.g., PSTR
Statistics
Data are expressed as mean ± SEM. Normality of the variables was tested by the Shapiro–Wilks test. If conditions for a parametric test were met, the significance of the group differences was assessed with a one-way ANOVA; once the significance of the group differences (p < 0.05) was established, Tukey's post hoc tests were subsequently used for pairwise comparisons. If conditions were not met, Kruskal–Wallis tests were used to assess the significance of differences among several groups; if the group differences were significant (p < 0.05), Dunn's tests were then used for post hoc comparisons between pairs of groups.
Results
Using electrocochleography in gerbils, we recorded mass potentials at the round window evoked by one-third octave bandpass-filtered noise (Batrel et al., 2017). A trial consisted of a pair of bursts with opposite polarities to reduce the cochlear microphonic originating primarily from outer hair cells. The response was then filtered at 300–1200 Hz to isolate the neural component (Fig. 1A-C). The PSTR was built from the average across trials of a full-wave rectified and smoothed electrical signal (Fig. 1D). Consistent with its neuronal origin, acute 30 min round-window infusion of 10 μm tetrodotoxin (TTX) completely abolished the response. Note, however, the TTX resistance of the baseline (1.73 ± 0.09 µV before TTX and 1.02 ± 0.05 µV after TTX) and the drop into the noise floor after death (0.30 ± 0.01 µV), suggesting a contribution from some non-neuronal or noncochlear origin (Fig. 1E). Therefore, PSTR measurements were referred to the baseline mean value measured before the evoked response. The PSTR hallmarks consisted of a peak shortly after stimulus onset followed by an adaptation, plus a plateau until the end of the stimulus. Similarly to the PSTH from a single ANF (Westerman and Smith, 1984), the PSTR can be well fit as the sum of two exponentially decaying components and a constant, steady-state component (rapid,
PSTR of the auditory nerve. A, B, Electrical signal recorded at the round window (B) in response to 300 ms bursts of one-third octave bandpass-filtered noise centered on 4 kHz and presented at 50 dB SPL (A). C, Neurophonic isolation by (1) calculating the average of individual responses within each pair and (2) bandpass filtering to catch the neuronal component generated by the auditory nerve (Batrel et al., 2017). D, Full-wave rectification and smoothing (1 ms span time) of the neurophonic potential. E, A PSTR was obtained by averaging the individual rectified smoothed signals shown in D. Infusion of 10 μm TTX into the round window niche completely abolished the PSTR. Note the resistance of the baseline to TTX (purple trace) as compared with death (gray trace). F, G, The PSTR time course during sound stimulation was fitted with a model consisting of two exponentially decaying (rapid and short term) components and a constant, steady-state component.
Characterization of the peristimulus time responses
Dependence on the stimulus bandwidth
In addition to well-synchronized spikes at the onset of stimulation, bandpass-filtered noise drives synchronized spikes that stems from the phase-locked activity of ANFs to stimulus envelope fluctuations (Louage et al., 2004). Decreasing the bandwidth of the bandpass-filtered noise reduces envelope fluctuations (Dau et al., 1999). Because the plateau of the PSTR relies on envelope fluctuation, we expect a decrease in the amplitude of PSTR plateau with the bandwidth reduction. To test this hypothesis, we compared PSTRs evoked by noise bursts of different bandwidths to those evoked by pure tones (Fig. 2A,B, representative example). The center frequency of the noise band was fixed at 4 kHz (i.e., the best sensitivity region of the hearing range of the gerbils; Ryan, 1976; Huet et al., 2016), and the level set to 50 dB SPL. Although the PSTR onset peak remained constant across all conditions, reduction of the bandwidth led to an amplitude decrease of the plateau, which, for a pure tone, almost completely disappeared (Fig. 2A,B).
PSTR as a function of stimulus frequency bandwidth expressed as octave fractions. A, PSTRs (bottom traces) in response to one-third octave bandpass-filtered noise (top traces) centered at 4 kHz and varying in bandwidth from one-half to one-ninth octave band (data from 1 representative gerbil). Right, The PSTR obtained to a tone burst presented at 4 kHz (gray). The level of presentation was fixed at 50 dB SPL for all stimuli. B, Magnification of the PSTR onset peak shown in A. Red, The fitting model and its coefficient of determination (R2, red digit). C–E, Amplitudes of the three components of the fitting model as a function of the bandwidth. F, G, Time constant of the rapid and short-term components as a function of the stimulus bandwidth. H, PSTR peak-to-plateau ratio as a function of the bandwidth. Data are mean ± SEM, n = 12 gerbils. Statistical comparisons between pairs of samples were performed using the one-half octave band as the reference. Data obtained in response to a tone burst are shown using a colored background. The
A least-squares curve-fitting method was used to quantify the amplitudes (Fig. 2C–E) and the time constants of the assumed components of the PSTR (Fig. 2F,G). The amplitude of the rapid component AR was resistant to the bandwidth reduction and showed a slight nonsignificant increase (from 9.1 ± 0.8 µV for a half-octave-wide noise burst to 10.6 ± 0.9 µV for a tone burst, Fig. 2C). In contrast, the amplitude of the short-term (AST) and the steady-state (ASS) components decreased for smaller bandwidths (AST, from 1.3 ± 0.12 µV for a half-octave-wide noise burst to 0.12 ± 0.05 µV for a tone burst; ASS, from 2.5 ± 0.27 µV for a half-octave-wide noise burst to 0.25 ± 0.03 µV for a tone burst; Fig. 2D,E). The kinetics of the rapid component became faster with bandwidth reduction (
Dependence on the interstimulus time interval
The ability of the ANFs to respond consistently to repeated acoustic cues depends strongly on the time between successive stimulations (Young and Sachs, 1973; Harris and Dallos, 1979; Chimento and Schreiner, 1991; Relkin and Doucet, 1991). To evaluate the behavior of the PSTRs in this masking phenomenon, we varied the interstimulus interval between two consecutive bursts of one-third octave bandpass-filtered noise centered on 4 kHz from 40 to 600 ms (Fig. 3A). Prolonging the interstimulus time interval from 40 to 300 ms increased the onset peak amplitude. Above 300 ms, the onset peak remained constant. We then expressed the amplitude parameters derived from the fits as functions of the interstimulus interval (Fig. 3B–D). Like the onset peak, the amplitude of the rapid- and short-term components increased up to 300 ms interstimulus intervals but did not change beyond 300 ms (AR, 4.9 ± 0.5 µV to 9.0 ± 0.8 µV and AST, 0.5 ± 0.1 µV to 1.5 ± 0.2 µV, from 40 to 300 ms time intervals, respectively, Fig. 3B,C). In contrast, the steady-state component (ASS), which reflects the plateau, was independent of the interstimulus interval (Fig. 3D). Only the kinetics of the rapid component changed below 300 ms interstimulus interval (
PSTR as a function of the interstimulus time interval. A, PSTRs (bottom traces) in response to one-third octave bandpass-filtered noise centered at 4 kHz (top traces) and varying in interstimulus interval (ISI) from 0.04 to 0.6 s (data from 1 representative gerbil). The level of the presentation was fixed at 50 dB SPL. Red, The fitted model and its coefficient of determination (R2, red digit). B–D, Amplitude of the three components of the fitting model as a function of the ISI. E, F, Time constant of the rapid and short-term components as a function of the ISI. G, PSTR peak-to-plateau ratio as a function of the ISI. Data are mean ± SEM, n = 12 gerbils. Statistical comparisons between pairs of samples were performed using ISI = 0.6 s as the reference. The y-axis is displayed using a logarithmic scale (B–G).
Simultaneous recordings of peristimulus time response at the round window and of single fiber from the auditory nerve
To investigate the relationship between the PSTH of single ANFs and the PSTR recorded at the round window, we conducted 43 simultaneous measurements using a sharp glass pipette in the gerbil auditory nerve and a silver ball electrode in the round window niche, respectively. For each ANF, the SR and the CF of the fiber were determined (Fig. 4A). Both PSTR and PSTH were obtained in response to one-third octave bandpass-filtered noise centered on the CF of the fiber.
Simultaneous recordings in response to increasing sound level. A, Shown are the spontaneous rate and characteristics frequency of the fibers used in this analysis. For 17 of the 43 fibers (filled circles), simultaneous recordings were successfully performed from 20 to 60 dB SPL in 10 dB steps. B, The mean PSTR (top traces) and the mean PSTH (bottom traces) obtained in response to a burst of one-third octave bandpass-filtered noise centered on the CF of the fiber, varying in level from 20 to 60 dB SPL (n = 17 ANFs). Red, The fitting model with its coefficient of determination (R2, red digit). C, Amplitude parameters derived from the mean PSTR (dark color, left axis) and the mean PSTH (light color, right axis), as a function of the sound level. D, Correlation between amplitude parameters derived from the mean PSTH (y-axis) and the mean PSTR (x-axis) for the different sound levels shown in C. Dashed line is a fit to the data using a power-law model for AR (y = 106 × x0.54) and a power-law model plus a deviation term for AST and ASS (AST, y = −25 × x−1.1 + 106, ASS, y = −29 × x−1 + 148). E, Time-constant parameters derived from the mean PSTR (dark color, left axis) and the mean PSTH (light color, right axis) as a function of the sound level. F, Correlation between time-constant parameters derived from the mean PSTH (y-axis) and the mean PSTR (x-axis) for the different sound levels shown in E. The dashed line (F) is a fit to the data using a power-law model (y = 1.28 × x0.76). G, Peak-to-plateau ratio derived from the mean PSTR (dark color, left axis) and the mean PSTH (light color, right axis) as a function of the sound level. H, Correlation between peak-to-plateau ratio derived from the mean PSTH (y-axis) and the mean PSTR (x-axis) for the different sound levels shown in G. Dashed line is a fit to the data using a power-law model (y = 0.51 × x1.21). The error bars (C–H) indicate the 95% confidence intervals given by the fitting procedure. For very small confidence intervals, the error bars are hidden behind the symbol used to indicate the center of the confidence interval. The y-axis is displayed using a logarithmic scale (C, E, G). The x-axis and y-axis are displayed using a logarithmic scale (D, F, H).
Dependence on sound-pressure level
We first correlated the activity of several ANFs to the corresponding PSTR by comparing the fits of the mean PSTR and the mean PSTH, as a function of sound level. To do so, we selected 17 fibers for which a complete experimental protocol was available, that is, successful recording of activity from 20 to 60 dB SPL (Fig. 4A, filled circles). Although both the mean PSTR and the mean PSTH behaved similarly (Fig. 4B), amplitude-fitting parameters derived from the mean PSTH tended to saturate for suprathreshold levels (especially for AST and ASS), whereas those of the mean PSTR increased more slowly and did not saturate (Fig. 4C), probably because of cochlear spread of excitation. Plotting the amplitude-fitting parameters of the mean PSTR (AR, AST, and ASS) against those of the mean PSTH displayed relationships that can be approximated by power-law function for AR (R2 = 0.99) and a power-law function plus a deviation term for AST and ASS (R2 = 0.99; Fig. 4D). The rapid time constant
Dependence on center frequency
In the gerbil, inner hair cells (IHCs) from the apical half of the cochlea are mainly innervated by high-SR ANFs, whereas basal IHCs are innervated by ANFs with a greater SR diversity (Schmiedt, 1989; Ohlemiller and Echteler, 1990; Müller, 1996; Bourien et al., 2014; Huet et al., 2016, 2019). When measured at a constant level above the threshold (i.e., 30 or 40 dB sensation level, SL), high-SR fibers have been shown to exhibit a shorter τR (Westerman and Smith, 1984) and larger PSTH peak-to-plateau ratio than the low-SR fibers (Rhode and Smith, 1985; Müller and Robertson, 1991; Taberner and Liberman, 2005). To test the ability of PSTR
Simultaneous recordings in response to different probe frequencies. A, Average PSTRs (top row) and PSTHs (bottom row) obtained when data were pooled per one-third octave band, according to the CF of the fiber. Forty-three ANFs were considered in this analysis (Fig. 4A), with a minimum of three fibers per one-third octave band. For each fiber, the PSTR and the PSTH were measured at 50 dB SPL in response to a one-third octave bandpass-filtered noise centered on CF. B, Rapid time constant derived from PSTH (y-axis) as a function of those derived from PSTR (x-axis), considering the data in A (3.15 kHz in purple to 12.6 kHz in orange). C, Peak-to-plateau ratio derived from PSTH (y-axis) as a function of those derived from PSTR (x-axis). D, E, Spontaneous rate of the fibers per one-third octave band (y-axis; D, E) as a function of PSTH peak-to-plateau ratio (x-axis; D) and PSTR peak-to-plateau ratio (x-axis; E). For PSTR τr, PSTH τr, PSTH peak-to-plateau ratio, and PSTR peak-to-plateau ratio shown in B–E, the vertical and horizontal error bars indicate the confidence interval (confidence level 95%) of the fit derived from the mean PSTRs and the mean PSTHs shown in A. For very small confidence intervals, the error bar is hidden behind the symbol used to indicate the center of the confidence interval. Error bars associated with SR are mean ± SEM (D, E), calculated when fibers where pooled per one-third octave band according to CF (Fig. 4A). The dashed line represents a power-law fit of the data (B–E). The gray area gives the predicted range of the fit for a confidence level of 95%. The top and bottom bounds of the horizontal blue area are a predicted range of the parameter indicated on the y-axis according to the measured value on the x-axis (vertical blue line). The x-axis and y-axis are displayed using a logarithmic scale (B–E).
A power law was used to fit the relationship between mean PSTH
Predictive modeling
Next, we probed whether the power laws used to describe the relationships between PSTR
Validation of the predictive model in gerbils
We first measured PSTRs in the independent group of gerbils (Fig. 6A). The center frequency was varied from 3.15 to 16 kHz in one-third octave step increments, and the noise bursts were presented at 50 dB SPL (i.e., 40 dB above the PSTR thresholds as in Fig. 5). Note that the
Next, single-fiber recordings from the gerbil dataset of Huet et al. (2016) was reanalyzed to extract PSTH
Prediction of fiber PSTH τR and spontaneous discharge rate (SR) in gerbils. A, Shown is a scatter plot of PSTH τR as a function of the CF of the fiber for 222 ANFs (filled circles). The red open circles show the mean τR (± 1 SEM) calculated from individual τR values pooled per one-third octave band according to the CF of the fiber (vertical dashed lines delimit the one-third octave bands). The SR of the fiber is represented using a color scale from green (low SR) to red (high-SR fiber). B, Same data as in A without individual data and using a y-axis magnification. The gray area gives a PSTR-based prediction range of the mean PSTH τR using the predictive model shown in Figure 5B. C, Shown is a scatter plot of fiber SR as a function of CF for 222 ANFs (filled circles). The red open circles show the mean SR (± 1 SEM) when fibers were pooled per one-third octave band according to CF. D, Same data as in C without individual data and using a y-axis magnification. The gray area gives a PSTR-based prediction range of the mean SR using the predictive model shown in Figure 5E. Note the high degree of correlation between predicted and measured PSTH τR (R2 = 0.98; B, inset) and SR (R2 = 0.96; D, inset) values. The PSTHs and PSTRs used for the prediction were obtained at 40 dB above PSTR thresholds. The x-axis and y-axis are displayed using a logarithmic scale.
PSTR as a function of the center frequency in gerbils. A, PSTRs (bottom traces) to a one-third octave bandpass-filtered noise (top traces; data obtained in 1 representative gerbil). The stimulus center frequency was varied from 3.15 to 16 kHz in one-third octave increments, and the noise bursts were presented at 50 dB SPL. Red, The fitting model and its coefficient of determination (R2, red digit). B–D, Amplitudes of three components of the fitting model (inset) as a function of the probe frequency. E, F, Time constants of the rapid and short-term components as a function of the probe frequency. G, The PSTR peak-to-plateau ratio as a function of the probe frequency. Data are mean ± SEM, n = 14 gerbils. Statistical comparisons between pairs of samples were performed using 4 kHz as the reference. The x-axis and y-axis are displayed using a logarithmic scale (B–G).
Validation of the predictive model in mice
To validate the PSTR as a neural index to infer rapid adaptation time constant and mean SR of a subpopulation of ANFs in another animal model, we recorded PSTRs in C57BL/6 mice (Fig. 8A). The center frequency was varied from 5.7 to 32 kHz in one-half octave increments, and the noise bursts were presented at 40 dB above the C57BL/6 PSTR thresholds (data not shown). The amplitude of the rapid component was independent of the probe frequency (Fig. 8B), as were the short-term and the steady-state components (Fig. 8C,D). However, the rapid time constant
PSTR as a function of the probe frequency in mice. A, PSTRs (bottom traces) obtained in response to a one-third octave bandpass-filtered noise (top traces). The center frequency of the stimulus varied from 5.7 to 32 kHz in one-half octave increments, and the noise bursts were presented at 40 dB above mice PSTR thresholds. Red, The fitting model and its coefficient of determination (R2, red digit). B–D, Amplitudes of the three components of the fitting model (inset) as a function of the probe frequency. E, F, Time constants of the rapid and short-term components as a function of the probe frequency. G, The PSTR peak-to-plateau ratio as a function of the probe frequency. Data are mean ± SEM, n = 9 mice. The x-axis and y-axis are displayed using a logarithmic scale (B–G).
Prediction of fiber PSTH τR and spontaneous discharge rate (SR) in mice. A, Shown is a scatter plot of PSTH τR as a function of the CF of the fiber for 144 ANFs (CBA/CaJ, filled circles, 111 ANFs; C57BL/6, open circles, 33 ANFs). The SR of the fiber is represented using a color scale from green (low SR) to red (high-SR fibers). The large and red open circles show the mean PSTH τR (± 1 SEM) when fibers were pooled per one-half octave frequency band according to CF (vertical dashed lines). B, Same data as in A without individual data and using a y-axis magnification. The gray area gives a PSTR-based prediction range of the mean PSTH τR using the predictive model shown in Figure 5B. C, Shown is a scatter plot of the SR of the fiber as a function of the CF of the fiber for the same ANFs shown in A. The red open circles show the mean SR (± 1 SEM) when fibers were pooled per one-half octave band according to CF. D, Same data as in C without individual data and using a y-axis magnification. The gray area gives a PSTR-based prediction range of the mean SR using the predictive model shown in Figure 5E. Note the correlation between the predicted and measured PSTH τR (R2 = 0.89; B, inset). Although the correlation index was lower (R2 = 0.1; D, inset), the SR values are distributed along the y = x line. The PSTHs and PSTRs used for the prediction were obtained at 40 dB above PSTR thresholds. The x-axis and y-axis are displayed using a logarithmic scale.
Prediction of fiber PSTH τR and SR at 4 kHz in humans. A, The CAP N1–P1 amplitude (mean ± SEM) in response to acoustic clicks as a function of the sound level. Top, Example of a CAP in one representative patient (inset; 200 presentations, same polarity, 40 dB above the click-evoked CAP threshold). Bottom, Auditory nerve responses (inset) were measured using a ball electrode directly on the surface of the eighth nerve. C, Cochlear nerve; V, vestibular nerve. B, PSTR to a 200 ms burst of one-third octave bandpass-filtered noise centered at 4 kHz and presented at 40 dB above the click-evoked CAP threshold (100 presentations, opposite polarity). The blue solid line indicates the mean PSTR across the eight patients (±1 SEM, light blue area). Red, The fitted model and its coefficient of determination (R2, red text). C, Prediction of the PSTH τR (horizontal blue area) using the measured PSTR τR at 40 dB above the click-evoked CAP threshold (vertical blue line) and the predictive model identified in gerbils (gray area). D, Prediction of the mean SR (horizontal blue area) using the measured PSTR peak-to-plateau ratio at 40 dB above the click-evoked CAP threshold (vertical blue line) and the predictive model identified in gerbils (gray area). The y-axis is displayed using a logarithmic scale (A). The x-axis and y-axis are displayed using a logarithmic scale (C, D).
Recordings from the auditory nerve in humans
For obvious ethical reasons, single-fiber recordings from the cochlear nerve have never been performed in humans, creating a critical gap in our understanding of the applicability of data from animal models. We took advantage of cerebellopontine angle surgeries to record PSTRs from an electrode placed on the intracranial portion of the cochlear nerve in eight consenting patients (62.1 ± 9.3 years old) with normal hearing (average thresholds between 0.5 and 4 kHz, 12.9 ± 0.9 dB HL). Those patients were undergoing neurosurgeries for cranial-nerve functional disorders (trigeminal neuralgia and hemifacial spasm; Møller and Jannetta, 1981; Møller and Jho, 1989). Click-evoked CAPs were recorded intraoperatively to ensure that no noticeable changes occurred in auditory nerve function as a result of the surgical dissection (Fig. 10A). Then PSTRs were recorded in response to noise bursts centered on 4 kHz, presented 40 dB above the click-evoked CAP threshold (i.e., 70 dB SPL). PSTRs were also obtained using stimulus pairs of opposite polarity. The waveform of the human PSTR was similar to those recorded from gerbils or mice (Fig. 10B). The rapid time constant
Discussion
Using far-field recordings in rodents, we probed the adaptation kinetics and spontaneous action-potential firing in limited populations of ANFs as a function of their tuning frequency. In addition, we examined whether the relationship between the PSTR and PSTH allows the prediction of these two electrophysiological features in normal-hearing human ANFs.
Adaptation of auditory nerve fibers
In response to the onset of an acoustic stimulus, the spike rate of an ANF rapidly increases to a maximum value and thereafter adapts over the course of minutes (Young and Sachs, 1973; Johnson, 1980; Zilany et al., 2009; Zilany and Carney, 2010). The postonset reduction in discharge rate has been mainly attributed to the reduction of vesicle release at the IHCs ribbon synapse (Beutner et al., 2001; Goutman and Glowatzki, 2007). Other mechanisms, including the desensitization of postsynaptic glutamate receptors and refractoriness of action potential generation at the ANF level, may also contribute (Heil and Peterson, 2015). The most definitive experimental procedure to investigate ANF adaptation relies on single-fiber recording. However, this approach is very invasive and difficult to achieve as the auditory nerve is deep beneath the cerebellum and largely surrounded by the petrous portion of the temporal bone. Here, we investigated whether far-field PSTRs recorded at the round window mimic single-fiber PSTHs, especially with respect to the rapid and short-term time constants of the postonset adaptation. With increasing sound level, we saw a reduction of the rapid time constant. Consistent with this, Westerman and Smith reported a decrease of the PSTH rapid time constant from 10 to 1 ms as tone level increased (Westerman and Smith, 1984), which is in the range of our PSTR data. In addition, the short-term time constant they reported was independent on the stimulus level, and approximated 50 ms, which is also in agreement with our PSTR measurements.
To most directly evaluate how a population of ANFs contributes to the PSTR, we conducted simultaneous recordings of single ANFs and round-window PSTRs. Pair-by-pair analysis showed many similarities, such as the shape and the time constants. Few discrepancies are, however, inherent to the nature of the recording techniques. For example, PSTH amplitudes (especially AST and ASS) tend to saturate at suprathreshold levels, whereas those of PSTR continue to increase. Because PSTHs reflect the time course of the discharge rate in response to sound, their amplitudes are limited by the maximum discharge rate of the fiber and the relatively limited dynamic range of single ANFs. In contrast, the PSTR amplitude can continue to grow for sound levels above 50 dB SPL because higher levels of stimulation recruit ANFs with characteristic frequencies outside the nominal bandwidth of the noise. Thus, at least within a range of 40 dB above threshold, PSTRs may reflect the rapid and short-term adaptation of the single ANFs with CFs that lie within the noise bandwidth.
Spontaneous activity of auditory nerve fibers
Several studies have described the SR patterns of ANFs in the gerbil cochlea, with a majority of high-SR fibers in the apical part and a more balanced distribution of high-, medium-, and low-SR fibers in the basal half (Schmiedt, 1989; Ohlemiller and Echteler, 1990; Müller, 1996; Huet et al., 2016; Petitpré et al., 2020). As shown in Figure 5D, high-SR fibers exhibit greater peak-to-plateau ratios than low-SR fibers. This is also true in cats (Rhode and Smith, 1985), guinea pigs (Müller and Robertson, 1991), chinchillas (Relkin and Doucet, 1991), and mice (Taberner and Liberman, 2005). Inversely, Peterson and Heil (2021) found a clear dependence between the peak-to-plateau ratio and the stimulus level, but not with the spontaneous rate of ANFs in cats. However, this result was obtained at high stimulus levels where the onset and adapted rates had reached their maxima. Thus, it is conceivable that the dependence of this ratio on SR, reported in Figure 5D and in other studies, results only because the ratio was measured at stimulus levels closer to (for high-SR fibers) or farther away from (for low-SR fibers) the levels at which the onset and adapted rates saturate.
Validation of our PSTR-based predictions comes from single-fiber recordings. In gerbils, a correlation was seen between the mean PSTR and the mean PSTH rapid adaptation
Like the PSTR, the mean PSTH in response to 50 dB SPL sound stimulation reflects the activation of an ensemble of single ANFs having different thresholds. The main difference was however that the PSTH was built in response to tone burst, and the prediction was based on responses to bandpass-filtered noise. This is an interesting point because we can probe our predictive modeling in other species in which the PSTH has been recorded in response to tone burst (which is generally the case). Indeed, when recorded on the cochlear round window of mice, the PSTR rapid adaptation
Toward a diagnostic tool
To date, the only functional data on the human ANFs come from recordings of CAP or wave I of the auditory brainstem responses, which reflect the synchronized activity at stimulus onset (for review, see Eggermont, 2017). For ethical reasons (mainly the need to penetrate the auditory nerve with microelectrodes), single-fiber recording from the auditory nerve is not feasible in humans, making the SR-based composition impossible to investigate. Here, we took advantage of cerebellopontine angle surgeries to record PSTRs from an electrode placed on the intracranial portion of the cochlear nerve in patients with normal or subnormal hearing who were undergoing microvascular decompression (Møller and Jannetta, 1983). PSTR evoked by one-third octave bandpass-filtered noise centered on 4 kHz and presented at 40 dB above the click-evoked CAP threshold had a similar shape to those recorded in animals. The rapid-adaptation time constant of a few milliseconds (∼3 ms) is in the range of those observed in animals. Based on the PSTR peak-to-plateau ratio, we predict a fiber mean SR of 22 spikes/s in the 4 kHz region. Although we do not have access to single-fiber recordings to test our prediction, PSTR might constitute an interesting neural index to investigate the adaptation and the mean spontaneous activity of ANFs in this area of the human cochlea.
Some technical limitations must, however, be considered before definitive validation of human data. First, the electrical and acoustic noise levels in the operating room were higher than they are in an experimental laboratory. Second, because of time limitations, we did not record as large a range of sound intensities and frequencies as in the animals. Third, we cannot be sure that anesthesia or the surgical manipulations needed to expose the eighth nerve may not have changed the physiology of the ANFs. To collect more data, a noninvasive technique is thus needed, especially to record PSTRs in awake subjects. Given that the spontaneous neural noise (900 Hz peak) can be extracted in the human using ear canal recording techniques (Pardo-Jadue et al., 2017), we expect that the signal-to-noise ratio is sufficient to extract PSTR using a tympanic electrode. If not, we can alternatively use a transtympanic electrode (Verschooten et al., 2018). To further investigate the weighted contribution of each pool of ANFs to the PSTR, and to predict their behavior under pathologic conditions such as ANF loss (Kujawa and Liberman, 2009; Wu et al., 2019; Jeffers et al., 2021) and/or hair-cell loss (Fernandez et al., 2020; Wu et al., 2020, 2021), we need to develop a mathematical model of the human cochlea. This should be based on the morphologic observations (Spoendlin and Schrott, 1989, 1990; Wu et al., 2020, 2021), as previously performed for the guinea-pig cochlea (Bourien et al., 2014). In the end, we expect the PSTRs to be a powerful diagnostic tool to capture information on auditory nerve survival and, importantly, SR-based function and dysfunction in humans, providing a better understanding of auditory neuropathies, tinnitus, and hyperacusis.
Footnotes
This work was supported by the Agence Nationale pour la Recherche (ANR-13-JSV1-0009-01), EraNet Neuron JTC 2020 (ANR Cosyspeech R21034FF), Institut National de la Santé et de la Recherche Médicale Grant U1051-Dot 02±2014), Cochlear France Award (R11055FF/RVF11006FFA), Gueules Cassées (R20113FF), Fondation pour l'audition (FPA RD-2016-2, RD-2020-10), U.S. Navy Office of Naval Research (N00014-16-1-2867), and the National Institute on Deafness and other Communication Disorders (R01 DC0188). A.H. was further supported by the German Research Foundation through the Cluster of Excellence (EXC2067). We thank Prof. André Chays and Dr. Arnaud Bazin, who initiated the clinical study; Arthur Lemolton for technical help; and Scientific and Technical English Language Services (www.stels-ol.de) for editing assistance.
The authors decare no competing conflicts of interest.
- Correspondence should be addressed to Jérôme Bourien at jerome.bourien{at}umontpellier.fr