Abstract
The binaural interaction component (BIC) of the auditory brainstem response is a noninvasive electroencephalographic signature of neural processing of binaural sounds. Despite its potential as a clinical biomarker, the neural structures and mechanism that generate the BIC are not known. We explore here the hypothesis that the BIC emerges from excitatory–inhibitory interactions in auditory brainstem neurons. We measured the BIC in response to click stimuli while varying interaural time differences (ITDs) in subjects of either sex from five animal species. Species had head sizes spanning a 3.5-fold range and correspondingly large variations in the sizes of the auditory brainstem nuclei known to process binaural sounds [the medial superior olive (MSO) and the lateral superior olive (LSO)]. The BIC was reliably elicited in all species, including those that have small or inexistent MSOs. In addition, the range of ITDs where BIC was elicited was independent of animal species, suggesting that the BIC is not a reflection of the processing of ITDs per se. Finally, we provide a model of the amplitude and latency of the BIC peak, which is based on excitatory–inhibitory synaptic interactions, without assuming any specific arrangement of delay lines. Our results show that the BIC is preserved across species ranging from mice to humans. We argue that this is the result of generic excitatory–inhibitory synaptic interactions at the level of the LSO, and thus best seen as reflecting the integration of binaural inputs as opposed to their spatial properties.
SIGNIFICANCE STATEMENT Noninvasive electrophysiological measures of sensory system activity are critical for the objective clinical diagnosis of human sensory processing deficits. The binaural component of sound-evoked auditory brainstem responses is one such measure of binaural auditory coding fidelity in the early stages of the auditory system. Yet, the precise neurons that lead to this evoked potential are not fully understood. This paper provides a comparative study of this potential in different mammals and shows that it is preserved across species, from mice to men, despite large variations in morphology and neuroanatomy. Our results confirm its relevance to the assessment of binaural hearing integrity in humans and demonstrates how it can be used to bridge the gap between rodent models and humans.
Introduction
When a sound is emitted in the environment, the acoustical waves reach the ears at slightly different times and with different amplitudes. Those interaural time differences (ITDs) and interaural level differences (ILDs), usually termed “binaural cues,” are used by the auditory system to estimate the spatial origin of a sound. Neural pathways from either ear converge three synapses away from the cochleae, in the superior olivary complex (SOC) of the brainstem, where the lateral superior olive (LSO) and the medial superior olive (MSO; Fig. 1A) first process the ILDs and ITDs, respectively (Tollin, 2003; Grothe et al., 2010).
A noninvasive method to probe the activity of auditory brainstem neurons is to measure early auditory-evoked potentials from electrodes placed on the scalp, referred to as auditory brainstem responses (ABRs; Fig. 1B). A broadband “click” stimulus played at one ear triggers a series of electrical events (Fig. 1B) that reflect synchronous neural activity propagating through the monaural auditory pathway. Its first peaks are generated by spiral ganglion cells and their axons forming the auditory nerve (Fig. 1B, orange), followed by cochlear nucleus and SOC neurons (green), while later peaks (blue) reflect activity along the lateral lemniscus, a tract of axons from the SOC to the inferior colliculus. Consistently, the latencies of individual peaks of the ABR coincide with first-spike latencies observed with in vivo extracellular recordings, and lesions of individual nuclei directly affect individual ABR peaks (Wada and Starr, 1989; Fullerton and Kiang, 1990; Melcher and Kiang, 1996).
When presenting clicks to both ears, the amplitude of the ABRs is approximately double that when the stimulus is presented separately to one ear (Fig. 1C, red and blue traces), which reflects the activation of both monaural pathways (Fig. 1D, green trace). A time course of binaural-specific processing can be obtained by subtracting the binaurally evoked ABR from the sum of monaurally evoked ABRs (the summed ABR; Fig. 1E). This residual ABR trace is the fraction of the binaural ABR that cannot be explained by independent activation of the monaural pathways, and is termed the “binaural interaction component” (BIC; Fig. 1F). It represents ∼20% of the amplitude of the binaural ABR, occurs at latencies consistent with the activity of SOC neurons, and is modulated by binaural cues (e.g., ITDs; Fig. 1G, bottom). Characteristics of the largest BIC peak (Fig. 1G, DN1) correlate with performance on binaural hearing tasks in both normal and hearing-impaired subjects, which makes it a good candidate biomarker for binaural hearing integrity (Laumen et al., 2016b). However, the amplitude of the BIC in humans is small and can be challenging to measure (Polyakov and Pratt, 1996; Brantberg et al., 1999; Delb, 2003). Thus, knowledge of the neural origin of the BIC may facilitate improved measurement techniques (i.e., acoustic stimuli, recording electrode montages, signal processing and averaging, etc.) and ultimately may help make measurements in clinical settings more reliable.
The origin of the generators of the BIC is located within the SOC (based on the latencies of its peaks). Yet, the SOC comprises several binaural nuclei (e.g., MSO and LSO), and thus the exact origin of the BIC remains unknown (Hall, 2007; Laumen et al., 2016b). Here, we adopt a comparative approach called “natural ablation” (Masterton et al., 1975) to understand the origin of the BIC by measuring the BIC across ITDs in five rodent species. Across the species considered, both the range of naturally occurring ITDs and the relative size of SOC nuclei vary widely (e.g., the MSO is nearly absent in mice). Our results suggest an LSO origin of the BIC (as opposed to MSO), which is further supported by a model in which the BIC emerges from LSO-like excitatory–inhibitory interactions.
Materials and Methods
ABR measurements
Click-evoked ABRs and the BIC were measured using a setup replicating that described in more details in previous publications (Beutelmann et al., 2015; Ferber et al., 2016), and only briefly outlined here.
Ethics statement.
All experimental procedures complied with guidelines set forth by the National Institutes of Health and were approved under a protocol submitted to the University of Colorado Health Sciences Center Animal Care and Use Committee.
Animal preparation.
Data were acquired for adult animals of either sex and included nine Mongolian gerbils (Meriones unguiculatus; four female), five Sprague Dawley rats (Rattus norvegicus; three female), four long-tailed chinchillas (Chinchilla lanigera; four female), five guinea pigs (Cavia porcellus; two female), and nine C57BL/6N mice (Mus musculus; three female). Mice used in these experiments were Mobp-EGFP transgenic mice [STOCK Tg(Mobp-EGFP) IN1Gsat/Mmucd; RRID:MMRRC_030483-UCD] that have been backcrossed 10 generations and are congenic on the C57BL/6N genetic background. Before recording sessions, animals underwent otoscopic examination. ABR recordings were then collected under ketamine/xylazine anesthesia. During recording sessions, physiological temperature was maintained using a heating pad, and vital signs were monitored. At the beginning of each individual recording session, click-evoked monaural ABR thresholds were assessed in each animal to confirm that interaural hearing threshold asymmetries were <10 dB (Laumen et al., 2016a).
ABR acquisition.
The acquisition system used for this study replicates that of Beutelmann et al. (2015) and Ferber et al. (2016). Presentation of stimuli and acquisition of evoked potentials were performed via a sound card at a sample rate of 44.1 kHz. Click stimuli were presented at an average interstimulus interval of 30 ms, with an SD of 10 ms, at 80 dB SPL (∼30–40 dB above the click threshold). Five hundred repetitions were presented for each monaural and binaural condition (before automated artifact rejections, see below). The BIC was recorded across an ITD range at least spanning ±2 ms (±2, 1, 0.75, 0.5, 0.375, 0.25, 0.125, and 0 ms), and in some conditions extended up to 10 ms. All conditions were randomized.
Stimuli were presented through custom insert earpieces; sound level and phase were calibrated via probe microphones using a 129-tap minimum phase filter (Beutelmann et al., 2015). Electroencephalographic recordings were obtained with platinum subdermal needle electrodes placed at the apex (active) and nape of the neck (reference) with a hind-leg ground. An automated artifact rejection threshold was set before each recording session and was typically ∼10 μV.
Analysis.
Recorded signals were averaged by condition and processed through a digital third-order Butterworth filter (cutoffs at 100 Hz and 1 kHz). The BIC waveform was then calculated for each ITD. Each monaural trace is shifted by ITD/2 in either direction (Fig. 1G, top) to obtain the “summed response,” termed S (Fig. 1E), so that the delay between monaural traces is equal to the ITD. Note that our convention may be different from that of other studies (Ungan et al., 1997; Riedel and Kollmeier, 2006). The BIC is then computed by subtracting the summed response from the binaurally evoked ABR: BIC = B − S (Fig. 1F,G).
Gaussian fits.
A bell-shaped function was fitted to the peak amplitudes versus ITD data obtained in the recordings. The fitted function was (plotted in Fig. 4-1A). Because DN1 amplitude data obtained from previous studies (Ungan et al., 1997; Riedel and Kollmeier, 2006) was normalized and reported for positive ITDs, only the σ parameter was allowed to vary while others were fixed (ITD0 = 0, A = 1, B = 0). Because this removed covariations between the parameters of the fit, they were not included in the computation of the correlation coefficients reported in Figure 4-1.
Cross-correlation analysis and signature BIC
We used a cross-correlation technique to compute the BIC amplitude, lag, and signature BIC from individual BIC traces. Briefly, we compute the normalized cross-correlation of the two zero-mean signals BIC0 and BIC1 as follows (using a FFT-based technique): where RMS0 is the root mean squared amplitude of BIC0, and RMS1 is the root mean squared amplitude BIC1. The maximum of the cross-correlation is then located: its amplitude is the cross-correlation gain, and its lag the relative delay between the traces (see main text; Fig. 2).
Subject signature BIC.
Because the gain and delay obtained from the cross-correlation method are relative, it is necessary to use one of the BIC traces as a reference. To compute the subject signature BIC across ITDs, we compute the BIC traces gain and lag relative to the BIC trace at ITD = 0 μs. All traces are then time-shifted using the obtained lags. The subject signature BIC is then defined as the average of the time-aligned traces weighed by the gains (Fig. 2).
Species signature BIC.
We use a similar strategy to compute the species-specific signature BIC. First, we compute subject signature BICs across ITDs for all individual subjects as described above. We then cross-correlate all pairs of subject signature BICs to obtain lags and delays (Fig. 3A,C). We then compute the species signature BIC in a similar manner: all signature BICs are time-aligned using the cross-correlation lags relative to one of our subjects (a gerbil; Fig. 3C), the average for each species is then obtained by weighed averaging using the gains.
Importantly, when temporally shifting the BIC waveforms to even out latency discrepancies, the resulting species signature BIC no longer is temporally referenced to the stimulus onset. That is, t = 0 on Figure 3B is not the stimulus onset, and all BICs appear to have the same latency, even though they do not (Fig. 3C).
BIC model
Modeling BIC amplitude changes with ITD.
We establish a model of the BIC DN1 peak as the result of inhibitory–excitatory coincidence detection occurring at the level of the LSO (Ashida et al., 2016, 2017). We therefore assume that the binaural difference potential is related to the probability that an excitatory spike is cancelled by inhibition. This occurs when excitation follows inhibition, within an inhibition time window of length w. We define a random variable U that describes the relative arrival time difference of the excitatory and inhibitory spike, such that U ∼ N (τ, σ2), where τ = τe − τi is the average difference in arrival time of the excitatory and inhibitory spikes and σ2 is a measure of the relative precision of the spikes reaching the LSO. In sum, the BIC amplitude and latency will be related to the probability that an excitatory spike is effectively cancelled by inhibition P[0 < U < w].
This probability can be obtained with the cumulative function of a normal random variable: where fU(u) is the probability density function of U (a normal distribution).
We then identify this probability value to the amplitude of the BIC across ITDs. We note that the left LSO receives excitation at τe = −ITD/2 and inhibition at τi = ITD/2, such that the random variable UL ∼ N(−ITD, σ2) describes the difference in arrival times at the left LSO. This situation is opposed on the right LSO UR ∼ N(ITD, σ2). It is important to note that we are not assuming any neural delay lines in doing so. The left and right LSOs only differ in which ear they get respectively excited and inhibited from. The difference in response between the two LSOs merely comes from the causality of the inhibition as we have modeled it.
We can now write rL (ITD) for the contribution of the left LSO to the BIC by replacing τ by the ITD in the above expression: We can write the right side's contribution by replacing τ with −ITD (which swaps excitatory and inhibitory ears in the model): rR(ITD) = rL(−ITD).
Finally, the total amplitude of the BIC is the sum of the amplitudes on each side, which we further normalize such that it is always equal to 1 at ITD = 0: This final expression gives a prediction of the amplitude of the BIC that uniquely depends on two parameters: (1) σ representing the relative precision of arrival times of the excitatory and inhibitory input spikes to LSO cells; and (2) w representing the temporal duration of the inhibition.
Modeling BIC latency changes with ITD.
Our model is also able to predict BIC latency changes with ITD by considering that the BIC occurs at the mean arrival time at which excitatory spikes occur when they are not inhibited. In other words, the BIC latency is the lag for which excitation is most likely to be cancelled by inhibition.
In the previous section, we let U ∼ N (τe − τi, σ2) the distribution of the difference in arrival times at the LSO. Here we seek to evaluate the expectation 𝔼[U|0 < U < w], which represents the average difference in excitation and inhibition that leads to a spike being by the LSO cell. The lag obtained that way is relative to the arrival time of inhibition, such that the absolute BIC lag is obtained by σ = τi + 𝔼[U| 0 < U < w]. In doing this, we are assuming that the arrival time of the inhibitory spike and the difference in arrival times are independent (or, alternatively, that the inhibitory spike arrives at a constant time).
It is possible to evaluate the expectation in the expression of δ analytically by recognizing that it is the expectation of a truncated normal random variable. Commonly, this is expressed using inverse Mills ratios: where φ and Φ are the probability density and cumulative probability density functions of normal random variables, respectively. Spelling out the cumulative density function, we obtain: We now look at the average time on the left and right LSOs by substituting τ with ITD and adding : The situation is symmetric on the other side and we therefore obtain Finally, we model the latency of the total BIC (occurring as a result of the activity of the left and right LSOs) as the average of the delays on each side weighed by the probability of a spike being cancelled by that side: This is a somewhat complex expression, which we were not able to satisfactorily reduce any further. However, it should be noted that it can still be analytically treated, for example to evaluate its gradient relative to the parameters σ and w, which is useful for fitting procedures (see below).
Modeling BIC latency changes with ITD.
To fit the model to our DN1 peaks measurements, we seek to minimize the difference between the model amplitudes and latencies to the real data: where SS is the sum of squared differences: and a,b,τoffset are three scale and shift parameters that allow us to deal with heterogeneities between lags and amplitudes across animals, without the need for normalization. Those parameters are not directly related to our model, which provides the shape of both latencies and amplitudes as a function of ITD.
The fitting procedure was an implementation of the Broyden–Fletcher–Goldfarb–Shanno algorithm (a modified quasi-Newton approach; Nocedal and Wright, 2006) as provided by the SciPy library. The Jacobian matrix of SS needed by the optimization routine was evaluated in part by hand (for the amplitudes), and in part (the latencies) using the SymPy symbolic computation package (Meurer et al., 2017). Python code for the fitting is available on-line (https://github.com/victorbenichoux/bic_ei_model.git).
The amplitudes fitted here were the cross-correlation gains (therefore equal to 1 for ITD = 0). The latencies were the cross-correlation latencies. Measurements from all within ±750 μs were fitted, because this is the range over which the ITD axis was sampled uniformly in all species tested.
Results
Cross-correlation-based BIC analysis
The BIC, a waveform derived from monaural and binaural ABR recordings, is usually analyzed via the latencies and amplitudes of its peaks and troughs. This strategy is problematic when analyzing large amounts of recordings because the identification of different peaks depends on the species under investigation and the electrode positioning on the scalp. Identification may become difficult when the peaks are small (which is the case when the ITD is large). Our new strategy to study the dependence of the BIC on binaural cues circumvents those issues. It relies on the observation that although BIC amplitude and latencies are modulated by the binaural cues, the morphology of the BIC waveform is itself well conserved (Fig. 2A).
BIC waveforms obtained at different ITDs for one guinea pig subject are color-coded against time and ITD in Figure 2A. The BIC waveform at nonzero ITDs appears to be a scaled and delayed version of the waveform obtained at 0 ITD. To quantify this similarity, we compute the cross-correlation for all pairs of BIC recordings obtained at different ITDs. An example of such cross-correlation function is represented in Figure 2B (between the BIC obtained at 0 and 0.5 ms ITD); it exhibits a clear maximum very close to 1, indicating that the shapes of both BIC traces are very similar. The similarity between any two BIC traces was always very large (>0.8; Fig. 2C). In addition, the cross-correlation value to the waveform at 0 μs ITD decreases as the absolute ITD increases (Fig. 2C), which reflects the already documented decrease in BIC waveform amplitude for increasing ITD magnitude (Fig. 2F, black line). Thus, instead of individual peak amplitudes, the maximal cross-correlation value can be used as a measure of relative overall BIC magnitude.
A noise floor estimate of the cross-correlation value was obtained by cross-correlating the BIC waveforms with segments of ABR signals before the stimulus was presented (i.e., containing no stimulus-triggered signal). The 99% percentile for this measure was equal to 0.56 for this subject (Fig. 2C, arrow), reflecting the cross-correlation value expected to be obtained by chance. BIC waveforms for all ITDs tested for the subject in Figure 2 were more similar to the waveform at 0 μs ITD than is to be expected by chance, including up to ±2 ms ITD. This suggests that there is binaural interaction for ITDs well outside the range of ITDs that would be naturally occurring for a guinea pig (∼250–300 μs; Greene et al., 2014; Benichoux et al., 2016).
The lag between any two BIC waveforms can also be obtained from the latency of the peak of the cross-correlation function. We thus use the cross-correlation function to compute the latency change with the ITD of the BIC waveform, compared with the trace obtained at 0 ITD. Consistent with previous reports, the latency of the BIC increases with increasing ITD magnitude (Fig. 2G, black trace), while being consistently shorter than expected by assuming that the BIC occurs at a constant delay after the second click (i.e., varies with ITD/2 in our convention).
Using the cross-correlation latencies between BIC waveforms, we can temporally shift the BIC waveforms to best match the waveform at 0 ITD, thus obtaining aligned BIC waveforms in which the peaks and troughs happen approximately at the same latencies (Fig. 2D). The obtained aligned BIC traces are again strikingly similar, suggesting that it is possible to obtain a common BIC signature that captures the temporal dynamics of the BIC for all ITDs. To define the subject signature BIC trace, we compute the average of the aligned BIC values weighed by their relative gain (a measure of absolute amplitude; Fig. 2E).
The cross-correlation gains and lags themselves provide a robust and systematic measure of the relative amplitude and lag of the BIC waveforms across ITDs, but it may be desirable to identify and analyze individual peak amplitudes and latencies as has been done in prior research (Laumen et al., 2016b). To this end, our method can also help identify individual peak amplitudes and latencies in a way that is easier, more systematic, and free of experimenter bias. Indeed, because the subject signature trace is an average of all traces across ITDs, it is less noisy. Thus, we can compute the maxima and minima of the waveform in Figure 2E automatically (when they fall outside of the noise floor, see above), and label them according to the nomenclature in the literature. The first positive deflection is usually referred to as DP1, and is observed here before 4 ms (Fig. 2E), and the first prominent negative peak is usually referred to as DN1, here at a latency of ∼5 ms (Fig. 2E). Peaks and troughs at nonzero ITDs are then obtained by centering a temporal window on the peaks identified on the subject signature trace (Fig. 2D, shaded areas) and taking extrema within this temporal window. The ITD dependence of the amplitudes and latencies of DP1 and DN1 found here are consistent with the cross-correlation-derived measures (Fig. 2F,G).
BIC across species
Using the same apparatus for specimens of five mammalian species (4–9 individuals per species), we measured the BIC waveforms across ITD ranges, including at least −2 to 2 ms, and represented them in Figure 3-1. In all specimens tested, a large negative deflection was observed (consistently outside the noise floor) and was termed DN1, and in almost all specimens a significant positive deflection was observed before DN1 and termed “DP1.” Because of variations in experimental parameters for each individual subject (e.g., the position of the electrodes) or morphological differences (e.g., head size), considerable variability occurs across subjects of the same species. To mitigate that variability and extract common features across subject and species, we apply a similar analysis as we applied across ITDs for BIC recordings across different subjects and species.
We cross-correlate signature BIC waveforms obtained for each individual (Fig. 3-1, right columns of each panel) and report the measured similarity in Figure 3A. The correlations between pairs of subject signature BIC waveforms across species are large (and above the noise floor), suggesting that the BIC waveform is a common feature across animals (Fig. 3A, inset). For the rat and the mouse, although the interindividual correlation remains high, the cross-correlation with waveforms of other species is somewhat smaller. We then use the same method used across ITDs in the previous section to align and average the signature BIC of each individual into a species-wide signature BIC (see Materials and Methods). The species-specific signature BIC for each species (Fig. 3B) reveals that the BIC waveform is conserved, including the prominent negative peak (DN1) that is always present, and in most instances is preceded by a positive peak (DP1). In sum, and despite the difficulty of identifying peaks in single traces, the pattern of peaks DP1-DN1 is very consistent across animals and species (Fig. 3-1). This includes both species that have a large MSO (relative to LSO; Glendenning and Masterton, 1998) and use fine-structure ITDs for sound localization (gerbils, chinchilla, guinea pig) and species that have a small MSO and that do not use fine-structure ITDs (mice and rats). In contrast, each species tested here has a prominent LSO (Glendenning and Masterton, 1998).
Figure 3-1
It should be noted that because of the alignment and averaging procedure, the species signature BIC is no longer temporally referenced to the stimulus onset, such that t = 0 on Figure 3C is not the stimulus onset. The analysis of the cross-correlation lags across species (reflective of lags between each species' signature BIC traces; Fig. 3C) reveals that the latency of the BIC varies systematically with the head size of the animals, measured here as the range of naturally occurring ITDs (the magnitude of the ITD cue being determined by the diameter of the head). This procedure clearly separates the larger animals with larger ITDs (chinchillas and guinea pigs) from the smaller rodents with smaller ITDs (Pearson correlation r2 = 0.31, p ≤ 0.001). This observation is consistent with the notion that latencies of the BIC waveform likely result from the length of the axons leading to the superior olivary nucleus from the cochlear nucleus, which is affected by the overall size of the brain of the animal.
Dependence of BIC waveform on ITD physiological range
We tested whether the range of naturally occurring ITDs affected the dependence of the BIC on ITDs in any given species. The physiological range of ITDs varied >2-fold across the species used for this study (Fig. 4A, horizontal lines): guinea pigs: ±250 μs (Greene et al., 2014; Benichoux et al., 2016); rats: ±165 μs (Koka et al., 2008; Benichoux et al., 2016); chinchillas: ±290 μs (Jones et al., 2011; Benichoux et al., 2016); mice: ±100 μs (Chen et al., 1995); and gerbils: ±120 μs (Maki and Furukawa, 2005). We used several different complementary approaches to test the hypothesis that the ITD dependence of the BIC waveform across species is correlated to the ecological range of ITDs for that species.
Figure 4-1
First, we investigate the magnitude of the BIC across ITDs averaged across animals of the same species (as measured by the cross-correlation gain; Fig. 4A). For all species considered, the magnitude of the BIC remains high even for ITDs well beyond the physiological range of ITDs of that species (Fig. 4A). In addition, it is striking that the width of the curves in Figure 4A is not related to the head size (horizontal bars). A very similar result holds with the analysis of individually labeled DP1 (Fig. 4D) and DN1 peaks (Fig. 4E). In addition, we report the amplitudes of DN1 peaks as measured in previous studies for two additional species: cats (Fig. 4E, dark gray; Ungan et al., 1997), whose head size is twice that of guinea pigs; and humans (Fig. 4E, black; Riedel and Kollmeier, 2006), whose head size is twice that of cats (Fig. 4E). As a result, our dataset now includes animals with an eightfold difference in head sizes from mice to humans, and thus an eightfold difference in the ecological range of ITDs.
Second, we quantified the range of ITDs for which the BIC amplitudes (as measured by the cross-correlation) were >0.9; that is, the range of ITD values for which the elicited BIC waveform is almost indistinguishable from that at 0 ITD (it is ≥90% similar to the BIC at 0 ITD). We represent this measure of the BIC versus ITD plateau in Figure 4B. For all species and subjects considered, the range over which the BIC waveform is almost indistinguishable from that at 0 ITD was larger than the range of naturally occurring ITDs. No dependence on the range of ecological ITDs was found across species (Fig. 4B; Pearson's correlation value, 0.11, p = 0.54).
Finally, we performed a Gaussian fit analysis of the DP1 and DN1 peaks variation with ITD (Fig. 4-1A; see Materials and Methods; Ferber et al., 2016). We represent the values of the parameters of the fit as a function of the range of ecological ITDs for the specimen in Figure 4-1B (DP1 amplitude) and Figure 4-1C (DN1 amplitude). The parameter relating the width of the DP1 (or DN1) versus ITD relationship is σ (see Materials and Methods), and it is not related to the range of ecological ITDs (as per Pearson's correlation; Fig. 4-1B,C, first columns). Indeed, for all species, the median BIC versus ITD relationship width remains ∼600 μs (mouse: 575 μs; rat: 569 μs; gerbil: 751 μs; guinea pig: 558 μs; chinchilla: 765 μs). In addition, fits were performed on BIC measurements from the literature for the cat and humans (see Materials and Methods), which yielded very similar σ values (cat: 731 μs; human: 614 μs). Similarly, the parameters relating the position along the abscissa (of Fig. 4-1B,C, second column), and the modulation amplitude of the BIC versus the ITD curve (Fig. 4-1B,C, third column) are not related to the range of ecological ITDs. In fact, the only parameter found to correlate with the range of ecological ITDs (and thus the size of the head) is the baseline value (the level of the BIC vs ITD curve tails; Fig. 4-1B,C, fourth column), which merely represents the noise level of the recordings.
Naturally, it is impossible to bring a statistical argument to prove the null hypothesis (here, the absence of a correlation with range of ITDs). Yet, we note that none of the three approaches led to a conclusive argument about the dependence of the BIC magnitude on the range of naturally occurring ITDs. In addition, clearly if the width of the BIC–ITD curve is related to the range of naturally occurring ITDs in that species, this dependence is marginal (or constrained to within-species dependence, which we did not investigate; Fig. 4). In support of this argument, it should again be noted that the human BIC–ITD relationship is indistinguishable from that of the rodents (e.g., mice and gerbils), although the range of naturally occurring ITDs in human is eight times larger than those of the other species (∼800 μs; Benichoux et al., 2016). We therefore argue that our analyses provide strong evidence that the BIC is not related to the processing of the binaural spatial cue ITD per se, and in turn not attributable to the activity of MSO cells (see Discussion). In what follows, we provide an alternative explanation: the BIC occurs because of excitatory–inhibitory synaptic interactions at the level of LSO cells.
A model of BIC amplitude changes with ITD
We provide a model of excitatory–inhibitory interactions at the level of the LSO that predict relative magnitudes and latencies of the BIC across ITDs. It describes an “anti”-coincidence mechanism in which spikes from excitatory monaural pathways are canceled by spikes in the monaural inhibitory pathways if they arrive in synchrony. A similar modeling approach for LSO neurons was recently published (Ashida et al., 2016, 2017). Our model uses two parameters: a measure of the temporal precision of the spikes along the monaural pathways leading to the LSO, together with a measure of the time window during which inhibition can cancel excitation. A fundamental difference between this model and previous ones (Ungan et al., 1997; Riedel and Kollmeier, 2006) is that it does not assume any specific neural delay arrangement in the neural pathways leading onto the LSOs, while retaining the property of a modulation of BIC amplitude and latency with ITDs.
We model the amplitude and latency of the BIC from the probability that a sound-elicited spike on the excitatory pathway is cancelled by a spike on the inhibitory pathway for both LSOs (left and right LSOs, which differ only by which side provides excitation or inhibition; Fig. 5A). We assume that the inhibition is effective during a given time window: inhibition occurs only when the inhibitory spike precedes the excitatory spike and both are within a given time window (Fig. 5B,C). When an ITD is imposed in our model, the average time delay between the excitation and inhibition is equal to the ITD (and −ITD on the contralateral LSO; Fig. 5A,B). Finally, the difference between spike times is assumed to follow a normal distribution, reflecting jitter in the afferent spike trains. In our model we identify the probability of the spike being cancelled by inhibition as the BIC amplitude, and the average time of successful inhibition as the BIC latency. The total BIC amplitude and latency is obtained by summing the contributions of the ipsilateral and contralateral LSOs (Fig. 5D, dashed gray line on top left panel; see Materials and Methods).
Figure 5-1
Under the assumptions above, BIC magnitude and latency can be obtained analytically as a function of ITD (Fig. 5D; see Materials and Methods). The model predictions qualitatively measured BIC amplitude and latency as a function of ITD: monotonic decrease in amplitude with increasing absolute ITD and monotonic increase of latency with increasing absolute ITD. Indeed, as the difference in arrival time of excitation and inhibition increases (i.e., absolute ITD increases), the likelihood that an excitatory spike follows the inhibitory spike within a given time window drops. Accordingly, the overlap of the green distribution in Figure 5B and the shaded area of effective inhibition decreases. At the same time, spikes that effectively get cancelled are more likely to be near the end of the inhibition time window, therefore also predicting an increased latency. In our model, the two LSOs have exactly symmetric contributions to the BIC, and they only differ by the sidedness of their excitatory and inhibitory inputs.
The two model parameters σ and w affect the predictions of the model in qualitatively distinct ways. Briefly, when the dispersion of the spike time differences σ is small relative to the inhibition time window duration w, then the BIC amplitude is maximal and plateaus at ∼0 ITD (Fig. 5D, top left) and the width of the curve is given by the value of w. Increasing σ changes the steepness of the flanking slopes of the amplitude versus ITD curve (Fig. 5D, top right). The latencies are monotonically increasing with ITD (Fig. 5D, bottom), but show a different behavior around the plateau and slopes of the BIC amplitude curves: they increase rapidly with ITD when the BIC amplitude plateaus, and less so when the BIC amplitude decreases.
We fitted the model of BIC amplitudes and latencies to individual BIC curves across ITDs within ±750 μs. The fitting procedure is efficient because the associated minimization problem's Jacobian matrix can be analytically expressed (see Materials and Methods), it converged for all individual recordings, and it provided a good quantitative fit to the amplitudes and the latencies (average root mean squared errors: 2% in amplitude and 16 μs in latency). Figure 5E shows an example fit on chinchilla DN1 recordings (fits for all subjects are shown in Fig. 5-1A). We find that the fit quality is very good qualitatively: both the lags and the amplitudes are well captured across ITDs, including in this case outside the range over which the data was fitted. Notably, the model captures equally well BIC amplitude and latencies in the data, although its parameters do not predict those quantities independently. This success provides good support for the hypothesis that the BIC is the result of excitatory–inhibitory synaptic interactions at the level of the SOC, and does not require a specific arrangement of neural delays to operate. In addition, parameters of the fit (σ, w) did not clearly relate to the species under study (Fig. 5E,F). This fact is additional support for the alternative explanation that the LSO can be the generator of the BIC.
The time window parameter in our model can be tied to the inhibition window observed in LSO neurons (Joris and Yin, 1998). Empirically, the time over which an action potential can be inhibited in single LSO neurons in vivo is a little under 1 ms (measured with clicks; Joris and Yin, 1995; Park, 1998; Irvine et al., 2001; Beiderbeck et al., 2018), which is consistent with that measured in vitro (Wu and Kelly, 1992), and shorter than in downstream nuclei (e.g., the inferior colliculus; Brown and Tollin, 2016). In addition, to account for empirical in vivo electrophysiological data, the LSO model of Ashida et al. (2016, 2017) required an inhibition window of 0.8 ms. We note that our model's window parameter is consistent with these prior results.
Discussion
Previous studies showed that the BIC represents binaural processing occurring in binaural auditory brainstem neurons in the MSO and/or the LSO (Laumen et al., 2016b). To locate its origin, we analyzed the BIC while varying the ITDs in several animal species whose head size and brainstem nuclei composition widely vary. These data, together with a simple model of the dependence of the BIC on ITDs, strongly suggest that the LSO, but not the MSO, is the generator of the BIC.
Natural ablation of brainstem nuclei across species reveals the LSO as the origin of the BIC
The relative size and numbers of neurons of MSOs and LSOs vary significantly across the species considered in this study (Glendenning and Masterton, 1998), which we exploited to infer the origin of the BIC. Masterton et al. (1975) dubbed this approach “natural ablation”: they reasoned that behaviors and physiological responses that rely on a particular nucleus would also “scale” with its size across species. For example, mice and rats have large and well developed LSOs but a small or inexistent MSOs (Fischl et al., 2016). Consistently, both species can use high-frequency ILDs, but neither can use low-frequency ITDs for sound localization (Heffner et al., 2001; Allen and Ison, 2010). Other low-frequency-hearing mammals (including cats, chinchillas, guinea pigs, gerbils, humans) have well developed LSO and MSO nuclei and use both high-frequency ILDs and low-frequency ITDs for sound localization.
Here we report that the BIC waveform is preserved across mammalian species and provide the first evidence of a BIC in mice, which have a very small MSO (with little evidence of binaural functionality; Fischl et al., 2016). In rats, whose MSO is much smaller than the LSO, we found a BIC whose characteristics were similar to the BIC observed in mammals whose MSO and LSO are the same size (e.g., in cats; Irving and Harrison, 1967; Moore and Moore, 1971). We conclude that the MSO cannot be the unique or main generator of the BIC in mammals. We additionally predict that species with small, disorganized LSOs should exhibit a reduced or eliminated BIC. For example, horses have small LSOs relative to the total SOC volume (Glendenning and Masterton, 1998), and are known to be unable to localize sound sources for frequencies >∼1 kHz, consistent with their SOC neuroanatomy (Heffner and Heffner, 1984).
Basing their findings on acute SOC lesions, Melcher and Kiang (1996) have argued that the MSO generates the BIC. They unilaterally lesioned the anterior portion of the anteroventral cochlear nucleus (AVCNa) in cats, which reduced the amplitude of the BIC by ∼50%. Because spherical bushy cells (SBCs) forming the AVCNa provide bilateral excitatory input to the MSO (Fig. 1A), the authors argued that MSO was the BIC generator. Yet, SBCs also provide the excitatory input to the ipsilateral LSO (Tollin, 2003), such that those results are consistent with an LSO origin of the BIC: elimination of excitatory input to the LSO should suppress binaural interaction in the LSO on the side of the lesion but not the opposite one, thus reducing the BIC but not eliminating it. This is consistent with the observations of Zaaroor and Starr (1991), who reported that BIC amplitude was reduced in correlation with the extent of lesions to the medial nucleus of the trapezoid body (MNTB) and LSO.
Melcher (1996) also found that lesioning the posterior portion of the AVCN (AVCNp) had no effect on the BIC, ruling out the LSO as the source because AVCNp globular bushy cells (GBCs) provide the inhibitory input to the LSO (through the MNTB; Fig. 1A). However, this conclusion was based on just a single animal and relied on gross localization of the lesion in the AVCNp. Yet, GBCs are diffusely distributed such that a substantial number of cells are found outside the AVCNp (Young and Oertel, 2010). We contend that, in this subject, a sufficient number of LSO afferents were preserved to preserve the BIC.
LSO as the origin of the BIC: BIC versus ITD relationship
In humans, the range of ITDs leading to perceptual fusion of transients matches the range of naturally occurring acoustical ITDs (800 μs; Benichoux et al., 2016) and the ITD range over which a BIC is elicited (Furst et al., 1985; McPherson and Starr, 1995). In addition, the BIC dependence on ITDs changes during development as the head size and experienced acoustical ITDs increase (Jiang and Tierney, 1996; Furst et al., 2004). These arguments suggested that the range of ITDs the auditory system is exposed to determines the ITD range over which a BIC can be elicited, and in turn that MSO neurons generate the BIC.
MSO neurons presented with a transient stimulus are maximally active for a given ITD, termed the “best delay” (BD; Joris and Yin, 2007; Grothe et al., 2010), such that the BIC–ITD curve can be thought of as reflecting the distribution of BDs across MSO neurons. The canonical MSO model suggests that the BDs of MSO cells lie within the range of naturally occurring ITDs (Jeffress, 1948), although it was recently proposed that BDs lie just outside of the range of ethological ITDs in mammals (McAlpine et al., 2001) or on the so-called-limit (a neuron'sBDis inversely proportional to its preferred frequency; Harper and McAlpine, 2004). In all cases, the width of the distribution of BDs should scale with the range of naturally occurring ITDs in that species, and in turn with the BIC amplitude relationship with ITD.
Here we specifically tested the hypothesis that the range of ITDs that elicit a BIC is linked to the range of naturally occurring ITDs in that species. We found no evidence for it. This is despite the species considered here had their range of naturally occurring ITDs varying threefold (eightfold if including human and cat data). We conclude that the range of ITDs that the auditory system is exposed to does not affect the BIC, which suggests that the BIC is not generated by the MSO but rather by a binaural mechanism preserved across all species tested. Our conclusion is consistent with prior observations that BIC properties are conserved across species with widely different head sizes (Laumen et al., 2016b).
A coincidence mechanism as the origin of the BIC
We provide a model of the amplitude of the BIC as a function of ITDs that emulates the excitatory–inhibitory interactions occurring in the LSO. It implements coincidence detection similar to that recently used to explain MSO (Franken et al., 2014) and LSO (Ashida et al., 2016, 2017) extracellular responses measured in vivo. We find that this model qualitatively and quantitatively explains the dependence of the BIC on ITDs, and in particular explains the relatively constant BIC magnitude observed over a range of ITD values at ∼0 μs (Furst et al., 1985). It can also make specific predictions about unilateral or bilateral hearing impairment. Importantly, it does not hypothesize any specific arrangement of neural delays leading onto the neurons generating the BIC (contrary to earlier models; Ungan et al., 1997; Riedel and Kollmeier, 2006), and remains able to predict the ITD dependence of BIC latency. Therefore, we argue that the BIC cannot be used to infer the arrangement of neural delays onto binaural SOC neurons.
The BIC is the largest evoked electrophysiological trace specific to binaural processing in the SOC, and we found that it is not related to auditory space (i.e., the range of naturally occurring ITDs) but rather to a generic synaptic coincidence detection mechanism in the LSO. This goes against the textbook view that SOC neurons perform initial analysis of auditory cues to sound-source location (i.e., spatial hearing). Rather, our results suggest that the bulk of SOC neural processing is related to binaural hearing in a strict sense: aggregating partially overlapping signals from two ears. A possibility is that the BIC reflects the range of binaural auditory fusion: the ability to perceive sounds presented in isolation at each ear as fused in a single perceptual event (Furst et al., 1985).
Clinical implications
Our findings reinforce the interest in the BIC as a clinically useful biomarker for the fidelity of binaural auditory brainstem processing in humans. The BIC is an attractive tool to detect deficits in binaural functionality because it requires equipment already available in audiology clinics. As such, the BIC has potential applications in areas of normal and abnormal development, central auditory processing disorder, conductive hearing loss, neurodegenerative disease (e.g., multiple sclerosis), and fitting of bilateral cochlear implants (for review, see Laumen et al., 2016b). An LSO origin of the BIC may also explain why the BIC has been reported by some researchers to be unreliable to measure in humans (Polyakov and Pratt, 1996; Brantberg et al., 1999; Delb, 2003), because humans have a relatively small LSO (Moore and Moore, 1971; Moore, 2000) and a small MNTB (Kulesza and Grothe, 2015), which provides the inhibitory input to LSOs. With knowledge of the LSO source of the BIC, improved measurement techniques might be explored (acoustic stimuli, recording electrode montages, etc.).
Footnotes
This work was supported by National Institutes of Health Grants R01-DC011555 (D.T. and V.B.) and F30-DC013932 (A.F.) and by Fondation pour l'Audition Grant FPA RD-2016-4 (V.B.). We thank Dr. A. D. Brown for comments on this manuscript.
The authors declare no competing financial interests.
- Correspondence should be addressed to Victor Benichoux at the above address. victor.benichoux{at}pasteur.fr