Abstract
The auditory system operates over a vast range of sound pressure levels (100–120 dB) with nearly constant discrimination ability across most of the range, well exceeding the dynamic range of most auditory neurons (20–40 dB). Dean et al. (2005) have reported that the dynamic range of midbrain auditory neurons adapts to the distribution of sound levels in a continuous, dynamic stimulus by shifting toward the most frequently occurring level. Here, we show that dynamic range adaptation, distinct from classic firing rate adaptation, also occurs in primary auditory neurons in anesthetized cats for tone and noise stimuli. Specifically, the range of sound levels over which firing rates of auditory nerve (AN) fibers grows rapidly with level shifts nearly linearly with the most probable levels in a dynamic sound stimulus. This dynamic range adaptation was observed for fibers with all characteristic frequencies and spontaneous discharge rates. As in the midbrain, dynamic range adaptation improved the precision of level coding by the AN fiber population for the prevailing sound levels in the stimulus. However, dynamic range adaptation in the AN was weaker than in the midbrain and not sufficient (0.25 dB/dB, on average, for broadband noise) to prevent a significant degradation of the precision of level coding by the AN population above 60 dB SPL. These findings suggest that adaptive processing of sound levels first occurs in the auditory periphery and is enhanced along the auditory pathway.
Introduction
A longstanding issue in hearing research is the “dynamic range problem,” the discrepancy between the dynamic range of behavioral intensity discrimination and that of single auditory neurons (Viemeister, 1988; Colburn et al., 2003). The auditory system responds behaviorally to an enormous range (100–120 dB) of sound pressure levels (SPLs), yet discriminates small level differences with nearly constant acuity (∼1 dB) across almost the entire range (Viemeister, 1983; Florentine et al., 1987). In contrast, the firing rates of most primary auditory neurons change with sound level over a range of only 20–40 dB and saturate at stimulus levels well below the upper limit of the behavioral range (Sachs and Abbas, 1974). This dynamic range problem motivates the search for neural mechanisms, such as novel forms of adaptation, that would extend the dynamic range of auditory neurons to higher sound levels.
Adaptation occurs in all sensory systems, anywhere from the sensory organ to the neocortex, and in response to changes in overall intensity, contrast (or variance) as well as complex spatio-temporal features (Wark et al., 2007). For example, retinal adaptation to changes in luminance produces a change in sensitivity that allows precise coding of contrast over a huge range of light intensity despite the limited range of firing rates of each neuron (Sakmann and Creutzfeldt, 1969; Shapley and Enroth-Cugell, 1984). Retinal adaptation differs from short-term adaptation in the auditory nerve (AN), which is characterized by a decrease in firing rates without a change in sensitivity or dynamic range (Smith, 1977, 1979; Harris and Dallos, 1979). However, studies of AN adaptation used static stimuli unlike natural sounds, in which the amplitude continually fluctuates over a wide range.
Using continuous stimuli in which the sound level varies rapidly, Dean et al. (2005) found that the dynamic range of midbrain auditory neurons adapts to the sound level distribution. They showed that the rate-level functions of inferior colliculus (IC) neurons shift toward the most probable levels in the stimulus. This form of adaptation extends the dynamic range of IC neurons, allowing them to code high-level sounds for which the rate responses would otherwise be saturated. Dynamic range adaptation has also been observed in the mammalian primary auditory cortex (Watkins and Barbour, 2008) and in the songbird auditory forebrain (Nagel and Doupe, 2006).
In the present study, we investigate whether dynamic range adaptation to mean sound level already occurs in primary auditory neurons and, if so, whether this adaptation can account for that found in the midbrain. Because each spiral ganglion cell (whose axons comprise the AN) forms just one synapse on one inner hair cell, recording from AN fibers offers an opportunity to study the mechanisms underlying dynamic range adaptation in a relatively simple system.
To investigate dynamic range adaptation, we measured rate responses of AN fibers in anesthetized cats to continuous, dynamic sound stimuli in which the level distribution contained a high-probability region (HPR). We show that the rate-level functions shift toward the HPR, an effect unexpected from classic descriptions of firing rate adaptation in the AN. This dynamic range adaptation extends the level range over which neural rate responses can support fine intensity discrimination.
Materials and Methods
Neurophysiology.
Methods for single-unit recordings from AN fibers in anesthetized cats were essentially the same as described by Kiang et al. (1965) and Cariani and Delgutte (1996) and were approved by the Animal Care and Use Committees of both the Massachusetts Eye and Ear Infirmary and the Massachusetts Institute of Technology. Cats were anesthetized with Dial in urethane (75 mg/kg), with supplementary doses given as needed to maintain an areflexic state. Throughout the experiment, the cat was given injections of dexamethasone (0.26 mg/kg) to prevent the brain from swelling and Ringer's solution (50 ml/d) to prevent dehydration. The AN was exposed by a posterior fossa craniotomy and retraction of the cerebellum. The tympanic bulla was widely opened to expose the round window, and a small opening was made in the bony septum to vent the middle-ear cavity. A silver electrode was placed near the round window to measure the compound action potential in response to click stimuli as an assay of cochlear function.
Single-unit recordings were performed on a vibration-isolated table in an electrically shielded, soundproof chamber. Sound was delivered to the cat's ear monaurally through a calibrated closed acoustic assembly driven by an electrodynamic speaker (Realistic 40-1377). Stimuli were generated by a 24-bit digital-to-analog converter (National Instruments NIDAC 4461) using sampling rates of either 50 or 100 kHz. Noise stimuli were digitally filtered to equalize the transfer characteristics of the acoustic system. This equalization was only possible up to 35 kHz, thereby setting an upper frequency limit to the noise stimuli.
Spike activity was recorded with glass micropipettes filled with 2 m KCl. The electrode was inserted into the nerve and mechanically advanced using a micropositioner (Kopf 650). The electrode signal was bandpass filtered and fed to a software spike detector triggering on level crossings. The spike times were recorded and saved to disk for subsequent analysis.
A click stimulus at ∼55 dB SPL was used to search for single units. Upon contact with a fiber, a frequency tuning curve was measured by an automatic tracking algorithm (Kiang et al., 1970) using 50 ms tone bursts, and the characteristic frequency (CF) was determined. The spontaneous firing rate (SR) of the fiber was measured over an interval of 20 s. Then, responses to tones at the CF or to broadband noise were studied as a function of stimulus level using the paradigms described below. The broadband noise was a burst of exactly reproducible noise with a bandwidth of either 25 or 35 kHz. The same noise token was used in all neurons and all experiments. Noise levels are given in decibels SPL over the entire bandwidth; the spectrum levels (in decibels relative to 20 μPa/√Hz) are ∼45 dB lower. All tone and noise bursts were 50 ms in duration, including 2 ms rise–fall times.
Stimulation paradigms.
Two different stimulation paradigms were used to measure rate-level functions. In the baseline paradigm, the sound levels of pure tone or broadband noise bursts are randomly drawn from a uniform distribution spanning 75 dB in 1 dB increments. Each tone or noise burst lasted 50 ms, with a 250 ms silent interval between successive stimuli. Typically, baseline rate responses were obtained from 10 trials at each level. This baseline paradigm is intended to minimize adaptation by introducing silent intervals between stimulus presentations and is similar to paradigms used in most studies of level coding in auditory neurons (Sachs and Abbas, 1974).
We also used the paradigm of Dean et al. (2005) to investigate whether the dynamic range of AN fibers adapts to the sound level distribution. In this paradigm, a pure tone or broadband noise stimulus is presented continuously, and the stimulus level is drawn at random every 50 ms from a probability distribution containing an HPR superimposed on a broad plateau (HPR paradigm) (Fig. 1A,B). As in the baseline paradigm, the level distribution spans 75 dB in 1 dB increments, but it also contains a 12-dB-wide HPR (e.g., centered at 72 dB SPL and spanning 66–78 dB SPL in Fig. 1A, B). Levels within the HPR have an 80% overall probability of occurrence, whereas levels outside the HPR have an overall probability of 20%. The HPR stimulus resembles many natural sounds such as speech in that the continuously varying sound level tends to be concentrated over a relatively narrow range, while also covering a wide range (Fig. 1C). Typically, responses were obtained from 380 stimulus trials (each 50 ms in duration) for each level within the HPR and 20 trials for each level outside the HPR, for a total of 6180 trials (about 5 min). After a run was completed with the HPR centered at one level (e.g., 72 dB SPL in Fig. 1B), the measurement was repeated with different HPR mean levels (taken from the set 36, 48, 60, 72, and 84 dB SPL) so as to sample the fiber's dynamic range. Typically, we obtained rate-level responses for three to five different HPR mean levels. We deliberately altered the order of HPR mean levels from one unit to the next to avoid any order effect across the neural population.
Analysis.
For each stimulus trial, we counted the number of spikes over a window extending from the response latency to the stimulus offset. The latency was measured for each fiber from peristimulus time histograms averaged across all levels in the baseline paradigm. Specifically, the latency was defined as the first peristimulus time when the firing rate exceeded 4 SDs above the SR (estimated from the silent intervals between stimuli). For both stimulus paradigms, we obtained a rate-level function by averaging the spike count across all trials for each level. We also computed the SD of the rate across trials and the spike count distribution across trials for signal detection analyses (see below).
Rate-level functions were fit with the five-parameter model of Sachs and Abbas (1974) and Winslow and Sachs (1988). In this model, the mean firing rate r is given as a function of sound pressure P (in Pa) by the following equation: where Rmin and Rmax are the minimum and maximum firing rates, respectively, θi and θe are parameters specifying the curve's position along the level axis, and N is an exponent characterizing the steepness of the growth in firing rate. The Matlab function “lsqnonlin” was used to fit the model to the data by the least-squares method. The fitted curves were used to characterize how the rate-level function changes with HPR mean level. To quantify the decrease in maximum firing rate with increasing HPR mean level, we normalized Rmax by its value for the baseline rate-level function and fit a straight line to the growth in normalized Rmax with HPR mean level. To quantify the horizontal shift of rate-level functions with HPR mean level, we computed the sound level “midpoint” L50, where the firing rate is halfway between its minimum Rmin and its maximum Rmax (Costalupes et al., 1984; Gibson et al., 1985). A straight line was then fit to the growth in L50 with HPR mean level, and the slope of this line (in decibels per decibels) was used as a metric for the strength of dynamic range adaptation.
Neural discriminability depends not only on the mean responses but also on variability in responses. To characterize discriminability for single AN fibers, we calculated the sensitivity index δ′ at each level (Colburn et al., 2003); δ′ is the slope of the fitted rate-level curve divided by the SD of the firing rate across trials. We used the Bayesian adaptive regression splines (BARS) algorithm (DiMatteo et al., 2001) to obtain smooth estimates of the SD as a function of level (see Fig. 7C). The neural just-noticeable difference (JND) in level based on rate information is approximately the reciprocal of δ′ (Colburn et al., 2003). Following Winslow and Sachs (1988), we also defined a statistical threshold Lth, as the level where the fitted firing rate is one SD above the minimum rate Rmin, and a statistical saturation level Lsat, where the fitted rate is one SD below its maximum Rmax.
To quantify the precision of level coding by our population of AN fibers for broadband noise stimuli, we calculated the average Fisher information (FI) using the method of Dean et al. (2005). To further characterize an average AN fiber's sensitivity to level change, we also calculated the mean neural JND by taking the reciprocal of the square root of the mean FI. Because our estimate of the FI is based on finite data, it has a positive bias (Gordon et al., 2008). To estimate this bias, we randomly shuffled the association between firing rate and sound level for each individual trial and recomputed the FI for the shuffled data. This procedure was repeated for 20 different random permutations of the data. The mean of the shuffled FIs was taken as a “noise floor” below which the estimated FI is considered to be an artifact of finite sampling.
Note that we use two slightly different metrics for quantifying the precision of level coding: δ′ for single fibers and FI for the AN fiber population. Both metrics are equivalent for Gaussian random variables, but FI is more general because it does not require a Gaussian assumption. We use δ′ to characterize coding precision in single fibers because it is simple to estimate (it only depends on the mean and variance of the firing rates at each level) and has a smooth dependence on level when computed from fitted curves. In contrast, FI cannot be reliably estimated for single fibers based on the 20 stimulus trials that are available for levels outside the HPR. Because of its generality, FI is used to characterize coding precision for the average fiber in the population, where the noise in single-fiber FI estimates tends to average out. We compared the population average FI to the population average δ′2 and found them to match closely (data not shown), suggesting that deviations from the Gaussian assumption may not be severe.
To quantitatively compare dynamic range adaptation between AN fibers and IC neurons, we quantified the rate of dynamic range shift with HPR mean level for the IC neurons studied by Dean et al. (2005). Because of the wide variety of shapes of rate-level functions in the IC, we could not always fit the IC data using the five-parameter model (Eq. 1) used for the AN data. Instead, we used the BARS algorithm (DiMatteo et al., 2001) to fit smooth curves to IC neurons' rate-level functions. For each HPR mean level, the level midpoint L50 of IC neurons was defined as the level where the BARS fit is halfway between its maximum and minimum across all levels tested. For nonmonotonic rate-level functions, only the steep portion from threshold to the peak level was considered because that part shifts most systematically with the HPR mean level. To avoid any possible bias from using different methods to fit the AN and IC data, we also fit the AN fiber rate-level functions using the BARS algorithm. The L50 slopes obtained using the two fitting algorithms (Eq. 1 and BARS) were strongly correlated (r = 0.93), and their mean values were not statistically different (paired t test, p = 0.61).
Results
Our results are based on recordings from 72 AN fibers in five cats with CFs ranging from 200 Hz to 37 kHz. Sixty percent of these units had high spontaneous discharge rates (SR, >18 spike/s), 28% had medium SRs (between 0.5 and 18 spike/s), and 12% had low SRs (<0.5 spike/s). Rate-level functions were measured with at least three HPR mean levels in 28 fibers for broadband noise stimuli and 21 fibers for CF tones.
AN fiber dynamic range shifts with HPR mean level
Figure 2A shows the rate-level functions of a high-SR AN fiber for pure tones at the CF (550 Hz), measured using stimulus levels drawn from a uniform distribution (baseline paradigm) and from four distributions with HPRs centered at 36, 48, 60, and 72 dB SPL, respectively (HPR paradigm). All five rate-level functions show flat saturation and are well fit by the five-parameter model (Sachs and Abbas, 1974; Winslow and Sachs, 1988). The model fits to the data are particularly tight in the HPRs, where the variability in mean firing rates is low because of the large number of trials. For all levels, the firing rates are highest for the baseline condition and decrease with increasing HPR mean level. Importantly, the range of levels over which the firing rate grows rapidly with level shifts to the right with increasing HPR mean level.
To interpret the effect of HPR mean level on rate-level functions as shown in Figure 2A, it is important to provide a precise definition of dynamic range adaptation and contrast it with classic firing rate adaptation (Fig. 3). Pure dynamic range adaptation (Fig. 3A) would be a change in sensitivity or operating point with no change in responsiveness (i.e., maximum and minimum firing rates), a horizontal shift of the rate-level function. This is close to what is seen in some IC neurons with HPR stimuli [Dean et al. (2005), their Fig. 1e]. The changes in sensitivity that characterize dynamic range adaptation are clearly distinct from the changes in firing rate that characterize classic firing rate adaptation (Fig. 3B). Firing rate adaptation in the AN is manifested by two related phenomena (Kiang et al., 1965; Smith and Zwislocki, 1975; Harris and Dallos, 1979; Smith, 1979; Chimento and Schreiner, 1991): a decay in firing rate in response to a sustained sound stimulus and suppression of the rate response to a probe stimulus presented after a stimulus that induces adaptation. Firing rate adaptation by preceding stimuli is expected to occur with HPR stimuli, for which the vast majority (80%) of adaptor levels are located in the HPR. Thus, the decrease in firing rates with increasing HPR mean level observed in Figure 2A is consistent with classic firing rate adaptation where the probe response is increasingly suppressed with increasing adaptor level (Smith, 1977; Harris and Dallos, 1979). For this fiber, the maximum firing rate normalized to that in the baseline condition decreases with increasing HPR mean level at a rate of 0.57%/dB (Fig. 2B).
In addition to a decrease in firing rate, Figure 2A also shows that the range of sound levels over which the firing rate grows rapidly shifts systematically with HPR mean level. This dynamic range shift is not expected from previous descriptions of AN firing rate adaptation (Smith and Zwislocki, 1975; Smith, 1979) because an adapting stimulus produces a constant decrement in probe firing rate without altering the operating point or sensitivity (Fig. 3B). Thus, the rate-level functions measured with the HPR paradigm in Figure 2A show mixed adaptation (Fig. 3C) comprising both a rate decrement consistent with classic firing rate adaptation and a change in sensitivity or operating point characteristic of dynamic range adaptation.
To separate dynamic range adaptation from firing rate adaptation, each rate-level function in Figure 2A was normalized to the minimum and maximum of the fitted curves (Fig. 2C). On this normalized scale, the rate-level functions for the different HPR mean levels closely parallel one another. The midpoint L50 of the normalized rate-level function increases from 38 dB SPL, for the HPR centered at 36 dB SPL, to 48 dB SPL, for the HPR centered at 72 dB SPL. The dependence of L50 on HPR mean level is well characterized by a straight line with a slope of 0.27 dB/dB (Fig. 2D). The statistical threshold Lth and saturation level Lsat (see Materials and Methods) also increase nearly linearly with HPR mean level (Fig. 2D). For this fiber, the growth rates are similar for all three metrics (Lth, Lsat, and L50), which is typical for high-SR fibers.
The mixed adaptation seen with CF tones using the HPR paradigm was also observed with broadband noise stimuli. Figure 4A shows an example from a medium-SR AN fiber (CF, 1300 Hz) using level distributions with four different HPRs (centered at 48, 60, 72, and 84 dB SPL). As with the pure-tone responses shown in Figure 2, the maximum firing rate decreases, and the level midpoint shifts toward higher intensities with increasing HPR mean level. Dynamic range adaptation in this medium-SR fiber behaves somewhat differently from that in the high-SR fiber in Figure 2 in that the rate of shift for the saturation level Lsat (0.07 dB/dB) (Fig. 4D) is lower than that for the midpoint L50 (0.27 dB/dB). Such differences between L50 slopes and Lsat slopes are frequently observed for low- and medium-SR fibers, although they are not usually as large as in Figure 4D.
Characteristics of dynamic range adaptation in AN fiber population
Figure 5 summarizes how the dynamic range shifts with HPR mean level for the AN fiber population in response to both CF tones and broadband noise stimuli. In general, the shift in the level midpoint L50 with HPR mean level was well characterized by a straight line. Specifically, we considered the growth in L50 to be linear if the maximum absolute deviation between the data and the fitted line was no more than 3 dB. This criterion was met by all 21 AN fibers for tones and by 27 of 28 fibers for noise. The rates of shift of L50 (the L50 slopes) ranged from 0.16 to 0.47 dB/dB for tones and from 0.05 to 0.39 dB/dB for broadband noise and showed no obvious dependence on either CF (Fig. 5A) or SR (Fig. 5B). Thus, no AN fiber approached the ideal slope of 1 dB/dB that would be required to keep the dynamic range and the HPR in alignment.
The mean L50 slope for tones (0.34 dB/dB) is significantly greater than the 0.25 dB/dB mean L50 slope for broadband noise (two-sample t test, p < 0.001). One possible reason for this difference is that, on average, HPR mean levels relative to threshold were lower for noise than for tones. Although the HPR mean levels in decibels SPL were the same for both stimuli, noise thresholds are higher than CF tone thresholds so that HPR levels relative to threshold (re. threshold) were ∼13 dB lower for noise than for tones, on average. Figure 5C shows that the L50 slope tends to increase with the average HPR mean level re. threshold across our fiber sample (r = 0.59; p < 0.001). The data for tones and noise seem to follow the same trend, suggesting that the difference in L50 slopes for the two stimuli can be primarily accounted for by differences in mean HPR levels re. threshold. The lower slopes when HPR mean levels are close to threshold may arise because L50 tends to reach a lower limit as the HPR mean level approaches threshold and adaptation becomes minimal. This L50 asymptote near threshold may lead to an underestimation of the slope when HPR mean levels are concentrated near threshold. Thus, the greater rates of dynamic range shifts observed for CF tones than for broadband noise may reflect a bias in our measurement procedure rather than a genuine difference in AN fibers' response properties between these two stimuli.
Dynamic range shift, as characterized by the shift in L50, results from the combined effects of shifts in threshold Lth and saturation level Lsat with HPR mean level. Whereas the shift in Lth generally paralleled that in L50, and the rates of shift did not significantly differ for the two metrics across the AN fiber population (paired t test, p = 0.49 for tones and p = 0.26 for broadband noise), Lsat behaved somewhat differently in some units. Figure 5D shows a scatter plot of Lsat slope versus L50 slope for both CF tones and broadband noise. The two metrics are significantly correlated for both stimuli (tones: r = 0.78, p < 0.001; noise: r = 0.66, p < 0.001). The vast majority of data points in Figure 5D lie below the main diagonal, indicating that the slope for Lsat tends to be smaller than the slope for L50. This observation is confirmed by paired t tests on the mean slopes (tones: 0.25 vs 0.34 dB/dB, p < 0.001; noise: 0.16 vs 0.25 dB/dB, p < 0.001).
Two factors contribute to the slower growth in Lsat than in L50. One factor important for low- and medium-SR fibers is that the constant rate-decrement property in classic descriptions of AN firing rate adaptation (Fig. 3B) cannot strictly hold at low stimulus levels if the rate decrement becomes greater than the (unadapted) SR, because this would lead to negative firing rates at low levels (Fig. 3D). In such cases, our procedure for estimating the midpoint between minimum and maximum firing rates could yield an apparent shift in L50, even in the absence of genuine dynamic range shift (Fig. 3D). When this happens, the shift in the saturation level Lsat may give a more accurate estimate of dynamic range adaptation than the shift in L50. Consistent with this idea, the difference between L50 slopes and Lsat slopes was significantly greater for low- and medium-SR fibers than for high-SR fibers (two-sample t test, p = 0.003). Despite this effect, the Lsat slopes were significantly above zero for most low- and medium-SR fibers, indicating that dynamic range adaptation is not limited to high-SR fibers.
A second factor contributing to the difference between Lsat slopes and L50 slopes is the decrease in the range of firing rates at higher HPR mean levels. Since Lsat is, by definition, the level at which firing rate is just 1 SD below the maximum rate, the reduction in maximum firing rate with increasing HPR mean level limits the growth in Lsat. In contrast, L50 is not affected by changes in firing rates because it is determined relative to the minimum and maximum rates. This robustness to rate changes makes L50 shift the best overall metric of dynamic range adaptation, although it may somewhat overestimate shifts in low- and medium-SR fibers.
As with L50 slopes, the slopes characterizing the decrease in maximum firing rate with increasing HPR mean level (in percentage per decibel, measured as in Figs. 2B and 4B) vary substantially across fibers in our sample but do not systematically depend on CF (Fig. 6A). This maximum rate slope shows a weak but significant negative correlation with the L50 slope (r = 0.34; p = 0.02), meaning that AN fibers showing a relatively large shift in dynamic range tend to show a relatively small decrease in firing rate (Fig. 6B). Moreover, although L50 slopes are greater for tones than for noise, the absolute values of the slopes for maximum rate decrement show the opposite trend (−0.53%/dB for tones and −0.85%/dB for noise; p = 0.0012). The weak negative correlation between decrease in firing rate and dynamic range shift, and the dissociation between the two metrics in the effect of stimulus type (tone vs noise) provide additional evidence that dynamic range adaptation is a distinct phenomenon from classic firing rate adaptation.
So far, we have assumed that the dynamic range shifts are entirely caused by changes in HPR mean level. A complication is that the variance of the level distribution also changes when the mean is altered. Specifically, the variance is minimal when the HPR is near the center of the 75-dB-wide level range and maximal at the extremes of the level range. In practice, the change in SD (the square root of the variance) is fairly small: from a minimum of 11.8 dB when the HPR is centered at 60 dB SPL to a maximum of 16.1 dB when the HPR is centered at 36 dB SPL. Nevertheless, because both central auditory neurons (Kvale and Schreiner, 2004; Nagel and Doupe, 2006) and somatosensory cortical neurons (Garcia-Lazaro et al., 2007) have been shown to adapt to changes in the variance of the stimulus-level distribution, it is possible that some of the dynamic range shifts we observe are attributable to changes in variance. To test for this possibility, we fit an alternative model in which L50 is a linear function of both the mean and the SD of the level distribution (as opposed to the mean only in the standard model). The bivariate model provided a significantly better fit (F test, p < 0.05) than the standard, univariate model in only two AN fibers for CF tones and in three fibers for broadband noise. Thus, although we cannot rule out a small adaptation to changes in the variance of the level distribution for a minority of fibers, the effects of changes in variance were negligible compared with those of HPR mean level for our stimuli. A more definitive assessment of whether adaptation to level variance occurs in AN fibers would require independent manipulation of the mean and the variance of the level distribution (Nagel and Doupe, 2006).
Dynamic range adaptation improves the precision of level coding in single fibers
Does dynamic range adaptation improve the precision of level coding by AN fibers within the HPR? To address this question, we used two complementary approaches (see Materials and Methods). We used the sensitivity per decibel δ′ (Colburn et al., 2003) to assess the coding precision based on rate responses of single AN fibers. δ′ represents the change in mean firing rate produced by a 1 dB increment in level, expressed as a fraction of the SD across trials. We also used the more general FI to assess the precision of level coding for the average fiber within the AN population (Dean et al., 2005). Both δ′ and FI are inversely related to the neural JND, the smallest level change that can be reliably detected by an ideal observer based on average rate information from one AN fiber or a population of fibers (Colburn et al., 2003).
Figure 7 illustrates how δ′ was calculated for a low-SR fiber (CF, 1180 Hz) in response to broadband noise stimuli with four level distributions (HPR mean levels 48, 60, 72, and 84 dB SPL). The mean firing rates for this fiber (Fig. 7A) are well fit by the five-parameter model (Sachs and Abbas, 1974; Winslow and Sachs, 1988). The fitted curves were differentiated with respect to sound level to obtain the slopes shown in Figure 7B. Each slope curve is bell shaped, and its maximum occurs close to the level midpoint L50 because the rate-level functions are nearly symmetric around the midpoint. The peak slope decreases, and the peak location shifts toward higher levels with increasing HPR mean level. As with the mean rate, the SD of the firing rate (Fig. 7C) first grows rapidly with stimulus level and then saturates, and the range of rapid growth shifts systematically with HPR mean level. The SDs show considerable scatter, particularly outside the HPR where only 20 stimulus trials were presented. Nevertheless, the main trends in the data are captured by the fitted BARS (see Materials and Methods). For each HPR mean level, the sensitivity index δ′ is equal to the rate-level function slope (Fig. 7B) divided by the BARS fit to the SD (Fig. 7C). The resulting δ′-level functions (Fig. 7D) are also bell shaped, with a peak that shifts systematically with HPR mean level. The peak value δ′max decreases with increasing HPR mean level because of the decrease in maximum firing rates (firing rate adaptation), which reduces sensitivity. The sound level Lδ′max at which δ′ peaks increases nearly linearly with HPR mean level with a slope of 0.25 dB/dB (Fig. 7E), which is slightly lower than the L50 slope for this fiber (0.30 dB/dB). This relatively small difference in slopes was typical. The HPR mean level at which the regression line for Lδ′max intersects the identity line is the “optimally coded” HPR mean level Loc (67 dB SPL for this fiber) (Fig. 7E). For a dynamic stimulus whose level distribution has an HPR, stimulus levels within the HPR are most precisely coded by this fiber when the HPR is centered at Loc. However, coding precision outside the HPR can be higher for dynamic stimuli with HPRs centered below Loc because of the wider range of firing rates at lower HPR mean levels.
The sensitivity δ′ is approximately the inverse of the single-fiber, rate-based neural JND for level discrimination (Delgutte, 1987; Colburn et al., 2003), which can be directly compared with psychophysical JNDs. Figure 7F shows both the minimum neural JND (the inverse of δ′max) and the JND at the HPR mean level (the inverse of δ′ at each HPR mean level) as a function of HPR mean level. The minimum JND slowly increases (meaning poorer performance) from 2.8 to 5.3 dB as the HPR mean level increases from 48 to 84 dB SPL, reflecting the decreasing range of firing rates. The JND at the HPR mean level approaches the minimal JND between 60 and 72 dB SPL, which contains Loc (67 dB SPL), meaning that dynamic stimuli with HPRs within that range are optimally coded by this fiber. However, for HPR mean levels on either side of this optimal range, the JND at the HPR mean level becomes much larger than the minimum JND. For HPRs centered below Loc, best performance (minimum JND) occurs for levels above the HPR, whereas for HPRs centered above Loc, best performance occurs below the HPR (Fig. 7E). This failure of the fiber's sensitivity maximum to fully track changes in HPR reflects the slow (0.25 dB/dB) growth of Lδ′max with HPR mean level, compared with the 1 dB/dB slope that would be required for robust coding over a wide range of HPR mean levels. Thus, dynamic range adaptation, while helpful, is insufficient to provide precise coding of the prevailing levels in a dynamic stimulus over a wide range for this typical fiber.
Figure 8A shows a scatter plot of the rate of growth in Lδ′max with HPR mean level (the Lδ′max slope) against the L50 slope for our population of AN fibers with both tone and noise stimuli. The two slopes are strongly correlated (r = 0.61; p < 0.001). In addition, the mean Lδ′max slope (0.30 dB/dB for tones, 0.21 dB/dB for broadband noise) is not significantly different from the corresponding mean L50 slope (paired t test, p = 0.11 for tones and p = 0.09 for noise). This suggests that the steep part of the rate-level function near L50 plays a dominant role in determining the coding precision of sound level in single fibers.
Figure 8B shows the optimally coded HPR mean level Loc as a function of baseline threshold for the fiber population. The two metrics are strongly correlated (r = 0.80; p < 0.001), and the slope of the regression line is close to unity (1.1 dB/dB). On average, Loc is ∼12 dB SPL above the threshold (25–75% quartiles, 8.1–18.7 dB). For noise, Loc ranges from below 10 to above 70 dB SPL, with half of the data concentrated between 47 and 61 dB SPL. The tight correlation between threshold and Loc suggests that the clustering of Loc around mid-levels simply reflects the distribution of the AN fibers' thresholds for noise. The Loc distribution for tones is harder to interpret because the stimulus frequency is always at the CF and thus different for every fiber.
Dynamic range adaptation improves the robustness of level coding by the AN fiber population
Although each AN fiber can optimally code the prevailing levels in dynamic stimuli over only a narrow range of HPRs, the population as a whole might still robustly code level over a fairly wide range of HPRs if the threshold (or Loc) distribution for the population is broad enough. Following Dean et al. (2005), we used the FI to quantify the precision of level coding for dynamic noise stimuli by the AN population. The FI takes into account both the range over which each fiber provides precise coding and the Loc distribution across the population. The inverse of FI provides a general lower bound on the precision with which a parameter (here, stimulus level) of a probability distribution can be estimated from observations drawn from that distribution (here, spike counts) (Gordon et al., 2008). To the extent that the firing patterns from different AN fibers are statistically independent (Johnson and Kiang, 1976), FI is additive over a population of AN fibers responding to the same stimulus.
Figure 9A shows the mean FI for the AN fiber population in response to broadband noise stimuli with HPRs centered at 36, 48, 60, 72, and 84 dB SPL. Because FI is estimated from a finite amount of data (rather than from ideal probability distributions), it has an inherent positive bias or noise floor that was estimated by randomly permuting the rate-level data (see Materials and Methods). The estimated FIs lie above the noise floor (dashed line) over almost the entire range of levels. As was the case for δ′ from single AN fibers, the population-based mean FI curves are bell shaped, and the peaks systematically shift with HPR mean level so as to locate the peak closer to the center of the HPR. The level LFImax where the FI peaks increases nearly linearly with HPR mean level at a rate of 0.26 dB/dB (Fig. 9B), which is close to the mean Lδ′max slope (0.21 dB/dB) for single fibers with broadband noise stimuli.
The minimum neural JND (the inverse square root of the maximum FI) monotonically increases from 4.2 to 5.0 dB with HPR mean level (Fig. 9C, open circles). In contrast, the neural JND at the HPR mean level (the inverse square root of FI at the HPR mean level) has a U-shaped dependence on HPR mean level (Fig. 9C, filled circles). This curve coincides with the minimum JND curve (4.5 dB) at 60 dB SPL but deviates at lower and higher levels. This indicates that the population of AN fibers as a whole most precisely codes the prevailing sound levels in a dynamic broadband noise stimulus when the HPR mean level is ∼60 dB SPL. This value is close to the median single-fiber Loc across the AN fiber population (55 dB SPL), indicating that level is optimally coded by the population as a whole in the range where a majority of individual fibers provide optimal coding for broadband noise stimulus. Thus, the U shape of the population JND curve as a function of HPR mean level results from the interaction of two factors: (1) the basic U shape of the JND curves for individual fibers, each with a minimum at its Loc; and (2) the threshold distribution across the population, which essentially determines the Loc distribution (Fig. 8B). In turn, the U shape for single-fiber JNDs is determined by both the narrow range over which firing rate increases rapidly with level and the slow rate of dynamic range shift compared with the 1 dB/dB that would be required to maintain each fiber's sensitivity maximum in perfect alignment with the HPR.
To quantitatively assess the benefit of dynamic range adaptation for the robustness of level coding by the average AN fiber, Figure 9C compares the neural JNDs in “unadapted” and adapted conditions. The red curve (the inverse square root of the red curve in Fig. 9A) shows the neural JND of the average AN fiber as a function of level for a level distribution centered at the optimally coded HPR mean level, 60 dB SPL. This curve represents the precision of level coding with dynamic range adaptation held at a fixed set point and is contrasted with the curve for the neural JND at the HPR mean level (black), in which the dynamic range naturally adapts to changes in the level distribution. Whereas the red curve characterizes an average fiber's ability to detect a change in instantaneous level during a continuous, dynamic stimulus, the black curve represents the ability to detect a change in the overall (mean) level of the dynamic stimulus. We compare the range of levels over which each of these two curves stays within an arbitrary criterion: 2 dB of the grand minimum JND across all levels and level distributions (Fig. 9C, gray dotted line). Based on this criterion, the range of robust level coding for the average fiber is 42.4 dB with naturally occurring dynamic range adaptation, compared with 32.4 dB with the dynamic range adaptation held at a fixed set point. The 10 dB increase quantifies the benefit of dynamic range adaptation for robust level coding by the average AN fiber. Despite this substantial benefit, the neural JND with adaptation still degrades markedly at high HPR mean levels, consistent with the nonideal (<1 dB/dB) dynamic range shifts with HPR mean level (Fig. 9B). This degradation in performance contrasts with the nearly constant behavioral JNDs (Weber's law) for broadband noise stimuli (Florentine et al., 1987), indicating that the rate responses of the AN fiber population fail to predict trends in psychophysical performance, even when dynamic range adaptation is taken into account.
Dynamic range adaptation is weaker in the AN than in the IC
The dynamic range adaptation we observed in AN fibers in anesthetized cats is qualitatively similar to that reported for IC neurons in anesthetized guinea pigs using the same stimulus paradigm (Dean et al., 2005). This indicates that adaptation in the midbrain is at least partially inherited from the auditory periphery. However, a quantitative analysis reveals differences between the two sites in dynamic range adaptation.
Figure 10A–C shows the rate-level functions of three IC neurons from Dean et al. (2005) in response to broadband noise at four HPR mean levels. In contrast to the stereotypical sigmoidal rate-level function of AN fibers, rate-level functions of IC neurons exhibit a great diversity of shapes, including monotonic saturating (Fig. 10A), nonmonotonic (Fig. 10B), and monotonic nonsaturating (Fig. 10C). Despite these differences, all three neurons in Figure 10 clearly show dynamic range adaptation. We characterized the rate of dynamic range shift in IC neurons by measuring the level midpoint L50 from smooth curves fit to the rate-level functions using the BARS algorithm (Fig. 10A–C, solid lines). A majority of the IC neurons (23 of 30) showed a nearly linear growth in L50 with HPR mean level, as assessed using the same criterion as for AN fibers. Among these 23 neurons, the mean rate of dynamic range shift was 0.39 dB/dB, which is significantly greater than the 0.25 dB/dB mean L50 slope for the 27 AN fibers that met the linearity criterion with broadband noise (two-sample t test, p < 0.001) (Fig. 10D). The cross-neuron variability is also higher in the IC, with L50 slopes exceeding 0.6 dB/dB in some neurons. Although Watkins and Barbour (2008) have reported that neurons with nonmonotonic rate-level functions show little dynamic range adaptation in the primary auditory cortex of awake marmosets, all three nonmonotonic IC neurons from Dean et al. (2005) showed clear dynamic range shifts (one of them is shown in Fig. 10B).
These relatively large rates of dynamic range shift allow some IC neurons to robustly code level at higher HPR mean levels than AN fibers. For two of the three IC neurons shown in Figure 10, the shifts were large enough for the HPR to still be within the steeply rising part of the rate-level function at the highest HPR mean level tested (75 dB SPL). In contrast, for AN fibers, the optimally coded HPR mean level Loc averaged 55 dB SPL and never exceeded 72 dB for broadband noise (Fig. 8B).
The differences in dynamic range adaptation between individual AN and IC neurons are reflected in the average FI curves for the two neural populations [compare our Fig. 9A with Fig. 2 of Dean et al. (2005)]. In the AN, the location of the maximum FI LFImax (where precision of level coding is best) grows linearly with HPR mean level at a rate of 0.26 dB/dB (Fig. 9B), and the overall level of dynamic noise stimuli is optimally coded only in a narrow range centered at 60 dB SPL (Fig. 9C). In the IC, the growth in LFImax with HPR mean level is faster and less linear than in the AN (Fig. 9B), and the range of optimally coded HPR mean levels is wider, spanning 60–75 dB [Dean et al. (2005), their Fig. 2]. Thus, the average IC neuron is better suited than the average AN fiber to precisely code the sound levels of dynamic stimuli with level distributions biased toward higher intensities.
Important caveats regarding this comparison between the AN and IC studies are that, besides species differences, there are also differences in anesthesia (urethane/Dial vs urethane/Hypnorm) and modes of stimulation (monaural vs diotic). Nevertheless, the comparison is worthwhile because it is a rare opportunity when data from two different laboratories are collected with the same stimulus paradigm and analyzed with the same techniques.
In summary, dynamic range adaptation in the AN and IC appears to differ in at least two ways: (1) the rate of dynamic range shift with HPR mean level is greater in the IC for both single units and the population mean FI; and (2) the range of stimulus levels over which dynamic stimuli are optimally coded extends to higher levels in the IC. These results suggest that dynamic range adaptation in the auditory midbrain is only partially accounted for by that occurring in the periphery, implying that the adaptation is enhanced along the auditory pathway. Additional dynamic range adaptation might occur at each intervening synapse and through neural network interactions in the auditory brainstem and midbrain.
Discussion
We investigated whether dynamic range adaptation to the sound level distribution first observed in the auditory midbrain also occurs in primary auditory neurons. We measured rate-level functions of AN fibers in anesthetized cats in response to continuous, dynamic sound stimuli in which the instantaneous sound level occurred in a narrow range (12 dB) with high probability (80%). We found that all AN fibers, regardless of CF or SR, showed dynamic range adaptation to the mean sound level for both CF tones and broadband noise stimuli. Dynamic range adaptation is manifested by a shift of the rate-level function toward the most probable sound level in the dynamic stimulus and is distinct from classic firing rate adaptation, which is manifested by a decrease in firing rates without a change in sensitivity. The shift relocates the dynamic region of the rate-level function toward the mean sound level, resulting in higher coding precision of the prevailing levels in the dynamic stimulus. The shifts are nevertheless too small to provide robust coding of sound level by the AN fiber population over a wide range and smaller than those measured in IC neurons by Dean et al. (2005).
Dynamic range adaptation and firing rate adaptation
Classic descriptions of short-term adaptation in the AN (Smith and Zwislocki, 1975; Smith, 1977, 1979; Harris and Dallos, 1979) used a brief probe tone stimulus preceded by an adapting tone, with long silent intervals between each presentation of the stimulus pair. With this paradigm, adaptation results in a constant decrement in the rate response to the probe for all probe levels, without a change in sensitivity or operating point. Although the decrease in firing rates with increasing HPR mean level we observe with our stimuli is consistent with these descriptions, the dynamic range shift constitutes a different phenomenon. Additional evidence that dynamic range adaptation is distinct from classic firing rate adaptation is the weak negative correlation between firing rate decrement and dynamic range shift across the AN fiber population, and the opposite effects of stimulus type (tone vs noise) on the two metrics (Fig. 6B).
The HPR stimuli differ from those used in classic studies of AN adaptation in two respects: the stimulus level varies rapidly, and there are no silent intervals between stimulus presentations. Which of these two properties is more important to produce dynamic range adaptation? Studies of AN fiber responses to probe tones in static backgrounds (Costalupes et al., 1984; Gibson et al., 1985) shed light on this question. These studies observed both decreases in firing rate and small dynamic range shifts with “inversely gated” tone or noise adapting stimuli presented continuously except during the probe tones. Inversely gated backgrounds also produce small dynamic range shifts in the cochlear nucleus (Gibson et al., 1985) and IC (Rees and Palmer, 1988). At all three sites, the dynamic range shifts obtained with inversely gated backgrounds are much smaller than those produced by continuous backgrounds, an effect that was attributed to the additional contribution of two-tone suppression with continuous backgrounds (Costalupes et al., 1984). Gibson et al. (1985) reported mean shift rates of 0.24 dB/dB in the AN for inversely gated tone backgrounds, which is within the range of shift rates we observed with dynamic HPR tones (Fig. 5A) but lower than the 0.34 dB/dB mean shift rate for HPR tones. Thus, nearly continuous stimulation with static sounds seems to produce some dynamic range adaptation in the AN, but rapid modulation of the stimulus amplitude may also contribute additional adaptation. A direct comparison of the dynamic range shifts produced by dynamic and static stimuli is needed to resolve the issue.
Mechanisms underlying dynamic range adaptation in AN
The mechanisms underlying dynamic range adaptation in the AN could potentially arise anywhere from cochlear mechanics to transmission at the inner hair cell synapse. Although single-unit recordings from AN fibers do not, by themselves, allow an identification of these mechanisms, some clues can be gleamed from the characteristics of the adaptation.
The shifts in sensitivity that characterize dynamic range adaptation resemble those produced by two-tone suppression (Javel, 1981; Delgutte, 1990) and stimulation of the medial olivocochlear (MOC) efferents (Guinan, 2006), both of which are primarily caused by changes in the gain of the cochlear amplifier (Cai and Geisler, 1996; Guinan, 2006). Despite this similarity, several observations argue against a mechanical origin for dynamic range adaptation. Rates of dynamic range shift with HPR mean level show no obvious dependence on CF (Fig. 5A), whereas the cochlear amplifier is thought to have a higher gain in the cochlear base than in the apex (Cooper and Yates, 1994; Cooper, 2004). Effects of MOC efferents on AN fiber responses are maximal in the 5–10 kHz CF region in cats (Guinan and Gifford, 1988), whereas this region does not stand out in our data. A priori, the MOC system is unlikely to play a major role in our Dial-anesthetized animals because the sound-evoked MOC reflex is generally very weak with barbiturate anesthesia (Boyev et al., 2002). Furthermore, observation of dynamic range adaptation in other sensory modalities such as in the retina (Sakmann and Creutzfeldt, 1969; Shapley and Enroth-Cugell, 1984) and the somatosensory cortex (Garcia-Lazaro et al., 2007) suggests that its underlying mechanism is not solely dependent on cochlear mechanics.
The inner hair cell ribbon synapse is a likely site for various forms of adaptation to occur. Short-term firing rate adaptation in the AN appears to be primarily caused by the depletion of presynaptic neurotransmitter stores at the inner hair cell synapse (Furukawa et al., 1978; Moser and Beutner, 2000; Spassova et al., 2004; Goutman and Glowatzki, 2007). Whereas the fast kinetic component of exocytosis of vesicles at the synapse occurs on a time scale similar to AN short-term adaptation (Moser and Beutner, 2000; Spassova et al., 2004), exocytosis also has slower components (Nouvian et al., 2006) that might contribute to dynamic range adaptation. Postsynaptic mechanisms such as receptor desensitization also contribute to firing rate adaptation (Goutman and Glowatzki, 2007) and may play a role in dynamic range adaptation as well. Lateral olivocochlear neurons directly innervate inner hair cells and their afferent synapses and are thus well positioned to produce dynamic range adaptation by modulating synaptic transmission, although their effects may be too slow.
In summary, although we cannot rule out a cochlear mechanical origin for dynamic range adaptation (through either direct changes in mechanical properties or modulation by olivocochlear efferents), the inner hair cell synapse appears to be a more likely site for the adaptation mechanism given the wide diversity of neuromodulatory mechanisms on both presynaptic and postsynaptic sides.
Functional significance of dynamic range adaptation
The stimuli used in the HPR paradigm resemble human speech and other natural sounds in that the sound level varies dynamically and the level distribution is concentrated over a fairly narrow range but also has wide tails (Fig. 1C). Thus, dynamic range adaptation has implications for the neural processing of natural acoustic stimuli. Dynamic range adaptation moves the locus of maximal neural sensitivity to level change toward the center of the HPR, thereby improving the precision of neural coding for the prevailing stimulus levels. However, for AN fibers, the rate of dynamic range shift is substantially less than the 1 dB/dB that would be required for the region of maximum neural sensitivity to match the HPR over a wide range of levels. Moreover, the benefits of dynamic range adaptation are partly counteracted by firing rate adaptation, which reduces the overall neural sensitivity. As a result, the precision of level coding by the AN population degrades significantly above 60 dB SPL for broadband noise.
Several studies (Delgutte, 1987; Viemeister, 1988; Winslow and Sachs, 1988; Heinz et al., 2001; Colburn et al., 2003) have used signal detection theory for predicting performance limits in level discrimination based on the AN fibers' rate responses. The predicted performance for an ideal observer always exceeds psychophysical performance in quiet. However, the predicted performance degrades severely at higher sound levels when using only rate information from fibers tuned to the target frequency, whereas psychophysical performance is stable with level, even when listening is restricted to a narrow frequency band by masking noise (Viemeister, 1983). The severe degradation in rate-based performance suggests that other types of information, such as that available in the spatio-temporal patterns of discharge may be needed to account parsimoniously for psychophysical performance over a wide intensity range (Heinz et al., 2001; Colburn et al., 2003). Our results shown in Figure 9 suggest that, if dynamic range adaptation were engaged during intensity discrimination tasks, then the performance of an ideal observer based on AN rate information would be somewhat improved at higher stimulus levels, but would still degrade above 60 dB SPL. Thus, dynamic range adaptation in the AN is too weak to fundamentally alter the conclusions of previous studies regarding the adequacy of rate information for intensity discrimination.
We have shown that the dynamic range of AN fiber rate responses adapts to the sound level distribution of continuous, dynamic stimuli by shifting toward the most probable stimulus levels in the stimulus, thereby improving the precision of level coding. While the adaptive dynamic range shifts are too small to account for psychophysical performance over a wide level range, they are likely to contribute significantly to the stronger dynamic range adaptation observed in the IC (Dean et al., 2005). The AN may be the first stage of adaptive processing in the auditory pathway that facilitates the processing of natural sounds by adjusting the neural dynamic range to maximize the information about the prevailing sound levels.
Footnotes
-
This work was supported by National Institutes of Health Grants RO1 DC002258 and P30 DC005209 (B.D.) and a Royal Society Dorothy Hodgkin Fellowship (I.D.). We thank K. Hancock for developing the experimental software, C. Miller for expert surgical assistance, and N. Harper for providing the code for computing the Fisher information. J. Guinan and two anonymous reviewers made valuable comments on this manuscript.
- Correspondence should be addressed to Bo Wen, Eaton-Peabody Laboratory, Massachusetts Eye and Ear Infirmary, 243 Charles Street, Boston, MA 02114. bwen{at}mit.edu