INTRODUCTION

Many aspects of hearing may be described in terms of logarithmic scales. The logarithm of loudness, for example, is over a wide range an approximately linear function of the sound level in decibels (dB), which is a logarithmic measure. Corresponding relationships were found in physiological studies. The logarithm of the auditory-nerve spike count, for instance, seems to increase almost linearly with sound level, over a range of roughly 60 dB (Relkin and Doucet 1997). Such laws cannot be extrapolated to low intensities, though. So the loudness at threshold would be grossly overestimated this way (Buus et al. 1998). As far as auditory-evoked responses near threshold are concerned, several studies suggest that it is the response amplitude itself, not its logarithm, which increases roughly linearly with sound level. In the case of the compound action potential (CAP) of the auditory nerve, such an increase can be found between threshold and 40–60 dB above threshold (see e.g., Fig. 2a of Eggermont and Odenthal 1974; Fig. 9 of Versnel et al. 1992). The range appears to shrink when moving up the auditory pathway; the upper limits observed in brainstem auditory-evoked potentials (Elberling and Don 1987) and auditory-evoked fields of cortical origin (Lütkenhöner and Klein 2007) were 30–40 and 15 dB, respectively.

A linear relationship between response amplitude and sound level in dB may appear natural at threshold, but in fact it is not. The point is the logarithmic nature of the dB scale. A linear increase with sound level implies a logarithmic increase with sound pressure, and such a nonlinear behavior does not appear plausible at threshold. Instead, we would expect that the response amplitude is a linear function of an elementary physical quantity such as sound pressure or intensity. Such a relationship was indeed found in auditory-nerve fibers. Yates et al. (1990) showed, in guinea pig, that the stimulus-driven component of the firing rate is, at low levels, proportional to intensity. Earlier data by Geisler et al. (1985) support that view, as demonstrated by Yates (1990). Consistent observations were made by Eatock et al. (1991) in a nonmammallian cochlea.

The apparent discrepancy between the auditory nerve as a whole and its individual fibers is astonishing. This article clears up the seeming contradiction. The model that will be presented is, in the low-intensity limit, linear with respect to intensity. As intensity increases, it becomes linear with respect to sound level. Although the model was developed for the gross firing rate of the auditory nerve, it is also applicable to the CAP, which may be modeled as a convolution of the gross firing rate with a unit response (Goldstein and Kiang 1958; Kiang et al. 1976; Antoli-Candela and Kiang 1978). For the sake of simplicity, we will not always distinguish between gross firing rate and CAP, but use the unspecific term “response” for a stimulus-driven effect. Some basic conclusions are presumably valid for higher levels of the auditory system, too. The results might also help to improve models for loudness at threshold.

THE MODEL

Rate-intensity function of auditory-nerve fiber

The discharge rate of an auditory nerve fiber may be described as the sum of a spontaneous rate r 0 and a stimulus-driven rate r D (Sachs 1969). Only the latter is to be considered in this study because spontaneous background activity does not contribute to an auditory evoked response. In the model proposed by Sachs and Abbas (1974), the driven rate r D is related to the amplitude of the basilar membrane displacement, d, by an equation of the form \( r_{{\text{D}}} = {r_{{\text{m}}} d^{\kappa } } \mathord{\left/ {\vphantom {{r_{{\text{m}}} d^{\kappa } } {{\left( {\theta + d^{\kappa } } \right)}}}} \right. \kern-\nulldelimiterspace} {{\left( {\theta + d^{\kappa } } \right)}} \), where r m is the maximum driven rate (difference between saturation rate and spontaneous rate). To understand the meaning of the parameter θ, it is useful to substitute \( \theta = \vartheta ^{\kappa } \) and to rewrite the equation as \( r_{{\text{D}}} = {r_{{\text{m}}} } \mathord{\left/ {\vphantom {{r_{{\text{m}}} } {{\left( {1 + {\left( {\vartheta /d} \right)}^{\kappa } } \right)}}}} \right. \kern-\nulldelimiterspace} {{\left( {1 + {\left( {\vartheta /d} \right)}^{\kappa } } \right)}} \). The new parameter ϑ evidently represents the basilar membrane displacement that results in the stimulus-driven discharge rate r m / 2. At sufficiently low intensities, d may be assumed to be proportional to sound pressure P. Thus, an appropriate normalization of sound pressure finally yields

$$ r_{{\text{D}}} = {r_{{\text{m}}} } \mathord{\left/ {\vphantom {{r_{{\text{m}}} } {{\left( {1 + P^{{ - \kappa }} } \right)}}}} \right. \kern-\nulldelimiterspace} {{\left( {1 + P^{{ - \kappa }} } \right)}} $$
(1)

For the parameter κ, the value 1.77 was derived.

The model was largely confirmed by Yates et al. (1990): Their data were in excellent agreement with the assumption κ = 2, but inconsistent with the assumption κ = 1. Other values of κ were not considered in that study. Thus, it cannot be excluded that an algorithmic parameter optimization would have resulted in a slight deviation from the integer 2, as proposed by Sachs and Abbas (1974). Nevertheless, there is a strong theoretical argument against the view that κ is a parameter that may be arbitrarily optimized by a curve-fitting algorithm. It appears plausible to assume that the low-intensity approximation of Eq. 1, \( r_{{\text{D}}} \approx r_{{\text{m}}} P^{\kappa } \), represents the leading term of a series expansion of r d with respect to sound pressure P, and this expectation requires κ to be an integer. Direct evidence was obtained by Eatock et al. (1991), although for nonmammallian auditory neurons (alligator lizard). In contrast to Yates et al. (1990), they determined the parameter κ without limiting their consideration to integer values. The distribution of exponents showed a distinct maximum around κ = 2 and most of the variability in the estimated exponents could be attributed to inherent noise in the data. The assumption κ = 2 represents one of the basic postulates of the present modeling study. For the sake of completeness, it shall be added that, for a specific subset of fibers (tectorial fibers), Eatock et al. (1991) obtained a distribution with a pronounced maximum around κ = 3.

Sachs and Abbas (1974) defined a normalized rate as r D / r m. For the model of Eq. 1, with κ = 2, the normalized rate may be expressed as a function of the normalized stimulus intensity, I = P 2, as

$$ f_{0} {\left( I \right)} = \frac{1} {{1 + 1 \mathord{\left/ {\vphantom {1 I}} \right. \kern-\nulldelimiterspace} I}}. $$
(2)

Figure 1 shows two alternative representations of this function (solid curves): on the right, the intensity scale is linear, whereas on the left it is logarithmic. More precisely, the left panel of Figure 1 shows f 0 as a function of sound level in dB, defined as L = 10 log10(I). Corresponding points in the two panels are connected by dotted lines. The intensity scale is normalized such that I = 1 (0 dB) corresponds to f 0 = 1/2. The curve on the left has its steepest slope in that point, and it is antisymmetric with respect to that point: f 0 goes to zero with decreasing level in the same way as it goes to one with increasing level. The antisymmetry is a consequence of the relationship

$$ f_{0} {\left( I \right)} + f_{0} {\left( {1 \mathord{\left/ {\vphantom {1 I}} \right. \kern-\nulldelimiterspace} I} \right)} = 1. $$
(3)
FIG. 1
figure 1

Normalized firing rate of an auditory-nerve fiber (according to Yates et al. 1990). The intensity scale is logarithmic on the left (level in dB) and linear on the right. The firing rate is half the maximum rate at a normalized intensity of one (0 dB). The solid curve has a linear low-intensity approximation (dashed curve).

With \(\tilde f_0 \left( L \right) = f_0 \left( {10^{{L \mathord{\left/ {\vphantom {L {10}}} \right. \kern-\nulldelimiterspace} {10}}} } \right) - {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}\), this relationship may be rewritten as \(\tilde f_0 \left( L \right) = - \tilde f_0 \left( { - L} \right)\). The singular status of the 0-dB point is not so obvious in the plot on the right. The solid curve in that plot emphasizes another feature instead: at very low intensities, the normalized firing rate is basically identical with the normalized intensity, i.e. f 0(I) ≈ I. This linear approximation of the function f 0 is shown as a dashed curve, in both panels of Figure 1.

Gross response of the population of auditory-nerve fibers: integration over cochlear location

The magnitude of the basilar membrane (BM) displacement caused by a pure tone has a distinct maximum at a particular location, called the characteristic place, and sharply decreases with increasing distance from that location. Figure 2A shows data from guinea pig (read from Fig. 1D of Russell and Nilsen 1997). The curves illustrate for four sound levels how BM displacement varies with distance from apex. In Figure 2B, the BM displacement at the characteristic place is normalized to one, but otherwise the curves are the same. As a first approximation, the logarithm of BM displacement decreases as a linear function of distance from characteristic place (dashed lines). This view is in good qualitative agreement with excitation patterns derived from other experimental data (e.g., Allen and Fahey 1993; de Boer and Nuttall 2000). For the moment, we disregard the fact that the decrease seems to be intensity dependent, to some extent. Moreover, we also ignore that, with increasing sound level, the maximum BM displacement becomes more and more affected by a compressive nonlinearity (Fig. 2C). Later in this article, we will see how these shortcomings of the model can be overcome or at least alleviated. In the low-intensity limit, they may be expected to be negligible anyway.

FIG. 2
figure 2

A BM displacement as a function of distance from apex (guinea pig data, read from Fig. 1d of Russell and Nilsen 1997). B Same curves as in A, but normalized. The logarithm of BM displacement decreases approximately linearly to both sides of the maximum (dashed lines). C Level dependence of maximum BM displacement. The solid curve represents an empirical fit to the data. BM displacement is assumed to be proportional to sound pressure at low levels (dotted line).

In the simple model of Figure 3, the vertical scale in the upper left represents the cochlear location, x. The origin of the coordinate system corresponds to the characteristic place; the units are arbitrary. The auditory neurons are assumed to have a normalized firing rate according to Eq. 2, as in the previous subsection. However, the normalized intensity now depends on the cochlear location, according to the equation

$$I_{x} {\left( x \right)} = I \cdot 10^{{ - {\left| x \right|}}} ,$$
(4)

where I is the normalized intensity at the characteristic place. The corresponding level (bottom scale) is \( L_{x} {\left( x \right)} = L - 10 \cdot {\left| x \right|} \), where L is the level at the characteristic place. Here we assume, for the sake of simplicity, that the intensity (level) decreases symmetrically with increasing distance from the characteristic place, but this special assumption is irrelevant for the results (proven in Appendix A). The upper panels in Figure 3 show three different excitation patterns (the dots represent 21 locations, equidistantly distributed over the cochlea). The levels at x = 0 are −20 dB (left), 0 dB (middle), and 20 dB (right). The bottom panels show the corresponding normalized firing rates; the curve shows the function f 0. Neurons near the characteristic place (x = 0) show marginal excitation in the example on the left, whereas they are almost fully excited in the example on the right. In the middle panel, the normalized firing rate at the characteristic place corresponds to half the maximal rate, whereas neurons near the ends of the cochlea model (x = ±4) are basically unexcited.

FIG. 3
figure 3

From cochlear excitation to firing rate. The upper panels show three cochlear excitation patterns (cochlear location, in arbitrary units, indicated by the vertical scale on the left). The normalized intensity (corresponding level represented by scale at the bottom) has a maximum at location zero, corresponding to −20 dB (left example), 0 dB (middle), and 20 dB (right example). The associated normalized firing rates can be looked up using the normalized rate-intensity function of Figure 1 (bottom panels).

In the example on the right of Figure 3, the normalized rates exhibit a kind of symmetry, which means that neurons may be paired such that the sum of their normalized rates is 1. Thus, the mean normalized rate of all neurons is around 0.5 in this example. The apparent symmetry of the normalized firing rates is a consequence of Eq. 3. The symmetry is not perfect in the present example because the neurons at x = ±4 are associated with a single neuron at x = 0. Thus, the mean normalized firing rate (0.477 for N = 21 neurons) slightly deviates from the analytical solution calculated using Eq. 20 of the Appendix, which is log10(101 / 1.01)/4 = 0.5. However, this imperfection of the numerical model becomes negligible as the number of neurons increases.

Mean normalized rates were also calculated for the other two examples in the figure. The results were displayed as filled gray circles in Figure 4, showing the mean normalized rate as a function of the level (on the left) and the normalized intensity (on the right) at x = 0 (inset with magnified scales). More numerical results, based on 1,000 neurons rather than the 21 neurons considered in Figure 3, are represented by black dots. The solid curve shows an approximation of the analytical solution derived in Appendix A (Eq. 20):

$$f_{1} {\left( I \right)} \approx \log _{{10}} {{\left( {1 + I} \right)}} \mathord{\left/ {\vphantom {{{\left( {1 + I} \right)}} \nu }} \right. \kern-\nulldelimiterspace} \nu $$
(5)
FIG. 4
figure 4

Mean normalized firing rate of all fibers in the auditory nerve. The intensity scale is logarithmic on the left (level in dB) and linear on the right; the inset shows a magnified view of low intensities. The analytical solution given in Eq. 5 is represented by a solid curve; approximations for low and high intensities are shown as a dashed and a dotted curve, respectively. The mean rates for the three coarse examples of Figure 3 (with only 21 neurons) are shown as filled gray circles; the black dots were obtained for a refined numerical model with 1,000 neurons.

The parameter ν in the denominator, having the value 4 in the present example, corresponds to log10(I / I * ), where I * is the intensity at the ends of the cochlea model (x = ±4 in the example of Fig. 3). The factor 1 / ν in Eq. 5 accounts for the fact that the percentage of basically unexcited neurons increases with decreasing I * . At higher intensities, f 1 is roughly proportional to sound level: the dotted line in the left panel of Figure 4 shows the approximation f 1(I) = log10(I) / ν. At low intensities, f 1 is proportional to I: the dashed line in the right panel of Figure 4 shows the linear approximation f 1(I) = I / (ν·ln(10)). Both approximations are displayed also in the respective other panel, as a dashed curve (low-intensity approximation) or a dotted curve (high-intensity approximation).

Accounting for variations in the threshold of single neurons

So far we assumed that all neurons have the same sensitivity. However, there is ample evidence that auditory neurons of the same characteristic frequency exhibit significant differences. This feature will now be accounted for. A consideration of near-threshold intensities can be confined to neurons with high spontaneous rates, which exhibit the highest sensitivity (Liberman 1978; Ohlemiller et al. 1991). The thresholds of these neurons vary within a range of roughly 20 dB (Liberman 1978; Geisler et al. 1985; Shofner and Sachs 1986; Relkin and Pelli 1987; Schmiedt 1989; Winter et al. 1990; Ohlemiller et al. 1991; Robertson and Wilson 1991; Jackson and Relkin 1998; Heinz et al. 2005).

The intensity scale in the previous subsection was defined in such a way that, at 0 dB (I = 1), a neuron at the characteristic place fired at half its maximum driven rate (r m / 2). In principle, this reference intensity could be measured for a given neuron. The result of such a hypothetical experiment, in which the intensity would be determined on a scale of the external world rather than the neuron, shall be denoted as I ref. A modified version of Eq. 5, now defined in terms of this new intensity scale, can be obtained by substituting I / I ref for I, and neural populations with different sensitivities may be simulated by varying I ref. For the following considerations it is convenient to normalize the intensity scale in such a way that the population of the most sensitive neurons is characterized by I ref = 1. Thus, I = 1 (0 dB) henceforth corresponds to the intensity at which the most sensitive neuron at the characteristic place fires at half its maximum driven rate.

The above ideas are illustrated in Figure 5. Let us first consider the panel on the left, where the abscissa represents the sound level. The leftmost curve, corresponding to I ref = 1, is identical with the solid curve in the left panel of Figure 4; the associated high intensity approximation is again shown as a dotted line. Increasing I ref, thus decreasing the sensitivity of the neuron, shifts the curve to the right. In the present example, log10(I ref) is assumed to be uniformly distributed between 0 and μ = 2, which corresponds to the 20-dB range of threshold variation observed in neurophysiological experiments. Eleven values of I ref are considered in Figure 5, each represented by a solid curve. Averaging these curves, and many intermediate curves that are not displayed in the figure (for details see caption), results in the fat dashed curve; the basically congruent solid curve represents the exact analytic solution derived in Appendix A (Eq. 28). For higher intensities (roughly above 20 dB), this exact solution may be approximated as

$$ f_{2} {\left( I \right)} \approx \frac{1} {\nu }{\left( {\log _{{10}} {\left( I \right)} - \frac{\mu } {2}} \right)} $$
(6)

(dotted line). Thus, the high-intensity approximation is the same as in the previous subsection (where all neurons had the same sensitivity), except for a horizontal shift by 10 dB (corresponding to log10(I) = μ / 2).

FIG. 5
figure 5

Averaging the mean normalized firing rates over nerve-fiber populations with different sensitivities. The intensity scale is logarithmic on the left (level in dB) and linear on the right; the inset shows a magnified view of low intensities. The leftmost curve in the left panel and the upper curve in the right panel are identical with the respective curves in Figure 4. Now, these curves are assumed to represent the most sensitive population of nerve fibers. Curves for less sensitive populations may be obtained by shifting, in the left panel, the curve for the most sensitive population to the right. In the present example, shifts between 2 and 20 dB were performed in steps of 2 dB. The fat dashed curve shows the mean rate of all populations (shifts in steps of 0.05 dB in this numerical calculation); the basically congruent solid curve represents the exact analytical solution according to Eq. 28. The thin dashed curve represents an approximation for intensities around 0 dB (Eq. 43). For other intensities, this approximation makes unrealistic predictions.

In the right panel of Figure 5, the same functions are displayed with a linear intensity scale. The situation at very low intensities is best understood by considering the magnified detail shown in the inset. Here, all curves are basically straight lines through the origin, and the exact analytical solution (solid curve congruent with the fat dashed curve) may be linearly approximated as

$$ f_{2} {\left( I \right)} \approx \frac{I} {{\mu \nu \cdot {\left( {\ln {\text{ }}10} \right)}^{2} }}. $$
(7)

Thus, apart from scaling issues, accounting for the threshold variability of the auditory neurons has no effect on the low-intensity approximation of the mean firing rate.

In conclusion, the mean rate f 2 is roughly a linear function of I for I << 1 and a linear function of log10(I), i.e. a linear function of level, for I >> 1. In between, there is a transition range centered at I = 1 (0 dB). An approximation derived in Appendix B (Eq. 43) shows that f 2 is roughly a parabolic function of log10(I) in that range. This approximation is represented by the thin dashed curve in Figure 5. Except for the vicinity of the 0-dB point, this approximation severely fails, especially towards very low intensities.

A glimpse at higher intensities

Although the emphasis of this article is on low intensities, it appears appropriate to briefly consider higher intensities as well. First, it is worthwhile to get to know the model predictions for higher intensities, at least qualitatively. Second, and more important, it is essential to understand the limitations of the formulas presented above; disregarding these limitations could result in serious misunderstandings.

Figure 6A corresponds to the left-hand side of Figure 5, but the range of levels is extended, and the curves are not clipped anymore at a mean normalized firing rate of 0.5. Another, less obvious difference is that the numerical calculations are based on Eq. 20 rather than on the approximation given in Eq. 5. This means that a finite cochlea was considered rather than an infinite one. The analytical solution for an infinite cochlea (Eq. 28) is represented by a thick solid curve. Below 30 dB (vertical gray line), this curve is basically congruent with the numerical solution for a finite cochlea (fat dashed curve). This means that the infinite-cochlea approximation is absolutely justified at low intensities. Between 30 and 40 dB, however, the curves begin to diverge. While the analytical solution for an infinite cochlea continuous to grow, approximately as a linear function of level, the numerical solution for a finite cochlea saturates. In the high-intensity limit, all neurons of the finite cochlea are fully excited so that the mean normalized firing rate is one.

FIG. 6
figure 6

A As left panel of Figure 5, but for a finite rather than an infinite cochlea. The fat dashed curve shows again the numerically calculated mean normalized rate of all neural populations. At lower sound levels, this curve basically coincides with the analytical solution for an infinite cochlea (thick solid curve). The latter does not saturate at high levels, though. B By choosing different parameters, the mean normalized rate can be made to increase more slowly, resulting in a wider dynamic range.

Figure 6B is analogous to Figure 6A, but the parameter ν was increased from 4 to 6. This means that the difference in sensitivity between the characteristic location and the ends of the cochlea model amounts to 60 dB rather than 40 dB. The mean firing rate increases more slowly now so that the dynamic range is wider.

ALTERNATIVE VIEWS OF THE MODEL

Up to this point, all considerations were based on specific assumptions about the model. Given the assumptions, the formulas derived allow quantitative predictions about hypothetical experiments. However, the model was intended to be used the other way around: to qualitatively explain available observations. For that purpose, it is useful to reconsider various aspects of the model in somewhat different ways.

Nonlinearities compared

The mean firing rate near threshold is a linear function of stimulus intensity, in all stages of the model development. With increasing intensity, however, the various versions of rate-intensity functions show fundamental differences. To illustrate these differences, the functions were normalized in such a way that their low-intensity approximations became identical (normalized rate equal to normalized intensity). Figure 7 shows the result. The vertical axis is labeled “normalized response amplitude” now, to make clear that the normalization in this figure is generally not consistent with the normalization in the previous figures. As in Figures 4 and 5, the intensity scale is logarithmic in the left panel (level in dB) and linear in the right panel, where the inset again provides a magnified view of the lowest intensities. The rate-intensity function of a single nerve fiber (Eq. 2) is shown as a thin solid curve. A renormalization was not required in this case so that the curve is exactly the same as in Figure 1. A linear approximation (dashed curve) is appropriate only at the lowest intensities (see inset). At a normalized level of 10 dB, the response is almost fully saturated. Quite different is the situation for the auditory nerve as a whole. The gray curve (derived from Eq. 5) corresponds to the assumption that all fibers have the same sensitivity (homogeneous auditory-nerve model), whereas the thick black curve (derived from Eq. 28) corresponds to the assumption that the sensitivities of the fibers are homogeneously distributed over a 20-dB range (inhomogeneous auditory-nerve model); the dash-dotted curve represents the low-intensity approximation (Eq. 29). In either model, the high intensity approximation (dotted line) increases linearly with sound level. Compared to the rate-intensity function of a single fiber, the approximately linear range is somewhat extended (see inset), and, most important, the dynamic range appears unlimited. But remember that it is actually not (shown in Fig. 6).

FIG. 7
figure 7

Comparison of normalized response amplitudes. The intensity scale is logarithmic on the left (level in dB) and linear on the right; the inset shows a magnified view of low intensities. The normalized rate-intensity function of a single neuron is represented by a thin solid curve; the homogeneous auditory-nerve model (all fibers having the same sensitivity) is represented by a thick gray curve, the inhomogeneous auditory-nerve model (sensitivity distributed over a 20-dB range) is represented by a thick black curve. All these curves were normalized so that, in the low-intensity limit, response amplitude and normalized intensity are identical (dashed curve). The left panel also shows the high-intensity approximations for the two auditory-nerve models (dotted lines). In addition, the approximation given in Eq. 29 is shown as a dash-dotted curve.

An experimenter’s perspective

In the “INTRODUCTION”, studies were mentioned where the amplitude of auditory evoked responses at low intensities linearly increased with sound level. This is exactly the kind of amplitude increase predicted by our auditory-nerve models for higher intensities (dotted lines in the left panel of Fig. 7). Thus, intensities that may appear low for an experimenter are not necessarily low in the context of our models. An experimenter who wishes to interpret own data in terms of these models has to match the intensity scales. A pragmatic solution would be to extrapolate the linear level dependences towards the threshold (here corresponding to zero amplitude) and to define this level as 0 dB. For the homogeneous auditory-nerve model (gray curve in Fig. 7), a redefinition of the intensity scale is not required, because the threshold extrapolated from the high-intensity approximation is exactly 0 dB. For the inhomogeneous model (thick solid curve in Fig. 7), the extrapolated threshold is 10 dB, and the level scale has to be shifted by this amount. A comparison between data and model also requires a common amplitude scale. This problem may be solved by an appropriate amplitude normalization; for example, the amplitude found 10 dB above the extrapolated threshold may be normalized to one. The gray and the thick black curve in Figure 8 were obtained by applying such transformations to the respective curves in Figure 7. The curves share the same high-intensity approximation now (dotted line); the 10-dB value of this approximation was normalized to one. The differences that can be observed at lower intensities are surprisingly small. For example, the sound levels where the amplitude is just 10% of the value at 10 dB differ by less than 3 dB (−5.9 vs −8.6 dB). The low-intensity approximations (dashed curves), being linear with respect to intensity, noticeably deviate from the exact functions even at such low levels.

FIG. 8
figure 8

The two auditory-nerve models from an experimenter’s point of view. At higher intensities, the response amplitude of either model increases linearly with sound level. A similar increase was observed in physiological studies. Thus, a data transformation based on this shared feature may help to compare data and model. Such a transformation is illustrated here for the two auditory-nerve models. The respective curves in the left panel of Figure 7 were transformed in such a way that they got the same high-intensity approximation (dotted line), with normalized response amplitudes of zero and one at 0 and 10 dB, respectively. Thus, 0 dB may be interpreted now as the threshold extrapolated from observations at higher intensities. By applying a corresponding transformation to real data, an experimenter may get an idea of the intensity range where a roughly proportional relationship between response amplitude and intensity (dashed curves) is to be expected. Such proportionality may be expected below −10 dB for the inhomogeneous auditory nerve model (black curve) and below −5 dB for the homogeneous auditory-nerve model (gray curve).

In view of the fact that the differences between homogeneous and inhomogeneous model largely disappear after applying appropriate normalizations, it seems that the much simpler homogeneous model will generally be sufficient for qualitative considerations. Accordingly, the exact sensitivity distribution of the inhomogeneous model is probably of secondary importance. This view is, of course, untenable at higher intensities, were neurons with low spontaneous rates are to be accounted for.

Higher threshold, faster growth

The growth of the mean firing rate with increasing intensity depends on the sharpness of the cochlear excitation pattern. This dependence is expressed by the factor 1 / ν in Eqs. 5, 6, and 7. The factor has interesting implications, which are illustrated in Figure 9. The figure is organized in the same way as Figures 4 and 5, with a logarithmic intensity scale on the left (level in dB), a linear intensity scale on the right, and a magnified view of the low-intensity limit in the inset. The thick solid curve corresponds to the situation considered in Figure 4. Thus, it represents a cochlear excitation pattern characterized by ν = 4. The thin solid curve was derived from the thick solid curve by a 10-dB reduction in sensitivity (the curve in the left panel is shifted by this amount to the right). In terms of the reference intensity introduced in the context of Figure 5, the two curves correspond to I ref = 1 and I ref = 10, respectively. The dashed curve finally represents a modification of the latter condition, where the sharpness of the cochlear excitation pattern is slightly reduced (ν = 3). High-intensity approximations to the three curves (not accounting for the saturation that results from the assumption of a finite cochlea; cf. Fig. 6) are represented by the dotted lines in the left panel.

FIG. 9
figure 9

Higher threshold associated with faster growth (example). The intensity scale is logarithmic on the left (level in dB) and linear on the right; the inset shows a magnified view of low intensities. The thick solid curve corresponds to the solid curve in Figure 4 (ν = 4). The thin solid curve is identical, except that the sensitivity is reduced by 10 dB. The dashed curve demonstrates that such a reduction in sensitivity can be compensated for, at higher intensities, by a faster growth of the mean firing rate (dashed curve). The faster growth was achieved by assuming ν= 3. A reduction of the parameter v corresponds to a loss in sharpness of the cochlear excitation pattern.

Independent of the definition of threshold, the threshold difference between the conditions represented by the thick and the thin solid curve in Figure 9 can be said to be exactly 10 dB, corresponding to the horizontal shift in the left panel. The situation is less clear for the dashed curve. An experimenter who would extrapolate a threshold from firing rates showing a roughly linear dependence on sound level (see previous subsection) would find the same threshold as for the thin solid curve (the corresponding dotted curves intersect at 10 dB). By contrast, if threshold would be defined as a specific minimum amplitude, the dashed curve would always be associated with the lower threshold. For example, threshold could be defined as the intensity corresponding to a mean normalized firing rate of 0.01 (maximum rate considered in the inset). The threshold intensities for the three conditions would be 0.096, 0.72, and 0.96 (indicated by the dotted vertical lines in the inset), which corresponds to −10.2, −1.5, and −0.2 dB, respectively. Thus, with that definition of threshold, there would be a threshold difference of 1.3 dB between dashed curve and thin solid curve. Despite this minor dependence on criteria, it appears reasonable to say that the dashed curve corresponds to a roughly 10-dB higher threshold than the thick solid curve.

With increasing intensity, the dashed curve approaches the thick solid curve, and both curves converge to the asymptotic value 1. This exemplifies that, regarding the mean firing rate, an enhanced threshold can be fully compensated for by a reduced sharpness of cochlear excitation, since the latter causes a faster growth. In the infinite-cochlea model, the difference in threshold is even overcompensated at high intensities, as indicated by the fact that the respective high-intensity approximations (dotted lines in the left panel) intersect at 40 dB.

Pseudo-linearity with respect to sound pressure

Figure 8 suggests that it may be a challenge for an experimenter to directly observe response amplitudes which linearly increase with intensity. According to our model, this would require studying intensities far below the range where the amplitude increases approximately linearly with sound level. In a real experiment, the responses at such low intensities might be so small that they are buried under the noise floor. An interesting situation would arise if the smallest response that can be detected has the amplitude 0.1 (in the scale of Fig. 8). According to our inhomogeneous auditory-nerve model (thick black curve), an experimenter would notice that an intensity increase by approximately 20 dB is required to augment a just detectable response by a factor of 10 (a corresponding observation was reported by Eggermont and Odenthal 1974). Such a finding is consistent with the idea that the response amplitude is proportional to sound pressure. So it would be natural to study the effect more systematically by plotting all the measured response amplitudes vs sound pressure. The prediction of our model, shown as a solid curve in Figure 10, is really amazing. The curve is over a wide range close to the dashed line, which represents the tangent to the curve at a normalized intensity of −10 dB (see Eq. 44 in Appendix B). With noisy data, an experimenter would undoubtedly arrive at the conclusion that the response amplitude near threshold increases linearly with sound pressure, up to a level that is roughly 15 dB above the level corresponding to a normalized amplitude of 0.1 (in this consideration representing the just detectable response). For the sake of comparison, the figure also shows the high-intensity approximation of Figure 8 (dotted curve).

FIG. 10
figure 10

Pseudo-linearity with respect to sound pressure. In contrast to Figure 8, the normalized response amplitude of the inhomogeneous auditory-nerve model (solid curve; high-intensity approximation represented by dotted curve) is shown as a function of normalized sound pressure. The corresponding dB value may be looked up in the upper scale. Up to a normalized sound pressure of about 2.5 (8 dB), the curve is close to the dashed line, which represents the tangent at a normalized level of −10 dB. Thus, noisy data may easily be misinterpreted as showing a linear dependence of the response amplitude on sound pressure. Extrapolation of this linear law would yield a threshold of about −18 dB (see magnified view in the inset).

A careful experimenter would perform more sensitive experiments now, to get data at lower intensities. The inset of Figure 10 shows the results that are to be expected. Down to a normalized sound pressure of 0.2 (−14 dB), yielding a response with a normalized amplitude of 0.034, the small deviation from the conjectured linear relationship would probably go unnoticed. Only a minor detail might arouse suspicion: a linear extrapolation of high-quality data would predict the disappearance of the response at a normalized sound pressure corresponding to about −18 dB (see context of Eq. 44 in Appendix B; but note that the intensity scale in Figure 10 is shifted by 10 dB). This would mean that there is a sensory threshold. However, such an inference would be inconsistent with signal detection theory, a theory that denies the existence of a sensory threshold (Swets 1961). To conclusively disprove the idea of a linear relationship between response amplitude and sound pressure, it would be necessary to repeat the experiment at even lower intensities. This would finally show that the response does not abruptly disappear at a sensory threshold, but smoothly fades away.

From auditory nerve to higher levels of the auditory system

Although our model was developed for the auditory nerve, some basic conclusions are probably valid for higher levels of the auditory system, too. This supposition is not only supported by experimental data, as will be elaborated in the “DISCUSSION” section, but also by theoretical arguments. In what follows, we will consider ultralow intensities. As ultralow we denote an intensity that is far below the detection threshold of the subject so that stimuli are detected basically at chance level. Nevertheless, in accordance with signal detection theory (Swets 1961), we assume that ultralow stimuli cause a response in the auditory pathway. Whether such a response can be observed in a physiological experiment depends on the experimental effort; the required measurement time might be virtually infinite.

Stimulation at an ultralow intensity may be assumed to cause an almost infinitesimal perturbation of the spontaneous activity in the auditory nerve. Under such circumstances, a single stimulus does not provide significant information that could be processed at higher levels of the auditory pathway. Nevertheless, a perturbation of the spontaneous activity at the level of the auditory nerve presumably perturbs the firing statistics at higher levels so that the linear intensity dependence proposed for the auditory nerve finally reaches the cortex. Thus, the low-intensity approximation of our model is probably applicable to all levels of the auditory pathway. As it would be unreasonable to assume a sharp border between ultralow and low intensities, we may expect that the model predictions are valid, to some extent, also above the subject’s threshold of hearing. The upper limit of that range of validity probably decreases when moving up the auditory pathway, owing to additional nonlinearities that come into play.

Comparison with the loudness function of Zwislocki (1965)

Model predictions that are valid for physiological responses in auditory cortex might be approximately valid also for the psychophysical correlate of stimulus intensity, loudness. Here we show that the loudness function proposed by Zwislocki (1965) is, at low intensities, strikingly similar to our model. In a normalized form, Zwislocki’s function may be written as

$$ f_{Z} {\left( I \right)} = \frac{1} {m}{\left( {{\left( {1 + I} \right)}^{m} - 1} \right)}, $$
(8)

where f z (I) may be interpreted as a normalized loudness [the formula can be derived e.g. from Eq. 2 of Buus and Florentine (2002) by an appropriate normalization of both the intensity and the loudness scale]. For the parameter m, Zwislocki (1965) determined the value 0.27. In the limit of very low intensities, normalized loudness and normalized intensity I are identical. Thus, in this respect the function f z completely agrees with the functions f 0, f 1, and f 2 that were considered above.

A plot of the normalized loudness f z is shown as a thick dashed curve in Figure 11. The figure was derived from the right panel of Figure 7. Thus, the gray curve shows again the normalized response amplitude for the homogeneous auditory-nerve model (function f 1), whereas the thick black curve corresponds to the inhomogeneous auditory-nerve model (function f 2). The curve of f z runs roughly in the middle between these two curves (inset shows magnified view of low intensities). The functions f 1 and f 2 may be considered as special cases of a function f n , where n is a parameter (see Eq. 30 in Appendix A). By adjusting n, an almost perfect match between f n and f Z can be achieved as shown by the thin solid curve in Figure 11 (calculated for n = 1.55).

FIG. 11
figure 11

Comparison between normalized response amplitude and loudness. As in the right panel of Figure 7, the normalized response amplitudes of the homogeneous and the inhomogeneous auditory-nerve model are shown as a thick gray and a thick black curve, respectively (inset providing magnified view of low intensities). Roughly in the middle between these two curves, the thick dashed curve is found, which represents a normalized version of Zwislocki’s (1965) loudness function. A similar function (thin black curve) is obtained by assuming that a real auditory nerve behaves like a hybrid of the above two model variants.

POSSIBLE PROBLEMS, LIMITATIONS, AND WORKAROUNDS

To allow analytical evaluations, simplifying assumptions about the spike generation process had to be made. Specifically, we assumed that the spike generation is driven by a “force” that is proportional to intensity, defined as sound pressure squared. The assumption appears adequate for sound intensities near the threshold of hearing, but becomes more and more problematic with increasing intensity. This section introduces minor modifications to the model, particularly a redefinition of intensity, to extend its scope. Another point deserving attention is that, in psychophysical studies, longer stimuli are associated with lower thresholds.

Intensity dependence of cochlear excitation pattern

Equation (4) and the more general Eq. 14 in Appendix A are based on the supposition that the BM displacement is proportional to sound pressure and that its logarithm decreases as a linear function of distance from characteristic place. Moreover, we postulated that the cochlear excitation pattern is independent of stimulus intensity, except for scaling issues. Figure 2 showed that these assumptions are not strictly valid.

Two different effects of stimulus intensity on the cochlear excitation pattern were distinguished in Figure 2. The first effect is illustrated in Figure 2C and may be described as a compression of the maximum BM displacement. The solid curve shows an empirical fit to the data. To understand this curve, it is useful to first consider its approximation for low intensities (dotted line). According to this approximation, maximum BM displacement d (in nanometer) and sound level L (in dB SPL) are related by the equation

$$ d_{{{\text{linear}}}} {\left( L \right)} = 10^{{{{\left( {L - L_{0} } \right)}} \mathord{\left/ {\vphantom {{{\left( {L - L_{0} } \right)}} {20}}} \right. \kern-\nulldelimiterspace} {20}}} . $$
(9)

The parameter L 0 = 16 dB specifies the level at which the approximation has the value 1 nm. The equation simply means that the maximum BM displacement is proportional to sound pressure, as to be assumed for very low sound levels (see e.g., Robles and Ruggero 2001). But except for the lowest levels, the data displayed in Figure 2C clearly deviate from this law and roughly follow the solid curve, which represents the function

$$ d{\left( L \right)} = d_{0} \ln {\left( {1 + d_{{{\text{linear}}}} {{\left( L \right)}} \mathord{\left/ {\vphantom {{{\left( L \right)}} {d_{0} }}} \right. \kern-\nulldelimiterspace} {d_{0} }} \right)}, $$
(10)

with d 0 = 4 nm. The equation describes a smooth transition between linear and logarithmic growth, and it is similar to Eq. 5, although the context is different. A compressive nonlinearity as in Eq. 10 can easily be accounted for by slightly modifying our model. Instead of defining the normalized intensity I as the square of normalized sound pressure, we define it as the square of BM displacement (suitably normalized). With this redefinition of I, there is no need to change any of our equations. A second effect of stimulus intensity on cochlear excitation is that normalized excitation patterns for different intensities are not necessarily congruent (Fig. 2B). Our model is easily improved also in that respect, because we may assume that the parameter ν in Eqs. 5, 6, and 7 depends on intensity, as specified in Eq. 23 of Appendix A.

A remaining question is to what extent deviations from the assumed linear relationship between logarithm of BM displacement and distance from characteristic location might interfere with our model predictions. This question is investigated in Figure 12. The upper thin curve in Figure 12A is basically identical with the 50-dB curve in Figure 2A. The dB scale was normalized so that the maximum BM displacement corresponds to 20 dB. An intensity dependence of cochlear excitation was simulated by vertically shifting this curve in steps of 5 dB. Alternatively, we could have taken all the curves in Figure 2A; but a comparison with our analytical model would have been severely hampered by the fact that the cochlear region for which data points are available shrinks with decreasing sound level. The thick curves in Figure 12A show linear approximations to the thin curves (focusing on locations close to the characteristic place) and represent the model that will be considered in our analytical evaluations. Dotted vertical lines mark the boundaries of the cochlear region accounted for (corresponding to the cochlear locations x and x + in Eq. 19 of Appendix A).

FIG. 12
figure 12

Simulation showing that the model is robust against minor deviations from the assumed linear relationship between logarithm of BM displacement and distance from characteristic location. A The thin curves correspond to the 50-dB curve in Figure 2A, except for horizontal shifts (in steps of 5 dB). The curves simulate data that deviate from an exact linear law. The thick lines show linear approximations to the thin curves, and they represent the model. The boundaries of the cochlear region accounted for are marked by dotted vertical lines. B The curves of A were transformed into normalized firing rates. C Mean normalized firing rate as a function of maximum BM displacement in dB. The thick curve represents the model; the thin curve the simulated data. The two curves show an excellent agreement, at least qualitatively.

Using Eq. 2 and equating the dB scale of Figure 12A with that of Figure 1, BM displacement was transformed into normalized firing rate. The results are presented in Figure 12B. The figure suggests that deviations from the linear law proposed in the model are irrelevant at low intensities, but have a significant effect at higher intensities (deviations between thin and thick curves are found especially towards the apex).

Figure 12C shows the mean normalized firing rate as a function of maximum BM displacement (in dB). The thin curve was derived, by numerical integration, from the thin curves in Figure 12B (and many corresponding curves that are not displayed). The thick curve represents the model; numerical integration and analytical evaluation using Eq. 19 of Appendix A yielded congruent results. The good agreement between thin and thick curve suggests that minor deviations from the proposed linear relationship between the logarithm of BM displacement and distance from characteristic place are tolerable. To prevent misunderstandings it shall be mentioned that the apparent saturation at higher intensities, which appears to be inconsistent with Figure 4, is a consequence of the boundaries of the cochlear region considered (same effect as in Figs. 6 and 9).

Adaptation

Equation 2, one of the cornerstones of our model, is based on experiments by Yates et al. (1990), who determined firing rates by counting the number of discharges during the presentation of a 100-ms tone burst. The saturation described by Eq. 2 essentially results from short-term adaptation, which has a time constant of about 40 ms (Smith 1977). But adaptation also has components with faster and longer time constants so that the question arises as to how appropriate our model is for stimulus durations other than 100 ms. Because significant adaptation effects are not to be expected near the threshold of hearing, this question is not relevant for very low intensities. However, it gains importance as intensity increases.

At high intensities, the onset response of auditory neurons is considerably affected by rapid adaptation (Westerman and Smith 1984; Yates et al. 1985) and neural refractoriness (Lütkenhöner and Smith 1986). But only a brief period of time is concerned so that we may expect that these factors had only a relatively small effect on the spikes counted by Yates et al. (1990) within a 100-ms time window. The situation would be quite different for stimulus durations of the order of a few milliseconds, corresponding to the time constant of rapid adaptation. This is not necessarily a major problem for our model, because it is conceivable that the issue can be solved by a suitable renormalization of the intensity scale. But since a definite answer cannot be given at this point, it appears appropriate to sound a note of caution: For stimuli that are much shorter than 100 ms, the validity range of our model might be restricted to relatively low intensities.

In the case of long and very long stimulus durations, long-term adaptation with a time constant of a few seconds and very-long-term adaptation with a time constant of the order of a minute is observed (Javel 1996). A corresponding phenomenon was found at the cortical level in humans (Lammertmann and Lütkenhöner 2001). Adaptation components with long-time constants can easily be accounted for by our model, even though the model is static by nature. We simply have to assume that the reference intensity being used for normalizing the intensity scale is a slowly varying function of time which mimics the time course of long-term adaptation. A time-dependent normalized intensity would cause a synchronous change in mean firing rate.

Temporal integration or not

The perception of short stimuli requires higher stimulus intensities than the perception of longer stimuli. This phenomenon is generally called temporal integration. Meddis (2006b) pointed out that the term is unfortunate and that it represents an example of how the name of a putative mechanism is used to indicate a phenomenon that it might (or might not) explain. The statement was made in reply to Krishna’s (2006) criticism of a recent computer model of the auditory periphery (Meddis 2006a), in which the auditory-nerve first-spike latency data of Heil and Neubauer (2001) were simulated. Krishna (2002; 2006) showed that integration over long-time scales is not necessary to model such experimental data, because they may be explained by stochasticity in the synaptic events leading up to spike generation in the auditory-nerve fibers. Put simply, Krishna’s view means that a near-threshold stimulus of steady intensity establishes a low probability of a neural event, and the longer the stimulus, the greater is the chance that the event will occur before the end of the stimulus (Meddis 2006b).

The alternative view of “temporal integration” suggested by Krishna (2002, 2006) fits neatly into the theory developed in the present study. So far, our model does not comprise an element related to temporal integration, and Krishna’s results suggest that this is not a shortcoming. Supposed that future research should give reason to modify that view, there is an easy solution for our model. Instead of defining intensity as sound pressure squared or BM displacement squared, it could be defined, for example, as inner-hair-cell receptor potential squared.

DISCUSSION

Implications regarding auditory-evoked responses

According to the model presented in this article, an experimenter studying auditory evoked responses at low intensities may observe three types of relationship between response amplitude and stimulus intensity. The amplitude may linearly increase with (I) intensity or (II) with sound pressure or (III) with the logarithm of either of these two quantities (which is mathematically equivalent, except for a factor of two). If the data are noisy, as typical for low intensities, it may be difficult to decide which of the three relationships is the most adequate one, and appearances may be deceiving. Figure 10, for example, demonstrates by means of simulated data that an experiment may provide seemingly irrefutable evidence of a linear relationship between response amplitude and sound pressure, even though such a conclusion would be contradictory to the assumptions underlying the model.

The implications of the model are best understood by first looking at very low intensities. A threshold does not exist in the model, in full agreement with signal detection theory (Swets 1961). This means that even the softest sound is predicted to elicit a minute response. A different question is whether the response would be above the noise floor in a given physiological experiment or whether a listener participating in a detection experiment would perform better than chance. This is the point where threshold comes into play. Only sounds with an intensity exceeding a certain level, the threshold, will elicit a response that fulfills predefined criteria of significance (specific to each experiment). For very low intensities, intensity and response amplitude are proportional. The normalized intensity scales of this study do not allow specifying in absolute terms what “very low” actually means for a real experiment. Nevertheless, it appears safe to say (cf. the considerations in the context of Fig. 8) that detecting proportionality between stimulus intensity and response amplitude will generally be a challenging task for an experimenter, at least in noninvasive experiments such as the recording of auditory evoked potentials from the surface of the human scalp.

In Figure 8, the deviation between exact model prediction (thick black curve, representing the more realistic version of our model, i.e. the inhomogeneous auditory nerve) and linear approximation (associated dashed curve) rapidly increases for normalized intensities greater than about −12 dB, which may therefore be considered as the approximate upper limit of the intensity range showing a more or less proportional relationship between intensity and response amplitude (range I). According to Figure 10, there is an overlapping range with an apparently linear relationship between sound pressure and response amplitude (range II), which roughly extends from −15 to 8 dB (consider the upper scale in Fig. 10). The upper limit of that range comes close to the lower limit of the subsequent range (III), where the response amplitude increases basically linearly with sound level. In Figure 10 (cf. upper scale), the transition between ranges II and III occurs roughly between 5 and 15 dB. In that range, the exact solution (solid curve) drifts apart from the dashed line and approaches the dotted curve. The latter represents a linear increase with respect to sound level, i.e. a logarithmic increase with respect to intensity. In the formulas derived, this logarithmic increase continues unlimited. However, this is merely a consequence of a simplification in our analytical considerations, which focused on low intensities: we assumed an infinite cochlea. This simplification is clearly not applicable at high intensities, and Figure 6 demonstrates that a more realistic model definitely shows saturation.

The model gives a simple explanation why physiological experiments at low sound levels (Eggermont and Odenthal 1974; Elberling and Don 1987; Versnel et al. 1992; Lütkenhöner and Klein 2007; Lütkenhöner et al. 2007) showed a response amplitude that linearly increased with level. Besides that, the model predicts that proving the proposed linear dependence on sound intensity will require studying levels that are clearly lower than commonly considered. Unless meaningful data can be recorded at sufficiently low levels, an experiment may falsely suggest a linear dependence on sound pressure. First evidence of the correctness of these prediction was recently obtained in a study of wave V of the brainstem auditory evoked potential (Lütkenhöner et al. 2007). The results of that study turned out to be qualitatively consistent with Figure 10. Comparisons between data and model should be done with care, though. All our model predictions were made under the assumption that the response is not affected by additional saturating nonlinearities, and this assumption becomes problematic with increasing intensity, especially at higher levels of the auditory system. Additional nonlinearities would inevitably affect a threshold extrapolation from observations at higher intensities, making a transformation of the intensity scale such as in Figure 8 a delicate procedure. Additional nonlinearities may also explain why the upper limit of intensity range III appears to shrink when moving up the auditory pathway (cf. “INTRODUCTION”).

Reference to psychophysics

The amplitudes of physiological responses and the psychophysical quantity “loudness” have in common that they tend to grow with increasing intensity. While it would be naïve to expect a simple relationship over the full intensity range, there might be at least a qualitative correspondence at low intensities, and physiological experiments at very low intensities could help to resolve controversies about loudness perception at threshold (Buus and Florentine 2002; Moore 2004).

Figure 11 shows that the normalized loudness function f Z , derived from Zwislocki (1965) is, at low intensities, basically an intermediate variant of our normalized rate functions f 1 and f 2. Either function represents a specific assumption about the sensitivity distribution of the auditory-nerve fibers. In the first case, all fibers have the same sensitivity, whereas in the second case the sensitivity is uniformly distributed over a 20-dB range. Both assumptions are, of course, artificial, and a more realistic model might have properties that are in between the two cases. As there is insufficient information about the true sensitivity distribution of the auditory neurons, it may be useful to consider the functions f 1 and f 2 as special cases of a generalized function f n , and to regard n as an adjustable parameter (cf. Appendix B). With n = 1.55, f n would indeed agree almost perfectly with Zwislocki’s model (Figure 11).

Thus, all in all it appears that our basic conclusions regarding low intensities do not only apply to physiological responses, but also to the psychophysical quantity loudness. A careful consideration of low intensities might therefore be a good starting point for relating loudness models to physiology. If this supposition is correct, Figure 9 might help to understand recent psychophysical results presented by Mauermann et al. (2004). They carefully studied the fine structure of individual hearing thresholds and showed that a tone that lays within a threshold minimum exhibits a slower growth of loudness than a tone at a threshold maximum. The fine structure of equal loudness contours consequently flattened out at levels around 30–40 dB SPL. Figure 9 suggests that the observed differences in growth of loudness might be related to the sharpness of the cochlear excitation pattern. According to this view, a threshold minimum is associated with a sharper cochlear excitation pattern than a threshold maximum.

Loudness recruitment in hearing impaired subjects could be explained in basically the same way. However, such an explanation would be contradictory to experimental results that were recently presented by Heinz et al. (2005). Their data suggest that loudness recruitment cannot be accounted for based on summed auditory-nerve response firing rates and may depend on neural mechanism involved in the central representation of intensity, aspects that are clearly beyond the scope of the present study.