## Abstract

We investigate the transduction of sound stimuli into neural responses and focus on locust auditory receptor cells. As in other mechanosensory model systems, these neurons integrate acoustic inputs over a fairly broad frequency range. To test three alternative hypotheses about the nature of this spectral integration (amplitude, energy, pressure), we perform intracellular recordings while stimulating with superpositions of pure tones. On the basis of online data analysis and automatic feedback to the stimulus generator, we systematically explore regions in stimulus space that lead to the same level of neural activity. Focusing on such iso-firing-rate regions allows for a rigorous quantitative comparison of the electrophysiological data with predictions from the three hypotheses that is independent of nonlinearities induced by the spike dynamics. We find that the dependence of the firing rates of the receptors on the composition of the frequency spectrum can be well described by an energy-integrator model. This result holds at stimulus onset as well as for the steady-state response, including the case in which adaptation effects depend on the stimulus spectrum. Predictions of the model for the responses to bandpass-filtered noise stimuli are verified accurately. Together, our data suggest that the sound-intensity coding of the receptors can be understood as a three-step process, composed of a linear filter, a summation of the energy contributions in the frequency domain, and a firing-rate encoding of the resulting effective sound intensity. These findings set quantitative constraints for future biophysical models.

- mechanosensory transduction
- spectral integration
- auditory receptor
- hearing
- sound intensity
- energy
- model
- locust

Auditory receptor cells are commonly characterized by their responses to pure tones. For example, threshold curves characterize the minimum intensity needed to evoke a response as a function of the frequency of a pure tone; rate-intensity functions describe how the response depends on the tone's intensity. Natural signals, however, are only rarely restricted to single frequencies, and receptor cells often show a broad frequency tuning. Our understanding of auditory coding is thus not satisfactory as long as we do not know how the relative intensities of different frequencies contained in a sound signal are integrated by auditory receptors. Investigating this spectral integration helps us also to scrutinize basic principles of the mechanosensory transduction process.

In general, the response of the receptor could be any complicated, nonlinear function of the frequency spectrum. One may hope, however, that the underlying mechanism is simple enough to allow for a straightforward phenomenological description. One such way of combining different spectral contents would be the extraction of a single physical stimulus property. Its nature is intensely debated with respect to the question of temporal integration, i.e., how stimulus intensities are combined over time. Psychoacoustic measurements of intensity-duration tradeoffs suggest that the stimulus energy is the crucial variable (Garner, 1947; Plomp and Bouman, 1959; Zwislocki, 1965; Florentine et al., 1988), while a recent investigation of first-spike latencies in mammalian auditory-nerve fibers finds the time-integrated pressure as the decisive stimulus attribute (Heil and Neubauer, 2001). In insect auditory receptors, the differences between thresholds for one- and two-click stimuli and intensity-duration tradeoffs are consistent with temporal energy integration (Tougaard, 1996,1998). Care must be taken, however, in the interpretation of these data because temporal integration also depends on the time course of several biophysical processes after the primary signal transduction such as internal calcium dynamics and spike generation.

Spectral integration, on the other hand, depends at least in insects almost exclusively on the mechanosensory transduction process; any fluctuations on the several kilohertz scale of relevant sound frequencies that were still present after the transduction (i.e., in the cell-membrane conductance) would be highly attenuated by the low-pass filter properties of the cell membrane (Koch, 1999). Looking at spectral integration instead of temporal integration therefore enables us to focus on the site of primary signal transduction.

For these reasons, we develop a descriptive model for the responses of auditory receptor neurons to stationary stimuli with arbitrary power spectrum. The model comprises three steps, which correspond to the coupling, the transduction, and the encoding of the primary signal (Eyzaguirre and Kuffler, 1955;French, 1992). Focusing on the locust auditory system, we investigate three alternative hypotheses about which stimulus property governs the transduction process: the maximum amplitude of the stimulus, the stimulus energy, and the average half-wave-rectified signal amplitude. To test the model framework and distinguish between the rival hypotheses, intracellular recordings from the axons of receptor cells are performed. Based on a systematic exploration of stimuli that cause identical neural responses, the recordings reveal how the individual spectral contributions are integrated into one effective sound intensity.

## MATERIALS AND METHODS

*Electrophysiology.* All experiments were performed on adult *Locusta migratoria*. The tympanal auditory organ of these animals is located in the first abdominal segment. After decapitation, removal of the legs, wings, intestines, and the dorsal part of the thorax, the animal was waxed to a holder, and the metathoracic ganglion and auditory nerve were exposed. Action potentials from auditory receptor cells were recorded intracellularly in the auditory nerve with standard glass microelectrodes (borosilicate, GC100F-10; Harvard Apparatus Ltd., Edenbridge, UK) filled with a 1 m KCl solution (50–110 MΩ resistance). The signals were amplified (BRAMP-01; NPI Electronic, Tamm, Germany) and recorded by a data acquisition board (PCI-MIO-16E-1; National Instruments, München, Germany) with a sampling rate of 10 kHz. Detection of action potentials and generation of acoustic signals were controlled on-line by the custom-made Online Electrophysiology Laboratory (OEL) software. Stimuli were transmitted by the above-mentioned data acquisition board with a conversion rate of 100 kHz to the loudspeakers [Esotec D-260, Dynaudio (Skanderborg, Denmark) on a DCA 450 amplifier (Denon Electronic GmbH, Ratingen, Germany)]. These were mounted at 30 cm distance on each side of the animal so that the incidence of sound-pressure waves was orthogonal to the body axis. Stimuli were played only by the loudspeaker ipsilateral to the recorded auditory nerve. The linearity of the loudspeakers for superpositions of multiple tones was verified by playing samples of the stimuli used in the experiments while recording the sound at the site of the animals with a high-precision microphone [40AC, G.R.A.S. Sound & Vibration (Vedbæk, Denmark) on a 2690 conditioning amplifier (Brüel & Kjær, Langen, Germany)]. During the experiments, animals were kept either at room temperature, which was ∼20°C or at a constant temperature of 30°C. No systematic trends regarding a possible temperature dependence of the studied phenomena were observed. All experiments were performed in a Faraday cage lined with sound-attenuating foam to reduce echoes. Recordings from 45 receptor cells stemming from 18 animals (with at most 4 cells from the same animal) were used in this study.

The experimental protocol complied with German law governing animal care.

*Measurement of rate-intensity functions.* In general, each sound stimulus was presented for a duration of 100 msec, separated by pauses of at least 400 msec. To investigate adaptation effects, control experiments with longer stimuli and pauses (300/500 msec or 500/750 msec) were performed. All of these stimuli are decidedly longer than typical integration times of insect auditory receptors (1–3 msec as determined by reverse correlation for locust auditory receptors; data not shown) (see also Tougaard, 1998). Responses were measured by the average firing rate, calculated as the total number of spikes divided by the stimulus length. Spikes were detected on-line and counted from stimulus onset until 20 msec beyond stimulus offset to include all spikes elicited by the stimulus. This is justified because the investigated cells show no or only very low spontaneous activity and no offset response. Spike trains from the control experiments were also used for off-line analysis of specific response episodes.

Rate-intensity functions were determined in the following way. First, the stimulus was presented in steps of 5 dB between 20 and 100 dB sound pressure level (SPL) (for a definition see Eq. 21 in the ) to obtain the general shape of the rate-intensity function. These data were used to identify the intensity range that gave rise to firing rates between 50 and 250 Hz. Within this dynamic range of ∼10–15 dB, additional measurements in steps of 1 or 2 dB were performed, and these were repeated 4–10 times to yield average firing rates and their SDs.

Stimulus intensities corresponding to given firing rates were obtained by fitting a straight line through the four points closest to the desired firing rate as shown in Figure 1. Errors on these measurements follow from the errors of the fitted parameters according to the law of error propagation. Thresholds were determined by linear extrapolation to zero firing rate from data points with a low, but significant firing rate.

*Superposition of pure tones.* Measuring rate-intensity functions for pure tones allows one to understand how the firing rate r depends on the amplitude A of a single tone for a certain sound frequency, r = r(A). Investigating spectral integration amounts to asking whether this understanding can be extended to stimuli that contain multiple tones simultaneously. We therefore try to obtain a description of the firing rate r depending on the amplitudes A_{1}, A_{2}, … of the different frequency components of such stimuli, r = r(A_{1}, A_{2}, …).

In a first set of experiments, stimuli were sound-pressure waves S(t) consisting of two or three pure tones of amplitudes A_{n}, frequencies f_{n}, and phase offsets ϕ_{n}, n = 1, 2, 3:
Equation 1with A_{3} = 0 for the two-tone experiments. We used stimuli that were far longer than the periods of the sine waves and avoided combinations of frequencies that are related to each other by small integer factors. This makes the measurements insensitive to the relative phases of the individual sine tones, which we cannot control in our experiments because of putative phase shifts at the tympanal membrane (Michelsen, 1971b). For concreteness, we set ϕ_{1} = ϕ_{2} = ϕ_{3} = 0 in all experiments. The frequencies were chosen to be far enough apart to avoid beating. The two-tone experiments were performed with sound frequencies f_{1}= 4 kHz and f_{2} = 3/π · 10 kHz ≈ 9.55 kHz, the three-tone experiments with f_{1} = 4 kHz, f_{2} = 3/π · 10 kHz ≈ 9.55 kHz, and f_{3} = 10/π^{2} · 15 kHz ≈ 15.20 kHz or with f_{1} = 6 kHz, f_{2} = 3/π · 9 kHz ≈ 8.59 kHz, and f_{3} = 10/π^{2} · 17 kHz ≈ 17.22 kHz.

Within the present approach, we are concerned only with the encoding of sound intensity and not with temporal aspects. We thus restricted our attention to stationary stimuli with constant envelope as described above. This is justified because the responses of locust auditory receptors do not phase lock to sound frequencies in the kilohertz range (Suga, 1960; Hill, 1983a).

The experiments were designed to identify, for individual receptors, sets of amplitude combinations (A_{1}, A_{2}) or (A_{1}, A_{2}, A_{3}), respectively, that result in the same firing rate. The recorded data were analyzed within a model framework, which includes explicit predictions about how these amplitude combinations should be related to each other. In Results, the model is systematically developed and discussed. Here, we only present the main aspects and cover technical issues and questions regarding the model's role within the data analysis.

In summary, we compute the average firing rate of a receptor cell in the following three-step process.

(1) The stimulus is a sound-pressure wave S(t), a superposition of pure tones with frequencies f_{n}, amplitudes A_{n}, and phase offsets ϕ_{n}, S(t) = ∑
A_{n}sin(2π f_{n}t + ϕ_{n}). In the first step, this signal is linearly filtered and thereby turned into:
Equation 2This means that every tone receives a gain factor 1/C_{n}. In addition, the phase may change from ϕ_{n} to ϕ̃_{n}. The inverse of the filter constant C_{n} thus corresponds to the sensitivity for the frequency f_{n}: the smaller the C_{n}, the more sensitive the receptor at the corresponding sound frequency.

(2) An effective sound intensity J is computed according to one of the following three hypotheses: where S̃(t) is the filtered signal from Equation2, ‖x‖ denotes the absolute value of x, and 〈y(t)〉 is the temporal average of y(t).

(3) The average firing rate r is determined according to a single nonlinear function r(J).

Note that the effective sound intensity J as defined above is distinct from the physical sound intensity, commonly measured in decibels SPL (compare Eq. 21 in the ), which we denote by I throughout the text. Whereas I measures the stimulus itself, J is a derived quantity that incorporates the filter constants C_{n} and therefore also reflects the sensitivity of the specific receptor cell. Furthermore, I is defined as a logarithmic measure (relative to a predefined reference intensity); J is not, which facilitates the notation.

Within the model framework, the filter constants C_{n} are determined only up to a common factor, which can be absorbed in the function r(J). In other words, the model remains unchanged if all C_{n} are multiplied by the same constant and r(J) is at the same time adjusted appropriately. It follows that one way to determine the C_{n} is to choose a fixed firing rate, find for each frequency f_{n} the amplitudeÂ_{n} that leads to this firing rate, and set C_{n} = Â_{n}.

In the following description of the experimental procedure, we will for simplicity focus on the case of superpositions of two tones. The generalization of concepts and formulas to the three-tone case is straightforward.

The three alternative hypotheses result in different predictions about which combinations of amplitudes (A_{1}, A_{2}) are expected to lead to the same firing rate. Because the model implies that equal firing rate follows from equal effective sound intensity J (step 3), curves of constant firing rate can easily be calculated for each hypothesis by setting J constant in the equations of the second step in the model. These “iso-firing-rate curves” are shown in Figure2. From the amplitude hypothesis, pairs (A_{1}, A_{2}) yielding the same firing rate are expected to lie on a straight line. Likewise, from the energy hypothesis, they are expected to lie on an ellipse. For the pressure hypothesis, they should fall on an even more strongly bent curve. The corresponding shape has to be computed numerically by solving the equation:
Equation 3for pairs:
The duration τ has to be chosen large enough to cover many cycles of the sine waves in the signal, so that the phasesϕ̃_{n} can be neglected. Note that the shape of these three alternative iso-firing-rate curves is not influenced in any way by the form of r(J).

To relate these predictions to experimental results, we determined a set of amplitude combinations leading to the same average firing rate in the following way. We start by measuring a first rate-intensity function for a single pure tone with frequency f_{1}. From this rate-intensity function, we determine the amplitude A
that leads to a firing rate of, e.g., 150 Hz as shown in Figure 1. (In the notation A
, the subscript i refers to the frequency f_{i} at which the amplitude is measured, and the superscript n indicates the number of the measurement, 1 ≤ n ≤ N where N denotes the total number of measurements.) Because the amplitude A_{2} of the second frequency component is zero for this stimulus, we denote the result as a data point (A
, 0), i.e., a point on the A_{1} axis in a graph such as that of Figure 2. The same procedure is performed for a pure tone with frequency f_{2}, leading to a second data point (0, A
) on the A_{2}axis that also corresponds to a firing rate of 150 Hz. These two amplitudes A
and A
can already serve as estimates of the filter constants C_{1} and C_{2}, respectively. We proceed by measuring rate-intensity functions for superpositions of the two tones where the ratio of the amplitudes A_{1} and A_{2} is held fixed. To do so, we set A_{1} = k·A_{2} and then jointly vary the intensity of A_{1} and A_{2}. This corresponds to measuring the rate-intensity functions along straight lines in radial direction as pictured by the *gray arrows* in Figure 2. It is also evident from the figure that the radial direction is well suited for accurate measurements of the iso-firing-rate curves and for discriminating between the hypotheses. The resulting rate-intensity functions are similar in shape to the ones for the pure tones, and we can again determine the stimulus that leads to a firing rate of 150 Hz as in Figure 1. This yields a third data point (A
, A
) with A
/A
= k. The procedure is continued for several different ratios k so that a set of amplitude pairs (A
, A
) is obtained.

A technical but important question is which ratios k should be used in the experiment. If the neuron is much more sensitive to one of the sound frequencies and if both amplitudes are comparable in size, i.e., A_{1} ≈ A_{2}, the response will be determined almost exclusively by the more effective sound frequency. To be most informative, the measurement should thus take the relative sensitivities into account. This is done by choosing k so that A_{1}/C_{1}and A_{2}/C_{2} are of the same order of magnitude, which assures that the effect of both tones is roughly the same. To do so, we use the estimates of C_{1} and C_{2} that have been obtained from the first two rate-intensity functions for the pure tones as explained above. In particular, the different ratios of A_{1} and A_{2} for subsequent measurements are selected on-line in such a way that after taking C_{1} and C_{2} into account, the directions along which the rate-intensity functions are measured are evenly spaced. The *gray arrows* in Figure 2 are such directions. Note that their even spacing depends on the scales of the axes given by C_{1} and C_{2}. The calculation that achieves this is as follows: choose angles α that are evenly spaced in the interval [0°, 90°] and use the relation for the slope ρ of a straight line:

In the off-line analysis, the parameters C_{1} and C_{2} were newly determined by χ^{2} fits of each of the three curves in Figure 2 to the complete data set (A
, A
). These fitted values of C_{1} and C_{2} should be more reliable than the initial estimates, which were obtained on-line from the pure-tone rate-intensity functions only.

A further technical detail concerns the choice of the fitting procedure. The procedure should treat A_{1} and A_{2} in a symmetric fashion, and it should not be affected by potentially large differences in the relative sensitivities for the two tones. This discards, e.g., the simplest choice of regarding A_{2} as a function of A_{1} or vice versa. Instead, we normalized the amplitudes by the filter constants and looked at the radial distance of the data points:
from the origin, which is given by:
as a function of the ratio:
This is a natural choice because the rate-intensity functions that led to the data points were measured in this radial direction. For the three hypotheses, we denote the predicted radial distance by d
, where m stands for the particular model hypothesis (m = AH, EH, or PH). d
can be obtained from the model as a function of ρ_{n} and corresponds to the normalized distance from the origin to the respective iso-firing-rate curve in Figure 2. For the amplitude hypothesis, one obtains:
for the energy hypothesis, d
= 1, and for the pressure hypothesis, d
has to be determined numerically using the solutions of Equation 3.

Estimating C_{1} and C_{2} then corresponds to minimizing the χ^{2} function for the radial distance for each model m:
Equation 4with respect to C_{1} and C_{2}. The contributions of the data points are weighted by the measurement errors ς_{n}, which follow from the measurement errors ΔA
and ΔA
for A
and A
, respectively, by the law of error propagation as:
Equation 5The fitted curves and the χ^{2} values obtained from the fits were used for further statistical analysis (see below).

For the control experiments with stimulus lengths of 300 or 500 msec, the onset response and the steady-state response were analyzed individually. For the onset, only spikes in the first 30 msec after stimulus onset were taken into account; for the steady state, the first 200 msec of the response were disregarded. The control experiments were aimed at investigating the effect of adaptation on our model description. We therefore performed the same analysis as explained above on the firing rates obtained for the onset and the steady state and fitted the filter constants C_{1} and C_{2} separately in each case. In addition, we compared C_{1} and C_{2} as well as their ratio R = C_{1}/C_{2} for the onset with the respective values for the steady state. The relative change of R was computed from the onset value, R_{O}, and the steady-state value, R_{S}, as ΔR = ‖R_{O} − R_{S}‖/ R_{S}. For the total response, the ratio of C_{1} and C_{2} is denoted by R_{total}. To estimate the significance of changes in C_{1}, C_{2}, and R between the onset and the steady state, error measures for these parameters were computed for each cell individually by taking several nonoverlapping stretches of 30 msec during the steady state for the analysis, determining C_{1}, C_{2}, and R in each case, and computing the respective SDs.

Experiments with superpositions of three pure tones were performed and analyzed in the same way as the two-tone experiments. We first measured rate-intensity functions for each pure tone and from these obtained initial estimates of the respective filter constants C_{1}, C_{2}, and C_{3}. Subsequently, rate-intensity functions were measured along different directions in the three-dimensional stimulus space:
The ratios of:
were taken as 1:1:1, 2:1:1, 1:2:1, and 1:1:2. Final fits of the model parameters C_{1}, C_{2}, and C_{3} were obtained in an analogous way as for the superposition of two tones.

*Statistical analysis.* The χ^{2} values obtained from the fits were used to test the statistical significance of deviations of the data from the models by a standard χ^{2}test for each cell individually.

The Bayesian probability of a model given the data can be used as a measure for the preference of one hypothesis over another. It is calculated from Bayes' formula:
Equation 6where p (data) = ∑_{m}p (data‖model m)·p( model m). If there is no a priori evidence for any model, the prior probabilities for the models are to be set to p (model m) = 1/M, where M is the number of models investigated. The probabilities p (data‖model m) were calculated from the difference between:
and the corresponding model predictions d
by assuming independent errors with a Gaussian distribution of SDs ς_{n} (given by the measurement errors) and a finite and fixed measurement resolution Δ:
Equation 7
An analogous formula was used in the case of superpositions of three pure tones.

Trends in the data were tested for statistical significance by a standard run test (Barlow, 1989). For a given model, the data points were subdivided into those sequences of points that lie consecutively either above or below the model prediction, and the number of these sequences was tested for significant deviations from the null hypothesis of independently scattered data points around the model prediction.

*Comparison of pure-tone and noise stimuli.* In another set of experiments, we tested whether our understanding of spectral integration allows accurate predictions of firing rates for more complex stimuli and focused on bandpass-filtered noise. To calibrate the model for a specific receptor, we measured the rate-intensity function for a pure tone as well as a set of filter constants in the relevant frequency band of the noise stimulus. According to our model, we can use these pure-tone results to calculate a prediction for the rate-intensity function of the noise stimulus (see below). The filter constants are needed for the calculation of the effective sound intensity J for the noise stimulus (model step 2), and the pure-tone rate-intensity function is needed because it implicitly contains the information about the shape of the response function r(J) of model step 3. To assess the reliability of the prediction, the rate-intensity function for the noise stimulus was also measured experimentally. The particular noise stimulus that we used was Gaussian white noise, cut off at ±3 SDs and bandpass filtered between 5 and 10 kHz, and the frequency of the pure-tone stimulus was 4 kHz. Note that when the amplitude of a noise stimulus is varied, all amplitudes in the signal are scaled by a common factor.

The prediction for the rate-intensity function of the noise signal is obtained in the following way. According to our model, the rate-intensity function of the pure tone, r^{pt}(I), and the rate-intensity function of the noise stimulus, r^{noise}(I), should have the same shape and be related to each other by a shift ΔI along the decibel-intensity axis:
Equation 8For notational simplicity, we always use the same symbol r to denote the firing rate regardless of whether we consider its dependence on the sound intensity I, r(I), or on the effective sound intensity J, r(J). Strictly speaking, r(I) and r(J) are different functions, but from the context, it will always be clear to which function we refer.

Let us briefly describe the reason for the relation of Equation 8. For concreteness, we focus on the energy hypothesis; the amplitude and the pressure hypotheses can be dealt with in an analogous way. Consider an arbitrary sound signal S(t) composed of a set of pure tones with amplitudes A_{n}. From these, we can calculate the intensity, which is defined as:
Equation 9as well as the effective sound intensity:
The essential observation is that multiplying every A_{n} by the same factor k amounts to adding a constant 20log_{10}k to the intensity I (if k < 1, this constant is negative), whereas J_{EH} is multiplied by a factor k^{2}.

We now consider a noise stimulus with intensity I^{noise} and effective sound intensity J
. To compare the response with that of a pure tone, we find the intensity I^{pt} that yields the same firing rate as the noise stimulus by setting both effective sound intensities equal:
The parameter C^{pt} denotes the filter constant for the pure tone. From the preceding equation, we can calculate the pure-tone amplitude A^{pt} and thus the intensity I^{pt} of the pure tone, for which the firing rate is the same as for the noise signal with given intensity I^{noise}. Let us denote the difference between I^{noise} and I^{pt} by ΔI.

If we multiply all amplitudes by the same factor k, the amplitudes of the noise signal as well as A^{pt}, the intensities I^{noise} and I^{pt} are changed by the same amount. Consequently, the difference between the new intensities is still given by ΔI. Likewise, the effective sound intensities are multiplied by the same factor, i.e., we still have J
= J
. Because the firing rate depends only on the value of J, this means that for the new intensities, the firing rates are also equal. It follows that whenever the intensities of the pure tone and the noise signal differ by ΔI, the firing rates for the two stimuli are the same. A thorough mathematical derivation of this concept, which also yields explicit expressions for the amount of the shifts for the energy and pressure hypotheses, can be found in the .

The derivation shows that the predicted ΔI is given by:
Equation 10for the energy hypothesis and by:
Equation 11for the pressure hypothesis. The two predictions for ΔI differ by −10log_{10}
≈ 1.05 dB. Because this is below our measurement accuracy, we do not use this experiment for distinguishing between the hypotheses, but rather as a test of the generality and the predictive power of the model per se.

Evaluating Equations 10 and 11 is possible if one knows the filter constants and the A
for the amplitudes in the noise signal. The latter are given by the power spectrum of the noise signal, which we calculated in discretized bins of width 0.05 kHz (using a triangular Bartlett window). Filter constants C_{n} were measured for pure tones between 5 and 10 kHz at every 0.2–1 kHz (depending on the length of the recording) by determining the amplitude that led to a firing rate of 260 Hz. Additional filter constants C_{n}, for all center frequencies of the power-spectrum bins, can be determined by linear interpolation from the measured filter constants.

The prediction for the noise-stimulus rate-intensity function that results from shifting the pure-tone rate-intensity function r^{pt}(I) by ΔI is compared with the measured curve r^{noise}(I). To do this quantitatively, the prediction of ΔI is related to the true shift ΔI_{true} that can be extracted from the measured rate-intensity functions of the pure tone and the noise signal as the distance between these two functions. Because the rate-intensity functions are given by individual pairs of intensity and firing rate, (I, r), we use the distance of such a data point of one rate-intensity function to the approximate location of the other rate-intensity function. For a data point (I^{pt}, r^{pt}) from pure-tone stimulation, e.g., we thus determine the intensityÎ^{noise} that would be expected to lead to the same firing rate r^{pt}, but for the noise stimulus. The determination ofÎ^{noise} given the firing rate r^{pt} is again done by linear interpolation of the noise rate-intensity function as in Figure 1. We thus find for every intensity I
of the pure-tone rate-intensity function a correspondingÎ
, and similarly for every intensity I
of the noise rate-intensity function a correspondingÎ
. Because ideally, these should be related by Î
= I
+ ΔI_{true} andÎ
= I
− ΔI_{true}, we can estimate ΔI_{true} by minimizing the χ^{2}function:
Equation 12Because the subthreshold part and the saturation are not important in the determination of the actual shift, only data points (I, r) with r between 20 and 80% of the maximum firing rate of the cell were taken into account.

## RESULTS

The objective of this study is to develop a descriptive model for the responses of auditory receptor neurons to arbitrary stationary acoustic stimuli. This is done to identify the dominant physical stimulus property governing the encoding of sound intensity. We first develop a general mathematical framework for the transformation of the incoming sound into the neural response. Subsequently, we apply the model to locust auditory receptors and show that the experimental data are well described by only one of three rival hypotheses about the nature of the primary signal transduction.

### Derivation of the mathematical model

In locusts, auditory signals are encoded by 60–80 receptor neurons at each ear with similar general properties but considerable variability in the parameter values describing the sensitivity of individual neurons to specific sound frequencies (Römer, 1985). In response to a pure tone with sufficient intensity, the firing rate of a receptor cell increases in a sigmoidal fashion with stimulus intensity (Fig.3*A*). The steepness and level of saturation of this rate-intensity function depend on the individual cell and temperature. Below a threshold intensity, there is no or only very low spontaneous activity. The regime between threshold and saturation usually spans ∼15–30 dB, and maximum firing rates lie at ∼300 Hz for room temperature and ∼500 Hz for 30°C.

The frequency-resolved sensitivity of the receptors can be characterized by a threshold curve, i.e., the dependence of the threshold on the sound frequency (Fig. 3*D*). The receptors are fairly broadly tuned with characteristic frequencies in the range of 4 kHz (low-frequency receptors) to 15 kHz (high-frequency receptors), and the absolute sensitivities vary strongly between individual neurons (Römer, 1985; Jacobs et al., 1999).

Measuring rate-intensity functions from a single receptor cell for many different sound frequencies reveals another property of the receptors; to good approximation, the rate-intensity functions are shifted versions of one another along the intensity axis, where intensity is measured in the logarithmic units of sound-pressure level, decibels SPL. This phenomenon has been reported previously by Suga (1960) and Römer (1976). A detailed example with frequencies spanning the whole sensitivity range of a typical low-frequency receptor (characteristic frequency of ≈5 kHz) can be seen in Figure 3*B*. The generic shape of the rate-intensity functions becomes even clearer if they are shifted relative to each other and aligned at 250 Hz firing rate (Fig.3*C*). Figure 3*D* shows the threshold curve together with curves denoting the intensities that lead to firing rates of 150 and 300 Hz. As a consequence of the generic shape of the rate-intensity functions, all curves are approximately parallel to each other.

These key findings indicate that over the whole frequency range, the coupling of the physical stimulus is not substantially influenced by mechanical nonlinearities. In fact, a simple filtering mechanism captures the essence of the observed phenomenon. Let us assume that for all pure tones the firing rate is given by a single function r(A_{n}/C_{n}). A_{n} denotes the amplitude of a specific pure tone of frequency f_{n}, and C_{n} is a frequency-dependent filter constant such that the firing rate depends only on the ratio A_{n}/C_{n}. This corresponds to a gain factor of 1/C_{n} for each sound frequency. For two different frequencies f_{1} and f_{2}, the firing rates r(A_{1}/C_{1}) and r(A_{2}/C_{2}) are then the same when A_{1}/C_{1} = A_{2}/C_{2}, i.e., when the amplitudes take on a constant ratio A_{1}/A_{2} = C_{1}/C_{2}. Because the intensity I in decibels SPL is defined as a logarithmic measure of the amplitude, I = 20 log_{10}(A/(
·20 μ Pa)), this constant amplitude ratio corresponds to a constant intensity difference, ΔI = I_{1} − I_{2} = 20 log_{10}(C_{1}/C_{2}). The firing rates for the two tones are therefore always the same if their intensities differ by ΔI. The rate-intensity functions are thus shifted versions of one another separated by ΔI as found in the experiment. Generalizing this idea to stimuli containing more than one frequency leads us to the first step of our model:

### Step 1: coupling to the stimulus

The sound pressure wave S(t), written as a Fourier series:
Equation 13where the f_{n} denote the frequencies, ϕ_{n} phase offsets, and the A_{n} the respective amplitudes, is initially transformed into a filtered signal S̃(t):
Equation 14The amplitudes are multiplied by frequency-dependent gain factors 1/C_{n}. These describe the frequency-resolved sensitivity, i.e., the tuning of the receptor cell, and correspond directly to the values of the threshold curve at the frequencies f_{n}. In addition, a putative phase shift turns ϕ_{n} into ϕ̃_{n}.

Although the above reasoning for using a linear filter as the first model step is based on electrophysiological observations only, it corresponds well with biophysical findings regarding the tympanal membrane. Schiolten et al. (1981) observed that the tympanal membrane behaves approximately as a linear oscillator with a short damping time constant of ∼100 μsec. The resonance properties of this oscillator are thought to be responsible for the frequency-resolved gain of the receptors and therefore also for the shapes of the threshold curves (Michelsen, 1971a,1971b,1979). Michelsen and Rohrseitz (1995) also note that the amplitude of the tympanal vibration depends linearly on the sound pressure for pure tones.

### Step 2: mechanosensory transduction

Receptor cells are attached to the tympanal membrane with a cilium protruding from the dendrite and several auxiliary cells surrounding a receptor (Gray, 1960). The biophysical functioning of this machinery is not yet understood, but oscillations of the tympanal membrane presumably lead to conductance changes in the receptors' dendrites that give rise to membrane depolarizations (Hill, 1983a, 1983b). This is where a spectral integration of frequency-dependent stimulus attributes must occur. Voltage fluctuations in the range of the relevant sound frequencies (several kilohertz) cannot be transmitted by the cell membrane because of its low-pass filter properties. Information about the spectral content is therefore lost at the level of the membrane potential, which, instead, is expected to correspond to an integrated stimulus property. The spectrum of the generator potential after acoustic stimulation is indeed found to contain no trace of the sound frequency used (Hill, 1983a).

Following ideas from the literature concerning temporal integration in auditory receptor cells (Tougaard, 1996; Heil and Neubauer, 2001), we set up three hypotheses for the spectral integration by calculating an “effective sound intensity” J from S̃(t).

#### Amplitude hypothesis (AH)

J corresponds to the maximum amplitude ofS̃(t). This is the common view of a threshold: a response occurs once the signal reaches a certain value. In the case of few frequency components, J is given by the sum of the scaled amplitudes: Equation 15

#### Energy hypothesis (EH)

J corresponds to the temporal mean of the squared signal [throughout what follows, 〈x(t)〉 denotes the temporal mean of x(t)]: Equation 16From Parseval's Theorem (Press et al., 1992), we see that this expression can be rewritten as the sum of the squares of the scaled amplitudes: Equation 17Because the square of the amplitude of a sinusoidal oscillation is proportional to the energy contained in the oscillation, this hypothesis reflects an energy-integration mechanism.

#### Pressure hypothesis (PH)

J corresponds to the temporal mean of the absolute value of S̃(t): Equation 18This hypothesis complies with a pressure-integration mechanism after half-wave rectification.

### Step 3: encoding by firing rates

The response of an auditory receptor to a signal of constant intensity can be characterized by a mean firing rate r. The rate is obtained from a one-dimensional, nonlinear transformation of the effective sound intensity J: Equation 19Note that the effective sound intensity J is a theoretical construct, which does not necessarily correspond to a biophysical property. It is used here to describe regions of constant firing rate in stimulus space because these correspond to regions of constant J. Therefore, instead of the specifically simple versions of J given above, we could also use any transformation J̃ = f(J) with some appropriate function f. This transformation does not affect the shape of the regions of constant J, but we can speculate that for the correct choice of f, J̃ has a direct biophysical interpretation, such as the change in membrane conductance caused by the stimulus.

Measured spike-train responses have a strong transient attributable to adaptation. In a first approach, we average over this temporal structure in the response and consider only the total number of spikes elicited by the stimulus. In a second, more detailed analysis, we analyze individual parts of the response to explicitly test how this structure in the spike trains might affect our model description.

### Electrophysiological experiments

#### Experimental strategy

To directly address the question of spectral integration and the hypotheses in step 2 of our model, we compare only stimuli that lead to the same firing rate of a given neuron. With this strategy, we circumvent complications attributable to the nonlinearity induced by the spike-generation mechanism. In terms of our model, a constant firing rate implies a constant effective sound intensity J and vice versa, independently of the specific shape of r(J). As a crucial element of our analysis, we therefore identify regions of constant J in stimulus space (A_{1}, A_{2}, …) by searching for stimulus combinations that result in the same firing rate. We denote these regions as iso-firing-rate regions. They are then compared with the predictions of the three hypotheses and reveal how J is composed of contributions from the single amplitudes. Because the rate-intensity functions are found to be fairly smooth in the rising part between threshold and saturation, extracting iso-firing-rate regions can be accurately done by linear interpolation as is shown in Figure 1.

#### Superpositions of two pure tones

The complete stimulus space of stationary stimuli is, of course, high dimensional. We thus started with low-dimensional subspaces using only two or three pure tones and their superpositions. In the two-dimensional subspace (A_{1}, A_{2}), each point represents a linear combination of two pure-tone signals at frequencies f_{1} and f_{2} (4 and 9.55 kHz):
Equation 20For these stimuli, we determined combinations (A_{1}, A_{2}) that yield the same fixed firing rate. Figure 2 shows the shapes of the iso-firing-rate curves as predicted by the three hypotheses.

Responses to superpositions of two pure tones with stimulus duration of 100 msec were measured for 17 cells. Figure4 depicts sets of amplitude combinations (A_{1}, A_{2}) that led to a firing rate of 150 Hz in each of the four cells presented. Fitted iso-firing-rate curves corresponding to the three hypotheses are also shown.

Performing a χ^{2} test on the fits of the three hypotheses showed that the amplitude hypothesis is rejected at the 1% level for all 17 cells, whereas the energy hypothesis is not rejected for any cell and the pressure hypothesis is rejected for 4 cells. For an in-depth analysis, we therefore considered only the energy and the pressure hypotheses.

To further distinguish between these two hypotheses, we directly compared the goodness of fit given by the χ^{2} values. The energy hypothesis yielded a lower χ^{2} than the pressure hypothesis in 16 of 17 cases. We also calculated a Bayesian estimate of the probability p (model‖data) of the model given the data (with prior probabilities of 0.5 for both the energy and the pressure hypothesis). The mean of p (EH‖data) was obtained as 0.884 with 0.167 SD and median 0.978 (N = 17), whereas p (PH‖data) equals 1 − p (EH‖data) and therefore had a mean of only 0.116.

Furthermore, data points for which A_{1}/C_{1} and A_{2}/C_{2} were approximately equal (i.e., data points in the middle sections of the plots in Fig. 4) were in general below the fitted iso-firing-rate curve of the pressure hypothesis instead of scattered around it as would be expected if the deviations resulted from independent measurement errors. We investigated this trend by a run test for those fits of the model that had at least 10 df (i.e., 12 data points). For these nine cells, the run test showed significant deviations (p < 0.01) from the pressure hypothesis in three cases. All three cells were different from those that had led to statistically significant deviations from the pressure hypothesis according to the χ^{2} test. For the energy hypothesis, such a trend was not observable.

From the combined evidence, we conclude that the amplitude as well as the pressure hypotheses can be rejected. The energy hypothesis, on the other hand, provides a good description of the data for spectral integration in the two-tone case.

The values obtained for the filter constants corresponding to the energy hypothesis can be read from the graphs in Figure 4 as the half-axes of the ellipses, i.e., as the intersection points of the ellipses with the two coordinate axes. Values range from ∼1 to 2000 mPa, corresponding to the large variability in overall sensitivity of the receptors. As stated in Materials and Methods, the filter constants are determined only up to a common factor. Their ratios are, however, a direct measure of the relative sensitivity for the two chosen sound frequencies. In our experiments, we found ratios of C_{1} and C_{2} of up to 30:1 (Fig. 4*B*), which means that spectral integration can be accurately determined even if the sensitivities for the two sound frequencies diverge by as much as 30 dB and possibly more.

We also see from Figure 4 that the initial estimates of C_{1} and C_{2} (taken from the pure-tone rate-intensity functions; see Materials and Methods) are already very close to the values obtained by the fit of the energy hypothesis. The initial estimates for C_{1} and C_{2} are given by the data points on the coordinate axes and closely coincide with the intersection points of the ellipses. This shows that the filter constants measured with pure tones are approximately the same as those obtained from fitting the energy hypothesis to all data points.

As an additional test of the energy hypothesis, we investigated how iso-firing-rate curves that were obtained separately for different firing rates are related to one another. Figure5 shows pairs (A_{1}, A_{2}) corresponding to several firing rates between 100 and 200 Hz. Pairs corresponding to the same firing rate are accurately fitted by ellipses. Each ellipse corresponds to an independent fit to the data points of the same firing rate. To good approximation, all ellipses are scaled versions of one another. This result is in accordance with the energy hypothesis, because the ratio of the half-axes of the ellipses should always equal the ratio of the filter constants C_{1} and C_{2}. Such a behavior was observed for all cells measured. For each cell, we determined the ratios R_{100} and R_{150} of half-axes of the ellipses corresponding to 100 and 150 Hz, respectively, and their relative deviations ‖(R_{150} − R_{100})/R_{150}‖. We found that with a mean of 0.044 (SD 0.026), these were always small.

#### Analysis of specific response episodes

Up to now, we have disregarded the fact that the spike-train responses contain a pronounced transient attributable to adaptation. Typical spike trains from receptor cells for 300 msec pure-tone stimuli and the corresponding instantaneous firing rates can be seen in Figure6, *A* and*E*. The transient usually spans approximately the first 40 msec but can last as long as 100 msec. Afterward, the cell has adapted to the sound intensity, and the response is approximately in a stationary steady state for the rest of the stimulus duration. When the stimulus ends, the receptor cells do not show an offset response, but stop firing or return to their usually low spontaneous activity. To investigate how the transients influence our model description, we explicitly analyzed the validity of the hypotheses for the onset as well as the steady-state response.

Spike trains from 10 cells were recorded with stimuli of either 300 msec (in 6 cases) or 500 msec duration (in 4 cases). The same analysis as before was applied to the onset by using only the first 30 msec of the response and to the steady state by disregarding the first 200 msec after stimulus onset. Two examples are shown in Figure 6. In each case, the data points are best fitted by the ellipses from the energy hypothesis. We again performed a statistical analysis of the goodness of fit. Longer stimulus durations resulted in fewer measurements per cell so that the data points often had larger experimental errors. This effect is even stronger for the analysis of the onset response, which relies on considerably shorter stretches of data. Nevertheless, the data from two cells during steady state deviated significantly (p < 0.05) from the pressure hypothesis, whereas the energy hypothesis always gave a good fit. Furthermore, the Bayesian test favored the energy hypothesis over the pressure hypothesis strongly for both onset and steady state [p (EH‖data) for onset response: mean 0.642 with 0.100 SD, median 0.649; p (EH‖data) for steady state: mean 0.795 with 0.131 SD, median 0.782; N = 10]. We conclude that the energy hypothesis yields an appropriate description also for the specific episodes of the response.

We may now use our description of spectral integration to investigate a possible dependence of the adaptation on the sound frequency. Adaptation mechanisms in mechanoreceptors have been identified for stimulus coupling (Eyzaguirre and Kuffler, 1955;Chapman et al., 1979), transduction (Ricci et al., 1998; Holt and Corey, 2000), and encoding (Matthews and Stein, 1969; Purali and Rydqvist, 1998). For insect mechanoreceptors, spike adaptation in the encoding stage often seems to be the dominant source (French, 1984a, 1984b). The fact that the time constants of adaptation depend strongly and systematically on the firing rate for locust auditory receptors also indicates that spike adaptation is an important mechanism (Benda, 2002). Because spike adaptation takes place after spectral integration, it is independent of the sound frequency. Our model description is not affected by such a frequency-independent adaptation as long as we focus on a fixed response episode. The number of spikes occurring during such an episode is still a function of the effective sound intensity, although the distribution of the spikes may display a certain structure within the response. For different response episodes, the decrease in the firing rate over time is simply reflected in an increase of the filter constants C_{1} and C_{2} by a common factor, which could also be absorbed in the function r(J). In particular, the ratios R = C_{1}/C_{2} of the filter constants for the onset, R_{O}, and the steady state, R_{S}, should be the same.

The development of the firing rates in Figure 6 shows that the transient parts of the response are generally similar and that they have approximately the same time constant independent of sound frequency. For the cell that is depicted in the *right column*of Figure 6, though, the firing rates for the two sound frequencies clearly differ in the first 30 msec. This indicates that, on short time scales, the adaptation dynamics can depend on sound frequency, which implies an adaptation mechanism within the coupling or transduction processes. For our model description, such a phenomenon results in a difference between R_{O} and R_{S}. This can be observed, e.g., in the*right column* of Figure 6, where the ellipses for the onset (Fig. 6*F*) and steady state (Fig. 6*G*) have different shapes.

We analyzed this effect quantitatively for the 10 investigated cells by determining the relative change ΔR = ‖R_{O} − R_{S}‖/ R_{S}. We found values of ΔR between 1 and 25%, which must be compared with the error measures for the values of R of ∼10%. Half of the cells had a ΔR value that was larger than their noise level. The cell depicted in the *right column* of Figure 6showed the largest ΔR of the 10 cells. The total values of the filter constants C_{1} and C_{2}, on the other hand, change between onset and steady state by 10–50%, with error measures of 5–10%. We conclude that all cells that we analyzed were affected by adaptation and that in some cells, a small fraction of the adaptation phenomenon might be attributed to frequency-dependent mechanisms. These frequency-dependent effects are restricted to approximately the first 30 msec. Analyzing the time window from 40 to 70 msec after stimulus onset, e.g., gives very similar ratios of C_{1} and C_{2} as for the steady state. Consequently, the frequency-dependent changes are negligible for the model description of the average response to longer stimuli. For example, the area between the two firing rate curves in the first 30 msec of Figure 6*E* (*bottom*), which denotes the difference in spike count attributable to the frequency dependence, corresponds to only ∼2% of the total spike count. For the remaining part of this study, we therefore use the full responses to 100 msec stimuli, for which it is easier to collect a sufficient amount of data in the limited recording time.

#### Superpositions of three pure tones

To see whether the findings from the two-tone experiments generalize to sounds with more complex frequency spectra, responses to superpositions of three pure tones were analyzed. We applied the same approach as for the two-tone experiments with 100 msec stimuli and identified iso-firing-rate surfaces in the three-dimensional subspace (A_{1}, A_{2}, A_{3}). The three hypotheses yield predictions about these surfaces in the form of a plane (amplitude hypothesis), an ellipsoid (energy hypothesis), and a more strongly bent surface (pressure hypothesis) the exact shape of which has to be determined numerically.

Responses to superpositions of three pure tones were measured for eight cells. From the rate-intensity functions, we determined amplitude triplets corresponding to a firing rate of 150 Hz. Figure7 illustrates the results for one cell and also shows the fitted ellipsoid corresponding to the iso-firing-rate surface of the energy hypothesis. We applied a χ^{2} test and found that the amplitude hypothesis is rejected at the 1% level for all eight cells, whereas the energy hypothesis is rejected for one cell and the pressure hypothesis is rejected for four cells. We again compared the fits for the energy and the pressure hypothesis in more detail. In all cases, the energy hypothesis gave a lower χ^{2} than the pressure hypothesis, and the Bayesian estimate of the probability of the model given the data again strongly favored the energy hypothesis over the pressure hypothesis [mean of p (EH‖data) was 0.916 with 0.109 SD, median 0.987, N = 8]. Thus, spectral integration for three pure tones is also best described by the energy hypothesis, whereas the amplitude and the pressure hypothesis are rejected by the data.

#### Comparison of pure-tone and noise stimuli

So far we have found that the energy hypothesis describes spectral integration of mixtures of two and three pure tones. We now pose the question whether this hypothesis also applies to stimuli composed of many frequencies. In particular, we aim at predicting the response to a bandpass-filtered Gaussian white noise based on the knowledge of the filter constants C_{n} and a pure-tone rate-intensity function. Spike-train responses to the noise stimulus have the same structure as responses to pure-tone stimulation (data not shown). We therefore again focus on the firing rate and measure rate-intensity functions for the noise stimulus. Our model predicts that these should have the same shape as the pure-tone rate-intensity functions (see Materials and Methods). The expected distance ΔI between the two rate-intensity functions can be calculated if the filter constants and the power spectrum of the noise stimulus are known. The values for the energy hypothesis, ΔI_{EH}, and the pressure hypothesis, ΔI_{PH}, are given in Equations 10 and11.

The pure-tone stimulus has a frequency of 4 kHz, and the noise stimulus is bandpass filtered between 5 and 10 kHz, a region in which many receptors are most sensitive. Rate-intensity functions for these two types of stimuli were measured for 10 cells. In addition, filter constants in the range of 5–10 kHz were determined independently by measuring the amplitudes of pure tones leading to a firing rate of 260 Hz. Figure 8 shows rate-intensity functions for the pure-tone as well as the noise stimulus together with the predictions that are obtained from shifting the pure-tone rate-intensity functions by ΔI_{EH}. In each case, the two measured rate-intensity functions are almost identical in shape, as expected from the model. Furthermore, the measured noise-stimulus rate-intensity function and the shifted pure-tone rate-intensity function coincide closely in most cases. Note that only results from pure-tone stimulation are used for the prediction of the noise-signal responses. To assess the results quantitatively, we calculated the deviation of ΔI_{EH} from the actual distance between the rate-intensity functions, ΔI_{true}, in each case. For the energy hypothesis, ΔI_{EH} − ΔI_{true}has a mean of −0.62 ± 0.68 dB (SE). The spread of these data (SD of 2.16 dB) corresponds to the expected measurement accuracy, which can be estimated to be ∼2 dB; the determination of C^{pt}, the collection of C_{n}, and the locations of both r^{pt}(I) and r^{noise}(I) all contribute independently with ∼1 dB error range. The pressure hypothesis yields ΔI_{PH} − ΔI_{true}with a mean of 0.43 ± 0.68 dB (SE) and is thus not ruled out by this experiment.

The results suggest that the description of spectral integration by the energy hypothesis, as derived from the two- and three-tone experiments, is also applicable to more complex stimuli. The model can be used for an accurate prediction of the location of the rate-intensity function after measuring the filter constants from pure-tone responses. The predictability of actual firing rates, however, is limited by the steepness of the rate-intensity functions. Because the range between threshold and saturation usually spans only ∼15–30 dB, small inaccuracies of a few decibels about the prediction of the shift between the rate-intensity functions have a strong effect on individual firing-rate predictions.

## DISCUSSION

Spectral integration is an important feature of auditory encoding and is closely connected to the mechanosensory transduction process. Our data show that the response of locust auditory receptor cells to stationary sound stimuli is determined by an “effective sound intensity” J that can be calculated from the stimulus spectrum and the sensitivity of the receptor at different sound frequencies. The sound-intensity coding of the receptor cells can thus be described in a three-step process. First, the tympanal membrane acts as a linear filter. The relevant characteristics of the filter can be determined using pure-tone stimuli by measuring which intensities correspond to a given firing rate of the receptor. Second, the effective sound intensity J is obtained by summing up all energies contained in the individual frequency components of the filtered signal. We believe that this summation reflects the dynamic properties of the mechanosensory transduction channels. Ultimately, a biophysical investigation is required to confirm this view. In a final step, the effective sound intensity is put through a nonlinear response function independent of the spectral contents of the original signal. The shape of this response function can be derived from the measurement of a single rate-intensity function with arbitrary but fixed spectral content.

Alternative hypotheses that compute J as the maximum amplitude or the integrated pressure are rejected by the analysis of the responses of the receptors to superpositions of two or three pure tones. Although the amplitude hypothesis can be clearly discarded, the energy and pressure hypotheses are more similar in their predictions regarding spectral integration. However, the combined evidence from several statistical investigations demonstrates that the pressure hypothesis fails in several single cases and that the energy hypothesis provides a far better description of the data.

Comparison between responses to pure tones and bandpass-filtered noise shows that the energy hypothesis also accounts for the responses to more complex stimuli. The model can therefore be used to accurately predict the rate-intensity functions for noise-like signals.

### Effects of stimulus onset and adaptation

A more detailed analysis reveals that the energy hypothesis also describes spectral integration for the onset and the steady-state responses individually. The model parameters depend on the response episode investigated. The main effect is a change in the filter constants by a common factor attributable to spike-dependent adaptation. For a generalization of the model to fluctuating stimuli, this dependence of the parameters on the adaptation state could be explicitly incorporated, e.g., by using the generic model of (Benda et al., 2001; Benda, 2002). In some cells, changes in the ratio of the filter constants between onset and steady state suggest additional, although smaller sound-frequency-dependent dynamics of the adaptation. Besides the dominant spike adaptation, there might thus be a second adaptation phenomenon, which occurs before spectral integration. It might be caused by either a mechanical effect of the vibrations of the tympanum and associated structures or a property of the transduction channels. Because the effect is generally small and restricted to the first 30 msec, it can be neglected for the description of the average response to longer stimuli, which was the focus of the present study.

### Conceptual framework

Combining the two concepts “spectral integration” and “iso-firing-rate regions” allowed us to rigorously compare different transduction models. A key ingredient in the experimental procedure was the systematic exploration of regions of constant activity under variation of the stimulus composition, in the present case the spectrum of a sound signal. Investigating such regions implies a change of the traditional perspective regarding neural input–output relations. Instead of asking what output is produced by a given input, one seeks to identify input ensembles that are associated with a fixed output. Reliable on-line analysis and automatic feedback to the stimulus generation are a central aspect of this approach. Based on increasingly available high-speed computing power, the method could be easily extended to identify more general invariant regions in auditory and other stimulus domains.

Our framework may be compared with the technique of silent substitution (Estévez and Spekreijse, 1982), in which the spectral composition of a visual stimulus is varied systematically such that the resulting stimuli always lead to the same activity of one (or more) receptor types in the retina. Fluctuations in visually evoked potentials can then be interpreted as being caused by the remaining receptors. In this case, however, the iso-activity regions of the receptors are not explored, but must be known accurately beforehand, and they are not compared with alternative model predictions.

### Comparison with studies of temporal integration in other auditory systems

Our results go along well with the fact that temporal energy integration describes firing thresholds for double-click and intensity-duration trade-off experiments in receptor cells of moths (Tougaard, 1996,1998). If this finding can also be confirmed for locust auditory receptors, spectral and temporal integration could be combined in a single simple model. Trade offs between intensity and duration would then be expected to occur for stimuli on time scales of a few milliseconds, the apparent integration time of the receptors, which is well below the stimulus durations used in this study. In mammalian auditory nerve fibers, on the other hand, first-spike latencies correspond to the integrated pressure and not the energy (Heil and Neubauer, 2001). It is possible that this is caused by a fundamental difference in the transduction mechanisms of hair cells and insect auditory receptors. However, latency measurements reflect properties of the transduction as well as properties of additional dynamic processes, such as synaptic transmission, internal calcium dynamics, and spike generation. In this context, it should be noted that the latency in type I excitable membranes depends strongly and nonlinearly on the input strength (Hodgkin, 1948; Rinzel and Ermentrout, 1998; Izhikevich, 2000). This opens up the possibility that properties of the spike generator alter the effective input in such a way that energy integration is in accordance with the observed correspondence between latency and the temporal pressure integral. In fact, Ermentrout (1996) showed that in type I membranes, the firing rate r to a constant stimulus S above the firing threshold S* approximately obeys the square-root relation r(S) ∼ . For a simplified phase-integrator model (Hoppensteadt, 1997), the latency Δt is then given by the condition that the integral ∫ dt reaches a threshold value. According to the energy hypothesis, S is proportional to the square of the pressure amplitude A of a pure tone and in most cases large compared with S*. This cancels the square root, thus resulting in the latency condition ∫ A(t)dt = const, the dominant component of the model proposed by Heil and Neubauer (2001). The above considerations may also explain the apparent discrepancy between the latency measurements and the fact that psycho-acoustic studies successfully apply energy-integration models (Garner, 1947; Plomp and Bouman, 1959;Zwislocki, 1965; Florentine et al., 1988). Further experiments are needed to decide this, however.

Response properties of hair cells and mammalian auditory nerve fibers are complicated by mechanical nonlinearities induced by the cochlea and a more intricate signal pathway than is the case in insect auditory systems. Nevertheless, measurements of basilar-membrane vibrations indicate that outside a region around the characteristic frequency, the stimulus coupling to mammalian auditory receptors occurs in an approximately linear fashion (Eguı́luz et al., 2000; Ruggero et al., 2000). This suggests that a phenomenological study along the lines of the present investigation might also reveal interesting properties of the transduction process in hair cells.

### Implications for the locust auditory system

Practical implications of our results include a more reliable characterization of insect auditory receptor sensitivity by measuring the intensities necessary to provoke a given non-zero firing rate instead of the threshold curve. The latter is notoriously difficult to measure because the rate-intensity functions usually flatten out near the threshold and are corrupted by background activity (Michelsen, 1971c). At least in locust receptor cells, the threshold curve runs approximately parallel to any other curve of equal response, and a single additional rate-intensity function can determine the distance between the measured curve and the actual threshold curve. Furthermore, our results show that average responses of an auditory receptor to complex stimuli can be well predicted once the cell-specific effective sound intensity J has been measured. The resulting quantitative correspondence between the stimulus spectrum and the firing rate differs from the predictions of an earlier heuristic approach (Lang, 2000). Our result will thus be helpful for systematic investigations of the processing of natural communication signals, such as grasshopper calling songs (Machens et al., 2001).

### Linear versus nonlinear models

The simplicity of our model, which is linear up to a final static nonlinearity, is consistent with the fact that previous studies have found no indications of dominant nonlinearities or active movement of the sensory cilia (cf. Eberl, 1999). Distortion-product otoacoustic emissions from locust ears indicate slight nonlinearities at the tympanal membrane, but only at ∼50 dB below the stimulating intensities (Kössl and Boyan, 1998). Many other auditory systems, on the other hand, are strongly affected by nonlinear mechanisms and active signal amplification leading to increased sensitivity and frequency resolution. This phenomenon is common in vertebrate ears (Fettiplace and Fuchs, 1999; Hudspeth et al., 2000) but has also been shown to exist in some insect auditory systems (Göpfert and Robert, 2001).

### Implications for other mechanosensory systems

It can be speculated that the nonlinearities mentioned above are additional features on top of the same underlying mechanosensory transduction process. Recent findings of structural and functional similarities between hair cells and the *Drosophila* sensory bristle as well as the discovery in *Drosophila* of homologs of mammalian genes related to hearing and deafness support this view and suggest that many aspects of mechanosensory transduction among insects and vertebrates are conserved (Adam et al., 1998; Bermingham et al., 1999; Eberl, 1999; Fritzsch et al., 2000; Walker et al., 2000; Gillespie and Walker, 2001). The energy hypothesis might thus be extended to account for spectral integration in other mechanosensory systems as well, possibly after modifications that take the system-specific nonlinearities explicitly into account.

Mechanosensory transduction is also involved in a wide range of other senses, including touch, proprioception, and the sense of balance. Unlike transduction mechanisms that involve second-messenger signaling suited for biochemical analysis, mechanosensory changes of the membrane conductance result from a direct coupling with the mechanical stimulus: stretch, compression of the cell, or deflection of associated processes or cilia (Corey and Hudspeth, 1979; Hudspeth, 1985; Hudspeth and Logothetis, 2000). This direct and fast electrophysiological response has so far resisted a detailed biophysical analysis (Gillespie, 1995). Our method of finding regions of constant neural response for varying spectral composition provides a novel approach for distinguishing between different hypotheses about receptor integration, sets quantitative constraints that any future biophysical model has to satisfy, and is applicable to a wide range of other (mechano)sensory systems.

## CALCULATION OF THE INTENSITY SHIFT BETWEEN PURE-TONE AND NOISE SIGNALS

We denote the effective sound intensities of the noise signal by J
and J
and the effective sound intensities of the pure-tone signal by J
and J
according to the energy and pressure hypotheses, respectively. The intensity in the decibel SPL scale is defined as:
Equation 21with A_{0} = 20 μPa. For the noise signal:
Equation 22the root-mean-square is obtained as:
Equation 23which implies that the intensity is given by:
Equation 24The effective sound intensity of the energy hypothesis, Equation17, can thus be written as:
Equation 25where the dependence on the intensity is given explicitly because the term
is invariant to intensity changes.

For J
, we note that the values of S̃(t) are distributed according to a Gaussian distribution with variance ς^{2} that is given by:
Equation 26For a Gaussian distribution with SD ς, the mean of the absolute value can be calculated as
and we therefore obtain from Equation 18:
Equation 27Equivalently, we find for the pure-tone stimulus S^{pt}(t) = A^{pt} sin (2πft):
Equation 28
Equation 29where C^{pt} denotes the filter constant for the pure tone. These latter relationships can be inverted to yield I^{pt} as a function of J^{pt}. Because equal J implies equal firing rate, we can then substitute J^{pt} by J^{noise} to obtain that intensity of the pure tone that leads to the same firing rate as a given intensity of the noise signal:
Equation 30
Equation 31
From these formulas, we can directly read out ΔI for the two hypotheses by comparison with Equation 8.

## Footnotes

This work was supported by Boehringer Ingelheim Fonds (T.G.) and the Deutsche Forschungsgemeinschaft. We are grateful to Christian Machens and Martin Stemmler for fruitful discussions and Peter Heil, Matthias Hennig, and Rüdiger von der Heydt for valuable comments on this manuscript.

Correspondence should be addressed to Andreas V. M. Herz, Institute for Theoretical Biology, Department of Biology, Humboldt University, 10115 Berlin, Germany. E-mail:herz{at}itb.biologie.hu-berlin.de.

H. Schütze's present address: Krieger Mind/Brain Institute, Johns Hopkins University, Baltimore, MD 21218.

J. Benda's present address: Department of Physics, University of Ottawa, Ottawa, Ontario, Canada K1N 6N5.