Abstract
We investigate the transduction of sound stimuli into neural responses and focus on locust auditory receptor cells. As in other mechanosensory model systems, these neurons integrate acoustic inputs over a fairly broad frequency range. To test three alternative hypotheses about the nature of this spectral integration (amplitude, energy, pressure), we perform intracellular recordings while stimulating with superpositions of pure tones. On the basis of online data analysis and automatic feedback to the stimulus generator, we systematically explore regions in stimulus space that lead to the same level of neural activity. Focusing on such isofiringrate regions allows for a rigorous quantitative comparison of the electrophysiological data with predictions from the three hypotheses that is independent of nonlinearities induced by the spike dynamics. We find that the dependence of the firing rates of the receptors on the composition of the frequency spectrum can be well described by an energyintegrator model. This result holds at stimulus onset as well as for the steadystate response, including the case in which adaptation effects depend on the stimulus spectrum. Predictions of the model for the responses to bandpassfiltered noise stimuli are verified accurately. Together, our data suggest that the soundintensity coding of the receptors can be understood as a threestep process, composed of a linear filter, a summation of the energy contributions in the frequency domain, and a firingrate encoding of the resulting effective sound intensity. These findings set quantitative constraints for future biophysical models.
 mechanosensory transduction
 spectral integration
 auditory receptor
 hearing
 sound intensity
 energy
 model
 locust
Auditory receptor cells are commonly characterized by their responses to pure tones. For example, threshold curves characterize the minimum intensity needed to evoke a response as a function of the frequency of a pure tone; rateintensity functions describe how the response depends on the tone's intensity. Natural signals, however, are only rarely restricted to single frequencies, and receptor cells often show a broad frequency tuning. Our understanding of auditory coding is thus not satisfactory as long as we do not know how the relative intensities of different frequencies contained in a sound signal are integrated by auditory receptors. Investigating this spectral integration helps us also to scrutinize basic principles of the mechanosensory transduction process.
In general, the response of the receptor could be any complicated, nonlinear function of the frequency spectrum. One may hope, however, that the underlying mechanism is simple enough to allow for a straightforward phenomenological description. One such way of combining different spectral contents would be the extraction of a single physical stimulus property. Its nature is intensely debated with respect to the question of temporal integration, i.e., how stimulus intensities are combined over time. Psychoacoustic measurements of intensityduration tradeoffs suggest that the stimulus energy is the crucial variable (Garner, 1947; Plomp and Bouman, 1959; Zwislocki, 1965; Florentine et al., 1988), while a recent investigation of firstspike latencies in mammalian auditorynerve fibers finds the timeintegrated pressure as the decisive stimulus attribute (Heil and Neubauer, 2001). In insect auditory receptors, the differences between thresholds for one and twoclick stimuli and intensityduration tradeoffs are consistent with temporal energy integration (Tougaard, 1996,1998). Care must be taken, however, in the interpretation of these data because temporal integration also depends on the time course of several biophysical processes after the primary signal transduction such as internal calcium dynamics and spike generation.
Spectral integration, on the other hand, depends at least in insects almost exclusively on the mechanosensory transduction process; any fluctuations on the several kilohertz scale of relevant sound frequencies that were still present after the transduction (i.e., in the cellmembrane conductance) would be highly attenuated by the lowpass filter properties of the cell membrane (Koch, 1999). Looking at spectral integration instead of temporal integration therefore enables us to focus on the site of primary signal transduction.
For these reasons, we develop a descriptive model for the responses of auditory receptor neurons to stationary stimuli with arbitrary power spectrum. The model comprises three steps, which correspond to the coupling, the transduction, and the encoding of the primary signal (Eyzaguirre and Kuffler, 1955;French, 1992). Focusing on the locust auditory system, we investigate three alternative hypotheses about which stimulus property governs the transduction process: the maximum amplitude of the stimulus, the stimulus energy, and the average halfwaverectified signal amplitude. To test the model framework and distinguish between the rival hypotheses, intracellular recordings from the axons of receptor cells are performed. Based on a systematic exploration of stimuli that cause identical neural responses, the recordings reveal how the individual spectral contributions are integrated into one effective sound intensity.
MATERIALS AND METHODS
Electrophysiology. All experiments were performed on adult Locusta migratoria. The tympanal auditory organ of these animals is located in the first abdominal segment. After decapitation, removal of the legs, wings, intestines, and the dorsal part of the thorax, the animal was waxed to a holder, and the metathoracic ganglion and auditory nerve were exposed. Action potentials from auditory receptor cells were recorded intracellularly in the auditory nerve with standard glass microelectrodes (borosilicate, GC100F10; Harvard Apparatus Ltd., Edenbridge, UK) filled with a 1 m KCl solution (50–110 MΩ resistance). The signals were amplified (BRAMP01; NPI Electronic, Tamm, Germany) and recorded by a data acquisition board (PCIMIO16E1; National Instruments, München, Germany) with a sampling rate of 10 kHz. Detection of action potentials and generation of acoustic signals were controlled online by the custommade Online Electrophysiology Laboratory (OEL) software. Stimuli were transmitted by the abovementioned data acquisition board with a conversion rate of 100 kHz to the loudspeakers [Esotec D260, Dynaudio (Skanderborg, Denmark) on a DCA 450 amplifier (Denon Electronic GmbH, Ratingen, Germany)]. These were mounted at 30 cm distance on each side of the animal so that the incidence of soundpressure waves was orthogonal to the body axis. Stimuli were played only by the loudspeaker ipsilateral to the recorded auditory nerve. The linearity of the loudspeakers for superpositions of multiple tones was verified by playing samples of the stimuli used in the experiments while recording the sound at the site of the animals with a highprecision microphone [40AC, G.R.A.S. Sound & Vibration (Vedbæk, Denmark) on a 2690 conditioning amplifier (Brüel & Kjær, Langen, Germany)]. During the experiments, animals were kept either at room temperature, which was ∼20°C or at a constant temperature of 30°C. No systematic trends regarding a possible temperature dependence of the studied phenomena were observed. All experiments were performed in a Faraday cage lined with soundattenuating foam to reduce echoes. Recordings from 45 receptor cells stemming from 18 animals (with at most 4 cells from the same animal) were used in this study.
The experimental protocol complied with German law governing animal care.
Measurement of rateintensity functions. In general, each sound stimulus was presented for a duration of 100 msec, separated by pauses of at least 400 msec. To investigate adaptation effects, control experiments with longer stimuli and pauses (300/500 msec or 500/750 msec) were performed. All of these stimuli are decidedly longer than typical integration times of insect auditory receptors (1–3 msec as determined by reverse correlation for locust auditory receptors; data not shown) (see also Tougaard, 1998). Responses were measured by the average firing rate, calculated as the total number of spikes divided by the stimulus length. Spikes were detected online and counted from stimulus onset until 20 msec beyond stimulus offset to include all spikes elicited by the stimulus. This is justified because the investigated cells show no or only very low spontaneous activity and no offset response. Spike trains from the control experiments were also used for offline analysis of specific response episodes.
Rateintensity functions were determined in the following way. First, the stimulus was presented in steps of 5 dB between 20 and 100 dB sound pressure level (SPL) (for a definition see Eq. 21 in the ) to obtain the general shape of the rateintensity function. These data were used to identify the intensity range that gave rise to firing rates between 50 and 250 Hz. Within this dynamic range of ∼10–15 dB, additional measurements in steps of 1 or 2 dB were performed, and these were repeated 4–10 times to yield average firing rates and their SDs.
Stimulus intensities corresponding to given firing rates were obtained by fitting a straight line through the four points closest to the desired firing rate as shown in Figure 1. Errors on these measurements follow from the errors of the fitted parameters according to the law of error propagation. Thresholds were determined by linear extrapolation to zero firing rate from data points with a low, but significant firing rate.
Superposition of pure tones. Measuring rateintensity functions for pure tones allows one to understand how the firing rate r depends on the amplitude A of a single tone for a certain sound frequency, r = r(A). Investigating spectral integration amounts to asking whether this understanding can be extended to stimuli that contain multiple tones simultaneously. We therefore try to obtain a description of the firing rate r depending on the amplitudes A_{1}, A_{2}, … of the different frequency components of such stimuli, r = r(A_{1}, A_{2}, …).
In a first set of experiments, stimuli were soundpressure waves S(t) consisting of two or three pure tones of amplitudes A_{n}, frequencies f_{n}, and phase offsets ϕ_{n}, n = 1, 2, 3:
Within the present approach, we are concerned only with the encoding of sound intensity and not with temporal aspects. We thus restricted our attention to stationary stimuli with constant envelope as described above. This is justified because the responses of locust auditory receptors do not phase lock to sound frequencies in the kilohertz range (Suga, 1960; Hill, 1983a).
The experiments were designed to identify, for individual receptors, sets of amplitude combinations (A_{1}, A_{2}) or (A_{1}, A_{2}, A_{3}), respectively, that result in the same firing rate. The recorded data were analyzed within a model framework, which includes explicit predictions about how these amplitude combinations should be related to each other. In Results, the model is systematically developed and discussed. Here, we only present the main aspects and cover technical issues and questions regarding the model's role within the data analysis.
In summary, we compute the average firing rate of a receptor cell in the following threestep process.
(1) The stimulus is a soundpressure wave S(t), a superposition of pure tones with frequencies f_{n}, amplitudes A_{n}, and phase offsets ϕ_{n}, S(t) = ∑
(2) An effective sound intensity J is computed according to one of the following three hypotheses:
(3) The average firing rate r is determined according to a single nonlinear function r(J).
Note that the effective sound intensity J as defined above is distinct from the physical sound intensity, commonly measured in decibels SPL (compare Eq. 21 in the ), which we denote by I throughout the text. Whereas I measures the stimulus itself, J is a derived quantity that incorporates the filter constants C_{n} and therefore also reflects the sensitivity of the specific receptor cell. Furthermore, I is defined as a logarithmic measure (relative to a predefined reference intensity); J is not, which facilitates the notation.
Within the model framework, the filter constants C_{n} are determined only up to a common factor, which can be absorbed in the function r(J). In other words, the model remains unchanged if all C_{n} are multiplied by the same constant and r(J) is at the same time adjusted appropriately. It follows that one way to determine the C_{n} is to choose a fixed firing rate, find for each frequency f_{n} the amplitudeÂ_{n} that leads to this firing rate, and set C_{n} = Â_{n}.
In the following description of the experimental procedure, we will for simplicity focus on the case of superpositions of two tones. The generalization of concepts and formulas to the threetone case is straightforward.
The three alternative hypotheses result in different predictions about which combinations of amplitudes (A_{1}, A_{2}) are expected to lead to the same firing rate. Because the model implies that equal firing rate follows from equal effective sound intensity J (step 3), curves of constant firing rate can easily be calculated for each hypothesis by setting J constant in the equations of the second step in the model. These “isofiringrate curves” are shown in Figure2. From the amplitude hypothesis, pairs (A_{1}, A_{2}) yielding the same firing rate are expected to lie on a straight line. Likewise, from the energy hypothesis, they are expected to lie on an ellipse. For the pressure hypothesis, they should fall on an even more strongly bent curve. The corresponding shape has to be computed numerically by solving the equation:
To relate these predictions to experimental results, we determined a set of amplitude combinations leading to the same average firing rate in the following way. We start by measuring a first rateintensity function for a single pure tone with frequency f_{1}. From this rateintensity function, we determine the amplitude A
A technical but important question is which ratios k should be used in the experiment. If the neuron is much more sensitive to one of the sound frequencies and if both amplitudes are comparable in size, i.e., A_{1} ≈ A_{2}, the response will be determined almost exclusively by the more effective sound frequency. To be most informative, the measurement should thus take the relative sensitivities into account. This is done by choosing k so that A_{1}/C_{1}and A_{2}/C_{2} are of the same order of magnitude, which assures that the effect of both tones is roughly the same. To do so, we use the estimates of C_{1} and C_{2} that have been obtained from the first two rateintensity functions for the pure tones as explained above. In particular, the different ratios of A_{1} and A_{2} for subsequent measurements are selected online in such a way that after taking C_{1} and C_{2} into account, the directions along which the rateintensity functions are measured are evenly spaced. The gray arrows in Figure 2 are such directions. Note that their even spacing depends on the scales of the axes given by C_{1} and C_{2}. The calculation that achieves this is as follows: choose angles α that are evenly spaced in the interval [0°, 90°] and use the relation for the slope ρ of a straight line:
In the offline analysis, the parameters C_{1} and C_{2} were newly determined by χ^{2} fits of each of the three curves in Figure 2 to the complete data set (A
A further technical detail concerns the choice of the fitting procedure. The procedure should treat A_{1} and A_{2} in a symmetric fashion, and it should not be affected by potentially large differences in the relative sensitivities for the two tones. This discards, e.g., the simplest choice of regarding A_{2} as a function of A_{1} or vice versa. Instead, we normalized the amplitudes by the filter constants and looked at the radial distance of the data points:
Estimating C_{1} and C_{2} then corresponds to minimizing the χ^{2} function for the radial distance for each model m:
For the control experiments with stimulus lengths of 300 or 500 msec, the onset response and the steadystate response were analyzed individually. For the onset, only spikes in the first 30 msec after stimulus onset were taken into account; for the steady state, the first 200 msec of the response were disregarded. The control experiments were aimed at investigating the effect of adaptation on our model description. We therefore performed the same analysis as explained above on the firing rates obtained for the onset and the steady state and fitted the filter constants C_{1} and C_{2} separately in each case. In addition, we compared C_{1} and C_{2} as well as their ratio R = C_{1}/C_{2} for the onset with the respective values for the steady state. The relative change of R was computed from the onset value, R_{O}, and the steadystate value, R_{S}, as ΔR = ‖R_{O} − R_{S}‖/ R_{S}. For the total response, the ratio of C_{1} and C_{2} is denoted by R_{total}. To estimate the significance of changes in C_{1}, C_{2}, and R between the onset and the steady state, error measures for these parameters were computed for each cell individually by taking several nonoverlapping stretches of 30 msec during the steady state for the analysis, determining C_{1}, C_{2}, and R in each case, and computing the respective SDs.
Experiments with superpositions of three pure tones were performed and analyzed in the same way as the twotone experiments. We first measured rateintensity functions for each pure tone and from these obtained initial estimates of the respective filter constants C_{1}, C_{2}, and C_{3}. Subsequently, rateintensity functions were measured along different directions in the threedimensional stimulus space:
Statistical analysis. The χ^{2} values obtained from the fits were used to test the statistical significance of deviations of the data from the models by a standard χ^{2}test for each cell individually.
The Bayesian probability of a model given the data can be used as a measure for the preference of one hypothesis over another. It is calculated from Bayes' formula:
Trends in the data were tested for statistical significance by a standard run test (Barlow, 1989). For a given model, the data points were subdivided into those sequences of points that lie consecutively either above or below the model prediction, and the number of these sequences was tested for significant deviations from the null hypothesis of independently scattered data points around the model prediction.
Comparison of puretone and noise stimuli. In another set of experiments, we tested whether our understanding of spectral integration allows accurate predictions of firing rates for more complex stimuli and focused on bandpassfiltered noise. To calibrate the model for a specific receptor, we measured the rateintensity function for a pure tone as well as a set of filter constants in the relevant frequency band of the noise stimulus. According to our model, we can use these puretone results to calculate a prediction for the rateintensity function of the noise stimulus (see below). The filter constants are needed for the calculation of the effective sound intensity J for the noise stimulus (model step 2), and the puretone rateintensity function is needed because it implicitly contains the information about the shape of the response function r(J) of model step 3. To assess the reliability of the prediction, the rateintensity function for the noise stimulus was also measured experimentally. The particular noise stimulus that we used was Gaussian white noise, cut off at ±3 SDs and bandpass filtered between 5 and 10 kHz, and the frequency of the puretone stimulus was 4 kHz. Note that when the amplitude of a noise stimulus is varied, all amplitudes in the signal are scaled by a common factor.
The prediction for the rateintensity function of the noise signal is obtained in the following way. According to our model, the rateintensity function of the pure tone, r^{pt}(I), and the rateintensity function of the noise stimulus, r^{noise}(I), should have the same shape and be related to each other by a shift ΔI along the decibelintensity axis:
Let us briefly describe the reason for the relation of Equation 8. For concreteness, we focus on the energy hypothesis; the amplitude and the pressure hypotheses can be dealt with in an analogous way. Consider an arbitrary sound signal S(t) composed of a set of pure tones with amplitudes A_{n}. From these, we can calculate the intensity, which is defined as:
We now consider a noise stimulus with intensity I^{noise} and effective sound intensity J
If we multiply all amplitudes by the same factor k, the amplitudes of the noise signal as well as A^{pt}, the intensities I^{noise} and I^{pt} are changed by the same amount. Consequently, the difference between the new intensities is still given by ΔI. Likewise, the effective sound intensities are multiplied by the same factor, i.e., we still have J
The derivation shows that the predicted ΔI is given by:
Evaluating Equations 10 and 11 is possible if one knows the filter constants and the A
The prediction for the noisestimulus rateintensity function that results from shifting the puretone rateintensity function r^{pt}(I) by ΔI is compared with the measured curve r^{noise}(I). To do this quantitatively, the prediction of ΔI is related to the true shift ΔI_{true} that can be extracted from the measured rateintensity functions of the pure tone and the noise signal as the distance between these two functions. Because the rateintensity functions are given by individual pairs of intensity and firing rate, (I, r), we use the distance of such a data point of one rateintensity function to the approximate location of the other rateintensity function. For a data point (I^{pt}, r^{pt}) from puretone stimulation, e.g., we thus determine the intensityÎ^{noise} that would be expected to lead to the same firing rate r^{pt}, but for the noise stimulus. The determination ofÎ^{noise} given the firing rate r^{pt} is again done by linear interpolation of the noise rateintensity function as in Figure 1. We thus find for every intensity I
RESULTS
The objective of this study is to develop a descriptive model for the responses of auditory receptor neurons to arbitrary stationary acoustic stimuli. This is done to identify the dominant physical stimulus property governing the encoding of sound intensity. We first develop a general mathematical framework for the transformation of the incoming sound into the neural response. Subsequently, we apply the model to locust auditory receptors and show that the experimental data are well described by only one of three rival hypotheses about the nature of the primary signal transduction.
Derivation of the mathematical model
In locusts, auditory signals are encoded by 60–80 receptor neurons at each ear with similar general properties but considerable variability in the parameter values describing the sensitivity of individual neurons to specific sound frequencies (Römer, 1985). In response to a pure tone with sufficient intensity, the firing rate of a receptor cell increases in a sigmoidal fashion with stimulus intensity (Fig.3 A). The steepness and level of saturation of this rateintensity function depend on the individual cell and temperature. Below a threshold intensity, there is no or only very low spontaneous activity. The regime between threshold and saturation usually spans ∼15–30 dB, and maximum firing rates lie at ∼300 Hz for room temperature and ∼500 Hz for 30°C.
The frequencyresolved sensitivity of the receptors can be characterized by a threshold curve, i.e., the dependence of the threshold on the sound frequency (Fig. 3 D). The receptors are fairly broadly tuned with characteristic frequencies in the range of 4 kHz (lowfrequency receptors) to 15 kHz (highfrequency receptors), and the absolute sensitivities vary strongly between individual neurons (Römer, 1985; Jacobs et al., 1999).
Measuring rateintensity functions from a single receptor cell for many different sound frequencies reveals another property of the receptors; to good approximation, the rateintensity functions are shifted versions of one another along the intensity axis, where intensity is measured in the logarithmic units of soundpressure level, decibels SPL. This phenomenon has been reported previously by Suga (1960) and Römer (1976). A detailed example with frequencies spanning the whole sensitivity range of a typical lowfrequency receptor (characteristic frequency of ≈5 kHz) can be seen in Figure 3 B. The generic shape of the rateintensity functions becomes even clearer if they are shifted relative to each other and aligned at 250 Hz firing rate (Fig.3 C). Figure 3 D shows the threshold curve together with curves denoting the intensities that lead to firing rates of 150 and 300 Hz. As a consequence of the generic shape of the rateintensity functions, all curves are approximately parallel to each other.
These key findings indicate that over the whole frequency range, the coupling of the physical stimulus is not substantially influenced by mechanical nonlinearities. In fact, a simple filtering mechanism captures the essence of the observed phenomenon. Let us assume that for all pure tones the firing rate is given by a single function r(A_{n}/C_{n}). A_{n} denotes the amplitude of a specific pure tone of frequency f_{n}, and C_{n} is a frequencydependent filter constant such that the firing rate depends only on the ratio A_{n}/C_{n}. This corresponds to a gain factor of 1/C_{n} for each sound frequency. For two different frequencies f_{1} and f_{2}, the firing rates r(A_{1}/C_{1}) and r(A_{2}/C_{2}) are then the same when A_{1}/C_{1} = A_{2}/C_{2}, i.e., when the amplitudes take on a constant ratio A_{1}/A_{2} = C_{1}/C_{2}. Because the intensity I in decibels SPL is defined as a logarithmic measure of the amplitude, I = 20 log_{10}(A/(
Step 1: coupling to the stimulus
The sound pressure wave S(t), written as a Fourier series:
Although the above reasoning for using a linear filter as the first model step is based on electrophysiological observations only, it corresponds well with biophysical findings regarding the tympanal membrane. Schiolten et al. (1981) observed that the tympanal membrane behaves approximately as a linear oscillator with a short damping time constant of ∼100 μsec. The resonance properties of this oscillator are thought to be responsible for the frequencyresolved gain of the receptors and therefore also for the shapes of the threshold curves (Michelsen, 1971a,1971b,1979). Michelsen and Rohrseitz (1995) also note that the amplitude of the tympanal vibration depends linearly on the sound pressure for pure tones.
Step 2: mechanosensory transduction
Receptor cells are attached to the tympanal membrane with a cilium protruding from the dendrite and several auxiliary cells surrounding a receptor (Gray, 1960). The biophysical functioning of this machinery is not yet understood, but oscillations of the tympanal membrane presumably lead to conductance changes in the receptors' dendrites that give rise to membrane depolarizations (Hill, 1983a, 1983b). This is where a spectral integration of frequencydependent stimulus attributes must occur. Voltage fluctuations in the range of the relevant sound frequencies (several kilohertz) cannot be transmitted by the cell membrane because of its lowpass filter properties. Information about the spectral content is therefore lost at the level of the membrane potential, which, instead, is expected to correspond to an integrated stimulus property. The spectrum of the generator potential after acoustic stimulation is indeed found to contain no trace of the sound frequency used (Hill, 1983a).
Following ideas from the literature concerning temporal integration in auditory receptor cells (Tougaard, 1996; Heil and Neubauer, 2001), we set up three hypotheses for the spectral integration by calculating an “effective sound intensity” J from S̃(t).
Amplitude hypothesis (AH)
J corresponds to the maximum amplitude ofS̃(t). This is the common view of a threshold: a response occurs once the signal reaches a certain value. In the case of few frequency components, J is given by the sum of the scaled amplitudes:
Energy hypothesis (EH)
J corresponds to the temporal mean of the squared signal [throughout what follows, 〈x(t)〉 denotes the temporal mean of x(t)]:
Pressure hypothesis (PH)
J corresponds to the temporal mean of the absolute value of S̃(t):
Step 3: encoding by firing rates
The response of an auditory receptor to a signal of constant intensity can be characterized by a mean firing rate r. The rate is obtained from a onedimensional, nonlinear transformation of the effective sound intensity J:
Measured spiketrain responses have a strong transient attributable to adaptation. In a first approach, we average over this temporal structure in the response and consider only the total number of spikes elicited by the stimulus. In a second, more detailed analysis, we analyze individual parts of the response to explicitly test how this structure in the spike trains might affect our model description.
Electrophysiological experiments
Experimental strategy
To directly address the question of spectral integration and the hypotheses in step 2 of our model, we compare only stimuli that lead to the same firing rate of a given neuron. With this strategy, we circumvent complications attributable to the nonlinearity induced by the spikegeneration mechanism. In terms of our model, a constant firing rate implies a constant effective sound intensity J and vice versa, independently of the specific shape of r(J). As a crucial element of our analysis, we therefore identify regions of constant J in stimulus space (A_{1}, A_{2}, …) by searching for stimulus combinations that result in the same firing rate. We denote these regions as isofiringrate regions. They are then compared with the predictions of the three hypotheses and reveal how J is composed of contributions from the single amplitudes. Because the rateintensity functions are found to be fairly smooth in the rising part between threshold and saturation, extracting isofiringrate regions can be accurately done by linear interpolation as is shown in Figure 1.
Superpositions of two pure tones
The complete stimulus space of stationary stimuli is, of course, high dimensional. We thus started with lowdimensional subspaces using only two or three pure tones and their superpositions. In the twodimensional subspace (A_{1}, A_{2}), each point represents a linear combination of two puretone signals at frequencies f_{1} and f_{2} (4 and 9.55 kHz):
Responses to superpositions of two pure tones with stimulus duration of 100 msec were measured for 17 cells. Figure4 depicts sets of amplitude combinations (A_{1}, A_{2}) that led to a firing rate of 150 Hz in each of the four cells presented. Fitted isofiringrate curves corresponding to the three hypotheses are also shown.
Performing a χ^{2} test on the fits of the three hypotheses showed that the amplitude hypothesis is rejected at the 1% level for all 17 cells, whereas the energy hypothesis is not rejected for any cell and the pressure hypothesis is rejected for 4 cells. For an indepth analysis, we therefore considered only the energy and the pressure hypotheses.
To further distinguish between these two hypotheses, we directly compared the goodness of fit given by the χ^{2} values. The energy hypothesis yielded a lower χ^{2} than the pressure hypothesis in 16 of 17 cases. We also calculated a Bayesian estimate of the probability p (model‖data) of the model given the data (with prior probabilities of 0.5 for both the energy and the pressure hypothesis). The mean of p (EH‖data) was obtained as 0.884 with 0.167 SD and median 0.978 (N = 17), whereas p (PH‖data) equals 1 − p (EH‖data) and therefore had a mean of only 0.116.
Furthermore, data points for which A_{1}/C_{1} and A_{2}/C_{2} were approximately equal (i.e., data points in the middle sections of the plots in Fig. 4) were in general below the fitted isofiringrate curve of the pressure hypothesis instead of scattered around it as would be expected if the deviations resulted from independent measurement errors. We investigated this trend by a run test for those fits of the model that had at least 10 df (i.e., 12 data points). For these nine cells, the run test showed significant deviations (p < 0.01) from the pressure hypothesis in three cases. All three cells were different from those that had led to statistically significant deviations from the pressure hypothesis according to the χ^{2} test. For the energy hypothesis, such a trend was not observable.
From the combined evidence, we conclude that the amplitude as well as the pressure hypotheses can be rejected. The energy hypothesis, on the other hand, provides a good description of the data for spectral integration in the twotone case.
The values obtained for the filter constants corresponding to the energy hypothesis can be read from the graphs in Figure 4 as the halfaxes of the ellipses, i.e., as the intersection points of the ellipses with the two coordinate axes. Values range from ∼1 to 2000 mPa, corresponding to the large variability in overall sensitivity of the receptors. As stated in Materials and Methods, the filter constants are determined only up to a common factor. Their ratios are, however, a direct measure of the relative sensitivity for the two chosen sound frequencies. In our experiments, we found ratios of C_{1} and C_{2} of up to 30:1 (Fig. 4 B), which means that spectral integration can be accurately determined even if the sensitivities for the two sound frequencies diverge by as much as 30 dB and possibly more.
We also see from Figure 4 that the initial estimates of C_{1} and C_{2} (taken from the puretone rateintensity functions; see Materials and Methods) are already very close to the values obtained by the fit of the energy hypothesis. The initial estimates for C_{1} and C_{2} are given by the data points on the coordinate axes and closely coincide with the intersection points of the ellipses. This shows that the filter constants measured with pure tones are approximately the same as those obtained from fitting the energy hypothesis to all data points.
As an additional test of the energy hypothesis, we investigated how isofiringrate curves that were obtained separately for different firing rates are related to one another. Figure5 shows pairs (A_{1}, A_{2}) corresponding to several firing rates between 100 and 200 Hz. Pairs corresponding to the same firing rate are accurately fitted by ellipses. Each ellipse corresponds to an independent fit to the data points of the same firing rate. To good approximation, all ellipses are scaled versions of one another. This result is in accordance with the energy hypothesis, because the ratio of the halfaxes of the ellipses should always equal the ratio of the filter constants C_{1} and C_{2}. Such a behavior was observed for all cells measured. For each cell, we determined the ratios R_{100} and R_{150} of halfaxes of the ellipses corresponding to 100 and 150 Hz, respectively, and their relative deviations ‖(R_{150} − R_{100})/R_{150}‖. We found that with a mean of 0.044 (SD 0.026), these were always small.
Analysis of specific response episodes
Up to now, we have disregarded the fact that the spiketrain responses contain a pronounced transient attributable to adaptation. Typical spike trains from receptor cells for 300 msec puretone stimuli and the corresponding instantaneous firing rates can be seen in Figure6, A andE. The transient usually spans approximately the first 40 msec but can last as long as 100 msec. Afterward, the cell has adapted to the sound intensity, and the response is approximately in a stationary steady state for the rest of the stimulus duration. When the stimulus ends, the receptor cells do not show an offset response, but stop firing or return to their usually low spontaneous activity. To investigate how the transients influence our model description, we explicitly analyzed the validity of the hypotheses for the onset as well as the steadystate response.
Spike trains from 10 cells were recorded with stimuli of either 300 msec (in 6 cases) or 500 msec duration (in 4 cases). The same analysis as before was applied to the onset by using only the first 30 msec of the response and to the steady state by disregarding the first 200 msec after stimulus onset. Two examples are shown in Figure 6. In each case, the data points are best fitted by the ellipses from the energy hypothesis. We again performed a statistical analysis of the goodness of fit. Longer stimulus durations resulted in fewer measurements per cell so that the data points often had larger experimental errors. This effect is even stronger for the analysis of the onset response, which relies on considerably shorter stretches of data. Nevertheless, the data from two cells during steady state deviated significantly (p < 0.05) from the pressure hypothesis, whereas the energy hypothesis always gave a good fit. Furthermore, the Bayesian test favored the energy hypothesis over the pressure hypothesis strongly for both onset and steady state [p (EH‖data) for onset response: mean 0.642 with 0.100 SD, median 0.649; p (EH‖data) for steady state: mean 0.795 with 0.131 SD, median 0.782; N = 10]. We conclude that the energy hypothesis yields an appropriate description also for the specific episodes of the response.
We may now use our description of spectral integration to investigate a possible dependence of the adaptation on the sound frequency. Adaptation mechanisms in mechanoreceptors have been identified for stimulus coupling (Eyzaguirre and Kuffler, 1955;Chapman et al., 1979), transduction (Ricci et al., 1998; Holt and Corey, 2000), and encoding (Matthews and Stein, 1969; Purali and Rydqvist, 1998). For insect mechanoreceptors, spike adaptation in the encoding stage often seems to be the dominant source (French, 1984a, 1984b). The fact that the time constants of adaptation depend strongly and systematically on the firing rate for locust auditory receptors also indicates that spike adaptation is an important mechanism (Benda, 2002). Because spike adaptation takes place after spectral integration, it is independent of the sound frequency. Our model description is not affected by such a frequencyindependent adaptation as long as we focus on a fixed response episode. The number of spikes occurring during such an episode is still a function of the effective sound intensity, although the distribution of the spikes may display a certain structure within the response. For different response episodes, the decrease in the firing rate over time is simply reflected in an increase of the filter constants C_{1} and C_{2} by a common factor, which could also be absorbed in the function r(J). In particular, the ratios R = C_{1}/C_{2} of the filter constants for the onset, R_{O}, and the steady state, R_{S}, should be the same.
The development of the firing rates in Figure 6 shows that the transient parts of the response are generally similar and that they have approximately the same time constant independent of sound frequency. For the cell that is depicted in the right columnof Figure 6, though, the firing rates for the two sound frequencies clearly differ in the first 30 msec. This indicates that, on short time scales, the adaptation dynamics can depend on sound frequency, which implies an adaptation mechanism within the coupling or transduction processes. For our model description, such a phenomenon results in a difference between R_{O} and R_{S}. This can be observed, e.g., in theright column of Figure 6, where the ellipses for the onset (Fig. 6 F) and steady state (Fig. 6 G) have different shapes.
We analyzed this effect quantitatively for the 10 investigated cells by determining the relative change ΔR = ‖R_{O} − R_{S}‖/ R_{S}. We found values of ΔR between 1 and 25%, which must be compared with the error measures for the values of R of ∼10%. Half of the cells had a ΔR value that was larger than their noise level. The cell depicted in the right column of Figure 6showed the largest ΔR of the 10 cells. The total values of the filter constants C_{1} and C_{2}, on the other hand, change between onset and steady state by 10–50%, with error measures of 5–10%. We conclude that all cells that we analyzed were affected by adaptation and that in some cells, a small fraction of the adaptation phenomenon might be attributed to frequencydependent mechanisms. These frequencydependent effects are restricted to approximately the first 30 msec. Analyzing the time window from 40 to 70 msec after stimulus onset, e.g., gives very similar ratios of C_{1} and C_{2} as for the steady state. Consequently, the frequencydependent changes are negligible for the model description of the average response to longer stimuli. For example, the area between the two firing rate curves in the first 30 msec of Figure 6 E (bottom), which denotes the difference in spike count attributable to the frequency dependence, corresponds to only ∼2% of the total spike count. For the remaining part of this study, we therefore use the full responses to 100 msec stimuli, for which it is easier to collect a sufficient amount of data in the limited recording time.
Superpositions of three pure tones
To see whether the findings from the twotone experiments generalize to sounds with more complex frequency spectra, responses to superpositions of three pure tones were analyzed. We applied the same approach as for the twotone experiments with 100 msec stimuli and identified isofiringrate surfaces in the threedimensional subspace (A_{1}, A_{2}, A_{3}). The three hypotheses yield predictions about these surfaces in the form of a plane (amplitude hypothesis), an ellipsoid (energy hypothesis), and a more strongly bent surface (pressure hypothesis) the exact shape of which has to be determined numerically.
Responses to superpositions of three pure tones were measured for eight cells. From the rateintensity functions, we determined amplitude triplets corresponding to a firing rate of 150 Hz. Figure7 illustrates the results for one cell and also shows the fitted ellipsoid corresponding to the isofiringrate surface of the energy hypothesis. We applied a χ^{2} test and found that the amplitude hypothesis is rejected at the 1% level for all eight cells, whereas the energy hypothesis is rejected for one cell and the pressure hypothesis is rejected for four cells. We again compared the fits for the energy and the pressure hypothesis in more detail. In all cases, the energy hypothesis gave a lower χ^{2} than the pressure hypothesis, and the Bayesian estimate of the probability of the model given the data again strongly favored the energy hypothesis over the pressure hypothesis [mean of p (EH‖data) was 0.916 with 0.109 SD, median 0.987, N = 8]. Thus, spectral integration for three pure tones is also best described by the energy hypothesis, whereas the amplitude and the pressure hypothesis are rejected by the data.
Comparison of puretone and noise stimuli
So far we have found that the energy hypothesis describes spectral integration of mixtures of two and three pure tones. We now pose the question whether this hypothesis also applies to stimuli composed of many frequencies. In particular, we aim at predicting the response to a bandpassfiltered Gaussian white noise based on the knowledge of the filter constants C_{n} and a puretone rateintensity function. Spiketrain responses to the noise stimulus have the same structure as responses to puretone stimulation (data not shown). We therefore again focus on the firing rate and measure rateintensity functions for the noise stimulus. Our model predicts that these should have the same shape as the puretone rateintensity functions (see Materials and Methods). The expected distance ΔI between the two rateintensity functions can be calculated if the filter constants and the power spectrum of the noise stimulus are known. The values for the energy hypothesis, ΔI_{EH}, and the pressure hypothesis, ΔI_{PH}, are given in Equations 10 and11.
The puretone stimulus has a frequency of 4 kHz, and the noise stimulus is bandpass filtered between 5 and 10 kHz, a region in which many receptors are most sensitive. Rateintensity functions for these two types of stimuli were measured for 10 cells. In addition, filter constants in the range of 5–10 kHz were determined independently by measuring the amplitudes of pure tones leading to a firing rate of 260 Hz. Figure 8 shows rateintensity functions for the puretone as well as the noise stimulus together with the predictions that are obtained from shifting the puretone rateintensity functions by ΔI_{EH}. In each case, the two measured rateintensity functions are almost identical in shape, as expected from the model. Furthermore, the measured noisestimulus rateintensity function and the shifted puretone rateintensity function coincide closely in most cases. Note that only results from puretone stimulation are used for the prediction of the noisesignal responses. To assess the results quantitatively, we calculated the deviation of ΔI_{EH} from the actual distance between the rateintensity functions, ΔI_{true}, in each case. For the energy hypothesis, ΔI_{EH} − ΔI_{true}has a mean of −0.62 ± 0.68 dB (SE). The spread of these data (SD of 2.16 dB) corresponds to the expected measurement accuracy, which can be estimated to be ∼2 dB; the determination of C^{pt}, the collection of C_{n}, and the locations of both r^{pt}(I) and r^{noise}(I) all contribute independently with ∼1 dB error range. The pressure hypothesis yields ΔI_{PH} − ΔI_{true}with a mean of 0.43 ± 0.68 dB (SE) and is thus not ruled out by this experiment.
The results suggest that the description of spectral integration by the energy hypothesis, as derived from the two and threetone experiments, is also applicable to more complex stimuli. The model can be used for an accurate prediction of the location of the rateintensity function after measuring the filter constants from puretone responses. The predictability of actual firing rates, however, is limited by the steepness of the rateintensity functions. Because the range between threshold and saturation usually spans only ∼15–30 dB, small inaccuracies of a few decibels about the prediction of the shift between the rateintensity functions have a strong effect on individual firingrate predictions.
DISCUSSION
Spectral integration is an important feature of auditory encoding and is closely connected to the mechanosensory transduction process. Our data show that the response of locust auditory receptor cells to stationary sound stimuli is determined by an “effective sound intensity” J that can be calculated from the stimulus spectrum and the sensitivity of the receptor at different sound frequencies. The soundintensity coding of the receptor cells can thus be described in a threestep process. First, the tympanal membrane acts as a linear filter. The relevant characteristics of the filter can be determined using puretone stimuli by measuring which intensities correspond to a given firing rate of the receptor. Second, the effective sound intensity J is obtained by summing up all energies contained in the individual frequency components of the filtered signal. We believe that this summation reflects the dynamic properties of the mechanosensory transduction channels. Ultimately, a biophysical investigation is required to confirm this view. In a final step, the effective sound intensity is put through a nonlinear response function independent of the spectral contents of the original signal. The shape of this response function can be derived from the measurement of a single rateintensity function with arbitrary but fixed spectral content.
Alternative hypotheses that compute J as the maximum amplitude or the integrated pressure are rejected by the analysis of the responses of the receptors to superpositions of two or three pure tones. Although the amplitude hypothesis can be clearly discarded, the energy and pressure hypotheses are more similar in their predictions regarding spectral integration. However, the combined evidence from several statistical investigations demonstrates that the pressure hypothesis fails in several single cases and that the energy hypothesis provides a far better description of the data.
Comparison between responses to pure tones and bandpassfiltered noise shows that the energy hypothesis also accounts for the responses to more complex stimuli. The model can therefore be used to accurately predict the rateintensity functions for noiselike signals.
Effects of stimulus onset and adaptation
A more detailed analysis reveals that the energy hypothesis also describes spectral integration for the onset and the steadystate responses individually. The model parameters depend on the response episode investigated. The main effect is a change in the filter constants by a common factor attributable to spikedependent adaptation. For a generalization of the model to fluctuating stimuli, this dependence of the parameters on the adaptation state could be explicitly incorporated, e.g., by using the generic model of (Benda et al., 2001; Benda, 2002). In some cells, changes in the ratio of the filter constants between onset and steady state suggest additional, although smaller soundfrequencydependent dynamics of the adaptation. Besides the dominant spike adaptation, there might thus be a second adaptation phenomenon, which occurs before spectral integration. It might be caused by either a mechanical effect of the vibrations of the tympanum and associated structures or a property of the transduction channels. Because the effect is generally small and restricted to the first 30 msec, it can be neglected for the description of the average response to longer stimuli, which was the focus of the present study.
Conceptual framework
Combining the two concepts “spectral integration” and “isofiringrate regions” allowed us to rigorously compare different transduction models. A key ingredient in the experimental procedure was the systematic exploration of regions of constant activity under variation of the stimulus composition, in the present case the spectrum of a sound signal. Investigating such regions implies a change of the traditional perspective regarding neural input–output relations. Instead of asking what output is produced by a given input, one seeks to identify input ensembles that are associated with a fixed output. Reliable online analysis and automatic feedback to the stimulus generation are a central aspect of this approach. Based on increasingly available highspeed computing power, the method could be easily extended to identify more general invariant regions in auditory and other stimulus domains.
Our framework may be compared with the technique of silent substitution (Estévez and Spekreijse, 1982), in which the spectral composition of a visual stimulus is varied systematically such that the resulting stimuli always lead to the same activity of one (or more) receptor types in the retina. Fluctuations in visually evoked potentials can then be interpreted as being caused by the remaining receptors. In this case, however, the isoactivity regions of the receptors are not explored, but must be known accurately beforehand, and they are not compared with alternative model predictions.
Comparison with studies of temporal integration in other auditory systems
Our results go along well with the fact that temporal energy integration describes firing thresholds for doubleclick and intensityduration tradeoff experiments in receptor cells of moths (Tougaard, 1996,1998). If this finding can also be confirmed for locust auditory receptors, spectral and temporal integration could be combined in a single simple model. Trade offs between intensity and duration would then be expected to occur for stimuli on time scales of a few milliseconds, the apparent integration time of the receptors, which is well below the stimulus durations used in this study. In mammalian auditory nerve fibers, on the other hand, firstspike latencies correspond to the integrated pressure and not the energy (Heil and Neubauer, 2001). It is possible that this is caused by a fundamental difference in the transduction mechanisms of hair cells and insect auditory receptors. However, latency measurements reflect properties of the transduction as well as properties of additional dynamic processes, such as synaptic transmission, internal calcium dynamics, and spike generation. In this context, it should be noted that the latency in type I excitable membranes depends strongly and nonlinearly on the input strength (Hodgkin, 1948; Rinzel and Ermentrout, 1998; Izhikevich, 2000). This opens up the possibility that properties of the spike generator alter the effective input in such a way that energy integration is in accordance with the observed correspondence between latency and the temporal pressure integral. In fact, Ermentrout (1996) showed that in type I membranes, the firing rate r to a constant stimulus S above the firing threshold S* approximately obeys the squareroot relation r(S) ∼
Response properties of hair cells and mammalian auditory nerve fibers are complicated by mechanical nonlinearities induced by the cochlea and a more intricate signal pathway than is the case in insect auditory systems. Nevertheless, measurements of basilarmembrane vibrations indicate that outside a region around the characteristic frequency, the stimulus coupling to mammalian auditory receptors occurs in an approximately linear fashion (Eguı́luz et al., 2000; Ruggero et al., 2000). This suggests that a phenomenological study along the lines of the present investigation might also reveal interesting properties of the transduction process in hair cells.
Implications for the locust auditory system
Practical implications of our results include a more reliable characterization of insect auditory receptor sensitivity by measuring the intensities necessary to provoke a given nonzero firing rate instead of the threshold curve. The latter is notoriously difficult to measure because the rateintensity functions usually flatten out near the threshold and are corrupted by background activity (Michelsen, 1971c). At least in locust receptor cells, the threshold curve runs approximately parallel to any other curve of equal response, and a single additional rateintensity function can determine the distance between the measured curve and the actual threshold curve. Furthermore, our results show that average responses of an auditory receptor to complex stimuli can be well predicted once the cellspecific effective sound intensity J has been measured. The resulting quantitative correspondence between the stimulus spectrum and the firing rate differs from the predictions of an earlier heuristic approach (Lang, 2000). Our result will thus be helpful for systematic investigations of the processing of natural communication signals, such as grasshopper calling songs (Machens et al., 2001).
Linear versus nonlinear models
The simplicity of our model, which is linear up to a final static nonlinearity, is consistent with the fact that previous studies have found no indications of dominant nonlinearities or active movement of the sensory cilia (cf. Eberl, 1999). Distortionproduct otoacoustic emissions from locust ears indicate slight nonlinearities at the tympanal membrane, but only at ∼50 dB below the stimulating intensities (Kössl and Boyan, 1998). Many other auditory systems, on the other hand, are strongly affected by nonlinear mechanisms and active signal amplification leading to increased sensitivity and frequency resolution. This phenomenon is common in vertebrate ears (Fettiplace and Fuchs, 1999; Hudspeth et al., 2000) but has also been shown to exist in some insect auditory systems (Göpfert and Robert, 2001).
Implications for other mechanosensory systems
It can be speculated that the nonlinearities mentioned above are additional features on top of the same underlying mechanosensory transduction process. Recent findings of structural and functional similarities between hair cells and the Drosophila sensory bristle as well as the discovery in Drosophila of homologs of mammalian genes related to hearing and deafness support this view and suggest that many aspects of mechanosensory transduction among insects and vertebrates are conserved (Adam et al., 1998; Bermingham et al., 1999; Eberl, 1999; Fritzsch et al., 2000; Walker et al., 2000; Gillespie and Walker, 2001). The energy hypothesis might thus be extended to account for spectral integration in other mechanosensory systems as well, possibly after modifications that take the systemspecific nonlinearities explicitly into account.
Mechanosensory transduction is also involved in a wide range of other senses, including touch, proprioception, and the sense of balance. Unlike transduction mechanisms that involve secondmessenger signaling suited for biochemical analysis, mechanosensory changes of the membrane conductance result from a direct coupling with the mechanical stimulus: stretch, compression of the cell, or deflection of associated processes or cilia (Corey and Hudspeth, 1979; Hudspeth, 1985; Hudspeth and Logothetis, 2000). This direct and fast electrophysiological response has so far resisted a detailed biophysical analysis (Gillespie, 1995). Our method of finding regions of constant neural response for varying spectral composition provides a novel approach for distinguishing between different hypotheses about receptor integration, sets quantitative constraints that any future biophysical model has to satisfy, and is applicable to a wide range of other (mechano)sensory systems.
CALCULATION OF THE INTENSITY SHIFT BETWEEN PURETONE AND NOISE SIGNALS
We denote the effective sound intensities of the noise signal by J
For J
Footnotes

This work was supported by Boehringer Ingelheim Fonds (T.G.) and the Deutsche Forschungsgemeinschaft. We are grateful to Christian Machens and Martin Stemmler for fruitful discussions and Peter Heil, Matthias Hennig, and Rüdiger von der Heydt for valuable comments on this manuscript.

Correspondence should be addressed to Andreas V. M. Herz, Institute for Theoretical Biology, Department of Biology, Humboldt University, 10115 Berlin, Germany. Email:herz{at}itb.biologie.huberlin.de.

H. Schütze's present address: Krieger Mind/Brain Institute, Johns Hopkins University, Baltimore, MD 21218.

J. Benda's present address: Department of Physics, University of Ottawa, Ottawa, Ontario, Canada K1N 6N5.