Previous Article | Next Article 
The Journal of Neuroscience, December 1, 2002, 22(23):10434-10448
Energy Integration Describes Sound-Intensity Coding in an
Insect Auditory System
Tim
Gollisch,
Hartmut
Schütze,
Jan
Benda, and
Andreas V. M.
Herz
Institute for Theoretical Biology, Department of Biology, Humboldt
University, 10115 Berlin, Germany
 |
ABSTRACT |
We investigate the transduction of sound stimuli into neural
responses and focus on locust auditory receptor cells. As in other
mechanosensory model systems, these neurons integrate acoustic inputs
over a fairly broad frequency range. To test three alternative hypotheses about the nature of this spectral integration (amplitude, energy, pressure), we perform intracellular recordings while
stimulating with superpositions of pure tones. On the basis of online
data analysis and automatic feedback to the stimulus generator, we systematically explore regions in stimulus space that lead to the same
level of neural activity. Focusing on such iso-firing-rate regions
allows for a rigorous quantitative comparison of the
electrophysiological data with predictions from the three hypotheses
that is independent of nonlinearities induced by the spike dynamics. We
find that the dependence of the firing rates of the receptors on the
composition of the frequency spectrum can be well described by an
energy-integrator model. This result holds at stimulus onset as well as
for the steady-state response, including the case in which adaptation effects depend on the stimulus spectrum. Predictions of the model for
the responses to bandpass-filtered noise stimuli are verified accurately. Together, our data suggest that the sound-intensity coding
of the receptors can be understood as a three-step process, composed of
a linear filter, a summation of the energy contributions in the
frequency domain, and a firing-rate encoding of the resulting effective
sound intensity. These findings set quantitative constraints for future
biophysical models.
Key words:
mechanosensory transduction; spectral integration; auditory receptor; hearing; sound intensity; energy; model; locust
 |
INTRODUCTION |
Auditory receptor cells are commonly
characterized by their responses to pure tones. For example, threshold
curves characterize the minimum intensity needed to evoke a response as
a function of the frequency of a pure tone; rate-intensity functions
describe how the response depends on the tone's intensity. Natural
signals, however, are only rarely restricted to single frequencies, and receptor cells often show a broad frequency tuning. Our
understanding of auditory coding is thus not satisfactory as long
as we do not know how the relative intensities of different frequencies
contained in a sound signal are integrated by auditory receptors.
Investigating this spectral integration helps us also to scrutinize
basic principles of the mechanosensory transduction process.
In general, the response of the receptor could be any complicated,
nonlinear function of the frequency spectrum. One may hope, however,
that the underlying mechanism is simple enough to allow for a
straightforward phenomenological description. One such way of combining
different spectral contents would be the extraction of a single
physical stimulus property. Its nature is intensely debated with
respect to the question of temporal integration, i.e., how stimulus
intensities are combined over time. Psychoacoustic measurements of
intensity-duration tradeoffs suggest that the stimulus energy is the
crucial variable (Garner, 1947
; Plomp and Bouman,
1959
; Zwislocki, 1965
; Florentine et al.,
1988
), while a recent investigation of first-spike latencies in
mammalian auditory-nerve fibers finds the time-integrated pressure as
the decisive stimulus attribute (Heil and Neubauer,
2001
). In insect auditory receptors, the differences between
thresholds for one- and two-click stimuli and intensity-duration
tradeoffs are consistent with temporal energy integration
(Tougaard, 1996
,
1998
). Care must be taken, however, in the interpretation of these data because temporal integration also depends on the time course of several biophysical processes after the primary signal transduction such as internal calcium dynamics and spike generation.
Spectral integration, on the other hand, depends at least in insects
almost exclusively on the mechanosensory transduction process; any
fluctuations on the several kilohertz scale of relevant sound
frequencies that were still present after the transduction (i.e., in
the cell-membrane conductance) would be highly attenuated by the
low-pass filter properties of the cell membrane (Koch, 1999
). Looking at spectral integration instead of temporal
integration therefore enables us to focus on the site of primary signal transduction.
For these reasons, we develop a descriptive model for the
responses of auditory receptor neurons to stationary stimuli with arbitrary power spectrum. The model comprises three steps, which correspond to the coupling, the transduction, and the encoding of the
primary signal (Eyzaguirre and Kuffler, 1955
;
French, 1992
). Focusing on the locust auditory system,
we investigate three alternative hypotheses about which stimulus
property governs the transduction process: the maximum amplitude of the
stimulus, the stimulus energy, and the average half-wave-rectified
signal amplitude. To test the model framework and distinguish between
the rival hypotheses, intracellular recordings from the axons of
receptor cells are performed. Based on a systematic exploration of
stimuli that cause identical neural responses, the recordings reveal
how the individual spectral contributions are integrated into one
effective sound intensity.
 |
MATERIALS AND METHODS |
Electrophysiology. All experiments were performed on
adult Locusta migratoria. The tympanal auditory organ of
these animals is located in the first abdominal segment. After
decapitation, removal of the legs, wings, intestines, and the dorsal
part of the thorax, the animal was waxed to a holder, and the
metathoracic ganglion and auditory nerve were exposed. Action
potentials from auditory receptor cells were recorded intracellularly
in the auditory nerve with standard glass microelectrodes
(borosilicate, GC100F-10; Harvard Apparatus Ltd., Edenbridge, UK)
filled with a 1 M KCl solution (50-110 M
resistance).
The signals were amplified (BRAMP-01; NPI Electronic, Tamm, Germany)
and recorded by a data acquisition board (PCI-MIO-16E-1; National
Instruments, München, Germany) with a sampling rate of 10 kHz.
Detection of action potentials and generation of acoustic signals were
controlled on-line by the custom-made Online Electrophysiology
Laboratory (OEL) software. Stimuli were transmitted by the
above-mentioned data acquisition board with a conversion rate of 100 kHz to the loudspeakers [Esotec D-260, Dynaudio (Skanderborg, Denmark)
on a DCA 450 amplifier (Denon Electronic GmbH, Ratingen, Germany)].
These were mounted at 30 cm distance on each side of the animal so that
the incidence of sound-pressure waves was orthogonal to the body axis.
Stimuli were played only by the loudspeaker ipsilateral to the recorded auditory nerve. The linearity of the loudspeakers for superpositions of
multiple tones was verified by playing samples of the stimuli used in
the experiments while recording the sound at the site of the animals
with a high-precision microphone [40AC, G.R.A.S. Sound & Vibration
(Vedbæk, Denmark) on a 2690 conditioning amplifier (Brüel & Kjær, Langen, Germany)]. During the experiments, animals were kept
either at room temperature, which was ~20°C or at a constant
temperature of 30°C. No systematic trends regarding a possible
temperature dependence of the studied phenomena were observed.
All experiments were performed in a Faraday cage lined with
sound-attenuating foam to reduce echoes. Recordings from 45 receptor
cells stemming from 18 animals (with at most 4 cells from the same
animal) were used in this study.
The experimental protocol complied with German law governing animal care.
Measurement of rate-intensity functions. In general, each
sound stimulus was presented for a duration of 100 msec, separated by
pauses of at least 400 msec. To investigate adaptation effects, control
experiments with longer stimuli and pauses (300/500 msec or 500/750
msec) were performed. All of these stimuli are decidedly longer than
typical integration times of insect auditory receptors (1-3 msec as
determined by reverse correlation for locust auditory receptors; data
not shown) (see also Tougaard, 1998
). Responses were
measured by the average firing rate, calculated as the total number of
spikes divided by the stimulus length. Spikes were detected on-line and counted from stimulus onset until 20 msec beyond stimulus offset to include all spikes elicited by the stimulus. This is justified because the investigated cells show no or only very low
spontaneous activity and no offset response. Spike trains from the
control experiments were also used for off-line analysis of specific
response episodes.
Rate-intensity functions were determined in the following way. First,
the stimulus was presented in steps of 5 dB between 20 and 100 dB sound
pressure level (SPL) (for a definition see Eq. 21 in the Appendix) to
obtain the general shape of the rate-intensity function. These data
were used to identify the intensity range that gave rise to firing
rates between 50 and 250 Hz. Within this dynamic range of ~10-15 dB,
additional measurements in steps of 1 or 2 dB were performed, and these
were repeated 4-10 times to yield average firing rates and their SDs.
Stimulus intensities corresponding to given firing rates were obtained
by fitting a straight line through the four points closest to the
desired firing rate as shown in Figure 1.
Errors on these measurements follow from the errors of the fitted
parameters according to the law of error propagation. Thresholds were
determined by linear extrapolation to zero firing rate from data points
with a low, but significant firing rate.

View larger version (31K):
[in this window]
[in a new window]
|
Figure 1.
Determination of sound intensities corresponding
to given firing rates. A, Example of a spike train recorded
intracellularly from an axon of a receptor cell. Calibration is given
to the right. The thick bar below the voltage
trace denotes the 500 msec pure-tone stimulus. The vertical
bars below show the spike times as determined by the
spike-detection algorithm. The firing rate is calculated by counting
the spikes and averaging over several stimulus repetitions.
B, Example of the rising part of a rate-intensity function
( ) measured in steps of 1 dB. Each stimulus was repeated multiple
times. Vertical bars denote the SD of each measurement.
Linear fits through the four points closest to the firing rates of
interest, here 100 and 150 Hz, are depicted as dotted and
dashed lines, respectively. The arrows indicate
the readout of the corresponding intensities.
|
|
Superposition of pure tones. Measuring rate-intensity
functions for pure tones allows one to understand how the firing rate r depends on the amplitude A of a single tone for
a certain sound frequency, r = r(A). Investigating
spectral integration amounts to asking whether this
understanding can be extended to stimuli that contain
multiple tones simultaneously. We therefore try to obtain a description
of the firing rate r depending on the amplitudes A1, A2, ... of the
different frequency components of such stimuli, r = r(A1, A2, ...).
In a first set of experiments, stimuli were sound-pressure waves
S(t) consisting of two or three pure tones of amplitudes An, frequencies
fn, and phase offsets
n, n = 1, 2, 3:
|
(1)
|
with A3 = 0 for the two-tone
experiments. We used stimuli that were far longer than the periods of
the sine waves and avoided combinations of frequencies that are related
to each other by small integer factors. This makes the measurements
insensitive to the relative phases of the individual sine tones, which
we cannot control in our experiments because of putative phase shifts at the tympanal membrane (Michelsen, 1971b
). For
concreteness, we set
1 =
2 =
3 = 0 in all experiments. The frequencies were chosen to be far enough apart to avoid beating. The two-tone
experiments were performed with sound frequencies f1 = 4 kHz and f2 = 3/
· 10 kHz
9.55 kHz, the three-tone experiments with f1 = 4 kHz, f2 = 3/
· 10
kHz
9.55 kHz, and f3 = 10/
2 · 15 kHz
15.20 kHz or with
f1 = 6 kHz, f2 = 3/
· 9 kHz
8.59 kHz, and f3 = 10/
2 · 17 kHz
17.22 kHz.
Within the present approach, we are concerned only with the
encoding of sound intensity and not with temporal aspects. We thus
restricted our attention to stationary stimuli with constant envelope
as described above. This is justified because the responses of locust
auditory receptors do not phase lock to sound frequencies in the
kilohertz range (Suga, 1960
; Hill,
1983a
).
The experiments were designed to identify, for individual receptors,
sets of amplitude combinations (A1,
A2) or (A1,
A2, A3), respectively, that
result in the same firing rate. The recorded data were analyzed within
a model framework, which includes explicit predictions about how these
amplitude combinations should be related to each other. In Results, the
model is systematically developed and discussed. Here, we only present
the main aspects and cover technical issues and questions regarding the
model's role within the data analysis.
In summary, we compute the average firing rate of a receptor cell in
the following three-step process.
(1) The stimulus is a sound-pressure wave S(t), a
superposition of pure tones with frequencies
fn, amplitudes
An, and phase offsets
n, S(t) = 
An
sin(2
fnt +
n). In
the first step, this signal is linearly filtered and thereby turned
into:
|
(2)
|
This means that every tone receives a gain factor
1/Cn. In addition, the phase may change from
n to
n. The
inverse of the filter constant Cn thus
corresponds to the sensitivity for the frequency
fn: the smaller the
Cn, the more sensitive the receptor at
the corresponding sound frequency.
(2) An effective sound intensity J is computed according to
one of the following three hypotheses:
where
(t) is the filtered signal from Equation 2, |x| denotes the absolute value of x, and
y(t)
is the temporal average of y(t).
(3) The average firing rate r is determined according to a
single nonlinear function r(J).
Note that the effective sound intensity J as defined
above is distinct from the physical sound intensity, commonly measured in decibels SPL (compare Eq. 21 in the Appendix), which we denote by
I throughout the text. Whereas I measures the
stimulus itself, J is a derived quantity that incorporates
the filter constants Cn and therefore also
reflects the sensitivity of the specific receptor cell. Furthermore,
I is defined as a logarithmic measure (relative to a
predefined reference intensity); J is not, which facilitates
the notation.
Within the model framework, the filter constants
Cn are determined only up to a common factor,
which can be absorbed in the function r(J). In other words,
the model remains unchanged if all Cn are
multiplied by the same constant and r(J) is at the same time
adjusted appropriately. It follows that one way to determine the
Cn is to choose a fixed firing rate, find for
each frequency fn the amplitude
Ân that leads to this firing rate, and set
Cn = Ân.
In the following description of the experimental procedure, we
will for simplicity focus on the case of superpositions of two tones.
The generalization of concepts and formulas to the three-tone case is straightforward.
The three alternative hypotheses result in different predictions about
which combinations of amplitudes (A1,
A2) are expected to lead to the same firing
rate. Because the model implies that equal firing rate follows from
equal effective sound intensity J (step 3), curves of
constant firing rate can easily be calculated for each hypothesis by
setting J constant in the equations of the second step in
the model. These "iso-firing-rate curves" are shown in Figure
2. From the amplitude hypothesis, pairs
(A1, A2) yielding the
same firing rate are expected to lie on a straight line. Likewise, from
the energy hypothesis, they are expected to lie on an ellipse. For the
pressure hypothesis, they should fall on an even more strongly
bent curve. The corresponding shape has to be computed numerically by
solving the equation:
|
(3)
|
for pairs:
The duration
has to be chosen large enough to cover many
cycles of the sine waves in the signal, so that the phases
n can be neglected. Note that the shape
of these three alternative iso-firing-rate curves is not influenced in
any way by the form of r(J).

View larger version (22K):
[in this window]
[in a new window]
|
Figure 2.
Prediction of iso-firing-rate curves for the
superposition of two pure tones. Depending on the model, the effective
sound intensity J as well as the firing rate are expected to
be constant along different curves in the two-dimensional space of
amplitude combinations. A1 and
A2 denote the amplitudes of the respective
components. According to the amplitude hypothesis
(AH), iso-firing-rate curves are straight lines (one
example shown by the dashed line); according to the energy
hypothesis (EH), they are ellipses (solid
line); and according to the pressure hypothesis
(PH), they are even more strongly bent curves
(dash-dotted line), the exact shape of which has to be
determined numerically. The scale of the axes is given by the filter
constants C1 and C2. Note
that when the hypotheses are fitted to the data, the obtained filter
constants will in general be different for each model, and the
intersection points with the axes will not coincide because
C1 and C2 are free
parameters for each model. The gray arrows indicate equally
spaced directions along which the rate-intensity curves are measured.
In each direction, the intensity increases with increasing amplitudes
A1 and A2, whereas
A1/A2 is kept fixed and
determined by the angle . (One example for this angle is denoted in
the figure.) The intersection points of the
arrows with the iso-firing-rate curves denote the
amplitude combinations that are expected to yield the specified
firing rate according to each of the three alternative hypotheses.
Because the three intersection points on each gray
arrow clearly differ from each other, the measurements of the
iso-firing-rate curves can be used to distinguish between the
hypotheses.
|
|
To relate these predictions to experimental results, we determined a
set of amplitude combinations leading to the same average firing rate
in the following way. We start by measuring a first rate-intensity
function for a single pure tone with frequency f1. From this rate-intensity function, we
determine the amplitude A
that leads to
a firing rate of, e.g., 150 Hz as shown in Figure 1. (In the notation
A
, the subscript i refers to
the frequency fi at which the amplitude is
measured, and the superscript n indicates the number of the measurement, 1
n
N where N
denotes the total number of measurements.) Because the amplitude
A2 of the second frequency component is zero for
this stimulus, we denote the result as a data point
(A
, 0), i.e., a point on the
A1 axis in a graph such as that of Figure 2. The
same procedure is performed for a pure tone with frequency f2, leading to a second data point
(0, A
) on the A2
axis that also corresponds to a firing rate of 150 Hz. These two
amplitudes A
and
A
can already serve as estimates of the
filter constants C1 and
C2, respectively. We proceed by measuring
rate-intensity functions for superpositions of the two tones where the
ratio of the amplitudes A1 and
A2 is held fixed. To do so, we set
A1 = k·A2 and then jointly
vary the intensity of A1 and
A2. This corresponds to measuring the
rate-intensity functions along straight lines in radial direction as
pictured by the gray arrows in Figure 2. It is also evident
from the figure that the radial direction is well suited for accurate
measurements of the iso-firing-rate curves and for discriminating
between the hypotheses. The resulting rate-intensity functions are
similar in shape to the ones for the pure tones, and we can again
determine the stimulus that leads to a firing rate of 150 Hz as in
Figure 1. This yields a third data point (A
, A
) with
A
/A
= k. The
procedure is continued for several different ratios k so
that a set of amplitude pairs (A
, A
) is obtained.
A technical but important question is which ratios k should
be used in the experiment. If the neuron is much more sensitive to one
of the sound frequencies and if both amplitudes are comparable in size,
i.e., A1
A2, the
response will be determined almost exclusively by the more effective
sound frequency. To be most informative, the measurement should thus
take the relative sensitivities into account. This is done by choosing
k so that A1/C1
and A2/C2 are of the same
order of magnitude, which assures that the effect of both tones is
roughly the same. To do so, we use the estimates of
C1 and C2 that have been
obtained from the first two rate-intensity functions for the pure tones
as explained above. In particular, the different ratios of
A1 and A2 for subsequent
measurements are selected on-line in such a way that after taking
C1 and C2 into account,
the directions along which the rate-intensity functions are measured
are evenly spaced. The gray arrows in Figure 2 are such
directions. Note that their even spacing depends on the scales of the
axes given by C1 and C2.
The calculation that achieves this is as follows: choose angles
that are evenly spaced in the interval [0°, 90°] and use the
relation for the slope
of a straight line:
|
|
In the off-line analysis, the parameters
C1 and C2 were newly
determined by
2 fits of each of the three curves in
Figure 2 to the complete data set (A
,
A
). These fitted values of
C1 and C2 should be more
reliable than the initial estimates, which were obtained on-line from
the pure-tone rate-intensity functions only.
A further technical detail concerns the choice of the fitting
procedure. The procedure should treat A1 and
A2 in a symmetric fashion, and it should not be
affected by potentially large differences in the relative sensitivities
for the two tones. This discards, e.g., the simplest choice of
regarding A2 as a function of
A1 or vice versa. Instead, we normalized the
amplitudes by the filter constants and looked at the radial distance of
the data points:
from the origin, which is given by:
as a function of the ratio:
This is a natural choice because the rate-intensity functions
that led to the data points were measured in this radial direction. For
the three hypotheses, we denote the predicted radial distance by
d
, where m stands for the
particular model hypothesis (m = AH, EH, or PH).
d
can be obtained from the model as a
function of
n and corresponds to the
normalized distance from the origin to the respective iso-firing-rate curve in Figure 2. For the amplitude hypothesis, one obtains:
for the energy hypothesis,
d
= 1, and for the pressure
hypothesis, d
has to be
determined numerically using the solutions of Equation 3.
Estimating C1 and C2 then
corresponds to minimizing the
2 function for the radial
distance for each model m:
|
(4)
|
with respect to C1 and
C2. The contributions of the data points are
weighted by the measurement errors
n,
which follow from the measurement errors
A
and
A
for
A
and A
,
respectively, by the law of error propagation as:
|
(5)
|
The fitted curves and the
2 values obtained from
the fits were used for further statistical analysis (see below).
For the control experiments with stimulus lengths of 300 or 500 msec,
the onset response and the steady-state response were analyzed
individually. For the onset, only spikes in the first 30 msec after
stimulus onset were taken into account; for the steady state, the first
200 msec of the response were disregarded. The control experiments were
aimed at investigating the effect of adaptation on our model
description. We therefore performed the same analysis as explained
above on the firing rates obtained for the onset and the steady state
and fitted the filter constants C1 and
C2 separately in each case. In addition, we
compared C1 and C2 as
well as their ratio R = C1/C2 for the onset with the respective values for the steady state. The relative change of R was computed from the onset value,
RO, and the steady-state value,
RS, as
R = |RO
RS|/RS. For the total
response, the ratio of C1 and
C2 is denoted by Rtotal.
To estimate the significance of changes in C1,
C2, and R between the onset and the
steady state, error measures for these parameters were computed for
each cell individually by taking several nonoverlapping stretches of 30 msec during the steady state for the analysis, determining
C1, C2, and
R in each case, and computing the respective SDs.
Experiments with superpositions of three pure tones were performed and
analyzed in the same way as the two-tone experiments. We first measured
rate-intensity functions for each pure tone and from these obtained
initial estimates of the respective filter constants
C1, C2, and
C3. Subsequently, rate-intensity functions were
measured along different directions in the three-dimensional stimulus
space:
The ratios of:
were taken as 1:1:1, 2:1:1, 1:2:1, and 1:1:2. Final fits of the
model parameters C1,
C2, and C3 were obtained
in an analogous way as for the superposition of two tones.
Statistical analysis. The
2 values obtained
from the fits were used to test the statistical significance of
deviations of the data from the models by a standard
2
test for each cell individually.
The Bayesian probability of a model given the data can be used as a
measure for the preference of one hypothesis over another. It is
calculated from Bayes' formula:
|
(6)
|
where p(data) =
m
p(data|model m)·p(model m). If there is
no a priori evidence for any model, the prior probabilities for the
models are to be set to p(model m) = 1/M,
where M is the number of models investigated. The
probabilities p(data|model m) were calculated
from the difference between:
and the corresponding model predictions
d
by assuming independent errors with a Gaussian distribution of SDs
n (given by the
measurement errors) and a finite and fixed measurement resolution
:
|
(7)
|
An analogous formula was used in the case of superpositions of
three pure tones.
Trends in the data were tested for statistical significance by a
standard run test (Barlow, 1989
). For a given model, the data points were subdivided into those sequences of points that lie
consecutively either above or below the model prediction, and the
number of these sequences was tested for significant deviations from
the null hypothesis of independently scattered data points around the
model prediction.
Comparison of pure-tone and noise stimuli. In another set of
experiments, we tested whether our understanding of spectral integration allows accurate predictions of firing rates for more complex stimuli and focused on bandpass-filtered noise. To calibrate the model for a specific receptor, we measured the rate-intensity function for a pure tone as well as a set of filter constants in the
relevant frequency band of the noise stimulus. According to our model,
we can use these pure-tone results to calculate a prediction for the
rate-intensity function of the noise stimulus (see below). The filter
constants are needed for the calculation of the effective sound
intensity J for the noise stimulus (model step 2), and the
pure-tone rate-intensity function is needed because it implicitly
contains the information about the shape of the response function
r(J) of model step 3. To assess the reliability of the
prediction, the rate-intensity function for the noise stimulus was also
measured experimentally. The particular noise stimulus that we used was
Gaussian white noise, cut off at ±3 SDs and bandpass filtered between
5 and 10 kHz, and the frequency of the pure-tone stimulus was 4 kHz.
Note that when the amplitude of a noise stimulus is varied, all
amplitudes in the signal are scaled by a common factor.
The prediction for the rate-intensity function of the noise signal is
obtained in the following way. According to our model, the
rate-intensity function of the pure tone,
rpt(I), and the rate-intensity
function of the noise stimulus,
rnoise(I), should have the
same shape and be related to each other by a shift
I
along the decibel-intensity axis:
|
(8)
|
For notational simplicity, we always use the same symbol
r to denote the firing rate regardless of whether we
consider its dependence on the sound intensity I, r(I), or
on the effective sound intensity J, r(J). Strictly speaking,
r(I) and r(J) are different functions, but from
the context, it will always be clear to which function we refer.
Let us briefly describe the reason for the relation of Equation 8. For
concreteness, we focus on the energy hypothesis; the amplitude
and the pressure hypotheses can be dealt with in an analogous way.
Consider an arbitrary sound signal S(t) composed of a set of
pure tones with amplitudes An. From these, we
can calculate the intensity, which is defined as:
|
(9)
|
as well as the effective sound intensity:
The essential observation is that multiplying every
An by the same factor k amounts to
adding a constant 20log10k to the intensity
I (if k < 1, this constant is negative),
whereas JEH is multiplied by a factor
k2.
We now consider a noise stimulus with intensity
Inoise and effective sound intensity
J
. To compare the response
with that of a pure tone, we find the intensity
Ipt that yields the same firing rate as
the noise stimulus by setting both effective sound intensities equal:
The parameter Cpt denotes the
filter constant for the pure tone. From the preceding equation, we can
calculate the pure-tone amplitude Apt and
thus the intensity Ipt of the pure tone,
for which the firing rate is the same as for the noise signal with
given intensity Inoise. Let us denote the
difference between Inoise and
Ipt by
I.
If we multiply all amplitudes by the same factor k, the
amplitudes of the noise signal as well as
Apt, the intensities
Inoise and
Ipt are changed by the same amount.
Consequently, the difference between the new intensities is still given
by
I. Likewise, the effective sound intensities are
multiplied by the same factor, i.e., we still have
J
= J
. Because the firing rate depends only on the value of J, this
means that for the new intensities, the firing rates are also equal. It
follows that whenever the intensities of the pure tone and the noise
signal differ by
I, the firing rates for the two stimuli are the same. A thorough mathematical derivation of this concept, which
also yields explicit expressions for the amount of the shifts for the
energy and pressure hypotheses, can be found in the Appendix.
The derivation shows that the predicted
I is given
by:
|
(10)
|
for the energy hypothesis and by:
|
(11)
|
for the pressure hypothesis. The two predictions for
I differ by
10log10
1.05 dB. Because this is below our measurement accuracy, we do not use this
experiment for distinguishing between the hypotheses, but rather as a
test of the generality and the predictive power of the model per se.
Evaluating Equations 10 and 11 is possible if one knows the
filter constants and the A
for the
amplitudes in the noise signal. The latter are given by the power
spectrum of the noise signal, which we calculated in discretized bins
of width 0.05 kHz (using a triangular Bartlett window). Filter
constants Cn were measured for pure tones
between 5 and 10 kHz at every 0.2-1 kHz (depending on the length
of the recording) by determining the amplitude that led to a firing
rate of 260 Hz. Additional filter constants
Cn, for all center frequencies of the
power-spectrum bins, can be determined by linear interpolation from the
measured filter constants.
The prediction for the noise-stimulus rate-intensity function that
results from shifting the pure-tone rate-intensity function rpt(I) by
I is
compared with the measured curve
rnoise(I). To do this
quantitatively, the prediction of
I is related to the
true shift
Itrue that can be extracted from
the measured rate-intensity functions of the pure tone and the noise
signal as the distance between these two functions. Because the
rate-intensity functions are given by individual pairs of intensity and
firing rate, (I, r), we use the distance of such a data
point of one rate-intensity function to the approximate location of the
other rate-intensity function. For a data point
(Ipt, rpt)
from pure-tone stimulation, e.g., we thus determine the intensity Înoise that would be expected to
lead to the same firing rate rpt, but for
the noise stimulus. The determination of
Înoise given the firing rate
rpt is again done by linear interpolation
of the noise rate-intensity function as in Figure 1. We thus find for
every intensity I
of the pure-tone
rate-intensity function a corresponding
Î
, and similarly
for every intensity I
of the
noise rate-intensity function a corresponding
Î
. Because ideally, these
should be related by Î
= I
+
Itrue and
Î
= I
Itrue, we can estimate
Itrue by minimizing the
2
function:
|
(12)
|
Because the subthreshold part and the saturation are not
important in the determination of the actual shift, only data points (I, r) with r between 20 and 80% of the maximum
firing rate of the cell were taken into account.
 |
RESULTS |
The objective of this study is to develop a descriptive model for
the responses of auditory receptor neurons to arbitrary stationary
acoustic stimuli. This is done to identify the dominant physical
stimulus property governing the encoding of sound intensity. We first
develop a general mathematical framework for the transformation of the
incoming sound into the neural response. Subsequently, we apply the
model to locust auditory receptors and show that the experimental data
are well described by only one of three rival hypotheses about the
nature of the primary signal transduction.
Derivation of the mathematical model
In locusts, auditory signals are encoded by 60-80 receptor
neurons at each ear with similar general properties but considerable variability in the parameter values describing the sensitivity of
individual neurons to specific sound frequencies (Römer,
1985
). In response to a pure tone with sufficient intensity,
the firing rate of a receptor cell increases in a sigmoidal fashion
with stimulus intensity (Fig.
3A). The steepness and level
of saturation of this rate-intensity function depend on the individual
cell and temperature. Below a threshold intensity, there is no or only very low spontaneous activity. The regime between threshold and saturation usually spans ~15-30 dB, and maximum firing rates lie at
~300 Hz for room temperature and ~500 Hz for 30°C.

View larger version (31K):
[in this window]
[in a new window]
|
Figure 3.
Firing-rate responses of
a locust auditory receptor cell. A, Rate-intensity function
for a 7 kHz pure tone. The observed sigmoidal shape of the
rate-intensity function is typical for many receptor types. Below a
threshold of ~45 dB SPL, the cell shows virtually no response.
B, Rate-intensity functions of the same neuron for many
different pure tones between 1.25 and 28 kHz. Connected
points belong to the same sound frequency. Curves
farther to the left correspond to frequencies at which the
cell is more sensitive. Although there are large differences concerning
the intensity range where the individual rate-intensity functions rise
from threshold to saturation, their overall shape is very similar. For
example, all measured rate-intensity functions have approximately the
same slope in the rising part of the curves and saturate at around the
same level. C, The same rate-intensity functions as in
B, now shifted along the decibel axis such that they align
at a firing rate of 250 Hz. This demonstrates the generic shape of the
rate-intensity functions. D, Curves denoting equal firing
rates at different sound intensities for the same cell. The threshold
curve (solid line) and the intensities corresponding to
constant firing rates of 150 Hz (dashed line) and 300 Hz (dotted line) are shown for pure tones between 1.25 and
28 kHz. The three curves are approximately parallel to each other,
reflecting the similarity of the rate-intensity functions for different
frequencies.
|
|
The frequency-resolved sensitivity of the receptors can be
characterized by a threshold curve, i.e., the dependence of the threshold on the sound frequency (Fig. 3D). The receptors
are fairly broadly tuned with characteristic frequencies in the range of 4 kHz (low-frequency receptors) to 15 kHz (high-frequency
receptors), and the absolute sensitivities vary strongly between
individual neurons (Römer, 1985
; Jacobs et
al., 1999
).
Measuring rate-intensity functions from a single receptor cell for many
different sound frequencies reveals another property of the receptors;
to good approximation, the rate-intensity functions are shifted
versions of one another along the intensity axis, where intensity is
measured in the logarithmic units of sound-pressure level, decibels
SPL. This phenomenon has been reported previously by Suga
(1960)
and Römer (1976)
. A detailed
example with frequencies spanning the whole sensitivity range of a
typical low-frequency receptor (characteristic frequency of
5 kHz)
can be seen in Figure 3B. The generic shape of the
rate-intensity functions becomes even clearer if they are shifted
relative to each other and aligned at 250 Hz firing rate (Fig.
3C). Figure 3D shows the threshold curve together
with curves denoting the intensities that lead to firing rates of 150 and 300 Hz. As a consequence of the generic shape of the rate-intensity
functions, all curves are approximately parallel to each other.
These key findings indicate that over the whole frequency range, the
coupling of the physical stimulus is not substantially influenced by
mechanical nonlinearities. In fact, a simple filtering mechanism
captures the essence of the observed phenomenon. Let us assume that for
all pure tones the firing rate is given by a single function
r(An/Cn).
An denotes the amplitude of a specific pure tone of
frequency fn, and
Cn is a frequency-dependent filter constant such
that the firing rate depends only on the ratio
An/Cn. This corresponds to a
gain factor of 1/Cn for each sound frequency. For two different frequencies f1 and
f2, the firing rates
r(A1/C1) and
r(A2/C2) are then the
same when A1/C1 = A2/C2, i.e., when the
amplitudes take on a constant ratio
A1/A2 = C1/C2. Because the intensity
I in decibels SPL is defined as a logarithmic measure of the
amplitude, I = 20log10(A/(
·20 µPa)), this
constant amplitude ratio corresponds to a constant intensity difference,
I = I1
I2 = 20log10(C1/C2).
The firing rates for the two tones are therefore always the same if
their intensities differ by
I. The rate-intensity
functions are thus shifted versions of one another separated by
I as found in the experiment. Generalizing this idea to
stimuli containing more than one frequency leads us to the first step
of our model:
Step 1: coupling to the stimulus
The sound pressure wave S(t), written as a Fourier
series:
|
(13)
|
where the fn denote the frequencies,
n phase offsets, and the
An the respective amplitudes, is initially
transformed into a filtered signal
(t):
|
(14)
|
The amplitudes are multiplied by frequency-dependent gain factors
1/Cn. These describe the frequency-resolved
sensitivity, i.e., the tuning of the receptor cell, and correspond
directly to the values of the threshold curve at the frequencies
fn. In addition, a putative phase shift turns
n into
n.
Although the above reasoning for using a linear filter as the first
model step is based on electrophysiological observations only, it
corresponds well with biophysical findings regarding the tympanal
membrane. Schiolten et al. (1981)
observed that the tympanal membrane behaves approximately as a linear oscillator with a
short damping time constant of ~100 µsec. The resonance properties
of this oscillator are thought to be responsible for the
frequency-resolved gain of the receptors and therefore also for the
shapes of the threshold curves (Michelsen, 1971a
,
1971b
, 1979
). Michelsen and
Rohrseitz (1995)
also note that the amplitude of the tympanal
vibration depends linearly on the sound pressure for pure tones.
Step 2: mechanosensory transduction
Receptor cells are attached to the tympanal membrane with a cilium
protruding from the dendrite and several auxiliary cells surrounding a
receptor (Gray, 1960
). The biophysical functioning of
this machinery is not yet understood, but oscillations of the tympanal
membrane presumably lead to conductance changes in the receptors'
dendrites that give rise to membrane depolarizations (Hill,
1983a
, 1983b
). This is where a
spectral integration of frequency-dependent stimulus attributes must
occur. Voltage fluctuations in the range of the relevant sound
frequencies (several kilohertz) cannot be transmitted by the cell
membrane because of its low-pass filter properties. Information about
the spectral content is therefore lost at the level of the membrane
potential, which, instead, is expected to correspond to an integrated
stimulus property. The spectrum of the generator potential after
acoustic stimulation is indeed found to contain no trace of the sound
frequency used (Hill, 1983a
).
Following ideas from the literature concerning temporal integration in
auditory receptor cells (Tougaard, 1996
; Heil and
Neubauer, 2001
), we set up three hypotheses for the
spectral integration by calculating an "effective sound intensity"
J from
(t).
Amplitude hypothesis (AH)
J corresponds to the maximum amplitude of
(t). This is the common view of a threshold: a
response occurs once the signal reaches a certain value. In the case of
few frequency components, J is given by the sum of the
scaled amplitudes:
|
(15)
|
Energy hypothesis (EH)
J corresponds to the temporal mean of the squared
signal [throughout what follows,
x(t)
denotes the
temporal mean of x(t)]:
|
(16)
|
From Parseval's Theorem (Press et al., 1992
), we
see that this expression can be rewritten as the sum of the squares of
the scaled amplitudes:
|
(17)
|
Because the square of the amplitude of a sinusoidal oscillation is
proportional to the energy contained in the oscillation, this
hypothesis reflects an energy-integration mechanism.
Pressure hypothesis (PH)
J corresponds to the temporal mean of the absolute
value of
(t):
|
(18)
|
This hypothesis complies with a pressure-integration mechanism
after half-wave rectification.
Step 3: encoding by firing rates
The response of an auditory receptor to a signal of constant
intensity can be characterized by a mean firing rate r. The
rate is obtained from a one-dimensional, nonlinear transformation of the effective sound intensity J:
|
(19)
|
Note that the effective sound intensity J is a
theoretical construct, which does not necessarily correspond to a
biophysical property. It is used here to describe regions of constant
firing rate in stimulus space because these correspond to regions of constant J. Therefore, instead of the specifically simple
versions of J given above, we could also use any
transformation
= f(J) with some appropriate function
f. This transformation does not affect the shape of the
regions of constant J, but we can speculate that for the
correct choice of f,
has a direct biophysical interpretation, such as the change in membrane conductance caused by
the stimulus.
Measured spike-train responses have a strong transient attributable to
adaptation. In a first approach, we average over this temporal
structure in the response and consider only the total number of spikes
elicited by the stimulus. In a second, more detailed analysis, we
analyze individual parts of the response to explicitly test how this
structure in the spike trains might affect our model description.
Electrophysiological experiments
Experimental strategy
To directly address the question of spectral integration and the
hypotheses in step 2 of our model, we compare only stimuli that lead to
the same firing rate of a given neuron. With this strategy, we
circumvent complications attributable to the nonlinearity induced by
the spike-generation mechanism. In terms of our model,