Abstract
Accurate detection of sensory input is essential for the survival of a species. Weakly electric fish use amplitude modulations of their selfgenerated electric field to probe their environment. Ptype electroreceptors convert these modulations into trains of action potentials. Cumulative relative refractoriness in these afferents leads to negatively correlated successive interspike intervals (ISIs). We use simple and accurate models of Punit firing to show that these refractory effects lead to a substantial increase in the animal's ability to detect sensory stimuli. This assessment is based on two approaches, signal detection theory and information theory. The former is appropriate for lowfrequency stimuli, and the latter for highfrequency stimuli. For low frequencies, we find that signal detection is dependent on differences in mean firing rate and is optimal for a counting time at which spike train variability is minimal. Furthermore, we demonstrate that this minimum arises from the presence of negative ISI correlations at short lags and of positive ISI correlations that extend out to long lags. Although ISI correlations might be expected to reduce information transfer, in fact we find that they improve information transmission about timevarying stimuli. This is attributable to the differential effect that these correlations have on the noise and baseline entropies. Furthermore, the gain in information transmission rate attributable to correlations exhibits a resonance as a function of stimulus bandwidth; the maximum occurs when the inverse of the cutoff frequency of the stimulus is of the order of the decay time constant of refractory effects. Finally, we show that the loss of potential information caused by a decrease in spiketiming resolution is smaller for low stimulus cutoff frequencies than for high ones. This suggests that a rate code is used for the encoding of lowfrequency stimuli, whereas spike timing is important for the encoding of highfrequency stimuli.
 electrosensory afferent
 electrolocation
 interspike intervals
 spike train variability
 weak signal detection
 correlations
 information theory
 resonance
Detecting external stimuli is essential for an animal to survive in its environment. A variety of methods have been used to quantify information transmission in sensory systems, such as information theory (Borst and Theunissen, 1999; Burac̆as and Albright, 1999), signal detection theory (Green and Swets, 1966;Gabbiani and Koch, 1998), and stimulus reconstruction (Rieke et al., 1997; Gabbiani and Koch, 1998). Information theory makes no assumptions on the nature of the neural code and uses mutual information to quantify information transfer in sensory systems. The mutual information I(X,Y) between two random variables X and Y is equal to H(X)−H(X‖Y) , where H(X) is the entropy of X , and H(X‖Y) is the conditional entropy of X given Y . Because entropy increases with variability, it is clear that I(X,Y) will increase if the variability of X is high and the variability of X given Y is low. In neural systems, we can identify a timevarying stimulus with Y and a spike train with X . Thus, information theory suggests that high variability of the spike train in the absence of the stimulus, combined with low variability in the presence of the stimulus, will maximize information transmission. In particular, because correlations are known to reduce entropy (Shannon, 1948), one might expect that the correlations in a spike train will degrade information transmission.
On the other hand, signal detection theory in the context of neural spike trains aims to determine the presence versus the absence of a stimulus that is based on differences between the number of spikes counted in an interval of length T . The discriminability d between the two cases is given by the absolute value of the difference between the mean of these spike counts with and without stimulus, divided by the square root of the summed variances of these spike counts (see Eq. 6). It is clear that discriminability is enhanced by a low variability in the spike train, because this will result in low variability of spike count (low variance) over the counting window. In this paper, we show that negative interspike interval (ISI) correlations, i.e., the tendency for long ISIs to be followed by short ISIs (and vice versa), reduce spike count variability, whereas positive ISI correlations increase spike count variability. Together, these effects lead to an optimal spike counting time at which discriminability is maximal.
Contrasting the information theoretic and signal detection approaches points to an apparent contradiction with respect to spike train variability: information theory requires high variability, whereas we have just shown that signal detection requires low variability. However, we will show, in the context of the electrosensory system, that each approach is best suited to a particular stimulus frequency range. Also, we will show that negative ISI correlations enhance both the mutual information I and the discriminant measure d .
Gymnotiform weakly electric fish are particularly adept at detecting prey (Nelson and MacIver, 1999) and each other (Dulka et al., 1995; Bastian et al., 2001). They emit a sinusoidal timevarying electric field through their electric organ discharge (EOD; frequency, 600–1200 Hz). Ptype electroreceptors on their skin detect amplitude modulations (AMs) of this field caused by nearby objects or conspecifics (for review, see Bastian, 1981; Zakon, 1986). These Punits fire action potentials when driven only by the EOD, i.e., without AMs. Therefore, detection of external stimuli is based presumably on a change from this baseline activity (Ratnam and Nelson, 2000). Furthermore, these Punits exhibit negative ISI correlations at low lags (Longtin and Racicot, 1997;Chacron et al., 2000; Ratnam and Nelson, 2000), resulting from cumulative relative refractoriness (Chacron et al., 2000).
For Apteronotus leptorhynchus (Brown ghost knife fish), Punit spike train variability decreases (as measured by the Fano factor; see Materials and Methods) by as much as two orders of magnitude for counting times varying between 40 and 1000 EOD cycles (Ratnam and Nelson, 2000) before increasing again. Our computational study provides an explanation for this result and for the fact that Punits can transmit information about both low and highfrequency stimuli. Our study uses simple and accurate biophysically plausible models for Punit activity to generate the large data sets necessary for mutual information analysis and for establishing the role of the ISI correlations in the enhancement of stimulus coding.
MATERIALS AND METHODS
Interspike interval analysis
Let us denote by {I_{j}} the ISI sequence with mean 〈I〉 and variance VAR (I) . The coefficient of variation (CV) of the interspike interval histogram (ISIH) provides a good measure of spike train variability on time scales of the order of the mean ISI. If {I_{j}} is stationary, then the coefficient of variation is defined as:
Serial correlation coefficients
One common measure of memory effects in a time series of events is the serial correlation coefficients (SCCs) ρ_{j} defined for lag j by:
Spectral density function
The spectral density function (SDF) is the discrete Fourier transform of the SCCs (Cox and Lewis, 1966) and is defined for positive frequencies f as:
Pulse number distributions and the Fano factor time curve
The pulse number distribution (PND) (Barlow and Levick, 1969a,b;Teich and Khanna, 1985) P(n,T) is defined as the probability of observing n spikes during a counting time T (the PND is sometimes referred to as the spike count distribution). It is calculated by first dividing the spike train into nonoverlapping time windows of length T and then counting the number of action potentials in each window. A normalized histogram of these numbers then yields the PND. The Fano factor (Fano, 1947) is defined as the variance to mean ratio of the PND:
The Fano factor always approaches unity for small T (Teich et al., 1997). For a Poisson process, we have F(T) = 1 for any T (Cox and Lewis, 1966). Processes with F(T) < 1 are thus considered less variable than Poisson, whereas those with F(T) > 1 are more variable (Gabbiani and Koch, 1998). The asymptotic value F_{∞} of the Fano factor is related to the SCCs of the ISI sequence (Cox and Lewis, 1966) according to:
Signal detection theory
The ideal observer paradigm is based on the optimal discrimination between PNDs obtained in the presence and absence of a stimulus (Green and Swets, 1966) for examples of applications to neural systems (see Nachimas, 1972;Shofner and Dye, 1989; Gabbiani and Koch, 1998; Gabbiani and Metzner, 1999). Let P_{0}(n,T) be the PND obtained without stimulus and let P_{1}(n,T) be the one obtained in the presence of stimulus. Then, we define the probability of false alarm P_{FA}, i.e., of reporting a signal when it is not there, as (Gabbiani and Koch, 1998):
Information theory
The entropy of a discrete random variable X with probability density function P(X) is defined as (Shannon, 1948; Cover and Thomas, 1991):
The variability of the neural response to an ensemble of stimuli is characterized by the total entropy H_{total}(Strong et al., 1998; Burac̆as and Albright 1999) and is estimated from the neural response to an unrepeated Gaussian stimulus. The trialtotrial variability of the neural response to a repeated stimulus is characterized by the noise entropy H_{noise} (Strong et al., 1998; Reinagel and Reid, 2000) defined as the conditional entropy of the spike train given a stimulus (previously referred to as H(X‖Y) in the introductory remarks). Thus H_{noise} represents the variability (“noise”) in the spike train that cannot be accounted for by the stimulus.
A good stimulus encoder must have a neural response that varies highly in response to different stimuli while having a very reliable response to repeated presentations of the same stimulus. In the case of neurons that are silent in the absence of sensory input, an estimate of the mutual information about the stimulus is given by (Strong et al., 1998):
Modeling
Because we are interested in understanding how correlations of Punit ISIs affect information transfer, we use two models of Ptype electroreceptors. The Nelson model (Nelson et al., 1997) accounts for firstorder ISI statistics (i.e., the ISIH) and for the gain and phase response to sinusoidal AMs; however, it is memoryless in the sense that the significant ISI correlations displayed by experimental data are absent (see Fig. 1 and Chacron et al., 2000). The other model uses the filtering property of the Nelson model to extend the model proposed by Chacron et al. (2000). The latter model was shown to reproduce first and secondorder statistics of experimental data, namely the ISIH, ISI return map (I_{j+1} as a function of I_{j}) , and the ISI correlations as measured by the SCCs. We start with a description of the Nelson model and then describe ours. For convenience, a list of all symbols and acronyms used in this paper is provided in appendix , whereas appendix summarizes all equations used for both models.
The Nelson model
It is known from experiments that various filtering mechanisms are at work inside a Punit (Hopkins, 1976;Bastian, 1981; Wessel et al., 1996;Xu et al., 1996; Nelson et al., 1997).Nelson et al. (1997) measured the gain and phase response characteristics of Punits to sinusoidal AMs of frequencies in the range 0.1–200 Hz, leading to the following set of differential equations:
A modified integrateandfire type model
Biophysical justification. We begin with a description of Ptype electroreceptors to biophysically justify our model. A Punit is composed of 25–40 receptor cells and a nerve fiber making synaptic contact onto at least 16 active neurotransmitter release sites per receptor cell (Bennett et al., 1989). Although it is currently impossible to record intracellularly from these cells, there is much indirect evidence that the EOD amplitude changes individual receptor potentials that govern the rate of release of neurotransmitter onto the afferent nerve. Fluctuations in this rate are thus one expected source of variability in such systems (Stein, 1965; Nelson et al., 1997). Another possible source is the conductance fluctuations of the ionic channels at the spike initiation zone in the axon.
As mentioned above, we expect relative refractoriness to be important at such high firing rates (Stein, 1965). It leads to negative SCCs for the ISIs at low lags (Geisler and Goldberg, 1966; Chacron et al., 2000). Because there is currently no biophysical characterization of the ionic conductances inside the Punit, we can only speculate as to the possible mechanisms responsible for this experimentally observed relative refractoriness. The physiological mechanisms responsible for these correlations could be presynaptic in origin; for example, longterm depression at the synapses connecting the receptor cells to the afferent nerve (Bennett et al., 1989; Hausser and Roth, 1997) would lead to relative refractoriness. However, the recovery time constant of the neurotransmitter at typical synapses is usually in the thousands of milliseconds range (von Gernsdorff et al., 1997), which is much too long for the phenomenon at work here because serial correlations are significant only up to lag 2 (Chacron et al., 2000) (see Fig. 1). Thus, we are looking at a time scale of about 10 EOD cycles (twice the mean ISI, which is 5 EOD cycles long for the unit considered here). A likely candidate would be a postsynaptic spikeactivated potassium channel that slowly deactivates and thus summates to produce a negative adaptation current. The KV3.1 channel has the right activation and deactivation kinetics (Wang et al., 1998). Members of the KV3 family are richly expressed in the electrosensory system (A. J. Rashid, personal communication), but it remains to be shown whether similar channels are present in Punits.
From the foregoing discussion, we feel that the best approach for studying the effects of correlations on information transfer and signal detection is to use a simple yet biophysically plausible model that reproduces the essential features of Punit baseline and evoked discharge.
Description. First, we will give an expression for the synaptic current at the spike initiation zone in the axon and then describe the spiking mechanism. We use the simple model inChacron et al. (2000) that has been proposed to model the baseline firing dynamics of Ptype electroreceptors and extend it to get the proper responses to timevarying AMs of the EOD amplitude. We write the transdermal potential on the fish's skin as (A(t)+A_{0}) sin(2 π f_{EOD}t) , where A_{0} is the constant EOD amplitude corresponding to baseline firing dynamics (this is similar to r_{base} in Nelson's model) and A(t) is the AM. The stimulus A(t) is filtered using Equations 35. Because many receptors rectify a periodic input (French et al., 1972; Gabbiani, 1996), we take the total synaptic current to be:
In contrast, we take τ_{2} to be much greater than the EOD cycle. Hence, this noise term can be thought of as “slow” compared with the model dynamics (it is almost constant for time scales much smaller than τ_{2}) (see Fig. 2 a). It thus models fluctuations on slow time scales (e.g., fluctuations in vesicular release rate; see below). This term is needed to accurately reproduce the Fano factor curve as shown in Results.
The spiking mechanism is a simple extension to the leaky integrateandfire (LIF) model in which a spike is said to have occurred when the membrane potential V reaches a constant threshold θ. Immediately afterward, the voltage is reset to its resting value (usually taken to be 0). LIF models are memoryless in the sense that consecutive ISIs are not correlated (ρ_{i} = 0 for all i > 0) . To include refractory effects, we make θ also a dynamical variable (Geisler and Goldberg, 1966). But instead of making it random (Gestri et al., 1980; Gabbiani and Koch, 1996, 1998), we let it carry the memory by the following firing rule (Chacron et al., 2000): when voltage equals threshold, it is reset to zero as in the LIF model, whereas threshold is incremented by a constant amount Δθ and kept constant for the duration of the absolute refractory period T_{r}, after which it relaxes exponentially toward its equilibrium value θ_{0} until the next spiking time. The equations for voltage and threshold between times of occurrence of action potentials and after the absolute refractory period are thus:
For the remainder of this paper, we refer to our model as the leaky integrateandfire with dynamic threshold (LIFDT) model. Note that our dynamic threshold can model the aforementioned KV3.1 channel but could also result from any current that leads to an increase in the effective distance between voltage and threshold immediately after a spike. It is thus very general and is used here to model cumulative refractory effects and consequently endow the ISI sequence with proper secondorder statistics (Chacron et al., 2000) (Fig. 1). The experimentally obtained SCC at lag 1 for the Preceptor data was −0.35 (Chacron et al., 2000), whereas it is −0.385 for the model. Note that the three parameters for the spiking mechanism (θ_{0}, τ_{v}, and τ_{θ}) can be adjusted to give the proper first and secondorder statistics (ISIH, ISI return map, and SCCs) for other Punits with different firing rates (data not shown). This spiking model could thus be used to model other neural systems in which negative ISI correlations at short lags have been observed (e.g., the auditory system) (Lowen and Teich, 1992). The reason explaining why the dynamic threshold gives rise to negative ISI correlations can be understood as follows: suppose an ISI shorter than 〈I〉 just occurred, then the threshold (having had less time to decay) will typically be high after the spike and will thus take a long time to decay. Consequently, the next ISI will (on average) be longer than 〈I〉 . This gives rise to a negative SCC at lag 1. This is also the case when an ISI longer than 〈I〉 occurs; as a result, the threshold will now be lower, and the next ISI will be shorter than 〈I〉 . However, because of the strong noise λ_{1}, negative ISI correlations at longer lags will be washed out. An extended explanation of the role of the noise in the model and why the dynamic threshold leads to negative ISI correlations in the presence of noise can be found in Chacron et al. (2001).
Also, our extended model gives the correct responses to sinusoidal AMs (see Fig. 3 and the next section).
Stimulation
Baseline firing statistics were computed for both models. Their ability to encode timevarying stimuli was tested using AMs of the EOD amplitude. For the LIFDT model, the EOD amplitude minus its baseline value was given by:
The stimulus c ς_{stim}s(t) was presented then to the Nelson model, and the constant c was adjusted so that both models gave identical gains and phases over the frequency range of the sinusoidal AMs. For the purpose of quantifying the amount of information transmitted, we then took s(t) to be lowpass filtered Gaussian white noise of mean 0 and variance 1. A Butterworth fourthorder filter was used with cutoff frequency f_{c} (Wessel et al., 1996). As mentioned above, this type of stimulus has been used widely in quantifying the ability of neurons to encode timevarying stimuli by means of the stimulus reconstruction technique (Gabbiani, 1996; Gabbiani and Koch, 1996,1998; Wessel et al., 1996; Chacron et al., 2000; Kreiman et al., 2000).
RESULTS
We first explain how negative ISI correlations (Fig. 1) lead to a decrease in spike train variability and how positive ISI correlations increase this variability. We then show how a combination of strong negative ISI correlations at short lags and weak positive ISI correlations induced by a weak correlated noise (Fig. 2) and extending out to long lags gives rise to the minimum in the Fano factor curve F(T) seen experimentally by Ratnam and Nelson (2000). We consider the role of these ISI correlations for signal detection using the ideal observer paradigm by considering two models (see Materials and Methods) that have identical responses to sinusoidal AMs (Fig. 3). We then study their contribution to the ability of the receptor to encode timevarying stimuli in the form of lowpass filtered Gaussian white noise. This will be done by computing the mutual information rate as a function of stimulus contrast and cutoff frequency. Finally, we show that signal detection is best suited for lowfrequency stimuli (e.g., electrolocation signals) (Nelson and MacIver, 1999) because spike train variability is low on long time scales, and that information theory is best suited for highfrequency stimuli (e.g., electrocommunication signals) (Zupanc and Maler, 1993; Bastian et al., 2001) because spike train variability is high on short time scales.
Fano factor
The Fano factor curve obtained for the LIFDT model with D_{2} = 0 is plotted in Figure 4(triangles). We see that the electroreceptor is more regular at all time scales than a Poisson process because F(T) < 1 (Cox and Lewis, 1966). F(T) decreases for T in the 1–5000 EOD cycle range and has an asymptotic value of 0.00685 (n = 5 line). If we take only the SCCs to be nonzero up to lag 5, then we get F_{∞} = 0.00681 from Equation 1, which is very close to the observed asymptotic value of 0.00685. For comparison, the Fano factor time curve obtained by random shuffle of the ISI sequence is also plotted in Figure 4 (diamonds). Because all ISI correlations have been eliminated by this operation, we now have a renewal process (Cox and Lewis, 1966) for which F(T) tends toward CV^{2} from Equation 1 (CV^{2} line in Fig. 4). Note that because CV^{2} ≈ 0.0436 < 1 in our case, hence we have F_{∞} < 1 even in the absence of ISI correlations. The two curves (triangles and diamonds) are almost on top of one another for short counting times (<10 EOD cycles), implying that correlations do not play a significant role over this range from the Fano factor curve perspective; however, as we will see below, this is not the case from an information theoretic perspective. However, they become different for longer times; the Fano factor curve without correlations tends toward CV^{2}, whereas the one with correlations has a lower asymptotic value. We plot the Fano factor curve that was obtained with the Nelson model in Figure 4(squares). We see that it matches the one obtained for randomly shuffled ISI sequences from the LIFDT model. This match is not surprising, because the two models have for all practical purposes identical ISI distributions and thus identical CVs (Fig.1). Furthermore, it demonstrates that there are no significant ISI correlations in the Nelson model. Because the SCCs are effectively negligible beyond lag 5 in the model, the Fano factor tends toward a constant for long counting times given by Equation 1; this is not what is observed experimentally.
We now plot the Fano factor curve obtained with D_{2} = 9 × 10^{−6}(EOD cycles)^{−2} in the LIFDT model. This curve is on top of the other ones for short counting times. In particular, the noise λ_{2} is weak (Fig.2 a) and has negligible effect on the ISIH and the SCCs at low lags (see below). However, the Fano factor curve differs from the others by increasing in a power law fashion for long counting times before saturating. The behavior can be understood from a plot of the mean and variance of the PND (Fig. 5). The mean increases linearly with counting time; this is because the mean number of spikes that is expected in a time window of length T is equal to the length of that window multiplied by the mean firing rate. However, the variance is almost constant for short counting times; hence, F(T) decreases. The variance then increases at a greater rate than the mean; hence, F(T) increases. Finally, the variance and mean both increase with the same rate, and the Fano factor is constant.
We now show that the increase in the Fano factor at long counting times is caused by the presence of weak positive SCCs that extend out to long lags. These positive correlations are extremely small and cannot be seen from a plot of the SCCs ρ_{j} as a function of j . However, their presence is revealed by the increase of the spectral density function at low frequencies (Cox and Lewis, 1966) (Fig. 6). This can be seen on the following simple example in which the following form for the SCCs at lags >0 is assumed:
It is known experimentally that the Fano factor curves obtained for different Punits have the same qualitative shape, although their minimum occurs between 40 and 1000 EOD cycles (Ratnam and Nelson, 2000). Our model produces Fano factor curves quantitatively similar to those obtained experimentally. Furthermore, the location of the minimum is found to be dependent mainly on D_{2}; in fact, by a suitable choice of D_{2}, we can obtain Fano factor curves matching the full experimental range (data not shown). For example, if we take D_{2} = 10^{−4} (EOD cycles)^{−2}, then the minimum of the Fano factor curve is at T = 40 EOD cycles. On the other hand, if we take D_{2} = 10^{−6} (EOD cycles)^{−2}, then the minimum is at T = 1000 EOD cycles.
These results imply that the remarkable regularity of Punit spike trains at counting times of ∼250 EOD cycles (within the experimentally observed range) (Ratnam and Nelson, 2000) can be entirely explained by negative ISI correlations that are present experimentally and in the LIFDT model at low lags. These arise from cumulative relative refractoriness exhibited by our dynamic threshold. We have shown here that negative SCCs contribute to the reduction of the Fano factor and that positive SCCs increase the Fano factor according to Equation 1. Ratnam and Nelson (2000) have shown that modeling the ISI sequence by a firstorder Markov chain gave the correct SCC at lag 1. However, their SCCs at longer lags had higher absolute values than observed experimentally and alternated in sign. As a consequence, the SCC sum for their model was greater (less negative) than the one calculated from the experimental data. Thus, the Fano factor calculated with their Markov chain model did not decrease as much as the Fano factor from their experimental data. In contrast, our model reproduces the descending part of the Fano factor curve seen experimentally [compare our Fig. 4 with Ratnam and Nelson (2000), their Fig. 11F].
We now discuss the power law increase of the Fano factor curve in more detail. We first note that because we are using Ornstein–Uhlenbeck noise, which has a finite correlation time τ_{2}, the Fano factor will eventually saturate to a finite value. This value is equal to 0.36188 (Fig. 4, n = 4000 line) and is given approximately by taking the SCCs up to lag 4000 in Equation 1 (hence ρ_{j} = 0 for j > 4000) . This implies that the 4000th ISI is still correlated to the first one.Ratnam and Nelson (2000) found that, for many Punits, the ISI sequence cannot be modeled by a Markov process of order <10. Our results suggest that a Markov chain of ISIs of order at least 4000 would be required to correctly reproduce this feature of the Fano factor curve. In contrast, our simple dynamical model accurately accounts for the full behavior of the Fano factor curve. One plausible origin of the slow Ornstein–Uhlenbeck noise might be fluctuations in synaptic neurotransmitter secretion rates that exhibit longterm correlations (Lowen et al., 1997). However, it could also be caused by slow drifts in EOD amplitude or frequency (Moortgat et al., 1998).
Pulse number distributions and ROC curves
We show in Figure 7 the PNDs that were obtained in the presence and absence of ISI correlations for four different counting times for the LIFDT model. We see that ISI correlations have minimal effects at short counting times such as 20 EOD cycles (Fig. 7 a). The effect of negative ISI correlations increases with counting time (Fig. 7 b). It is very important around 255 EOD cycles (Fig. 7 c), at which the variance of the PND is reduced while the mean is left unchanged. This effect diminishes for longer counting times (Fig. 7 d) at which positive ISI correlations contribute to the broadening of the PND.
This has implications for weak signal detection using the ideal observer paradigm (Green and Swets, 1966). Intuitively, if P_{0}(n,T) and P_{1}(n,T) do not overlap much, then we have a very good detector (Gabbiani and Koch, 1998). As mentioned previously, a good measure for this is the discriminant measure d (Green and Swets, 1966; Nachimas, 1972; Snippe and Koenderink, 1992) defined by:
In the above analysis, we did not consider the effect on signal detection of transients in firing rates that are associated with changes in EOD amplitude (Xu et al., 1996; Nelson et al., 1997). Transients will help the animal in detecting weak signals by increasing or decreasing the firing rate, thus shifting the mean of the PND with stimulus away from the baseline PND. This is confirmed in Figure 8 in which the transients resulting from a 4% step increase in EOD amplitude lead to a near perfect detection of the signal. These transients are present in both models because both incorporate the filter given by Equations 35. A full exploration of their effects will be done elsewhere.
Thus, there are two phenomena at work here: transients resulting from a change in EOD amplitude will shift the mean of the PND, whereas negative correlations will reduce the variance. The combined effect is better discriminability between the PNDs, hence a better ROC curve and a lower stimulus contrast threshold for signal detection.
Usually, fluctuations caused by noise average out over time, and the ROC curve improves (Gabbiani and Koch, 1998). In our case, the ROC curve becomes worse at longer counting times as the positive ISI correlations increase variability (Fig. 7 d). Thus, there is a time window in which signal detectability is optimal for the animal, and it corresponds to the counting time at which the Fano factor is minimal. This optimal counting time has been shown to vary between 40 and 1000 EOD cycles (Ratnam and Nelson, 2000) within the Punit population.
Entropy
All the previous analysis assumed a rate code, and thus spike timing was considered unimportant. A recent study on the electroreceptors of a very similar electric fish (Kreiman et al., 2000) showed that significant jitter could be added to the spike train without affecting the quality of encoding by using the stimulus reconstruction technique (Rieke et al., 1997;Gabbiani and Koch, 1998) when the stimulus cutoff frequency was low (<20 Hz). This suggests that a rate code might be more relevant for the encoding of lowfrequency stimuli.
However, electrocommunication signals contain much higher frequency components (Metzner and Heiligenberg, 1991;Zupanc and Maler, 1993; Dulka et al., 1995). In fact, the animal can detect AM frequencies >200 Hz (Bastian et al., 2001). We use information theoretic measures to assess the quality of encoding time, varying stimuli by Punits at such frequencies. In particular, we will assess the role of ISI correlations by comparing the information rates obtained with and without their presence.
To assess the ability of the Punit to encode different frequency stimuli, we used lowpass filtered Gaussian white noise with a variable cutoff frequency f_{c} (Wessel et al., 1996). The same stimulus was presented to both models, and the resulting baseline and noise entropies for word lengths up to 16 were calculated. Because the unit can fire at most once per EOD cycle, it is natural to make the bin size Δτ equal to one EOD cycle. We have also plotted results obtained with a binomial process (each bin has probability p of being assigned the value 1 and probability 1 − p of being assigned the value 0, p being the probability of firing per EOD cycle equal to 0.2 in our case). For such a process, the entropy of words of length L as a function of L^{−1} is constant (Shannon, 1948; Cover and Thomas, 1991); this permits us to verify the accuracy of our algorithm. The calculated entropy for the binomial process was significantly lower than the true value (Fig. 9) for words of length >6. This is attributable to the finite length of the spike train that is considered, which leads to undersampling for words of longer lengths (Strong et al., 1998). We thus only considered words up to length 6 (i.e., 1/L goes from 1 to 1/6). We plot the baseline entropies obtained for each model in Figure9. Because a clear deviation from linearity can be seen in each case, we performed a quadratic fit according to Equation 2. Note that the baseline entropy values for our model are always lower than for Nelson's as correlations reduce entropy (Shannon, 1948;Cover and Thomas, 1991). However, noise entropies will also be lower for our model for the same reason. It is thus not clear a priori what effect correlations will have on information transfer.
We thus calculated the information rates at different stimulus contrasts ς_{stim} for both models. The baseline and noise entropy rates for different contrasts were calculated at a fixed cutoff frequency f_{c} = 100 Hz to assess the capacity of the Punit to encode highfrequency stimuli (as discussed above). We plot in Figure 10 the information rate for both models. As expected, it increases with stimulus contrast for both models; however, the information for the LIFDT model is always higher than the one obtained for Nelson's model. For example, for stimulus contrasts between 0.04 and 0.05 mV, the gain in information rate is approximately 0.04 bits per EOD cycle, which corresponds to 40 bits per second. For both models, the correlation between successive spikes in the presence of a stimulus increases with ς_{stim}, resulting in a decrease in noise entropy rate. This leads to an increase in the information rate. However, this correlation is higher for the LIFDT model because of the dynamic threshold, which reduces the noise entropy even more and leads to a higher information rate.
We now look at the dependence of information rate on stimulus cutoff frequency f_{c}. We thus construct an “information tuning curve” for the two models (Fig.11 a), i.e., the dependence of information rate on stimulus cutoff frequency. For both models, information rates are small at low cutoff frequencies and increase for higher cutoff frequencies, such as those near the mean baseline firing rate of the neuron (200 Hz in our case). The general shape is attributable to the fact that the noise entropy decreases as a function of f_{c} (over the range of interest). This decrease may be attributable in part to the highpass filter characteristics of both models (Eqs. 35) over the range of frequencies considered. A more complete analysis of this result will be presented elsewhere.
Information rates were higher for the LIFDT model, compared with the Nelson model over the full range considered. A surprising result is the fact that the gain in information caused by ISI correlations exhibits a maximum at frequencies ∼100–150 Hz (Fig. 11
b). Hence, our dynamic threshold enhances the information tuning curve, and this effect is maximal for stimuli with a cutoff frequency f_{c} of the order of the inverse of the decay time constant of the dynamic threshold τ_{θ}. This effect can be understood intuitively as follows: cumulative relative refractoriness leads to a less variable neural response to repeated stimuli; if the unit fires on the rising phase of the stimulus, then it has time to recover and fire on the next rise of the stimulus, which occurs on average at least after one correlation time f
Behavioral experiments have demonstrated that the animal can reliably detect AMs with 100–200 Hz frequencies and that this might be relevant for courtship behavior (Bastian et al., 2001). We hypothesize that the resonance in the Punit information tuning curve caused by our dynamic threshold may be responsible for this observed sensitivity.
Note that if the stimulus correlation time f
Effect of blurring on information rate
Finally, we show that the importance of spike timing increases with stimulus cutoff frequency. We introduce “blurring” in the spike train by making the bin width Δτ greater and studying the incurred loss in potential information for different cutoff frequencies of the stimulus. We see that this loss is greater for higher cutoff frequencies (Fig. 13). This suggests that spike timing has minimal contributions in encoding information about slowly varying stimuli. Furthermore, this finding agrees with the fact that significant jitter does not affect the quality of the reconstructed stimulus for low stimulus cutoff frequencies (Kreiman et al., 2000). However, spike timing is important for highfrequency timevarying stimuli, and our results show in fact that electroreceptors can encode stimuli that vary on time scales at least as fast as 5–10 EOD cycles. This agrees with the results from a previous study that found that spike timing jitter in electroreceptors was on the order of one to two EOD cycles (Kreiman et al., 2000). This also justifies our use of PNDs for lowfrequency stimuli.
Contrasting signal detection and information theory
As the fish swims by a prey, it will experience a small change in transdermal potential in a time window of ∼200 EOD cycles (Nelson and MacIver, 1999). However, information rates are almost zero for such low cutoff frequencies (this corresponds to frequencies of ∼5 Hz, the information rate is almost zero for f_{c} < 50 Hz for the particular Punit cell modeled here) (Fig. 11 a). The fact that the animal can readily detect these signals (Nelson and MacIver, 1999) suggests that the mutual information rate that was calculated using baseline entropy is poorly suited for coding of these lowfrequency stimuli. This is also the case after the correlation time of the stimulus has been taken into account (Fig. 12 a). However, we have shown that measures based on signal detection theory were adept at quantifying the ability of the electroreceptor to transmit information about this type of stimuli (Figs. 7, 8). It is thus more natural to analyse slow timevarying stimuli by looking at the trialtotrial variability of the PND.
We use a weak, slow timevarying stimulus (ς_{stim} = 0.01 mV, f_{c} = 1.96 Hz) and look at the PND in a time window of 255 EOD cycles. The portion of stimulus used is shown in Figure 14 a. The PNDs obtained with both models are shown in Figure 14 b. Note that for each model, the PNDs with and without stimulus have almost the same variances. Thus, the main factor for discriminating between the two distributions is the difference in their means. This is not captured by entropy measures because they do not depend on the mean of the distribution used (Shannon, 1948).
However, as shown previously (Fig. 7), the variance of the PNDs that were obtained with the LIFDT model are smaller than those obtained with the Nelson model. Note that the means are separated by as few as three spikes in both cases (as low as 6% difference). The lesser overlap for the LIFDT model results in a dramatic improvement in discriminability. For example, the probability of obtaining 53 spikes for the LIFDT model is 0.3 with stimulus and 0.05 without stimulus, corresponding to a ratio of 6. For the Nelson model, this ratio is 2.5. Our results thus show that a single Punit could discriminate as few as two extra spikes in a time window of 255 EOD cycles as already suggested byRatnam and Nelson (2000). Their firstorder Markov process was not able to reproduce the experimentally observed probabilities of correct detection. This is because the Fano factor for their model was higher than the one for the experimental data. The key factor in this remarkable sensitivity is the negative ISI correlations that were observed experimentally and that result from cumulative relative refractoriness exhibited by our simple model with dynamic threshold.
DISCUSSION
Using two models, one with baseline ISI correlations and one without, we have shown that negative ISI correlations that are present experimentally play an important role in the animal's ability to detect both slowly and rapidly timevarying stimuli. A first analysis based on signal detection theory, which assumes rate coding, revealed that the ISI correlations dramatically enhanced the detectability of lowfrequency weak signals. We then used information theory to quantify the ability of the afferents to encode timevarying stimuli with various cutoff frequencies. We found that ISI correlations also helped in this case and that the effect was maximal for cutoff frequencies on the order of the inverse of the decay time constant of our dynamic threshold that was used to include cumulative relative refractoriness. By comparing both approaches, our study suggests that a rate code can be assumed for lowfrequency stimuli, whereas spike timing is important for highfrequency signals.
Comparison of models
Our previous simple model (Chacron et al., 2000) reproduced baseline first and secondorder ISI statistics that were seen experimentally, such as the ISI correlations. Here, this model was extended to include experimentally measured linear response properties to sinusoidal AMs by Nelson et al. (1997). To quantify the effects of correlations, we used a second model proposed byNelson et al. (1997) that gave identical firstorder statistics and responses to sinusoidal AMs, except that there were no ISI correlations.
Spike train variability and signal detection
Spike train variability as measured by the Fano factor was computed for both models at various time scales. Although able to reproduce firstorder ISI statistics as well as responses to AMs, the Nelson model did not reproduce experimentally observed spike train variability as measured by the Fano factor at counting times >10 EOD cycles (Fig. 4) (Ratnam and Nelson, 2000). The asymptotic value of the Fano factor for the Nelson model was shown to equal the square of the coefficient of variation of the ISIH as expected from Equation 1 in the absence of ISI correlations. The Fano factor obtained from our LIFDT model also tends toward this value when the ISI sequence is randomly shuffled to eliminate ISI correlations, thus proving that the Nelson model did not display any significant ISI correlations.
The negative ISI correlations obtained with our simple LIFDT model bring the Fano factor down (Cox and Lewis, 1966) (Eq. 1) to experimentally determined values for longer counting times; a discrepancy was, however, observed for counting times >150 EOD cycles as the Fano factor decreased monotonically instead of increasing. To get this increase in variability at long counting times, it was necessary to add a weak noise with a long correlation time to the model dynamics. This noise had no effect at low counting times, as explained in “Fano factor” and in Appendix . However, it led to positive ISI correlations extending to long lags as seen by the increase of the spectral density function at low frequencies (Fig. 6). The positive ISI correlations were weak, decayed exponentially up to long lags, and summed up according to Equation 1 to give the increase in the Fano factor. This increase has been observed in many neurons and may be of synaptic origin or attributable to drifts in EOD amplitude or frequency. These ISI correlations make modeling of electroreceptors by Markov chains of ISIs difficult because ISI correlations can extend out to lags in the thousands. However, the addition of an extra noise term in our simple model gave quantitatively accurate results and could be used to incorporate these effects into other neuron models.
We used the ideal observer paradigm (Green and Swets, 1966) to optimally discriminate between pulse number distributions with and without stimulus (Gabbiani and Koch, 1998). It was shown that negative ISI correlations reduced the variance of these pulse number distributions without changing their means significantly, hence increasing their discriminability. This led to an improvement in the receiver operating characteristic curve that was maximal at counting times at which the Fano factor was minimal. A criterion for near perfect discrimination was derived, and it was shown that signals leading to a 6.5% increase in firing rate would be discriminated in this manner on the basis of steadystate dynamics alone. Transients caused by the filter further lowered this detection threshold by increasing or decreasing the firing rate computed over the duration of the stimulus, thus leading to a further increase in discriminability. Signal detection analysis is not appropriate for zero mean highfrequency signals (e.g., beat frequencies generated by fish with high EOD frequency differences) because spike train variability is high at low counting times (Softky and Koch, 1993; Borst and Theunissen, 1999). It is, however, appropriate for lowfrequency signals such as those caused by prey.
Using information theory to quantify the coding of timevarying stimuli
The trialtotrial variability of the neural response to a repeated stimulus was characterized by the noise entropy rate. The entropy rate of the baseline spike train was estimated, and the difference between baseline and noise entropy rates was used to quantify the information transfer rate about the stimulus. This definition is natural in our case because we have baseline firing activity and signal detection must be based on a change from this baseline activity (Ratnam and Nelson, 2000). We compared the information rates obtained with our model and Nelson's. It was shown that the information rate increased in both cases with stimulus contrast. This agrees with the fact that the electroreceptor is better at encoding stronger stimuli (Wessel et al., 1996). However, negative ISI correlations could help the electroreceptor in the coding of fast timevarying stimuli because information rates computed with our model were higher. This is caused by the fact that correlations reduce the noise entropy even more than the baseline entropy. We note that our results are consistent with previous studies that have demonstrated that a refractory period can improve the linear correlation between the stimulus and the instantaneous firing rate (Chialvo et al., 1997) as well as the neural precision (and hence the mutual information) (Berry and Meister, 1998). However, the models did not incorporate ISI correlations. Furthermore, in Berry and Meister (1998), the increase in refractory period led to a decrease in mean firing rate, and the causes of the increase in mutual information were not clear. In our analysis, we examined the effect of ISI correlations on mutual information and signal detection without concomitant changes in the mean firing rate.
Information rates as a function of stimulus cutoff frequency were also computed. Our results show that the information rate increased with cutoff frequency as expected from the highpass characteristics of Pafferents (Xu et al., 1996). Information rates were higher for the LIFDT model, and the gain attributable to correlations exhibited a resonance with a maximum of ∼100 Hz. This frequency corresponds to the inverse of the decay time constant of our dynamic threshold. Remarkably, there is evidence that these fish respond preferentially to electrocommunication signals with a frequency content of ∼100 Hz (Bastian et al., 2001).
Comparing signal detection and information theory
Finally, we have shown that the loss of information about the stimulus incurred by decreasing the spiketiming resolution increases with cutoff frequency (Fig. 13). This suggests that spike timing is not as important for lowfrequency stimuli as it is for highfrequency stimuli. However, signal detection theory assumes a rate code and can be applied in cases in which spike timing is not important. Our results show that an optimal detector receiving a single Punit spike train with baseline negative ISI correlations can detect the presence of stimuli that would give rise to as few as two extra spikes over a counting time of ∼250 EOD cycles, as suggested by Ratnam and Nelson (2000). For lowfrequency stimuli, signal detection theory is appropriate because the change in mean firing rate (computed over an appropriate time window) without a concomitant change in firing rate variance can signal the presence or absence of the stimulus. Without a change in variance, there will be no change in entropy (because the entropy of a random variable does not depend on its mean) (Shannon, 1948) and thus almost no mutual information as per our measure. For zeromean highfrequency stimuli, there may be almost no change in mean firing rate computed over a long time window; hence, signal detection theory is not appropriate. However, because baseline entropy is high (this is caused by the high variability at short counting times) and because ISI correlations reduce the noise entropy, the mutual information will be high. The ideas presented above are compatible with a recent analysis by Salinas and Sejnowksi (2000) in which neurons can be driven either by the mean excitatory level (mean firing rate as assessed by the PND in our case) or by fluctuations around this mean (as assessed by mutual information in our case). Hence, the high variability of Pafferent spike trains observed for short counting times gives a high mutual information when looking at highfrequency stimuli, whereas the low variability at longer counting times caused by negative ISI correlations at short lags improves signal detectability at low frequencies.
Conclusion and outlook
We have shown that negative ISI correlations that are seen experimentally can improve the ability of a neuron to code both slow and fast timevarying stimuli. The Punits we have studied are known to converge onto basilar pyramidal cells of the electrosensory lateral line lobe (ELL) (Bastian, 1981). Population averaging is thus expected and might explain the extreme behavioral sensitivity to AMs, down to 0.1% of baseline EOD (Knudsen, 1974; Nelson and MacIver, 1999). Also, as mentioned before, the position of the minimum of the Fano factor is highly variable (40–1000 EOD cycles). Moreover, it is possible that different Punits (probability of firing per EOD cycle ranges from 0.1 to 0.5) (Nelson et al., 1997) will have different cumulative relative refractoriness decay time constants. It will thus be very interesting to study ELL decoding of slowly versus rapidly timevarying input processed by this heterogeneous Punit population.
Appendix
In this appendix, we give a list of all the symbols and acronyms used in this paper. When a symbol is constant throughout the paper (e.g., model parameter), its value is given. Units are only given for symbols.
Appendix
In this appendix, we summarize the equations used for both models.
Nelson model
The AM A(t) is filtered using the following set of equations:
LIFDT model
In our model, the AM A(t) is filtered using the same set of equations as for the Nelson model:
Appendix
In this appendix, we give an intuitive explanation of why adding a slow noise to our model will give rise to positive ISI correlations.
We first recall that the Ornstein–Uhlenbeck process λ_{2}varies on a time scale much greater than either the voltage or threshold variables in the LIFDT model because τ_{2} = 50,000 EOD cycles is much greater than an EOD cycle. Thus, we can treat λ_{2} as a quasistatic variable with respect to voltage and threshold because it is almost constant when considering time scales much smaller than τ_{2}. Because the mean ISI is five EOD cycles long, many ISIs (10,000 on average) will have occurred during a time window of length τ_{2}. Let us imagine that λ_{2} has some value; then λ_{2} will have that value (approximately) for a long time. If the value is positive, then the synaptic current I_{syn} is bigger than it would be if λ_{2} = 0 and we thus expect a long sequence of ISIs of shorter duration (but not much shorter because λ_{2} is weak) than on average (when λ_{2} = 0). When λ_{2} has a negative value, one can expect sequences of ISIs longer than average by the same argument. Thus, we will get long sequences of ISIs that are shorter than average and long sequences of ISIs that are longer than average. It is these long sequences that will lead to positive SCCs (the SCC at a given lag is positive only when two ISIs separated by this lag are both shorter or longer than average). Because those sequences can be very long, the ISI correlations extend to long lags. As mentioned in the text, a full mathematical explanation will be presented elsewhere.
Footnotes

This research was supported by the Natural Sciences and Engineering Research Council of Canada (M.J.C. and A.L.) and Canadian Institutes of Health Research (L.M.) Canada. We thank B. Doiron, D. Bueti, and J. Bastian for useful discussions. We give special thanks to P. Lánský, J. Lewis, and A. M. Oswald for their careful reading of this manuscript as well as useful discussions. Finally, we thank two anonymous reviewers for their helpful comments.
Correspondence should be addressed to Maurice J. Chacron, Physics Department, 150 LouisPasteur, Ottawa, Ontario, Canada K1N6N5. Email: mchacron{at}physics.uottawa.ca.