Abstract
Neurons are often assumed to operate in a highly unreliable manner: a neuron can signal the same stimulus with a variable number of action potentials. However, much of the experimental evidence supporting this view was obtained in the visual cortex. We have, therefore, assessed trial-to-trial variability in the auditory cortex of the rat. To ensure single-unit isolation, we used cell-attached recording. Tone-evoked responses were usually transient, often consisting of, on average, only a single spike per stimulus. Surprisingly, the majority of responses were not just transient, but were also binary, consisting of 0 or 1 action potentials, but not more, in response to each stimulus; several dramatic examples consisted of exactly one spike on 100% of trials, with no trial-to-trial variability in spike count. The variability of such binary responses differs from comparably transient responses recorded in visual cortical areas such as area MT, and represent the lowest trial-to-trial variability mathematically possible for responses of a given firing rate. Our study thus establishes for the first time that transient responses in auditory cortex can be described as a binary process, rather than as a highly variable Poisson process. These results demonstrate that cortical architecture can support a more precise control of spike number than was previously recognized, and they suggest a re-evaluation of models of cortical processing that assume noisiness to be an inevitable feature of cortical codes.
- auditory cortex
- Poisson spiking
- neural coding
- neural reliability
- neural computation
- cell-attached recording
Introduction
Since the earliest single-unit cortical recordings (Hubel and Wiesel 1959), it has been generally accepted that the train of action potentials elicited by repeated presentations of the same stimulus is highly variable. This unreliability has contributed to the widely held view that cortical spike trains are so noisy that only their average activity can be used to encode stimuli and that the details of spike count and timing must reflect noise. Conversely, cortical variability is sometimes taken to reflect a fundamental limitation on the fidelity of cortical computation. In this view, unreliability is an unavoidable consequence of cortical architecture, and it can be used to make inferences about the general principles of cortical organization (Shadlen and Newsome 1994, 1998; Mazurek and Shadlen, 2002). Variability thus appears to impose severe constraints on cortical representation and computation. Precisely how cortical circuits overcome this noise limitation and perform so well as computational devices has been the subject of much controversy (Softky and Koch, 1993; Marsalek et al., 1997; Shadlen and Newsome, 1998; Diesmann et al., 1999; Manwani and Koch, 1999; Pouget et al., 2000; Kistler and Gerstner, 2002; Mazurek and Shadlen, 2002), but the empirical observation that cortical spike trains are variable has, until recently, gone widely unquestioned (but see Gur et al., 1997; Gershon et al., 1998; Kara et al., 2000).
Spike count variability is often quantified in terms of the “Fano factor” (Buracas et al., 1998), defined as the ratio of the variance to the mean spike count over trials. A perfectly repeatable neural response has a Fano factor of zero, whereas a Poisson process (e.g., the tics of a Geiger counter) has a Fano factor of one. A Fano factor of order unity is therefore often interpreted as evidence of a highly random underlying spike-generating process.
The variability of cortical responses has been well studied in several areas of visual cortex in anesthetized cats and in both anesthetized and awake primates. An almost universal finding is that the Fano factor is greater than or approximately equal to one (Heggelund and Albus, 1978; Dean, 1981; Tolhurst et al., 1983; Buracas et al., 1998; Oram et al., 1999), although several exceptions to this rule have recently been reported (Gur et al., 1997; Gershon et al., 1998; Kara et al., 2000). By contrast, the variability of neurons in other noncortical sensory areas, including the retina (Berry et al., 1997) and the motion-sensitive neuron of the fly (de Ruyter van Steveninck et al., 1997), can be substantially lower.
Is high trial-to-trial variability thus a general feature of cortical circuitry? Surprisingly, cortical variability has only rarely been studied outside of the visual system (Lee et al., 1998), so the widespread belief that cortical spike trains are highly unreliable is based mainly on experiments in visual cortex. The trial-to-trial response variability of well isolated single units in the auditory cortex has not previously been quantified.
Here we show that the majority of spiking responses generated by neurons in the rat auditory cortex are binary, consisting of either 0 or 1, but not more, action potentials in response to a stimulus. Binary spiking represents the lowest variability (Fano factor) possible for a given spike rate; for some responses, the reliability of the responses we observe is perfect, i.e., the Fano factor is zero. Moreover, we show binary spiking is not simply the result of the transient nature of auditory cortical responses. Our results demonstrate that cortical architecture can support a more precise control of spike number than was previously recognized, and they suggest a re-evaluation of models of cortical processing that assume noisiness to be an inevitable feature of cortical codes.
Materials and Methods
Surgery. Sprague Dawley rats (17-24 d) were anesthetized in strict accordance with the National Institutes of Health guidelines, as approved by the Cold Spring Harbor Laboratory Animal Care and Use Committee. Recordings were performed under ketamine (60 mg/kg) and medetomidine (0.50 mg/kg). After the animal was deeply anesthetized, it was placed in a custom naso-orbital restraint that left the ears free and clear. Local anesthetic was applied to the scalp, and a 1 × 2 mm craniotomy and durotomy were performed above the auditory cortex. A cisternal drain was performed before the craniotomy. Before the introduction of electrodes, the cortex was covered with physiological buffer (in mm: NaCl, 127; Na2CO3, 25; NaH2PO4, 1.25; KCl, 2.5; MgCl2, 1; and glucose, 25) mixed with 1.5% agar. Rectal temperature was monitored and maintained at 37°C using a feedback-controlled blanket (Harvard Apparatus, Holliston, MA). Breathing and response to noxious stimuli were monitored throughout the experiment, and supplemental dosages of anesthetic were provided when required.
Electrophysiology. Multiunit recordings were obtained using 1 MΩ tungsten electrodes (World Precision Instruments, Sarasota, FL) and a Cyberamp 380 (Axon Instruments, Foster City, CA). Cell-attached recordings were obtained using an Axopatch 200B (Axon Instruments) and a data acquisition program written by Bernardo Sabatini in the Igor programming language. For cell-attached recordings, pipettes were filled with an internal solution consisting of (in mm): KCl, 10; KGluconate, 140; HEPES 10; MgCl2 2; CaCl2 0.05; Mg-ATP, 4; Na2-GTP, 0.4; Na2-Phosphocreatine, 10; BAPTA 10; and biocytin, 1%, pH 7.25; diluted to 290 mOsm. Resistance to bath was 3-5 MΩ before seal formation.
One hundred and seventy five cell-attached recordings (from 16 animals) passed our criteria for inclusion in the analysis: recordings had to be stable for at least 5 min; electrode capacitance had to be sufficiently well compensated and seal resistance sufficiently high (range, 10-100 MΩ) to allow unambiguous identification of every spike (see Fig. 2), and at least one action potential had to be observed. Only stationary epochs were analyzed.
Stimuli. All experiments were conducted in a double-walled sound booth (Industrial Acoustics Company, Bronx, NY). Free-field stimuli were presented using a System II (Tucker-Davis Technologies, Gainesville, FL) running on a host Pentium III computer connected to an amplifier (Stax SRM 313), which drove a calibrated electrostatic speaker (taken from the left side of a pair of Stax SR303 headphones). The stimuli consisted of 25, 50, and 100 msec pure-tone pips of 32 different frequencies (logarithmically spaced between 2 kHz and 46731 Hz) with 5 msec cosine-squared windows applied to the onset and termination of each pip. All 32 tones were repeatedly presented at 65 dB in a fixed pseudorandom order at a rate of 2 tones/sec.
The natural stimulus depicted in Figure 5d is an 8 sec segment of a vocalization of the common nightingale taken from an audio CD, sampled at 44,100 Hz, called “The Diversity of Animal Sounds” available from the Cornell Laboratory of Ornithology.
Response variability analysis. Because the Fano factor of the spike count in response to repeated presentations of the same tone has the mean spike count in the denominator, it is only defined for sets of responses that include at least one spike on at least one trial. Therefore, we only included results from such tones in the variability analysis for any given neuron or multiunit penetration site. For each response set, we chose our window for counting spikes so that it contained the entire region of the peristimulus time histogram (PSTH), including responses to all presentations of all tones, that was greater than the spontaneous rate. For example, we used a 45-msec-long window starting 8 msec after stimulus onset for the neuron shown in Figure 1b,c. The mean window size across all 175 neurons was 46 msec.
Group statistics analysis. We could not directly assess the degree to which our data were consistent with binomial statistics without introducing a specific noise model to account for the occasional occurrence of multispike responses. Rather than do this, we quantified the statistical significance of the low variability of our data by comparing with the null hypothesis that the neurons obeyed Poisson statistics. The ability to assess significance depended on two parameters: the sample size and the firing probability. Intuitively, the dependence on firing probability arises because at low firing rates most responses produce only trials with zero or one spikes under both the Poisson and binary models; only when firing probability is high do the two models make different predictions, because in that case the Poisson model includes many trials with two or even three spikes, whereas the binary model generates only solitary spikes (see Fig. 4).
We recorded responses to 32 different 25 msec tones from each of 175 neurons, repeating each tone between five and 75 times (mean, 19 trials). Thus, our initial ensemble consisted of 32 × 175 = 5600 response sets, with between five and 75 samples in each set. Of these, 3055 response sets contained at least one spike on at least one trial. For each response set, we tested whether the observed variability was significantly lower than expected from the null hypothesis of a Poisson process.
For each response set, we computed the cumulative distribution function (cdf) for the “Fano factor” (defined as the variance divided by the mean spike count across all trials; this is sometimes called the “coefficient of dispersion” in the statistics literature) for a sample drawn from a Poisson process with the same number of trials and mean spike count as the original data set. Because we were not able to obtain a useful, closed-form, analytical expression for this cdf, we instead used the brute-force approach of empirically computing the Poisson probability, and Fano factor, for every possible response set consisting of between zero and three spikes on any trial; for response sets with means greater than two spikes per trial, we considered all response sets with up to five spikes per trial. We then made an empirical, weighted histogram from this set of Fano factors, in which the contribution from each response set was weighted by its Poisson probability. We verified the accuracy of the central region of each estimated pdf using a Monte Carlo procedure (100,000 simulations), and we analytically verified the accuracy of the tail near zero, which was crucial for our analysis. We assigned a Fano factor of zero to every response set consisting of all zeros.
We identified all response sets for which significance could be assessed by calculating the smallest possible value that the Fano factor could have taken given the observed sample mean, which corresponds to the set of responses containing only ones and zeros and which has the same mean response and total number of trials as the data. This was not possible for cases in which the mean spike count was greater than one; for these cases we set the minimum Fano factor to zero. If the cumulative probability of this minimum Fano factor was found to be less than our significance criterion p, then it was possible to assess the significance of the response set. If the cumulative probability of the observed Fano factor was less than p, the response set was considered significant.
An example of the procedure for determining statistical significance of response sets. Suppose that, in response to 20 repeated presentations of the same tone, we observe the following set of spike counts: [1,1,0,0,1,1,1,0,1,0,2,0,1,0,0,1,0,1,1,0] (1), which includes only one multispike response (a doublet on trial 11). Response set (1) has a mean of 0.60 spikes per trial and a variance of ∼0.36 (spikes per trial) 2, resulting in a Fano factor of ∼0.60 spikes per trial.
To determine whether we can assess the statistical significance of the low variability of this response set, we first construct the least variable (i.e., most “binary”) set of responses we could have observed with the same empirical mean: [1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0] (2). (For convenience, we chose to place all the ones in the early trials, which will not affect any of the calculations). This has a mean of 0.60 spikes per trial and a variance of ∼0.25 (spikes per trial)2, resulting in a Fano factor of ∼0.42 spikes per trial. Thus, because of integer-counting statistics, this is the lowest possible Fano factor we could have observed for the given number of trials and the observed sample mean.
Next, we compute the probability distribution of the Fano factor under the null hypothesis of a Poisson process with an event rate equal to the observed mean (0.60 spikes per trial), and we find that the cumulative probability that the Fano factor could have been equal to or less than the minimum possible value (0.42 spikes per trial) is p = 0.0045, which satisfies our significance criterion p < 0.01. Therefore, we can assess the statistical significance of response set (1), and so it would have been included in our analysis. However, the cumulative distribution for the observed Fano factor (0.60 spikes per trial) is 0.058, which is >0.01, and thus does not satisfy our criterion. Accordingly, despite the low occurrence of multispike responses (5% = 1/20), we would have concluded that response set (1) is not binary because it is not statistically significantly different from a Poisson process at the p < 0.01 level.
Results
We recorded responses of neurons in the auditory cortex of ketamine-anesthetized rats to pure-tone pips of different frequencies (Sally and Kelly, 1988; Kilgard and Merzenich, 1998). Each pip was presented repeatedly, allowing us to assess the variability of the neural response to multiple presentations of each stimulus.
Multiunit recordings
We first recorded multiunit activity with conventional low-impedance tungsten electrodes (Fig. 1a). The number of spikes in response to each pip fluctuated markedly from one trial to the next, as though governed by a random mechanism such as that generating the ticks of a Geiger counter. Such highly variable responses are comparable to those recorded throughout the visual cortex (Tolhurst et al., 1983; Softky and Koch, 1993; Buracas et al., 1998; Shadlen and Newsome, 1998; Stevens and Zador, 1998) and have contributed to the widely held view that cortical spike trains are so noisy that only the average firing rate can be used to encode stimuli.
Cell-attached single-unit recordings
Because we were recording the activity of an unknown number of neurons, we could not be sure of the relationship between the strong trial-to-trial fluctuations observed in the population and the underlying variability of the single units. We therefore used an alternative technique, cell-attached recording with a patch pipette (Otmakhov et al., 1993; Friedrich and Laurent, 2001; Margrie et al., 2002), to ensure single-unit isolation (Fig. 2). This recording mode minimizes both of the main sources of error in spike detection: failure to detect a spike in the unit under observation (false negatives) and contamination by spikes from nearby neurons (false positives). Although single-unit isolation can also be obtained using high-impedance tungsten electrodes, cell-attached recording also differs from conventional extracellular recording methods in its selection bias. With cell-attached recording, neurons are selected solely on the basis of the experimenter's ability to form a seal, rather than on the basis of neuronal activity such as spontaneous activity or responsiveness to particular stimuli, as in conventional methods.
Surprisingly, single-unit responses were far more orderly than suggested by the multiunit recordings; responses typically consisted of either 0 or 1 spikes per trial, and not more. In the most dramatic examples, each presentation of the same tone pip elicited exactly one spike (Fig. 1b). In most cases, however, some presentations failed to elicit a spike (Fig. 1c). Thus, these single-unit responses could be characterized as a noisy binary process: “binary” because neurons produced either 0 or 1 spikes, and “noisy” because some stimuli elicited single spikes on some trials, but no spikes on others.
Eleven of the 32 tones presented to this neuron elicited at least one spike on at least one trial (out of 27 repetitions). For these eleven tones, we compared the mean spike count to the Fano factor (the ratio of the variance to the mean of the distribution of spike counts on individual trials). The Fano factor for any neuron that displays binary spiking is the same as for a binomial process with the same probability of spiking per trial, p, and is given by variance/mean = [p (1 - p)]/(p) = 1 - p, independent of the number of trials. Thus, on a plot of p versus Fano factor, a collection of perfectly binary responses (i.e., trials consisting of no multispike responses) falls along the diagonal (1 - p) connecting the top left and bottom right corners of the unit square (Fig. 3a). For this neuron, 10 of the 11 tones elicited perfectly binary responses and so fell exactly along the diagonal, whereas one point deviated slightly from the diagonal because of a lone double-spiked response.
By comparison, most of the multiunit responses fell far above the diagonal on this plot (Fig. 3a). Indeed, all but one of the multiunit responses fell above the horizontal line corresponding to the Fano factor of unity, indicating trial-to-trial variability in excess of that expected from Poisson spiking. Below we show how the multiunit and single-unit responses can be reconciled by assuming correlations between units (see Reconciling multiunit and single-unit recordings below).
The probability of firing a single spike was related to stimulus frequency (Fig. 3b) (Calford and Semple, 1995). This suggests that the conventional notion of a tuning curve, in which spike rate is related to some stimulus parameter, can readily be extended to binary responses.
The majority of responses are binary
How prevalent were binary responses such as those illustrated by Figures 1, b and c, and 3a? One approach to answering this question would be to assess directly the degree to which our data were consistent with binomial statistics. However, in a real experimental setting, small deviations from perfectly binary spiking are to be expected, and we do not have a good noise model to account for these deviations. For example, although it might seem reasonable to model the deviant spikes that occur during tones as “spontaneous”, i.e., as occurring at the same rate as the spikes that occur during the intertone interval, such a model includes an implicit assumption about the additivity of noise during spontaneous and evoked activity. An ideal statistical test would include no such strong ad hoc assumptions.
We therefore adopted an alternate approach. We devised a statistical test to distinguish binary responses from those consistent with a Poisson process (the null hypothesis). We formulated the question in terms of deviations from the null hypothesis of a Poisson process by asking whether the observed variability was significantly below that expected from a Poisson process (see Materials and Methods for details). Expressed this way, establishing significance requires that two conditions be satisfied. First, the amount of data had to be sufficient (given the observed firing rate and number of repetitions) to distinguish Poisson from binary firing. This is because, at low firing rates, Poisson and binary firing are indistinguishable given limited data (Fig. 4); thus only some responses can be classified, whereas others must be regarded as potentially consistent with either hypothesis. Second, the neuronal variability (as quantified by the Fano factor) had to be sufficiently low that the chances of observing such variability was less than some statistical confidence level. Intuitively, our test assessed whether, when plotted as in Figure 3a, points were significantly below the horizontal (Poisson) line, in the direction of the diagonal (binary) line. This test is highly conservative, because a perfectly binary response might nonetheless be characterized as unclassifiable if the response probability were too low or the sample size too small.
Figure 4 (compare c, d) emphasizes the difference, evident at high firing rates, between Poisson and binary spiking. In each set of simulated trials there are exactly 20 spikes; thus, the mean spike count is one spike per trial in both cases. In the Poisson set (Fig. 4c), some trials have no spikes at all, some have one, some two, and some three spikes. By contrast, in the binary set (Fig. 4d), the same 20 spikes are arrayed over the 20 trials in a much more orderly manner, with exactly one spike per trial. The two sets of rasters are thus clearly different, even in the case where there is on average one spike per trial. It is this difference that our statistical test captures.
The majority of response sets (370/624 = 59%) for which statistical significance could be assessed (at the p < 0.01 significance level) were well characterized as binary (Fig. 5a). We emphasize that, by definition, <1 of 100 responses (i.e., no more than ∼6 of the 624 for which significance could be assessed) would have been expected to show such low variability by chance, given the null hypothesis of a Poisson process. Moreover, the majority of the 91 neurons (75/91 = 82%) for which significance could be assessed showed at least one significantly sub-Poisson response (p < 0.01). Even using a more stringent criterion of p < 0.001, half (239/458 = 52%) of the response sets and 68% (49/72) of the neurons were still significantly sub-Poisson. Therefore, low-variability spiking was not an anomalous finding, characteristic of a limited subset of neurons or responses, but was instead a typical mode of firing among neurons in our sample.
Most responses with sub-Poisson variability consisted of either one or zero spikes on nearly every trial, as in Figure 1, b and c, and thus were well characterized as binary. However, 13 neurons achieved low variability for at least one tone by firing stereotyped multispike bursts in which nearly every spike count was, for example, either 0 or 3, but not 1, 2, 4, or greater (Fig. 5b). Such bursty responses have been previously described in the anesthetized cat (Phillips and Sark, 1991). Note that we use the term burst here phenomenologically, with no suggestion of the mechanism underlying the multispike response.
Approximately 41% of the responses were not significantly sub-Poisson. In some cases response variability was supra-Poisson (i.e., Fano factor greater than unity), as expected from recordings in other cortical regions. Heterogeneity in response variability has also been reported in the primary visual cortex of the anesthetized cat, in which only well isolated and well driven layer 4 neurons show markedly sub-Poisson variability (Kara et al., 2000). However, we found no comparable dependence of response variability on recording depth. Although our depth measurements were not validated with electrolytic lesions (cf. Kara et al., 2000) and should therefore be treated as crude estimates only, our data do not support the hypothesis that binary spiking in the auditory cortex is limited to a particular layer, but suggest instead that it is a general feature of the entire neuronal population recorded using patch electrodes.
We wondered whether binary spiking resulted from the brevity (25 msec) of the stimuli we typically used. We therefore subjected 12 neurons to an additional protocol consisting of at least 10 interleaved presentations each of 100 msec tones and 25 msec tones of all 32 frequencies (Fig. 5c). Of the 100 msec stimulation response sets, 45 were found to be significantly sub-Poisson at the p < 0.05 level, in good agreement with the 47 found to be significant among the responses to 25 msec tones. Thus, binary spiking was not attributable to the brevity of the stimuli. Moreover, even complex stimuli with rich spectrotemporal structure can produce binary behavior (Fig. 5d).
Response timing
In many neurons, binary responses showed high temporal precision, with latencies sometimes exhibiting SD values as low as 1 msec (Fig. 6) (see also Fig. 1b,c), comparable to previous observations in the auditory cortex (Heil, 1997), and only slightly more precise than in visual area MT (Buracas et al., 1998) of the alert monkey. High temporal precision was positively correlated with high response probability, both within (Fig. 6a) and across (Fig. 6b) cells.
Poisson model with refractory period
The low trial-to-trial variability ruled out the possibility that the firing statistics could be accounted for by a simple rate-modulated Poisson process (Fig. 5a). To illustrate this, we compared the observed spike rasters (Fig. 7a) with simulated spike trains generated using a rate-modulated Poisson process whose event rate was fit to the smoothed PSTH derived from the observed rasters. As expected, the simulated spike trains contained several multispike responses (Fig. 7b). The simulated spike trains seen here are qualitatively similar those observed in visual area MT during presentation of dynamic stimuli (Buracas et al., 1998).
In other systems, low trial-to-trial variability has sometimes been explained in terms of a Poisson process followed by a post-spike refractory period (Berry et al., 1997; Kara et al., 2000). In the present context, if the underlying firing rate were elevated for a time shorter than the refractory period, then at most one spike per trial could be generated. Thus, whether a refractory period can provide a full account of binary spiking depends critically on whether the refractory period is longer than the period of elevated firing.
During periods of spontaneous firing, and occasionally during stimulus-evoked responses, interspike intervals as short as 2 msec were observed, as expected from previous cortical recordings. These short interspike intervals provide an upper bound on the hard refractory period, which is presumably caused by the intrinsic properties (for example, the time course with which sodium channels recover from inactivation) of the spike-generating mechanism of the neuron. The inclusion of such a hard refractory period did not substantially reduce the variability of the simulated Poisson process; several multispike responses were still observed (Fig. 7c).
These simulations suggest that binary spiking did not result from the intrinsic properties of the spike-generating mechanism of the neuron. Rather, the fact that stimulus-evoked responses consisted of at most a single spike was more likely the result of circuit-level mechanisms. Tones elicit a precisely timed sequence of excitation, followed by strong inhibition (Wehr and Zador, 2002); the inhibition quenches the response and thereby enforces a very short window for temporal integration during which only a single spike can occur. However, this inhibition typically decays within 50-100 msec, and is therefore unlikely to account for the long-lasting suppression observed after a stimuli that elicit spikes with high probability (Fig. 8); such longer-lasting suppression may be attributable, at least in part, to short-term synaptic depression (Chung et al., 2002).
Reconciling multiunit and single-unit recordings
How can the highly variable multiunit recordings (Fig. 1a) be reconciled with the single-unit binary recordings (Fig. 1b,c)? According to the simplest model, each multiunit response would consist of the summed activity of several single-unit recordings. If responses from enough low firing probability, statistically independent binary units are combined, then the trial-to-trial variability of the population approaches a value close to unity. Thus, we should expect the multiunit data to be about as variable as a Poisson process, even if the individual units are statistically independent of one another.
However, more careful examination of the multiunit responses reveals that the observed variability for most responses actually exceeds that of a Poisson process (i.e., the Fano factor exceeds unity) by a substantial margin. This supra-Poisson variability suggests that neurons are correlated with each other (Zohary et al., 1994; Abbott and Dayan, 1999). A simple simulation illustrates how positive correlations can lead to supra-Poisson variability. Figure 9 shows a simulated multiunit recording consisting of five noisy binary neurons, each with a per trial spiking probability of 0.2. The responses of the neuron were designed so that a fraction of the spikes of each neuron was shared by all neurons. For example, the point at the far left of the graph corresponds to the case of five statistically independent neurons; at the far right, the activity of each neuron is identical. The simulated multiunit variability increased with the fraction of shared spikes. Thus comparison of single- and multiunit data support the idea that binary units are positively correlated.
Discussion
We have used in vivo cell-attached recording to assess the trial-to-trial variability of neurons in the auditory cortex of ketamine-anesthetized rats. We find that the majority of responses can be well characterized as a binary process (i.e., as a response consisting of 0 or 1 spikes, but not more) instead of as a more variable Poisson process such as has usually been assumed the rule in cortex. Our results demonstrate that cortical architecture can support a more precise control of spike number than has previously been recognized, and suggest a reevaluation of models of cortical processing that assume noisiness to be an inevitable feature of cortical codes.
Relation to previous studies of auditory cortex
Response variability has been extensively studied in the visual cortex (Heggelund and Albus, 1978; Dean, 1981; Tolhurst et al., 1983; Buracas et al., 1998; Oram et al., 1999), where with three exceptions (Gur et al., 1997; Gershon et al., 1998; Kara et al., 2000), Poisson or supra-Poisson variability has been observed. However, the trial-to-trial variability of single units has not been previously assessed in the auditory cortex, although the variability of stimulus-evoked local field potentials (LFPs) has been considered (Kisley and Gerstein, 1999).
It has long been known that tone-evoked responses in the auditory cortex can be transient. For example, in one early study of auditory cortical responses, it was noted that “nearly 95% of the neurons responded with a single spike, or short bursts of 2 to five spikes, to pure tones delivered monaurally or binaurally regardless of the duration of the tone” (Brugge et al., 1969). Subsequent work over the intervening three decades has supported the view that transient responses are common (Calford and Semple, 1995; Heil, 1997; Sutter et al., 1999), although more sustained responses are also sometimes observed, even in the anesthetized preparation (Phillips and Sark, 1991; Furukawa et al., 2000) (Fig. 5b).
However, the binary responses we describe are not simply transient. To demonstrate the transient nature of auditory spiking, it would have sufficed to show PSTHs, because the brevity of the response can be fully assessed by the mean activity. “Transient” refers to the time course and the mean spike count per trial, whereas “binary” makes a statement about the variance of the spike count as well. Our study thus establishes for the first time that transient responses in auditory cortex can be described as a binary process, rather than as a highly variable Poisson process.
Anesthesia and transient responses in auditory cortex
Although there has recently been renewed interest in the stimulus dependence of sustained spiking in the awake auditory cortex [particularly in response to complex stimuli (deCharms et al., 1998; Lu et al., 2001)] purely transient responses have also commonly been reported in this preparation since they were observed in single-unit recordings over 40 years ago (Hubel et al., 1959; Abeles and Goldstein, 1972; Dear et al., 1993). Such transient responses can be seen, for example, in early studies on responses to complex vocalizations (Wollberg and Newman, 1972; Creutzfeldt et al., 1980). Particularly clear examples of purely transient spiking are evident in an early study of attentional modulation of auditory cortical responses (Hocherman et al., 1976). In the awake rat, up to 50% of neurons show phasic short-latency responses such as those observed here (Talwar and Gerstein, 2001). Indeed, even in a recent study focusing on the contrast dependence of the sustained component, a substantial fraction of neurons show purely transient responses under all stimulus conditions (Barbour and Wang, 2003). Thus transient firing per se is not an artifact of anesthesia. However, none of these earlier studies distinguished between Poisson and binary responsiveness, and so provide no insight into the reliability of cortical coding.
There is nevertheless little doubt that sustained responses are less common in the anesthetized preparation. Unfortunately, there have been relatively few studies in which the activity of individual single units is compared when the animal is in different states of arousal. In one study comparing single-unit activity in sleeping to awake rats, most response properties remained primarily unchanged, with at least some neurons remaining transient under all conditions (Edeline et al., 2001).
Interestingly, firing rates observed with cell-attached and whole-cell recording methods in the awake preparation (Margrie et al., 2002) are much lower than previously reported based on conventional extracellular recordings, supporting the idea that these methods differ in their selection bias. There is at present no evidence to suggest that responses of well isolated transient responders are any less reliable in the awake preparation, and it remains an open question whether the subpopulation of transient responders in the auditory cortex of awake animals show the binary behavior we describe here.
Relation to other areas
Transient responses are also observed in other cortical areas. However, in contrast to the present results, transient responses in other cortical areas typically show the same high trial-to-trial variability as sustained responses and can, to first approximation, be considered to result from a rapidly modulated Poisson process. For example, in area MT of the awake monkey, even when the response to a brief stimulus consists of on average only a single spike per trial, individual trials may show as many as two or three spikes (Buracas et al., 1998). These area MT responses thus have more in common with the simulated responses shown in Figure 7, b and c, than with the binary auditory responses described here.
Whereas this is the first description of binary spiking, there have been several reports of low variability spiking. Under some (de Ruyter van Steveninck et al., 1997) but not all (Warzecha and Egelhaaf, 1999) stimulus conditions, motion-sensitive neurons in the fly visual system can show sub-Poisson firing. Similarly, neurons in the vertebrate retina (Berry and Meister, 1998; Kara et al., 2000) and thalamus (Kara et al., 2000) have been reported to respond with low variability under some conditions. There have also been three reports describing sub-Poisson variability in the cortex (Gur et al., 1997; Gershon et al., 1998; Kara et al., 2000), although even the least variable neurons from these studies do not approach the extremely low variability attainable by the binary units we describe here. It is interesting to note that variability of responses reported in the auditory nerve is high (typical Fano factors are of order unity) (Teich et al., 1990), indicating that in the auditory system spike count variability decreases centripetally; in contrast, in the visual system spike count variability shows the opposite trend, increasing from the retina to the visual cortex.
Cortical physiology and circuitry are similar across many different cortical regions, and it is tempting to speculate that basic cortical computations are, as a result, also similar. It is therefore somewhat puzzling that different regions should differ so strikingly in so fundamental a characteristic as the operating fidelity. One possibility is that auditory and visual processing are indeed fundamentally different. An alternative interpretation is that the difference results not from the sensory modality, but instead from the difference between the stimuli used. In this view, the binary responses may not be limited to the auditory cortex; neurons in visual and other sensory cortices might exhibit similar responses to the appropriately punctate stimuli. Conversely, auditory stimuli analogous to edges or gratings (Kowalski et al., 1996; deCharms et al., 1998) or other complex stimuli (Lu et al., 2001) may be more likely to elicit conventional, rate-modulated Poisson responses in the auditory cortex.
Sparse and efficient representations
The first spike is privileged in that it often carries most of the information in the spike train (Heil, 1997; Buracas et al., 1998; Panzeri et al., 2001). In fact, it has been suggested that complex image recognition can occur with only a single spike per neuron (Delorme and Thorpe, 2001). Because in the binary mode we have described each spike is a “first spike,” binary spiking is an efficient or sparse representation (Olshausen and Field, 1996; Hahnloser et al., 2002).
Because binary responses consist of at most a single spike, no possible information can be contained in the precise substructure of the spike train, ruling out the possibility of a privileged role for bursts (Martinez-Conde et al., 2000) or temporal multiplexing (Richmond and Optican, 1987) as has been reported in visual cortex; a stimulus parameter, such as the frequency of a tone, is encoded as the probability of firing a single spike (Fig. 3b). From the perspective of a single neuron, this probability fully specifies the response and implies that any additional information about the stimulus can be decoded only by looking over the neuronal population.
Implications for cortical processing
The well established observation that neuronal firing in the visual cortex typically shows Poisson or supra-Poisson variability has often been assumed to be a general principle true of all cortical areas (Shadlen and Newsome, 1998) or even every cortical neuron (Pouget et al., 2000). High variability is thus often taken as a starting point for a general theory of cortical dynamics, a constraint that any biologically plausible theory must satisfy. Natural questions then become: what biophysical (Softky and Koch, 1993) or circuit (van Vreeswijk and Sompolinsky, 1996) mechanisms allow such variability to arise or propagate (van Rossum et al., 2002) and how can populations of noisy neurons represent sensory stimuli with fidelity (Pouget et al., 2000)? On the other hand, if such high variability is not a necessary feature of cortical function, then inquiry turns naturally to the question of why one area or modality should show high variability whereas another does not, or to the dependence of variability on stimulus parameters.
High variability greatly constrains the kinds of processing that might occur at the level of a single neuron and may complicate models in which separate computations are performed on major dendritic branches (Shepherd and Brayton, 1987; Mel, 1994). For example, consider a hypothetical patch of dendrite in which the appropriate complement of voltage-dependent channels are arrayed to produce a logical AND gate of the activity of two nearby inputs (Shepherd and Brayton, 1987). In other words, suppose that the arrival of two presynaptic action potentials to one or both of these synapses within a short time period can result in a much greater signal to the soma than twice the affect of one action potential alone. In conventional high-variability models of cortical processing, a response that, on average, consisted of a single spike in input A would, on particular trials, often consist of a pair of spikes, and so would be indistinguishable from the simultaneous firing of A and B. If, however, each input neuron A and B reliably produced either zero or one spike, then such a scheme could sensibly signal the presence of simultaneous activity in both neurons. Similarly, some models for harnessing the computational power of dynamic synapses depend on temporally precise, low-variability spiking (Maass and Zador, 1999). Thus, binary spiking provides a possible substrate for models requiring a degree of control over spike number that, heretofore, had not been documented in the cortex.
The precise organization of both spike number and time we have observed suggests that cortical activity consists, at least under some conditions, of packets of spikes synchronized across populations of neurons. Theoretical work (Marsalek et al., 1997; Diesmann et al., 1999; Kistler and Gerstner, 2002) has shown how such packets can propagate stably from one population to the next, but only if neurons within each population fire at most one spike per packet; otherwise, the number of spikes per packet (and hence the width of each packet) grows at each propagation step. Interestingly, one prediction of stable propagation models is that timing precision should increase with increasing spike probability, a prediction born out by our observations (Fig. 6). The role of these packets in computation remains an open question.
Footnotes
This work was supported by grants from the National Institutes of Health, the Sloan Foundation, the Packard Foundation, and the Mathers Foundation to A.M.Z., and by a Swartz fellowship to M.R.D.
Correspondence should be addressed to Anthony Zador, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724. E-mail: zador{at}cshl.org.
Copyright © 2003 Society for Neuroscience 0270-6474/03/237940-10$15.00/0