The synchronized activity of cortical neurons often features spike delays of several milliseconds. Usually, these delays are considered too small to play a role in cortical computations. Here, we use simultaneous recordings of spiking activity from up to 12 neurons to show that, in the cat visual cortex, the pairwise delays between neurons form a preferred order of spiking, called firing sequence. This sequence spans up to ∼15 ms and is referenced not to external events but to the internal cortical activity (e.g., beta/gamma oscillations). Most importantly, the preferred sequence of firing changed consistently as a function of stimulus properties. During beta/gamma oscillations, the reliability of firing sequences increased and approached that of firing rates. This suggests that, in the visual system, short-lived spatiotemporal patterns of spiking defined by consistent delays in synchronized activity occur with sufficient reliability to complement firing rates as a neuronal code.
Does fine spike timing determine what we see? Cortical neurons contain an arsenal of cellular mechanisms sensitive to the precise timing of inputs (Bi and Poo, 1998; Usrey et al., 2000; Ahissar and Arieli, 2001; Azouz and Gray, 2003; Meliza and Dan, 2006; Fries et al., 2007; Hausselt et al., 2007; Cardin et al., 2009; Branco et al., 2010). However, it is unclear whether such mechanisms are actually exploited (i.e., whether the fine temporal structure of cortical activity reflects features of the environment). That is, if neurons listen to the timing of inputs, do they hear anything worth listening to?
At early stages of sensory processing, information about stimulus properties is available in the precise temporal relationships between action potentials: More optimally stimulated neurons respond earlier to a stimulus than less optimally stimulated ones (Heil, 2004; Johansson and Birznieks, 2004; VanRullen et al., 2005; Gollisch and Meister, 2008). In the hippocampus, the relative timing of action potentials also contains information, for example, about the animal's spatial position (O'Keefe and Recce, 1993; Skaggs et al., 1996; Harris et al., 2002). An important difference to the sensory systems is that, in the hippocampus, action potentials do not lock to the timing of external inputs (onset and offset) but are instead referenced internally to the oscillatory rhythm of the population at theta frequencies (∼5–8 Hz) (O'Keefe and Recce, 1993; Dragoi and Buzsáki, 2006; Geisler et al., 2007). Thus, the information encoded by the precise timing of a hippocampal neuron is detectable only in relation to the activity of other neurons.
In the visual cortex, oscillations related to stimulus processing occur predominantly in the beta/low-gamma range (20–60 Hz). These fast oscillations synchronize neurons (Gray et al., 1989; Engel et al., 1991; König et al., 1995a; Maldonado et al., 2000), and the synchronization exhibits small delays of up to several milliseconds (Buzsáki and Chrobak, 1995; König et al., 1995b; Roelfsema et al., 1997; Schneider and Nikolić, 2006). These delays were thought to reflect imprecise locking of spiking activity to the oscillation cycle (Buzsáki and Chrobak, 1995; Roelfsema et al., 1997). In contrast, we explore the possibility that these spike delays reflect stimulus information, similar to the phase coding of spatial information in the hippocampus (Schneider et al., 2006; Fries et al., 2007).
We made simultaneous recordings in the primary visual cortex of anesthetized cats with up to 32 electrodes and extracted the preferred temporal sequence of spiking from pairwise cross-correlations of neurons (Schneider and Nikolić, 2006; Schneider et al., 2006; Nikolić, 2007; Havenith et al., 2009). We find that (1) groups of synchronized neurons form firing sequences, each neuron having its own preferred firing time relative to all other neurons in the group. (2) These sequences last shorter than a single cycle of beta/gamma oscillation. (3) Firing sequences are not fixed (“hardwired”) but change as a function of stimulus properties. (4) The changes are systematic and permit inferences on stimulus properties. (5) The degree to which firing sequences replicate across repeated stimulus presentations allows for reliable coding.
Materials and Methods
In five adult cats (three males; two females), anesthesia was induced with a mixture of ketamine (Ketanest; Parke-Davis; 10 mg/kg, i.m.) and xylazine (Rompun; Bayer; 2 mg/kg, i.m.). After tracheotomy, anesthesia was maintained with 70% N2O and 30% O2, supplemented with ∼1.0% halothane. After craniotomy, the level of halothane was reduced to ∼0.5%. When anesthesia was stable and sufficiently strong to prevent any vegetative reactions, the animal was paralyzed with pancuronium bromide applied intravenously (pancuronium; Organon; 0.15 mg · kg−1 · h−1). The respiration rate was adjusted manually to keep end-tidal CO2 at 3–4%. Rectal temperature was kept at 37–38°C. Glucose and electrolytes were supplemented intravenously. All procedures complied with the German law for the protection of animals and were overseen by a veterinarian.
Recordings were acquired from area 17 using silicon-based 16-site probes supplied by the Center for Neural Communication Technology at the University of Michigan (illustrated in supplemental Fig. S7, available at www.jneurosci.org as supplemental material), as described previously by Biederlack et al. (2006). The electrode contacts (recording sites) had an impedance of 0.3–0.5 MΩ at 1000 Hz and were organized in a 4 × 4 matrix on four shanks, with a distance of 0.2 mm between the neighboring contacts. Thus, the recording area of one probe spanned ∼0.6 × 0.6 mm. The probes were carefully placed to enter the cortex at a 90° angle (i.e., perpendicularly to the surface of the cortex), and the uppermost row of electrode contacts was lowered 100–200 μm below the cortex surface. Thus, the same probe recorded neurons from different layers and different orientation columns. We always inserted two probes simultaneously into area 17 of the same hemisphere, thus recording from 32 electrode contacts in parallel. In one probe, good signals were available only on nine recording channels. Thus, in total, we recorded from 153 sites.
Signals were amplified 1000×, filtered at 500 Hz to 3.5 kHz, and sampled with a frequency of 32 kHz. Action potentials (spikes) were detected with a two-sided threshold discriminator adjusted manually to yield a signal-to-noise ratio of ∼2:1. For each detected action potential, the time of the event was recorded together with the spike waveform over a duration of 1.2 ms. Single-unit activity was then extracted off-line by spike sorting. The distance between neighboring electrode contacts was sufficient to prevent them from recording activity of the same neuron—in no case did cross-correlation histograms (CCHs) show the high, narrow peaks in the very center of the CCH characteristic of autocorrelations. Thus, spike sorting was applied to the signals from each recording site separately. Spike waveforms were sorted using principal component analysis (PCA), according to the following two criteria: (1) waveform separation: the shape of the spike waveforms should be clearly defined and different from all other waveforms in the multiunit (MU), as indicated by the first three PCA components; (2) refractory period: not more than 0.5% of spikes should occur with an interspike interval of <1 ms. The 126 isolated single units (SUs) that satisfied these criteria (7–20 neurons per recording probe) had an average firing rate of 18 ± 15 spikes/s (mean ± 2 SD) in the most optimal stimulation condition.
Visual stimuli were presented on a 21 inch Hitachi CM813ETPlus monitor with a refresh rate of 100 Hz, positioned at a distance of 57 cm from the eyes. The pupils were dilated with atropine, the nictitating membrane retracted with neosynephrine and the cornea protected with contact lenses containing artificial pupils of 2 mm diameter. After refraction, the eyes were focused on the monitor with correcting lenses. The optical axes of the two eyes were aligned by mapping the receptive fields (RFs) for each eye separately and placing a prism in front of one eye to achieve binocular fusion of the RFs. All RFs were determined by manual mapping with high-contrast white bars on a black background. Because of the spatial proximity between electrode contacts, the RFs recorded from one probe always overlapped, producing clusters that spanned up to ∼10° of visual angle. The stimuli were always positioned such that their centers matched the center of the RF cluster.
Stimuli were always presented binocularly, using the presentation software ActiveSTIM (www.ActiveSTIM.com). In the main set of recordings, for which the largest part of the reported analyses was performed, we presented high-contrast drifting sinusoidal gratings of circular shape, and varied the orientation and drifting direction in 12 30° steps, thereby covering the entire circle of 360°. In three cats, the gratings had a spatial frequency of 2.4°/cycle and covered 12° of visual angle, in two cases moving with a speed of 2.2°/s and in one with 6.6°/s. In the remaining two cats, gratings had a spatial frequency of 1 and 2.5°/cycle, respectively, covering 7 and 18° of visual angle and moving with a speed of 1.5 and 2°/s.
To exclude stimulus-locked spike timing as a cause for the time delays in CCHs, in three animals we additionally recorded responses to high-contrast moving bars. The bars were oriented to match the orientation preference of a majority of the cells and were moved across the receptive fields in the two directions perpendicular to the orientation of the bar. Bars spanned 0.5 × 6° of visual angle, appeared on the screen at an eccentricity of 2° from the center of the RF cluster, and moved across the center of the cluster with a speed of 1°/s, disappearing after 4 s.
All stimuli were presented 40–120 times (trials) each, in randomized order. The recordings were made in two to six blocks, each block consisting of 20–60 trials per stimulus condition. Each block lasted 20–100 min, and often several-hour-long breaks were made between different blocks. This enabled us to investigate the stability of the relative firing times over longer periods of time. The responses obtained from the two simultaneously inserted Michigan probes were analyzed separately (i.e., timing relationships between units recorded from different probes were not investigated here). Thus, for each cat we obtained two datasets (denoted A and B). Thus, the total of 8 experiments (5× drifting gratings + 3× bars) yielded a total of 16 data sets, 10 in which we investigated the stimulus-induced changes in the firing sequences in response to grating stimuli, and 6 with responses to bar stimuli used to analyze the relationship between the preferred spike delays and the external temporal dynamics of the stimulus (i.e., to prove the internal origin of the delays).
Extracting pairwise preferred delays.
The preferred spike delay between pairs of neurons was determined by computing their CCH with 1 ms resolution across the sustained responses to each stimulus (i.e., excluding the responses to stimulus onset and offset to eliminate rate covariation from the CCHs). To ensure that CCHs did not include rate covariation (Brody, 1999), the responses showing the strongest rate variation (i.e., those immediately after the onset and offset of the stimuli) were not included in the CCH analysis. Hence, CCHs were computed only for the sustained responses over a duration of 3.2 or 1.8 s for grating and bar stimuli, respectively. Stimulus-locked rate covariation can be detected by computing CCHs between responses with shuffled trial orders (i.e., shift predictors) (Perkel et al., 1967). We computed shift predictors for a random sample of four datasets and most of these 423 shifted CCHs were flat, indicating that rate covariation had not contributed to the peaks in the CCHs. More importantly, regardless of the shapes of the shift predictors, the estimates of phase shifts were identical with and without subtraction of shift predictors (r = 0.99; n = 423).
We then fitted a Gaussian function to the central peak of the CCH (±15 ms) and located its maximum on the x-axis (see Fig. 1B). When applied on reasonably smooth CCHs, particularly with 1 ms binning, this approach estimates preferred spike delays with submillisecond precision (Havenith et al., 2009). Compared with other typical fitting methods for CCHs, like damped Gaussian (i.e., Gabor) functions (König, 1994; König et al., 1995a; Roelfsema et al., 1997; Nikolić, 2007) or cosine functions (Schneider and Nikolić, 2006; Schneider et al., 2006), the Gaussian function applied here produced the same results as the other two (r > 0.95; 91 CCHs tested) but was least sensitive to initial parameter settings and least likely to enter local minima during the fitting process.
The submillisecond precision of the fitting procedure could be achieved only if the CCH peak consisted of at least three consecutive bins (1 ms width each) with at least N = ∼8 entries (coincident events). With fewer entries, precision decreased rapidly because of fitting errors (Havenith et al., 2009). This requirement had several implications for the selection of the data included in the analyses as follows.
(1) Not all neurons in all stimulation conditions generated enough spikes to enable a reliable assessment of the preferred time delays from the pairwise CCHs. This was usually a problem with stimuli of nonoptimal orientation, which did not drive the neurons well. If a neuron had overall low firing rates and participated in CCHs with insufficient entries, it was completely excluded from the analysis. However, we did retain neurons that only had “poor” CCHs in some stimulation conditions. The reason is that our analysis required that the networks of pairwise preferred delays were complete (i.e., did not contain missing values). The criterion for the inclusion of a unit was that the Gaussian functions fitted the CCHs with r2 ≥ 0.5 in >50% of the cases. Since the overall preferred firing time of a neuron was computed from multiple pairwise delays, this criterion provided sufficiently accurate estimates. This procedure eliminated 48 of the 126 isolated SUs (38%), leaving 78 neurons for additional analysis (5–12 per recording probe; median, 7 neurons). This criterion could not be applied for the polar plots of firing times shown in Figure 5B, and for the time-resolved changes in firing times shown in Figure 5, E and F. These analyses needed to include also responses with very low spike counts (e.g., for nonpreferred grating orientations). Thus, all neurons with nonempty CCHs were included in these analyses, regardless of the goodness of fit of Gaussian functions.
(2) As mentioned, some grating orientations evoked particularly low firing rates in the recorded neurons, producing very sparse, or sometimes even completely empty, CCHs. Therefore, in some analyses, we had to exclude the responses to certain grating orientations. A stimulation condition was removed from the analysis if it was not possible to compute a firing sequence for five or more neurons after applying the criterion explained in (1). In one data set, neurons responded vigorously enough to compute firing sequences for all 12 stimulation conditions. The smallest number of stimulation conditions that entered the analysis was two, and the median was 3.5. As mentioned above, we did not use these criteria for the computation of the orientation tuning of the relative firing times.
(3) Finally, it was not feasible to estimate the intertrial variability of firing sequences directly by determining pairwise time delays for single trials and to make at the same time a reasonable comparison to firing rates. This is because, as a result of low firing rates, insufficient entries were available to estimate CCHs from single trials, which lead to a disproportionate increase in the error of time delay estimates compared with estimates of firing rates [for an illustration of the trial-to-trial variability in estimated time delays, see Nikolić (2007) and supplemental Fig. S9, available at www.jneurosci.org as supplemental material]. This lack of distributivity of errors in the estimates of time delays is not an indication of inherent imprecision of time delays but occurs because of the fact that the measure of delay is extracted indirectly, by fitting a Gaussian function. The fitting errors of each parameter do not reduce with equal speed. For example, the baseline of the Gaussian can be estimated with accuracy that is proportional to 1/, where n would denote the amount of available data (e.g., the number of trials). However, other parameters (e.g., the mean and the width) can be estimated with a similar rate only when the baseline is estimated accurately, and the estimate of the peak delay of the Gaussian is in turn reliant on all three of these other parameters. Consequently, the error with which the mean delay of the Gaussian is measured, is for small numbers of data points, much larger than would be expected from the widths of the center peaks. In other words, with a small number of trials (that is, CCHs containing only a small number of coincident events), the best fit is likely to position the Gaussian at values far outside the region within which the center peaks of CCHs vary (e.g., an estimated delay of 30 ms can be easily encountered). Consequently, an average of k Gaussian fits, each of which is applied to 1/k of the total available amount of data, will be much more inaccurate than a single Gaussian fit calculated for a CCH including all data. This is referred to as the lacking distributivity of errors. Thus, to obtain accurate estimates in single trials, each trial would be required to contain an unrealistically high number of entries (e.g., rich MU instead of SU). Therefore, time delays had to be estimated always from CCHs calculated across multiple trials. Because of these limitations, to estimate the variability of firing times across repeated measurements we randomly split the data sets into two halves 100 times and calculated CCHs from each half of the data. This restriction does not apply to rate responses, as they can be calculated for n individual trials and subsequently averaged to obtain an estimate with the error decreasing with the usual factor of ∼1/.
Delays in RF activation.
To estimate the delays in activation caused by the RF properties of different units, we used two different techniques. For the data shown in Figure 1D and supplemental Figure S1, B and C (available at www.jneurosci.org as supplemental material), we applied the double sliding window technique for the measurement of onset latencies, based on peristimulus time histograms (PSTHs) (Berényi et al., 2007). We computed PSTHs with a resolution of 5 ms, and the width of the sliding window was 400 ms. The method rarely entered local minima, and the estimated latencies of the response onset corresponded well to those obtained by visual inspection of the PSTHs. For the data shown in supplemental Figure S1, B and C, we computed PSTHs with a bin size of 10 ms, and then generated cross-correlation functions of those PSTHs for delays of ±1000 ms. The delays in PSTH peaks were then estimated by the same procedures based on fitting a Gaussian as used for the regular CCHs.
Using the method described by Schneider et al. (2006) and Nikolić (2007), the matrices of cross-correlation delays (also referred to as networks) (see Fig. 2B) between simultaneously recorded neurons were transformed into linear sequences of preferred firing times. This is justified if a data set satisfies the principle of additivity: For any three units A–C, the time delay ϕ between units A and C equals the sum of the time delays between units A and B and units B and C, as follows: If this relationship holds, one can assign each unit k a temporal position indicating the preferred time xk at which it fires action potentials relative to all other units (see Fig. 2C). xk is determined as the normalized sum of all time delays ϕjk of unit k with respect to all other units j as follows: where n indicates the total number of units. Thus, if a unit tends to fire earlier than others, it is assigned a positive value and vice versa, the sum of all values being always zero. In this way, n(n − 1)/2 pairwise time delays are condensed into n temporal positions on a single time axis (see Fig. 2F). To estimate how accurately this linear representation matched the original pairwise delays, we computed a measure of the error as the normalized difference between the distance of the positions of two neurons on the time axis on the one hand, and the preferred delay measured in their CCH on the other. This error is referred to as additivity error σAdd because it is directly related to the deviation of the original pairwise delays from additivity (see Eq. 1). The overall error σAdd takes into account all possible pairs of neurons, whereas the individual error σAdd(k) of a unit k only takes into account its pairwise connections with all other neurons. For more details, see Schneider et al. (2006).
Error measures of firing sequences.
To estimate the error variance of the preferred firing times across repeated measurements, we used several different error measures. First, we applied a bootstrapping procedure, whereby we computed firing sequences from a randomly chosen subsample of trials. Each subsample contained one-half of the total number of available trials, which were sampled with replacement. For each data set, 100 such samples were made and the SD of the obtained relative firing times was used as an estimate of the measurement error, referred to as “bootstrap error,” and is shown in Figures 3E and 4C, and also used in the computation of mutual information detailed below.
To estimate the stability of the firing sequences over time, we split the data in two halves as they were recorded in time (i.e., early vs late recording blocks) (unlike the random choice of trials used in the bootstrapping procedure). For the number of recordings used in each analysis and other details, see supplemental Table S1 (available at www.jneurosci.org as supplemental material).
Finally, we also computed a measure of the error variance resulting from the imperfect additivity of the measured pairwise delays. Pairwise spiking delays are not guaranteed to be additive. In principle, spike trains can also produce nonadditive time delays (Schneider et al., 2006). Therefore, we needed to ensure that the time delays were sufficiently additive to warrant the application of our analyses. The degree to which the preferred firing sequence correctly represents the measured pairwise time delays is expressed in the average positioning error per unit, σAdd [defined as σAdd≡σ̂x in the study by Schneider et al. (2006)]. To obtain σAdd, first the differences between the distance δij separating two units i and j on the time axis and their measured time delay ϕij are summed up as follows: Next, QAdd is normalized by the number of time delays that enter its computation [adapted from the study by Schneider et al. (2006), their Eq. 4] as follows: σAdd is also referred to as the additivity error because the measured time delay between two units and their distance on the time axis correspond only if a data set is additive (see Fig. 2C–E).
In addition to this average error across all units, we computed an individual positioning error σAdd(k) for each unit k, which was used to determine the widths of the Gaussians above each unit in the representations of firing sequences shown in Figures 2F, 3A,G, 4A, and 5A. Similar to the average error for all units, σAdd(k) was also computed as a normalized sum of errors, but only for the time delays ϕjk of the unit k with all other units j (j ≠ k). It can be shown that the mean of σ2Add(k) across all units k equals the overall additivity error σ2Add.
Statistical testing of firing sequences.
To statistically test the stimulus-dependent changes of firing sequences, we applied three tests based on different types of error variance. First, we tested whether the difference between two firing sequences exceeded the variance resulting from the imperfect additivity of pairwise delays. For this purpose, we used the ANOVA presented in the study by Schneider et al. (2006), termed here ANOVA-Add, which is based on the positioning error σAdd (defined in the previous section), The resulting F value, computed as the ratio between the stimulus-dependent differences between two mean firing sequences, and the average additivity error σAdd, is distributed with (n − 1) and (n − 1)(n − 2) degrees of freedom and assumes large values if the positioning error is small and the difference between the firing sequences is large. The ANOVA-Add is designed only for comparisons between two firing sequences. Thus, when comparing changes in firing sequences for more than two stimulus conditions, we conducted multiple pairwise ANOVA-Adds and controlled for the type I error by correcting p values using the Dunn–Sidak correction for multiple comparisons (Dunn, 1961). A stimulus-dependent change of firing sequences was considered significant if across all comparisons the smallest corrected p value did not exceed p = 0.05.
Second, we corroborated this parametric analysis with a nonparametric statistical test also designed to investigate pairwise delays (Nikolić, 2007). This test considers only the direction of time delays, ignoring their magnitudes, and tests whether these directions indicate a consistent rank order of firing across the units (i.e., if unit A fires before B, and B fires before C, then A fires before C), a property known as transitivity. Similar to, for example, Wilcoxon's signed-ranks test, changes between two firing sequences are evaluated by computing first the differences between all pairwise time delays across two stimulation conditions, and then testing the transitivity of this difference network (matrix). To this end, all subnetworks of size three (triples) are considered, the total number of nontransitive triples, CNTrans, is counted, and its significance for a given network size is taken from a look-up table provided in the study by Nikolić (2007). The test makes no assumptions about normal distribution of errors. To correct for multiple comparisons in the transitivity test, we used an α level of 0.01.
Finally, to compare the stimulus-dependent changes of firing sequences against their error variance across multiple measurements, we used a two-way (stimulus by unit) ANOVA for repeated measurements (ANOVA-RM). We obtained repeated measurements by splitting each dataset into two parts, analyzing either odd or even trials (according to the order of presentation). The data sets were only split in two parts because the estimates of preferred firing delays required fitting CCH with Gaussian functions, and this procedure is particularly susceptible to producing large errors when a small number of data points is used (for details, see above, Extracting pairwise preferred delays). On these split-half measures, we applied the ANOVA-RM. Note that a significant change of firing sequences cannot be detected by testing the main effect of the factor “stimulus.” The reason is that the mean of a firing sequence is, by definition, always zero and, therefore, does not change across stimulus conditions. Instead, significant changes in firing sequences have to be detected through the interaction between the factors “stimulus” and “unit,” because the positions of individual units on the time axis change as a function of the stimulus even though the overall mean of all positions is fixed to zero. The same split-half repeated-measurement design was also used for the analysis of rate responses. A comprehensive table of results for all statistical tests on all data sets is provided in supplemental Table S1 (available at www.jneurosci.org as supplemental material) (also see below, Unitary event analysis, for a statistical test for detection of temporal sequence directly in the spike trains).
Sequence-triggered average of the local field potential.
The sequence-triggered average (SQTA) of the local field potential (LFP) displayed in Figure 3B is computed by first averaging the LFP signals recorded across all electrodes with which we recorded also spiking activity. Next, the classical spike-triggered average (STA) of the LFP is computed for each of the neurons in the sequence. Finally, to obtain the SQTA, the relative positions of the individual STAs are offset according to the relative firing times of the corresponding neurons, and then averaged. For example, if a sequence includes two neurons with the preferred firing times of +2 and −2 ms, the two corresponding STAs would be averaged with an offset of 4 ms.
Oscillation frequency of CCHs.
To determine the dominant frequency of oscillation for each CCH, we first generated an extended CCH over a time window of ±128 ms, and then computed its fast Fourier transform to obtain the power spectrum. The power spectra showed a well defined single peak in the range of 20–60 Hz in almost all cases. The frequency of each CCH was defined as the frequency with the highest power within this frequency range.
Orientation and direction tuning of firing times.
In each of the polar plots in Figure 5B, a constant time, tc, needed to be subtracted from the firing times of the unit relative to all the other units for all 12 stimulus directions. The reason was that, unlike rate responses, relative firing times do not have an absolute zero. Hence, the choice of tc was arbitrary and, in turn, affected strongly the degree to which firing times appeared direction selective (i.e., the “sharpness” of the tuning curve). Thus, in Figure 5, B and C, the sharpness of orientation and direction selectivity should not be compared across firing rates and firing times. We chose values of tc that approximately matched the sharpness of the corresponding rate tuning curves to facilitate visual comparisons of their orientation and direction preferences. Importantly, the subtraction of tc neither affects the estimates of the preferred stimulus direction nor the Pearson's correlation coefficient, r, calculated between firing rates and firing times across the 12 stimulation conditions.
The time-resolved analysis of firing times and firing rates was applied to three data sets of 5–7 SUs (n = 17 units in total) that were taken from the three oscillatory recordings. Firing sequences and mean firing rates were extracted for analysis windows of 450–550 ms duration that were moved in 10 ms steps. All the units in a group had overlapping RFs. To compare the modulation amplitudes of firing rates and firing times, we fitted both types of responses with sine functions of the same temporal frequency as the presented grating stimuli. For firing rates, the modulation amplitude was then determined by comparing the amplitude of the sinusoidal modulation of the rate response by the grating (F1) to the mean change in response amplitude from spontaneous to stimulus-evoked activity (F0), known as F1/F0 ratio (Movshon et al., 1978a,b; Skottun et al., 1991). Rate responses with an F1/F0 ratio >1 were classified as simple, whereas rate responses with an F1/F0 ratio <1 were classified as complex. For firing times, the modulation amplitude of the fitted sine was not normalized, but simply expressed in milliseconds.
The mutual information (MI) was calculated between rate or time responses (r) of individual neurons on one side, and the visual stimuli (s), on the other side by using the following equation: where p(r) and p(s) are probability distributions of responses and stimuli, respectively, and p(r,s) is the joint probability of responses and stimuli. In the experiment, the p(s) was kept constant for all possible stimuli [i.e., p(s) = 1/n, where n is the number of different stimuli]. p(r) and p(r,s) were calculated based on conditional probability p(r|s) by using the following equations: and The conditional probability p(r | s) for both firing rates and firing times was approximated by assuming that the responses follow Gaussian distribution for which the mean and SD were estimated by a bootstrapping procedure described above (see Error measures of firing sequences).
The maximum possible entropy of stimuli, MImax, is the upper bound for mutual information with other variables and varied between 1 bit, for two different stimuli, and 3.6 bits, for 12 stimuli. Consequently, the capacity for carrying stimulus-related information, pMI, was expressed as proportion of the maximum possible information that was transferred: The MI and pMI values for rate responses and for relative firing times of a neuron were calculated from identical spike trains.
The oscillation strength in a data set was estimated by computing its oscillation score (OS) (Muresan et al., 2008). To this end, an autocorrelation histogram (ACH) was computed with 1 ms resolution for each unit and for each investigated stimulus condition. The OS is defined as the ratio between the power at the strongest frequency within a band of interest, which covers in our case the upper beta/lower gamma band (25–50 Hz), and the average power of all other frequencies in that band. The OS computes a corrected oscillation power by removing the artifacts induced by the presence of the center peak in the ACH and by possible bursting behavior of the cells. An OS of 5 for a preferred frequency of 40 Hz indicates that the oscillatory power at 40 Hz is five times larger than the average power in all the other investigated frequencies.
Within a data set, the units tended to have similar OS, indicating a collective fluctuation in the level of oscillatory activity. Thus, the overall oscillation strength for a data set was estimated simply as the average OS for all individual ACHs. In accordance with the recommendations given by Muresan et al. (2008), data sets with an average OS ≥ 5 were classified as oscillatory, and data sets with an OS < 5 as nonoscillatory. For the responses to gratings drifting in different directions, seven data sets were classified as nonoscillatory (OS, 1.5–4.8; mean, 3.2; median, 3.5) and three as oscillatory (OS, 6.0, 6.3, and 7.2).
Unitary event analysis.
Unitary events were determined by dividing the spike responses to gratings (300 ms after stimulus onset until stimulus offset) into short consecutive time windows (2 and 5 ms; results shown only for 5 ms windows). A unitary event was defined as a group of spikes fired within a time window, and the size of the unitary event was defined as the number of spikes in that window. The frequency of occurrence (events per second) was computed for each size of unitary event.
To investigate the stimulus specificity of unitary event distributions, spike trains were shifted in two ways: In one, the shift of each spike train compensated for the delay in the relative firing time of that neuron in the preferred firing sequence (“correct shift”). These shifts were expected to reduce maximally the average delay between action potentials. Hence, this procedure should maximize the number of large unitary events (and concomitantly reduce the number of small unitary events). In the second type of shift, spike trains were shifted according to the preferred firing sequences obtained in response to the other stimuli (“incorrect shift”). If the temporal structure of the spike trains is stimulus dependent, this type of shift should either not reduce the delays between action potentials (e.g., if stimuli are mostly dissimilar) or reduce them less efficiently (e.g., if similar stimuli are used for correct and incorrect shifts).
To assess the significance of the differences in the sizes of unitary events f, we computed the average frequency of occurrence (number of events per second) in spike trains shifted either correctly, Fc, or incorrectly, Fi, and then computed the normalized frequency based on the average for correct and incorrect shifts: Fnc = Fc/(½ Fc + ½ Fi) and Fni = Fi/(½ Fc + ½ Fi).
Given that the frequencies with which unitary events of different sizes occurred were mutually dependent—if the number of large unitary events increased, the number of small ones necessarily decreased—we had to test for the significance of a change by examining the interaction between the correctness of the of the shift and the size of the unitary event in a two-way ANOVA. The p value of the interaction term indicated whether shifting the spike trains in the different ways affected the shapes of the distributions of unitary event sizes. The Fn ratios obtained for each stimulus condition were treated as repeated measurements and the null-hypothesis (H0) was that the distributions of unitary event sizes do not increase as a result of a correct shift of a spike train compared with incorrect shift. The results of this test are listed in supplemental Table S1 (available at www.jneurosci.org as supplemental material).
To investigate whether oscillations had an effect on the sizes of unitary events, this approach was modified such that the two-way ANOVA had a factor strength of oscillation instead of correctness of the shift and was made using the Fnc values only (see Fig. 8I).
The significance of changes for each event size separately (see Fig. 8E–H) was tested nonparametrically: We computed ratios between correctly shifted and original spike trains, Fc/Fo, incorrectly shifted and original spike trains, Fi/Fo, and correctly and incorrectly shifted spike trains, Fc/Fi. A binomial test was made for the number of firing sequences with the ratio >1.0, given the total number of firing sequences with unitary events of that size. The binomial probability was computed for obtaining the observed count by chance (i.e., assuming that each firing sequence has equal probability of 0.5 of producing a ratio >1.0 and <1.0).
As a control, we applied all of the above-described analyses to trial-shuffled spike trains. Trial shuffling eliminates unitary events that are generated because of the internal organization but retains those that emerge from the responses that are time-locked to the external stimulus and those that occur by chance. In this case, the average frequencies of unitary events were estimated from 200 repetitions of the shuffling procedure.
Neurons fire with consistent delays at the millisecond scale
We obtained simultaneous extracellular recordings of SU activity from area 17 of five cats using 16-channel Michigan probes. Visual stimuli consisted of sinusoidal gratings drifting in 12 different directions closing a full cycle of 360°, or of bars moving in two different directions. Gratings were presented 40–120 times (trials) each, and bars 20–40 times each. In each recording, we investigated firing sequences formed by 5–12 well isolated SUs recorded simultaneously (78 SUs in total) (for examples of spike waveforms, see supplemental Fig. S1A, available at www.jneurosci.org as supplemental material).
Internally generated temporal relationships can be determined using as reference the population activity (e.g., the LFP) (Dragoi and Buzsáki, 2006; Kayser et al., 2009; Vinck et al., 2010). When the activity of multiple neurons is recorded simultaneously, another possibility is available: Temporal relationships between the recorded neurons can be extracted directly from their own action potentials (Schneider et al., 2006; Nikolić, 2007). In the present study, we opted for this second possibility.
To this end, we first estimated the dominant (preferred) delays between pairs of neurons, by computing CCHs and fitting a Gaussian function to the central peak (Fig. 1B) (König, 1994; Schneider and Nikolić, 2006; Havenith et al., 2009) (see Materials and Methods). The dominant delay is defined by the peak of the fitted function. Clearly, this dominant delay does not occur every time a pair of neurons generates coincident spikes, as indicated by the widths of the CCH peaks, whose width at half-height varied between 5 and 20 ms. For many pairs of neurons, the deviations from the preferred spike delay were sufficiently small to maintain consistently the order of firing—one neuron firing reliably earlier than the other (see supplemental Fig. S2, available at www.jneurosci.org as supplemental material).
We next verified that the preferred delays had a truly internal origin by computing shift predictors (Perkel et al., 1967). The shift predictors were flat for all CCHs, indicating that the CCH peaks, and their shifts, were not produced by locking of the responses to the stimulus. Single bars moving over RFs (Fig. 1A,B) activated cells at different times because of the spatial offsets of the RFs, producing effectively delays in activation onset (Fig. 1C). However, these onset delays were not correlated with those extracted from the CCHs (r = 0.14; p = 0.14; n = 107) (Fig. 1D), and neither were the delays with which the cells reached the peaks of activation (supplemental Fig. S1B,C, available at www.jneurosci.org as supplemental material). Moreover, the activation delays were ∼2 orders of magnitude larger than those in the CCHs (Fig. 1D). Thus, firing sequences were not generated by sequential stimulation of spatially shifted border of RFs.
Pairwise spike delays reflect a preferred sequence of firing
Figure 2 illustrates the extraction of a firing sequence for a set of seven neurons with overlapping RFs (Fig. 2A), activated by a drifting sinusoidal grating. Although the network of pairwise delays appears complex (Fig. 2B), its structure is simple because the delays are predominantly additive. That is, for any three units A–C, the delay A–C approximates the sum of the delays A–B and B–C (Fig. 2C) (Eq. 1) (Schneider et al., 2006; Nikolić, 2007). If all neurons fire in all spike sequences (i.e., all spike sequences are complete), the additivity of preferred delays is a mathematical necessity despite any variability of pairwise firing delays. However, under more realistic conditions, neuronal activity is sparse and only a fraction of neurons participates in a synchronized population response. In this case, spike sequences are mostly incomplete. Under such circumstances, additivity of spike delays is possible (Fig. 2D) but not guaranteed (Fig. 2E).
Based on the additivity of delays, a simple arithmetic (Schneider et al., 2006; Nikolić, 2007) (Eq. 2) allows one to transform the entire network of temporal relationships into a simple one-dimensional sequence (Fig. 2F). This sequence shows the preferred firing times of neurons relative to each other. The deviations from additivity, and therefore the uncertainty with which the units can be positioned on the time axis, are indicated by Gaussians (Fig. 2C) (Eq. 5). These errors refer to the inconsistencies between the preferred delays (i.e., CCH peak positions) and do not indicate directly the variability across individual coincident events. The latter is indicated by the widths of the CCH peaks (shown in Fig. 2G; for more examples, see supplemental Fig. S3, available at www.jneurosci.org as supplemental material), which are much larger. This difference between the two indicators of variability in firing sequences corresponds—both conceptually and quantitatively—to the difference between the error of the mean (relatively small) and the variability within the original statistical sample (relatively large). A very similar increase in variability can be seen in firing rates when, similarly to the computation of a CCH, they are estimated on the basis of individual interspike intervals (see supplemental Fig. S4, available at www.jneurosci.org as supplemental material).
In Figure 3A, we show a firing sequence of five neurons in relation to the period of the underlying beta/gamma oscillation. The total span of the sequence, 7.88 ms, was much shorter than the oscillation period (37.24 ms; corresponding to 26.9 Hz), illustrated here by the sequence-triggered average of the LFPs (adapted from the spike-triggered average of the LFP) (see Materials and Methods) (Fig. 3B). Thus, the firing sequence occupied mostly the peak of the oscillation cycle. Similar results were obtained in all recordings: The 43 investigated firing sequences (10 data sets with 2–12 stimulus conditions, one sequence per stimulus condition; each sequence containing 5–12 neurons; median, 7 neurons) covered on average 6.60 ± 1.49 ms (mean ± SD), whereas the underlying oscillation cycles lasted 59.14 ± 16.14 ms (21.2 ± 4.3 Hz). The firing sequences never spanned more than approximately one-third of the length of the oscillation cycle. In contrast to the hippocampus in which the delay between the discharges of different place cells is directly related to the period of the theta oscillation (Geisler et al., 2007), we found no evidence that the time span of the firing sequences depended on the period of the underlying oscillation (Fig. 3C). Consequently, relative firing times were always expressed in milliseconds rather than phase angles.
The time spans of firing sequences were much larger than the measurement errors (Fig. 3D). On average, the median additivity error was only 0.25 ms (highly skewed distribution; average additivity error, 0.44 ± 0.49 ms; the examples in Figs. 2F and 3A have average additivity errors of 0.25 and 0.53 ms, respectively). Similar results were obtained when, instead of the additivity error, we used the bootstrap error computed across multiple estimates of firing sequences (Fig. 3E), or if, instead of the total time span of the sequence, we computed its dispersion (i.e., the SD of the individual firing times) (data not shown). In all cases, the preferred firing times of neurons exhibited little variability and could be distinguished reliably. The variability of the delays across individual coincident events, as indicated by the average width of the CCH peaks at half-height (shown, for example, in Fig. 2G) was, as would be expected, considerably larger. Yet in most cases even the half-widths of the center peaks were smaller than the total time span of most firing sequences (Fig. 3F).
We also determined whether firing sequences were stable over prolonged periods of time. We repeatedly recorded the responses of neurons to the same stimuli and compared blocks of recordings made 3–12 h apart. An example is shown in Figure 3G. Although small changes in the sequence can be seen, the relative firing times of the majority of neurons shifted only within the range of the additivity errors. Also, the ordinal positions of the neurons in the firing sequence remained mostly fixed. Thus, the firing sequence can be characterized as relatively stable over this prolonged period of time. Similar results were obtained with all other firing sequences: The jitter of the relative firing times had a median of only 0.86 ms (highly skewed distribution; average root-mean-squared deviation, 1.65 ± 1.16 ms). For the example in Figure 3G, the mean jitter was 0.56 ms. Thus, neurons that fired early (late) during one recording fired early (late) also during a later recording. In supplemental Figure S5 (available at www.jneurosci.org as supplemental material), we show that the estimates of firing sequences are also robust against subsampling of the neurons belonging to a synchronized assembly—the relative firing time of a neuron can be estimated accurately when compared with a small number of other neurons, without referencing to the global population activity, as reflected, for example, by the LFP (Schneider et al., 2006).
The preferred firing sequence is stimulus dependent
To explore stimulus-dependent changes, we analyzed firing sequences in response to gratings drifting in different directions. Figure 4A shows the change of the firing sequence from Figure 3G when the drifting direction of the grating was altered. The relative firing times of the neurons clearly changed to a much larger extent. For different drift directions (Fig. 4A), the largest shifts in relative firing time exceeded 4 ms, with a root-mean-squared distance of 2.06 ms, compared with 0.56 ms for the variability of responses to the same stimulus (Fig. 3G). The corresponding CCHs in Figure 4B illustrate these results on a pairwise basis. To test the significance of the changes shown in Figure 4A (different stimuli), we used first a repeated-measurement ANOVA (ANOVA-RM), based on variability of the estimates of delays from different subsets of trials. The differences were statistically significant (FRM(10,22) = 20.6; p < 0.001). Second, we examined whether the additivity between pairwise delays also indicated a significant change. To this end, we used specialized tests for additivity statistics, which come both in a parametric version, called ANOVA-Add (Schneider et al., 2006), and in a nonparametric version, called transitivity test (Nikolić, 2007). In both cases, the differences were significant [FAdd(10,90) = 9.5, p < 0.001; CTrans(11) = 10, p = 0.001]. In contrast, changes in Figure 3G (same stimulus) were, according to all these tests, not significant (all values of p > 0.08) (for detailed statistical results, see supplemental Table S1, available at www.jneurosci.org as supplemental material). Across all recordings, the observed magnitudes of stimulus-induced changes were consistently larger than either additivity errors, bootstrap errors, or errors across repeated recordings. However, as would be expected from, for example, Figure 2G and supplemental Figure S4 (available at www.jneurosci.org as supplemental material), stimulus-induced changes were often smaller than the total variability of individual delays in firing times, as indicated by the widths of CCH peaks (Fig. 4C) (for similar conclusions being derived from a direct analysis of the timing of individual spiking events, see Fig. 8).
In Figure 5A, we show changes of a firing sequence of eight neurons for 12 different stimulus conditions—the grating direction shifting in 30° steps. A gradual change in the grating direction evokes a gradual change in the relative firing times—which allows us to determine orientation “preferences” and distinct tuning curves, shown as polar plots in Figure 5B. Hence, similarly to the rate responses in early visual areas (Hubel and Wiesel, 1962) (Fig. 5C), relative firing times are selective for the orientation and movement direction of visual stimuli. The orientation sensitivity of firing times was detectable also in multiunit signals (supplemental Fig. S6, available at www.jneurosci.org as supplemental material). The firing sequences of multiunits covered shorter time spans, but, interestingly, the reliability of these firing sequences was even higher, which could be explained by an increase in the total number of coincident spiking events that entered the analysis (supplemental Fig. S6, available at www.jneurosci.org as supplemental material).
These results were replicated in all our recordings. Firing sequences always changed as a function of the grating orientation, and, as would be expected from an interpretable code, the degree of significance depended on the magnitude of the stimulus change, larger changes in grating orientation causing significant changes more frequently than small ones (e.g., for ANOVA-Add, 67.4% of comparisons were significant for 30° changes, but 90.3 and 86.7% for 60 and 90° changes). The significance of the changes also depended on the number of action potentials included in the analysis. In Figure 5A, the smallest additivity error [widths of Gaussians (e.g., directions 210 and 240°)], and hence the highest degree of significance, was obtained in the conditions with the largest total number of action potentials (Fig. 5A, bars on the far right). When the data sets were split into two equally sized groups according to mean firing rates, the data sets with high firing rates showed smaller additivity errors, and correspondingly more significant changes in firing times than those with low firing rates (mean additivity error of 0.34 ± 0.39 ms and 0.53 ± 0.34 ms, resulting in 100 vs 40% of data sets significant according to ANOVA-Add and 100 vs 60% according to ANOVA-RM). The stimulus-dependent changes of firing sequences across different cortical depths and cell types (regular spiking vs fast spiking) are shown in supplemental Figures S7 and S8 (available at www.jneurosci.org as supplemental material), and statistical analyses are detailed in supplemental Table S1 (available at www.jneurosci.org as supplemental material).
Firing sequences change independently of firing rates
Interestingly, the tuning curves of relative firing times matched only partially those of the rate responses. Figure 5C shows rate tuning curves for the same eight neurons as Figure 5B. For some neurons, rate and firing time tuning matched (e.g., the fourth and sixth neuron from the left); for others, it did not (e.g., the second and fifth unit). Correspondingly, when correlations between rate and firing time tuning were computed across all 27 neurons for which tuning curves could be determined, the mean correlation coefficient was positive but small [r = 0.28 (i.e., 8% of shared variance)] (for the distribution of r values, see Fig. 5D). The results shown in supplemental Figure S5 (available at www.jneurosci.org as supplemental material) suggest that this small correlation cannot be explained by the choice of neurons in relation to which the relative firing time is estimated.
A sliding-window analysis along the time course of the stimulus presentation also revealed that the firing rates and firing times changed independently. In three data sets of five to seven SUs (n = 17 in total), an analysis window of 450–550 ms width was slid in 10 ms steps to investigate neuronal responses to a grating drifting in a direction preferred by a majority of cells. For each sliding step, the relative firing times and mean firing rates of the neurons were computed from 40 to 120 stimulus presentations. Typically, the preferred firing times changed depending on the phase of the grating (Fig. 5E), but these changes were not correlated to the changes of firing rates (Fig. 5F). Moreover, the amplitudes of the modulations, measured by the F1/F0 ratio (Skottun et al., 1991), were also unrelated (Fig. 5G): Across the 17 SUs included in the analysis, the correlation between the modulation amplitudes of firing times and firing rates was r = 0.07 (Fig. 5G), and units classified as complex (F1/F0 <1) did not differ in the modulation strength of their relative firing times from units classified as simple (F1/F0 >1) [t test: t(15) = 0.37; p = 0.72]. These moment-to-moment adjustments of firing sequences to continuously changing stimulus properties account partly for the variability of spike delays indicated by the widths of CCH peaks typically computed along the entire stimulus presentation. Also, these results confirmed our initial result that firing rates and firing times contain mainly complementary information about visual stimuli.
Firing sequences encode stimulus information less reliably than firing rates
As Figure 3G illustrates, firing sequences do not repeat exactly but vary somewhat across repeated stimulus presentations—as is the case with any other biological measure, including firing rates (Softky and Koch, 1993; Shadlen and Newsome, 1998). To investigate how firing rates and firing times compared in the reliability with which they carried stimulus-related information, we computed the MI between the stimuli on the one hand and the firing times and rates on the other.
In Figure 6, A and B, we illustrate, for a representative neuron, the variability across repeated measurements of firing times and firing rates. The error variance across these measurements was always smaller than the stimulus-related variance. Consequently, a good share of information about the stimulus was transferred by both measures (42 and 65% of the available information transferred by firing times and firing rates, respectively; maximum MI = 3.6 bits).
Overall, both firing rates and firing times exhibited ample capacity for carrying stimulus-related information. Firing sequences could not always be computed across all 12 stimulus conditions, because not all stimuli evoked a sufficient number of action potentials to calculate a complete set of CCHs (see Materials and Methods, Extracting pairwise preferred delays). Consequently, the number of stimuli included in the MI analysis ranged between 2 and 12. To take into account the corresponding differences in total entropy (1–3.6 bits), the results are presented as a ratio of MI and stimulus entropy, indicated as pMI (1 indicating that all available information was transferred). For firing rates and firing times, the average pMI values were 0.75 ± 0.23 and 0.44 ± 0.23, respectively. Thus, firing rates showed a significant advantage of ∼30% of transmitted information over firing times (paired t test, t(77) = 9.0; p < 0.001) (Fig. 6C).
These conclusions were also supported by a trial-by-trial analysis (supplemental Fig. S9, available at www.jneurosci.org as supplemental material): The coefficient of variation (the ratio between SD and mean) of the stimulus-dependent changes across individual stimulus presentations tended to be smaller in firing rates than in firing times, but, again, firing times never lagged far behind. The same held true for the distributions of individual spiking events—coincident spikes in the case of firing times and interspike intervals in the case of firing rates. The overlaps within the two groups of distributions were similar (see supplemental Fig. S4, available at www.jneurosci.org as supplemental material).
Coding capacity depends on oscillatory state
It has been reported previously that oscillations at gamma frequencies reduce the variability of rate responses (Rodriguez et al., 2010). As beta/gamma oscillations support precise synchrony (Gray et al., 1989; Engel et al., 1991; König et al., 1995a; Maldonado et al., 2000), we hypothesized that oscillations should also augment the coding capacity of relative firing times.
Oscillation strength was assessed by calculating the OS (Muresan et al., 2008) for each neuron, and the pMI—values of both firing times and rates covered a broader range in neurons with weak oscillations (OS < ∼4) than in those with strong oscillations (OS > ∼6) (Fig. 7A,B) (oscillation strength and frequency of all neurons are shown in supplemental Fig. S10A–C, available at www.jneurosci.org as supplemental material). Consequently, when data sets were categorized according to the overall oscillation strength, the median pMI was higher in recordings with strong oscillations than in those with weak oscillations for both firing times (pMI = 0.37 vs 0.54; 46% difference) and firing rates (0.73 vs 0.77; 5% difference). A two-way ANOVA indicated that both factors, oscillation (strong vs weak) and type of response (rate vs time) were significant [F(1,152) = 8.3, p = 0.0046; and F(1,152) = 66.1, p < 0.001, respectively]. Thus, beta/gamma oscillations augmented the amount of information transmitted by discharge rate and by firing sequences—firing rates encoding the grating directions more accurately than firing times. However, beta/gamma oscillations improved the time code considerably more than the rate code (46 vs 5% improvement) (Fig. 7C,D), and this difference was marginally significant (ANOVA interaction: F(1,152) = 3.2, p = 0.076). These results could not be explained by different spike counts in recordings with strong and weak oscillations (supplemental Fig. S10D–G, available at www.jneurosci.org as supplemental material).
Firing sequences detected directly in raw spike trains
Oscillations also strongly enhanced the consistency with which individual coincident spikes reflect the preferred firing sequence. First, as shown in supplemental Figure S2 (available at www.jneurosci.org as supplemental material), during pronounced oscillations the pairwise spike delays between neurons were more tightly clustered around the preferred delay. Consequently, a larger proportion of coincident spikes were fired in the preferred temporal order.
Second, we traced the firing sequences of entire neuronal assemblies directly in the spike trains by analyzing the statistics of unitary events. Significance tests based on ANOVA-RM and ANOVA-Add cannot assess directly the significance of stimulus-dependent changes in firing sequences at the level of individual trials. The present code based on precise spike timing may be most useful when decoding is fast (e.g., at single trials or even oscillation cycles) (for putative decoding mechanisms at different timescales, see supplemental Fig. S11E–G, available at www.jneurosci.org as supplemental material). An analysis addressing much closer the issue of fast decoding was that of unitary events. A unitary event was defined as a group of spikes fired within a short time window (5 ms), and the size of the unitary event as the number of spikes within that window (Fig. 8A). We then compared the frequency with which unitary events of different sizes occurred in the original spike trains and in spike trains shifted in time to various degrees (Fig. 8A–D). To investigate how abolishing the delays affects the sizes of unitary events, we shifted the raw spike trains in two different ways: In one, the shift of each spike train compensated for the time delays extracted from CCHs in these data (correct shift) (i.e., the amount of the shift was the negative value of the relative firing time of that neuron in the firing sequence). The other type of shift was based on the firing sequences obtained in response to other stimuli (incorrect shift). We expected a larger number of large unitary events with a correct than with incorrect shift.
Consistent with this prediction, correct shifts resulted in a severalfold increase in the frequency of large unitary events (those including five or more spikes) and in a decrease of small unitary events (one or two spikes) (Fig. 8E). In contrast, shifting spike trains incorrectly increased the number of large unitary events only marginally and this increase was not significant (Fig. 8F). Consequently, large unitary events were more frequent in correctly than in incorrectly shifted spike trains (Fig. 8G). Conversely, small unitary events (one to two spikes) occurred less often in correctly than in incorrectly shifted spike trains (Fig. 8G). This predominance in large unitary events of correctly over incorrectly shifted spike trains was present in all data sets and reached statistical significance in 6 of 10 cases (two-way ANOVA) (for details, see Materials and Methods and supplemental Table S1, available at www.jneurosci.org as supplemental material), and the effect was much more pronounced in the data sets with strong oscillations (Fig. 8H) (two-way ANOVA, F(1,354) = 19.6, p < 0.0001) (for details, see Materials and Methods). Similar results were obtained when using 2 ms rather than 5 ms windows to calculate unitary events (results not shown). In a control analysis based on trial-shuffled data (see Materials and Methods), the overall sizes of unitary events were considerably reduced and no differences were found between correct and incorrect shifts (all values of p > 0.05) (Fig. 8I). The results support directly our findings obtained from CCHs: Synchronized neurons tend to fire in sequences and the order within these sequences is specific to the stimulus.
These findings provide evidence that the fine temporal structure of neuronal activity in the visual cortex reflects stimulus properties. This temporal structure is not defined relative to external events but is defined internally, by the relative timing of synchronized spiking events: Synchronized cortical neurons distribute their action potentials over a temporal window of up to ∼15 ms, and within this time period, each neuron has its own preferred time for generating action potentials, some neurons firing early, and others late. Most importantly, these preferred firing times are neither fixed nor arbitrary, but change as a function of stimulus properties. Thus, the relative firing times have the potential to convey stimulus information that can be read out by cellular mechanisms sensitive to spike timing.
Potential origins of firing sequences
In the past, small delays in synchronized activity have been used to infer chains of synaptic connections and the associated conduction delays (Alonso and Martinez, 1998; Usrey et al., 2000). In our case, such hardwired delays may determine a “baseline” structure of delays but cannot explain the stimulus-dependent changes. The observed graded, stimulus-specific tuning exhibited by firing times cannot arise from anatomically fixed delays.
Mechanisms allowing for flexible adjustments of spike timing have been investigated in the context of hippocampal theta phase precession (Magee, 2001; Harris et al., 2002; Mehta et al., 2002; Harvey et al., 2009), and more recently also in the neocortex in relation to beta/gamma oscillations (Schneider et al., 2006; Fries et al., 2007; Tsubo et al., 2007; Tiesinga et al., 2008). In both cases, spike times appear to be determined by an interplay between global fluctuations of excitability, likely mediated by the network of inhibitory interneurons, dynamic changes in excitatory drive to individual neurons, and by the ion conductances in those neurons (Magee, 2001; Harris et al., 2002; Mehta et al., 2002; Fries et al., 2007; Tsubo et al., 2007; Tiesinga et al., 2008; Harvey et al., 2009) (see supplemental Discussion 1 and supplemental Fig. S11A–C, available at www.jneurosci.org as supplemental material). In supplemental Discussion 1, we point out several differences between firing sequences in the visual cortex and phase sequences in the hippocampus. One prediction of these studies is that strong activation should cause early firing. However, this does not necessarily imply high correlations between firing times and firing rates, because additional variables like lateral inhibition and adaptation shape firing rates (Harris et al., 2002; Mehta et al., 2002), and another, presently unknown, set of factors may shape firing times. Correspondingly, we found that discharge rate and spike timing are only loosely correlated. This suggests that rate and time can convey information independently and may serve as complementary codes (Harris et al., 2002; Mehta et al., 2002; Dragoi and Buzsáki, 2006). Thus, allocortex and neocortex may use similar strategies to extend their coding capabilities by using both rate and relative spike times as coding space.
When exploring the potential origin of firing sequences in the visual cortex, it is also important to consider the temporal structure of the incoming activity. Neurons in the lateral geniculate nucleus (LGN) generate spike patterns that are timed internally with similar precision as observed in the present study (Desbordes et al., 2008; Koepsell et al., 2009). Although it is not clear whether the internally timed spikes in the LGN form stimulus-dependent sequences, it is unlikely that the firing sequences reported here are inherited from the LGN. The reason is that the oscillations that serve as a timing reference in the LGN occur in the 60 Hz range, whereas the firing sequences in the visual cortex are referenced to slower oscillatory rhythms (20–30 Hz). For a more detailed treatment of this issue, see supplemental Discussion 1 (available at www.jneurosci.org as supplemental material).
Potential readout mechanisms
Relative firing times can only serve a function if changes in firing sequences affect cortical processing (i.e., if the coded information can be read out). A potential obstacle is the fact that firing sequences are usually incomplete as neurons do not fire in each oscillation cycle (supplemental Figs. S11A–D, S12, available at www.jneurosci.org as supplemental material). Thus, a readout mechanism sensitive to input time at the millisecond scale needs to tolerate variations in spike timing as well as the dropout of spikes from individual neurons (supplemental Fig. S11A–C, available at www.jneurosci.org as supplemental material). This requires a mechanism that averages inputs at the population level without losing track of precise timing relationships (supplemental Fig. S11D, available at www.jneurosci.org as supplemental material). Our finding that stimulus information could be extracted reliably from firing sequences of multiunit responses suggests that timing information remains available even when the identity of individual units is lost when the responses of a local population are pooled.
One mechanism capable of evaluating small time delays despite jitter is spike-timing-dependent plasticity (STDP) (Bi and Poo, 1998; Yao et al., 2004). This mechanism is exquisitely sensitive to temporal sequences, operates at the millisecond level, and effectively averages over recurring sequences. Correspondingly, stimulus-locked spike delays of the same magnitude and jitter as observed in the present study have been shown to efficiently modify the response preferences of V1 neurons in vivo (Yao and Dan, 2001; Yao et al., 2004; Meliza and Dan, 2006). In these previous studies, to evoke spike delays that triggered STDP, visual stimuli had to be flashed in fast succession (8–16 ms presentation delay). In contrast, in natural viewing, stimuli often evolve more slowly. Our findings show that, by generating stimulus-dependent spike sequences internally, the cortex can translate visual information from slower timescales onto timescales compatible with neuronal integration mechanisms like coincidence detection and STDP. This principle has a precedence in the hippocampus, where the place sequences produced by phase precession are used to translate information about a slower external process (the animal's movement in space) onto timescales compatible with neuronal integration and plasticity (the theta cycle) (Skaggs et al., 1996; Dragoi and Buzsáki, 2006). The firing sequences described presently could thus be exploited for adaptive processes such as perceptual learning (supplemental Fig. S11E, available at www.jneurosci.org as supplemental material).
In addition to learning processes, small time delays should also be relevant in real-time signal processing. There is ample evidence that neurons are highly sensitive coincidence detectors and can distinguish between precisely coinciding and temporally dispersed inputs (Bi and Poo, 1998; Usrey et al., 2000; Ahissar and Arieli, 2001; Azouz and Gray, 2003; London and Häusser, 2005; VanRullen et al., 2005; Meliza and Dan, 2006; Fries et al., 2007; Hausselt et al., 2007; Tsubo et al., 2007; Womelsdorf et al., 2007; Cardin et al., 2009; Branco et al., 2010). By switching positions in the firing sequence, the inputs to target cells can either coincide or disperse in time, and in the case of dispersion, the sequence can change. This, too, is a relevant variable for postsynaptic activity, because it matters whether EPSPs at distal dendritic compartments are generated before or after those at proximal sites (London and Häusser, 2005; Hausselt et al., 2007; Branco et al., 2010) (supplemental Fig. S11F,G, and supplemental Discussion 2, available at www.jneurosci.org as supplemental material). As we show in Figure 8, if the time delays are properly accounted for by a readout, the magnitude of the input—as represented by the total number of conjointly incoming action potentials—can increase considerably. Such a readout would have to be able to deal with the variability of individual spike delays, as indicated for example by the widths of CCH peaks, and also would have to distinguish between two potential sources of this variability: One is an inherent imperfectness in the time at which neurons generate action potentials relative to the other neurons and relative to the underlying oscillation cycle. Because of this variability, an efficient use of the present code may require a certain degree of information averaging that can be performed either over time, as made in the present analyses, or over space (i.e., over larger number of neurons), as is likely to occur during information processing in the brain. The second source of the variability of time delays is the continuous change in the contents of this temporal code. For example, we showed that the relative firing times adjust continuously along the presentation trial as gratings drift over the receptive fields. Therefore, another part of the variability of the relative firing times is not inaccuracy of the code but modifications of the message communicated by that code.
Oscillation: a “switch” toward coding by firing sequences?
Beta/low-gamma oscillations considerably increased the coding capacity of firing sequences, making it comparable with the capacity of firing rates. This is remarkable given that stimulus orientation is one of the most effective variables for modulating rate responses in the visual cortex. In addition, during oscillations, preferred firing sequences could be detected more easily in the spike trains. These findings suggest that sequences are expected to be most efficient during behavioral states associated with beta/gamma oscillations. This is the case, for example, during focused attention (Jensen et al., 2007; Womelsdorf and Fries, 2007; Tallon-Baudry, 2009), suggesting that the brain is capable of switching between the relative use of rate codes and temporal codes in a task- and state-dependent way. Future studies will have to clarify whether other visual features are encoded more efficiently in sequences than in rate modulations, for example, stimuli that cause only minor changes in discharge rate but major modulation of neural synchrony (Biederlack et al., 2006).
These considerations suggest that precise timing relationships between the discharges of synchronized neurons can be exploited to transmit stimulus information. With an appropriate readout mechanism that averages across a large number of input sources, information carried by a firing sequence could be possibly read out even from a single oscillation cycle. This suggests that many of the principles of spike timing discovered in the hippocampus—such as internal referencing, complementary coding to firing rates, translation of slow external events onto timescales compatible with neuronal integration—may find their place also in the cortex in a slightly altered form associated with faster oscillations in beta and gamma range and with neural synchrony.
This work was supported by Deutsche Forschungsgemeinschaft Grant NI 708/2-1, The Alexander von Humboldt Foundation, The Hertie Foundation, and by Research Grant 01GQ0840 of the German Federal Ministry of Education and Research (Bundesministerium für Bildung und Forschung) within the “Bernstein Focus: Neurotechnology.” M.N.H. was supported by The Ernst Schering Foundation, the Royal Society, and the German National Academic Foundation. We thank Johanna Klon-Lipok, Sergio Neuenschwander, Ralf Galuske, and Kerstin Schmidt for assistance during data acquisition. We are also thankful to Raul C. Muresan, Gaby Schneider, Alexander Martinez-Marco, and Weija Feng for valuable input during data analysis.
- Correspondence should be addressed to Dr. Danko Nikolić, Max Planck Institute for Brain Research, Deutschordenstrasse 46, 60528 Frankfurt am Main, Germany.