Abstract
Cortical responses can vary greatly between repeated presentations of an identical stimulus. Here we report that both trial-to-trial variability and faithfulness of auditory cortical stimulus representations depend critically on brain state. A frozen amplitude-modulated white noise stimulus was repeatedly presented while recording neuronal populations and local field potentials (LFPs) in auditory cortex of urethane-anesthetized rats. An information-theoretic measure was used to predict neuronal spiking activity from either the stimulus envelope or simultaneously recorded LFP. Evoked LFPs and spiking more faithfully followed high-frequency temporal modulations when the cortex was in a desynchronized state. In the synchronized state, neural activity was poorly predictable from the stimulus envelope, but the spiking of individual neurons could still be predicted from the ongoing LFP. Our results suggest that although auditory cortical activity remains coordinated as a population in the synchronized state, the ability of continuous auditory stimuli to control this activity is greatly diminished.
Introduction
The activity of the cerebral cortex depends on brain state. The most striking changes of brain state occur with the sleep cycle. Slow-wave sleep is characterized by a synchronized, or inactivated, state displaying low-frequency local field potential (LFP) fluctuations, corresponding to an alternation of up states of global activity and down states of network silence; in contrast, rapid-eye movement is characterized by a desynchronized, or activated, state in which large up–down alternations are suppressed (Steriade et al., 2001). Cortical state also varies with wakefulness; desynchronized, higher frequency patterns are seen during alert and attentive conditions whereas lower frequency oscillatory patterns are more typical of quiescence or drowsiness (Buzsaki et al., 1988; Wiest and Nicolelis, 2003; Gervasoni et al., 2004; Luczak et al., 2007, 2009; Poulet and Petersen, 2008; Sakata and Harris, 2009). Attention and behavioral engagement can suppress low-frequency LFP and EEG power (Bastiaansen et al., 2001; Fries et al., 2001; Chalk et al., 2010), suggesting that attention might affect cortical state by enhancing desynchronization.
Sensory responses have been observed to be state-dependent in multiple modalities (Livingstone and Hubel, 1981; Wörgötter et al., 1998; Fanselow and Nicolelis, 1999; Gaese and Ostwald, 2001; Edeline, 2003; Castro-Alamancos, 2004a; Murakami et al., 2005; Otazu et al., 2009). Within the synchronized state, stimulus responses exhibit a complex, nonlinear interaction between stimuli and ongoing activity such as stimulus-evoked flips between up and down states (Hasenstaub et al., 2007; Curto et al., 2009), as well as prominent adaptation at both thalamic and cortical levels (Castro-Alamancos, 2004b). It has been suggested that spontaneous excitability fluctuations may account for variability and noise in sensory responses (Arieli et al., 1996; Azouz and Gray, 1999; Kisley and Gerstein, 1999; Petersen et al., 2003). In contrast, desynchronized brain states may better support the representation of temporally extended stimuli such as rapid stimulus trains (Castro-Alamancos, 2004a) and natural movies (Goard and Dan, 2009).
To investigate how brain state modulates cortical representations of continuous auditory stimuli, we recorded from neuronal populations in rat auditory cortex under urethane anesthesia while presenting frozen amplitude-modulated white noise (AM noise). Although anesthesia typically produces a synchronized pattern, under urethane, the cortex can exhibit transient periods of desynchronization (Duque et al., 2000; Clement et al., 2008; Renart et al., 2010). This allowed us to compare responses across synchronized and desynchronized states. We found that in the desynchronized state, the stimulus envelope is represented more faithfully in both LFPs and spiking activity, whereas in the synchronized state, cortical activity is largely decoupled from the stimulus.
Materials and Methods
Experimental procedures.
All experiments were performed in accordance with protocols approved by the Rutgers University Animal Care and Use Committee. Six male Sprague Dawley rats (250–450 g) were anesthetized with urethane (1.2–1.5 g/kg) plus supplementary doses of ketamine and xylazine (15 and 2 mg/kg, respectively), as required. In some experiments, subcutaneous injections of dexamethasone (0.2–0.5 mg/kg) and atropine methyl-nitrate (0.1–0.2 mg/kg) were administered to lessen edema and secretions, respectively, and acepromazine (0.5–1 mg/kg) was administered to regularize heart and breathing rhythms. Tracheotomy was performed to minimize noise from breathing. Animals were placed in a custom orbitonasal restraint that left the ears free. The temporal muscle was reflected and a 2 × 3 mm craniotomy drilled above left Te1 for dura removal before covering with 1% agar in artificial CSF. Sixteen- or 32-channel silicon probes (NeuroNexus Technologies) were descended to putative layer V/VI (0.8–1.2 mm from surface), and auditory responses evaluated online to confirm electrode placement by response latency and tone responsiveness. Desynchronization of the EEG was induced by applying 30 s to 1 min of pressure to the tip of the tail (tail pinch), and also occurred spontaneously. Desynchronized epochs were identified as periods when the total spectral power <6 Hz was significantly reduced for >5–10 s. Intermediate periods where the EEG was not clearly synchronized or desynchronized were not used in our analyses. Of the six experiments, only four yielded sufficient desynchronized epochs for statistical analysis.
All experiments were performed in a single-walled sound-proof chamber (IAC). Acoustic stimuli consisted of a repeatedly presented 50 s frozen-noise stimulus, generated by pointwise multiplication of a Gaussian white noise carrier with an envelope made by exponentiating a bandpass-filtered (1–100 Hz) second Gaussian white-noise sequence. The resulting signal had a mean amplitude of 63 dB sound pressure level (SPL; range, ∼30–100 dB SPL). Stimuli were delivered free field via a TDT RP2 processor, ED1 speaker driver, and ES1 electrostatic speaker (all from Tucker-Davis Technologies). An ACO-7012 microphone (ACO Pacific) was placed by the ear and audio recorded to disk at 160 kHz for sound level calibration and to control for extraneous noises during the experiment. Electrophysiological signals were amplified and recorded to disk at 20 kHz using custom software.
Data analysis.
Spike sorting was performed using previously described methods (Harris et al., 2000), and LFP extracted by low-pass filtering and down-sampling the raw traces to 1.25 kHz. LFP spectrograms were computed using the multitaper method (www.chronux.org), and coherence by Welch's method (Mathworks). To measure spike count variability, the 50 s stimulus was divided into successive 100 ms time bins. For each combination of neuron and time bin, a set of spike counts was accumulated over all stimulus repetitions where cortex was in the required state, and the Fano factor computed as variance divided by mean; if different numbers of stimulus repetitions occurred for different states, a random subset of those in the state with more was taken to equalize group sizes. Each cell's mean Fano factor was computed for both states by averaging over all time bins, excluding any for which the Fano factor was undefined due to zero mean spike count.
To quantify how well individual neurons were entrained by the AM noise stimuli, we used a spike-train prediction method (Harris et al., 2003; Itskov et al., 2008). A function for predicting each neuron's firing rate from the stimulus envelope was estimated from a training set of all but one stimulus repetitions, and its quality evaluated by comparing to the spike train observed on the remaining presentation (the test set). Prediction quality was assessed by the difference log-likelihoods Lf = −∫f(t)dt + Σslogf(ts) of the test set spike train ts under the predicted firing probability f(t) relative to a constant probability given by the mean rate on the training set. The resulting likelihood ratio was divided by log2 and the number of test set spikes to yield an estimate of how many bits per spike a communicator of the test set spike train could save by knowing the stimulus envelope over knowing only the mean rate, and assuming spikes were generated by an inhomogeneous Poisson of rate f. Note that f was not chosen by directly maximizing Lf, but by one of two algorithms described below. The procedure was repeated with each stimulus repetition taking its turn as test set, and an average computed. Note that with this method, estimated information rates can be negative if predictions perform worse than the mean firing rate.
Prediction functions were estimated by two methods. The first was a linear–nonlinear method, where the prediction function was fit by first finding the optimal linear filter for predicting the spiking activity, and then fitting a static nonlinear function that links this linear prediction to firing rates (Chichilnisky, 2001). Because the noise envelope was approximately white in the pass band of 1–100 Hz, the optimal linear filter could be obtained simply by computing the spike-triggered average (STA) of the envelope. The link function was constructed by binning the filter output into 100 bins and computing the smoothed firing probability in each bin. For brevity, we refer to this method as the linear STA method.
The second prediction method, termed the two-dimensional (2D) STA method, was simplified from the method of Sharpee et al. (Sharpee et al., 2006; Atencio et al., 2008). In this approach, firing probability was predicted as a nonlinear function of the stimulus envelope and its instantaneous derivative at a fixed time lag into the past. The stimulus envelope and its derivative were binned to form a 192 × 192 grid of possible signal and derivative values, and firing rates were estimated in this 2D space by using a smoothing method previously described for hippocampal place fields (Harris et al., 2001), in which a smoothed spike count map is pointwise divided by a smoothed occupancy map (13 pt Gaussian for both). To predict the firing rate function on the test set, the rate map computed from the training set is used as a look-up table. The likelihood ratio was averaged over all cross-validation repeats to yield a mean prediction quality as a function of the time lag parameter. The maximum of this curve was taken as prediction quality.
Results
To test the state dependence of auditory cortical representation of continuous stimuli, we recorded neural populations using multisite silicon electrodes in auditory cortex of urethane-anesthetized rats. Acoustic stimuli consisted of a repeatedly presented 50 s frozen amplitude-modulated noise stimulus, with an amplitude envelope that had power in the range 1–100 Hz.
An example of data collected with this method can be seen in Figure 1. LFPs during the synchronized state were dominated by a low-frequency (<10 Hz) pattern, whereas the desynchronized state LFPs exhibited greatly reduced low-frequency power (Fig. 1A). The smaller, narrowband oscillation at 3–4 Hz seen in the desynchronized state likely corresponds to volume-conducted hippocampal theta [which has a lower frequency than in awake rats, and occurs together with cortical desynchronization (Sirota et al., 2008)].
Presentation of the stimulus did not change these low-frequency patterns, but it did cause an increase in higher frequency power, which was more prominent in the desynchronized state. Rasters of population activity (Fig. 1B) show that the synchronized state consists of alternations between periods of generalized spiking activity (up states) accompanied by negative LFP deflections, and periods of very little spiking (down states), accompanied by positive LFP deflections; stimulus presentation did little to change this pattern. In the desynchronized state, such global oscillations were not seen, but instead cells fired more continuously during both AM noise stimulation and silence.
The use of a repeatedly presented frozen-noise stimulus allowed us to examine the reliability with which the cortex responded to the stimulus. In Figure 2A, the evoked LFPs from two presentations of the stimulus are overlaid (synchronized, blue and cyan; desynchronized, red and magenta); below each pair of traces is a raster representation of a single neuron's response to multiple stimulus repetitions. It can be seen that the response in the desynchronized state is highly reliable from trial to trial. In the synchronized state, cortical activity is modulated by the stimulus in a less reliable way. This reliability of LFP responses was quantified using the coherence of the evoked LFP with the stimulus envelope (Fig. 2B). In all cases, the LFP showed greater coherence to the stimulus envelope in the desynchronized state. To quantify the reliability of spiking responses across multiple stimulus repetitions, we computed Fano factors for each cell in the two states (see Materials and Methods, above) (Fig. 2C). In the synchronized state, Fano factors were typically >1 (p < 0.001, one-sample t test; mean ± SD, 1.19 ± 0.36), indicating that spiking was more variable than expected from a (inhomogeneous) Poisson process, whereas in the desynchronized state, Fano factors were typically <1 (p < 0.001, one-sample t test; mean ± SD: 0.84 ± 0.25), indicating spiking was less variable than Poisson. A significant difference was also found between states (p < 0.001, paired t test).
We next set out to quantify the degree to which individual neurons were reliably entrained by the AM noise stimuli. To do this, we used a spike-train prediction method, in which the stimulus envelope was used to generate a predicted firing rate, which was then compared with the spike train actually observed. To avoid overfitting, we used cross-validation: parameters of the prediction function were estimated from one part of the data (the training set) and evaluated on another (the test set). Prediction was assessed by log-likelihood ratio compared with the prediction of constant mean firing rate, and normalized by the number of spikes, resulting in a measurement in bits/spike (see Materials and Methods, above).
Two methods were used to predict spike-firing probability from the amplitude envelope. The first was based on convolution with a linear filter followed by a static nonlinearity (see Materials and Methods, above). To ensure results were not dependent on this specific prediction method, we also applied a second technique we termed the 2D STA, simplified from the method of Sharpee et al. (Sharpee et al., 2006; Atencio et al., 2008). In this approach, firing probability was predicted as a nonlinear function of the amplitude and slope of the envelope at a fixed time lag in the past (Fig. 3A). Figure 3B illustrates how both the shape of the 2D STA and quality of the prediction vary as a function of the time lag. When quantifying predictions using this method, the value of time lag giving optimal performance was used.
Three examples of this prediction can be seen in Figure 4A. In the desynchronized state, the first neuron (Fig. 4A1) showed a preference for high amplitudes ∼16 ms before spiking, visible as sharp peaks in the linear STA and near the top of the 2D STA plot, with predictability of ∼1.1–1.4 bits/spike for both methods. In the synchronized state, however, this predictability was completely abolished, with a flat linear STA and unstructured 2D STA plot. For the second neuron (Fig. 4A2), the desynchronized linear STA showed a broad peak spanning −40 to −20 ms, yielding predictability of ∼1.3 bits/spike. The 2D STA showed a diffuse peak in the upper half, with poorer predictability reflecting the inability of the amplitude and derivative at any single time point to accurately capture the lower-frequency amplitude modulations that drove this neuron. As with the first example, however, predictability according to both measures was abolished in the synchronized state. The third example cell (Fig. 4A3) showed a complex receptive field structure in the desynchronized state, with a biphasic linear STA and a sharp peak at the right side of the 2D STA plot indicating preference for the rising phase with a lag of ∼16 ms. Unlike the other examples, this neuron did show some predictability in the synchronized state, but its 2D STA moved from a sharp peak to a more diffuse ring, indicating that loud sounds would make it fire, but with unreliable timing. Consistent with this picture, the linear STA in the synchronized state was broad but provided no information about spiking. These examples suggest that the two prediction methods give similar though not always identical results, but that with either method, predictions from stimulus envelope are worse in the synchronized state.
Cortical activity is not simply a deterministic function of sensory input, and cortical circuits can exhibit autonomous activity independent of external stimuli. Thus, even if a cell is poorly predicted from sensory stimuli, it is possible that its activity is strongly related to internally generated activity patterns. To determine whether this was the case, we applied the same methods to predict neural activity from the LFP signal (averaged over neighboring recording shanks to avoid contamination by the neuron's own waveform). The results of this analysis are seen in Figure 4B for the same three example cells as before. Prediction from LFPs was typically better with the nonlinear method. The optimal time lag near 0 indicated that the instantaneous LFP amplitude and derivative was a good predictor of spiking. 2D STAs showed peaks to the left or below the origin, consistent with preferences to fire on the descending phase or trough of LFP oscillations. In contrast to prediction from stimulus, prediction from LFP was often better in the synchronized state, likely as a result of the strong modulation of population activity by up states and down states.
The intuition suggested by the above examples is confirmed by group-level analysis. The two STA methods produce highly correlated predictions (Fig. 5A,B), with a slight advantage to the linear method when predicting from stimulus envelopes in the desynchronized state (synchronized, p = 0.051; desynchronized, p < 0.001, paired t test) (Fig. 5A), and to the nonlinear method when predicting from LFPs in both states (p < 0.001, paired t test) (Fig. 5B). For further analyses, we therefore used the best method in each case (linear for AM noise envelope, 2D for LFP). Prediction from the AM noise envelope was better by a large margin in the desynchronized state (p < 0.001, paired t test) (Fig. 5C), and prediction from the LFP was generally better in the synchronized state (p < 0.001, paired t test) (Fig. 5D). Comparing prediction from LFP to prediction from the stimulus, we found that LFP prediction outperformed the poor AM noise prediction by large margins in synchronized states (p < 0.001, paired t test) (Fig. 5E), and that in desynchronized states, the LFP was also generally a better predictor of neural activity than the stimulus envelope (p < 0.005, paired t test) (Fig. 5F) when using the optimal method in each case.
Discussion
We analyzed the response of auditory cortical neurons to frozen-noise stimuli as a function of brain state under urethane anesthesia. In the synchronized state, the activity of individual neurons was strongly predictable from the LFP, but not from the stimulus. In the desynchronized state, however, neural activity could be well predicted from both the stimulus and the LFP.
The fact that neural activity in the synchronized state was strongly predictable from the LFP, an indicator of global neuronal activity, but only poorly predictable from the AM noise envelope, suggests that cortical activity had largely decoupled from the stimulus. Even in the desynchronized state, where spike times were predictable from the stimulus envelope, prediction from the LFP could be better still. If cortical activity were deterministically controlled by the stimulus, one would expect the LFP to predict any neuron's activity as well as the stimulus envelope, but no better. The fact that spiking was better predicted by the LFP suggests that auditory cortex showed coordinated population activity beyond that imposed by the stimulus. Quantification of the predictability of spiking activity is subject to the caveat that the method of prediction chosen may not be optimal; however, the use of two prediction methods (linear and 2D STA), which had highly correlated results, helped mitigate this concern.
Cortical desynchronization can occur through both neuromodulatory input to the cortex, and increased tonic firing of thalamic relay neurons, which may in turn reflect neuromodulation in thalamus (Metherate et al., 1992; Steriade, 2004; Hirata and Castro-Alamancos, 2010). Desynchronization evoked by tail pinch or occurring spontaneously under urethane is accompanied by altered activity in multiple subcortical neuronal classes, including increased spiking of cholinergic neurons of the basal forebrain (BF) and pedunculopontine tegmental nuclei (PPT), which target the cortex and thalamus, respectively (Duque et al., 2000; Manns et al., 2000; Boucetta and Jones, 2009). Desynchronization can be evoked under anesthesia by electrical stimulation of the BF, PPT, and other nuclei (Metherate et al., 1992; Dringenberg and Vanderwolf, 1997). Electrical stimulation of any one site, however, is likely to activate a larger subcortical network; for example, stimulation of the BF produces increased tonic firing in lateral geniculate nucleus (Goard and Dan, 2009), even though it does not directly project there (Kolmac and Mitrofanis, 1999). Thus, it seems probable that spontaneous, tail pinch-evoked, and BF/PPT stimulation-evoked desynchronization under urethane involve activation of complex but largely overlapping subcortical networks (Clement et al., 2008).
Although the AM noise stimulus was unable to reliably entrain cortical activity in the synchronized state, this is not because auditory sensory responses cannot occur in this state. Indeed, robust responses to clicks and to the onsets of tones and natural sounds occur in the synchronized state under urethane (Bartho et al., 2009; Curto et al., 2009; Luczak et al., 2009). Those repeatable responses that did occur in the synchronized state were typically seen after large transients (Fig. 2A, at 14.2 s). Smaller amplitude modulations, by contrast, led to repeatable responses in the desynchronized but not synchronized state, suggesting that they had been filtered out. These results therefore complement data from other sensory modalities that suggest that the synchronized state leads to increased adaptation to prolonged or rapidly repeated stimuli. In barrel cortex, the response to a single stimulus is larger in synchronized/quiescent states than in desynchronized/information-processing states, but responses to rapidly repeated stimuli show more adaptation in synchronized states (Castro-Alamancos, 2004a). In auditory cortex, the response to the first click of a train is larger in passive than behaviorally engaged rats, but increased adaptation leads to a similar steady-state response at high repetition rates (Otazu et al., 2009); increased adaptation to 50 ms click pairs is also seen in the synchronized state under urethane (Hollender et al., 2008). In visual cortex, the reliability of responses to natural movies is enhanced by BF stimulation, consistent with a filtering out of certain features of these prolonged stimuli in the synchronized state (Goard and Dan, 2009). This filtering, however, need not take place at the cortical level. Thalamic burst mode has been suggested to allow large wake-up call responses to stimulus transients, whereas tonic mode would provide a more linear representation of temporally extended stimuli (Sherman, 2001). In auditory as in other cortices, thalamic bursting is more common in synchronized states (Massaux et al., 2004).
Although for our analysis we divided data into the most synchronized and desynchronized states we recorded, a continuum of states, corresponding to a continuum of LFP and EEG power spectra, can be observed both under anesthesia (Clement et al., 2008; Curto et al., 2009) and during wakefulness (Gervasoni et al., 2004). We suggest that one consequence of cortical desynchronization is to put the cortex under progressively greater control of sensory stimuli, and to tone down the role of intrinsic dynamics in shaping population activity. In primates, attention causes decreased low-frequency LFP power in multiple areas (Fries et al., 2001; Chalk et al., 2010), broadly similar to the changes in low-frequency power seen in our data. When an animal attends to a changing stimulus, it may allow cortical activity to more faithfully follow that stimulus, whereas an unattended stimulus would be less able to control neural spiking. Our data suggest such an effect could be achieved by placing the parts of the cortex that represent the attended stimulus in a more desynchronized state.
Footnotes
-
This work was supported by the National Institutes of Health (Grants MH073245 and DC009947) and the National Science Foundation (Grant SBE-0542013 to the Temporal Dynamics of Learning Center, a National Science Foundation Science of Learning Center). We thank members of the Harris lab for productive discussions, Shuzo Sakata for assistance with pilot experiments and Artur Luczak for help with stimulus design.
-
The authors declare no competing financial interests.
- Correspondence should be addressed to Kenneth D. Harris, Departments of Bioengineering, Electrical and Electronic Engineering, Imperial College, South Kensington Campus, London SW7 2AZ, UK. kenneth.harris{at}imperial.ac.uk