Abstract
The brain is widely assumed to be a paradigmatic example of a complex, selforganizing system. As such, it should exhibit the classic hallmarks of nonlinearity, multistability, and “nondiffusivity” (large coherent fluctuations). Surprisingly, at least at the very large scale of neocortical dynamics, there is little empirical evidence to support this, and hence most computational and methodological frameworks for healthy brain activity have proceeded very reasonably from a purely linear and diffusive perspective. By studying the temporal fluctuations of power in human restingstate electroencephalograms, we show that, although these simple properties may hold true at some temporal scales, there is strong evidence for bistability and nondiffusivity in key brain rhythms. Bistability is manifest as nonclassic bursting between high and lowamplitude modes in the alpha rhythm. Nondiffusivity is expressed through the irregular appearance of high amplitude “extremal” events in beta rhythm power fluctuations. The statistical robustness of these observations was confirmed through comparison with Gaussianrendered phaserandomized surrogate data. Although there is a good conceptual framework for understanding bistability in cortical dynamics, the implications of the extremal events challenge existing frameworks for understanding largescale brain systems.
Introduction
The characterization of spontaneous or “restingstate” data has been traditionally overshadowed by the study of taskrelated activity, such as the evoked response potential of psychophysiology. However, there is growing interest in restingstate data (Biswal et al., 1995), both as an influential precursor to sensory processing (Makeig et al., 2002) or motor activity (Fox et al., 2006) and as a window into the highly structured nature of spontaneous largescale brain activity per se (Greicius et al., 2003; Achard et al., 2006). The use of restingstate electroencephalogram (EEG) data is ideal in these regards, because its high temporal resolution allows the study of activity from long to very short timescales. Lengthy acquisitions of such data permit estimation of the statistics of activity at various scales, including the mean power (or power spectra) plus higherorder moments, as reflected in longrange temporal (LinkenkaerHansen et al., 2001) and spatial interdependences (Breakspear and Terry, 2002). However, the only measure traditionally reported is the mean power (i.e., the power spectrum). This leaves the nature of spontaneous fluctuations around the mean unknown. An understanding of these fluctuations may have farreaching methodological and computational consequences.
We hence examine the likelihood distribution of power fluctuations across a range of timescales (0.5–35 Hz) in human restingstate EEG and seek lowdimensional parametric forms at each frequency. In particular, we test the null hypothesis that the distribution at all frequencies could be well described by the simple oneparameter family of exponential distributions, which contain no additional information that could be inferred from knowledge of the mean. Such distributions arise when the underlying system generates uncorrelated stochastic events and have nearmaximum entropy (i.e., are close to thermodynamic equilibrium). This would accord with the view that cortical activity is well described as filtered Gaussian noise, justifying the sole use of the power spectra when reporting the properties of scalp EEG data. Rejection of this null hypothesis would imply the presence of nonGaussian processes in the cortex and suggest the need for more sophisticated generative models.
Stochastic systems that generate Gaussian statistics are typically close to equilibrium, dominated by shortrange interactions and characterized by weak global correlations (Tsallis and Brigatti, 2004). They may comprise complex local units but are of sufficiently high dimension that their largescale statistics are nonetheless essentially uncorrelated. However, the brain has dense internal connectivity with longrange projections that scale hierarchically (Hilgetag and Kaiser, 2004; Sporns and Zwi, 2004). Through its abundant use of energy, the brain resides in a strongly nonequilibrium state. These are the hallmarks of complex systems, which exist in a wide variety of physical and biological fields and exhibit selforganization, multiple excited modes and highly correlated, nondiffusive processes (Zaslavsky, 2002). From this perspective, it might seem remarkable that, in contrast to many complex systems, the brain would exhibit only simple diffusive dynamics at the global scale. However, current computational models of neural population dynamics, such as the Fokker–Plank formulation (Renart et al., 2003; Deco et al., 2008), are premised on the “diffusion approximation,” namely that the inputs impinging on individual neurons within a population can be treated as temporally uncorrelated when computing instantaneous postsynaptic firing rates. Likewise, Bayesian formulations of human inference propose that cortical populations compute via uncorrelated Poisson processes (Ma et al., 2006). Ultimately, the spatial and temporal scales at which these assumptions are valid are empirical questions that can only be determined by examining the statistics of in vivo neuronal activity. The goal of the present study is to address this at the macroscopic spatial scale.
Materials and Methods
Subjects and data acquisition
Scalp EEG data were collected from a total of 16 healthy, adult volunteers (11 females; mean age, 25.3 years; range, 20–31 years). Written informed consent was obtained from each subject before their participation. Subjects were requested to rest with eyes closed while maintaining alertness. Acquisition times ranged in duration from 14 to 30 min.
All data were acquired using magnetic resonancecompatible amplifiers (hardware bandpass filter, 0.1–250 Hz; BrainAmp; Brain Products) and EEG caps (EasyCap; FMS) arranged according to the International 10–20 System, referenced against an electrode centered between Cz and Pz. Impedances of all electrodes were set below 5 kΩ. In seven subjects, EEG data were acquired from 60 scalp channels in a sound and lightattenuated room. In nine subjects, EEG data were acquired from 29 scalp channels and recorded simultaneously with functional magnetic resonance imaging (fMRI) in a 1.5 T MR Vision tomograph (Siemens) using a T2*weighted blood oxygen leveldependent (BOLD)sensitive gradient echo planar imaging sequence (scan repetition time, 2200 ms; acquisition time, 2050 ms; echo time, 60 ms; volumes, 750; slices, 20; flip angle, 90°; voxel size, 3 × 3 × 5 mm). For additional details of the EEG–fMRI setup, see Moosmann et al. (2003) and Ritter et al. (2009a). EEG sampling rate was 5 kHz. In all cases, the electrooculogram and electrocardiogram were also recorded.
Data preprocessing
The nine datasets acquired simultaneously with fMRI contain artifacts from MR gradient switching (Ritter et al., 2009b) and ballistocardiogram effects, which are generated by heart rateassociated movements of the electrodes in the static B0 field of the MR scanner. Both types of artifacts were corrected using template estimation and subtraction (Allen et al., 1998, 2000; Ritter et al., 2007) as implemented in the VisionAnalyzer software (version 1.03; Brain Products).
Temporal independent component analyses (ICAs) were performed on all 16 datasets. Components reflecting eye movement and scalp muscle artifact were identified and subtracted from the data. Data were then visually inspected to ensure they were artifact free. For this study, we characterized temporal fluctuations from the right occipital region (channel O2). These singlechannel data were lowpass filtered (70 Hz high cutoff frequency) and downsampled to 200 Hz for subsequent analysis.
Data analysis
Wavelet decompositions and derivation of empirical probability distribution functions.
Waveletbased estimators of the dynamic spectral content of time series are a well established approach, with robust theoretical support (Torrence and Compo, 1998) and recent extensions to inference on dynamic coherence (Maraun et al., 2007). For the present study, dynamic spectrograms were derived by convolving the data with complex Morlet wavelets (center frequency, 1 Hz; bandwidth parameter, 10 s; corresponding to a kernel with a fullwidth at halfmaximum of 0.5 s at 10 Hz). Conversion from scales to frequency was estimated as the inverse of the scaling parameter at each level of the wavelet decomposition, yielding a frequency range from 0.5 to 35 Hz in steps of 0.5 Hz. Power was estimated as the modulus squared of the corresponding wavelet coefficients. Frequencyspecific probability distribution functions (PDFs) were then obtained by partitioning the fluctuations of power separately at each frequency and counting the number of observations in each bin. Bin width was determined individually at each frequency by dividing the frequencyspecific power range into 200 equally sized bins.
Estimating and fitting exponential PDFs.
A Gaussian process is expected when the observed events arise independently from a stochastic source. According to the central limit theorem, such events need not be drawn from a single distribution with stationary statistics but may be drawn from a large (limiting to infinite) number of independent processes that, by themselves, need not be Gaussian, provided they have finite variance. In the context of scalp EEG, which reflects summed activity across a number of large neuronal populations, this implies that the temporal fluctuations in activity may be Gaussian even if neuronal activity on smaller scales is not Gaussian (e.g., if it is a nonlinear process) as long as they are statistically independent when observed at the macroscopic scale. In the present study, this forms the null hypothesis. Rejection of this null hypothesis attributable to nonGaussian statistics is expected when such spatially distributed macroscopic processes become sufficiently correlated.
For processes exhibiting Gaussian fluctuations in amplitude, the functional form of the corresponding power distribution (i.e., the amplitude squared) follows an exponential PDF (Balakrishnan and Basu, 1996): where x is the power, and λ is the shape parameter that can be estimated from an empirical distribution by taking the log of the likelihood and estimating the slope of the resulting line. For the present purposes, this was achieved by a linear regression of the PDF in linear–log coordinates. High variance at the lower end of the loglikelihood line typically arises as a result of the relatively infrequent observations. We hence iteratively excluded such bins until a goodnessoffit threshold (R^{2} > 0.95) was achieved. It is important to note that this procedure was only pursued to provide a best estimate of the shape parameter λ when Equation 1 was at least approximately satisfied, not to make any inferences regarding the best functional form for the distribution. To gain a better insight into the functional form of the PDF, particularly the asymptotic scaling behavior of both tails, the fitted PDFs were formally evaluated in log–linear and log–log coordinates. For the data, this was achieved by partitioning the range of the log of the power into equally sized bins and counting the number of observations in each bin. For the fitted distributions, this required a change of variables to y = log(x) and, hence,
Independent component decomposition.
It is possible that deviations from an exponential fit could reflect multiple independent spatial sources (i.e., cortical regions) contributing in a complex, heterogeneous manner to scalp channels. We hence also applied the estimation and fitting procedure to time series data obtained from ICA of the wholehead EEG (Makeig et al., 1996). This algorithm produces the maximally temporally independent signals available in the channel data, hence unmixing spatially discrete sources of the scalp channel data.
Phaserandomized surrogate data.
Nonnegligible residual errors and biases in an empirical fit to an exponential PDF may arise from trivial causes such as a finite sample set, measurement bias, or methodological processes, such as waveletbased estimates of momenttomoment power in a colored noise dataset. To test the null hypothesis that observed deviations are attributable to such trivial effects, we repeated the analysis on surrogate data constructed by Fourierbased phase randomization of the original EEG data (Theiler et al., 1992). Such a process preserves the mean power spectrum of the original data while rendering its statistics Gaussian. Two hundred such surrogate datasets were computed for each subject and used to represent the null distribution.
Results
Representative distributions and parametric fits at different temporal scales are presented first. A systematic survey across the frequency domain is then provided, together with a formal modelbased comparison of Gaussian and candidate nonGaussian fits on empirical and phaserandomized surrogate data.
Exemplar results: Gaussian, bimodal, and superGaussian
In this section, we present examples of empirical PDFs of power across a range of different temporal frequencies.
Example of a simple unimodal exponential distribution
At some temporal scales, empirical distributions of the log power were observed to exhibit a unimodal form with featureless tails above and below the mean. Exponential PDFs fitted to these distributions capture the observed variability across the entire power domain. Figure 1 shows such an example from a single subject at 34 Hz, of the powerlikelihood distributions in log–linear (Fig. 1a) and log–log (Fig. 1b) coordinates. In this example, the exponential PDF is an excellent quantitative estimate of the data, as evident in both log–linear and log–log space. Importantly, there are no obvious systematic deviations (biases) evident above or below the mean. Residual error variance increases near the lower tail (lefthand), but this is spread across both sides of the tail and hence does not show a systematic bias. Increased variance is expected at the ends of the tails because of the relatively few observations.
The observed and fitted distributions derived from the same data, after phase randomization, are plotted in Figure 1, c and d. It is hence evident that the exponential PDF closely fits a time series that has been deliberately rendered Gaussian. Once again, there are some unbiased residual errors at the lower left tail. Although fitting phaserandomized data with an exponential PDF is an expected result, it is nonetheless important to verify empirically because it confirms that combined use of the wavelet decomposition and curvefitting procedures used have prima facie validity, that is, they do capture the nature of a distribution when it is known to be Gaussian by construction. Comparison of Figure 1, a and b, with Figure 1, c and d, also shows that, in this example, phase randomization had no discernable effect on the statistics of the original empirical distribution. This accords with a view that power fluctuations in this subject, at this scale, likely arise from uncorrelated stochastic processes.
Examples of bimodal exponential distributions
In most subjects, the appearance of two distinct modes in the alpha rhythm (8–12 Hz) marks a striking deviation from the unimodal exponential distribution shown above. An example of an observed bimodal distribution in a single subject at 10 Hz in log–linear and log–log coordinates is given in Figure 2. To model this distribution, a singleexponential PDF was regressed to fit the dominant mode (black line), which in this case is the mode with greatest power. A second exponential distribution was then estimated from the residuals of this fit (gray line). It can be seen that the two exponentials provide a remarkably good fit, as evident by comparing their sum (red line) with the original distribution. Figure 2, c and d, shows the same data rendered Gaussian through phase randomization. The second mode has been abolished and the surrogate data hence affords a close fit with a singleexponential PDF. Forcing a second exponential fit to the residual errors in the phaserandomized case actually renders a worse fit than one mode alone, as evident in the large errors of the summed exponentials (red line) in Figure 2, c and d.
To ensure that the abolishment of the second mode is not an artifact of using the Fourierbased phaserandomization method of producing surrogate data, we also inspected PDFs derived from a waveletbased resampling technique (Breakspear et al., 2003). Briefly, these surrogate data are created by resampling the coefficients of a discrete wavelet decomposition, using a loworder discrete wavelet basis set (we used the Daubechies wavelet of order 3). The results accorded with those derived from the Fourier approach. Supplemental Figure S1 (available at www.jneurosci.org as supplemental material) shows the outcome of applying this alternative procedure to the same data as shown in Figure 2, showing that the outcome is not dependent on the particular surrogate algorithm used.
Clear visually evident bimodal distributions of alpha activity (8–12 Hz) were observed in 13 of the total 16 subjects whose data was acquired and inspected. These distributions are shown in supplemental Figure S2 (available at www.jneurosci.org as supplemental material). Data from two other subjects (S05 and S06) exhibited somewhat less distinct bimodal distributions at 8 and 23 Hz. For completeness, the PDFs for these two subjects at the respective (nonalpha) frequencies where they were observed are shown in supplemental Figure S3 (available at www.jneurosci.org as supplemental material). One additional subject (S07) had no clear bimodal distribution within any of the frequencies studied.
To address the possibility that the bimodal distributions arise from two spatially distinct sources, each with different amplitude statistics, we analyzed time series derived from independent component decompositions of the full multichannel EEG, identifying the single dominant posterior alpha component. In subjects who showed a bimodal alpha distribution, the corresponding posterior alpha mode was invariably bimodal, typically with even greater separation of the modes. Figure 3 shows an example of a bimodal ICA distribution, again in log–log (a) and log–linear (b) coordinates taken from a single subject at 10.5 Hz. The spatial topology of this ICA mode (Fig. 3c) shows a single posterior source, consistent with the principles that ICA unmixes single (linearly mixed) brain sources of scalp activity (Makeig et al., 1996). The power spectrum of this component (Fig. 3d) evidences a strong peak in the alpha rhythm, albeit obscuring the bimodal nature of this peak.
These observations suggest that the two modes arise through a temporally heterogeneous process, whereby two distinct dynamical processes arising from a single cortical region are alternately expressed. Figure 4 shows examples of the temporal expression of bimodal distributions in two representative subjects at their peak alpha frequency. In the first subject (Fig. 4a–c), the second (higherpower) mode is expressed erratically throughout the entire 500 s example (Fig. 4a). The time series exhibits a burstlike behavior from the lowerpower first mode (Fig. 4b), evident as sudden increases in the magnitude of the first temporal derivative of power (Fig. 4c). Each burst is weakly visible in the raw electrode time series. In this subject, this bursting behavior is present throughout the entire dataset. In contrast, data from the second subject (Fig. 4d–f) shows two distinct periods of time (t < 180 s and t > 1000 s) when the second (higherpower) mode is constantly present. In the interim period (180 s < t < 1000 s), there are sporadic instances when the second mode appears to be expressed, although these are typically lower in amplitude. Greater irregularity is also visible in the raw electrode time series (Fig. 4a,d), underlining the gross, macroscopic impact of alpha rhythm bimodality.
Although the bimodal distribution clearly affords a far better fit to the data, it does include a second parameter. To formally compare the fits, we used the Bayesian information criterion (BIC), which includes a penalty term for model complexity: where RSS is the sum of the squared residuals, n is the number of observations (discrete power bins), and k is the number of free parameters (k = 1 for the unimodal fit and k = 2 for the bimodal fit). Given two or more candidate models, the “best” model will yield relatively low values, reflecting small residual variance after penalization for the number of free parameters. Figure 5 shows a (singlesubject) example of the difference between the BIC values for a unimodal versus a bimodal fit across all frequencies studied. Hence, a strongly positive value favors the bimodal fit, whereas a strongly negative value favors a unimodal fit. Results for the empirical data are shown in black squares. The mean and 95% confidence intervals, estimated parametrically from 200 surrogate datasets, are depicted in red circles.
The standout feature in the original data is a strong upward deflection in the BIC difference between 8 and 11.5 Hz. This supports the results clearly visible in Figure 2 (the same subject) while additionally providing a quantitative measure of model preference and indicating the approximate frequency domain over which the second mode is expressed. In stark contrast, results from the surrogate data are consistently negative, indicating that, after penalizing for the second parameter, the bimodal fit is statistically inferior to a singleexponential fit for the unimodal distribution that is invariably generated from surrogate data. The original data also show a second peak at ∼14.5–15.5 Hz. Visual inspection of the PDF at these frequencies reveals only weak evidence for a second mode, evident as a point of inflection in the righthand tail. The significance of the broader positive region of the curve in the beta range (13–30 Hz) is considered below.
In systems with multistability, the distribution of times for which the system “dwells” in each mode can also be informative of the possible nature of the underlying process (Nakamura et al., 2007). Figure 6 shows the cumulative distribution of dwell times derived from the two same subjects whose data was shown in Figure 4. It can be seen that dwelltime distributions of each mode (which have been rescaled to their mean) appear to follow similar functional forms. When viewed in log–log coordinates (Fig. 6b,d), it is clear that the dwelltime distributions in these subjects do not follow a powerlaw relationship. When viewed in linear–log coordinates (Fig. 6a,c), the distributions instead suggest a gammatype stretched exponential function: Solid lines in each panel show fits to the respective data. As evident in the form of Equation 4, the parameter β determines the curvature of the curves (in linear–log space). When β is close to 1, Equation 4 approaches a basic exponential form e^{−αx}, whereas if β approaches 0, the curve becomes highly concave (in linear–log coordinates), trending toward a powerlaw effect. In the first subject (Fig. 6a,b), β = 0.45 for the lowpower mode and β = 0.5 for the highpower mode. For the second subject, the corresponding values are β = 0.45 (lowpower mode) and β = 0.75 (highpower mode). Hence, the dwell times for the highpower mode in this subject are closer to a basic exponential form. Similar fits were obtained in all 13 subjects showing a bimodal alpha pattern. These are shown in supplemental Figure S4 (available at www.jneurosci.org as supplemental material). For the lowerpower mode, β was generally close to 0.5 (mean, 0.49; range, 0.40–0.60), whereas for the higher mode, β was higher in all subjects except one (mean, 0.68; range, 0.50–0.90).
Examples of unimodal nonGaussian distributions
In all subjects and across a broad range of frequencies, typically in the beta range (∼13–30 Hz), the empirical PDFs frequently show a consistent upward bias of the rightside tail (i.e., above the mean) from exponential toward powerlaw scaling. Figure 7 provides examples of this effect in two subjects. It is critical to note that the empirical distributions (Fig. 7a,b) show a systematic bias and not simply large unbiased residuals over the righthand tails. In clear contrast, this bias is absent in PDFs derived from the surrogate data (Fig. 7c,d).
Research into complex, highly correlated systems motivates the consideration of an alternative stochastic model (Bramwell et al., 1998). Above the mean, fluctuations in these systems scale according to the following functional form: where the parameter λ is estimated in the same manner as for the exponential PDF. Setting a = 0.5 yields the heavy righthand tail that has been empirically observed in data from numerous complex systems (Bramwell et al., 2000). This doubleexponential form, also known as the “Fisher–Tippett distribution,” clearly captures the trend in the empirical PDF, correcting the bias resulting from a simpleexponential fit above the mean. Phase randomization returns the tails back to simpleexponential scaling. Similar results were seen at many frequencies in all subjects and are presented in supplemental Figure S5 (available at www.jneurosci.org as supplemental material).
By fixing a = 0.5, Equation 5 becomes a oneparameter double exponential and can hence be compared directly against the simple exponential without need for parameter penalization. Figure 8 shows the ratio of the RSS for the exponential and doubleexponential fits in a single subject. In this subject, the ratio is relatively small (<10) for frequencies below 16 Hz and above 30 Hz. However, within the beta band, between 16 and 30 Hz, there is a very strong increase in the ratio, reflecting a decrease in the residual error for the Fisher–Tippett fit in this range and hence a corresponding increase in the relative log evidence, peaking at 21 Hz.
In comparison, the ratios derived from the surrogate data (shown in red) are uniformly small, typically <1, with low variability between surrogate realizations. However, in this subject, as in all others, the ratio of the residuals in the empirical data is outside the surrogate distribution at most frequencies, although they are relatively low (<10). Visual inspection of the data shows that this is typically attributable to a small number of outlier observations, typically near the very edges of the righthand tail, and not to a consistent bias as evident in Figure 7 and supplemental Figure S3 (available at www.jneurosci.org as supplemental material). Such outlier observations increase the residual error for both the simple and doubleexponential PDFs. However, the relative ratio of the RSS is invariably only large (>10) when there is a consistent upward bias of the righthand tail associated with very small errors for the Fisher–Tippett fit.
The preferential performance of the Fisher–Tippett fit over the exponential PDF for the righthand tail implies that largeamplitude (“extremal”) events occur far more frequently than expected from the (null) simpleexponential form. Such events can be clearly seen in time series of power at the frequencies exhibiting Fisher–Tippett statistics versus the same data after phase randomization (Fig. 9a,b). Notably, compared with amplitude dynamics at frequencies showing bimodal distributions (Fig. 4), power fluctuations here do not exhibit obvious irregular switching because they are still drawn from a unimodal distribution. Hence, fluctuations are sampled from a single underlying mode. The critical difference is that their log likelihood falls less rapidly than an exponential functional form, and hence largeamplitude “events” are sporadically observed. Hence, the kurtosis—a measure of the width of the tails—is increased approximately twofold.
What are the characteristics of extremal events in the time–frequency plane? By calculating the ratio of the simpleexponential and Fisher–Tippett curves, it is possible to define the threshold above which amplitude events are 95% more likely to be drawn from a Fisher–Tippett rather than a simpleexponential PDF. Examples of extremal events in the time–frequency plane, after imposing this threshold in a single subject, are shown in Figure 9, c and d. For example, the double peak at 22 Hz in a is seen to span several adjacent frequencies in d at ∼1328 s. Their appearance in this case, as in all subjects, suggests that, after taking into account the smoothing kernel of the complex Morlet wavelet, extremal events are relatively isolated in both temporal and frequency domains. Detailed inspection of the data failed to reveal more complex phenomena, such as chirped signals tracking diagonally across the time–frequency plane.
Grouplevel findings
In this section, we present the intersubject consistency and variability of the observed bimodal and nonGaussian statistics.
Bimodal distribution
The summary statistics of bimodal PDFs observed at the singlesubject level in all 16 subjects are presented in Table 1. These results were compiled by inspecting the BIC differences for each subject and verifying the results against visual inspection of the frequencyspecific PDFs to exclude spurious results attributable to outlier observations or poor performance of the fitting routine. They evidence remarkable consistency in 13 of the subjects showing a robust center frequency in the alpha range (mean ± SD, 10.65 ± 0.72 Hz; range, 9.5–12 Hz).
One subject (S06) had bimodal activity that spanned a wide range of frequencies (19.0–27.0 Hz) in the beta range. In one subject (S05), the bimodal activity was somewhat less distinct and evident just below the defined alpha range. For completeness, the PDFs for these two subjects at the respective (nonalpha) frequencies where they were observed are shown in supplemental Figure S3 (available at www.jneurosci.org as supplemental material). The single case of bimodal beta activity in the beta activity (S06) is notable for exhibiting the same classic bimodal form as typically evident at approximately half that frequency in most other subjects. Only one of the 16 subjects (S07) had no evidence of bimodal activity in the frequencies studied. Interestingly, this subject had consistently highamplitude alpha activity and hence a highpower peak in the alpha range, suggesting that the data were acquired during a period of time when the alpha rhythm was consistently in the highpower mode. For thoroughness, the empirical PDFs for these three subjects at 10.5 Hz (the peak frequency for bimodal activity in the grand average) are also shown in supplemental Figure S6 (available at www.jneurosci.org as supplemental material). The lack of any suggestion of a second mode is quite clear in these PDFs. These results hence argue that the existence of bimodal alpha activity is not consistently observed in all subjects.
To further characterize the nature of the bimodal alpha activity in those 13 subjects in whom it was apparent, we calculated the grand average difference between the unimodal and bimodal BIC across this subgroup. Robust evidence for a bimodal fit, when compared with the surrogate data, is evident across the conventional alpha range (8–12 Hz). Above 12 Hz, there exists a broad range of frequencies for which the difference is close to 0, yet nonetheless well above the surrogate distribution, a phenomenon seen at the singlesubject level (as in Fig. 5). However, detailed inspection of the PDFs at the singlesubject level shows that this finding is not attributable to weak bimodality (such as may be evident as a point of inflection). Rather, it is typically because of the existence of a Fisher–Tippett distribution. When compared with a singleexponential PDF, forcing a second fitted exponential PDF, usually with a very low maximum likelihood, is typically able to decrease the residual error term at least sufficiently to offset the parameter penalty term in Equation 3, hence yielding near equivalent BIC values for bimodal and unimodal exponential fits between ∼12 and 28 Hz.
It is, of course, possible to include the extra three subjects in such a grouplevel approach. Their statistics have only a minor effect on the mean but naturally increase the variance. However, the present goal is not to make a populationlevel inference regarding the existence of a single frequency at which a bimodal PDF occurs, because some subjects clearly do not evidence this in a single restingstate recording. Intersubject variability in the existence and character of bimodal distributions is an interesting subject that would arguably require a larger database.
To further investigate the deviation from a unimodal exponential process within the alpha range using a conventional nonparametric statistic, we applied the Kolmogorov–Smirnov (K–S) test on the empirical distributions of both original and surrogate data at each timescale and compared the resulting statistics. The K–S test is based on comparing the empirical distribution function (eCDF) to a theoretical distribution function (tCDF). The resulting statistics reflect the maximum distance between eCDF and tCDF and can be read as a goodnessoffit measure (Chakravarti et al., 1967; Stephens, 1974). For the tCDF, we used the CDF corresponding to the simpleexponential PDF (Eq. 1), representing the null hypothesis of an underlying Gaussian process. However, it must be noted that the conversion from a K–S statistic to an associated p value rests on the assumption that the data are serially independent, which is certainly not the case for raw EEG data or coefficients derived from a continuous wavelet decomposition. We hence used the K–S test simply as a measure of goodness of fit and used the same Fourierresampling approach for statistical inference.
The grandaveraged frequencyspecific K–S statistics are plotted for original and surrogate data in supplemental Figure S7 (available at www.jneurosci.org as supplemental material). Essentially the K–S test replicates the finding observed when using log RSS (compare this figure with Fig. 10). In particular, the K–S statistic diverges for the empirical (black), but not surrogate (red), distributions specifically within the alpha range (8–12 Hz). It is also worth noting that the K–S statistic is lower for the surrogate compared with the empirical data throughout the measured frequency range; this is likely attributable to a variety of ad hoc reasons, such as outlier observations and nonGaussian (Fisher–Tippett) distributions within the beta range. Also of note is that the K–S statistic gradually decreases with increasing frequency for the surrogate distributions. This is likely attributable to the decreasing length of serial correlations at higher frequencies and hence the increasing effective degrees of freedom in the data at higher frequencies (to which the K–S test is sensitive).
NonGaussian unimodal distributions
Table 2 summarizes the statistics of nonGaussian PDFs in all 16 subjects. It hence documents the existence of nonGaussian “fat righthand tails” within the beta range in all subjects. Evidence of nonGaussian scaling is also often present at frequencies in the theta and low alpha range (4–8.5 Hz). In contrast to the bimodal PDFs, nonGaussian distributions showed far greater intersubject variability. It is not uncommon for the relative log evidence to dip close to unity for narrowfrequency intervals separating broader intervals when the relative log evidence is consistently high (>10). Because of this high intersubject frequency evidence, a secondlevel subgroup statistic (equivalent to Fig. 10) was not performed.
Notably, the only subject without bimodal activity (S07) had low RSS for a Fisher–Tippett distribution in the alpha range.
At a number of frequencies in all subjects, there existed outlier observations in a small number of power subintervals. The resulting PDFs did not show a clear scaling behavior suggestive of either a single or double exponential, producing quite large errors for both unimodal fits. Because these outlier observations were inevitably above the background likelihood trend, they invariably yielded smaller errors for the Fisher–Tippett distribution, although the relative log ratio was typically <10. These frequencies were not included in Table 2 as evidence for Fisher–Tippett scaling.
Discussion
After an extensive survey of neuronal oscillations across a hierarchy of scales and the 1/f scaling laws of their power spectrum, Buzsaki (2006, p 128) surmised that “rare but extremely large events are inevitable, because at one point 1/f dynamics become supersensitive to either external perturbations or its internal processes, responding with very large synchronized events.” However, he then concluded that “such large events never occur.” In addressing this issue, we have shown that the problem does not lie in observing such events but rather in forming a proper representation of the null distribution for their nonoccurrence. By estimating PDFs across a spectrum of timescales in spontaneous EEG data and comparing the empirical PDFs with null distributions generated from phaserandomized surrogate data, we observe the robust occurrence of two classes of largeamplitude events in restingstate EEG. First, in the alpha range (8–12 Hz), 13 of 16 subjects showed a distinctly bimodal distribution with a highpower mode expressed irregularly in discrete time windows. Second, in all subjects and at many temporal scales, the empirical PDFs exhibited a consistent upward bias above the mean from exponential toward powerlaw scaling. This translates into sporadically occurring highamplitude events in the time–frequency domain. We therefore argue that the extremal events foreseen by Buzsaki are a robust and routine property of spontaneous cortical activity.
The present approach was informed by the characterization of correlated, nonGaussian processes in a wide variety of other complex systems, whereby a simple functional form is sought for the probability distribution of a measure of system activity (Bramwell et al., 1998, 2000). As shown by Chapman et al. (2002), it is the presence of nonvanishing higherorder moments that distinguishes systems with nontrivial, strongly correlated fluctuations from those dominated by uncorrelated, diffusive ones. Intriguing applications of this approach have revealed universal powerlaw distributions in human locomotor activity (Nakamura et al., 2007) and nonGaussian statistics in heart rate variability during daily activity (Kiyono et al., 2005). The present report also mirrors the observation of nonPoisson statistics in a variety of human activities ranging from communication to work patterns (Barabási, 2005), as can be seen by comparing Figure 1 of Barabási's report with Figure 9 of the present one. To our knowledge, this is the first observation of such statistics in spontaneous cortical electrical activity. It accords with an innovative methodological advance by Anemüller et al. (2003) who generalized ICA to the complex domain for cortical EEG data. Their approach enables disambiguation of nonstationary, correlated sources of cortical activity that each exhibit superGaussian statistics. The detailed characterization of superGaussian empirical PDFs in the present study, particularly in the beta range, arguably provides strong support for this approach over standard ICA models. Another related study is that of Suckling et al. (2008) who estimated the Hölder exponent, a measure of correlated fluctuations, from time series of the restingstate cortical BOLD signal. As with the present approach, they also analyzed timescalespecific activity via a multiscale wavelet decomposition. They hence reported scale invariance of this exponent that was, moreover, sensitive to age and pharmacological perturbation. However, the deviation from a Gaussian process in their data was marginal, possibly attributable to the fact that the Hölder exponent is primarily intended as a singularity analysis and may be less sensitive to nonGaussian amplitude fluctuations or attributable to the lowpass temporal filtering property of the hemodynamic response function that converts neuronal activity into an fMRI BOLD signal.
An important study of nontrivial temporal correlations in cortical activity is that of LinkenkaerHansen et al. (2001) who studied modulations in alpha (8–13 Hz) and beta (15–25 Hz) power in both EEG and magnetoencephalographic data. They observed powerlaw scaling of temporal correlations at timescales ranging from a few seconds to several minutes. These longrange correlations were also present, although attenuated, during sensory input (LinkenkaerHansen et al., 2004). It has been proposed that this observation reflects the propagation of “avalanches” of critical neuronal activity in a network close to an instability (Poil et al., 2008), as detailed previously in neuronal cultures by Beggs and Plenz (2003). Longtailed, possibly scalefree, distributions of synchronous activity in multichannel EEG (Breakspear et al., 2004; Stam and de Bruin, 2004), MEG and fMRI (Kitzbichler et al., 2009) have also been reported across a broad spectrum of timescales, including those of the present study. By revealing the typical presence of two distinct modes of alpha activity and formally estimating parameters for each underlying mode, our study advances this research by providing a more principled (parametric) estimate for switching between high and lowpower modes and hence for estimating the scaling behavior of their respective dwell times. Although we did find longtailed dwell times for both modes, these appeared to be better captured by stretched exponential forms rather than the powerlaw scaling functions used by Poil et al. (2008).
A number of methodological points require careful consideration. The present goal of describing power distributions in scalp EEG arose from an objective to characterize the covariance structure of simultaneously acquired EEG and fMRI data. We hence observed the bimodal and nonGaussian PDFs in the nine simultaneous acquisitions reported in Tables 1 and 2. The classic appearance of eyes closed, restingstate alpha fluctuations underlying the bimodal distributions in these data, and the intersubject variability seen in the nonGaussian statistics provided prima facie support for the neuronal origin of these phenomena. In addition, an exploratory regression of both bimodal and extremal EEG events onto the BOLD data showed a variety of regional signal increases and decreases in cortical regions and did not suggest a classic artifactual effect, such as a ventricular or pericranial pattern. A full characterization of these BOLD correlates is to be the subject of a future report. Additional empirical support arose from the replication of both bimodal and nonGaussian highpower tails in the additional seven EEG recordings acquired without fMRI in a sound and lightattenuated room, as also reported in the present study.
It is also critical to note that we estimated the exponential shape parameter λ from a linear regression in linear (power)–log (likelihood) coordinates. However, noting that the estimation of residual errors using a linear regression may yield varied results (Armstrong, 1985; Clauset et al., 2009), we studied the nonlinear functional forms of the PDFs in log (power)–log (likelihood) coordinates. This enables a more accurate visualization of the scaling behavior of the tails of the PDFs and, crucially, quantitative analysis of any systematic scaling bias. We also avoided direct inference on the goodness of fit of the original regression, focusing instead on relative model selection of several candidate models using the BIC approach. The findings were also consistent with those derived from the K–S statistic (Clauset et al., 2009) and validated using surrogate data. Our goal was hence to find the best unbiased model, through a formal model comparison, for these functional forms. Moreover, we also studied the behavior of the candidate models on Gaussianrendered data to ensure that no systematic methodological bias could lead to spurious nonGaussian distributions. Finally, we also repeated the groupwise analysis using the K–S test, replicating the finding using BIC.
We chose to analyze the original channel data, rather than source reconstructed data, because source reconstruction techniques inevitably embody a number of assumptions about the noise structure of the underlying data sources. However, finding nonGaussian distributions in channel data, which arise from large regions of cortex, arguably has a positive meaning. According to the central limit theorem, the linear superposition of uncorrelated sources is expected to converge toward a Gaussian distribution even if the sources themselves are not Gaussian. Hence, the finding of a nonGaussian distribution in scalp channel EEG data argues, conversely, that the underlying sources must be correlated and nonGaussian. Recent theoretical work suggests a nonGaussian central limit for correlated stochastic processes (Tsallis, 2006). In other words, we submit that both the bimodal and superGaussian distributions that we observe in channel data reflect the presence of the same nontrivial distributions in cortical sources and indeed at all scales from the macroscopic down.
The findings we report here have significant implications. The observation of a bimodal PDF in the alpha range supports previous arguments for the presence of a weakly stable, noiseperturbed steadystate attractor in the underlying corticothalamic system (Stam et al., 1999; Valdes et al., 1999) and argues for cortical field models that allow for a nearby bifurcation (Robinson et al., 1997, 2001, 2002; Lopes da Silva et al., 2003). However, the frequent occurrence of regular switching between two distinct modes in our data adds critical details. First, the bifurcation should arguably be of a subcritical Hopf nature to permit distinct jumps between two distinct modes, because a noiseperturbed supercritical bifurcation would be expected to lead to constant (and therefore unimodal) mixing of subthreshold and suprathreshold dynamics. Second suprathreshold oscillations hence occur in healthy states and not only during pathological conditions such as epilepsy, as argued previously (Robinson et al., 1997, 2002; Breakspear et al., 2006). Theoretical work on Hopf bifurcations in timedelayed stochastic systems (Longtin, 1991) provides a potential for understanding the stretched exponential forms that describe the dwelling distributions in our data.
The observation of superGaussian (doubleexponential) PDFs primarily in the beta range is also of importance, particularly because this frequency range underlies critical cognitive and behavioral processes, such as motor learning and coordination (Boonstra et al., 2007). This finding suggests the presence of nontrivial spatial and temporal correlations among macroscopic neuronal populations. Although these may be expected to occur in smallscale neural systems, as supported by an abundance of theoretical arguments (Amari et al., 2003; Kuhn et al., 2003), current computational models of largescale neural systems are premised on the diffusion approximation, namely that inputs impinging on individual neurons within a population can be treated as being temporally uncorrelated when computing firing rates (Renart et al., 2003; Deco et al., 2008).
Such models do, however, predict spatial and temporal correlations across multiple scales, including 1/f behavior near critical points where instabilities occur (Robinson et al., 1997, 2001, 2002; Robinson, 2003). Extending such models to more explicitly incorporate strong correlations and nontrivial higherorder moments would link them to the rich theoretical developments that currently underpin the study of other complex correlated systems (Zaslavsky, 2002; Tsallis, 2006). These developments explain how systems with both dynamical and stochastic processes, such as a densely connected physiological system under the influence of noise, can intermittently be dominated by the dynamical forces, causing the system to be transiently “trapped” near lowdimensional, highly synchronous manifolds (Shlesinger et al., 1993).
Such processes arise (although not exclusively so) in systems of coupled oscillators that unfold on an underlying structure with scalefree (i.e., powerlaw) connectivity (Vasily et al., 2006), a possible basic structural principle of the cortex (Freeman and Breakspear, 2006). Characterizing the temporal statistics of the sporadic expression of extremal events, for example, if they are clustered in time or display formal intermittency (Platt et al., 1993), may help to formally investigate these possibilities. Future work is also required to understand their computational role and metabolic correlates.
Footnotes

P.A.R. and K.A were supported by the Australian Research Council (ARC). P.R. was supported by the German Federal Ministry of Education and Research (Berlin Neuroimaging Center; Bernstein Center for Computational Neuroscience, Bernstein Focus State Dependencies of Learning) and the Robert Bosch Foundation. M.B. acknowledges the support of Brain Network Recovery Group Grant JSMF22002082 and ARC Grants DP0667065 and TS0669860. We thank A. R. McIntosh, V. K. Jirsa, and K. Friston for helpful discussions.
 Correspondence should be addressed to Michael Breakspear, Queensland Institute for Medical Research, Herston, QLD 4006, Australia. Michael.Breakspear{at}qimr.edu.au