## Abstract

What constitutes normal cortical dynamics in healthy human subjects is a major question in systems neuroscience. Numerous *in vitro* and *in vivo* animal studies have shown that ongoing or resting cortical dynamics are characterized by cascades of activity across many spatial scales, termed neuronal avalanches. In experiment and theory, avalanche dynamics are identified by two measures: (1) a power law in the size distribution of activity cascades with an exponent of −3/2 and (2) a branching parameter of the critical value of 1, reflecting balanced propagation of activity at the border of premature termination and potential blowup. Here we analyzed resting-state brain activity recorded using noninvasive magnetoencephalography (MEG) from 124 healthy human subjects and two different MEG facilities using different sensor technologies. We identified large deflections at single MEG sensors and combined them into spatiotemporal cascades on the sensor array using multiple timescales. Cascade size distributions obeyed power laws. For the timescale at which the branching parameter was close to 1, the power law exponent was −3/2. This relationship was robust to scaling and coarse graining of the sensor array. It was absent in phase-shuffled controls with the same power spectrum or empty scanner data. Our results demonstrate that normal cortical activity in healthy human subjects at rest organizes as neuronal avalanches and is well described by a critical branching process. Theory and experiment have shown that such critical, scale-free dynamics optimize information processing. Therefore, our findings imply that the human brain attains an optimal dynamical regime for information processing.

## Introduction

Resting state activity of the human brain, which is maintained in the absence of any particular sensory input or motor output, has provided novel insights into the functional and anatomical organization of the cortex (Raichle et al., 2001). Resting state activity forms well described networks of functional interactions (Greicius et al., 2003; Damoiseaux et al., 2006) that are correlated with the underlying anatomical connectivity (Hagmann et al., 2008; Greicius et al., 2009). The resting state also differs between healthy subjects and patients suffering from mental or neurological disorders (Broyd et al., 2009; Montez et al., 2009; Hawellek et al., 2011), implying its potential use for clinical diagnostics. Most analysis of human resting state has considered spatial correlations across long timescales, primarily using functional magnetic resonance imaging (Greicius et al., 2003; Damoiseaux et al., 2006; Hagmann et al., 2008; Greicius et al., 2009; Lewis et al., 2009). At faster timescales using electroencephalography (EEG) or electrocorticography, analysis has focused on the presence of oscillations (Logothetis et al., 2001; Buzsáki, 2006), their internal nesting (He et al., 2010), and corresponding long-range temporal correlations (Linkenkaer-Hansen et al., 2001). Although it is now established that resting state activity differs significantly from noise, the general principles underlying these dynamics are not well understood (Deco et al., 2011).

*In vitro* and *in vivo* animal studies have shown that spontaneous cortical activity organizes into spatiotemporal cascades of discrete events extracted from large deflections in the local field potential (LFP) termed neuronal avalanches (Beggs and Plenz, 2003; Gireesh and Plenz, 2008; Petermann et al., 2009). These cascades have been described successfully by a critical branching process (Harris, 1989; Beggs and Plenz, 2003; Plenz, 2012). In branching processes, activity propagates from one active group of neurons to another in a cascade. The ratio between the number of activations in consecutive time steps is termed the branching parameter, σ. When σ = 1, the system is critical, operating at an exquisite balance at which activity can propagate long distances without runaway excitation. Analytically, a branching process at criticality produces a scale-invariant cascade size distribution following a power law with exponent α of −3/2. Indeed, σ = 1 and α = −3/2 are the experimentally observed characteristics of neuronal avalanches in superficial layers of cortex *in vitro* (Beggs and Plenz, 2003) and *in vivo* (Plenz, 2012). The idea that neuronal avalanches identify cortical dynamics at criticality has been influential in recent years. Theory and experiment show that criticality gives rise to optimal information processing (Kinouchi and Copelli, 2006; Shew et al., 2009; Shew et al., 2011), and the resulting scale-invariant organization provides a unified framework for describing cortical activity at multiple spatial and temporal scales (Plenz and Thiagarajan, 2007; Plenz, 2012).

The scale invariance of neuronal avalanches implies that they might be observable at the scale of the entire human cortex with noninvasive neuroimaging methods. Using the temporally precise and spatially localized magnetoencephalography (MEG), we show that resting state activity of healthy human subjects consists of neuronal avalanches, suggesting that it is a critical state.

## Materials and Methods

##### Data acquisition and preprocessing.

Spontaneous brain activity was recorded from healthy human subjects in the MEG core facility at the National Institutes of Health (NIH)–National Institute of Mental Health in the United States and the Medical Research Council Cognition and Brain Sciences Unit in Cambridge, United Kingdom(Shriki et al., 2011; an earlier version of this study). The NIH facility recorded activity from 104 subjects (38 males and 66 females; age, 31.8 ± 11.8 y) for 4 min at rest with eyes closed. The data were recorded using a CTF MEG system (CTF Systems). They were sampled at 600 Hz and band-pass filtered (1–80 Hz). The sensor array consisted of 275 axial first-order gradiometers. Two dysfunctional sensors were removed, leaving 273 sensors in the analysis. Analysis was performed directly on the axial gradiometer waveforms and on data transformed to planar gradiometers using the FieldTrip toolbox in MATLAB (MathWorks).

The Cambridge facility used an Elekta Neuromag MEG system to record activity from 20 participants (15 males and 5 females; age, 28.8 ± 7.2 y) for 3.5 min at rest with eyes closed. Maxfilter (Elekta Neuromag) was used to remove external noise with signal space separation (Taulu et al., 2004) and to correct for head movements (Taulu and Simola, 2006). Independent component analysis was used to remove independent components associated with eye movements (unless otherwise specified). The data were sampled at 1 kHz, down-sampled to 250 Hz, and band-pass filtered (1–80 Hz). The sensor array consisted of 102 pairs of planar gradiometers. Fifteen sensors were found to be associated with artifacts in many subjects and were removed, leaving 87 sensor pairs in the analysis.

Notch frequencies associated with alternating current of the electricity network were removed (60 Hz for the NIH data, 50 Hz for the Cambridge data). Power spectrum density functions were calculated for each sensor, averaged for each subject, and the exponent β was fit for 10–50 Hz.

Independent components associated with the magnetocardiogram (MCG) were removed. Independent component analysis was performed using the FieldTrip MATLAB toolbox. MCG components have a highly characteristic waveform with sharp deflections at regular intervals corresponding to the heartbeat. We therefore identified MCG components based on their temporal regularity. For analysis, we used the 25 independent components that had the highest variance for each subject. Using a threshold (±3 SD; see also Signal Discretization below), discrete suprathreshold events were identified in the waveform associated with each component. We then examined the statistics of the inter-event intervals by calculating the coefficient of variation (CV). In expectation with the regularity of heartbeat artifacts, the CV distribution was bimodal; a significant number of components had very low CVs that were typically between 0 and 0.7 (0.20 ± 0.18 mean ± SD; *n* = 104). In contrast, most other components centered at 1 with some components reaching up to 6. The width of the central peak was relatively narrow toward smaller values, resulting in a clear separation of the cluster of CV values below 0.7 (the SD of a Gaussian fit to the left side of the central peak was 0.06). Visual inspection of the actual waveforms revealed regular spikes at heartbeat frequency, confirming our identification of MCG components based on low CV values. In most NIH subjects, we found between one and three relevant components and a few subjects did not have any MCG-associated component. For most Cambridge subjects, there was no MCG-associated component and a few had a single relevant component. Sensor waveforms were then reconstructed without MCG components using the FieldTrip MATLAB toolbox.

##### Signal discretization.

For each sensor, positive and negative excursions beyond a threshold were identified. A single event was identified per excursion at the most extreme value (maximum for positive excursions and minimum for negative excursions; Fig. 1*A*). Comparison of the signal distribution to the best fit Gaussian (Fig. 1*B*) indicates that the two distributions start to deviate from one another at ∼±2.7 SD. Therefore, thresholds smaller than ±2.7 SD will lead to the detection of many events related to noise in addition to real events, whereas much larger thresholds will miss many of the real events. To strike a compromise, the thresholds chosen were ±3 SD, which amounts to a false positive probability of ∼0.1%. As will be shown further below (Fig. 5), similar results were found with thresholds in the range of ±2.7 to ±4.2 SD (Fig. 5*E*), as well as when using a peak detection method that considers all local extremum points beyond threshold as events (Fig. 5*F*).

All planar gradiometers are in orthogonal pairs tangential to the skull. This refers to both the virtual planar gradiometers in the NIH data and the original planar gradiometers in the Cambridge data. Event rasters of the gradiometer pairs were combined into a single channel, or sensor, using an OR operation. That is, events observed at the same time in a pair of gradiometers were counted as one event.

##### Effect of signal discretization on pairwise correlations.

For each sensor, we calculated two event-triggered averages (ETAs), one for positive and one for negative events (±350 ms relative to the event; Fig. 1*C*). To reconstruct the continuous sensor signal, we convolved the positive ETA with the train of positive events, convolved the negative ETA with the train of negative events, and summed the two. For each pair of sensors, we calculated the corresponding cross-correlation based on the original signals and on the reconstructed signals. Scatter plots of the pairwise correlations for the original and reconstructed signals are shown in Fig. 1*D*. To evaluate the effect of signal discretization on the correlations, we calculated the correlation coefficient for these values for each subject.

##### Cascade-size distributions and power law statistics.

The time series of events obtained from each sensor was individually discretized with time bins of duration Δ*t*. The timescale of the analysis, Δ*t*, was explored systematically in multiples of Δ*t*_{min}, which was the inverse of the data acquisition sampling rate for each system. A cascade was defined as a continuous sequence of time bins in which there was an event on any sensor, ending with a time bin with no events on any sensor. The number of events on all sensors in a cascade was defined as the cascade size.

We analyzed the avalanche size distribution's fit to a power law using methods described previously (Clauset et al., 2007; Klaus et al., 2011). The candidate distributions were the power law and exponential distributions, each limited to the range between a minimum and a maximum size. This focus on power laws stems from the theory of critical branching processes, which predicts power law behavior (Harris, 1989). We note that alternative heavy-tail distributions such as the gamma and log-normal distributions are characterized by two parameters and can give better fits than power laws or single exponentials simply due to the additional degree of freedom.

Power laws were modeled as follows:
Exponential functions were modeled as follows:
where *C*_{α and} *C*_{λ} are normalization factors. The parameter *x*_{min} was set to 1, the minimal avalanche size, and *x*_{max} was set to 1.5 times the total number of sensors included in the analysis, which included virtually all observed avalanches.

Assuming independence of avalanche sizes and a sample of *n* avalanches, the likelihood of each model, given a parameter α or λ, is as follows:
For ease of computation, the logarithms of the likelihoods can be calculated instead, producing the log-likelihood as follows:
The best fit parameters for both distributions (α̂ and λ̂) can then be calculated by maximizing this log-likelihood as a function of the parameter. To determine whether the power law or exponential distribution was a better fit to the data, the log of the likelihood-ratio (LLR) was taken with the best fit parameters as follows:
If the LLR is positive, the power law model is more likely, and if it is negative, the exponential model is more likely. If the LLR is zero, neither distribution is more likely. To determine whether the LLR was significantly different from zero, the *p*-value of the LLR was calculated as follows:
where:
with ℓ̅_{α} = ℓ(α | *x*^{(n)})/*n* and ℓ̅_{λ} = ℓ(λ | *x*^{(n)})/*n*.

For further control, surrogate data were generated for each subject by shuffling the component frequencies of each individual sensor, which maintains the power spectrum but destroys phase relationships across frequencies and sensors.

##### Finite-size scaling and coarse-graining of sensor arrays.

To study the effect of sensor array size on the cascade size distributions, cascades were identified using data from contiguous subsections of the array with *N* sensors. The distributions were calculated for cascade size, *S*, and normalized cascade size, *Z* = *S*/*N*. The power law cutoff was identified as the right-most point in the distribution before the tail of the distribution consistently remained below the fitted power law. Finite size scaling was quantified by correlating the size of the cutoff with the size of the sensor array used for avalanche analysis. Coarse graining of the sensor array was performed by grouping clusters of neighboring sensors and combining their event rasters with a logical OR operation. Therefore, multiple events observed during the same time bin for any sensor group were counted as one event.

##### Estimation of the branching parameter.

The branching parameter σ was estimated by calculating the ratio of the number of events in the second time bin of a cascade to that in the first time bin. This ratio was averaged over all cascades for each subject with no exclusion criteria (Beggs and Plenz, 2003) as follows:
where *N*_{av} is the total number of avalanches in the dataset and *n*_{events} represents the number of events in a particular bin. Note that for single bin cascades, the second bin is an empty bin and therefore the corresponding ratio is 0.

##### Phase synchronization analysis.

Kitzbichler et al. (Kitzbichler et al., 2009) suggested probing for critical dynamics using phase synchronization, which we both replicated and simplified here. The metric involves measuring the durations of phase synchrony between pairs of sensors and then examining the distribution of such phase synchrony durations across all channel pairs.

The data in each pair of channels were first transformed into a complex representation of instantaneous amplitude and phase as follows:
The method for this transformation used in Kitzbichler et al. (2009), along with an additional, simplified method, are described below. Once an amplitude and phase representation is obtained, the instantaneous phase difference between two channels can be computed as follows:
Due to rapid noise fluctuations, a more reliable measure is obtained by smoothing as follows:
where 〈 … 〉 denotes a running average with a time window of Δ*t*.

Between two channels, a period of phase synchrony is identified when both the phase difference is below a threshold and the amplitudes of the channels are above a threshold, as follows: The durations of all such periods are recorded for all channel pairs. In Kitzbichler et al., (2009), it was found that the probability distribution of these phase locking intervals (PLIs) follows a power law in the Ising and Kuramoto models when at criticality, as well as in human resting state fMRI and MEG signals.

To characterize phase synchronization among channels, the raw time series data were transformed into a complex representation of instantaneous amplitude and phase. In Kitzbichler et al. (2009), this was done by filtering with Hilbert wavelet pairs, as described previously (Selesnick, 2001; Whitcher et al., 2005). Results for this method are shown in Figure 9*A*, *B*. In addition, a simplified method was used in which the signal was first band-pass filtered, and then the Hilbert transform was applied to produce the instantaneous amplitude and phase. Specifically, the signals in each channel, *x*(*t*), were first filtered in the band of interest using a 25-pole finite impulse response filter. The filter was applied first forward and then backward in time to eliminate phase distortions. The complex analytic representation of the filtered signal, *y*(*t*), was calculated as follows:
where *y _{H}*(

*t*) is the Hilbert transform, defined as follows: The Hilbert transform represents the convolution of the signal with the function

*C–H*.

##### Model for testing the effect of linear mixing among sensors.

The analyses we performed were based on sensor data. Each MEG sensor linearly sums contributions from many sources and some overlap in the sensitivity of nearby sensors exists. Therefore, the transformation from sources to sensors involves a certain degree of linear mixing of the underlying sources. To evaluate the effect of linear mixing on the results, we ran simulations of a simple neural network with a two-dimensional layout and added a layer of simulated sensors, which linearly mix the activations in the underlying network.

The model was based on one described previously (Shew et al., 2009) and consisted of *N* = 2025 binary neurons arranged in a 45 × 45 grid. The synaptic coupling strengths are denoted by *p _{ij}* and represent the probability that neuron

*i*will spike at time

*t*+ 1 if neuron

*j*was the only one that spiked at time

*t*. In general, if a set of neurons

*J*(

*t*) spikes at time

*t*, the neurons obey the following dynamics: where Θ[

*x*] is the unit step function,

*p*=

_{iJ}(t)*t*) is a random number from a uniform distribution in [0,1]. The coupling strengths,

*p*, are random and fall with the distance between neurons. Specifically, we used a Gaussian envelope with a SD of four times the distance between neighboring neurons. The branching parameter of the network is defined by σ ≡

_{ij}*N*

^{−1}

The sensor layer consisted of *M* = 64 sensors arranged in an 8 × 8 grid. To avoid boundary effects, the sensor array was located within the neuron grid with wide margins on each side. Specifically, we first created a 12 × 12 sensor grid covering the total area of the neuron grid and then considered only the 8 × 8 central sensors. Therefore, the distance between neighboring sensors, denoted by *d*, was exactly four times the distance between neighboring neurons and the margins on each side corresponded to two times the distance between neighboring sensors, or equivalently, to eight times the distance between neighboring neurons. Each sensor summed the activity of the underlying neurons with a Gaussian weighting function centered around its location on the grid. The sum of the weights was normalized to 1. The SD of this Gaussian was varied from 0.5 *d* (small overlap) to 1.5 *d* (large overlap). A threshold function was then applied to the summed activation, with the threshold proportional to the peak value of the Gaussian. The proportion factor was chosen to be 0.35. High threshold values result in misdetection of events on the grid and thus in underestimation of the branching parameter. Similarly, low threshold values tend to result in multiple events on the sensor array in response to a single event on the neuron grid, which may lead to underestimation of size 1 cascades. Our choice reflected a tradeoff between these two effects, but the results were robust for a wide range of values. After the thresholding operation, in each time step, there was an *N*-by-1 binary vector representing the network state and an *M*-by-1 binary vector representing the corresponding state of the sensor array. The simulation was run for 60,000 cascades. The power low exponent and the branching parameter of the sensor array were estimated in the same way as in the MEG data. In particular, the branching parameter was estimated using the ratio of events in the first two time steps of each cascade.

We also simulated independent Poisson processes on the neuron grid and recorded the corresponding activity on the sensor array. In each time step of a cascade, neuronal activations were chosen from the Poisson process and the cascade terminated when no neuron was active. The mean rate of the Poisson process was set to match the mean activation probability of neurons in the critical network simulation.

## Results

We analyzed resting state activity from a total of 124 healthy subjects recorded at two different facilities using two different MEG systems (Hansen et al., 2010). At the NIH facility, 104 subjects (38 males, 66 females; age, 31.8 ± 11.8 y) were recorded for 4 min each using 273 axial gradiometer sensors. At the Cambridge facility, 20 subjects (15 males, 5 females; age, 28.8 ± 7.2 y) were recorded on 2 separate days for 3.5 min each using 87 pairs of orthogonal planar gradiometer sensors, resulting in 40 recordings. All recordings were band-pass filtered before analysis from 1–80 Hz.

To measure spatiotemporal cascades of discrete events, we first identified prominent deflections in the continuous signal collected from each sensor (Fig. 1*A*). The corresponding amplitude distributions differed from a Gaussian fit for values larger than ±2.7 SD (Fig. 1*B*). A Gaussian distribution of amplitudes is expected to arise from a superposition of many uncorrelated sources. Conversely, avalanche analysis focuses on the occurrence of local synchronized group activity that is expected to generate amplitude distributions that deviate from a Gaussian. Therefore, for further analysis, we identified large positive and negative signal deflections using an amplitude threshold of ±3 SD, followed by the detection of the extreme of each deflection (Fig. 1*A*). This signal discretization maintained most of the strong correlations found in the continuous MEG signals recorded from different brain regions as quantified by examining pairwise correlations between sensors before and after discretization (Figs. 1*C–D*). For each sensor, ETAs were calculated for both positive and negative events (Fig. 1*C*) and a continuous signal was reconstructed by convolving the event sequence with the ETAs (see Materials and Methods). The scatter plot of cross-correlations between pairs of original signals and the corresponding pairs of reconstructed signals (Fig. 1*D*) reveals that the discretization largely maintained the correlations between different brain sites despite a slight bias to reduce weak correlations, as expected for a thresholding operation (*r* = 0.91 reconstructed vs original values). Similar results were obtained for all subjects (*r* = 0.89 ± 0.04 mean ± SD).

In Figure 2*A*, we show a typical event raster from one subject with 273 sensors and a duration of 10 s. Events were found to cluster in time across subgroups of sensors. This markedly nonrandom organization of neuronal activity in space and time suggests that neuronal group activity during rest was correlated across different brain sites. To gain further insight into this organization, we quantified event clusters for each subject. First, the raster was binned at several temporal resolutions Δ*t*, which were multiples of the sampling time step (NIH: 1.67 ms, Cambridge: 4 ms). Then, clusters were identified by concatenating series of successive bins with at least one event (Fig. 2*B*,*C*). This approach captures a wide range of spatiotemporal organizations in neuronal group synchronization and recognizes correlations between neuronal groups that are successively active at temporal resolution Δ*t*, as well as neuronal groups that are near simultaneously active within a Δ*t* period. Event clusters frequently engaged spatially contiguous sensors (Fig. 2*D*).

### Cascade size distributions reveal scale-invariant dynamics

We next calculated the size of each cluster, *s*, simply defined as the number of events in the cluster, and plotted the probability of cluster size, *P*(*s*). The distribution for a single subject (273 axial gradiometers) analyzed at Δ*t* = 3.3 ms appears approximately linear in double-logarithmic coordinates with a cutoff at ∼100–200 sensors (Fig. 3*A*). A maximum likelihood based analysis (see Materials and Methods; Clauset et al., 2007; Klaus et al., 2011) demonstrated a significantly better fit to a power law compared with an exponential function (for the depicted subject, *P* < 0.05). Similar results were obtained for >98% of the 104 NIH subjects at Δ*t* = 1.67 ms, which slightly decreased to 89% at Δ*t* = 10 ms (*p* < 0.05 for all Δ*t*). The power law demonstrates long-range spatiotemporal correlations beyond chance among MEG sensor locations. Accordingly, phase shuffling of the original continuous MEG signal (see Materials and Methods) destroys synchronization among brain sites and the power law (Fig. 3*A*, broken red line). The phase shuffling maintains the power spectrum density function, which, for the range of 10–50 Hz, followed a power law of the form 1/*f*^{β} with β = −2.8 ± 0.6 (mean ± SD; *n* = 104 subjects).

Several controls further demonstrate that the cluster size distribution in resting MEG underlies scale-invariant cortical dynamics. First, the cutoff of the power law reflects the finite size of the sensor array and does not indicate a spatial limit of the cortical dynamics. We divided the original sensor array into subarrays of different sizes and recalculated the cluster distribution for each size. As the sensor array increases in size, the power law cutoff also increases (Fig. 3*B*). The mean estimated power law cutoff (see Materials and Methods) across all subjects increased linearly with the size of the sensor array (*R*^{2} = 0.99; *n* = 104; Δ*t* = 3.3 ms). Accordingly, the corresponding distributions collapsed when rescaled for array size (Fig. 3*B*, inset). Second, the power law did not change when the spatial resolution used to identify cortical dynamics was reduced. Coarse graining by combining adjacent sensors using a logical OR operation for event detection maintained the power law in cluster sizes (Fig. 3*C*; *p* < 0.05 for >97% of the *n* = 104 subjects at Δ*t* = 1.67 ms, decreasing to 90% for Δ*t* reaching 10 ms). Cluster size distributions for sensor signals recorded in the absence of human subjects (“empty scanner” data) were not heavy tailed and differed sharply from a power law distribution (Fig. 4).

To further study the robustness of our results, NIH datasets were converted from axial to virtual planar gradiometers (see Materials and Methods). Planar sensors estimate activity from smaller, more localized cortical regions, thereby reducing potential contributions from overlapping sensor measurements. As shown in Figure 3*D–F*, cascade size distributions followed power laws (*p* < 0.05 for >99% of the *n* = 104 subjects; Δ*t* = 1.67–10 ms), obeyed finite size scaling (*R*^{2} = 0.99; *n* = 104; Δ*t* = 3.3 ms), and could be coarse grained consistently with our results for axial gradiometers (*p* < 0.05 for >99% of the *n* = 104 subjects; Δ*t* = 1.67–10). Similar results were found with real planar gradiometers for the Cambridge MEG system (Fig. 3*G–I*; Δ*t* = 12 ms). Although the overall quality of the power law fits was not as strong as that obtained with the NIH system, >82% of Cambridge subjects demonstrated significant power law fits for Δ*t* of 8–20 ms at the *P* < 0.05 level. The distributions revealed finite size scaling (Fig. 3*H*) with a power law cutoff that increased linearly with the size of the sensor array (*R*^{2} = 0.99; *n* = 20 subjects; Δ*t* = 12 ms). They were also robust to coarse graining (Fig. 3*I*), although the power law fit after coarse graining decreased markedly for higher Δ*t* (*p* < 0.05 for 85% of *n* = 20 subjects at Δ*t* = 8 ms, decreasing to 40% of subjects at Δ*t* = 20 ms).

### Resting state dynamics is captured by a critical branching process

We next tested the hypothesis that a critical branching process describes the dynamics underlying the power law in cluster sizes. A branching process is described by the branching parameter, σ, which is the ratio of descendant events to ancestor events. Here, ancestors constitute events in the very first time bin of a cascade, followed by descendants in the subsequent time bin (Beggs and Plenz, 2003). The average ratio of these two numbers over all cascades estimates σ for each subject. For a critical branching process, σ is expected to be exactly 1; that is, dynamics are balanced and one event on average leads to one future event. The corresponding cascade size distribution is expected to form a power law with exponent α equal to −3/2 (Harris, 1989).

Although critical dynamics are scale invariant in both space and time, the experimental necessity of a sensor array with characteristic spacing between sensors imposes a characteristic timescale on the observed events in the system. This relationship between timescale and sensor spacing was demonstrated previously for neuronal avalanches (Beggs and Plenz, 2003) and follows from the average speed of activity propagation in cortical tissue; as the sensor spacing increases, the time period Δ*t* required for activity to propagate across sensors will increase. In the present study, the folding of the cortex makes exact prediction of the timescale imposed by the sensor spacing difficult. We therefore systematically explored multiple timescales for evidence of a critical branching process.

Using axial sensors and different timescales (i.e., values of Δ*t*), we found that both α and σ increased with longer Δ*t* for all NIH subjects (Fig. 5*A*), which is consistent with previous findings on neuronal avalanches *in vitro* (Beggs and Plenz, 2003). For the value of Δ*t* where σ = 1, α was close to −3/2 (Fig. 5*A*,*B*), which is consistent with a critical branching process. This Δ*t* value was 3.3 ms and at that timescale, the average avalanche rate across all NIH subjects was 18.3 ± 4.1 avalanches/s given a threshold of ±3 SD. The intersection of the critical values σ = 1, α = −3/2 persisted when taking subarrays (Fig. 5*C*), coarse graining the array (Fig. 5*D*), or changing the event threshold (Fig. 5*E*). These findings were also robust to different peak detection methods. Taking the most extreme peak of the signal beyond threshold excursion or, alternatively, including multiple local peaks beyond threshold excursion resulted in virtually identical results in the (σ, α) plane for systematically varied thresholds ranging from ±2.7 to ±4.2 SD (Fig. 5*E*,*F*). Similar results were found for virtual planar gradiometers (Fig. 5*B*,*D*) and for real planar gradiometers at the Cambridge site (Fig. 6), although for the latter, the cascade size distributions were shallower; in particular, α was more positive than −3/2 for σ = 1 (Fig. 6*A–D*). Eye closure or the removal of eye movement artifacts did not change these findings for the Cambridge data (Fig. 6*E*,*F*).

For each subject, both the power law exponent and branching parameter were consistent over time. Subjects at the NIH facility had well correlated results between the first and second half of the recording (Fig. 7*A*,*B*; axial: α: *R*^{2} = 0.83; σ: *R*^{2} = 0.72) and subjects at the Cambridge facility had correspondence across 2 visits separated by at least 1 week (Fig. 7*C*,*D*; α: *R*^{2} = 0.84; σ: *R*^{2} = 0.73).

### Effect of linear mixing at the sensors on estimating avalanche dynamics

Our analysis relied on data at the sensor level. In principle, the transformation from sources to sensors, which involves linear mixing of the sources due to sensor overlap, could have affected our results. To examine the potential effects of linear mixing, we therefore simulated a neural network with a two-dimensional layout (*N* = 2025 neurons arranged in a 45 × 45 grid) together with the response of a simulated sensor array (*M* = 64 sensors arranged in an 8 × 8 grid; see Materials and Methods). The arrangement of the neuron and sensor array is sketched in Figure 8*A*. We compared the cascade size distributions on the neuronal grid to those measured by the simulated sensor array. Each sensor had a Gaussian weight function and we explored the effect of sensor overlap by changing the width of the Gaussian while maintaining the spacing among sensors (Fig. 8*B*, illustration). To determine whether uncorrelated sources can give rise to a power law distribution at the sensor level, we simulated independent Poisson processes (see Materials and Methods). The cascade size distribution from the activities on the neuron grid differed markedly from a power law (Fig. 8*C*) and a similar exponential behavior was obtained at the sensor level (Fig. 8*D*), albeit with a slower decay for larger sensor overlap. When the network was critical, it displayed a power law distribution of cascade sizes (Fig. 8*E*) and a similar power law distribution was observed at the sensor level (Fig. 8*F*). The power low exponent was close to −1.5 for small sensor overlap (SD = 0.5 *d*), but the power laws became more shallow with increasing sensor overlap (Fig. 8*G*). Increasing the overlap among sensors also caused a slight underestimation of small cascades, because even a single event on the grid now tended to be registered by more than a single sensor (Fig. 8*F*). In contrast, the estimated branching parameter on the sensor array remained close to 1 for the range of sensor overlap studied. These simulations suggest that uncorrelated activity cannot appear as neuronal avalanches due to sensor overlap and that sensor overlap mainly affects the power law exponent of the cascade size distribution when the system is critical.

Recently, a power law distribution of phase-synchrony periods between two sites has been suggested as evidence for criticality in human MEG (Kitzbichler et al., 2009). For comparison, we therefore calculated the duration of PLIs for different frequency bands in single subjects (Fig. 9*A*,*C*,*E*) and the grand average across all NIH subjects (Fig. 9*G*). For both the NIH and the Cambridge data, we found clear power laws in the PLI distributions for each frequency band, confirming the previous report of Kitzbichler et al. (2009) and expanding this finding to two different MEG systems. However, we also found similar power law behavior in our control, “empty scanner” datasets for both the NIH and Cambridge data (Fig. 9*B*,*D*,*F*,*H*). This should be contrasted with the avalanche metric, which did not display power law behavior on empty scanner data (Fig. 4).

## Discussion

Our results show that resting state activity of the human brain as measured by MEG is well described by a critical (i.e., balanced) branching process, which produces scale-invariant neuronal avalanches. These findings correspond to neuronal avalanches previously identified at smaller spatial scales in *in vitro* and *in vivo* (e.g., in the awake monkey), implying a universality of cortical activity (Beggs and Plenz, 2003; Gireesh and Plenz, 2008; Petermann et al., 2009).

The cortical MEG signal is generated mainly by synchronous synaptic activity that induces current flow along extended structures aligned in parallel, such as apical dendrites of pyramidal neurons (Hansen et al., 2010). It also has a time resolution that matches the millisecond timescale of neural activity propagation, in contrast to the relatively slow timescale of the BOLD signal in fMRI. Compared with EEG, the MEG signal is less distorted by the meninges and skull. Here, we transformed the continuous, *z*-normalized MEG signal into discrete sequences of significant events using a threshold. This approach is similar to the analysis of significant events in the LFP in previous avalanche studies (Beggs and Plenz, 2003) and is consistent with the strong correspondence between evoked MEG signals and the LFP (Zhu et al., 2009). We have demonstrated that this transformation largely preserves the instantaneous correlations in neuronal activity between brain sites that underlie the MEG signal; while maintaining the strong pairwise correlations, it slightly reduces weak correlations. Our approach is thus similar to that commonly used in the reconstruction of functional connectivity graphs when weak correlations are ignored (Hagmann et al., 2008). Our results were robust to changes in the threshold level beyond noise, which is consistent with previous findings on the fractal organization of neuronal avalanche amplitudes in the awake monkey (Petermann et al., 2009). The raster of the detected events showed a notable spatiotemporal structure with cascades of activity. We identified these cascades using several timescales and found that cascade size distributions followed a power law. The cutoff of the power law was a function of the size of the sensor array, demonstrating clear, finite-size scaling that is consistent with expectations for scale-invariant dynamics.

At the timescale when the branching parameter σ was 1, we observed a power law exponent α of −3/2. These are the hallmarks of a critical branching process. Our results thus demonstrate that human resting state activity is at a critical point where activity neither blows up to encompass the whole system nor dies quickly close to the starting point, but produces a rich repertoire of cascades covering all spatial scales. The power law distribution means that cascades covering large portions of the brain are far more common than by chance. We propose that these dynamics allow for intermittent and selective integration of activity across many cortical areas as required for, for example, workspace theory (Dehaene and Naccache, 2001; Kitzbichler et al., 2011).

Our results were obtained for different sensor types and MEG systems. However, cascade size distributions for the Cambridge data tended to be more shallow (Fig. 6), which, based on our linear mixing simulations, could point to a larger sensor overlap in these recordings (Fig. 8*G*).

Theory and experiment have shown that systems at criticality maximize various information processing features. Analytical and numerical studies predict that critical systems have maximized dynamic range (Kinouchi and Copelli, 2006), information transmission (Beggs and Plenz, 2003; Tanaka et al., 2009), and entropy of events and states (Haldeman and Beggs, 2005; Ramo et al., 2007). By experimentally manipulating the balance of excitation and inhibition, neuronal avalanche dynamics has been shown to maximize the dynamic range (Shew et al., 2009), pattern variability (Stewart and Plenz, 2006), and information capacity and input-output mutual information (Shew et al., 2011), and to represent the most diverse state of intermittent phase locking between distant cortical sites (Yang et al., 2012). Our results suggest that human cortex realizes these information processing advantages and maintains its dynamics at the critical branching parameter of 1. To stay at this critical point, there must be a balance of excitatory and inhibitory forces. Experimentally, perturbing dopamine tone (Stewart and Plenz, 2006; Gireesh and Plenz, 2008) or the balance of GABAergic and glutamatergic transmission (Shew et al., 2009) drives the system away from criticality and destroys neuronal avalanches. Models have shown that critical dynamics can also be maintained on a wide range of parameters through network topology (Rubinov et al., 2011) and such mechanisms as short-term synaptic depression (Levina et al., 2007; Millman et al., 2010). Because functional relationships established during specific tasks or experiences are reflected in corresponding changes of resting functional connectivity (Lewis et al., 2009), our findings imply that the brain employs homeostatic mechanisms to maintain the critical state in the face of plasticity.

At present, evidence for scale invariance and criticality in human brain activity is largely based on temporal measurements such as long-range temporal correlations (Linkenkaer-Hansen et al., 2001) and power-law power spectra (He et al., 2010). Conversely, the neuronal avalanche metric relies on both spatial and temporal aspects. A variation of the neuronal avalanche metric with events defined as moments of rapid amplitude transitions was applied to EEG data, but yielded poorer power law fits and α close to −1.9 (Allegrini et al., 2010). We could replicate recent findings of a power law distribution for phase-synchrony periods between two sites, which has been suggested as evidence for criticality in human MEG (Kitzbichler et al., 2009). However, we found this metric to be ambiguous, because power law behavior with the same exponents was also found for empty MEG scanner data. This suggests that additional steps such as amplitude comparisons might have to be taken into account to insure that power laws obtained from PLI analysis truly reflect the organization of brain activity. In contrast, our neuronal avalanche analysis exhibited power law behavior only for human data and not for empty scanners (Fig. 4). Consistent with our spatiotemporal approach, a recent study applied a point-process analysis to the BOLD signal from human fMRI resting state recordings and identified spatiotemporal contiguous clusters that obey a power law size distribution (Tagliazucchi et al., 2012). Another recent study addressed criticality by comparing the functional connectivity derived from resting state fMRI to the one obtained from simulations of a large-scale network model with the same anatomical connectivity. It was found that the model best fits the data when the scaling of the interactions is set at the edge of instability (Deco and Jirsa, 2012).

Perhaps the most significant advancement of the neuronal avalanche analysis is the specific generative mechanism of the branching process and related power law in size distribution. Here we demonstrate that the branching process model yields the numerical predictions of α = −3/2 and σ = 1 in the MEG, and these values are consistent with previous research *in vitro* and with animal models. This suggests that a critical branching process is indeed a universal property of cortical dynamics and thus opens a translational corridor between animal and human studies of normal and pathological brain activity. Cortical networks at criticality optimize numerous features of information processing (Kinouchi and Copelli, 2006; Shew et al., 2009; Shew et al., 2011), paving the way for new identifiers in brain disorders associated with dysfunctional information processing. These insights may produce mechanistically founded MEG biomarkers that could be linked to other MEG-based biomarkers (Langheim et al., 2006; Georgopoulos et al., 2007; Tsiaras et al., 2011; Zamrini et al., 2011). Specifically, the utility of critical dynamics to the human brain can be determined by comparing the normal avalanche metrics described here with evoked states, pharmacologically modified states, and neuropsychiatric disorders such as during the excitation/inhibition imbalance of epilepsy or disturbances of dopaminergic tone as found for schizophrenia.

## Footnotes

This work was supported in part by the Intramural Research Program of the U.S. National Institute of Mental Health, the U.K. Medical Research Council (Grant #MC_US_A060_0046), and the Wellcome Trust. The Cambridge MEG study was sponsored by GlaxoSmithKline Research and Development.

The National Institute of Mental Health has filed a U.S. patent application that covers neuronal avalanche dynamics as a potential diagnostic assay using MEG in humans (U.S. Patent Application 11/990419 filed August 14, 2006 claiming priority to August 12, 2005).

- Correspondence should be addressed to Dr. Oren Shriki, Section on Critical Brain Dynamics, National Institute of Mental Health, PNRC, Rm 3A-114, 35 Convent Drive, Bethesda, MD 20892. shrikio{at}mail.nih.gov or oren70{at}gmail.com