## Abstract

Interactions among different brain regions are usually examined through functional connectivity (FC) analysis, which is exclusively based on measuring pairwise correlations in activities. However, interactions beyond the pairwise level, that is, higher-order interactions (HOIs), are vital in understanding the behavior of many complex systems. So far, whether HOIs exist among brain regions and how they can affect the brain's activities remains largely elusive. To address these issues, here, we analyzed blood oxygenation level-dependent (BOLD) signals recorded from six typical macroscopic functional networks of the brain in 100 human subjects (46 males and 54 females) during the resting state. Through examining the binarized BOLD signals, we found that HOIs within and across individual networks were both very weak regardless of the network size, topology, degree of spatial proximity, spatial scales, and whether the global signal was regressed. To investigate the potential mechanisms underlying the weak HOIs, we analyzed the dynamics of a network model and also found that HOIs were generally weak within a wide range of key parameters provided that the overall dynamic feature of the model was similar to the empirical data and it was operating close to a linear fluctuation regime. Our results suggest that weak HOI may be a general property of brain's macroscopic functional networks, which implies the dominance of pairwise interactions in shaping brain activities at such a scale and warrants the validity of widely used pairwise-based FC approaches.

**SIGNIFICANCE STATEMENT** To explain how activities of different brain areas are coordinated through interactions is essential to revealing the mechanisms underlying various brain functions. Traditionally, such an interaction structure is commonly studied using pairwise-based functional network analyses. It is unclear whether the interactions beyond the pairwise level (higher-order interactions or HOIs) play any role in this process. Here, we show that HOIs are generally weak in macroscopic brain networks. We also suggest a possible dynamical mechanism that may underlie this phenomenon. These results provide plausible explanation for the effectiveness of widely used pairwise-based approaches in analyzing brain networks. More importantly, it reveals a previously unknown, simple organization of the brain's macroscopic functional systems.

- default mode network
- frontoparietal network
- functional connectivity
- pairwise correlation
- resting-state fMRI

## Introduction

By using functional magnetic resonance imaging (fMRI), complex neuronal dynamics have been observed even when a subject is not engaged in specific tasks, reflecting that different brain areas interact with each other in the resting state (Biswal et al., 1995; Raichle et al., 2001; Greicius et al., 2003; Greicius et al., 2004; Beckmann et al., 2005; Fox et al., 2005; Fox and Raichle, 2007). During the past two decades, the most widely used method for studying these interactions is the analysis of functional connectivity (FC), which is defined as a pairwise correlation in BOLD activities observed from two spatially separated brain areas (Biswal et al., 1995; Greicius et al., 2003; Bullmore and Sporns, 2009). Based on FC, many functional systems of the brain, including the default mode network (DMN; Raichle et al., 2001; Greicius et al., 2003; Buckner et al., 2008) and the frontoparietal network (FPN; Daffner et al., 2003; Dosenbach et al., 2008; Zanto and Gazzaley, 2013), have been identified and studied intensively, providing important insights into the organization of macroscopic neuronal interactions in both normal and pathological conditions (Greicius et al., 2004; Greicius et al., 2007; Bullmore and Sporns, 2009; Brier et al., 2012). However, for a network with more than two nodes, there could be interactions among triplets, quadruplets, etc., of nodes termed higher-order interactions (HOIs). These HOIs can play important role for information processing [e.g., exclusive-OR (XOR) function] and cannot be revealed by pairwise-based analysis (Schneidman et al., 2003; Schneidman et al., 2006; Ganmor et al., 2011; Yu et al., 2011). Indeed, HOIs have been observed in a number of complex systems (Ohiorhenuan et al., 2010; Ganmor et al., 2011; Ohiorhenuan and Victor, 2011; Shemesh et al., 2013). More importantly, it has been reported that neuronal interactions at the mesoscopic level, reflected by spiking activities, exhibit significant HOIs (Ohiorhenuan et al., 2010; Ganmor et al., 2011). These results suggest the possibility that HOIs may also exist at the macroscopic level and may even play a role in orchestrating interactions among different brain areas. Nevertheless, so far, the HOIs among macroscopic brain networks remain largely unknown.

It is known that HOIs in a network are determined by a number of factors, including the system's intrinsic organization (e.g., whether it contains operation such as XOR) and other confounding influences such as the size (Roudi et al., 2009), topology, and node composition (Schneidman et al., 2006), as well as the inputs that drive the system (Ganmor et al., 2011; Macke et al., 2011). A recent study has claimed that pairwise interactions are good enough in characterizing the activities of two functional networks, the FPN and the DMN, under a specific condition: with the global signal (GS) regressed (Watanabe et al., 2013). However, many important questions remain. First, we do not know how these results can be extended to cover other functional systems under various conditions; for example, different network sizes, observation scales, spatial relations among nodes, and lack of GS regression (noGSR). Second, what are the dynamical mechanisms that may explain the HOI structures that can be observed from the empirical data? Answering these questions will provide important information regarding the organization of neuronal interactions at the macroscopic level. In the present study, we combine comprehensive HOI analysis based on both fMRI data recorded during the resting state and the simulation of a dynamic network model under various conditions to address these issues.

## Materials and Methods

#### Dataset and preprocessing

The resting state fMRI (rs-fMRI) data analyzed here came from all 100 subjects (46 males and 54 females) of the Human Connectome Project (HCP) Q3 public data release (http://www.humanconnectome.org/data). During the resting-state scan, subjects underwent passive fixation in two separate sessions. In the two-fold cross-validation analyses, we separated the whole dataset into two halves, each one containing the data of one session from all subjects. Parameters for the resting data were as follows: TR = 720 ms, TE = 33.1 ms, FA = 52°, 2 × 2 × 2 mm voxels, FOV = 208 × 180 mm, and 72 oblique axial slices (Feinberg et al., 2010; Moeller et al., 2010; Setsompop et al., 2012; Van Essen et al., 2012). Each session contained 1200 time points for a total of 14.4 min. In total, 228,000 frames (approximately 46 h) of rs-fMRI data were used, with the first 60 frames of each scanning discarded.

The released fMRI data went through initial preprocessing by the HCP (Glasser et al., 2013). The detailed processing step adopted by HCP included: (1) gradient distortion correction, (2) motion correction, (3) distortion correction, (4) registration to the structural scan, (5) spline resampling to Montreal Neurological Institute (MNI) 2 × 2 × 2 mm^{3} space using the FSL software package (Jenkinson et al., 2002; Smith et al., 2004), and (6) intensity normalization to mean of 10,000 and bias field correction. We added the following preprocessing steps for our analysis: (1) band-pass filtering the time-series (0.01 Hz < *f* < 0.1 Hz); (2) censoring motion related frames with framewise displacement >0.2 mm (Ciric et al., 2017), performed in the analysis comparing the effect of different methods of motion artifacts rejection; and (3) regressing nuisance signals of 36 regressors described previously (Ciric et al., 2017). The regressors included tissue mean signal *T* (white matter mean signal, cerebrospinal fluid mean signal, global mean signal) and six head motion sequences *R*, their ([*T*, *R*]) derivatives, quadratic terms, and squares of derivatives. To examine the effect of global signal on higher-order interaction, we preprocessed the data separately with and without regressing the global mean signal (GSR vs noGSR). For the pipeline of noGSR, the four components of the GS were dropped (global mean, first derivatives of global mean, square of global mean, and square of the first derivatives).

We extracted mean region-of-interest (ROI) signals of six typical functional networks from all the brain regions (50 ROIs), which included the FPN (11 ROIs), the DMN (12 ROIs), the dorsal attention network (DAN, 8 ROIs), the executive control network (CON, 5 ROIs), the salience network (SAL, 7 ROIs), and the sensory-motor network (SMN, 7 ROIs). We defined each ROI by using a sphere with radius of 2, 5, and 10 mm, centering at the peak coordinate chosen based on previous studies (Brier et al., 2012; Watanabe et al., 2013; Zhan et al., 2016). All ROIs were masked by a gray matter probability map thresholded at 10% provided by the SPM12 software package (http://www.fil.ion.ucl.ac.uk/spm/software/spm12/), with corresponding peak coordinates listed in Table 1.

To estimate HOIs in finer scale and spatial closer networks comparing with the 6 typical networks, we constructed 9-node networks, with all of their nodes (2 mm radius ROI for each one) completely contained within individual 10-mm-radius ROIs of 6 typical functional networks. For such networks, their nodes were centered at 8 vertices and geometric center of a × 10 mm^{3} cuboid that shared the same center with the 10-mm-radius ROI. In total, due to the constraint of the gray matter mask, 23 such finer-scale networks out of all 50 10-mm-radius ROIs could be constructed.

#### Data analysis

##### Outline of network analysis.

To reduce the dimensionality of the fMRI data to analyze HOIs properly, we dichotomized the continuous signals into binary time series. Specifically, we converted the continuous BOLD signals into discrete, binary time series of activity patterns by using different thresholds and different time bins (threshold ranges from 0.25 to 2 times the SD for individual ROIs; time bin width equals 1, 2, 4, 6, and 12 times of TR). The empirical pattern probability distribution was fitted with an independent model (Schneidman et al., 2006), a pairwise model (the Ising model; Schneidman et al., 2006; Tang et al., 2008; Yu et al., 2008), and a model with thresholding-induced HOIs [Macke et al., 2009; Yu et al., 2011; the dichotomized Gaussian (DG) model]. The strengths of different-order interactions were estimated directly from the distribution of experimental data and corresponding Ising and DG models (Yu et al., 2011). The fitting accuracy was quantified by Jensen–Shannon (JS) divergence *D*_{JS} (Schneidman et al., 2006; Yu et al., 2011; see below for details).

##### Three interaction models used in the present study.

The independent model (Schneidman et al., 2006) is a maximal entropy model containing no interactions. Its pattern probability distribution is given by *P*(σ_{1}, σ_{2}, …, σ* _{n}*) = exp(∑

*σ*

_{i}h_{i}*) where (σ*

_{i}_{1}, σ

_{2}, …, σ

*) denotes the state pattern of a network. The state variable σ*

_{n}*is the binary state (+1 or −1) of the region*

_{i}*i*. The intrinsic preference

*h*is the parameter representing the likelihood of individual nodes to be active, which is determined by the empirically observed activity rate < σ

_{i}*>. The partition function*

_{i}*Z*is a normalization factor.

The theoretical basis for understanding interaction structure in neural networks is information geometry, in which the complex interactions are decomposed into different orders (Nakahara and Amari, 2002). The second-order interactions were studied using the Ising model (Schneidman et al., 2006; Tang et al., 2008; Yu et al., 2008), which is a maximal entropy model containing pairwise interactions only and its expression is *P*(σ_{1}, σ_{2}, …, σ* _{n}*) = exp(∑

*σ*

_{i}h_{i}*+ ∑*

_{i}_{i<j}

*J*σ

_{ij}*σ*

_{i}*). The intrinsic preference*

_{j}*h*is the strength of the external field and the exchange interaction

_{i}*J*is the strength of the second-order interaction between the regions

_{ij}*i*and

*j*. The value of both

*h*and

_{i}*J*were determined by the observed activity rate < σ

_{ij}*> and pairwise correlation < σ*

_{i}*σ*

_{i}*>, respectively.*

_{j}The DG model (Macke et al., 2009; Yu et al., 2011) is a two-layer model containing a bottom layer of continuous, multivariate Gaussian variables **u** and the top layer obtained by converting these Gaussian variables into binary variables by thresholding (i.e., dichotomization). The state variable **u** = (*u*_{1}, *u*_{2}, …, *u _{n}*) ∼ N(μ, δ) where

*n*is the dimensionality and N(μ, δ) is a standard multivariate Gaussian distribution with the mean of μ and the covariance of δ. To make the binary variables fit the mean

**r**and covariance ∑ of the experimentally observed data, the choice of μ

_{i}and δ

_{ij}need to satisfy the relationship of μ

*= Φ*

_{i}^{−1}(

*r*) and ∑

_{i}*= Φ*

_{ij}_{2}(μ

*, μ*

_{i}*, δ*

_{j}*) − Φ*

_{ij}_{1}(μ

*)Φ*

_{i}_{1}(μ

*), respectively. The functions of φ*

_{j}_{1}and φ

_{2}are the cumulative probability of one- and two-dimensional standard Gaussian distribution, respectively, and φ

^{−1}is the inverse function of φ

_{1}. We applied the algorithm in Macke et al., 2009 to obtain the parameters μ and δ in the multivariate Gaussian distribution. Based on the distribution, the pattern probabilities

*P*(σ

_{1}, σ

_{2}, …, σ

*) were calculated by integrating the multivariate Gaussian distribution from −∞ to the thresholding value or from the thresholding value to +∞.*

_{n}##### Calculating strengths of different order interactions.

Strengths of different order interactions can be measured by the method used previously (Ganmor et al., 2011) when the pattern probability distribution *P*(σ_{1}, σ_{2}, …, σ* _{n}*) was known. Generally, the probability can be calculated as follows:

*P*(σ

_{1}, σ

_{2}, …, σ

*)= exp(∑*

_{n}*α*

_{i}*σ*

_{i}*+ ∑*

_{i}_{i < j}β

*σ*

_{ij}*σ*

_{i}*+ ∑*

_{j}_{i < j < k}γ

*σ*

_{ijk}*σ*

_{i}*σ*

_{j}*+ …) where α, β, and γ are strengths of the first-, second-, and third-order interactions (3OIs), respectively. Specifically, α*

_{k}_{1}= log

*P*(100…) + log

*Z*where

*P*(100…) is the probability of the pattern in which the first node is active while all other nodes are quiescent, and

*Z*= 1/

*P*(000…); β

*= log*

_{12}*P*(1100…) + log

*Z*− ∑

_{i=1}

^{2}α

*; and γ*

_{i}_{123}= log

*P*(11100…) + log

*Z*− ∑

_{i=1}

^{3}α

*− β*

_{i}_{12}− β

_{13}− β

_{23}.

##### Time bin and thresholding methods for analyzing empirical data.

Different time bins (1, 2, 4, 6, and 12 times of TR) were used here to analyze the empirical data. In all presented results for the empirical data, the time bin was chosen to equal one time of TR. For cases with larger time bins, two methods were used to determinate the value for individual bins: the peak-value method and the mean-value method (the maximum value or the mean value among the data within a given time bin). Three different thresholding methods were tested here: (1) the absolute value method, which uses a predefined value as the threshold; (2) the global method, which uses the SD of all ROIs in a network to calculate the threshold; and (3) the scaled method, which uses the SD for individual ROIs to calculate the threshold. For the global and scaled methods, the threshold used was the summation of the mean and certain times of SD, as defined above. The global method was applied for all results of the empirical data shown here. Similar results can be obtained for the different time bin and thresholding methods (e.g., see Fig. 5*C*,*D* for the results obtained using the peak-value and mean-value thresholding methods). We note that the results presented in the main text were based on positive thresholding, that is,the dataset of positive deflections in the BOLD signals. The results based on various levels of negative thresholds were the same as that of positive thresholds.

##### Quantification of model performance.

To measure the accuracy of models' prediction to empirically observed pattern probabilities, we used the JS divergence (Schneidman et al., 2006; Yu et al., 2011). The JS divergence between probability distributions *P* and *Q* is defined by *D*_{JS}(*P*, *Q*) = 0.5 × *D*_{KL}(*P*‖*M*) + 0.5 × *D*_{KL}(*Q*‖*M*), where *M* = (*P* + *Q*)/2 and *D*_{KL} is the Kullback–Leibler divergence, defined as *D*_{KL}(*P*‖*Q*) = ∑* _{i}p_{i}* log(

*p*/

_{i}*q*). We note that the JS divergence can quantify the distance between two distributions in a statistical sense, but it cannot address the functional implication of such distance. To quantify the improvement in performance of a model over the other, we define the relative accuracy index (RAI) as |

_{i}*D*

_{JS}

^{model1}−

*D*

_{JS}

^{model2}|/

*D*

_{JS}

^{model1}, which is adapted from the accuracy index used previously (Watanabe et al., 2013). In addition, to measure the higher-order information beyond the model, we quantified the entropy difference between various models and the empirical data as

*I*

_{model}

^{(H)}=

*H*

_{model}−

*H*

_{data}, where

*H*is the entropy calculated by

*H*= − ∑

*P*log(

_{i}*P*).

_{i}To determine whether a parametric model can provide accurate enough prediction for the data, we used a two-fold cross-validation method. The original dataset was separated into two equally sized sets. The Ising and the DG model fitted to the first set were used to predict the probabilities for the second set. Their prediction performance was compared with the “Exp” method, which uses the observed pattern probabilities of the first set to predict directly that of the second one. Previous study has shown that the performance of Exp method was highly sensitive to the amount of training data (Yu et al., 2011), so the two-fold cross-validation, which can provide the largest possible training dataset compared with other cross-validation methods, was used here to test the performance of the Ising and DG models in a more stringent way.

##### Examining statistical significance.

To determine whether the improvements of the DG model over the Ising and the Exp models in approximating both the activities of all six functional networks and internetwork activities were statistically significant, we separated the entire dataset randomly into two halves to perform the two-fold cross-validation test 100 times and, in each time, the performance of the three methods (Ising, DG, and Exp) was quantified by *D*_{JS}, providing the distributions to assess the mean and variability of their performance. Statistical significance was then examined using the permutation test (Nichols and Holmes, 2002). Specifically, *D*_{JS} of any two methods in all 100 times of two-fold cross-validation was first pooled together and then permuted and resampled (without replacement) 10,000 times, providing the permutation distribution of difference in *D*_{JS} under the null hypothesis that the two models' performance was the same. This null hypothesis was rejected only if the actual difference between the mean *D*_{JS} of these two models (averaged across 100 times of two-fold cross-validation) was larger than 95% of the cases in the permutation distribution. In addition, conventional statistical tests were conducted when enough independent samples could be obtained in two-fold cross-validations, such as in the cases for three-node subnetworks, finer-scale and spatially close networks, and 10-node networks with nodes randomly chosen from all 50 ROIs. In those cases, to examine the statistical significance, we applied the Kruskal–Wallis test, followed by the Mann–Whitney *U* test for *post hoc* analysis when it was appropriate. The significance level was set to be *p* < 0.05 for both the Kruskal–Wallis and Mann–Whitney *U* tests.

##### Method for estimating the significant HOIs on subnetwork state probabilities.

To estimate whether empirically observed 3OIs of specific subnetworks were significantly different from the DG model's prediction, we used the corresponding DG model to generate 1000 synthetic datasets, each with the same sample size of the experimental data. An empirically observed 3OI was considered significant if its value was outside the 95% confidence interval of 3OIs calculated from this synthetic dataset.

##### Data generated by three-neuron interaction model.

The data in Figure 1, *B* and *C*, were generated according to the probability distribution *P* as follows:
The strengths of the second- and third-order interactions in the system were changed by adjusting the parameters β* _{ij}* and γ

*, respectively. Mean correlation observed in the system was calculated by averaging Pearson correlation coefficients (PCCs) of all pairs of nodes.*

_{ijk}#### Network model for generating simulated BOLD signals

##### Overview of the network model.

The network model consists of a number of “brain regions.” Each “region” is a collection of interacting neurons, with 80% excitatory and 20% inhibitory ones (Abeles, 1991; Brunel and Wang, 2001; Wang, 2002; Braitenberg and Schüz, 2013; Deco et al., 2014). The long-range connections between different regions were set according to the predefined connections and were exclusively excitatory. Within each region, neuronal activities were affected by the balance between the excitation and the inhibition, which was approximated by a slow-variable dominant dynamical equation using the mean-field method (Wong and Wang, 2006; Deco et al., 2013). In this system, the most important parameter that can influence the whole network's behavior is the strength of the long-range connections bridging different areas. For the sake of simplicity, we assumed the same strength for all interareal connections (termed global coupling). To simulate the fMRI signals, we then used the Balloon–Windkessel hemodynamic model that converts the fast neuronal dynamics unfolding at the neural network level to slower BOLD signals (Friston et al., 2003; Deco et al., 2013).

##### Mean-field neural networks.

In mean-field neural networks, the dynamics of each node, or brain region, could be described by the following nonlinear stochastic differential equations (Wong and Wang, 2006; Deco et al., 2013):
where *S _{i}*(

*t*),

*H*(

*x*), and

_{i}*z*(

_{i}*t*) denote the average synaptic activity variable (the gating variable), the mean firing rate and the total inputs for the local region

*i*, respectively. The connection matrix

*C*represents the underlying structure of the network, with

*C*= 1 if there is a connection from region

_{ij}*j*to

*I*and

*C*= 0 otherwise. The global coupling strength

_{ij}*G*controls the interarea interactions. The local coupling strength

*w*contains the effects of both excitation and inhibition within individual areas. The transmission delay τ

*represents the time needed for activities traveling between areas*

_{ij}*i*and

*j*. The external input to the model, σξ

*(*

_{i}*t*), has two components: ξ

*(*

_{i}*t*) = ϵξ

_{com}(

*t*) + (1 − ϵ)ξ

_{uni}(

*t*) where ξ

_{com}(

*t*) and ξ

_{uni}(

*t*) are different standard Gaussian noise, whereas ξ

_{com}(

*t*) is the common input to all areas and ξ

_{uni}(

*t*) is unique for individual areas. The parameter ϵ controls the ratio of common inputs. Parameters used for the activation function

*H*were as follows:

*a*= 270(VnC)

^{−1},

*b*= 108 Hz, and

*d*= 0.154 s. The kinetic parameter γ = 0.641/1000 (the factor 1000 was used to convert the unit of time from second to millisecond) and the time constant τ

_{s}= 100 ms. The synaptic conductance

*J*= 0.2609 nA. The local coupling strength

*w*was set to 0.85. The background input

*I*

_{0}= 0.3 nA. σ

*v*(

_{i}*t*) denotes the synaptic noise in the region

*i*and ξ

*(*

_{i}*t*) is the standard Gaussian noise with the amplitude σ = 0.25. The transmission delay τ

*is usually set to be zero except for the control analysis, in which it was randomly chosen from the range of [0, 50] ms.*

_{ij}##### The Balloon–Windkessel hemodynamic model.

The Balloon–Windkessel hemodynamic model was used to translate the simulated neural synaptic signals into BOLD signals (Friston et al., 2003; Deco et al., 2013). The BOLD signal of a cortical region *I* was computed by the mean synaptic activity *S _{i}* (Eq. 2) within this region. In short, for the

*i*

^{th}region, the synaptic activity

*S*induces an increase in the vasodilatory signal

_{i}*x*(Eq.5). The vasodilatory signal

_{i}*x*is also affected by a leakage term (κ

_{i}*in Eq. 5) and the feedback from the blood inflow*

_{i}x_{i}*f*, which is in turn proportional to

_{i}*x*itself (Eq. 6). The change in the blood volume

_{i}*v*is the linear function of

_{i}*f*and a leakage term (Eq. 7). The deoxyhemoglobin content

_{i}*q*is a nonlinear function of

_{i}*f*,

_{i}*v*, the resting oxygen extraction fraction ρ and itself (Eq. 8). Finally, the BOLD signal was calculated by the following: where

_{i}*V*

_{0}= 0.02,

*k*

_{1}= 7ρ,

*k*

_{2}= 2, and

*k*

_{3}= 2ρ − 0.2. The parameters used in the present study were set according to a previous study (Friston et al., 2003). It should be noted that, in this model, the BOLD signal of a region is the function of the synaptic activity within itself and is not affected by synaptic activities of other regions.

##### Network simulations with different topological structures.

Most of the simulation results were based on the structure of the FPN, with the exception of the case in which the cluster coefficient of networks was changed. To study the effects of inhibitory connections, we started with the FPN topology and randomly changed the directed excitatory connections to inhibitory ones with increasing probability, thereby gradually adjusting the ratio of inhibitory connections. To examine the effects of different cluster coefficients, we started with regular 11-node networks with mean degree < *k* > = 4, in which all nodes were arranged in a ring configuration and individual nodes were connected to two nearest neighbors from each side. Then, we rewired the directed connections randomly with an increasing probability (Watts and Strogatz, 1998). For each level of the cluster coefficient examined, six networks were generated.

##### Down-sampled data.

In our simulation, both the synaptic activities and the BOLD signals were down-sampled with an interval of TR_{s} = 240 ms for the HOI analysis and the total time for each simulation was 60 h. Similar results were obtained from the data with different time bin widths; for example: 3, 6, 12, 24, and 36 times TR_{s}, by using the global thresholding operation based on the peak value in the larger time bins.

##### Criteria of the network model constrained by the empirical data.

To ensure that the simulation can capture important features of the empirical rs-fMRI signals, two criteria were used: (1) the distribution of the BOLD signal amplitude should be unimodal and nearly symmetric; that is, a Gaussian-like distribution, which was the case for the empirical data; and (2) the overall level of pairwise correlations observed in simulations should cover the values exhibited in the empirical data.

## Results

We first illustrated how HOIs can be important in determining a system's behavior and how they can introduce spurious pairwise correlations. We then analyzed the effects of HOIs in recordings of macroscopic brain dynamics based on fMRI data collected in 100 human subjects during the resting state. Finally, we constructed a two-layer model to simulate the generation of BOLD signals in distributed brain networks, examining the possible mechanisms underlying the behavior of HOIs that we observed in the empirical data.

### Why are HOIs important?

Let us assume a small network composed of three nodes and each one could be either active or quiescent at a given moment. In Figure 1*A*, we depict one of such system (System 1) having three nodes that behave independently. In this case, there were quite a few instances in which three nodes were all active together by chance, representing a (trivial) synchronous state of the network. In another system (System 2), the probability of any nodes to be active (the first-order statistics), as well as the pairwise correlation between all pairs (the second-order statistics) were exactly the same as in System 1; that is, the correlations were all zero. Despite having the same first- and second-order statistics, System 2 exhibited no synchronous activities due to the injected third-order interaction (equivalent to the XOR operation) that actively prevented them. This simple example clearly illustrates that HOIs could be very important in shaping the systems' overall behavior and such an influence could escape our attention completely had the system been examined only at the pairwise level.

It was not just functionally important interactions such as the XOR operation in the above example that could be undetected at the pairwise level, HOIs can also “project” to the pairwise level, resulting in spurious pairwise correlations. In Figure 1, *B* and *C*, we show the results for two three-node systems, in which we kept the pairwise interactions stable while the 3OIs were increased systematically (see Materials and Methods for details). Despite the unchanged true pairwise interactions, the measured correlations between any two nodes were affected profoundly by the 3OIs. Importantly, there is no simple way to filter out such influence because the effects are largely determined by how active the individual nodes are (Fig. 1*B*) and how strong the true pairwise interactions are (Fig. 1*C*). In a large system composed of many nodes, there are many different HOIs; for example, fourth order, fifth order, etc., which can also “project” their effects to the pairwise level, adding yet another level of complexity.

In a system with abundant HOIs, the measured pairwise correlations are very difficult to interpret and the functional network extracted based on such measures should not be considered as a faithful representation of the true, underlying interaction structure.

### HOIs in macroscopic functional networks

To determine whether HOIs are present in actual brain dynamics, we analyzed activities of six typical brain networks, including the FPN (containing 11 ROIs; Daffner et al., 2003; Watanabe et al., 2013; Zanto and Gazzaley, 2013), the DMN (containing 12 ROIs; Raichle et al., 2001; Greicius et al., 2003; Watanabe et al., 2013), the DAN (containing 8 ROIs; Fox et al., 2006; Brier et al., 2012; Zhan et al., 2016), the CON (containing 5 ROIs; Seeley et al., 2007; Brier et al., 2012; Zhan et al., 2016), the SAL (containing 7 ROIs; Zhou et al., 2010; Zielinski et al., 2010; Brier et al., 2012; Zhan et al., 2016), and the SMN (containing 7 ROIs; Fox et al., 2009; Brier et al., 2012; Zhan et al., 2016), in the rs-fMRI data of 100 human subjects collected in HCP (see Materials and Methods for details). Below, we first present the results in details for the FPN and the DMN and then show more comprehensive results for all six typical functional networks and other networks with various parameters.

#### Effects of HOIs in the FPN and DMN

As shown in Figure 2*A*, the BOLD signals recorded from all ROIs within individual networks were first converted to binary activity patterns using a thresholding operation, with 1 indicating superthreshold episodes (active) and −1 indicating subthreshold episodes (quiescent). Next, we measured the probabilities of all possible patterns of the networks, which provide important information about the system's behavior. For example, for FPN (with 11 nodes), we measured the probabilities for all 2^{11} different activity patterns. We then tried to predict/explain all these probabilities using various parametric models, which take into account different orders of statistics measured from the empirical data. These models included: (1) the independent model, which introduces the same first-order statistics (how active a node is) as empirically observed but assumes that all nodes behave independently (Schneidman et al., 2006; Yu et al., 2011); (2) the pairwise maximal entropy model or Ising model, which introduces the same first- and second-order (pairwise correlations) statistics as empirically observed but assumes all HOIs are zero (Schneidman et al., 2006; Tang et al., 2008; Yu et al., 2011); and (3) the DG model. In addition to the observed first- and second-order statistics, the DG model also takes into account a specific structure of HOIs due to the thresholding operation used in extracting the binary activity patterns (Amari et al., 2003; Macke et al., 2009; Yu et al., 2011). To determine how well those models can predict/explain the probabilities of all patterns, we used the JS divergence (*D*_{JS}; see Materials and Methods for details). Comparing different models' performances gave us a quantitative measure of the relative importance of interactions of different orders in determining the networks' behavior.

In fMRI signal analysis, GSR is a widely adopted preprocessing step (Fox et al., 2009; Murphy et al., 2009; Van Dijk et al., 2012). However, the GS contains important information about neuronal dynamics and can be used as a biomarker in monitoring brain diseases (Schölvinck et al., 2010; Yang et al., 2014). Importantly, it may also introduce strong HOIs to the system and drastically change the overall interactions structure (Macke et al., 2011; Yu et al., 2011). Therefore, in our HOI analysis, we examined the GSR and noGSR conditions separately.

In Figure 2, *B* and *C*, we show results for the FPN and the DMN, respectively, under the GSR condition. Clearly, if the probabilities were predicted using the first-order statistics only (the independent model, indicated by black dots in Fig. 2*B*,*C*), then the prediction had enormous errors. If the second-order statistics were included into the prediction (the Ising model, indicated by blue asterisks in Fig. 2*B*,*C*), then the accuracy was greatly improved. Measured by the RAI, defined as |*D*_{JS}^{model1} − *D*_{JS}^{model2}|/*D*_{JS}^{model1}; see Materials and Methods for details; Watanabe et al., 2013), the prediction was improved by 96.35% and 93.1% for the FPN and the DMN, respectively, compared with the independent model. Moreover, if the threshold-induced HOIs were taken into account (the DG model, indicated by red diamonds in Fig. 2*B*,*C*), RAI could be further improved (30.9% and 32.4%) compared with the Ising model.

In Figure 2, *D* and *E*, we show the results for the same analyses but under the condition of noGSR. We found that the improvements in RAI achieved by the Ising model (over the independent model) were quite similar to the GSR condition (94.9% and 91.5% for the FPN and the DMN, respectively). However, in contrast to the GSR condition, now for the patterns that were mostly likely to occur, the Ising model gave much worse results, as indicted by large deviation of those data points from the equality line, suggesting an increased amount of HOI under such condition. Importantly, the DG model's prediction in the noGSR condition was as accurate as that in the GSR condition, resulting in the RAI increasing by >60% (67.6% and 64.9% for the FPN and the DMN, respectively) compared with the Ising model. These results indicated that the apparent HOIs in the noGSR condition can be satisfactorily explained by considering the thresholding operation with the DG model.

The examples analyzed above suggest that, although the Ising model gives good prediction for the GSR condition, a specific structure of HOIs due to thresholding are needed to explain the systems' behavior for the noGSR condition and the RAI as a measure to judge whether a model can give accurate prediction to the empirical data has important limitations. Therefore, we next performed more comprehensive analyses using three different measures and under various thresholding levels to further examine the importance of pairwise and HOIs for these two networks.

First, consistent with the above results, we found that the JS divergence, which is another way to measure the prediction accuracy, of the Ising model's prediction differed significantly under the GSR and noGSR conditions, whereas that of the DG model's prediction kept almost identical between the two conditions (Fig. 3*A*,*B*), suggesting that the GS does not introduce true HOIs. Second, to examine directly the information encoded by HOIs, we defined the higher-order information beyond a model as the entropy difference between the system and that model (see Materials and Methods for details). In Figure 3, *C* and *D*, we show that the higher-order information beyond the Ising model was much larger under noGSR condition. However, the higher-order information beyond the DG model was very similar between these two conditions, confirming that there was actually no information encoded by HOIs in the noGSR condition. Finally, we determined the strength of 3OIs directly in the systems and compared that with those predicted by the DG model. We found that, although much stronger 3OIs were observed empirically under the noGSR condition, this increase could be explained accurately by the DG model (Fig. 3*E*,*F*). Again, these results suggest that, in the noGSR condition, observed HOIs can largely be explained by the thresholding operation.

#### Comprehensive analyses of HOIs for large-scale functional networks

In addition, results similar to Figure 3 were obtained for other four networks, including the DAN, CON, SAL, and SMN. In Figure 4, we first show the diversity of the PCC of BOLD signals across all six networks with and without the GSR (Fig. 4*A*,*B*). To determine whether such a correlation structure can be well captured by studying the thresholded signals, we compared the correlation coefficients between all pairs of continuous BOLD signals with their counterparts calculated based on dichotomized signals. These two measurements strongly correlated with each other (*r* > 0.93, *p* < 0.01 for both GSR and noGSR data). In addition, the negative correlations in the continuous signals were mostly preserved after the binarization (Fig. 4*C*,*D*). These results suggest that the dichotomized signals indeed reflect the correlation structure of the original data. Next, we summarized the overall results concerning HOIs in all six networks using the two-fold cross-validation method. Specifically, the original dataset was separated into two equally sized sets. The Ising and the DG model fitted to the first set were used to predict the probabilities for the second set. In addition, we also used the observed pattern probabilities of the first set to predict directly that of the second one (Exp method). We found that, in both the GSR (Fig. 4*E*) and noGSR (Fig. 4*F*) conditions, the prediction of the DG model, measured by the JS divergence, was better than that of the Exp method for all six networks (*p* < 0.05, permutation test, see Materials and Methods for details). This is possible because, with the limited sampling (the first half of data), the DG model needs to estimate a much smaller number of parameters (only the first- and second-order statistics for a network with *N* nodes; ∼*N*^{2} parameters) compared with the naive model (Exp, ∼2^{N} parameters). Because one half of the dataset equals ∼23 h recording, our results indicate that the DG model, with ∼*N*^{2} parameters, could characterize brain networks' behavior very accurately. Our results from all six networks suggest that the lack of HOIs may be a fundamental feature for all major functional networks of the resting brain.

Because there were significant PCCs between the regions belonging to different networks (Fig. 4*A*,*B*), we also analyzed the effect of HOIs among regions belonging to different networks with the same method. Specifically, we studied the HOIs in five networks of 10 nodes, which were chosen randomly from all 50 ROIs (without replacement). We found that, in both the GSR and noGSR conditions, the prediction of the DG model was better than that of the Exp method (Fig. 4*G*,*H*; *p* < 0.05, permutation test), confirming that HOIs are also weak for interactions among different networks. Similarly, significantly better performance of the DG model can be demonstrated for 100 random 10-node networks, with nodes chosen randomly from all 50 ROIs (with replacement; Fig. 5*A*; *p* < 0.05, Kruskal–Wallis and Mann–Whitney *U* tests, see Materials and Methods for details). Furthermore, we found that the DG model could predict activities correctly for the entire network consisting of all 50 ROIs from these six networks (Fig. 4*I*,*J*), suggesting that weak HOIs hold true even for a global network of the resting brain.

Because the current study was aimed primarily at examining the functional networks at the macroscopic level, in the results mentioned above, we focused on relatively large spatial scale (10-mm-radius ROIs) and a specific temporal scale (TR = 720 ms as the time bin width). First, to determine whether, in a finer scale of ROIs, the HOIs would be more pronounced, we performed the same analyses for both the PFN and DMN with 2- and 5-mm-radius ROIs. The results obtained at these two finer scales are very similar to the results shown in Figure 4, *C* and *D*, demonstrating the significantly better performance of the DG model compared with the Exp method (*p* < 0.05, permutation test; results of the FPN are shown in Fig. 5*B*). Second, for different temporal scales, similar conclusions could be drawn for data obtained with dramatically different time bin widths (*p* < 0.05, permutation test; results of the FPN are shown in Fig. 5*C*,*D*). These results indicate that the lack of strong HOIs is not dependent on specific spatial or temporal scale at which the BOLD signals were analyzed during the resting state.

Because the motion artifacts had been demonstrated to be important in analyzing brain networks' activities based on fMRI signals (Power et al., 2012; Van Dijk et al., 2012; Power et al., 2014), we scrutinized the data using a more stringent method for motion artifact rejection suggested in a recent study (Ciric et al., 2017). The results were highly consistent with the previous ones. We found that the performance of the DG model was significantly better than the Exp method in all 6 functional networks under both the GSR and noGSR conditions (Fig. 5*E*, *p* < 0.05, permutation test), indicating that our conclusions were not affected by motion artifacts in the original BOLD signals.

#### Effects of HOIs on subnetworks

It remains to be tested whether HOIs are strong at specific subnetworks; for example, among three nodes, within these typical functional networks. To address this issue, using the FPN and the DMN as examples, we examined all three-node subnetworks within each of them and compared the strengths of observed 3OIs with that predicted by the DG model. Because the networks extracted in the noGSR condition exhibit stronger HOIs, here, we focus the analysis on this condition. We found that ∼65% of 3OIs in the real data were not significantly different from the DG model's prediction, judged as its value within 95% confidence interval of 1000 length-matched samples generated by the DG model (see Materials and Methods for details). For the remaining ∼35% of 3OIs that could not be fully explained by the DG model (referred to as significant 3OIs hereafter), we found that they had very limited influence on corresponding three-node subnetworks' activities. To demonstrate this, in Fig. 6, *A* and *B*, we plot the observed versus predicted (by the DG model) pattern probabilities for all triplets chosen within the FPN and the DMN. For the subnetworks composed of the triplets with significant 3OIs, we examined how these 3OIs were important in shaping subnetwork's activities, by using the same two-fold cross-validation analysis. We did not find larger prediction error for subnetworks containing significant 3OIs. Furthermore, the DG model was still significantly superior compared with the Exp method for this group of subnetworks (Fig. 6*A*,*B*, insets; *p* < 0.05, Kruskal–Wallis and Mann–Whitney *U* tests), resembling the situation for subnetworks without significant 3OIs. These results demonstrated that, even for the subnetworks with significant HOIs, the DG model's prediction is accurate enough so that the error is smaller than statistical fluctuations in tens of hours of recording (reflected by the Exp method). Therefore, similar to the whole network, the HOIs that cannot be explained by the DG model play a very minor role, if any, in shaping activities at the subnetwork level. Very similar results were obtained when we investigated the 3OIs for subnetworks consisted of nodes belonging to different networks (Fig. 6*C*,*D*). Taken together, we did not find significant HOIs that are important in shaping activities of subnetworks at both the intranetwork and internetwork levels.

#### Effects of spatial vicinity on HOIs

At the mesoscopic scale, that is, the networks composed of individual neurons, it has been reported that, the closer the neurons are to each other in space, the stronger the HOIs that will be found among them (Ohiorhenuan et al., 2010). To determine how spatial proximity can affect the HOIs among macroscopic functional networks, we analyzed the HOIs among networks at finer scale and in which individual nodes were much closer to each other. Specifically, we used 2 mm as the radius to define ROIs and chose 9 nodes within a single coarse-grained DMN or PFN node (10-mm-radius ROI). In total, 23 of such finer-scale networks were studied (see Materials and Methods for details). The results were highly consistent with what we found for spatially separated networks because the DG model performed significantly better than the Ising model across all threshold levels examined (Fig. 7*A*,*B*; *p* < 0.05, Kruskal–Wallis and Mann–Whitney *U* tests). Although there were apparently sizable HOIs exhibited in both the GSR and noGSR conditions, they could be satisfactorily explained by the DG model (Fig. 7*C*). To further determine whether the DG model's approximation of the network activities was accurate enough, we applied the same two-fold cross-validation test to these 23 finer scale networks. The results confirmed that the DG model was better than the Exp method in both the GSR and noGSR conditions (Fig. 7*D*, *p* < 0.05, Kruskal–Wallis and Mann–Whitney *U* tests). Therefore, the weak HOIs seem to be a general feature for functional brain networks extracted by BOLD signals at rest regardless of the spatial relationship among the constituent network nodes.

### Behavior of HOIs in dynamic network models

#### Overview of the model

To better understand our empirical results, including the largely absent HOIs and the difference between the GSR and noGSR condition, we built a network model that covered the condition similar to the empirical data (see Materials and Methods for details) but extended to a much wider range of conditions to examine the behavior of HOIs thoroughly. In the model with predefined network structure (e.g., the structure of the FPN based on Gong et al., 2009 and Watanabe et al., 2013 as shown in Fig. 8*A*, top), the synaptic dynamics were generated by a mean-field neural network model (Wong and Wang, 2006; Deco et al., 2013) and then the Balloon–Windkessel hemodynamic model was used to convert the fast synaptic dynamics (Fig. 8*A*, middle) to slower BOLD signals (Fig. 8*A*, bottom) (Friston et al., 2003; Deco et al., 2013; see Materials and Methods for details).

#### Effects of global coupling and common inputs on HOIs

In this system, the connection strength between different regions (termed global coupling, assumed to be homogeneous) plays an important role in shaping the whole network's behavior. In Figure 8, *B* and *C*, we show the results of the HOIs detected in the model as a function of global coupling, with the same topology of interarea connections as shown in Figure 8*A*. We found that the mean strengths of 3OIs in simulated synaptic activities and BOLD signals were weak. Similar to the empirical results, such an absence of HOIs could be correctly approximated by the DG model. Importantly, the mean strength of 3OIs was not affected by the increase of global coupling (Fig. 8*B*,*C*) from 0 to 0.1, which was chosen to make the average pairwise correlation in the simulated signals cover the range of that in the empirical data (see Materials and Methods for details). It is noteworthy that, based on the comparison between the results of simulated synaptic activities and BOLD signals, we can also conclude that the absence of HOIs is not due to the lower temporal resolution of the BOLD signal. Instead, it is a property rooted in how neuronal populations among different brain areas interact with each other.

Based on the empirical data, we demonstrated that, under the noGSR condition, apparent HOIs are much stronger. The nature and origin of the GS in the brain are not completely understood yet, but the common inputs mediated through diffuse ascending projections from subcortical structures, including the locus ceruleus (Sara, 2009), have been suggested as possible sources. Therefore, to understand how global signal gives rise to apparently stronger HOIs, we introduced global signal in the model by injecting common inputs to the entire network. Specifically, we systemically changed the ratio of common input to all external inputs, mimicking the effects of global signal in the system. We found that, in both synaptic activities and BOLD signals, strong common inputs led to strong 3OIs (Fig. 8*D*,*E*). These results explain both weak HOIs in data with GSR and apparently strong HOIs in data without GSR. Moreover, consistent with our empirical results, the HOIs induced by common inputs could be well explained by the DG model. The mean strengths of 3OIs in empirical data can be largely predicted by the DG model for both synaptic activities and BOLD signals, indicating that they were not true HOIs intrinsic to the system. To further mimic the situation of GSR in the empirical data, we also preprocessed the simulated data with GSR similar to that was done in the empirical data. We found that at various levels of common inputs, both simulated synaptic activities and BOLD signals showed no significant HOIs that cannot be explained by the DG model, resembling the empirical results with the GSR condition (Fig. 8*F*,*G*).

#### Similar dynamic features between the empirical and simulated BOLD signals

To determine whether the parameter space explored above covered the network states having similar dynamic features with the real BOLD signals, we compared the strength of pairwise correlations between the empirical and simulated data. We found that, with the proper choice of global coupling and common input ratio, the model exhibited distribution of correlation strength highly similar to the empirical data (Fig. 9). Specifically, when the global coupling was chosen to be 0.1 (without common inputs), the distribution of PCC (Fig. 9*A*) and mean magnitude squared coherence (MSC, averaged across the frequency range of 0.01–0.1 Hz; Sun et al., 2004; Fig. 9*C*) observed in the simulation was similar to that of the empirical data with GSR. When the common input ratio was chosen to be 0.3, the distribution of PCC (Fig. 9*B*) and mean MSC (Fig. 9*D*) observed in the model was similar to that of the empirical data without GSR. These results suggested that the model indeed captured essential features of brain dynamics and therefore can be used to study the mechanisms underlying the lack of strong HOIs observed in the empirical data.

#### Robustness of weak HOIs in various conditions

The lack of strong HOIs in the model was highly robust in both the GSR and noGSR conditions within a wide range of other key parameters such as ratio of inhibitory connections of the network (Fig. 10*A*), transmission delays among different regions (Fig. 10*B*), and time scale used for analysis (Fig. 10*C*). Specifically, we investigated how HOIs can be affected by the typological feature (e.g., the clustering coefficient) of the underlying network. We found that changing the clustering coefficient from 0.2 to 0.6 in the simulated networks (see Materials and Methods for details) did not introduce significant HOIs that cannot be explained by the DG model (Fig. 10*D*). Consistently, the DG model significantly outperformed the Exp method in the two-fold cross-validation regardless of the networks' clustering coefficient (*p* < 0.05, Mann–Whitney *U* tests; Fig. 10*D*, inset).

In summary, our simulation results demonstrated that, given a similar level of overall pairwise interaction strength as the empirical data, the network model exhibited very weak true HOIs in a wide range of parameter settings. Combined with the results regarding the apparent HOIs in the model due to strong common input, they captured well the characteristic of weak HOIs under the GSR condition and spurious HOIs under the noGSR condition in the empirical data.

## Discussion

In the present study, we quantified the strength of HOIs in BOLD signals recorded during the resting state. The results demonstrated that HOIs are generally weak in brain networks and thus play little role in forming functional systems at the macroscopic scale. Our modeling results further suggest that the lack of HOIs may be an intrinsic property of the macroscopic brain dynamics during the resting state.

### Methodological considerations

Is it possible that the lack of HOIs is due to limitations in our analysis, for example, the dichotomization of BOLD signals, movement artifacts in the signal, or the slow fluctuation of the fMRI signals?

To reduce the dimensionality of the data for HOI analysis, we dichotomized continuous BOLD signals. Similar approaches have been used in analyzing local field potentials (LFPs), magnetoencephalography, and fMRI signals (Petermann et al., 2009; Yu et al., 2011; Watanabe et al., 2013; Shriki et al., 2013). It is known that fMRI signals are closely related to LFPs (Logothetis et al., 2001; Goense and Logothetis 2008; Magri et al., 2012; Logothetis and Panzeri 2015), whereas previous work has demonstrated that neuronal interactions can be correctly revealed by studying binarized LFP signals (Petermann et al., 2009; Yu et al., 2011, 2013). Our results showing the monotonic relationship between correlations measured based on the continuous BOLD signals and their binarized counterparts provide more direct evidence that the correlation structure in the original signals was indeed well preserved after the thresholding. Importantly, we emphasize that the threholding operation introduced a specific structure of HOIs, which can be well captured by the DG model. Therefore, the good performance of the DG model actually indicates the lack of true HOIs in the original, continuous BOLD signals.

It is known that head movements during fMRI scanning can induce systematic, yet spurious correlations among different brain areas (Power et al., 2012; Van Dijk et al., 2012; Power et al., 2014), thereby artificially increasing the synchrony of the recorded signals. In the current study, we found that there was no intrinsic HOI in either the GSR or noGSR condition. Because GSR is an effective way to remove motion artifacts in the fMRI data (Power et al., 2014), these results provide evidence that our conclusion are not affected by possible movement artifacts. This point was further supported by the fact that consistent results were obtained by using a more stringent method of motion artifact rejection (Ciric et al., 2017).

Last, is it possible that HOIs are difficult to be detected given the time scale of fluctuations in fMRI signals? For example, the interactions mediated by slow signals may need a longer temporal scale to reveal. However, we increased the width of time bin used for analyzing both the empirical and simulated BOLD signals in a wide range, but the results of weak HOIs were obtained consistently. Another possibility is that the HOIs existing in the faster temporal scales (e.g., neuronal interactions) were obscured in the slow fluctuations of the BOLD signals. This is also unlikely because we have analyzed the simulated synaptic activities, which have faster temporal fluctuation yet exhibit the same weak HOIs as the corresponding BLOD signals. Together, these results strongly suggest that the conclusion of weak HOIs may not depend on specific time scale of the analyzed BOLD signals.

### Pairwise interactions determine activities of macroscopic brain networks

For microscopic brain networks, that is, the network of individual neurons, it is known that, although HOIs are in general weak (Schneidman et al., 2006; Tang et al., 2008; Yu et al., 2008), in certain conditions such as being driven by specific types of inputs (Ganmor et al., 2011; Macke et al., 2011) or with constituent neurons in close proximity (Ohiorhenuan et al., 2010), HOIs can be strong and no longer negligible. Previously, few studies have examined the HOIs at the macroscopic level. Although many researches have applied various graph-theory-based analyses on functional networks extracted using BOLD signals, these results do not provide information regarding how different brain areas can interact at higher orders because the widely used FC analysis is fundamentally based on measuring pairwise interactions. Therefore, examining the structure of HOIs will provide information about the functional organization of the brain complementary to the existing knowledge obtained by FC analyses.

Recently, Watanabe et al. (2013) reported that, compared with the independent model, the pairwise maximal entropy model (the Ising model) could significantly improve the accuracy in approximating activities of two functional networks under the special condition of GSR, although we demonstrated here that the Ising model falls short in predicting pattern probabilities for the condition without GSR. Using the DG model, we demonstrated that the lack of HOIs is a feature generally true for macroscopic brain networks with different sizes, topologies, degrees of spatial proximity, and spatial scales. These results provide a sound basis for the conventional, pairwise-based FC analysis, explaining why the FC analysis can indeed extract the functional organizations of the brain (Honey et al., 2007; Honey et al., 2009). In addition, we note that the lack of HOI does not conflict with the complex interaction structure, reflected by clusters/modules consisting of multiple brain regions, that was reported recently (Bassett et al., 2011; Stolz 2014; Davison et al., 2015). Such clusters could be formed by HOIs or aggregated pairwise interactions. Our results strongly imply that the latter probably is the case. From this perspective, the current results reveal the level of complexity at which the brain networks are organized, suggesting that it may be unnecessary to go beyond the pairwise level to study the macroscopic interactions in the resting brain.

Moreover, we took special attention to compare the conditions with and without the GSR. Although it was reported that the GSR can improve the effectiveness in detecting functional correlations, thereby providing better correspondence between observed FC and anatomy (Fox et al., 2009), other studies have suggested that the GS itself may have physiological importance (Schölvinck et al., 2010) and removing it will cause unnecessary distortions to the signals (Saad et al., 2012) or impede the detection of disease-related changes (Yang et al., 2014). Here, we did not aim to address the nature of the GS nor the validity in removing it. Instead, we demonstrated that, with or without the GS, the BOLD signals at rest do not contain true HOIs. Therefore, conventional FC analysis can be appropriately applied in both conditions.

It is worth noting that the current results demonstrated the validity of FC analysis at the pairwise level, but we do not address the question of which pairwise measures should be used. Although the simple PCC has been widely used, it has been suggested that more sophisticated pairwise measures such as partial correlation (Salvador et al., 2005; Salvador et al., 2008), magnitude squared coherence (Sun et al., 2004), mutual information (Salvador et al., 2008; Salvador et al., 2010), and exchange interactions (Watanabe et al., 2013) could provide more accurate results by taking into account the nonlinear correlations or the aggregated effects of pairwise correlations of the entire network.

### Dynamic mechanism underlying the weak HOIs

It has been demonstrated that dynamic models are instrumental in revealing the relationship between the anatomical connectivity and dynamics of brain networks (Zhou et al., 2006; Honey et al., 2007; Honey et al., 2009). Therefore, we examined the HOIs in a network model constructed to match the structural and dynamical characteristics of the real brain networks. Our simulation results suggest a possible explanation for the lack of HOIs at the macroscopic brain, that is, at this scale the dynamics are driven by background inputs and recurrent interactions that are both weak. It is a regime close to a vicinity of linear fluctuation, which can be reasonably approximated by a Gaussian linear process: the DG model. Consistently, we found that the amplitude distributions of the empirical BOLD signals are Gaussian like. This was also the case for our simulated BOLD signals provided that the global coupling was <0.1 and the common input was <0.3, in which condition we found no strong HOIs can be generated. Therefore, the dynamic of macroscopic brain networks that is close to the linear fluctuation regime likely limits the existence of strong HOIs. This view is also supported by the finding that a feedforward neuronal circuit model generally produces very weak HOIs at the mesoscopic level unless bimodal common inputs are imposed to the circuit (Barreiro et al., 2014), indicating deviation from the vicinity of linear fluctuation.

We note that the current model has made a number of assumptions, including the effectiveness of the mean-field approximation, spatial homogeneity in both the mean-field model and the Balloon–Windkessel hemodynamic model, as well as specific setting of many parameters. These approximations and similar parameter setting were commonly used in previous studies (Friston et al., 2003; Wong and Wang, 2006; Deco et al., 2013) and we have verified the effectiveness of our model by showing the similarity of dynamical features between the model and the empirical data. Nevertheless, it awaits future studies to investigate to what degree these assumptions and specific parameters may affect the behavior of HOIs in a network.

Finally, we emphasize that the lack of strong HOIs demonstrated here is only applicable to the resting state so far. During execution of tasks, input to the systems may be changed significantly (Ganmor et al., 2011). In addition, interactions may undergo specific fine tuning to perform coordinated information processing. More importantly, functional benefits of HOIs such as XOR operation (Schneidman et al., 2006) may play an essential role in such conditions. These issues need to be further explored by studying HOIs under task conditions and, if they are present, determining how they can emerge during the transition from the resting state to the task execution state.

## Footnotes

This work was supported by the National Key Research and Development Program of China (Grant 2017YFA0105203), the Natural Science Foundation of China (Grants 81471368, 11505283, 91132301, and 91432302), the Strategic Priority Research Program of the Chinese Academy of Sciences (CAS Grants XDB02060003 and XDB02030300), the China Postdoctoral Science Foundation (Special Financial Grant 2015T80154), the Hundred-Talent Program of CAS (S.Y.), the CAS Key Research Program of Frontier Sciences (Grant QYZDJ-SSW-SMC019), and the Beijing Municipal Science and Technology Commission (Grant Z161100000216139).

The authors declare no competing financial interests.

- Correspondence should be addressed to Shan Yu, Brainnetome Center, Institute of Automation, Chinese Academy of Sciences, 95 Zhongguancun East Road, Beijing 100190, China. shan.yu{at}nlpr.ia.ac.cn