## Abstract

Characterization of synaptic connectivity is essential to understanding neural circuit dynamics. For extracellularly recorded spike trains, indirect evidence for connectivity can be inferred from short-latency peaks in the correlogram between two neurons. Despite their predominance in cortex, however, significant interactions between excitatory neurons (E) have been hard to detect because of their intrinsic weakness. By taking advantage of long duration recordings, up to 25 h, from rat prefrontal cortex, we found that 7.6% of the recorded pyramidal neurons are connected. This corresponds to ∼70% of the local E–E connection probability that has been reported by paired intracellular recordings (11.6%). This value is significantly higher than previous reports from extracellular recordings, but still a substantial underestimate. Our analysis showed that long recording times and strict significance thresholds are necessary to detect weak connections while avoiding false-positive results, but will likely still leave many excitatory connections undetected. In addition, we found that hyper-reciprocity of connections in prefrontal cortex that was shown previously by paired intracellular recordings was only present in short-distance, but not in long distance (∼300 micrometers or more) interactions. As hyper-reciprocity is restricted to local clusters, it might be a minicolumnar effect. Given the current surge of interest in very high-density neural spike recording (e.g., NIH BRAIN Project) it is of paramount importance that we have statistically reliable methods for estimating connectivity from cross-correlation analysis available. We provide an important step in this direction.

## Introduction

Characterization of synaptic interactions on a large scale is essential to understand information processing in neural circuits. In an attempt to characterize local circuit dynamics, combined electrophysiological and imaging techniques have started to map out neocortical circuits (Bock et al., 2011; Ko et al., 2011). With recent developments in multielectrode recording technology, it has also become possible to monitor large numbers of neurons simultaneously with high temporal resolution (McNaughton et al., 1983; Csicsvari et al., 2003; Buzsaki, 2004). Even though this number is still small relative to the total number of neurons, one advantage of ensemble recordings is that it can greatly facilitate the study of spike-train interactions, because the number of neuron pairs increases as the square of the number of units recorded. The obtained spike trains are often analyzed by cross-correlations to infer synaptic interactions (Alonso and Martinez, 1998; Ostojic et al., 2009). The cross-correlogram, a well established technique to investigate temporal relationships between neural spikes, describes the covariance between the binned spike trains of two neurons at various time lags (Perkel et al., 1967b; Kirkwood, 1979; Aertsen and Gerstein, 1985; Brown et al., 2004). A plausible argument for synaptic connectivity can be made from the presence of short-latency peaks in the correlogram within the range of central glutamatergic EPSP and GABAergic IPSP rise times (Csicsvari et al., 1998; Barthó et al., 2004; Fujisawa et al., 2008).

Despite the potential usefulness of the cross-correlation method, the detection of excitatory interactions between pyramidal neurons (E–E interaction) has been difficult (Barthó et al., 2004; Fujisawa et al., 2008), because these interactions are generally weak (McNaughton, 1980; Mason et al., 1991; Deuchars et al., 1994; Markram et al., 1997; Thomson and Deuchars, 1997; Reyes and Sakmann, 1999; Thomson et al., 2002). Excitatory synaptic strength of intracellularly measured cortical connections has been shown to follow a lognormal distribution (Song et al., 2005), suggesting that a very small number of strong connections are embedded in a large number of weak connections. In addition, the duration of typical ensemble recordings is not long (up to a few hours). The consequence is a limited sampling of spike occurrences and also a poor sampling of the possible state-space occupancy distribution (Perkel et al., 1967a,b). Cells are tuned to respond to a given set of input vectors and some vectors in the set may never occur during the sampling epoch. Thus, for some connected cells in the sample, the input which brings them close enough to threshold to allow synaptic interaction may not occur during the sampling period. In other words, the effective contribution of cell A to the firing of cell B is not independent of the activity of other cells in the network. Hence, some connections may not be visible in spike cross-correlograms from a given sample epoch.

To explore these issues, we analyzed neural ensembles from rat prefrontal cortex and estimated the asymptotic detection probability of excitatory connections and investigated the nature of the distribution of the detected excitatory connections.

## Materials and Methods

##### Recording experiment.

Twenty-five hour continuous, multineuron recording datasets were collected from three adult male Brown Norway/Fischer 344 hybrid rats that were trained to run a spatial sequence task (Euston et al., 2007) and were also subjected to a novel object exposure procedure (Tatsuno et al., 2006). The recording sessions were divided into two 12 h sleep/rest sessions interrupted by 1 h behavior, which was either the sequence task or the novel object exposure, to yield a total recording duration of 25 h per recording session. The animals were implanted with a hyperdrive with 12 independently movable tetrodes (Gothard et al., 1996; <1 MΩ impedance) in the left medial prefrontal cortex (3.2 mm AP, 1.3 mm ML relative to bregma) to record from dorsal anterior cingulate cortex and prelimbic cortex. The tetrodes were used to record extracellular single units. The anatomical positions of the tetrodes were confirmed histologically.

Two additional probes were placed 4–5 mm deep in the medial prefrontal cortex to record a differential reference signal. The animal's position was tracked at 60 frames/s using LEDs mounted on the headstage that were detected by a color camera that was mounted on the ceiling of the recording room (∼0.33 cm/pixel). The thresholded signals were recorded with a Cheetah Data Acquisition System (Neuralynx), digitized at 32 kHz and bandpass filtered between 600 Hz and 6 kHz. Local field potential data were filtered between 1 and 475 Hz and sampled at 2 kHz.

Animal care and surgeries were conducted in accordance with National Institutes of Health guidelines and approved Institutional Animal Care and Use Committee protocols. After surgery, rats were administered 26 mg of acetaminophen (children's Tylenol, McNeil) and also received 2.7 mg/ml acetaminophen in the drinking water for 1–2 d after surgery. In addition, they were given oral ampicillin on a 10 d on/off regimen for the duration of the experiment.

##### Spike sorting.

Spiking activity was analyzed offline using an automated spike-sorting algorithm (KlustaKwik by K.D. Harris, University College London, London, UK) to isolate units and separate them from noise. The resulting clusters were manually refined using cluster cutting software (MClust 3.0 by A.D. Redish, University of Minnesota, Minneapolis, MN, with customizations by P. Lipa, University of Arizona, Tucson, AZ, S. L. Cowen, University of Arizona, Tucson, AZ, and D. Euston, University of Lethbridge, Canada) in a multiparameter space including features, such as energy (area under the waveform), peak to trough distance, principal component, time (to control for stability of the recording of the unit over the entire recording duration), and cross correlograms. In addition, all units were verified by waveform visualization software (WaveformCutter 1.0 by S. Cowen). Only units with <0.3% interspike intervals falling within 2 ms refractory period were accepted.

During long recording times, the position of the electrode can shift and, consequently, the shape of the recorded waveform can change. This can lead to errors in spike sorting and can cause false temporal firing relationships between neuron pairs. For this reason, we only included neurons that show good isolation and little variance in their waveforms and *Z*-scored peak amplitudes over the entire length of recordings (Fig. 1*a*). As a measure for the stability of the neurons, we calculated the fractional change of firing rate between the first and last 4 h periods. We found that the fractional change for the majority (95%) of the neurons that were selected by the first criteria was <80%, suggesting that the neurons were stable for the 25 h period of recordings.

In addition, inspired by the recent finding by Mizuseki and Buzsaki (2013) that the distribution of firing rates follows a lognormal distribution in the hippocampal formation, we also calculated the distribution of firing rate of our data. Figure 2*c* shows the firing rate distribution during the first and last 4 h of recording. We found that not only does the firing rate follow a lognormal distribution, but also the shape of the distribution did not change between the two periods. We also confirmed that the firing rate calculated of the entire recording follows a lognormal distribution.

##### Neuron classification.

Neurons are classified into excitatory, inhibitory, or unclassified based on statistical dependency with other neurons. First, for each reference neuron, cross-correlations against all other neurons were calculated with a bin size of 1 ms and a window size of ±50 ms. Second, each cross-correlation was converted to a *Z*-score with its mean and SD. This alleviated the problem of spike rate differences. Next, we calculated the moving average (window size: 15 ms) of the *Z*-scored cross-correlations and subtracted it from the original signal to obtain a detrended *Z*-scored cross-correlation. This procedure removed modulations in intermediate temporal ranges that are slower than monosynaptic interactions. Correlations that showed putative common input (measured as a peak or trough encompassing lag [−1:1]) were excluded from further cell classification analysis. Cross-correlation pairs that did not contain relevant information were also removed [if none of the *Z*-scored correlation values within [−4, 4] exceeded 2 or −2, the pair was removed]. Finally, the average cross-correlation of the remaining pairs was calculated (Fig. 1*b*). Mean and 1 SD of the average cross-correlation for the bins outside of [−4, 4] ms were used as a threshold to determine whether a bin within [1, 4] ms has a relevant correlation signal for cell classification; if at least one bin exceeded the upper threshold (mean + 1 SD) and no bin undershot the lower threshold (mean − 1 SD), the reference neuron was classified as a putative excitatory neuron. If at least one bin undershot the lower threshold (mean − 1 SD) and no bin exceeded the upper threshold (mean + 1 SD), the reference neuron was classified as a putative inhibitory neuron. Otherwise, the reference neuron was left unclassified. We also inspected all individual cross-correlations visually to verify whether the decision of the algorithm we applied on the average detrended *Z*-scored cross-correlations was consistent with the judgment made by human observers. In summary, this procedure aims at classifying a reference neuron based on the average statistical influence of the reference neuron to all possible postsynaptic target neurons.

##### Cross-correlation analysis.

All simultaneously recorded neurons from each animal were cross correlated with each other with a bin size of 1 ms. The cross-correlation values were normalized by the firing rate of the reference neuron. This procedure resulted in *n*·(*n* − 1)/2 correlation pairs (*n* = number of recorded units; self-correlation was excluded). For each neuron pair, the spikes of one spike train were jittered randomly within an interval of [−5, 5] ms to break up the short-latency relationships between spikes. The bin size and jitter interval were chosen based on synaptic integration times in frontal cortex neurons *in vivo* (Léger et al., 2005). Then, the jittered spike trains were cross-correlated with a bin size of 1 ms. This procedure was repeated 1000 times to compute a surrogate dataset of cross-correlations. Various significance thresholds (α levels) were computed from the distribution of jittered data to detect peaks or troughs with short latency and duration (≤2 ms) in the cell pair correlations (Barthó et al., 2004; Fujisawa et al., 2008): α levels used are 0.5, 0.2, 0.1, 0.05, 0.02, 0.01, 5 × 10^{−3}, 2 × 10^{−3}, 1 × 10^{−3}, 5 × 10^{−4}, 2 × 10^{−4}, 1 × 10^{−4}, 5 × 10^{−5}, 2 × 10^{−5}, 1 × 10^{−3}, and the global min/max value (referred to as absolute limit: abs) which corresponds to the overall maximum and minimum values of all bins in the surrogate dataset. If a single bin exceeded the significance level within the first 4 ms before or after lag 0, the correlation was scored as significant excitation and the pair was considered monosynaptically connected. Similarly, if two neighboring bins undershot the significance level within the first 4 ms before or after lag 0, the connection was scored as significant inhibition (Barthó et al., 2004). This way, significance was tested for both possible directions of interaction between the two neurons. To verify our method and temporal window within which we assess significance of the short-latency peaks, we repeated the same analysis as described above for 4 ms windows ∼±25 ms and ±50 ms.

The length of the recording period used for the cross correlogram calculation was varied from 1 to 25 h with 1 h increments. First, for each hour segment and for a given significance level α, we constructed a matrix of significant cross-correlations for all possible nonoverlapping blocks (i.e., 25 blocks for 1 h segments, 12 blocks for 2 h segments, etc.). This resulted in a matrix with an entry for each significant correlation type EE1, EE2, EI1, EI2, IE1, II1, EU1, and IU1 (EE1 represents unidirectional excitation between two excitatory neurons; EE2 represents reciprocal excitation between excitatory neurons; EI1 represents unidirectional excitation from an excitatory neuron onto an inhibitory neuron; EI2 represents reciprocal connection between an inhibitory and an excitatory neuron; IE1 represents unidirectional inhibition from an inhibitory neuron onto an excitatory neuron; II1 represents unidirectional inhibition between two inhibitory neurons; EU1 represents unidirectional excitation from an excitatory neuron onto an unclassified neuron; IU1 represents unidirectional inhibition from an inhibitory neuron onto an unclassified neuron). Because swapping reference- and target-neurons results in the mirror image of the original cross correlogram, only the upper triangle of the result matrix was considered. We then summed the number of significant correlations for each correlation type. To estimate the distribution of the number of detected correlations for each hour segment, we used bootstrapping with replacement (Mooney and Duval, 1993; Hoffman and McNaughton, 2002). The matrix was randomly resampled with replacement as many times as needed to obtain 1000 samples for each hour segment. For instance, for the 1 h segment, each upper triangle of the 25 result matrices was resampled 40 times. For the 25 h segment, only one result matrix was obtained and was therefore resampled 1000 times. The median, 84^{th} and 16^{th} percentile were calculated to characterize the distribution. For the tetrode-wise analysis, the connection matrix of each tetrode was resampled with replacement 1000 times. The result was summed across tetrodes and datasets. Again, the median, 84^{th} and 16^{th} percentile of the bootstrapped sample distribution were calculated.

The experimentally derived detection probability of excitatory connections *p*_{α}^{n} for the significance level α and the recording length of *n* hours was defined as follows:
where #(*EE*1) is the number of significant unidirectionally connected pairs, #(*EE*2) is the number of significant reciprocally connected pairs, and *N* is the total number of neuron pairs.

##### Extrapolation by fitting the statistical power function using simulated annealing.

To investigate how the experimentally derived detection probability of weak E–E connections *p*_{α}^{n} may improve beyond the recording duration of 25 h, we applied curve fitting based on statistical power. To obtain a reasonable estimate, we used a framework for the case that both the data from the alternative hypothesis and the data from the null hypothesis follow the normal distribution with a known SD. The statistical power function, defined as the probability of rejecting the null hypothesis when the alternative is true, is written as follows:
where the *cutoff* is the threshold value for rejecting the null hypothesis. When the alternative is true, μ_{1} is the mean of the alternative hypothesis (the mean peak cross-correlation value within −4 to +4 ms excluding 0), σ is the SD of the null and alternative hypotheses, *n* is the sample size (the length of recording in hours), and:
Using the significance level α, the *cutoff* is written as follows:
where μ_{0} is the mean of the null hypothesis (chance-level cross-correlation value estimated by spike-jittering) and:
Plugging Equation 4 into Equation 2 yields the following:
We conjectured that the probability of detecting a significant cross-correlation between two neurons, if a connection exists and the conditions for its making a contribution to postsynaptic spiking are realized, follows the same form of statistical power. The fitted detection probability of excitatory connections *C*_{α}^{n} for the significance level α and the recording length of *n* hours can be obtained as follows:
where *CEE*_{α} is the asymptotic detection probability of excitatory connections *p*_{α}^{n} at the significance level α. Finally, to obtain a better fit, the significance level α and the SD σ in the right hand side in Equation 7 were treated as fitting parameters α* _{param}* and σ

*, respectively. This gives the final fitting equation as follows: In summary,*

_{param}*CEE*

_{α}, α

*, and σ*

_{param}*in Equation 8 were optimized to fit the data of the experimentally derived detection probability of weak E–E connections*

_{param}*p*

_{α}

^{n}. This yields the fitted detection probability of excitatory connections

*C*

_{α}

^{n}. Note that in a standard testing of a normal mean with a known standard distribution, α and α

*are identical and they represent the significance level of the test. Similarly, σ and σ*

_{param}*are identical and they represent the standard distribution of the null and alternative hypotheses. In this study, however, α*

_{param}*was separated from α and σ*

_{param}*was separated from σ, and they were treated as fitting parameters. This distinction was necessary to obtain a good fit to our experimental data. We speculate that it might be partly due to the fact that μ*

_{param}_{1}follows a skewed continuous distribution rather than a normal distribution. Given the fact that many different fitting functions were indeed able to fit our experimental data almost equally well, but would produce greatly different asymptotes, we chose a fitting function that is based on statistical power.

Actual curve fitting was conducted by simulated annealing (Metropolis et al., 1953; Kirkpatrick et al., 1983) using MATLAB Global Optimization Toolbox (MathWorks). Simulated annealing is a well established, general optimization method that has been used as a powerful optimizer for *n*-body problems including the classic traveling salesman problem (Metropolis et al., 1953; Kirkpatrick et al., 1983). The method can be pictured as the physical process of first heating a material and then lowering the temperature slowly, which corresponds to minimizing the system's energy. More precisely, at a sufficiently high temperature, the system is slightly perturbed and the change of energy, Δ*E*, is calculated. The new state is accepted if Δ*E* is negative. If the energy increases, the new state is accepted under a certain probability (*p* = *exp* [− Δ*E*⧸(*k _{B}T*)], where

*k*= Boltzmann constant and

_{B}*T*= temperature) to avoid that the system gets stuck in local minima. This process is repeated many times at the current temperature. The temperature is slowly decremented until the system is frozen (Kirkpatrick et al., 1983). The method may not find the optimum solution but is most likely to find one of the near-optimum solutions. We ran 1000 optimization trials to select the best parameter values that would give the least sum of squares. See Table 2 for optimization results for all α levels. The

*r*-squares,

*R*

^{2}

_{AllEx}and

*R*

^{2}

_{AllExTT,}were computed to estimate the goodness of fit of the optimized function to the data.

##### Calculation of the effect size, *h*.

To compare the size of the short-latency peak across neuron pairs, we introduced the effect size *h*, which was defined as the normalized significant peak as follows:
where *h _{peak}* is the height of the maximum peak in the cross correlogram in [−4,−1] ms and [1,4] ms (i.e., the maximum peak taken of this 8 ms window) for unidirectional excitatory connections (EE1) and the height of the maximum peaks in the cross correlogram in [−4,−1] ms and [1,4] ms for reciprocal excitatory connections (EE2; i.e., the maximum peak taken of each 4 ms window), and μ

*and σ*

_{jitter}_{jitter}are the mean and SD of the jittered data for the corresponding cross-correlation, respectively (Fujisawa et al., 2008).

## Results

Long-term recordings of 25 h length were analyzed using cross-correlations to identify excitatory connections between prefrontal putative pyramidal neurons. Six datasets obtained from three rats that were subjected to both a repeated sequence running task and novel object experience, were analyzed (two novel-object datasets were described previously; Tatsuno et al., 2006). Table 1 shows the number of neuron pairs in each session. In total, we analyzed 237 neurons and 4787 correlation pairs. An example of units included in the analysis is shown in Figure 1*a*. We classified the neurons into excitatory (54%), inhibitory (8%), and unclassified neurons (38%) based on their overall short-latency effects on postsynaptic neurons (see Materials and Methods; Fig. 1*b*), as the different classes were not unambiguously differentiable based on their waveform characteristics. This may be due to our recording parameters, e.g., using a rather tight bandpass filter between 600 Hz and 6 kHz, which affects the shape of the recorded spikes.

### Changes of detection probability over various lengths of recordings

Significant excitatory connections between neuron pairs were detected by using a spike train jitter method (Barthó et al., 2004; Fujisawa et al., 2008). As the threshold for significant interactions is chosen somewhat arbitrarily by the experimenter and influences the size of the type I error (false-positives), we investigated a wide range of significance levels; 16 different α levels ranging from 0.5 to absolute limit (abs). The absolute limit is derived from the absolute maximum value of all bins in the jittered data for each neuron pair and represents the strictest α level. Note that it is not much smaller than 10^{−5} (the smallest used before abs).

Figure 2*a*,*b* shows four examples of cross correlograms between two putative pyramidal neurons with the absolute limit. Although no significant short-latency peak was detected with a 1 h recording (Fig. 2*a*,*b*, top row), it was detected with a 25 h recording (Fig. 2*a*,*b*, middle row). The reduced fluctuations in the correlogram reflect the improved signal-to-noise ratio and enable the detection of small peaks. As expected, longer recording times increase the chance of detecting weak excitatory connections. In addition, we also calculated the jitter-corrected cross-correlograms (Fig. 2*a*,*b*, bottom row; Hirabayashi et al., 2013; Mizuseki and Buzsaki, 2013). This is obtained by subtracting the mean jittered cross-correlations for each bin from the original cross-correlation. The results show that the significant peaks clearly stand out. The firing rate distribution of the analyzed dataset follows a lognormal distribution as previously shown (Mizuseki and Buzsaki, 2013) and does not change between the first and last 4 h of recordings (Fig. 2*c*). The distribution of the jitter corrected significant peaks follows a lognormal distribution as well (Fig. 2*d*), which is also consistent with previous findings (Mizuseki and Buzsaki, 2013).

The experimentally derived detection probability of excitatory connections *p*_{α}^{n} for the significance level α and the recording length of *n* hours was given by Equation 1. To make the results comparable to paired intracellular recordings between pyramidal neurons (Song et al., 2005), *N* was taken as the total number of possible excitatory neuron pairs (*N _{AllEx}* = 1443; Table 1). In addition, because intracellular recordings are typically done in slices and are limited to a smaller volume of tissue, we also calculated N as the number of possible excitatory neuron pairs per tetrode (

*N*= 275; Table 1). Although the results for both

_{AllExTT}*N*and

_{AllEx}*N*are presented, we consider that

_{AllExTT}*N*is more directly comparable to previous intracellular recordings (Song et al., 2005). The results of detection probabilities normalized by all recorded neuron pairs (Barthó et al., 2004; Fujisawa et al., 2008;

_{AllExTT}*N*= 4787), including connection types other than excitatory interactions are shown in Figure 9.

_{All}Figure 3*a* (top, data points) shows how the experimentally derived detection probabilities of excitatory connections *p*_{α}^{n} changes from 1 h recordings to 25 h recordings for four representative α levels: 0.05, 0.01, 10^{−5}, and the absolute limit. The data points represent the median and the background band shows the range for the 84^{th} and 16^{th} percentile estimated by bootstrapping. The detection probabilities increased with longer recording times for both ways of calculating the connection probability. For α = 0.05, the detection probability increased from 6.7% (*N _{AllEx}*) and 8% (

*N*) for 1 h recording segments to 16.4% (

_{AllExTT}*N*) and 25% (

_{AllEx}*N*) for 25 h recording segments, respectively. For α = 0.01, the detection probability increased from 1.6% (

_{AllExTT}*N*) and 2.9% (

_{AllEx}*N*) to 7.2% (

_{AllExTT}*N*) and 17.6% (

_{AllEx}*N*), respectively. For the stricter α values, 10

_{AllExTT}^{−5}and the absolute limit, which show very similar trends for both ways of normalization, the detection probability increased from 0.1% (

*N*) and 0.5% (

_{AllEx}*N*), to 1.7% (

_{AllExTT}*N*) and 7.6% (

_{AllEx}*N*), respectively.

_{AllExTT}### Curve fitting and extrapolation of the relationship between statistical power and sample size

Next, we investigated how the detection probability of excitatory connections between putative pyramidal neurons would improve beyond the recording duration of 25 h. The fitting of Equation 8 was applied to the experimentally derived detection probabilities *p*_{α}^{n} with 16 different α levels (0.5 to the absolute limit) to obtain the fitted detection probabilities *C*_{α}^{n} beyond 25 h. The optimization results are summarized in Table 2. Figure 3*a* (top) show the fitted results for four representative significance thresholds (α levels for 0.05, 0.01, 10^{−5}, and abs). The corresponding power function, defined by *C*_{α}^{n}/*CEE*_{α}, where *CEE*_{α} is the asymptotic value of *p*_{α}^{n}, was also obtained (Fig. 3*a*, bottom). We found that the fitted functions converge to different asymptotic values of the experimentally derived detection probability of excitatory connections *p*_{α}^{n} but the difference becomes very small for stricter α levels (10^{−5} and absolute limit). There indeed appears to exist a lower limit asymptote as the threshold becomes stringent (∼2%, *N _{AllEx}*; ∼8.5%,

*N*), indicated by the blue dashed lines in the plots. The

_{AllExTT}*CEE*

_{α}values for the stricter α levels (∼2% with

*N*and ∼8.5% with

_{AllEx}*N*) reach ∼20 and 70% of the local connectivity as reported by Song et al. (2005), respectively. As was discussed in the previous section, we consider that

_{AllExTT}*N*is more directly comparable to their study. Therefore, we conclude that ∼70% of local E–E connections could be detected by long extracellular recordings.

_{AllExTT}For α levels of 0.05 and 0.01, the power function (*C*_{α}^{n}/*CEE*_{α}) approaches 1 more quickly than that of 10^{−5} and absolute limit (Fig. 3*a*, bottom); ∼90% of power could be achieved with 20 h of recording. However, their detection probability also increases quickly and exceeds Song et al.'s (2005) detection probability of 11.6%; α = 0.05 with *N _{AllEx}* normalization and α = 0.05 and 0.01 with

*N*normalization (Fig. 3

_{AllExTT}*a*, top). Because intracellular recordings have intrinsically a higher chance of detecting monosynaptic connections, as the presynaptic neurons are stimulated and even subthreshold responses in the postsynaptic neurons recorded, the overshoot for α levels 0.05 and 0.01 suggests that these acceptance criteria detect excessive false-positives.

To assess how the same asymptote could be possibly reached, *CEE*_{α} are plotted against the α levels in percentage on a log–log scale (Fig. 3*b*). We found that as α levels get stricter, the slope of the consecutive data points for both ways of the *CEE*_{α} estimation gets increasingly smaller (open square and asterisk correspond to *N _{AllEx}* and

*N*normalization, respectively). This result indicates there is a lower limit asymptote. In contrast, when choosing a window of identical size at longer latency lags (±50 ms), the

_{AllExTT}*CEE*

_{α}estimation approaches zero almost following a diagonal line (open circle and open triangle correspond to

*N*and

_{AllEx}*N*normalization, respectively). A similar result was also obtained for the latency lags of ±25 ms. These results demonstrate that the windows at the latency lags outside of monosynaptic interactions (e.g., ±25 ms or ±50 ms) do not contain monosynaptic interactions.

_{AllExTT}In the final form of the fitting function (Eq. 8), the significance level α was treated as a fitting parameter α* _{param}* to obtain a better fit. We confirmed that α and α

*were monotonically related, indicating that fitting α*

_{param}*does not violate the original relationship of α = α*

_{param}*significantly (Fig. 4). However, α*

_{param}*deviates gradually when α gets smaller. This could be due to the fact that μ*

_{param}_{1}will follow a skewed distribution, which is not accounted for in Equation 8. Thus, Equation 8 would need to be rewritten accordingly but it is beyond the scope of the present paper. Together, these results show that our method using short-latency peaks of cross-correlations with strict α levels and using the approach of fitting a statistical power function is valid for detecting monosynaptic E–E interactions and estimating their asymptotic detection probability in local networks.

### False-positives and the distribution of effect size

We investigated whether the connections detected by less stringent α levels (e.g., 0.05 and 0.01) were contaminated by false-positives by plotting the distribution of the effect size *h* (Eq. 9). The distribution of the effect size within tetrodes (short range) and between tetrodes (long range) for four representative α levels (0.05, 0.01, 10^{−5}, and abs) are plotted in Figure 5*a* and *b*, respectively. Both unidirectional (EE1) and reciprocal excitatory connections (EE2) are included in the analysis. The results show that the distributions for the stricter α levels (10^{−5} and abs) were almost identical and contained strong effects only (Fig. 5, right columns). On the contrary, less strict α levels (0.05 and 0.01) contained additional large counts of smaller effect sizes (Fig. 5, left columns). Subtraction of the most stringent distribution (α = abs) from less stringent groups (α = 10^{−5}, 0.01, and 0.05) confirmed that the stricte*r* α levels (10^{−5} and abs) are very similar to each other but less stringent α levels (0.05 and 0.01) have higher counts of weak effect sizes (Fig. 5, insets). This is likely due to peak counts being contaminated by a large proportion of false-positives. Because false-positives would occur randomly, their distribution is expected to be Gaussian-like, which is consistent with what we obtained (Fig. 5, insets). In addition, most of the stronger effects are found in short range interactions (within tetrodes; Fig. 5*a*), not in long-range interactions (between tetrodes; Fig. 5*b*), suggesting that the strongest connections are mostly local. We also investigated the distribution of the effect size for unidirectional (EE1) and bidirectional connections (EE2) separately; Figure 6*a* for EE1 (within tetrodes), Figure 6*b* for EE1 (between tetrodes), Figure 6*c* for EE2 (within tetrodes), and Figure 6*d* for EE2 (between tetrodes). We found that the majority of strong effects for both EE1 and EE2 are local and that the long-range excitatory connections are predominantly unidirectional (EE1).

### Hyper-reciprocity in mPFC

Further support of the conclusion that less stringent α levels (e.g., 0.01 and 0.05) are massively contaminated by false-positives came from comparing the counts of significant connection pairs at different α levels and the number of connection pairs predicted by a random connection assumption (Fig. 7). Given that a network is randomly connected by a connection probability *p*_{α}^{n} (Eq. 1), the expected number of unconnected pairs is *N*(1 − *p*_{α}^{n})^{2}, the expected number of unidirectionally connected pairs is 2*Np*_{α}^{n} (1 − *p*_{α}^{n}), and the expected number of reciprocally connected pairs is *N* (*p*_{α}^{n})^{2}. Song et al. (2005) reported that the count of reciprocally connected excitatory neuron pairs (EE2) in rat visual cortex is four times higher than the number predicted. Wang et al. (2006) reported 3.5 and 7.9 times more EE2 connections than predicted in visual cortex and prefrontal cortex, respectively, in young ferrets. The ratios of observed EE2 to predicted EE2 are comparable in visual cortex for both rodent studies (4 in rat and 3.5 in ferret). EE2 connections are twice as likely in ferret medial prefrontal cortex than in visual cortex (7.9/3.5 = 2.3) and if the relationship holds true for rat, we would expect the EE2 connection probability in rat medial prefrontal cortex to be up to eight times higher than predicted by a random connectivity assumption. Based on these considerations, we predicted that the EE2 connection in rat medial prefrontal cortex would be four to eight times higher than predicted. For the stringent α levels of the absolute limit and 10^{−5}, we found that medians of observed EE2 connections are 4.6 and 5.6 times higher than the number predicted, respectively (Fig. 7, right columns). In addition, their 95% confidence intervals (Fig. 7, error bars) overlap with the predicted four to eight times range, suggesting that these results are consistent with what we predicted based on the numbers by Song et al. (2005) and Wang et al. (2006). However, for less stringent α levels, the difference between observed EE2 counts and predicted EE2 counts became lower than predicted; 3.2 times for α = 0.01, 2 times for α = 0.05 and 1.3 for α = 0.5 (Fig. 7, left three columns). Furthermore, their 95% confidence intervals do not overlap with the four to eight times range, except for α = 0.01, where there is a small overlap, indicating that the statistics of the observed EE2 connections were different for less stringent α levels. Because false-positives detected in cross correlograms are expected to occur randomly, the observed decrease of hyper-reciprocity suggests that more false-positives are included if less stringent α levels were used. We therefore conclude that stricter α levels together with long-term recordings are necessary for reliable detection of weak E–E interactions in extracellular recordings.

### Excitatory connectivity in mPFC is predominantly local

The relationship between local excitatory connectivity (within tetrode connections, normalized by *N _{AllExTT}*) and total excitatory connectivity (within and between tetrode connections, normalized by

*N*) can be further investigated by plotting them against each other for different α levels (Fig. 8). For stricter α levels, observed connections were localized within the upper-left triangle, suggesting that predominantly local connections were detected. With increasingly relaxed α levels the difference between local and total connectivity disappeared (at α = 0.2, indicating that the difference was masked by an increasing number of false-positives. Interestingly, total connectivity became larger than local connectivity for α = 0.5. If there was a general tendency of underestimating the local connectivity and overestimating the total connectivity, then this could explain why the total connectivity is larger than the local connectivity for α = 0.5. One could speculate that over elimination of spikes during spike sorting resulting in decreased correlations (Cohen and Kohn, 2011) could lead to the observed effect as this would affect the detection of significant cross-correlations within tetrodes more than between tetrodes. Incapability of recording overlapping spikes within tetrodes may also enhance this tendency.

_{AllEx}### Various types of excitatory connections are the most abundant in mPFC and their detection probability increases with sample size

In addition to E–E interactions (EE1 and EE2), our 25 h recordings also allowed us to investigate six additional connection categories (EI1, EI2, IE1, II1, EU1 and IU1, see Materials and Methods); Although, II2, reciprocal inhibition between putative inhibitory neurons, could be defined as well, we excluded this connection probability from our analysis because its cross-correlation cannot be easily distinguished from the case of common inhibitory input. The detection probabilities of connection categories involving excitatory neurons (EE1, EE2, EI1, EI2, IE1, and EU1) change with increased recording duration for two representative α levels (10^{−5} and abs; Fig. 9). Note that the normalization by all recorded neuron pairs (*N _{All}*) was used because all neuron types (excitatory, inhibitory, and unclassified) were included. We also found that EE1 (Fig. 9

*b*) and EU1 (Fig. 9

*g*) are the most abundant connection types followed by EE2 (Fig. 9

*c*), EI1 (Fig. 9

*a*), and EI2 (Fig. 9

*d*), suggesting that excitatory interactions are the major portion of cortical connections in rat prefrontal cortex. Inhibitory interactions, IE1 (Fig. 9

*e*), II1 (Fig. 9

*f*), and IU1 (Fig. 9

*h*), are detected much less frequently.

## Discussion

Pyramidal neurons are the most abundant cell type in neocortex. Pyramid–pyramid connections (E–E interactions) provide the majority of intracortical and extracortical projections. However, most E–E synapses are extremely weak (McNaughton, 1980; Mason et al., 1991; Deuchars et al., 1994; Markram et al., 1997; Thomson and Deuchars, 1997; Reyes and Sakmann, 1999; Thomson et al., 2002). With long-term continuous recordings, we tried to capture weak connections between pyramidal neurons. As opposed to most electrophysiological studies that generally record for ∼0.5–2 h per session, we were able to detect excitatory interactions between putative pyramidal cells in 8.5% of cell pairs with recording lengths up to 25 h. This connection probability of 8.5%, estimated using the number of putative excitatory neuron pairs per tetrode, corresponds to ∼70% of the local E–E connections reported in Song et al. (2005). As was discussed, this normalization is considered to be most comparable to Song et al. (2005). The smaller connection probability (2%) found for the total excitatory connectivity indicates that on average the connection probability falls off with distance (Song et al., 2005; Fujisawa et al., 2008). In addition, 1 h fragments of our recordings yielded a consistent number of monosynaptically coupled pairs (0.2% for strictest α levels: abs and 10^{−5} and 1.04% for α 0.01; counts were normalized using all recorded neuron pairs, Fig. 9) to previously reported studies (0.2–0.8%; Csicsvari et al., 1998; Barthó et al., 2004; Maurer et al., 2006; Fujisawa et al., 2008).

The strength of a pyramidal–pyramidal connection is partly determined by the location of the synapse and the type of receptors activated (Deuchars et al., 1994). For instance, because lateral connections between pyramidal neurons in cortex are predominantly on distal parts of the dendrites, they are weakened as the potential propagates toward the soma. Those inputs are nearly only transferred if they coincide with other inputs (Deuchars et al., 1994). Therefore the E–E interactions detected by relatively short duration electrophysiological recordings are presumably the strongest ones. However, those are embedded in a much larger matrix of weak connections, which are difficult to capture with extracellular recordings that only detect action potential transmission. With regard to successful spike transmission, pyramidal–interneuron pairs have been shown to be more reliable (Mizumori et al., 1989; Marshall et al., 2002; Holmgren et al., 2003; Swadlow, 2003). Using a less stringent detection threshold, such as 1%, as used in many previous studies, is therefore less likely to be severely affected by false-positives. A higher convergence of excitatory inputs among pyramidal neurons is necessary to bring a postsynaptic pyramidal neuron to firing threshold (McNaughton et al., 1981; Markram et al., 1997). In this respect, long-term recordings dramatically increased the detection probability of the statistically rare E–E events.

In Figure 5, we reported that most connections are local, i.e., both presynaptic and postsynaptic neurons are within the recording radius of a single tetrode. In the hippocampus, monosynaptically coupled pairs of pyramidal neurons and interneurons were also more frequently observed on the same tetrode than on different tetrodes (Csicsvari et al., 1998; Maurer et al., 2006). Connection strength tends to cluster around a few neurons (Barthó et al., 2004; Song et al., 2005; Fujisawa et al., 2008), often referred to as hub neurons. We also observed that many connections converge onto a few neurons. The majority of detected connections are local (Figs. 10*a*, 11), recorded on the same tetrode (Fig. 10*a*, red dashed lines). A schematic diagram created from Figure 10*a* also confirmed that there exists a hub neuron (Fig. 10*b*, neuron #16). Its inhibitory property (a hub neuron is an inhibitory neuron) is consistent with previous studies (Barthó et al., 2004; Fujisawa et al., 2008).

In this study, we showed that the detection probability of weak E–E connections can be significantly improved by long-term recordings (25 h) due to the substantially improved signal-to-noise ratio reflected in smoother cross-correlation signals for 25 h compared with 1 h recordings (Fig. 2). We were also able to detect the hyper-reciprocity of excitatory connections in medial prefrontal cortex with our long-term extracellular recordings (Song et al., 2005; Wang et al., 2006; Fig. 7). This strongly suggests that connectivity is not random, but highly structured. Despite the fact that increased recording times improve the detection of connections among excitatory neurons, extracellular recordings still underestimate the local connectivity as is reported by paired intracellular recordings. At best, they identify ∼70% of local excitatory connections at an α level that is strict enough to exclude most type I errors (Fig. 3). This limitation can be partly attributed to the fact that extracellular recordings in behaving and resting animals are bound to the brain state spaces visited over the duration of recordings, which the experimenter has only limited control over. In contrast, through direct stimulation of the presynaptic neuron, any subthreshold response in the postsynaptic neuron can be monitored in paired intracellular recordings. In addition, our analysis indicates that, to estimate reliably an asymptotic value for the local connectivity, stable recordings of at least 20–25 h, ideally longer, have to be used. This can be a challenge under many experimental conditions. A further caveat of the correlation method is the fact that the nature of the measure is a correlation and therefore reflects synaptic interactions only indirectly. Consequently, the cross-correlation analysis of extracellularly recorded spike trains should not be considered a definitive tool for estimating connectivity and should be used with caution. Despite this caveat, the method can provide important preliminary information for within- and inter-regional circuit connectivity and changes in their connectivity due to experimental manipulations that may not be obtainable with other current methods.

The cross-correlogram, like all other statistical methods, has limitations in terms of its applicability to the interference of causal interactions from spike-train data. First, the method assumes that the spike trains are stationary, meaning that their stochastic properties do not change over time. This assumption is not always easy to justify; for example, with repeated stimulus presentations neurons tend to adapt their responses (Fairhall et al., 2001). Second, cross-correlations are affected by firing rate differences between the neurons (de la Rocha et al., 2007; Amari, 2009). Appropriate normalization, however, can alleviate this problem (Hirase et al., 2001). It has been proposed that the information-geometric measure could be an alternative correlation measure that is statistically orthogonal to the change of firing probability (Amari and Nagaoka, 2000; Amari, 2001). In addition, recent theoretical studies show that an information-geometric measure could be more directly related to synaptic interactions (Tatsuno and Okada, 2004; Tatsuno et al., 2009; Nie and Tatsuno, 2012) and that it can be applied to nonstationary data (Shimazaki et al., 2012). Other promising methods would include a Bayesian approach (Brown et al., 2004; Eldawlatly et al., 2010) and a regularized logistic approach (Zhao et al., 2012). These measures may perform well in identifying causal, nonlinear relationships between neurons. Further investigation of long-term electrophysiological data by such statistical methods would promote the detection of neural interaction and hence contribute to the understanding of the circuit dynamics underlying complex behavior.

## Footnotes

This work was funded by an Alberta Innovates Health Solutions (AIHS) Polaris Award and National Institute of Mental Health R01MH46823-16 Grant awarded to B.L.M., an Innovates Centre of Research Excellence SCH001 Grant awarded to M.T., and an AIHS graduate Fellowship awarded to C.D.S. We thank Francesco Battaglia for helpful comments on a previous version of the paper.

The authors declare no competing financial interests.

- Correspondence should be addressed to Masami Tatsuno, Department of Neuroscience, Canadian Centre for Behavioural Neuroscience, The University of Lethbridge, 4401 University Drive W, Lethbridge, AB T1K 3M4, Canada. tatsuno{at}uleth.ca

This article is freely available online through the *J Neurosci* Author Open Choice option.