Abstract
To reduce statistical redundancy of natural inputs and increase the sparseness of coding, neurons in primary visual cortex (V1) show tuning for stimulus size and surround suppression. This integration of spatial information is a fundamental, context-dependent neural operation involving extensive neural circuits that span across all cortical layers of a V1 column, and reflects both feedforward and feedback processing. However, how spatial integration is dynamically coordinated across cortical layers remains poorly understood. We recorded single- and multiunit activity and local field potentials across V1 layers of awake mice (both sexes) while they viewed stimuli of varying size and used dynamic Bayesian model comparisons to identify when laminar activity and interlaminar functional interactions showed surround suppression, the hallmark of spatial integration. We found that surround suppression is strongest in layer 3 (L3) and L4 activity, where suppression is established within ∼10 ms after response onset, and receptive fields dynamically sharpen while suppression strength increases. Importantly, we also found that specific directed functional connections were strongest for intermediate stimulus sizes and suppressed for larger ones, particularly for connections from L3 targeting L5 and L1. Together, the results shed light on the different functional roles of cortical layers in spatial integration and on how L3 dynamically coordinates activity across a cortical column depending on spatial context.
SIGNIFICANCE STATEMENT Neurons in primary visual cortex (V1) show tuning for stimulus size, where responses to stimuli exceeding the receptive field can be suppressed (surround suppression). We demonstrate that functional connectivity between V1 layers can also have a surround-suppressed profile. A particularly prominent role seems to have layer 3, the functional connections to layers 5 and 1 of which are strongest for stimuli of optimal size and decreased for large stimuli. Our results therefore point toward a key role of layer 3 in coordinating activity across the cortical column according to spatial context.
Introduction
One of the fundamental computations performed by the primary visual cortex (V1) is the integration of visual information across space. In V1, neurons have a spatially localized classical receptive field (RF), but their activity strongly depends on the spatial context of the stimulus. Responses typically become larger for stimuli of increasing size, but can be suppressed if the stimulus extends beyond the RF (Blakemore and Tobin, 1972; Nelson and Frost, 1978; Allman et al., 1985; Gilbert and Wiesel, 1990; Knierim and van Essen, 1992; DeAngelis et al., 1994). This phenomenon, known as surround suppression, has been described at all stages of the retino-geniculo-cortical pathway, with dedicated neuronal circuits likely working in parallel (Angelucci et al., 2017). Surround suppression is thought to be a key mechanism for reducing redundancies in the natural input, perceptual pop-out, and segmentation of object boundaries (Coen-Cagli et al., 2012; Sachdev et al., 2012; Schmid and Victor, 2014; Angelucci et al., 2017).
In area V1, surround suppression has been observed at most cortical layers, but suppression is strongest and RF size smallest in L2/3 (Jones et al., 2000; Shushruth et al., 2009; Nienborg et al., 2013; Vaiceliunaite et al., 2013; Self et al., 2014). There, optogenetic studies in mice have revealed that one of the key circuits for surround suppression consists of L2/3 somatostatin-positive (SOM+) inhibitory interneurons, which are preferentially recruited by cortical horizontal axons (Adesnik et al., 2012). L2/3 SOM+ inhibitory interneurons have large RFs that effectively sum information across space while showing little surround suppression themselves. Furthermore, their inactivation results in decreased suppression of L2/3 principal cells (Adesnik et al., 2012). Consistent with a more general role in providing lateral inhibition, SOM+ neurons seem to control frequency tuning in L2/3 of mouse auditory cortex by providing lateral inhibition (Kato et al., 2017).
In addition to horizontal connections and local inhibitory interneurons in L2/3, additional mechanisms and neural circuits might control V1 spatial integration, potentially with differential impact across V1 layers. V1 receives extensive inter-areal feedback connections, preferentially terminating in L1 and L5/6 (Coogan and Burkhalter, 1990; Markov et al., 2014) and extending across multiples of the V1 RF diameter, which have been proposed to mediate suppressive influences from surround regions further away (Angelucci et al., 2002; Angelucci and Bressloff, 2006; Nurminen et al., 2018). In addition, at least some surround suppression might be inherited from dLGN (Müller et al., 2003; Alitto and Usrey, 2008; Piscopo et al., 2013; Erisken et al., 2014). Although each of these mechanisms and circuits likely contributes to surround suppression across several V1 layers, the coordination of suppression across V1 layers remains poorly understood.
To shed light on how activity across layers of a V1 column is differentially orchestrated during spatial integration, we characterized with high temporal resolution surround-suppressed activity within each layer and surround-suppressed functional connectivity between layers using the Granger causality framework (Granger, 1969; Plomp et al., 2014b; van Kerkoerle et al., 2014; Seth et al., 2015; Liang et al., 2017). In awake mice, we recorded single- and multiunit activity and local field potentials (LFPs) simultaneously across all six layers, computed dynamic functional connectivity estimates (Baccalá and Sameshima, 2001; Milde et al., 2010; Plomp et al., 2014b), and used a novel Bayesian model comparison approach to identify at which latencies surround suppression was evident in laminar activity and in interlaminar functional connectivity strengths. We found sustained surround suppression in spiking activity of L4 and L3, with a rapid sharpening of the tuning profile shortly after stimulus onset. In addition, L3 functional connectivity to L1 and L5 also had a surround-suppressed profile: the L3→L1 and L3→L5 connections were strongest for stimuli covering the RF and much reduced in the presence of a surround. Together, these results demonstrate a key role of L3 in orchestrating activity across layers of V1 in a size-dependent way.
Materials and Methods
Experimental procedures.
All experiments were performed on awake, adult mice. The procedures complied with the European Communities Council Directive 2010/63/EC and the German Law for Protection of Animals and were approved by local authorities following appropriate ethics review.
Mice.
We used seven adult mice, four males and three females (two C57BL/6J and five mice with floxed NR1 receptors used as controls for a different study (see Korotkova et al., 2010 for details on the mouse line), which ranged in age from 2 to 7 months. We used recordings with at least two contacts both in L1 and L6 (25 experiments from 13 penetrations), allowing bipolar derivation for L1–L6 (see below).
Surgical procedures.
Surgeries were performed as described previously (Erisken et al., 2014). Briefly, mice were anesthetized using 3% isoflurane, which was maintained for the duration of the surgery at 1.5–2%. Analgesic (buprenorphine, 0.1 mg/kg, s.c.) was administered and eyes were prevented from dehydration with an ointment (Bepanthen). The animal's temperature was kept at 37°C via a feedback-controlled heating pad (WPI). A custom-designed head post was attached to the anterior part of the skull using dental cement (Tetric EvoFlow, Ivoclar Vivadent) and two miniature screws were placed in the bone over the cerebellum, serving as reference and ground (#00–96 X 1/16, Bilaney). Following the surgery, antibiotics (Baytril, 5 mg/kg, s.c.) and long-lasting analgesics (carprofen, 5 mg/kg, s.c.) were administered for 3 consecutive days. After recovery, mice were placed on a Styrofoam ball and habituated to head fixation for several days. The day before electrophysiological recordings, mice were again anesthetized (isoflurane 2%), and a craniotomy (∼1 mm2) was performed over V1 (3 mm lateral from the midline suture, 1.1 mm anterior to the transverse sinus). The exposed brain was sealed with the silicon elastomer Kwik-Cast at the end of each recording session. Recording sessions always started at least 1 d after surgery.
Experimental design.
Visual stimuli were created with custom software (Expo, https://sites.google.com/a/nyu.edu/expo/home) and presented on a gamma-corrected LCD monitor (Samsung 2233RZ; mean luminance 50 cd/m2) placed 25 cm from the animal's eyes. Stimulus onset was measured using a photodiode detecting frame-by-frame luminance changes between white and black at the top left corner of the monitor; this photodiode signal was sampled as additional channel by the data acquisition system (see below) and used for temporal alignment of electrophysiological data to visual stimulation.
To measure RFs, we mapped the ON and OFF subfields with a sparse noise stimulus. The stimulus consisted of white and black squares (4° diameter) briefly flashed for 150 ms on a square grid (40° diameter). To measure size tuning, we centered circular square-wave gratings (spatial frequency 0.02 cycles/°) of 10 different diameters (3.9°, 5.6°, 7.8°, 12.1°, 15.5°, 21.8°, 30.6°, 43.1°, 60.5°, or 67.3° of visual angle) on online estimates of RF centers based on threshold crossings and presented each stimulus in pseudorandom order for 750 ms, followed by a 500 ms ISI. We also included a blank screen condition in which only the mean luminance gray screen was presented. Orientation of the gratings was chosen to match the average preferred orientation tuning based on threshold crossings. The number of trials for each stimulus size varied across animals (mean 208, range 50–500). For determining the L4/L5 border using current source density analysis, we presented a full-field, contrast-reversing checkerboard at 100% contrast, with a spatial frequency of 0.02 cycles/° and a temporal frequency of 0.5 cycles/s.
Extracellular recordings.
Extracellular recordings were performed in awake, head-fixed mice placed on a Styrofoam ball. Recordings of neural activity were performed with a 32 channel linear silicon probe with 25 μm intercontact spacing (Neuronexus, A1x32–5 mm-25-177-A32, thickness: 15 μm, maximal shank width: 145 μm). We used a motorized micromanipulator (MP 225, Sutter Instruments) suitable for intracellular and extracellular recordings to advance probes into cortex at very low speeds (0.5 μm/s) and let the tissue settle for at least 30 min before the first recording. We typically performed four to five recording sessions in the same animal (maximum 10) and avoided regions where implantations were previously made. The frame for head fixation, the cup of the air-floating ball, and the micromanipulator were mounted on an air table to avoid differential movement of the animal relative to the electrode.
Extracellular signals were recorded at 30 kHz (Blackrock Microsystems) and analyzed with the NDManager software suite (Hazan et al., 2006). For spike sorting, we divided the linear array into five “octrodes” (eight channels per group with two channels overlapping). Using a robust spike detection threshold (Quiroga et al., 2004) set to 6 SDs of the background noise, we extracted spike waveshapes from the high-pass-filtered continuous signal. The first three principal components of each channel were used for automatic clustering with a Gaussian Mixture Model in KlustaKwik (Henze et al., 2000) and the resulting clusters were manually refined with Klusters (Hazan et al., 2006). Duplicate spike clusters, which can arise from separating the electrode channels in different groups for sorting, were defined as pairs of neurons for which the cross-correlogram's zero-bin was three times larger than the mean of nonzero bins and one of the neurons in the pair was removed from the analysis.
For calculating the envelope of multiunit activity (MUAe), we full-wave rectified the median-subtracted, high-pass filtered signals before low-pass filtering (200 Hz) and down-sampling to 2000 Hz (Supèr and Roelfsema, 2005; van der Togt et al., 2005; Self et al., 2014). To ensure spatial alignment of RFs across cortical depth, we routinely assessed RF maps obtained by the sparse noise stimulus, for which we used average MUAe between 50 and 175 ms after stimulus onset (for an example, see Fig. 1b). For analysis of size tuning, MUAe was normalized to prestimulus values for each single trial and layer. Response onset (see Fig. 2) was quantified as the first latency after which the 95% confidence interval (CI) across animals remained >0 for at least 10 ms for the largest presented grating.
The LFP was computed by downsampling the data to 1250 Hz, and high-pass, forward-backward filtering at 1 Hz (second-order Butterworth). To map electrode contacts to cortical layers, we computed current source density (CSD) from the second spatial derivative of the LFP (Mitzdorf, 1985) and assigned the base of L4 to the contact that was closest to the earliest CSD polarity inversion from sink to source. The remaining contacts were assigned putative layer labels based on the known relative thickness of V1 layers (Heumann et al., 1977) and an assumed total thickness of ∼1 mm. We checked the L4–L5 boundary localization using CSD methods that do not assume constant activity in the horizontal direction (Pettersen et al., 2006) and obtained identical depth estimates.
We then selected for further analysis the channel closest to the middle of each layer as the representative signal for that layer. To reduce volume conduction effects from neural and noise sources (e.g., muscles artifacts), we derived bipolar LFPs by subtracting signals from the two neighboring electrodes (Bastos et al., 2015; Trongnetrpunya et al., 2015; Rohenkohl et al., 2018). Power spectral density was calculated with the S-transform on epochs of −500 to 500 ms around stimulus onset and rectified as relative increases with respect to prestimulus activity (Roberts et al., 2013).
Time-varying directed connectivity.
We used a time-varying implementation of the partial directed coherence (PDC) (Baccalá and Sameshima, 2001; Milde et al., 2010), a multivariate, directed connectivity measure based on the notion of Granger causality, or the relative predictability of signals from one another (Granger, 1969; Bressler and Seth, 2011). Directed functional connectivity values were calculated from single-trial bipolar LFP signals between −50 and 300 ms after stimulus onset. We chose bipolar LFPs (Bastos et al., 2015; Trongnetrpunya et al., 2015; Rohenkohl et al., 2018) because possible noise amplification in CSD calculations can negatively affect connectivity analysis (Trongnetrpunya et al., 2015). Although more sophisticated CSD methods can alleviate this problem using spatial filtering (Pettersen et al., 2006), they introduce parameters that critically determine temporal properties of the resulting CSD signal. Such parameter-dependent dynamics pose problems for analyses that quantify temporal relations between signals, like the PDC, especially when true parameter values are unknown, as for our recordings. For a similar reason, the MUAe signal is not suitable for such analyses. The MUAe provides a continuous measure of spiking activity by taking the envelope of the high-frequency power in the signal (Super and Roelfsema, 2005), but this envelope removes phase information from the signal so that the precise timing relations between signals are lost.
PDC was derived from a multivariate autoregressive model based on a fixed model order that reflects the maximum time lag of observations included in the model (Baccalá and Sameshima, 2001). Optimal model orders were determined by minimizing Akaike's information criterion across epochs within animals for each stimulus size (Barnett and Seth, 2014) and ranged between 13 and 15 (10–12 ms). This parametric approach avoids known pitfalls of some nonparametric approaches (Stokes and Purdon, 2017).
To obtain time-varying multivariate autoregressive (tvMVAR) models, we used a Kalman filter approach (Milde et al., 2010). The constants that determine adaptation speed during parameter estimation were fixed at 0.02 following previous work (Astolfi et al., 2008; Plomp et al., 2014a, 2014b). Within animals, tvMVAR parameter estimates were averaged across trials for the 11 conditions (Ghumare et al., 2015).
We orthogonalized the tvMVAR parameters to further guard against possible volume conduction effects (Hipp et al., 2011; Omidvarnia et al., 2014) and obtained orthogonalized PDC values (OPDC) values using a row-wise normalization to optimize sensitivity to information outflows (Kuś et al., 2004; Astolfi et al., 2007) as follows: where A is the frequency-transformed tvMVAR parameter matrix. We squared OPDC values to enhance accuracy and stability (Astolfi et al., 2006). Resulting PDC matrices were normalized (0–1) for each animal across conditions, time, and frequencies (1–150 Hz) and multiplied by the normalized spectral power across conditions, time, and frequencies, obtaining a weighted PDC (wPDC) estimator that has been shown to better reflect the underlying physiological processes (Plomp et al., 2014b).
Statistical analysis.
In V1, the suppressive influence from the extraclassical surround is generally considered a phenomenon accounted for by divisive normalization (Carandini and Heeger, 2011). On a descriptive level, effects of surround suppression in spatial tuning can be captured by a ratio of Gaussians (RoG) model (Cavanaugh et al., 2002), where a center Gaussian with independent amplitude and width is normalized by a Gaussian representing the surround. Therefore, responses are given by the following: where x is the stimulus diameter, kc and ks are the gains of center and surround, wc and ws their respective spatial extents, and Lc and Ls are the summed squared activities of the center and surround mechanisms, respectively. Our use of the RoG model is not meant to reflect a particular biophysical implementation, but should only serve as a quantitative description of tuning; however, it has been shown that the RoG model applied to V1 responses can outperform models assuming subtractive influences from the surround (Cavanaugh et al., 2002). Although RoG models can also capture nonsuppressed responses, responses increasing monotonically for most of the tested stimulus sizes can be more parsimoniously explained by a simple linear null model with only two parameters as follows: where a and b reflect intercept and slope, respectively. In Bayesian statistics, the evidence in favor of one model (M1; RoG model) over another (M2; linear-null model) given the data is the ratio of their posterior probabilities or Bayes factor (Raftery, 1995; Jeffreys, 1998; Rouder et al., 2009): We assumed both models to be equally likely a priori, and set the summed prior probabilities to 1. Bayes factors were approximated using the Bayesian information criterion (BIC) values associated with M1 and M2 (Raftery, 1995) as follows: Model comparisons based on BIC values penalize for the number of parameters and here provide a conservative approach for detecting evidence in favor of the RoG model. The Bayes factor (B12) quantifies the relative amount of evidence in the data for each model. B12 > 3 is generally considered positive evidence in favor of M1 (Kass and Raftery, 1995; Raftery, 1995). We report B12 values on logarithmic scale.
For each time point in the MUAe activity and for each time–frequency point in the connectivity analysis, we fitted amplitudes as a function of stimulus size with linear and RoG models (nonlinear least squares, Port algorithm), enforcing wc < ws (Cavanaugh et al., 2002). Model comparisons were done separately for each animal to avoid effects driven by single animals or outliers.
From the RoG models, we obtained RF center size as the stimulus diameter eliciting peak amplitude (MUAe or connectivity strength); models with center sizes <3.9° (smallest presented size) were not further analyzed. We quantified strength of suppression using the suppression index (SI) as follows: where A is MUAe, spike rate, or wPDC amplitude, Aopt is the model's peak amplitude, and Asupp is the amplitude at the largest presented size (67.3°) (DeAngelis et al., 1994; Self et al., 2014).
We identified data points with a suppressed tuning curve profile as those that showed both positive evidence in favor of the RoG model (B12 > 3, or equivalently log B12 > 1) and a nonzero SI (SI > 0). This latter requirement ensured that responses with asymptotic or other nonlinear monotonic increases were not further considered. For further analysis, we retained data points where at least six of seven animals passed both criteria (conjunction analysis). When an animal failed a criterion, its results were not included for further summaries. Layers or functional connections with only one data point were not further analyzed (L6 MUAe at 288 ms; L2→L1 connection at 82 ms, 6 Hz). All model comparisons and analyses were done in R (www.r-project.org). For graph visualization (Fig. 6), the layout was determined using the Fruchterman–Reingold algorithm (Fruchterman and Reingold, 1991) applied to a binary adjacency matrix as implemented in the igraph library for R.
For the model comparison analysis of single-unit RF dynamics, we included sorted units from the central contact in the target layer and the two contacts immediately above and below (i.e., across five contacts, covering 125 μm centered on the middle of L3 or L4, see Fig. 1b). We first fit RoG models to the spike rates between 0 and 300 ms after stimulus onset to identify units with R2 > 0.5, SI > 0 and center size > 4°. For these units, we then dynamically fit RoG models in 50 ms bins sliding between 25 and 250 ms after stimulus onset (1 ms shift size).
Results
In awake, head-fixed mice (Fig. 1a), we performed extracellular recordings across all layers of area V1 (Fig. 1b). We assigned electrode contacts to layers based on CSD analysis (Mitzdorf, 1985) (Fig. 1b) and computed from the recorded signals the LFPs and the MUAe (Supèr and Roelfsema, 2005; van der Togt et al., 2005). MUAe responses reflect the number and amplitude of spikes close to the electrode, resembling thresholded multiunit data and average single-unit activity (Supèr and Roelfsema, 2005; Self et al., 2014). To assess spatial integration, we presented gratings of various diameters centered on the RFs of the recorded neurons. Similar to numerous studies before, we found that time-averaged multiunit activity at the granular and supragranular layers varied systematically with grating diameter, typically peaking at intermediate sizes of around 20–30° of visual angle and showing surround suppression with larger diameters (Fig. 1c). Beyond these time-averaged response patterns, we observed considerable variation in response latencies, amplitudes, and time courses of MUAe responses across layers depending on stimulus size (Fig. 1d).
To get first insights into the dynamics of size tuning at each layer, we used Bayesian model comparisons to identify at what latencies MUAe activity showed more evidence in favor of a RoG model than a linear model (Fig. 1e). The RoG models consist of two Gaussians with the same center location but different widths and amplitudes and can well capture V1 size tuning curves (Cavanaugh et al., 2002; Vaiceliunaite et al., 2013). In this model, preferred center size is given by the peak location of the fitted curve and suppression strength is the amplitude reduction for large stimuli relative to peak response (SI) (Van den Bergh et al., 2010). By selecting RoG models with B12 values > 3 and SI > 0, we identified layers and time points where MUAe amplitudes consistently reflected a surround-suppressed tuning curve (see Materials and Methods, “Bayesian model comparison”).
Multiunit activity in L3–L5 is dynamically suppressed
Using the above outlined, stringent model comparison approach, we found that the major time points where MUAe activity consistently reflected surround suppressed tuning curves occurred in L3, L4, and L5 (Fig. 2a). Surround suppression was first evident in L4, emerging at 44 ms after stimulus onset (first data point with consistent evidence in favor of suppression in 6/7 animals). L4 suppression onset came 12 ms after response onset at 32 ms (first data point after which the 95% CI across animals exceeded baseline for at least 10 ms; Fig. 2b, middle). The observed short delay between stimulus-driven activity and surround suppression onset is consistent with the hypothesis that even at the earliest latencies in L4 of mouse V1, surround suppression is not solely inherited from the dLGN, but is rapidly shaped by intracortical circuits (Knierim and van Essen, 1992; Smith et al., 2006). L4 MUAe suppression continued until 114 ms after stimulus onset and also showed a later, sustained surround-suppressed response component from 183 ms onward (Fig. 2a). Across time points, L4 MUAe peaked for stimulus diameters of 24° (range: 14–52°), with a median SI of 0.61 (range: 0.04–0.84) (Fig. 2c, d).
In L3, MUAe onset (40 ms) occurred later than in L4, but the onset of suppression was similar to that of L4 (47 ms; Fig. 2b) and, overall, surround suppression had a similar time course and strength (Fig. 2a, c, d). In particular, L3 MUAe showed surround suppression during broadly two periods: an early period starting slightly after response onset from 47 to 115 ms and a later one between 187 and 300 ms (end of epoch; Fig. 2a). Overall, L3 MUAe median center size was 22° (range: 5–36°) with suppression strengths (SI) of 0.6 (0.05–0.96) (Fig. 2c, left), indicating considerable suppression of RF activity in L3 during surround stimulation.
For L5 MUAe, surround suppression started relatively late, 55 ms after stimulus onset or 22 ms after response onset (Fig. 2a, b), and suppression was more concentrated in time (55–65 ms) than in the more superficial layers. At these relatively few time points, however, consistent evidence for surround suppression was found in all mice. Here, L5 MUAe preferred stimulus diameters of 25° (range: 9–49°) and had a median SI of 0.31 (0.11–0.71). Model comparison results and RoG model parameters for the surround suppressed MUAe per layer are summarized in Table 1.
None of the MUAe in L1, L2, or L6 showed strong and consistent evidence for surround suppression at any single time point. In L1, stimulus-evoked transient MUAe responses were small and showed only modest variations with stimulus size across time (Fig. 1d). This is not surprising given that L1 has relatively few neurons (Hestrin and Armstrong, 1996; Gonchar et al., 2007), the spiking activity of which is difficult to pick up with extracellular recordings. L2 showed evoked responses (Self et al., 2014) and time-averaged MUAe had a surround-suppressed profile (Nienborg et al., 2013; Vaiceliunaite et al., 2013; Self et al., 2014), but variability was considerable and consistency across animals lower than for L3 and L4 MUAe. The lack of surround suppression at any single time point in L1 and L2 indicates that surround-suppressed activity in those layers is less consistently time locked to stimulus onset than the suppression that we dynamically observed in L3, L4, and L5. This suggests that L1 and L2 serve less time-critical functions in spatial processing. In L6, by contrast, stimulus-evoked MUAe was strong, but it increased monotonically with stimulus size without showing consistent suppression at any single time point (Fig. 1c, d). The weak average SI in L5 and the relative lack of surround suppression in L6 are generally consistent with previous studies reporting broader spatial tuning for V1 infragranular layers in cats (Jones et al., 2000), and mice (Nienborg et al., 2013; Vaiceliunaite et al., 2013; Self et al., 2014).
Functional connectivity between layers shows surround suppression
Having observed strikingly different effects of surround suppression across cortical layers and in time, we next assessed how cortical layers dynamically orchestrate activity during spatial integration by analyzing interlaminar functional connectivity. We calculated time-varying connectivity between all layers based on wPDC, a multivariate variant of Granger causality in the frequency domain (Baccalá and Sameshima, 2001; Bressler and Seth, 2011; Plomp et al., 2014b; Seth et al., 2015) (see also Materials and Methods, “Time-varying directed connectivity”). Granger causality is a statistical measure of time-lagged regularities between recorded signals (here, LFPs with bipolar derivation), where increased connectivity means that the future activity of the target layer becomes better predictable from the activity at the source layer; that is, that the source layer more strongly drives activity in the target layer. Because driving in this functional connectivity framework might or might not occur via direct anatomical connectivity, we decided to interpret our results in light of the simultaneously recorded spiking activity (MUAe) and the known structural connectivity while also emphasizing alternative explanations and limits of the technique (see also Discussion). In the past, similar connectivity analyses have helped to better understand laminar interactions at rest and during sensory processing (Bollimunta et al., 2008; Plomp et al., 2014a; van Kerkoerle et al., 2014; Chen et al., 2017; Liang et al., 2017).
We calculated wPDC strengths from our V1 laminar recordings and investigated how directed functional connectivity depended on spatial context. We reasoned that, in the same way that surround-suppressed activity reflects contextual influences on neuronal responsivity, surround-suppressed connections would reflect how stimulus context parametrically varies the influence that a source layer has on future activity of its target layer. We illustrate this reasoning in Figure 3 using two example connections. Figure 3a shows directed connectivity strengths in response to a large-sized grating from L4 to each of the other layers. Consistent with known L4 projections, the main targets of L4 driving were L3 and L5 (Thomson and Bannister, 2003; Harris and Shepherd, 2015; Pluta et al., 2015; Xu et al., 2016). Figure 3b shows connectivity strengths of the L3→L1 connection for different stimulus sizes and illustrates that connection strength can depend on spatial context.
Applying the same Bayesian model comparison approach as above to directed functional connectivity strengths at each time (0–300 ms) and frequency point (1–150 Hz), we obtained for each connection a time–frequency distribution of Bayes factors (B12). Using identical thresholding and conjunction analysis as for MUAe, we identified five major interlaminar connections the strength of which followed a surround-suppressed tuning curve and thus relayed information about stimulus size and context.
The earliest surround-suppressed connection extended from L4 to L2 at latencies between 49 and 51 ms after stimulus onset and operated in the beta band (Fig. 4a; Table 2). Across time and frequency points, the L4→L2 connection was strongest for stimuli of 23° (range across animals: 10–41°) and strongly suppressed for larger stimuli (median SI = 0.65, range 0.05–0.89). In general, functional connectivity from L4 to L2 is consistent with the known ascending projections from L4 to L2/3 (Thomson and Bannister, 2003; Harris and Shepherd, 2015; Xu et al., 2016; Pluta et al., 2017). The surround-suppressed L4→L2 connectivity coincided with the onset of surround-suppressed MUAe at L4 (Fig. 2a), consistent with the notion that size-tuned multiunit activity plays a role in the relay of size information to L2. Remarkably, the time point by time point MUAe analysis of L2 did not show consistent evidence for surround-suppressed activity. This indicates that the L4→L2 driving does not immediately and consistently result in size-tuned MUAe at L2. Instead, it suggests that L4 activity parametrically drives postsynaptic potentials at L2 in a less time-locked manner, likely contributing to the surround-suppressed activity obtained in time-averaged data (Fig. 1c).
A second ascending connection with a surround-suppressed driving profile was the L3→L1 connection, which showed surround-suppressed connectivity in the beta and low gamma band between 65 and 78 ms (Fig. 4b). For this connection, median preferred size was 39° (16–44°), with a suppression strength of SI = 0.3 (0.1–0.83) (Fig. 4d). As with the L4→L2 connection, the latencies of surround-suppressed driving coincided with a period of surround-suppressed MUAe in the source layer and with an absence of evidence for time-resolved surround suppression in the target layer (Fig. 2d), suggesting that size-tuned activity at L3 relays size information to L1 by driving postsynaptic potentials with low temporal precision. Target L1 is an important recipient of thalamic and cortical feedback (Coogan and Burkhalter, 1993; Ji et al., 2015; D'Souza and Burkhalter, 2017). L1 postsynaptic potentials, in turn, have modulatory influence throughout the column because neurons in most layers have apical dendrites in L1, allowing L1 to change spike likelihoods in deeper layers (Larkum et al., 1999; Jiang et al., 2013; Egger et al., 2015). The L3→L1 driving was largest for stimuli inside the RF of the column, indicating that the processing at L1 is shaped in a size and context-dependent way. The L3→L1 connection could thus potentially modulate how feedback arriving at L1 affects activity throughout the column, but in the absence of known L3→L1 excitatory projections, this effect could also result from a common circuit mechanism that manifests itself slightly earlier in L3 than L1. These hypotheses provide interesting directions for future investigations.
In addition to these ascending size-tuned connections, size information was also relayed via descending functional connectivity from L3. Driving from L3→L5 showed a surround-suppressed profile across several frequency bands at latencies between 50 and 107 ms (Fig. 4c). Functional connectivity was overall strongest for stimuli spanning 17° (range: 10–50°) with a suppression strength of 0.59 (SI, range 0.11–0.73). L2/3 pyramidal cells constitute the main input to L5 and L3 contains apical dendrites from L5 pyramidal cells (Thomson and Bannister, 2003; Xu et al., 2016). Functional synaptic coupling between L3 and L5 has been previously established using laminar population analysis (Einevoll et al., 2007). The size-tuned L3→L5 connection coincided with size-tuned MUAe at L3, consistent with the idea that spiking activity in L3 drives postsynaptic potentials at L5 in a context-dependent manner. At the target layer L5, size-tuned MUAe coincided with this connection, indicating that the surround-suppressed L3→L5 connection may immediately contribute to surround-suppressed spiking activity at L5. L5 is also an important recipient of feedback connections (Coogan and Burkhalter, 1990; Markov et al., 2014), suggesting the possibility that this L3 driving modulates the influence of input from higher cortical areas on processing in this column.
At longer latencies between 290 and 300 ms (end of epoch), the L3→L4 connection showed surround suppression in the high-gamma band (Fig. 4d), consistent with known excitatory projections (Xu et al., 2016). The median center size of this connection was 21° (range: 11–27°) with an SI of 0.68 (0.45–0.82). This timing corresponds to the second period of size-tuned MUAe in L4. Although our results are thus consistent with the hypothesis that late L3→L4 driving shapes L4 size-tuned activity, it is likely that other influences at these latencies also contribute to L4 surround suppression.
The relays of surround-suppressed size information from L3 to both L5 and L1 and later to L4 thus all occurred simultaneously with L3 surround-suppressed MUAe (Fig. 2). The coexistence of surround-suppressed spiking activity and surround-suppressed driving from L3 is consistent with the interpretation that surround-suppressed population activity at L3 has a major role in orchestrating activity across V1 layers in a context-dependent way.
Last, the ascending L6→L4 connection briefly showed surround-suppressed size tuning in the gamma band, with peak driving for stimuli of 31° (range: 19–36°) and median SI of 0.5 (0.35- 0.78; Fig. 4e). Occurring at 75 ms, considerably later than V1 response onset, this functional connection might be driven by fast feedback from higher-level areas to L6 (Domenici et al., 1995; Nowak et al., 1997; Zhang et al., 2014). Simultaneously, size-tuned MUAe was seen at target layer L4, suggesting that L6 driving contributes to size-tuned activity at L4 at those latencies. The surround-suppressed L6→L4 connectivity might enhance the gain of visual input in L4 through intracortical circuits (Raizada and Grossberg, 2003). In our data, however, the mechanism of this driving remains unclear because L6 MUAe did not show surround suppression.
In summary (see also Table 2), we found that directed functional connectivity strengths from L3, L4, and L6 resemble surround-suppressed tuning curves that are typically observed for single-unit or multiunit spiking activity. These connections most strongly influenced target layers for stimuli covering the RF and showed reduced driving for larger stimuli. L3 in particular, but also L4 and L6, thus effectively relay information about stimulus size and play an active role in coordinating laminar activity patterns through parametric variations in connection strengths that depend on spatial context.
We did not observe consistent and time-locked, size-tuned driving arising from L1. This might be somewhat surprising given that feedback from extrastriate visual areas has a prominent termination in L1 (Coogan and Burkhalter, 1990; Ji et al., 2015; D'Souza and Burkhalter, 2017) and is important in size tuning (Nassi et al., 2013; Angelucci et al., 2017; Nurminen et al., 2018). Several explanations could account for the absence of size-tuned connectivity arising from L1 in our study. First, as suppressive feedback from L1 might increase with stimulus diameter rather than following a tuning curve, L1's influence over the column might not be captured well by our analysis targeting size-tuned driving. Second, studies on corticocortical feedback influences on surround suppression have so far mostly been performed in primates (Nassi et al., 2013; Angelucci et al., 2017; Nurminen et al., 2018) and it is unclear whether such feedback targeting L1 is equally important across species. In fact, a recent study in mouse V1 using pharmacological silencing of superficial layers (Self et al., 2014), did not observe changes in the strength of general surround suppression or orientation-tuned surround suppression in lower layers. Finally, although we took utmost care during the electrophysiological recordings, we cannot exclude that an underrepresentation of L1 driving arises from potential cortical damage inflicted by the depth probes, which might be more severe for the upper layers compared with lower ones. More studies are clearly needed to directly address the role of feedback targeting L1 in surround suppression in mouse V1.
Dynamic size tuning
Previous work has shown that spatial RF properties in V1 can undergo fast dynamics after response onset, showing rapid decreases in preferred size and increases in suppression in both cat and monkey (Wörgötter et al., 1998; Malone et al., 2007; Briggs and Usrey, 2011). We therefore investigated whether such coarse-to-fine tuning dynamics occurs in L3 and L4 MUAe of mouse V1 and if the size-tuned functional connections from L3 follow these dynamics as well.
We first investigated center size and SI dynamics for size-tuned MUAe in L4 and L3 relative to stimulus onset (Fig. 5a). Both L4 and L3 MUAe showed an initial phase with rapidly decreasing center sizes and increasing SIs, followed by a phase with stable RF properties. Similar two-stage dynamics have previously been shown in cat LGN (Ruksenas et al., 2007; Einevoll et al., 2011). To better quantify the observed dynamics and test whether it held across mice, we investigated whether center sizes negatively correlated with SI in the initial phase using linear mixed-effects models allowing variable intercepts and slope across mice. This revealed a consistent inverse relationship between RF center size and SI for L4 MUAe (slope −0.014; F(1, 371) = 17.36, p < 0.001) and L3 MUAe (slope −0.012; F(1, 131) = 10.04, p = 0.002). These results demonstrate that a rapid sharpening of tuning-curve profiles in the first 150 ms occurs reliably across mice.
We found similar coarse-to-fine tuning when we investigated RF dynamics of single-unit activity (see Materials and Methods, “Extracellular recordings”). We identified single neurons in L3 and L4 that showed surround suppression (L3, n = 35; L4, n = 29) and used a moving window approach to determine their RF dynamics, obtaining good RoG model fits. Inspecting RF center size and suppression strength dynamically in 50 ms moving windows, we found a sharpening of RFs between 50 and 100 ms after stimulus onset with simultaneously decreasing center sizes and increasing suppression strength (Fig. 5b). This RF sharpening observed in single units provides a physiological basis for the sharpening seen in MUAe and functional connectivity strengths, lending support to the notion that surround-suppressed MUAe and functional connections qualitatively reflect the underlying activity of single units.
We finally inspected whether similar dynamics existed for the L3→L5 and L3→L1 connectivity strengths, which are the most sustained of the surround-suppressed connections (Fig. 4). We found that these connections showed similar tuning dynamics as observed in MUAe and single-units, with decreasing center size and increasing SI between 50 and 150 ms after stimulus onset (Fig. 5c). As for MUAe, center sizes negatively correlated with SI across animals for the L3→L5 (−0.03; F(1,397) = 13.14, p < 0.001) and L3→L1 connection (−0.02; F(1,103) = 47.06, p < 0.001). This sharper tuning of functional connections proceeded in parallel with sharper tuning of MUAe activity in L3, suggesting that relays of size information to L5 and L1 qualitatively follow the coarse to fine dynamics observed in L3 MUAe.
It is remarkable that MUAe, single-unit activity, and functional connectivity strengths showed similar coarse to fine dynamics of tuning parameters, particularly because functional connections were derived from low-frequency LFP signals that reflect a complex mixture of cellular and postsynaptic currents (Buzsáki et al., 2012; Einevoll et al., 2013). A parsimonious interpretation of these converging findings is that there is a common local source for this sharpening in single-unit activity.
Discussion
We here provide a dynamic view on how spatial integration evolves across cortical layers of mouse V1 based on a Bayesian model comparison approach applied to laminar multiunit activity and interlaminar functional connectivity strengths. Our analyses reveal that information about stimulus size and context evolves across time and is dynamically communicated between cortical layers through a network of size-tuned functional connections (Fig. 6). These connections from L3, L4, and L6 parametrically vary with spatial context, driving activity in target layers L1, L2, L4, and L5 most strongly for intermediate stimulus sizes while showing reduced influence for larger ones. Among these functional connections, L3 occupies a central role, exhibiting surround suppression in its single- and multiunit activity, as well as in its impact on other layers. These findings shed new light on how laminar activity is coordinated across a cortical column and on the different functional roles of cortical layers in spatial integration.
Consistent with previous anatomical and circuit-level results, our functional connectivity analyses reveal a major role for L3 in dynamically orchestrating spatial integration across cortical layers. In visual cortex of many mammalian species, L3 exhibits prominent horizontal connectivity (Rockland and Lund, 1982; Gilbert and Wiesel, 1983), where neurons can extend their axons within the layer beyond their own RF. Being preferentially connected according to similarity in orientation preference (Bosking et al., 1997; Ko et al., 2011) makes these pyramidal cells optimally suited to mediate the well known orientation dependency of surround modulations (Nelson and Frost, 1978; Self et al., 2014). In addition, the preferential recruitment of SOM+ inhibitory interneurons by L2/3 pyramidal cells is a circuit motif in accordance with their prominent role in L2/3 surround suppression (Adesnik et al., 2012) or lateral inhibition (Pluta et al., 2017). Our finding of consistent and relatively strong surround suppression for considerable durations in L3 multiunit and single-unit activity are consistent with this notion of a prominent role of L2/3 in shaping spatial integration.
In addition to exhibiting surround-suppressed activity, our functional connectivity analyses also revealed that L3 coordinates activity in the column by modulating activity in L5, L1, and L4 according to spatial context. The predominant frequencies of L3 driving were in the beta and lower-gamma band. Gamma-band activity has been associated with feedforward streams at L3 (Markov et al., 2014; Bastos et al., 2015), suggesting a feedforward interpretation. The L3→L5 connection also showed driving in lower bands, at longer latencies that may play a role in feedback from downstream areas (von Stein and Sarnthein, 2000). Generally, among the size-tuned connections from L3, the L3→L5 driving was most prominent and could contribute to establishing surround-suppressed activity in L5, which occurred there at slightly longer latencies and was notably briefer and weaker. Such a dual role of L3 in providing horizontal competition within L3 while driving a less suppressed signal in L5 is reminiscent of results in somatosensory cortex, where a layer-specific excitation–inhibition ratio creates lateral suppression in L3 and feedforward facilitation in L5 (Adesnik and Scanziani, 2010).
Remarkably, L5 surround-suppressed activity itself did not parametrically drive activity in other layers. This suggests that, although L5 is known to propagate activity across the column (Sakata and Harris, 2009; Plomp et al., 2017) and can sustain L2/3 activity, including its horizontal spread (Wester and Contreras, 2012), these functional connections are not suppressed by spatial context. Similarly, our results suggest that the influence of L4 activity on other layers in the column does not strongly vary with spatial context even though L4 activity itself shows sustained periods of strong surround suppression.
Our results are based on directed functional connectivity analysis within a multivariate Granger causality framework (Granger, 1969; Baccalá and Sameshima, 2001; Bressler and Seth, 2011), reflecting the predictability between recorded signals while accounting for both the directedness of neural interactions and their dynamics. An interpretation in terms of neural circuits, however, is not immediately warranted (Seth et al., 2015; Chen et al., 2017). The presence of a functional connection only indicates that activities systematically covary in time; likewise, the presence of a structural connection only implies the potential for interaction (Battaglia et al., 2014), the magnitude of which will depend on synaptic strength and various circuit-level forms of gating (Wang and Yang, 2018). Despite the lack of a 1:1 relationship, functional connectivity analysis of LFPs has previously helped define the role of thalamocortical interactions in visual attention (Saalmann et al., 2012), interactions between cortical layers at rest and during stimulation (Brovelli et al., 2004; Bollimunta et al., 2008; Plomp et al., 2014a; Chen et al., 2017; Liang et al., 2017), as well as frequency-specific feedforward and feedback interactions between visual areas that are in excellent agreement with known anatomy (van Kerkoerle et al., 2014; Bastos et al., 2015; Michalareas et al., 2016).
Consistent with previous findings of RF dynamics in monkey and cat (Wörgötter et al., 1998; Malone et al., 2007; Ruksenas et al., 2007; Briggs and Usrey, 2011), we show that spatial RFs also evolve from broad to sharp spatial tuning within the first 150 ms after stimulation in mouse V1. This dynamic sharpening was not only observed in L3 and L4 MUAe, but also in single-unit spike rates and functional connection strengths from L3 (Fig. 5). The RF sharpening of L3 activity seems to be independent of activity elsewhere in the column because, during this dynamic sharpening, size-specific connections did not target L3. Therefore, the RF sharpening might be related to known mechanisms within L3, such as surround suppression through the inhibition of excitatory cells via horizontal connections (Ozeki et al., 2009; Adesnik et al., 2012). What could be the role of such coarse-to-fine visual processing? The size of RFs is known to be contrast-dependent, such that the observed shrinkage over time suggests that the local processing in L3 and L4 might perform an enhancement of effective contrast, sharpening the representation of visual space. Consistent with this, the latencies match those of boundary detection processes, as observed in macaque V1 (Poort et al., 2016).
In the past, response onset latencies have been compared with suppression onset to differentiate between inheritance and local mechanisms in surround suppression (Alitto and Usrey, 2008). In our data, we observed delays of suppression relative to response onset of about 10 ms in L4 and L3. Similar delays, albeit of overall larger magnitude, between response onsets and suppression onsets have been previously reported in V1 of the anesthetized mouse (Self et al., 2014) and in macaque V1 for uniform patterns (Knierim and van Essen, 1992; Smith et al., 2006). Other studies, however, have found approximately instantaneous onsets of suppression at response onset (Müller et al., 2003). It has become clear that the temporal evolution of surround modulation depends systematically on the strength of surround stimulation (Henry et al., 2013), with weaker suppression observed later in time. Because most of our main results rely on activity of local populations, which in rodents lack a large-scale organization according to preferred orientation (Ohki et al., 2005), our stimulus, at least on average, might provide suboptimal drive to the extraclassical orientation-tuned surround mechanisms in V1. Therefore, in mouse V1, onset of surround suppression might be relatively slow. Although our results of delayed suppression might therefore point toward cortical mechanisms, including intracortical feedback, dynamically shaping V1 spatial integration, and delayed onsets of suppression in V1, could in principle also be attributed to slowly developing signals inherited from dLGN or even the retina. Future studies performing a dynamic analysis of simultaneously recorded LGN and V1 suppression effects are clearly needed to unequivocally distinguish between these possibilities.
During our recordings, awake mice were placed on an air-cushioned ball that allowed them to either sit or run, but to maximize the number of trials our analysis was performed regardless of behavioral state. It is known that locomotion can alter spatial integration in mouse area V1 (Ayaz et al., 2013; Dipoppa et al., 2018) and dLGN (Erisken et al., 2014), increasing the RF center size and reducing surround suppression. In our experiments, however, effects of locomotion are unlikely to induce systematic biases because bouts of locomotion occur spontaneously across the 11 randomly interleaved stimulus conditions. Having established now the basic laminar profile of surround suppression within our dynamic connectivity analysis framework, it will be interesting to investigate the influence of behavioral state on laminar relay of size and context information in future studies.
Our results are limited to the ascending and descending relays of size information within a V1 column. In the future, it will be important to extend our recording approach to multishank probes and to determine the relative contributions of vertical and horizontal interactions in V1 during spatial integration (Kätzel et al., 2011; Constantinople and Bruno, 2013; Narayanan et al., 2015). Our analysis approach would also benefit from layer-specific or interneuron-specific perturbations of neural circuits in vivo. Such causal manipulations would strongly constrain the functional connectivity models and provide further guidance in the interpretation of the model results.
Footnotes
This work was supported by the Swiss National Science Foundation (Grant PP00P1_157420 to G.P.) and the German Research Council (DFG Grant BU 1808/5-1 to L.B.). We thank Mattia F. Pagnotta for implementing row-normalized OPDC.
The authors declare no competing financial interests.
- Correspondence should be addressed to Gijs Plomp, Department of Psychology, University of Fribourg, Rue de Faucigny 2, 1700 Fribourg, Switzerland. gijs.plomp{at}unifr.ch