Abstract
The brain combines two-dimensional images received from the two eyes to form a percept of three-dimensional surroundings. This process of binocular integration in the primary visual cortex (V1) serves as a useful model for studying how neural circuits generate emergent properties from multiple input signals. Here, we perform a thorough characterization of binocular integration using electrophysiological recordings in the V1 of awake adult male and female mice by systematically varying the orientation and phase disparity of monocular and binocular stimuli. We reveal widespread binocular integration in mouse V1 and demonstrate that the three commonly studied binocular properties—ocular dominance, interocular matching, and disparity selectivity—are independent of each other. For individual neurons, the responses to monocular stimulation can predict the average amplitude of binocular response but not its selectivity. Finally, the extensive and independent binocular integration of monocular inputs is seen across cortical layers in both regular-spiking and fast-spiking neurons, regardless of stimulus design. Our data indicate that the current model of simple feedforward convergence is inadequate to account for binocular integration in mouse V1, thus suggesting an indispensable role played by intracortical circuits in binocular computation.
SIGNIFICANCE STATEMENT Binocular integration is an important step of visual processing that takes place in the visual cortex. Studying the process by which V1 neurons become selective for certain binocular disparities is informative about how neural circuits integrate multiple information streams at a more general level. Here, we systematically characterize binocular integration in mice. Our data demonstrate more widespread and complex binocular integration in mouse V1 than previously reported. Binocular responses cannot be explained by a simple convergence of monocular responses, contrary to the prevailing model of binocular integration. These findings thus indicate that intracortical circuits must be involved in the exquisite computation of binocular disparity, which would endow brain circuits with the plasticity needed for binocular development and processing.
Introduction
Neural circuits in the visual system combine inputs from the two eyes and transform them into signals that are then used to guide animal behavior. In humans, the process of binocular integration is key to stereoscopic vision as it uses the small difference between the two retinal images (i.e., binocular disparity) to encode depth (Cumming and DeAngelis, 2001). More generally, early stages of binocular integration are an excellent model for studying how neural circuits combine multiple streams of signals into one and generate useful emergent properties for subsequent processing.
The prevailing view of binocular integration came from a series of studies done in the cat primary visual cortex (V1), which led to the disparity energy model (Ohzawa et al., 1990, 1996, 1997). In cats, binocular convergence occurs downstream of V1 layer 4. Convergence of the feedforward geniculocortical projections creates oriented receptive fields (RFs) in layer 4 neurons (Hubel and Wiesel, 1962; Priebe and Ferster, 2012), which, according to the disparity energy model, are then combined to generate selectivity for binocular disparity. Specifically, the model posits that the two monocular RFs have different degrees of spatial offset, and only specific disparities that match this offset would cause the neuron to discharge because of spiking threshold. Although this model has been successful in explaining many experimental observations in cat and monkey V1 (Cumming and Parker, 1997; Anzai et al., 1999; Prince et al., 2002; Tsao et al., 2003), it is important to note that it is purely feedforward and only considers excitatory connections.
The mouse has been a useful model in binocular vision studies, especially for ocular dominance (Espinosa and Stryker, 2012), a measure of relative response magnitude of individual V1 neurons through the two eyes. Studies have also investigated how V1 neurons match their orientation preference through the two eyes, which is driven by visual experience during a critical period in early life (Wang et al., 2010, 2013; Gu and Cang, 2016). A few studies reported large proportions of monocular neurons in the binocular zone of mouse V1 (Salinas et al., 2017; Huh et al., 2020; Jenks and Shepherd, 2020; Tan et al., 2020, 2021), inconsistent with the results of many previous studies (Gordon and Stryker, 1996; McGee et al., 2005; Mrsic-Flogel et al., 2007; Kameyama et al., 2010). This discrepancy could be because of technical differences but also highlights the potential confusion caused by the classification of monocular and binocular neurons in studies where the two eyes were never stimulated simultaneously (Cang et al., 2023). The nomenclature “monocular” implies a specific wiring pattern, whereas in fact it refers to a neuronal property measured at the level of spikes. Indeed, studies in a number of species, including mice, indicate that ocular dominance is not directly related to having selectivity to binocular disparity (LeVay and Voigt, 1988; Read and Cumming, 2004; Kara and Boyd, 2009; Scholl et al., 2013; Chioma et al., 2020). However, how V1 neurons transform monocular inputs into binocular responses in mice has not been systematically studied.
Here, we set out to study binocular integration in mouse V1 by interleaving monocular and binocular stimulations using dichoptic presentation. We found that mouse V1 neurons heavily dominated by one eye or having mismatched orientation preference were just as likely to be tuned to phase disparity as neurons with more balanced activation or matched preference through both eyes. Consequently, almost all neurons in the V1 binocular zone showed evidence of binocular integration. In addition, for individual neurons, monocular responses were poorly associated with responses to binocular stimulation. Because a simple feedforward model would dictate that any binocular response was the result of converging monocular inputs, this decoupling of monocular and binocular responses indicates that binocular integration is unlikely to be explained by feedforward mechanisms alone. Rather, intracortical circuits must play a major role in binocular integration. This mechanism may provide visual circuits with the plasticity they need for binocular processing.
Materials and Methods
Animals
Male and female C57BL/6 mice (n = 28; 16 male, 12 female) beyond 8 weeks of age (63–117 d) were used for all experiments. All animals were approved by the Institution for Animal Care and Use Committee at the University of Virginia.
Surgery
Mice were placed under isoflurane anesthesia (5% for induction, 2% for maintenance, in O2, ∼0.5 L/min (VetFlo, Kent Scientific) for implantation of a custom-designed titanium head plate to immobilize the head during recordings. Atropine (0.3 mg/kg in 10% saline) and dexamethasone (2.0 mg/kg in 10% saline) were administered subcutaneously, and the core body temperature was monitored and maintained at 37°C (Frederick Haer) for the duration of surgery. Artificial tears (Henry Shein Medical) were administered to prevent drying and injury to the corneas. The head of the mouse was held in place using a stereotaxic frame equipped with ear bars (Kopf Instruments), after which the scalp was resected to expose a portion of the occipital skull centered over binocular V1. Subsequently the head plate was adhered to the skull using Metabond (Parkell). Mice were placed on a heating pad to recover from isoflurane until bright, alert, and responsive, after which they were returned to their home cages. Postoperative monitoring continued for 4 d.
Habituation
Following head plate surgery, mice were habituated to the electrophysiological recording rig for at least 4 d before recording. Mice were head fixed to a post that allowed them to run on a cylindrical Styrofoam wheel (6 inch diameter) that could freely rotate around its axis. Each habituation session lasted for 30 min, and mice were limited to 1 session per day. Recording proceeded once mice displayed no signs of distress or agitation while on the running wheel.
Physiologic recording
At least 12 h before recording, a craniotomy was performed while the mouse was placed under isoflurane anesthesia. The craniotomy was positioned above the left visual cortex (∼2.0 mm diameter, ∼3.15 mm lateral and ∼0.5 mm anterior from lambda; Fig. 1B). The craniotomy was covered with agarose (2.5%) and Kwik-Cast silicone epoxy (World Precision Instruments), which was removed immediately before recording.
Mice were head fixed and allowed to run freely on the running wheel for the duration of recording. A silicone multielectrode probe (64M, 64D, or 128AxN models, Masmanidis Lab; Yang et al., 2020) was attached to an data acquisition system (RHD 128-Channel Headstage, Intan Technologies) to record electrophysiological signals during visual stimulation. The probe was centered over the craniotomy, and following penetration into the cortex, the area was covered with agarose (2.5%), into which a reference wire was immersed. The probe was advanced until the tip channels were embedded in white matter. Analog voltage signals were digitized at a 20 kHz sampling rate (RHD Evaluation System, Intan Technologies), and the timing of visual stimulation condition changes were recorded with transistor-to-transistor logic pulses simultaneously with voltage signals to enable later off-line synchronization for analysis.
After the recording session ended, the probe was retracted, the craniotomy was covered with agarose and Kwik-Cast, and animals were returned to their home cages. Recording continued on consecutive days until the recording area showed visible signs of damage, after which animals were killed according to Institutional Animal Care and Use Committee protocols.
Spikes in the voltage time series were sorted into separate units using MountainSort software (Chung et al., 2017). The spike waveform of each detected spike-like event was projected onto a 1D feature space. The spike-sorting algorithm comprises a series of nonparametric statistical tests for unimodality. Noise overlap is the fraction of noise events in a cluster, that is, spike-like events not associated with this or any of the other clusters. Spike clusters were considered single units if they passed criteria for noise overlap (<0.08), indicating that spike waveforms were distinguishable from randomly sampled noise waveforms, and isolation (>0.96), indicating that spike waveforms were distinguishable from clusters of other spike clusters in feature space (Fig. 1C,D). Single units were then retained for further analysis.
Visual stimulation
Visual stimuli were generated in MATLAB using the Psychophysics Toolbox package (RRID:SCR_002881; Kleiner et al., 2007). Stimuli were shown to mice using a combined projector and polarization modulator system, as described previously (Tanabe et al., 2022). Briefly, the graphics processor (AMD Radeon Pro WX 7100) generated left-eye and right-eye images on the top and bottom half of every video frame, respectively. The projector (Optoma HD27HDR) then displayed the two halves in interleaved video frames at 120 Hz so that frames intended to be viewed by the left and right eyes were shown in an alternating sequence. The left eye and right eye received the images asynchronously, with the left eye preceding by 8.3 ms. These frames were then filtered through a polarization modulator (DepthQ Passive Bundle) and projected onto a polarization-preserving screen (Stewart FilmScreen 150). Animals viewed the stimulus frames through passive polarization filters that were mounted in 3D-printed frames and secured to the head-fixing post to maintain a constant position in front of the eyes of the animal. The passive filters allowed the animal to view left-eye-intended frames only with the left eye, and vice versa for the right eye (Fig. 1A), and each eye received stimulation at 60 Hz. Gamma correction was applied for a linear transformation from grayscale values to luminance (range 6–87 cd/m2), and crosstalk with this system was measured with a photometer to be 1.9%. A custom 3D-printed shield was used to prevent light from the projector from generating photoelectric artifacts in the physiological recording data. The viewing distance from the mouse to the projector screen was 25 cm, consistent with previous mouse studies (Samonds et al., 2019).
The visual stimulus was centered over the estimated RF locations of the neuronal population being recorded. To map the RF, we used either a contrast-reversing bar on a gray background or a flashing bright bar on a dark background. RFs needed to be within 20° of the contralateral visual field from the vertical meridian on the horizontal axis and within the stimulus screen along the vertical axis. If the RF was not located within these boundaries, the probe was retracted and reinserted in a location closer to the retinotopic center of the visual field. After three to four neurons had their RFs mapped with sufficient confidence, the average position of the RF centers was used as the visual stimulus center position.
We used a contrast-reversing checkerboard pattern (reversal rate 0.5 Hz, 10° × 10°) presented to both eyes to estimate the depth of the electrode contacts relative to the cortical layers. The checkerboard patch was centered at the average position of the previously mapped RFs and covered an area of 50° × 50°.
Drifting sinusoidal gratings were used to assess the orientation and disparity tuning of V1 neurons. Gratings were presented in a circular patch (radius 30°) that was centered over the average of the previously mapped RFs. Disparity in binocular gratings was generated by shifting the phase of the sinusoid seen by the right eye and was referred to as phase disparity. The orientation of binocular gratings ranged from 0° (vertical orientation, drifting rightward) to 157.5°, in 22.5° counterclockwise steps. The full range of phase disparity from 0° to 360° in 45° steps, was tested for each orientation of the grating. The stimulus set also included monocular gratings for the left and right eyes, at the orientations specified above, and a control condition in which both eyes were shown a gray screen for a total of 81 conditions. Stimulus conditions were presented in a pseudorandom order, where every condition was presented at least once before any condition was repeated, and recording continued until each stimulus had been repeated at least 10 times. Each trial consisted of a stimulus-on duration of 1 s, followed by an interstimulus interval of 0.5 s. The spatial frequency and temporal frequency of gratings were fixed at 0.04 cycles/degree and 2 Hz, respectively, and gratings were presented with full contrast.
Data analysis
Spike timing was aligned with the onset timing of the stimulus. Tuning functions were then constructed by calculating the firing rate during the response window for each stimulus condition. The response time window was set to be 60 ms delayed to the stimulation.
Tuning functions generated from the responses to monocular grating stimuli were used to calculate the ocular dominance index (ODI) and quantify the degree of interocular mismatch of orientation preference (ΔO) of a cell. We calculated the trial-averaged firing rate for each condition and generated two orientation tuning functions, one for monocular stimulation of the contralateral eye
To calculate ODI, the peak of the tuning function associated with the contralateral eye was estimated using the zeroth- and first-order harmonics of orientation tuning as follows:
For comparisons involving the strength of orientation tuning, a global orientation selectivity index (gOSI) was calculated by taking the ratio between the amplitudes of the zeroth- and first-order harmonics of the orientation tuning (Mazurek et al., 2014) as follows:
The angle of the vector sum
The strength of disparity tuning in response to binocular gratings was quantified using the phase disparity selectivity index (PDSI). To calculate the PDSI, we used the same method as described above to calculate the zeroth- and the first-order harmonics. The array of binocular conditions in our experimental design resulted in multiple-phase disparity tunings
The recorded neuronal population was also classified into putative fast-spiking (FS) inhibitory neurons versus regular-spiking neurons. A well-characterized feature of FS neurons is that their spike waveforms are narrow; specifically, their voltage polarity can switch in <0.2 ms (Bruno and Simons, 2002; Atencio and Schreiner, 2008). To differentiate between these two types of cells, we calculated the slope of the waveform at a time window where fast-spiking cells would have completed a polarity switch already, whereas regular-spiking cells would not have (Niell and Stryker, 2008). For this analysis, we took the spike template from the channel with the largest amplitude and extracted three time samples, centered at 0.5 ms after the negative peak. A linear regression of these three voltage values was used to approximate the slope of the waveform around those time points.
The local field potential (LFP) was the extracellular voltage time series bandpass filtered between 1 and 120 Hz (second-order Butterworth filter). The LFP was trial averaged across all contrast reversals of the checkerboard pattern. Strong negative sinks typically occurred in a limited range of depth. We estimated the deepest point of the sink using spline interpolation and used that as the center of layer 4 of V1. If necessary, the probe depth was then adjusted so that the tip penetrated ∼250–300 µm deeper into the cortex to span all cortical layers. The current-source density was used to verify the existence of a sink (Mitzdorf, 1985). We then subtracted this depth from the position of the electrode contact on the microprobe where the spikes of each neuron were detected. This was the relative depth of each neuron with respect to layer 4.
Statistics
Neurons in this study were determined to be visually responsive if they passed a bootstrapped permutation test (Siegle et al., 2021), where based on their firing patterns, we were able to reject the null hypothesis that the firing rate during visual stimulation was indistinguishable from the firing rate in response to a blank stimulus condition. Deviation from the null hypothesis of the original observed data were quantified using the chi-square value as follows:
In this equation, ri and r0 represent the trial-average firing rates of the ith stimulus condition and the blank condition, respectively. Random permutation was used to estimate the null distribution of the chi-square value. The trial-to-trial firing rate, including the firing rate of the blank condition, was randomly permutated so that any association between the firing rate and any given stimulus condition was lost. This permutated data were then used to recalculate a null chi-square value, and this process was repeated 1000 times to generate a null chi-square value distribution. The 95th percentile of this null distribution was used as the cutoff for visual responsiveness; that is to say, only neurons that had an original chi-square value greater than the 95th percentile of the randomly permutated distribution were considered significantly responsive to visual stimulation.
The MATLAB Statistics and Machine Learning Toolbox was used for all statistical tests. Spearman's partial rank correlation coefficient was used to quantify the correlation between response indices. Hartigan's dip test was used to verify unimodality for distributions (Hartigan and Hartigan, 1985). The Mann–Whitney U test was used to compare data from different groups of cells. Further details on the number of cells and animals, as well as statistical tests used for specific comparisons, are provided (see below, Results). We did not use statistical methods to predetermine sample sizes, and animals were not assigned to control or experimental groups because such considerations were inapplicable to the design of this characterization study.
Results
Systematic measurement of binocular response properties
To systematically measure the relationship between monocular and binocular responses of individual V1 neurons, we used a dichoptic display system to present drifting sinusoidal gratings of different orientations, either binocularly with specific phase disparity or monocularly to either the ipsilateral or contralateral eye (Fig. 1A). Neurons were recorded from all cortical layers of the binocular zone of V1, using a 64-channel silicon microprobe spanning the entire cortical depth (Fig. 1B,C,D). Many neurons displayed response properties that were not expected from a simple convergence model. For example, the neuron in Figure 1E showed a complete dominance of the contralateral eye over the ipsilateral eye (Fig. 1E, orange vs blue). A neuron with such a strong ocular dominance would be traditionally classified as monocular, and one would expect it to be insensitive to any stimulus to the ipsilateral eye. In striking contrast to this expectation, this neuron was in fact tuned to the phase disparity of the binocular gratings. At the preferred orientation (90°), this neuron was most responsive when the gratings in the two eyes were opposite in phase (i.e., a phase disparity of 180°), and the response was reduced at other disparities (Fig. 1E, black curve). The tuning to phase disparity is unequivocal evidence that this neuron, despite its strong ocular dominance, integrates signals originating in the two eyes. In other words, binocular integration in mouse V1 does not necessarily produce a simple combination of monocular signals, thus highlighting the need for a careful characterization using well-designed stimulus sets.
Methods and example tuning curves. A, Schematic diagram of dichoptic stimulus system. Bottom left, Stimulus frames for the left and right eyes were interleaved (120 frames/s) and the alternating frames had different polarities after passing a polarizing modulator. Bottom right, The animal viewed the projected images through a pair of passive polarization filters, and each eye only viewed alternating frames that had a matching polarization. B, Intrinsic signal imaging was used to identify the V1 binocular zone. The azimuthal visual field through the ipsilateral eye is represented in color (central/peripheral, blue/red). C, Spike sorting into single units. Spike waveforms that were considered for further analysis as single units had to pass two criteria, noise overlap <0.08 and isolation cutoff >0.96. D, Autocorrelograms of all single-unit spikes. Each line is an isolated single unit, and spike counts within each bin were divided by the total number of spikes of that unit. A clear dip around short delay times (τ < 2 ms) is present. E, Example tuning curves from a phase disparity selective neuron. Binocular stimuli were shown at eight orientations and eight phase disparities, allowing for the generation of eight disparity tuning curves (black), one per orientation. The responses to monocular stimulation are shown in orange (contralateral) and blue (ipsilateral).
Across the population of recorded neurons in V1 (n = 594, 13 mice), 47.5% (n = 282) showed significant responses to the stimulus set (see above, Materials and Methods for details of classifying responsiveness). There was a wide variety of ocular dominance, which we were able to quantify in neurons that responded significantly to monocular stimulation through the contralateral and/or ipsilateral eye (n = 244). Some neurons were comparably driven by stimulation through either eye (Fig. 2A), whereas others were dominated by one eye, the contralateral eye in most cases (Fig. 2B). The distribution of the ODI spanned the entire range of −1 and 1, with a small bias toward the contralateral eye (median, 0.12; Fig. 2C). For the subset of neurons that were significantly responsive to monocular stimulation through both eyes (n = 136), we estimated the preferred orientation associated with each eye and compared interocularly. Many neurons had similar orientation preferences between the two eyes (Fig. 2D), whereas a minority had orientation preferences that were quite different (Fig. 2E). We calculated the difference in the ΔO between the two eyes for each neuron. The population distribution of ΔO values had a peak near 0° with a skewed tail that tapered off at higher ΔO values (median = 27.4°; Fig. 2F), largely consistent with previous results in anesthetized mice (Wang et al., 2010, 2013; Sarnaik et al., 2014; Gu and Cang, 2016; Levine et al., 2017). Also consistent with previous findings (Wang et al., 2010; Levine et al., 2017), the degree of interocular matching depended on the strength of orientation tuning (ρ = −0.435, p < 0.001, between ΔO and the OSI of the nondominant eye; Fig. 2J) but not on the response amplitude (ρ = −0.148, p = 0.092, between ΔO and the peak amplitude of the nondominant eye; Fig. 2K), and the strength of orientation tuning was significantly matched interocularly (ρ = 0.242, p < 0.01; Fig. 2L).
Quantification of V1 binocular response properties. A, Example monocular orientation tuning curves. This neuron was balanced in its response to contralateral and ipsilateral stimulation (orange and blue, respectively; ODI −0.13). B, Orientation tuning curves from a neuron that is dominated by contralateral stimulation (ODI 0.82). C, Histogram of ODI across the population (n = 244). The population had a small bias toward the dominance of the contralateral eye with a median ODI of 0.12 (red vertical line). D, Example orientation tuning curves represented in polar coordinates. The angle and radial distance represent the stimulus orientation and response magnitude, respectively. The tuning curve was duplicated to complete the full range of angles. This neuron had matched orientation preference through the two eyes (ΔO = 4.5°). E, Example orientation tuning curves from a neuron mismatched in orientation preference through the two eyes (ΔO = 52.7°). F, Histogram of ΔO values across the population (n = 136), with a median of 27.4° (red vertical line). G, Example phase disparity tuning curve. This neuron was tuned to the phase disparity of the binocular grating (PDSI = 0.32). H, Example phase disparity tuning curve from a nonselective neuron (PDSI = 0.03). I, Histogram of PDSI values across the population (n = 246), with a median of 0.31 (red vertical line). J, ΔO was correlated with gOSI through the nondominant eye (Spearman correlation, ρ(129) = −0.435, p < 0.001). K, The magnitude of ΔO was not correlated with the peak spike rate in response to the nondominant eye (Spearman correlation, ρ(129) = −0.148, p = 0.09). L, Correlation of gOSI through the two eyes (Spearman correlation, ρ(129) = 0.242, p = 0.005).
The characterization of ocular dominance and interocular matching of orientation preference is limited to responses to monocular stimulation. As illustrated in Figure 1, we also examined responses of these neurons to binocular stimulation of various disparities in the same recordings. Many neurons were tuned to the phase disparity of a binocular grating (Fig. 2G). Others responded strongly to binocular gratings, without obvious tuning to the phase disparity (Fig. 2H). We measured the strength of disparity tuning by calculating a PDSI for each orientation of the stimulus. We then chose the PDSI associated with the strongest modulation by phase disparity to represent the strength of disparity tuning. Most visually responsive neurons displayed significant phase disparity selectivity (median = 0.31; Fig. 2I; n = 246; with n = 148, i.e., 60.2%, >0.25, a level of high selectivity).
Ocular dominance, interocular matching, and disparity selectivity are independent measures of binocular integration
Binocular integration is often assumed to depend primarily on the convergence of feedforward monocular inputs representing the two eyes, which are the same projections that create orientation tuning. This assumption would lead to certain expectations of the relationship among ocular dominance, interocular matching, and disparity selectivity of individual V1 neurons. For example, neurons that receive balanced innervation from the two eyes may be more likely to have interocularly matched orientation preference and show strong disparity tuning. We found no evidence for such relationships. Neurons with balanced monocular responses (i.e., ODI near zero) could have very different values of ΔO and PDSI, ranging from mismatched to matched and from nonselective to highly selective (Fig. 3A,B). On the other hand, neurons with strong ocular dominance (i.e., ODI near 1 or −1) could still be highly selective for phase disparity (Fig. 3C,D). Across the population, there is no correlation between ODI and ΔO (ρ = −0.010; p = 0.91; Fig. 3E) or between ODI and PDSI (ρ = −0.110; p = 0.22; Fig. 3F). Furthermore, there is also no correlation between ΔO and PDSI (ρ = −0.138; p = 0.12; Fig. 3G). These three measures of binocular responses are therefore independent from each other, and they likely encode orthogonal features of binocular stimuli. These data suggest that a simple convergence of monocular inputs is unlikely to be the deciding mechanism of binocular integration.
No correlation among ocular dominance, interocular matching, and binocular disparity selectivity. A–D, Four example neurons showing different levels of ocular dominance, interocular matching, and phase disparity selectivity. Monocular orientation tuning curves of each neuron are shown to the left in polar plots (contralateral in orange and ipsilateral in blue) and disparity tuning curve to the right (green curve). Bottom, The associated ODI, ΔO (if responsive to both monocular simulations, not applicable (N/a) otherwise), and PDSI. No systematic relationship is seen among the three measures. E, No significant correlation was found between ΔO and ODI (Spearman partial ranked correlation, ρ(129) = −0.010, p = 0.090). Example cells from A and B are labeled in magenta and orange, respectively. F, No significant correlation was found between PDSI and ODI (Spearman partial ranked correlation, ρ(200) = −0.110, p = 0.218). Example cells from A, B, C and D are labeled in magenta, orange, blue, and green, respectively. G, No significant correlation was found between PDSI and ΔO (Spearman partial ranked correlation, ρ(125) = −0.138, p = 0.123). Example cells from A and B are labeled in magenta and orange, respectively.
Additionally, few cells in our dataset had monocularly dominated ODI values (e.g., |ODI| >0.80, n = 16/244, i.e., 6.6%) and a large portion of those that did show high disparity selectivity (e.g., PDSI >0.25 and |ODI| >0.8, n = 6/16, i.e., 37.5%; Fig. 3F). In other words, neurons in the mouse V1 binocular zone do not form distinct contralateral, ipsilateral, or binocular groups as suggested by some recent studies. Instead, binocular integration is widespread and almost all neurons are binocular.
Disparity tuning and orientation selectivity in joint parameter space
Orientation tuning and phase disparity tuning are often measured separately. This would be fine if separate populations of dLGN afferents truly converged to produce the two tuning properties. Orientation tuning would reflect the property of the dLGN afferents converging from the compartment of the contralateral eye, whereas phase disparity tuning would reflect the property of the dLGN afferents converging across the contralateral and ipsilateral compartments. Given that our data suggest that the convergence is unlikely the only mechanism for the production of phase disparity tuning, we explored the tuning in the joint parameter space of orientation and phase disparity. Specifically, each combination of orientation and phase disparity of a binocular grating has a vector, or a position, on a surface. A schematic illustrating these two variables is shown in Figure 4A. Binocular gratings with a phase disparity of 90° and orientation of 0° (left, blue outline), once rotated 180° (right, magenta outline), become identical to gratings with a phase disparity of 270° and orientation of 0° (left, magenta outline). Similarly, gratings with a phase disparity of 270° and orientation of 0° (left, magenta outline), once rotated 180° (right, blue outline), become identical to those with a phase disparity of 90° and orientation of 0° (left, blue outline). Because of the inverted nature by which the two axes wrap around into a circle, the phase disparity and orientation conditions that define the points on this map can also be understood as tiling the surface of a Klein bottle (Tanaka, 1997; Tanabe et al., 2022).
Analysis of preferred phase disparity and orientation selectivity. A, Schematic illustrating the systematic variation of stimulus orientation (x-axis) and phase disparity (y-axis). The two magenta outlined squares and the two blue outlined squares mark the identical stimulus condition (see above, Disparity tuning and orientation selectivity in joint parameter space). B. The response of each neuron to a particular stimulus orientation and phase disparity is represented in a position on a 2D heat map. Two example neurons are shown with localized hot spots on the heat map. C, Distribution of the preferred phase disparity separately for each orientation. Each dot represents a column of the heat map of a given neuron with its degree of disparity selectivity (PDSI) shown in grayscale. The preferred phase disparity tended to cluster ∼0°, particularly for tuning curves that had higher PDSI values (darker data points, where the estimation of preferred disparity is more accurate with stronger phase disparity tuning). D, Distribution of preferred phases represented in a polar histogram, including all data points in C. E, Distribution of preferred phases, where for each neuron only the orientation that induced the highest PDSI was included. A strong bias toward phase disparity 0° is seen (and also 180°), with gaps near orthogonal phase disparities 90° and 270°. F, Example heat map from a neuron with a vertically elongated hotspot. G, Relationship between orientation selectivity and disparity selectivity. For each neuron, we identified the peak in the heat map and calculated the gOSI and PDSI from the two cross-sections that run through the peak.
We illustrated the firing rate of each neuron for each parameter combination in a heat map. Many neurons had heat maps with a clearly defined hotspot (Fig. 4B). Because of the way in which the orientation and phase disparity axes wrap around into a circle, hotspots located near the edges appear split across two areas of the heat map (Fig. 4B, left, top left quadrant and bottom right quadrant). Given this complex topological structure, a single vertical slice of the heat map may not serve as a representative of phase disparity tuning. We thus calculated the preferred phase disparity for each vertical slice of the heat map and plotted it and the associated PDSI (grayscale) for the corresponding orientation (Fig. 4C; n = 246). Interestingly and unexpectedly, the PDSI was similarly high in many orientations, including both 0° (vertical) and 90° (horizontal; Wilcoxon's signed rank, p = 0.5). In other words, in mouse V1, phase disparity tuning is not limited to horizontal disparity of vertical gratings, as one would expect from stereoscopic depth perception.
Furthermore, in several of the orientations (e.g., 90°), there were biases toward phase disparity of 0° and 180°. The bias was partially visible in the pooled distribution of the preferred phase disparity across all stimulus orientations (Fig. 4D; circular mean 338.6°). We examined this bias further by selecting the preferred disparity associated with the highest PDSI for each neuron. The bias in the preferred phase disparity became much clearer (Fig. 4E), showing a bimodal distribution with peaks at 0° and 180° and dips at 90° and 270° (Hartigan's dip test, p < 0.001). Mouse V1 neurons were therefore most strongly driven by a binocular grating that had the same phase in both eyes or the opposite phase across the eyes.
In a small subset of the neurons, the hotspot was elongated vertically, indicating weak phase disparity tuning but strong orientation tuning (Fig. 4F). In contrast, few neurons had a hotspot stretching across the orientation axis. We therefore analyzed the relationship of the strength of tuning along the phase disparity axis and the orientation axis. To do this, we generated two cross sections that ran through the heat map peak, one cross section along the orientation axis and the other along phase disparity axis. We calculated the gOSI and the PDSI of the respective cross sections. In the scatterplot comparing these values (Fig. 4G), there was a noticeable dearth of points in the top left corner; that is, few neurons were highly phase disparity selective but not orientation selective. This result could reflect a hierarchical processing mechanism in which orientation tuning is generated before the phase disparity tuning.
Predicting binocular tuning from responses to monocular stimuli
We next examined how binocular tuning might derive from upstream monocular processes. We noticed that in many neurons, the strongest phase disparity tuning was observed when the binocular grating was given at the orientation that elicited the strongest monocular response (Fig. 5A). The shape of the phase disparity tuning had the characteristics of a sinusoid, wherein the baseline and amplitude both reach their peaks at the preferred orientation. To quantify this relationship, we decomposed the phase disparity tuning curve into its zeroth order (F0, the mean) and first-order Fourier components (F1, the vector sum). Both F0 and F1 peaked at the same orientation for this example neuron (Fig. 5B; black solid and black dashed lines, respectively), and both curves closely resembled the orientation tuning obtained with contralateral eye stimulation. For this neuron, the orientation tuning of the contralateral eye stimulation accurately predicted the orientation tuning of both F0 and F1.
Relationship between monocular and binocular responses. A, Tuning curves of an example neuron. This neuron showed the strongest disparity tuning at the orientation it preferred monocularly (112.5°). B, Orientation tuning of the F0 (black solid) and the F1 (black dashed) components of the phase disparity tuning curves from A. The monocular orientation tuning curves are superimposed (blue for ipsilateral and orange for contralateral). F0 and F1 as a function of orientation have indistinguishable shapes, and the monocular responses are good predictors of their magnitude. C, Tuning curves from an example binocular neuron whose disparity tuning was dissociated from the monocular orientation tuning. Strong disparity tuning was obtained with an orientation (157.5°) that was perpendicular to its preferred orientation. D, Same plot as B, but for the neuron in C. F0 and F1 as a function orientation are dissociated from each other, and the monocular responses are good predictors for the magnitude of F0 but not for F1. E, Scatterplot of F0 and the sum of the responses to monocular stimulation. Each dot corresponds to one point of the orientation tuning curve in D. The dots fell along the diagonal, showing a high correlation. F, Scatterplot of F1 and the sum of the monocular responses. Most dots fell below the diagonal and filled the gap between the diagonal and the horizontal axis. F1 was dissociated from the monocular responses. G, Scatterplot of F1 and F0, with most dots filling the entire space between the diagonal and the horizontal axis. H, Simulation of the spike threshold effect on the relationship between F0 and F1. Sinusoids with all possible combinations of F0 and F1 was generated (i.e., no correlation between F0 and F1) and then passed through a threshold nonlinearity. The output was then decomposed into F0 and F1, whose amplitudes are shown in this plot. The top left corner of the plot was empty, which created a moderate level of spurious correlation when in fact the subthreshold sinusoids have zero correlation.
In other neurons, the monocular responses were good predictors of F0, but poor predictors of F1 (Fig. 5C). Orientation tuning of F0 had a clear peak at 67.5° for this neuron (Fig. 5D), resembling the peak of the contralateral eye stimulation. In contrast, the orientation tuning of F1 had two peaks, including one that hardly elicited responses when the grating was monocular (orientation 157.5°; Fig. 5D). This shape was dissimilar to the orientation tuning of either contralateral or ipsilateral stimulations.
To quantify the similarities of the four orientation tuning curves, we first summed the two monocular orientation tuning curves (contralateral Lmonoc and ipsilateral Rmonoc) and then compared it with the binocular response (F0 and F1). If the binocular combination is as simple as a linear summation, as postulated in the disparity energy model, we would expect the summed orientation tuning curve to be a good predictor of the orientation tuning curves of both F0 and F1. This was indeed true for the F0 component, where F0 plotted against Lmonoc + Rmonoc fell along the identity line (ρ = 0.776; p < 0.001; Fig. 5E). However, the F1 component did not follow this pattern. F1 plotted against Lmonoc + Rmonoc not only fell below the identity line, it also scattered across a wide range between the horizontal axis and the identify line (ρ = 0.650; p < 0.001; Fig. 5F). The large scatter was partially because of the discrepancy in orientation tuning between F0 and F1 (Fig. 5G). Despite the high value of the rank correlation between F1 and F0 (ρ = 0.742; p < 0.001), the actual relationship was in fact poor. This is because F1 cannot exceed F0 because of the rectification by the firing threshold (Fig. 5H), which limits points from falling above the identity line. Therefore, our data suggest that the orientation tuning of F1 is dissociated from the monocular orientation tunings. This finding indicates that the F0 of phase disparity tuning is largely relayed from monocular processes, whereas the F1 of phase disparity tuning is likely generated by a separate mechanism in downstream circuits where binocular combination has already occurred.
Specificity among cell types and cortical layers
We next classified the recorded neurons into fast-spiking and regular-spiking neurons and analyzed their binocular response. The fast-spiking neurons are known to overlap with a specific subtype of inhibitory interneuron that express the molecular marker parvalbumin (PV; Kawaguchi and Kondo, 2002). When the waveforms of all visually responsive neurons in the dataset were superimposed, we observed two distinct patterns of waveforms (Fig. 6A). To separate these two categories quantitatively, we calculated the slope of the spike at 0.5 ms after spike detection. A histogram of the slope values showed a bimodal distribution with a clear split at slope of zero (Fig. 6B). We classified neurons with a negative slope as fast spiking (n = 67), and neurons with a positive slope as regular spiking (n = 179). The fast-spiking population had weaker orientation tuning than the regular-spiking counterpart (z = −3.610; p < 0.001; Fig. 6C). The fast-spiking population also had weaker phase disparity tuning than the regular-spiking counterpart (z = −3.829; p < 0.001; Fig. 6D). These results are independent confirmations of previously published results on their orientation selectivity and disparity selectivity (Niell and Stryker, 2008; Liu et al., 2009; Scholl et al., 2015). We next applied the classification to our measurement of binocular response properties. The low selectivity metrics of fast-spiking neurons could be obscuring any pattern that was otherwise present in the remaining regular-spiking population. This was not the case. Even after splitting the population into the two classes, we found no correlation between ODI and ΔO (Fig. 6E,F), between PDSI and ODI (Fig. 6G,H), or between PDSI and ΔO (Fig. 6I,J).
Binocular response properties of fast-spiking versus regular-spiking neurons. A, Aggregated spike waveforms display distinct populations of fast-spiking and regular-spiking neurons. Area shaded in gray indicates the time window where the slope of the waveform was estimated for quantification. B, Bimodal distribution of estimated waveform slopes. Positive values belong to regular-spiking, where negative values belong to fast-spiking, or putative PV inhibitory interneurons. C, Population of fast-spiking neurons displayed broader orientation tuning than regular-spiking neurons (Wilcoxon rank-sum, z = −3.61, p = 1.29e−4). D, Population of fast-spiking neurons displayed broader phase disparity tuning than regular-spiking neurons (Wilcoxon rank-sum, z = −3.81, p = 1.36e−4). E, No significant correlation was found between ΔO and ODI within fast-spiking neurons (Spearman correlation, ρ(42) = 0.001, p = 0.994). F, No significant correlation was found between PDSI and ODI within regular-spiking neurons (Spearman correlation, ρ(85) = −0.072, p = 0.504). G, No significant correlation was found between PDSI and ΔO within fast-spiking neurons (Spearman correlation, ρ(58) = −0.159, p = 0.225). H, No significant correlation was found between ΔO and ODI within regular-spiking neurons (Spearman correlation, ρ(140) = −0.042, p = 0.624). I, No significant correlation was found between PDSI and ODI within fast-spiking neurons (Spearman correlation, ρ(41) = 0.213, p = 0.170). J, No significant correlation was found between PDSI and ΔO within regular-spiking neurons (Spearman correlation, ρ(82) = 0.138, p = 0.210).
Next, we examined binocular responses across cortical depth. We measured the LFP in response to a contrast-alternating checkerboard pattern and used it to determine the center of layer 4 (Fig. 7A,B; see above Material and Methods). A sink occurred in a specific window in depth after a short delay from both LFP and current-source density analysis (71.4 ms; 500 µm from the tip of the probe). We found no evidence of layer specificity in any of the binocular response metrics, ODI, ΔO, or PDSI (Figs. 7C–E).
Binocular response properties are not specific to discrete cortical layers. A, Example LFP as a function of time and depth in the cortex. Time t = 0 s was when the stimulus switched polarity, and depth z = 0 was the position of the channel closest to the tip of the microelectrode probe. Dark blue represents negativity. B, Current source density (CSD) calculated from A. The region with strongest negativity in the LFP was also where the strongest negativity appeared in the CSD. C, No layer specificity was found with ocular dominance. D, No depth specificity was found with ΔO. E, No depth specificity was found with the width of phase disparity tuning.
Finally, we asked whether stimulus design could have an impact on our observations of binocular integration. Thus far, we had interleaved monocular and binocular stimulations to rule out any potential complications arising from neuronal adaptation to the statistics of sensory stimulation. If the stimuli were to be presented in a series of blocks, the statistics of the stimulus would have varied from one block to the next. It is conceivable that neurons change their response property as an adaptation to the statistics of the sensory stimuli, which could make direct comparisons between monocular and binocular responses difficult to interpret. To test whether such an adaptation might influence our measurements, we acquired an additional dataset where we presented all the monocular stimulation in the first block to measure ocular dominance and interocular matching, followed by all the binocular stimulation in the second block to measure phase disparity selectivity (two-block design; n = 22 mice, including 7 used in the one-block design). We observed no major differences between the population distributions of ODI, ΔO, or PDSI when using one stimulus recording block versus two blocks, nor any correlation between the response indices (Fig. 8).
Consistency of measurements with block-structured stimulation. A, Comparison of ODI distribution obtained with the interleaved stimulation versus the block-structured stimulation. Left, ODI obtained with interleaved stimulation, as in Figure 1D. Median ODI (0.12) is indicated in red. Right, ODI obtained with block-structured stimulation. The monocular stimulation block preceded the binocular stimulation block. Median ODI (0.19) is indicated in red. B, No significant correlation was found between ΔO and ODI with the block-structured stimulation (Spearman correlation, ρ(155) = 0.048, p = 0.552). C, Left, ΔO obtained with interleaved stimulation, as in Figure 1D. Median ΔO (27.4°) is indicated in red. Right, ΔO obtained with block-structured stimulation. Median ΔO (20.7°) is indicated in red. D, No significant correlation was found between PDSI and ODI with the block-structured stimulation (Spearman correlation, ρ(245) = −0.016, p = 0.801). E, Left, PDSI obtained with interleaved stimulation, as in Figure 1D. Median PDSI (0.31) is indicated in red. Right, PDSI obtained with block-structured stimulation. Median PDSI (0.32) is indicated in red. F, No significant correlation was found between PDSI and ΔO with the block-structured stimulation (Spearman correlation, ρ(106) = −0.048, p = 0.625).
Together, our results indicate that regardless of stimulus design, cortical depth, and cell type, binocular integration in mouse V1 is widespread, and importantly this integration process does not simply depend on a convergence of feedforward monocular inputs.
Discussion
In this study, we tested the extent of binocular integration in mouse V1 and found that almost all responsive neurons showed evidence of binocular integration. Even neurons that were heavily dominated by one eye showed clear tuning to the phase disparity of a binocular grating, unequivocally demonstrating binocular interaction. Our data do not support the notion that neurons dominated by one eye are relaying signals from an upstream monocular structure, as implied by their classification nomenclature of monocular cells. Furthermore, we showed that ocular dominance, interocular matching, and disparity selectivity are independent measures of binocular integration. Finally, we found no direct association between phase disparity tuning and orientation tuning, suggesting that the two properties are generated via separate mechanisms. The major mechanism for creating phase disparity tuning is likely intracortical circuitry, unlike the convergence of geniculocortical projections for orientation tuning.
Ocular dominance, interocular matching, and disparity selectivity
Ocular dominance is a measure of binocularity that has been long studied in the visual cortex (Hubel and Wiesel, 1962). Decades of ocular dominance studies have significantly advanced our understanding of visual development and plasticity, as well as their critical period regulation (Hofer et al., 2006; Espinosa and Stryker, 2012; Priebe and McGee, 2014; Kaneko and Stryker, 2017; Ribic, 2020). However, these studies may have led to the notion that separate groups of monocular and binocular neurons exist even within the binocular zone of V1. For example, several recent studies using 2-photon calcium imaging reported large proportions of monocular neurons (>50%) in layer 2/3 of mouse binocular V1 (Salinas et al., 2017; Huh et al., 2020; Tan et al., 2020). The notion of purely monocular neurons in binocular V1 might be an acceptable approximation of layer 4 in cats and monkeys, where geniculocortical projections form alternating stripes of ocular dominance (Levay et al., 1978). However, it would raise important questions about the nature of binocular integration if monocular cells indeed exist at such a large population after the convergence of eye-specific inputs, which takes place in V1 layer 4 in mice (Gordon and Stryker, 1996) and even to some extent in the dLGN (Guido et al., 1989; Howarth et al., 2014).
A careful study of binocular integration requires an apparatus that presents visual stimuli dichoptically to the subject. The tuning to binocular disparity shows the neuron's sensitivity to binocular interactions. A truly monocular neuron could never have this property, and therefore disparity tuning is direct evidence of binocular integration. Here, we found widespread disparity selective neurons in the mouse binocular V1 across all cortical layers. Together with the observed ODI distribution, our data demonstrate that almost all neurons in the binocular V1 show evidence of being influenced from both eyes. The reported monocular cells in previous mouse studies were likely because of the lower sensitivity in calcium imaging and the lack of true binocular stimulation (Cang et al., 2023). More generally, we found a lack of correlation between ocular dominance and disparity selectivity. This result is consistent with findings across multiple studies performed in cats, monkeys, and mice (LeVay and Voigt, 1988; Read and Cumming, 2004; Kara and Boyd, 2009; Scholl et al., 2013; Chioma et al., 2020). In other words, disparity tuning and ocular dominance are independent measures of the binocular integration process in V1. Notably, this is inconsistent with the disparity energy model (Ohzawa and Freeman, 1986; Ohzawa et al., 1990, 1997), which predicts that more strongly tuned neurons should have a more balanced ocular dominance.
The original feedforward model of binocular integration also assumed that monocular RFs of a neuron in the two eyes should be similar (including orientation), except for offsets in position or phase, for the neuron to be useful in representing stereoscopic depth (Anzai et al., 1999). Again, we did not find such a relationship between disparity selectivity and interocular matching of orientation preference. More generally, the disparity energy model postulates that neuronal responses to binocular stimulation can be predicted from the measured responses to monocular stimulation, up to a first-order approximation. Our data showed that this was reasonably the case for the average binocular response magnitude but not for the degree of disparity selectivity.
Together, our study has revealed considerable discrepancies between actual binocular responses and the ones predicted by the disparity energy model. One way to reconcile these discrepancies is to consider a neuronal population in which binocular convergence has the same mechanism as the disparity energy model, but the population is synaptically coupled via intracortical connections. To a first-order approximation, the response property of each neuron closely follows the disparity energy model. When the stimulus drive is sufficiently strong as for a sinusoidal grating, neurons in the population influence each other through nonlinear interactions. The response of a particular neuron in such a coupled network will depend on the strength and feature of the stimulus because of the propagation and recurrent feedback of signals across the population. We have implemented such a recurrent network model in a previous study (Tanabe et al., 2022), which was able to explain the observed stimulus-dependent differences in disparity tuning between mice and tree shrews. It also predicts a reduced disparity tuning of inhibitory neurons in mice, a prediction validated by the observed tuning of fast-spiking neurons. Whether recurrent intracortical connectivity indeed underlies binocular integration will have to be studied in the future.
Limitations of this study
One limitation of our study is the lack of eye-tracking data. Disconjugate eye movements could potentially wash out disparity tuning if they are unchecked. Mice are known to make disconjugate eye movement under freely moving conditions (Meyer et al., 2020). Under head-fixed conditions, it is less clear. Vergence eye movements are likely smaller than what is detectable with current eye-tracking methods (Samonds et al., 2019). The sheer fact that we saw strong disparity tuning is evidence that vergence eye movements are small enough that it does not wash out the tuning. To study how vergence eye movements might affect the encoding of disparity, future studies would need to develop new methods of eye tracking with an accuracy several magnitudes higher.
Another limitation was that running speed was not recorded. In the mouse visual cortex, locomotion is known to amplify the sensory signal (Niell and Stryker, 2010). When we include running states in the same condition as stationary states, we expect an increase in the mean and variance of firing rates across all stimuli. One possibility we could explore in future studies is whether locomotion amplifies sensory signals of certain stimulus features (e.g., orientation) more than others (e.g., disparity).
Implications for natural behaviors and visual development
Much of the studies using cats and primates involve the mechanisms for stereoscopic depth perception. Stereopsis is particularly advantageous for animals with frontally facing eyes, but it is not clear whether the same advantage holds for mice with laterally facing eyes. Recent studies have shown that mice are capable of solving some stereoscopic tasks (Samonds et al., 2019; Boone et al., 2021). On the other hand, it is also possible that evolutionary pressure has led mice to use binocular vision in very different ways.
Binocular vision is best understood in terms of the geometry of visual projections from the environment onto the retina. For the study of binocular vision in humans, eye-movement tracking is vital because visual projection geometry will critically depend on the point of fixation. The same does not necessarily hold for the study of binocular vision in mice. One of the patterns of eye movement that mice make is in response to head movement. A compelling explanation is that those eye movements serve to stabilize their gaze despite head movement (Meyer et al., 2020). During head movement, there is a transient period during which the retinal images become misaligned, and binocular vision may be useful to the animal in that time period.
The magnitude of misalignment depends critically on the interocular distance. For instance, suppose there is an object in the upper binocular visual field, and the head is rotated clockwise along the roll axis. The left eye sees the object from a higher viewpoint than the right eye. Geometrically, the projection of the object to the left eye is in a shallower angle than the projection to the right eye. The disparity produced by this head rotation is directly proportional to the interocular distance. To compute the magnitude of the eye movement for realigning the images, the interocular distance will be a necessary parameter. For a developing mouse, whose interocular distance continues to change with the growth of the head, this might be a challenge. A mechanism that allows continuous recalibration, such as synaptic plasticity, might resolve this challenge during development. Having the primary circuitry for this computation in the cortex, as opposed to the geniculocortical projections, might help in the continuous recalibration.
Another product of activity-dependent development is the matching of orientation preference (Wang et al., 2010; Chang et al., 2020), which raises the question of how the development of phase disparity tuning might mirror it, and whether the two properties develop on a slightly offset time course. The paucity of the current literature on this subject makes it difficult to speculate on how exactly the two processes might codevelop, but we hope future studies will help us understand how the wiring for orientation tuning and phase disparity tuning are determined during the course of development.
In conclusion, we have found widespread and multifaceted binocular integration in mouse V1, which includes neurons that are heavily dominated by one eye. The practice of classifying and separating out these neurons as being part of a monocular mechanism will consequently underestimate the role of recurrent circuits in binocular integration. Our results indicate a critical role of intracortical circuits in binocular computation. Future studies will be needed to reveal the exact cortical circuits, determine how developmental plasticity shapes them, and understand how binocular integration and development help mouse behavior.
Footnotes
This work was supported by National Institutes of Health Grants EY026286 and EY020950 and Jefferson Scholars Foundation to J.C. We thank Sotiris Masmanidis at University of California, Los Angles for supplying the multielectrode silicon probes.
The authors declare no competing financial interests.
- Correspondence should be addressed to Jianhua Cang at cang{at}virginia.edu