Abstract
During binocular viewing, visual inputs from the two eyes interact at the level of visual cortex. Here we studied binocular interactions in human visual cortex, including both sexes, using source-imaged steady-state visual evoked potentials over a wide range of relative contrast between two eyes. The ROIs included areas V1, V3a, hV4, hMT+, and lateral occipital cortex. Dichoptic parallel grating stimuli in each eye modulated at distinct temporal frequencies allowed us to quantify spectral components associated with the individual stimuli from monocular inputs (self-terms) and responses due to interaction between the inputs from the two eyes (intermodulation [IM] terms). Data with self-terms revealed an interocular suppression effect, in which the responses to the stimulus in one eye were reduced when a stimulus was presented simultaneously to the other eye. The suppression magnitude varied depending on visual area, and the relative contrast between the two eyes. Suppression was strongest in V1 and V3a (50% reduction) and was least in lateral occipital cortex (20% reduction). Data with IM terms revealed another form of binocular interaction, compared with self-terms. IM response was strongest at V1 and was least in hV4. Fits of a family of divisive gain control models to both self- and IM-term responses within each cortical area indicated that both forms of binocular interaction shared a common gain control nonlinearity. However, our model fits revealed different patterns of binocular interaction along the cortical hierarchy, particularly in terms of excitatory and suppressive contributions.
SIGNIFICANCE STATEMENT Using source-imaged steady-state visual evoked potentials and frequency-domain analysis of dichoptic stimuli, we measured two forms of binocular interactions: one is associated with the individual stimuli that represent interocular suppression from each eye, and the other is a direct measure of interocular interaction between inputs from the two eyes. We demonstrated that both forms of binocular interactions share a common gain control mechanism in striate and extra-striate cortex. Furthermore, our model fits revealed different patterns of binocular interaction along the visual cortical hierarchy, particularly in terms of excitatory and suppressive contributions.
Introduction
Psychophysical studies have documented behavioral responses to dichoptic stimuli with different combinations of target and mask contrasts in the two eyes (Legge, 1984a,b; Maehara and Gpryo, 2005; Ding and Sperling, 2006; Meese et al., 2006). Extensive physiological studies have used dichoptic stimuli to demonstrate interocular suppression in primary visual cortex, V1 (Sengpiel and Blakemore, 1994; Sengpiel et al., 1995, 2006; Smith et al., 1997; Truchard et al., 2000; Macknik and Martinez-Conde, 2004; Li et al., 2005; Sengpiel and Vorobyov, 2005; Busse et al., 2009), and in V2 (Bi et al., 2011). The degree of suppression in V1 and V2 was found to be similar in strabismic monkeys (Bi et al., 2011). By contrast, there are few electrophysiological studies of interocular suppression in human V1. Two recent MEG (Chadnova et al., 2017, 2018) and one EEG (Busse et al., 2009) source-imaging studies in humans used a two-frequency dichoptic noise-masking paradigm and varied contrast and luminance levels in the two eyes. They demonstrated interocular suppression in V1, where dichoptic masking decreased responses by ∼50% in participants with normal binocular vision (Chadnova et al., 2018). It is not clear how such binocular contrast interaction propagates from V1 to extra-striate visual cortex.
Our first goal was to obtain a more complete profile of interocular suppression along the human visual cortical hierarchy, including areas of V1, V3a, hV4, hMT+, and lateral occipital cortex (LOC). Here we used source-imaged steady-state visual evoked potentials (SSVEPs) with stimuli in each of the two eyes tagged with distinct temporal frequencies (Hou et al., 2016). Our approach was similar to Chadnova et al. (2017, 2018), but we used a pair of parallel sinusoidal gratings instead of a noise-masking paradigm. Parallel gratings evoke stronger masking/suppression of the harmonic responses to each eye's stimulus frequency (referred to as self-terms, nF1 and nF2) than orthogonal gratings (Morrone and Burr, 1986; Burr and Morrone, 1987; Brown et al., 1999; Candy et al., 2001; Moradi and Heeger, 2009). This is true for the neurons in cat's visual cortex as well (Ohzawa and Freeman, 1986; DeAngelis et al., 1992). As self-term responses to each eye's input are reduced by simultaneously presenting stimuli to the other eye (Brown et al., 1999; Chadnova et al., 2017; Chadnova et al., 2018), we investigated dynamic interocular suppression in various cortical areas simultaneously by using different combinations of contrasts in the two eyes.
In addition to self-term responses, the interaction between the two unique frequency components presented to each eye also evokes intermodulation (IM) terms (nF1 ± mF2) (Baitch and Levi, 1988; Suter et al., 1996; Brown et al., 1999; Sutoyo and Srinivasan, 2009; Baker and Wade, 2017; Cunningham et al., 2017). The presence of such IM components constitutes objective neural evidence for interocular interaction (Brown et al., 1999; for review, see Norcia et al., 2015). Therefore, our second goal was to use IM responses as a novel method to measure interocular interaction along the visual cortical hierarchy.
Furthermore, to characterize these two forms of binocular interactions (interocular suppression and interaction revealed by self and IM terms, respectively) and their relation to contrast gain control, we fit both self and IM data simultaneously to a family of divisive gain control models (Tsai et al., 2012) across various cortical areas. Previous masking SSVEP studies (Baker and Wade, 2017; Chadnova et al., 2017, 2018) used a static contrast input in gain control models, which limited their ability to deal with temporal dynamics and the range of input contrasts in our study. The Tsai et al. (2012) model includes a time-varying contrast input and can explain the full range of frequency-domain responses. Model fits of both interocular suppression (self-terms) and interocular interaction (IM terms) provided a more complete description of binocular interactions in human striate and extra-striate visual cortex within the normalization framework of contrast gain control.
Materials and Methods
Participants
Fifteen participants with normal vision (7 females) between 22 and 68 years old (mean age 44 ± 14 years) volunteered for the study. They all had normal or corrected-to-normal vision (20/20 or better in each eye with Bailey-Lovie LogMAR chart). Their stereoacuity was at least 40 arcsec (Random-dot stereo butterfly, Stereo Optical). Their dominant eye and nondominant eye were determined using the hole-in-the card test. The research protocol was approved by the Institutional Review Board of the Smith-Kettlewell Eye Research Institute and conformed to the tenets of the Declaration of Helsinki. Written informed consent was obtained before the experiments.
Experimental design
Display and stimuli.
Figure 1 illustrates the stimuli and experimental design in this study. A pair of 2 cpd parallel sinusoidal gratings was presented on two matched Sony Trinitron monitors (model 110GS) viewed through cross-polarized filters (goggles) at a distance of 100 cm. Each screen had a resolution of 1024 × 768 pixels and was refreshed at 85 Hz. The mean luminance of the display was 46.2 cd/m2. The gratings were contrast-reversed at different temporal frequencies (8.5 and 6.07 Hz) and presented separately to each eye, viewed through cross-polarized filters (as seen in Fig. 1A). Figure 1B shows a sample fMRI scan from 1 participant in a separate session to define the region of interests (ROIs: V1, V3a, hV4, hMT+, and LOC) in visual cortical areas. These ROIs were used for source imaging of EEG scalp potentials. SSVEPs were measured in three conditions from each of 15 participants. In the Target-alone condition, the target grating was contrast-reversed at 8.5 Hz in the nondominant eye, and its contrast was swept from 1.7% to 40% in 10-logarithmic steps within 10 s, while the mask contrast in the dominant eye was set at 0%. The spectrum in this condition was dominated by the second harmonic response (2F1) to the target at 17 Hz (Fig. 1C, top). In the Mask-alone condition, the mask grating was contrast-reversed at 6.07 Hz in the dominant eye, and its contrast was fixed at 20% for the trial, which lasted 10 s, while the target contrast in the nondominant eye was set at 0%. The spectrum in this condition was dominated by the second harmonic response (2F2) to the mask at 12 Hz (Fig. 1C, middle). In the Target+Mask condition, the target contrast in the nondominant eye was swept from 1.7% to 40% in 10-logarithmic steps within 10 s, whereas the mask contrast in the dominant eye was fixed at 20%. The spectrum in this condition consisted of self-terms (Target 2F1 and Mask 2F2) and also IM-terms (F1 + F2 at 14.57 Hz, F1 − F2 at 2.43 Hz), as seen in Figure 1C (bottom). We also repeated the Target+Mask condition with reversed temporal frequencies in the two eyes, with the target and mask gratings contrast-reversing at 6.07 and 8.5 Hz, respectively. This repeated Target+Mask condition served to determine whether different temporal frequencies affect target and mask responses.
SSVEP data acquisition and source localization
EEG data were collected from 15 participants with 128-channel HydroCell Sensor Nets and Net Station acquisition system (EGI), bandpass filtered from 0.1 to 50 Hz, and digitized at 500 Hz. Four stimulus conditions (Target alone, Mask alone, Target+Mask, and repeated Target+Mask with reversed temporal frequencies) were presented in a random order with each trial lasting for 10 s duration that was divided into 12 bins (10 core + 1 prelude + 1 postlude), and with intervals of 3 ± 0.5 s between each trial. The prelude and postlude bins were discarded for data analysis to eliminate onset/offset transients. Twenty trials of each stimulus condition were acquired. Participants were instructed to fixate a central marker and avoid blinking during stimulus presentation. At the end of the EEG session, the 3D locations of 128 sensors and three fiducials (nasion, left and right preauricular) were recorded for each participant using a Fastrak radio-frequency 3D digitizer (Polhemus) and coregistered to participants' T1-weighted anatomical magnetic resonance scans, from which a three-shell boundary element model of the skull and scalp was computed.
EEG signals and artifact rejection (eye movements and blinks) were postprocessed using a custom software package designed by the Norcia research group (Ales et al., 2013). Details for source localization of EEGs are described previously (Appelbaum et al., 2006; Cottereau et al., 2011; Hou et al., 2016; Hou et al., 2017). In brief, an L2 minimum norm inverse was computed with sources constrained to the location and orientation of the cortical surface (Hämäläinen et al., 1993). ROIs corresponding to visual areas V1, V2v, V2d, V3v, V3d, V3a, and hV4 were defined by a separate procedure based on retinotopic mapping using fMRI (Engel et al., 1997). The area hMT+ was identified using low-contrast motion stimuli (Huk and Heeger, 2002). The LOC was defined using a block-design fMRI localizer scan with stimuli from Kourtzi and Kanwisher (2000).
ROI-based analysis
In this study, we specifically examined responses in V1, V3a, hV4, hMT+, and LOC. The areas V2 and V3 were excluded due to cross talk from other areas (Cottereau et al., 2011). To measure contrast response functions, raw EEG recordings for each trial were divided into 10 sequential core bins that corresponded to the swept stimulus values (contrast). For each bin, a recursive least-square adaptive filter (Tang and Norcia, 1995) was used to generate a series of complex-valued spectral coefficients representing the amplitude and phase of harmonic responses (Hou et al., 2017). Voltage versus contrast response functions were obtained by coherently averaging the spectral coefficients for each bin across trials for each participant, ROI, harmonic, and stimulus condition. To take into account the different noise levels for each participant (Vialatte et al., 2010), we computed the signal-to-noise ratio (SNR) for each participant by dividing peak amplitudes by the associated noise, which was defined for a given frequency by the average amplitude of the two neighbor frequencies (stimulus frequencies ± 1.21 Hz). Then, we averaged the SNRs across the 15 participants. As there were no significant differences between left and right hemisphere responses (F(1,14) = 3.16, p = 0.097) when collapsing across stimulus conditions, harmonics and ROIs, we therefore averaged the data from both hemispheres.
Cross talk in ROI
To estimate the contribution of activities from areas outside the designated ROI, we computed a cross talk matrix, using the calculation described by Lauritzen et al. (2010) and Cottereau et al. (2011), as seen in Figure 2. Cross talk refers to the neural activities generated in different areas that are attributed to a particular ROI due to the smoothing of the electric field by the head volume. Ideally, the cortical current densities would show 0 cross talk, and the associated matrix would be equal to identity; however, the skull, dura, and intervening media smear the source localization. Nevertheless, the visual areas (V1, V3a, hV4, hMT+, and LOC) chosen for our study received on average <25% cross talk in our study, allowing us to conclude that the results we observed arise predominantly in the designated areas.
SSVEP contrast threshold estimation
The SSVEP contrast thresholds are estimated for a given ROI for each individual participant by extrapolating the second harmonic of the SSVEP response amplitude as a function of target contrast to 0 mV, using in-house software for contrast sweep VEP (Norcia et al., 1990). The threshold extrapolation algorithm incorporated the signal-to-noise ratio and phase-consistency criteria described by Norcia et al. (1990), with additional cross-checks performed using the t2circ statistic of Victor and Mast (1991).
Contrast response modeling
In a nondichoptic masking study, Tsai et al. (2012) extended a well-established description of the contrast response function–the hyperbolic ratio function (Naka and Rushton, 1966; Albrecht and Hamilton, 1982) and included a time-varying contrast input that explained the full range of frequency-domain responses. To adapt the Tsai et al. (2012) model to our dichoptic masking paradigm, we made several modifications: (1) we presented the target and mask stimuli to different eyes; (2) we introduced a weighting factor for the mask contrast input as the relative contributions of the target and the mask are different in our dichoptic masking study; and (3) we introduced an additive baseline parameter to account for the SNR floor not being ∼1, as the responses of Target alone 2F1 and Target+Mask 2F2 were suprathreshold with SNR > 1. Thus, the variant of the Tsai et al. (2012) model used in our study is described as follows: Equation 1 defines the time-varying contrast c(t), where ctarget and cmask are the contrasts of the Target and Mask, respectively, ftarget and fmask are their temporal frequencies (Carandini, 2004; Bonin et al., 2006; Tsai et al., 2012), and wmask is a weighting factor of mask contrast relative to target. Equation 2 defines the nonlinearity, where σ is a semisaturation constant representing contrast sensitivity, p is an exponent of excitation, and q is an exponent of divisive suppression (Foley, 1994; Chen et al., 2001; Xing and Heeger, 2001; Peirce, 2007), as described by Tsai et al. (2012). Equation 3 is the function fit to the data, where R(f) is the SNR at frequency f. The parameter R0 is a frequency-dependent baseline parameter we added to account for the signal-to-noise floor, and Rm is the response gain factor. U(f) indicates the amplitude of the Fourier transform of time series u(t) at frequency f. Parameter values were obtained by nonlinear constrained optimization (MATLAB function fmincon) to minimize the sum of the squared residual error. The coefficient of determination (R2) was used to assess goodness of fit of a model. The SDs and CIs of the fit parameter values were estimated from the distributions of 1000 bootstrap resamplings (each of size 15), drawn randomly from participant data with replacement.
Results
Profile of interocular suppression in V1 and extra-striate cortex
Self-term responses represent the response to each eye's input (i.e., target eye and mask eye). As one eye's self-term response is reduced by simultaneously presenting the stimuli to the other eye (Chadnova et al., 2018), we measured this component to investigate interocular suppression under dynamic combinations of contrasts from the two eyes.
Figure 3 plots SNR at the second harmonic responses averaged across 15 participants from 3 stimulus conditions that produced the following self-term response components: the second harmonic of the target frequency when the target was presented alone (Target alone 2F1) and when it was presented with a mask (Target+Mask 2F1), and the second harmonic of the mask frequency when the mask was presented alone (Mask alone 2F2), and when it was presented with the target (Target+Mask 2F2). Data from the different ROIs are shown in different colors. It is obvious that the response was a factor of ∼2 higher in V1 than in extra-striate cortex, regardless of the absence (Fig. 3A,B, top) or presence (Fig. 3C,D, top) of a stimulus in the other eye. An initial ANOVA was conducted with the factors of component (Target alone, Target+Mask at 2F1, and Target+Mask at 2F2) and ROI (V1, V3a, hV4, hMT+, and LOC) collapsed (averaged) across 10 target contrast levels. The component of Mask alone (Fig. 3B, top) was excluded due to only one contrast level. As expected, the interaction of component and ROI was significant (F(8,7) = 5.06, p = 0.023), suggesting that the responses are different among the conditions and the ROIs. This difference was likely driven by the overall weaker responses in extra-striate cortex, compared with the responses in V1. Therefore, we further compared the responses between V1 and extra-striate ROIs (averaged responses of areas V3a, hV4, hMT+, and LOC) across 10 target contrast levels for each component. ANOVA with factors of ROI and target contrast level revealed that V1 had significantly stronger responses than extra-striate cortex for all three components: Target alone 2F1 (F(1,14) = 13.80, p = 0.002), Target+Mask 2F1 (F(1,14) = 7.97, p = 0.014), and Target+Mask 2F2 (F(1,14) = 10.27, p = 0.006). There were no significant differences among target contrast levels for each component (F(9,6) = 2.02, p > 0.05).
Furthermore, as seen in Figure 3C, D when a mask was presented to the other eye, the responses to the Target (Target+Mask at 2F1; Fig. 3C, top) reduced in all ROIs, compared with Target alone condition (Fig. 3A, top). In contrast, the responses to the Mask (Target+Mask at 2F2; Fig. 3D, top) in the presence of a target of increasing contrast decreased monotonically as the target contrast was increased. The pattern of self-term responses at both target frequency (2F1) and mask frequency (2F2) was consistent with the Winner-Take-All competition (Busse et al., 2009; Carandini and Heeger, 2011), as the response seems to be dominated by the higher contrast component. Repeating the Target+Mask condition with reversed temporal frequencies in the two eyes did not show significant differences (F(1,14) = 0.32, p = 0.581, repeated-measures ANOVA). This result indicated that different temporal frequencies did not affect the pattern of target and mask responses. Therefore, replication of the results suggests good test-retest reliability for the study as well.
We also measured individual contrast thresholds by extrapolating the SSVEP response amplitude at Target+Mask 2F1 versus target contrast response function to 0 mV (Norcia et al., 1990). Previous studies found that extrapolation of contrast response curves efficiently predicts psychophysical thresholds (Campbell and Maffei, 1970). The averaged threshold across participants in each ROI is plotted in Figure 3C (top). Contrast threshold for the target in the presence of a fixed contrast dichoptic mask was 2.9% in V1, which was significantly lower than the threshold of 3.69% in extra-striate ROIs (averaged across V3a, hV4, hMT+, and LOC) (p = 0.011, paired t test). However, we were not able to measure contrast thresholds for the 2F1 component in the target-alone condition because the contrast sweep started at 1.7% contrast, which is well above the unmasked threshold measured in a previous study (Norcia et al., 1990). Contrast thresholds measured with a contrast sweep method similar to the one used in our study (Norcia et al., 1990) were ∼0.22%-0.32% for adults at low special frequencies (0.5 − 2 cpd). This should be similar to thresholds for the 2 c/deg of special frequency grating used in our study. Compared with the thresholds for unmasked stimuli, the thresholds in the presence of a dichoptic mask are approximately an order of magnitude higher (2.9% in V1 and 3.69% in extra-striate cortex), suggesting interocular suppression of contrast sensitivity.
The response phase depended on whether the mask was present and whether the responses were to the target or to the mask. In the Target-alone condition, as seen in Figure 3A (bottom), we observed a “phase advance” of the 2F1 component, as described previously (Burr and Morrone, 1987), where the response phase progressively increased with increasing target contrasts, with a steeper increase for intermediate contrasts (∼5%-10% target contrast in our study) in all ROIs, except in V1. V1 showed a phase decrease of ∼−20° to −40° for intermediate contrasts. When the parallel dichoptic mask was present, the “phase advance” was reduced (from 60° in Target-alone condition to 40° in Target+Mask condition) in hMT+ and LOC. Interestingly, the “phase advance” in V3a disappeared, and the phase in hV4 showed a “phase decrease” (Fig. 3C, bottom). The phase in V1, as in hV4, also showed a “phase decrease” with increasing target contrast. These findings suggest that the “phase decrease,” including reduction of “phase advance,” might be a signature of interocular suppression in parallel dichoptic masking. The observation that the phase advance is abolished in V3a with the introduction of a parallel grating mask is similar to the findings in a previous VEP study (Burr and Morrone, 1987). It is possible that the responses recorded at POz from Burr and Morrone (1987) also represent more dorsal areas, such as V3a.
For the responses to the mask frequency 2F2, the phase showed an opposite pattern, compared with the pattern of the target frequency 2F1 (Fig. 3D, bottom). All ROIs, except V1, showed a “phase decrease” at ∼10% and 20% target contrast. V1 showed a “phase advance” as described in the Burr and Morrone (1987) study. The opposite pattern of phase change for 2F1 and 2F2 suggests that the phase responses to the stimuli rely on the relative contrast between the two eyes. A direct comparison of phase responses in each ROI in the Target-alone and Target+Mask conditions can be seen clearly in the bottom panels of Figure 4A for target frequency (2F1) and of Figure 4B for mask frequency (2F2).
Figure 4 demonstrates the profile of interocular suppression in V1 and the ROIs in extra-striate cortex. In all ROIs, the responses to the Target (Fig. 4A, top) and to the Mask (Fig. 4B, top) were suppressed (reduced) by presenting stimuli to the other eye. As seen in Figure 4A (top), the effect of the dichoptic mask on the target was to shift all contrast response functions of Target+Mask 2F1 downward and rightward. The downward and rightward shifts are quantified by the modified version of the Tsai et al. (2012) model, in which Target alone and Target+Mask responses in the top row of Figure 4A were fit separately in each ROI. The detailed fit results are presented below in Monocular versus dichoptic processing in higher visual areas.
Figure 5 plots suppression percentage in each ROI as the difference between with- and without-mask SNR, divided by the Target-alone SNR. Negative and positive values indicate suppression and facilitation, respectively. For suppression of the target by the mask, all ROIs showed suppression from 1.7% to 40% target contrast. The suppression was evident at very low target contrasts (<3.5%) and reached a maximum at 7% target contrast, which was far below the contrast at which the two eyes' contrasts were matched (20%). V1 had the strongest suppression (∼50% reduction), and LOC had the least suppression (∼20% reduction) at the contrast corresponding to peak suppression. The suppression in V3a was similar to V1.
Profile of interocular interaction in V1 and extra-striate cortex
IM terms can only result from the interaction of the two eyes' inputs; thus, their presence is neural evidence for interocular interaction (Brown et al., 1999). Among the IM components, F1 + F2 was dominant in our study as seen in Figure 1C (bottom), so we used the F1 + F2 IM component to index “interocular interaction” in our study. When both target and mask were present, the IM responses as a function of target contrast across all ROIs were nonmonotonic, as seen in Figure 6 (left): they increased with target contrast to a peak and then declined thereafter. Although overall response magnitudes in extra-striate cortical areas were weaker than in V1, the peak IM responses occurred ∼20% target contrast, when the contrast matched that of the mask. In V1, the IM response became evident at ∼2.5% of target contrast, while in extra-striate ROIs IM response were evident at ∼4%-5% of target contrast. One-factor ANOVA with ROI (V1, V3a, hV4, and hMT+ and LOC) at the peak response, where both the target and mask contrast were matched at 20%, revealed significance (F(1,14) = 192.24, p < 0.001), suggesting an effect of ROI that is likely driven by higher response in V1. To investigate this further, we compared the responses between V1 and extra-striate ROIs (averaged responses of areas V3a, hV4, hMT+, and LOC) across 10 target contrast levels. Similar to the self-terms noted earlier, ANOVA with factors of ROI and target contrast level revealed that V1 had significantly stronger interocular interaction than extra-striate cortex (F(1,14) = 7.94, p = 0.014). Among the extra-striate ROIs, hV4 showed the smallest IM response. However, the difference between hV4 and averaged responses of V3a, hMT+, and LOC did not reach significance (F(1,14) = 1.94, p = 0.185). The phase of the IM response also changed with increasing target contrast. However, the pattern of the IM phase change was different from the patterns of phase change in for the 2F1 component in the Target-alone and Target+Mask conditions. V1 and hV4 showed a “phase advance” as contrast increased from ∼5% to ∼20% to 40%. The phase in V3a, hMT+, and LOC showed a small decrease with increasing target contrast and remained at ∼−40° for contrasts >10%.
Monocular versus dichoptic processing in higher visual areas
As mentioned earlier, we quantified the extent of interocular suppression by fitting the target 2F1 responses in the absence and presence of mask (Fig. 4A, top row) separately using a divisive gain control model (Tsai et al., 2012) that we modified for our dichoptic masking study. The model fits are shown in Figure 7, where the black solid lines indicate the best fitting model. The corresponding fit parameters and values in each ROI are listed on the left side of the panels for the Target-alone condition and on the right side of the panels for the Target+Mask condition. Several key changes were observed in target 2F1 responses in the presence of a dichoptic mask (binocular interaction), compared with those in the absence of mask (monocular processing), which we describe in detail below.
Downward and rightward shifts of the contrast response function in the presence of dichoptic mask
The changes to the 2F1 response by the addition of a dichoptic mask are characterized primarily by changes to the semisaturation constant (σ) and to the baseline parameter R0, as shown in Table 1. The σ value increased from 4.7% to 11% in V1, and from 10% to 32% in extra-striate cortex (averaged over extra-striate ROIs). The σ values in the presence of mask were distributed over a wide range, showing no significant differences between the ROIs. However, the overall rightward shift of the 2F1 responses due to the addition of a dichoptic mask is obvious compared with those in the absence of the mask. The increased σ indicates that the mask reduced contrast sensitivity. Table 1 also shows a clear decrease in the baseline level (R0) on average across ROIs from 1.47 ± 0.19 in the absence of the mask to 1.06 ± 0.05 in the presence of the mask in all ROIs. The downward shift in the baseline parameter R0 is likely due to the fact that even the lowest value of the sweep contrast (1.7%) in the Target-alone condition was at a suprathreshold level, which is known to evoke significant VEP responses (Norcia et al., 1990; Hou et al., 2014).
Response gain reduced in the presence of dichoptic mask
We have seen that the dichoptic mask reduced response amplitudes (Fig. 4A, top row). The response gain parameter Rm also showed a similar effect (Fig. 8A). Response gain decreased by a factor of ∼2 in all ROIs in the presence of a dichoptic mask. Response gain in V1 was higher than in extra-striate cortex, regardless of the presence or absence of mask. A previous MEG source-imaging study (Hagler, 2014) also reported that V1 had a factor of ∼2 stronger responses than V2, V3, and V3a for checkerboard pattern stimuli. The strong response gain in V1 might be because the stimuli (contrast in our study, checkerboard in Hagler, 2014) were low level and primarily represented functional properties of V1 neurons. We speculate that the strong response gain in V1 might also be driven by the specific role of V1 as the first locus where binocular information is combined. It is also possible that the more robust EEG responses in V1 are because it has a larger cortical surface area in close proximity to the scalp electrodes, compared with extra-striate cortex, such as areas hV4, hMT+, and LOC.
Excitatory and divisive suppression in monocular and dichoptic processing
Our model fits revealed a change in the relationship between excitatory (p) and suppressive contributions (q) in the presence of a dichoptic mask. Although the values of p and q differed across cortical ROIs (Fig. 7), divisive suppression across all ROIs was in general stronger than excitation in the presence of a mask, and in general weaker than excitation in the absence of mask. This is evident in Figure 8B, where the differences between q and p for the 2F1 component in the Target alone and the Target+Mask conditions are evident. In the absence of mask (monocular processing), p is slightly more dominant (as seen in Fig. 8B, bottom, pink). In the presence of mask (dichoptic processing), q is dominant (Fig. 8B, top, green) in extrastriate areas. However, this alternation was not observed in V1, where the mean difference between q and p did not reach significance due to the large SD of the fit.
Weight of mask contrast in dichoptic processing
As seen in Figure 4, overall suppression was weaker from the target eye (swept contrast) to the mask eye (fixed contrast) (Fig. 4B, top), compared with the suppression from the mask eye to the target eye (Fig. 4A, top). Our model fits are also consistent with this observation. More specifically, we modified the Tsai et al. (2012) model to allow for a parameter that determines the relative contribution of mask contrast, Wmask. We found that the effective contrasts of target and mask were different in our dichoptic-masking paradigm, compared with nondichoptic masking (Tsai et al., 2012) where both target and mask contrast inputs had equal weights. This is evident in the values of the weighting factor (Wmask) in the Target+Mask condition listed in Figure 7 (right) and in Figure 8C (right), where the mask contrast across ROIs had to be attenuated to ∼0.74 on average (± 0.225 SD) to achieve good fits of the model. For the dichoptic masking condition, there was no significant difference in Wmask between ROIs. For the Target-alone condition, Wmask was disabled (set to 0 for the fits in Fig. 7, left). It is not clear why the relative effectiveness of target and mask contrasts were different, but the attenuation of the mask needed to account for the dichoptic masking data suggests that the suppression from the mask eye (fixed contrast) to the target eye (varied contrast) is stronger than the other way around.
Relation of the two forms of binocular interactions to contrast gain control
To characterize the two forms of binocular interaction (i.e., interocular suppression represented by self-terms vs interocular interaction represented by IM terms) and their roles in the relation to contrast gain control in visual cortical hierarchy, we fit all the dichoptic masking data (both self and IM term responses) simultaneously across various cortical areas to a family of divisive gain control models (Tsai et al., 2012) that we modified for our dichoptic masking study. The model fits are shown in Figure 9, where the black solid lines indicate the best fitting model. The corresponding fit parameters and values in each ROI are listed on the right side of the panels, as well as in Table 2 along with SDs. The fit parameters in Figure 9 typically inherit the patterns of Target+Mask 2F1 in Figure 7, where Wmask is <1 and q still remains dominant across all ROIs, as these components (Target 2F1, Mask 2F1, and IM) are all dichoptic responses. However, simultaneously fitting all three response components in a given ROI resulted in larger SDs than fitting a single component (Fig. 7; Target 2F1). Nonetheless, as seen in Figure 9, both self and IM terms in V1 and extra-striate cortex are simultaneously fit well, suggesting that both forms of binocular interactions during dichoptic presentation have the same gain control parameters within a given ROI. The goodness of fit (R2) varied across the ROIs between 0.875 and 0.967. This finding indicates that both forms of binocular interactions share a common gain control nonlinearity within a cortical area.
Discussion
Using source-imaged SSVEP and frequency-domain analysis, we examined the neural dynamics of binocular interactions in various human visual cortical areas over a wide range of relative stimulus contrast between the two eyes. The stimuli in each eye were tagged with distinct temporal frequencies, which allowed us to quantify spectral components associated with the individual stimuli in each eye (self-terms) and the responses due to interaction between the inputs from two eyes (IM terms). These two terms provided a more complete description of neural activities regarding binocular interactions in human striate and extra-striate cortex, including areas V1, V3a, hV4, hMT+, and LOC.
Contrast normalization accounts for two forms of binocular interactions in the visual cortical hierarchy
One of important findings in this study is that the two forms of binocular interactions (interocular suppression and interaction revealed by self and IM terms, respectively) share a common gain control mechanism. This shared gain control is evident in Figure 9, in which both self and IM terms are simultaneously fit well not only in V1, but also in extra-striate cortex to a family of divisive gain control models (Tsai et al., 2012). These findings suggest that contrast normalization successfully predicts the neural populations during binocular processing in both striate and extra-striate visual cortex. Our study is the first to demonstrate that interocular interaction as indexed by the IM term is also consistent with divisive gain control in extra-striate cortex.
It is surprising that our dichoptic contrast masking data are fit perfectly well by a contrast gain model developed to explain nondichoptic contrast masking (Tsai et al., 2012), with minor variations, such as presenting target and mask stimuli to different eyes and attenuating mask contrast input. The latter is critical to achieve good fits of the model for dichoptic masking, as it allows for a parameter (Wmask) that determines the relative contribution of the mask contrast. The original nondichoptic version of the model (Tsai et al., 2012) without Wmask fit our monocular data very well, but not our dichoptic data, as can be seen in Figure 8C. This finding suggests that the relative contributions of each eye to contrast gain may vary under certain circumstances, such as a constant mask in one eye and a varying contrast target in the other eye. We found that the effect from the target eye to the mask eye is weaker than the effect from the mask eye to the target eye. Our variant of the model provides a quantitative estimate of the weighting of contrast from the mask eye Wmask, which could be useful to evaluate amblyopic suppression from the nonamblyopic fellow eye to the amblyopic eye.
Previous studies have used a two-stage model (Meese et al., 2006) to describe binocular combination, where the first stage has monocular inputs that are normalized by inputs from both eyes before a second stage where they are combined binocularly. Our data are well fit by a single stage of normalization in Tsai et al. (2012), but that is perhaps because we are measuring responses in V1 and beyond where the inputs from the two eyes have already been combined binocularly. It is also possible that the relative weighting of the two eyes' input occurs at a first stage, which we implement in our model by attenuating mask contrast before binocular combination. Our dichoptic masking study along with previous dichoptic (Chadnova et al., 2017, 2018) and nondichoptic masking (Candy et al., 2001; Tsai et al., 2012) studies further support the view that normalization serves as a canonical neural computation (Carandini and Heeger, 2011; Baker and Wade, 2017).
Two forms of binocular interactions in the visual cortical hierarchy
The overall binocular interactions represented by self and IM terms in our study revealed that both forms have stronger responses in V1 than in extra-striate cortex, suggesting that it is likely that the bulk of binocular interactions occurs in V1. This is evident in Figure 7 and its corresponding fit parameters, in which Wmask is similar in all ROIs. In the presence of a dichoptic mask, the decrease in response at low contrast is greatest in V1 and appears to be propagated to higher visual areas (Fig. 3C, top). In general, the fit parameters change between V1 and other extra-striate areas; for example, suppression factor (q) becomes dominant in the presence of dichoptic masking only in extra-striate cortex (Fig. 8B). Our model fits suggest that the results are compatible with a gain control model that combines inputs from the two eyes in V1, but allows for cascading stages of gain control in higher visual areas (Simoncelli and Heeger, 1998).
The suppression effect in V1 in our study is similar to the report in human V1 from an MEG source-imaging study (Chadnova et al., 2018), in which dichoptic masking decreased responses by ∼50% in participants with normal binocular vision. Our result in V1 was also similar to low-channel SSVEP responses recorded at Oz and POz locations (Baker and Wade, 2017). We should point out that our findings in V1 do not depend critically on SSVEP source imaging, as data from a single electrode at Oz, referenced to the average, are very similar to V1 (Ales et al., 2010). It is possible that the responses at POz in Baker and Wade (2017) represent more dorsal areas, such as V3a, where suppression was found to be similar in magnitude to that found for V1 in our study. An fMRI study reported that V1, V2, and V3 have the same suppression pattern (Moradi and Heeger, 2009). However, we cannot directly compare our data at V2 and V3 with this fMRI study, as there is considerable cross talk from other areas to these two areas in our source-imaging technique (Cottereau et al., 2012, 2015). To our knowledge, there are no comparable studies of interocular suppression beyond V3a or higher-level visual cortical areas.
The IM terms revealed another aspect of binocular interaction, compared with self-term responses (interocular suppression) described above and in previous studies (Chadnova et al., 2017, 2018). Our results showed that the IM response was strongest at V1 and was least at hV4. Nevertheless, the peak IM responses occurred at ∼20% target contrast across all ROIs when the target and mask contrasts were equal. IM responses in our study were similar at V3a, hMT+, and LOC, as these areas are known to be involved in binocular processing and disparity coding (Preston et al., 2008; Cottereau et al., 2011, 2012). There are no comparable studies using IM responses in extra-striate visual cortex.
Although our modeling results suggest that a single contrast gain control mechanism accounts for both the interocular suppression and binocular combination results (self- and IM-term response, respectively) in neurotypical observers, other studies of dichoptic masking in amblyopia suggest that these two processes might be differentially affected in amblyopia. Several studies suggest that interocular suppression is intact (Sengpiel and Blakemore, 1996; Hallum et al., 2017; Shooner et al., 2017), but that binocular combination is compromised (Levi et al., 1979).
Excitatory and suppressive contributions in binocular interactions
Another important finding in this study is that the contributions from excitation (p) and divisive suppression (q) differed during monocular and dichoptic processing, as seen in Figure 8B. Importantly, our results show, for the first time, that the divisive suppression is stronger in extra-striate cortex than in V1 during dichoptic processing.
Limitation of our model
We used a variant of the model from Tsai et al. (2012) to explain our dichoptic masking data. As described by Tsai et al. (2012), one limitation of the model is its relative deficiency in predicting the absolute phase of SSVEP responses correctly. Nonetheless, the modeling here for SSVEP response amplitudes does demonstrate that contrast normalization accounts for binocular interactions in human striate and extra-striate visual cortex.
In conclusion, we studied neural dynamics of binocular interactions at different levels of human visual cortex over a wide range of relative contrast between two eyes. We measured two forms of binocular interactions: one associated with the individual stimuli that represent interocular suppression from each eye, and the other a direct measure of interocular interaction between inputs from the two eyes. We demonstrated that these two forms of binocular interactions share a common gain control mechanism in both striate and extra-striate cortex but have different characteristics in areas along the visual cortical hierarchy. Furthermore, our model fits revealed two important characteristics of dichoptic masking: attenuation of mask contrast and dominance of divisive suppression in extra-striate cortex. Our study provides a more complete description of binocular interaction in human striate and extra-striate cortex and provides the basis for comparison with clinical populations with abnormal binocular vision, such as strabismus and amblyopia in future studies.
Footnotes
This work was supported by National Institutes of Health Grant R01-EY025018 to C.H. We thank Dr. Jeffrey J. Tsai for sharing and providing the model script used in Tsai et al. (2012); and Margaret Q. McGovern for assistance in recruiting the participants.
The authors declare no competing financial interests.
- Correspondence should be addressed to Chuan Hou at chuanhou{at}ski.org