Abstract
We tested how a stimulus gestalt, defined by the neuronal interaction between local and global features of a stimulus, is represented within human primary visual cortex (V1). We used high-resolution fMRI, which serves as a surrogate of neuronal activation, to measure co-fluctuations within subregions of V1 as (male and female) subjects were presented with peripheral stimuli, each with different global configurations. We found stronger cross-hemisphere correlations when fine-scale V1 cortical subregions represented parts of the same object compared with different objects. This result was consistent with the vertical bias in global processing and, critically, was independent of the task and local discontinuities within objects. Thus, despite the relatively small receptive fields of neurons within V1, global stimulus configuration affects neuronal processing via correlated fluctuations between regions that represent different sectors of the visual field.
SIGNIFICANCE STATEMENT We provide the first evidence for the impact of global stimulus configuration on cross-hemispheric fMRI fluctuations, measured in human primary visual cortex. Our results are consistent with changes in the level of γ-band synchrony, which has been shown to be affected by global stimulus configuration, being reflected in the level fMRI co-fluctuations. These data help narrow the gap between knowledge of global stimulus configuration encoding at the single-neuron level versus at the behavioral level.
- binding problem
- feature conjunction
- high-resolution fMRI
- primary visual cortex (V1)
- spontaneous activity
Introduction
In everyday life, the visual system is bombarded with a multitude of stimuli. In human and nonhuman primates, the local features of stimuli are to a large part encoded by neurons in the primary visual cortex (V1) that possess small receptive fields. These locally encoded features need to be bound together to represent the gestalt (i.e., the overall shape) of the stimulus. Binding is thus crucial to encode a global configuration as well as to avoid illusory conjunctions (Treisman and Schmidt, 1982; von der Malsburg and Schneider, 1986).
The neural mechanisms that underlie feature binding have been a topic of interest for many decades (Rosenblatt, 1961). One of the first hypothetical mechanisms was changes in the extent of synchronous neuronal activity (von der Malsburg and Schneider, 1986). According to the neuronal synchrony hypothesis, the absolute firing rate of neurons encodes the significance of the encountered features, while the level of temporal correlation across different neurons “tags” the binding between encoded features.
Evidence in support of the neuronal synchrony hypothesis was first provided by Gray et al. (1989) who showed that, in cats, the level of coherence between V1 neurons was higher when the encoded features belonged to the same rather than different objects. Also, this coherency-based encoding was more apparent in the γ-band, i.e., 30–80 Hz, rather than lower frequencies. These findings suggested that global stimulus configuration can influence local feature encoding beyond what is expected from the classical definition of the neural receptive field ((Gray et al., 1989; Kapadia et al., 1995); but see also (Riesenhuber and Poggio, 1999)).
Evidence for feature binding and global stimulus configuration encoding via temporally synchronized neuronal activity in the human brain is mostly limited to studies based on EEG recordings. For instance, Rose et al. (2006) observed an increase in synchronous γ-band power between the cerebral hemispheres when they preferentially encoded features that belonged to the same objects. However, the low spatial resolution of the EEG technique and ambiguities inherent in source localization (Hämäläinen and Ilmoniemi, 1994) make it difficult to accurately localize the fine-scale neural mechanisms, at the level of cortical columns, that underlie synchronized EEG waves.
In contrast to EEG, BOLD fMRI provides a relatively high spatial resolution (Goense et al., 2016; Dumoulin et al., 2018; Polimeni and Wald, 2018) that in many cases is comparable to the resolution achieved by invasively measured local field potentials (Berens et al., 2008; Nauhaus et al., 2009). Importantly, multiple studies have linked the ultra-slow spontaneous fluctuations in the fMRI signal to the change in the level of γ-band neural activity (Nir et al., 2007; Scholvinck et al., 2010; Scheeringa et al., 2011; Mateo et al., 2017). Specifically, changes in the level of γ-band neuronal activity can drive vasomotive oscillations in pial arterioles on the cortical surface; this mechanism influences the supply of oxygenated blood to the underlying tissue and subsequently causes changes in the BOLD signal (Mateo et al., 2017; Drew et al., 2020). This interaction between neuronal activity and the supply of energy substrates makes fMRI a suitable technique to test the impact of global stimulus configuration on the level of synchrony between cortical subregions.
In this study, we tested whether the correlation between fluctuations in the BOLD fMRI signal, evoked within fine-scale cortical structures of human area V1, varied when these structures represent parts of the same versus different objects. We focused on the impact of global configuration on “cross-hemispheric” coherence in neuronal activity. This was mainly because the impact of global configuration on “within-hemisphere” coherence is limited to neighboring neural columns (Gray et al., 1989; Engel et al., 1991), which appears to be beyond the spatial resolution of current fMRI techniques (see Materials and Methods). We also tested whether this phenomenon is impacted by the subject's level of attention as well as by vertical asymmetries in the visual perception, as expected from human behavioral data (Previc, 1990; Nasr and Tootell, 2020).
Materials and Methods
Participants
In total, 29 human volunteers (18 females, 20-42 years of age) participated in this study. Among them, 18 subjects (12 females, 21-37 years old) participated in Experiment 1. Of these 18 subjects, 7 subjects (6 females, 21-37 years old) also participated in Experiment 2. The remaining 11 subjects (6 females, distinct from those who participated in Experiments 1 and 2, 20-42 years old) participated only in Experiment 3.
All subjects had normal or corrected-to-normal vision (based on a Snellen test) and no history of neurologic and/or psychiatric illness. All procedures were in compliance with the guidelines of the Institutional Review Board of the Massachusetts General Hospital. Procedures were fully explained to all subjects, and informed written consent was obtained before scanning in accordance with the Declaration of Helsinki.
Visual stimuli and procedure
Experiment 1
Inside the MRI scanner, subjects were presented with four unfilled elliptical objects (6° distance between focal points, ρ1/ρ2 = 4; border width = 1 pixel) drawn peripherally (R = 7.8° eccentricity) (Fig. 1A,B). Objects appeared concurrently on the screen, against a gray background, ∼30 s before initiating fMRI data collection and remained visible during the entire run (240 s) without any change. This early stimulus presentation relative to the data collection enabled us to reduce (if not eliminate) the impact of stimulus onset on the fMRI activity co-fluctuations.
Global stimulus configurations used in different experiments. A, The stimulus configuration in Experiments 1 and 2. Stimulus configuration remained unchanged during each run and only changed between runs. B, The difference in stimulus configuration between runs. Red dashed lines indicate location of the “focal points” (i.e., the ROIs). Arrowheads point to the adjacent focal points that belong to the same (solid yellow lines) versus different (dashed yellow lines) objects. C, D, The stimulus configurations across Experiment 3. In half of the runs (C, left and right), we used temporally varying noise patterns to partially fill the area in the focal points of the ellipse objects to add local discontinuity. In the other half of the runs (D, left and right), we used the same noise pattern to fill the entire area of the ellipse objects. Similar to the previous tests, the global configuration only changed between (not within) runs.
Each subject participated in two runs. Between runs, the entire stimulus was rotated by 45°, resulting in a change in global properties of the ellipses' focal points across left and right hemifields (as shown in Fig. 1A,B). Specifically, in one run, adjacent cross-hemispheres focal points belonged to the same object. In the other run, they were positioned in two different objects. In Experiment 1 (and in Experiment 2, described below), the locations of the focal points were not stimulated. The order of the runs was counterbalanced across subjects.
As a control for the attention of subjects during the experiment, subjects were instructed to look at a centrally presented white fixation target (subtending 0.15° × 0.15°) and to report any change in the shape of the fixation target (from circle to square, or vice versa every 2-7 s) by immediately pressing a key on an MRI-compatible keypad. During the experiments, subjects received no feedback about the accuracy of their responses.
For the 11 subjects who only participated in Experiment 1, but not Experiment 2, we also collected one additional run (in the same scan session) during which subjects were asked to close their eyes but stay awake without any explicit task (i.e., we collected one run of resting-state fMRI). The duration of this resting-state run was the same as the task runs (i.e., 240 s). The sequence of runs was counterbalanced between subjects.
Experiment 2
This experiment was designed to increase the subject's attention to the fixation task and to reduce the amount of attention to the periphery (compared with Experiment 1). During these scans, stimuli were identical to those used in Experiment 1, but here subjects were required to look at a red fixation target (subtended 0.15° × 0.15°) and to report any change in color intensity of the target (dark-red to light-red, or vice versa). The amount of change in color intensity was adjusted dynamically for each subject, using a staircase method, to keep their change-detection accuracy at ∼70% (see Results). Here again, the sequence of runs was counterbalanced. All other details were identical to Experiment 1.
Experiment 3
Here we tested the impact of local discontinuities on the level of correlation between evoked fMRI activation. Subjects were presented with similar elliptical objects, as used in Experiments 1 and 2, with one exception. Here, all shapes were filled either partially (i.e., only within circular regions centered on the focal points; R < 2.5°) (Fig. 1C) or completely with random-noise patterns comprised of binary valued black-and-white noise that was spatially and temporally independent updated every 0.14 s (Fig. 1D). In contrast to Experiments 1 and 2, here stimulation was presented within the focal points. Similar to Experiment 1, subjects were instructed to look at a centrally presented fixation target and to report its shape change by immediately pressing a key on a keypad. All other details are similar to Experiment 1.
Retinotopy mapping
For each subject, at the end of the experimental session, during separate runs relative to those used for the main tests (see above), we localized the cortical retinotopic representations of (1) the focal points of the ellipse stimuli, used as ROIs in our data analysis (see below) and (2) the horizontal and vertical meridians used to functionally define the V1 borders and topographic layout. For mapping these locations, we used a conventional block-design paradigm, during which subjects were presented with contrast-reversing scaled checkerboards flashing at 4 Hz that were masked to be (1) limited to the region around the focal points (R < 2.5°) (Fig. 2A, right), (2) limited to the area outside the focal point region (R > 2.5°) (Fig. 2A, left), (3) along horizontal meridian (i.e., ± 15 angular degrees), or (4) along vertical meridian (i.e., ± 15 angular degrees) against a uniform gray background.
For each subject, the ROIs that represented the focal points of the ellipse objects were localized based on retinotopy mapping. A, The stimuli used for retinotopy mapping of the focal points. The two stimulus configurations were presented in different blocks; and in each block, the stimulus contrast reversed with 8 Hz frequency. B, The significance (p value) of activity map for one individual subject evoked by contrasting the response to the stimuli shown in A (left – right). The location of ROIs (indicated by white arrows) was defined based on their significant (p < 0.05) response to this contrast. The border of area V1 (dashed black lines) was localized by contrasting the response evoked by stimulating horizontal versus vertical meridians.
Each subject participated in 6 runs for retinotopy mapping. Each run lasted 216 s and consisted of 8 blocks (i.e., 2 blocks per stimulus condition), and each block lasted 24 s. Each run started and ended with 12 s of neutral gray background presentation. The sequence of blocks within a run was pseudo-randomized with the constraint that, within a run, stimulus conditions could not be repeated immediately. Subjects were asked to fixate on the fixation target and to report when the color of fixation target changed (i.e., red to green or vice versa) by immediately pressing a key on a keypad.
Apparatus
Stimulus presentation was controlled using MATLAB (The MathWorks) and psychtoolbox (Brainard, 1997). Stimuli were back-projected on a translucent projection screen, using a Sharp XG-P25 video projector (1024 × 768 pixels resolution, 60 Hz refresh rate). Subjects were able to see the stimuli through a mirror mounted on the housing of the head coil.
Training
Before the functional scans, subjects were familiarized with the stimuli and task. Subjects practiced controlling their eye movements for at least 90 s. During this practice, in contrast to the actual test, the elliptical objects rotated around the screen in increments of 45° to act as a distracter for the fixation task. Subjects were explicitly instructed to avoid shifting their gaze toward the elliptical objects and to only focus on the shape of the fixation target. They were also informed that the movement of objects is limited to the practice runs, and they should not expect any peripheral change during the actual runs. During the practice, one of the experimenters sat close to the subject and monitored the eye movements visually. The volunteers continued to practice their fixation inside the scanner. The experiment only started when the subjects were confident about their fixation stability.
It is also noteworthy that the chance of eye movement is higher when stimuli first appear on screen. To avoid this transient period of eye movements, and to eliminate the impact of stimulus onset on the fMRI data, we initiated the fMRI data collection ∼30 s after the stimulus onset. These procedures reduce the chance of involuntary eye movement during the fMRI data acquisition.
Imaging
MRI data were collected with a 3T TimTrio whole-body human MRI scanner (Siemens Heathineers), with the standard vendor-supplied 32-channel head coil array. fMRI data were acquired using standard 2D gradient-echo BOLD-weighted EPI (TR = 3000 ms, TE = 32 ms, flip angle = 90°, in-plane acceleration factor R = 3, nominal echo spacing = 0.9 ms, no partial Fourier, voxel size = 1.2 × 1.2 × 1.2 mm3, 41 slices, and FOV = 192 × 192 × 49.2 mm3). Each run of the main experiment and the retinotopy mapping experiment consisted of 80 and 72 TRs, respectively. The slices were positioned in an oblique-axial orientation centered on and parallel to the long axis of the calcarine sulcus, such that V1 was included in the fMRI acquisition.
For all subjects, at the beginning of the session, we collected anatomic reference data using a standard 3D T1-weighted multiecho MPRAGE pulse sequence with protocol parameter values: TR = 2530 ms, four echoes with TE1 = 1.64 ms, TE2 = 3.5 ms, TE3 = 5.36 ms, TE4 = 7.22 ms, TI = 1200 ms, flip angle = 7°, echo spacing = 10.3 ms, acceleration factor = 2, no partial Fourier, bandwidth = 651 Hz/pix, voxel size = 1.0 × 1.0 × 1.0 mm3, FOV = 256 × 256 × 176 mm3.
Data analysis
Functional and anatomic MRI data were preprocessed and analyzed using FreeSurfer and FS-FAST (version 6.0; http://surfer.nmr.mgh.harvard.edu) (Fischl, 2012). For each subject, cortical surfaces, including the “white matter surface” at the gray matter/white matter interface (deep) and the “pial surface” at the gray matter/CSF interface (superficial), were reconstructed based on the T1-weighted anatomic data, after which inflated representations were generated for visualization (Dale et al., 1999; Fischl et al., 1999, 2002). All functional images were rigidly aligned to the subject's own anatomic reference scan using Boundary-Based Registration (Greve and Fischl, 2009) with 6 degrees of freedom and then were corrected for motion. For the data collected during the main tests, no spatial smoothing (i.e., 0 mm FWHM), no HRF deconvolution, and no temporal filtering were applied; the latter was omitted because no slow temporal drifts were detected in the data.
To test whether the change in the fMRI co-fluctuations are detectable in both deep and superficial cortical layers, as expected from the intercolumnar synchrony (Gray et al., 1989), we analyzed fMRI activation separately between outermost and innermost borders of the cortical gray matter thickness as follows. First, for each subject, surface reconstructions corresponding to the gray-white interface (“deep”) and the gray-CSF interface (“superficial”) were generated automatically based on subject's own high-resolution structural scans (see above) (Dale et al., 1999; Fischl et al., 1999, 2002). Second, fMRI activity in each functional voxel intersecting these two surfaces was projected onto the corresponding vertices of the surface mesh. Then statistical analysis was performed on the corresponding fMRI activity (see below).
For the retinotopy mapping runs, the acquired fMRI data were spatially smoothed using a surface-based 2D Gaussian filter with a 1.5 mm FWHM. A standard hemodynamic response model based on a γ function was convolved with the stimulus timing to generate a task regressor for the fMRI signal, which was used in a voxel-wise standard univariate GLM framework to estimate the significance of the BOLD response. The resultant significance (i.e., p value) maps were projected onto the subject's cortical surface reconstructions (Fig. 2B) (also see below).
ROI definition
The ROIs included cortical representations of elliptical object focal points within V1, detected based on the retinotopic mapping of these locations within each subject (see Visual stimuli and procedure). Specifically, for each subject, the activity map evoked by contrasting the response to stimulation of focal points versus the surrounding regions (Fig. 2A) was thresholded (p < 0.05). Those vertices that showed a significant response (p < 0.05) to stimulation of focal points were used to define the ROI. The individual focal points were then able to be identified uniquely based on the known retinotopic layout of V1 because (1) in each hemisphere, the activation map represented the stimuli presented within the contralateral visual fields and (2) the upper-to-lower visual fields are represented within the ventral-to-dorsal portions of V1, respectively (Tootell et al., 1998).
On average, each ROI consisted of 38.2 ± 4.0 (mean ± SEM) vertices (i.e., 22.3 ± 2.4 mm2). An application of two-way repeated measures ANOVA [hemisphere (left vs right) and side (dorsal vs ventral)] to the measured number of vertices per ROI (measured in 29 subjects) did not yield any significant effect of hemisphere (F(1,28) = 0.15, p = 0.70), side(F(1,28) = 0.06, p = 0.80), and/or interaction between them (F(1,28) = 0.39, p = 0.53). A similar result (i.e., no significant difference, p > 0.33) was also found when the same test was applied to the size of ROI measured in mm2.
Statistical analysis
Motion-corrected fMRI data were spatially averaged within each ROI separately. Then, the level of correlation (i.e., r value) between adjacent ROIs was calculated, based on using all collected time points (80 TRs; see Imaging section), using a Pearson test of correlation. To make sure that the sampled r values have a normal distribution, all measured r values were transformed to z values using the Fisher transformation.
Unless otherwise mentioned, for each individual subject, z values measured across dorsal/ventral cross-hemisphere ROIs and left/right within-hemisphere ROIs were averaged to increase the signal-to-noise ratio. In other words, we only used two z values in our graphs and in our statistical analysis. To test the vertical asymmetry in the level of correlation (as expected from human behavior) (Previc, 1990), we also reported and compared the z values measured in dorsal and ventral ROIs.
To examine the significance of independent parameters in each experiment, we used repeated-measures ANOVA. Repeated-measures ANOVA is particularly susceptible to the violation of sphericity assumption, caused by the correlation between measured values and unequal variance of differences between experimental conditions. To address this problem, when necessary (determined using a Mauchly test), results were corrected for violation of the sphericity assumption, using the Greenhouse–Geisser method.
Data availability
Data will be shared on request.
Results
First, we tested whether the global stimulus configuration affects the level of correlation of fMRI fluctuations measured at the cortical representations of local features. Specifically, during Experiment 1, we tested whether the level of cross-hemisphere co-fluctuations increased when the ROIs in the visual cortex represented parts of the same compared with different objects (see Materials and Methods). This test was applied to fMRI measured in deep and superficial layers to clarify whether (or not) changes in the level of co-fluctuation are detectable across cortical layers, as expected from V1 columnar organization and, shown by others in animals (Gray et al., 1989). During the measurements, subjects performed a shape-change detection task with the fixation target (response accuracy 95.0 ± 1.6%, mean ± SE).
We measured the correlation between spontaneous fMRI fluctuations at the representation of the stimulus focal points in cortical area V1 (Fig. 3A). These representations were localized retinotopically for each subject in the same scan session (see Materials and Methods; Fig. 2). Consistent with our hypothesis, we found stronger correlations between fluctuations within cortical ROIs that represented focal points from the same object relative to the correlations between fluctuations within ROIs that represented focal points from different objects. To test the statistical significance of this effect, we used a three-way repeated-measures ANOVA with focal-points grouping (FPG) of the same versus different objects, ROI-side grouping of cross- versus within-hemisphere, and grouping by superficial versus deep cortical layers (Table 1; Fig. 3B). This yielded a significant effect of the FPG (p < 0.01) and a significant FPG × ROI-side interaction (p < 10−3). The observed cross-hemispheric coherence is consistent with findings based on single-cell recordings (Engel et al., 1991) and EEG (Rose et al., 2006), showing that global stimulus configuration has a significant impact on the level correlation between activity evoked across hemispheres.
The results of three-way repeated-measures ANOVA applied to the results of Experiment 1
Global stimulus configuration impacts the level of correlation between fMRI fluctuations evoked across different V1 subregions. A, The level of correlation between the fMRI fluctuations measured from the cross-hemisphere (left) and within-hemisphere (right) ROIs, in superficial (red) and deep (blue) cortical layers. In both cortical layers, the level of correlation was higher when the cross-hemisphere ROIs represented those focal points that belonged to the same objects rather than different objects (Table 1). Error bars indicate 1 SEM. B, The impact of global configuration for each individual subject by subtracting the level of correlation between adjacent ROIs when they were contained within different objects from their level of correlation when they were contained within the same object. We found stronger correlation when the cross-hemisphere ROIs were contained within the same compared with different objects in 15 (of 18) individual subjects. Each point in B represents data from 1 subject, measured separately for cross- versus within-hemisphere ROIs, individually for voxels sampling from superficial (red) versus deep (blue) cortical depths.
Importantly, the absence of an impact of global configuration on within-hemisphere co-fluctuations is consistent with our hypothesis and could be anticipated from the separation distance along the cortex between the within-hemispheric ROIs (10.6 ± 1.6 mm geodesic distance). In particular, single-cell studies have shown that the global configuration of the stimulus leads to coherent neuronal activity only up to cortical distances of 7 mm (Gray et al., 1989; Engel et al., 1991). This lack of within-hemisphere co-fluctuations, plus the extensive training before the tests (see Materials and Methods), also weakens the possibility that the effect of global configuration impact is because of eye movement. To clarify, the eye movement pattern is not expected to vary between “within- versus cross-hemispheres” ROIs since they are positioned at equidistant locations (Fig. 2A).
We further found a significant effect of cortical depth, which indicates a higher correlation observed within superficial compared with deep cortical layers (p < 10−5). This likely results from the stronger gradient-echo BOLD response found in voxels near large veins at the pial surface compared with voxels near the white matter (Koopmans et al., 2010; Polimeni et al., 2010; De Martino et al., 2013; Nasr et al., 2016). However, it can also result, in part, from the stronger γ-band synchrony in more superficial compared with deeper cortical layers (Buffalo et al., 2011) (see Discussion).
Despite this difference in the overall level of correlation between deep versus superficial cortical layers, we did not find any significant FPG × cortical depth interaction (p = 0.81). This inability to detect this interaction suggests that larger BOLD signal changes, expected to be observed in the superficial layers, do not necessarily lead to a stronger FPG effect. Thus, changes in the level co-fluctuation are not associated with changes in the amplitude of BOLD signal, or at least this association is not linear.
All told, correlations between ROIs that represent focal points from the same object exceed the correlations between those within ROIs from different objects. Further, the effect of global stimulus configuration on correlations between adjacent ROIs is stronger for ROIs that are positioned across hemispheres rather than those for adjacent ROIs within the same hemisphere.
Previous behavioral studies have shown that the encoding of global stimulus configuration is stronger within the lower compared with upper visual field (Previc, 1990; Levine and McAnany, 2005; Nasr and Tootell, 2020). We tested whether this effect is reflected on the level of cross-hemisphere correlation between the focal-point ROIs in V1 (Fig. 4). We found a stronger cross-hemisphere correlation between dorsal ROIs, which represent the lower visual field, compared with the ventral ROIs, which represent the upper visual field. Further, the impact of global configuration on the level of cross-hemisphere correlation was stronger in dorsal compared with ventral ROIs. A three-way repeated-measures ANOVA, similar to that used above, yielded a significant effect of the FPG (p < 10−3) and ROI location (p < 0.01), along with a significant FPG × ROI location interaction (p = 0.01) (Table 2). These results suggest that the vertical bias in global configuration encoding is at least partly reflected in the level of correlation between cross-hemisphere ROIs. Notably, the main effect of ROI location in this analysis may be (at least partly) because of the shorter distance between dorsal (compared with ventral) ROIs and the surface receive coil array (e.g., see Fig. 2B), which is expected to affect the noise level.
The results of three-way repeated-measures ANOVA applied to compare the impacts of global configuration in dorsal versus ventral ROIs (Experiment 1)
The impact of global configuration on the ROIs within dorsal and ventral cortical regions. Global configuration of the stimuli had a stronger impact on the dorsal ROIs (left), which represented the lower visual field; compared with the ventral ROIs (right), which represented the upper visual fields (see also Table 2). Other details are similar to Figure 3A.
We further tested whether the aforementioned difference in correlation may be explained as an increase in the level of correlation when cross-hemisphere ROIs were within the same object, as opposed to a decrease in the level of correlation when ROIs were within different objects. In a subset of subjects (n = 11) with whom resting-state fMRI data were acquired, we compared the measured correlation levels during the stimulus presentation relative to those measured during resting state (with eyes closed), which can be viewed as a baseline condition (Fig. 5). A two-way repeated-measures ANOVA showed a significant effect of FPG (p = 0.01) but no effect of cortical depth (p = 0.60) and no FPG × cortical depth interaction (p = 0.73). The same conclusions were reached from four separate t tests (Table 3). These results show that there is a “decrease” in the level of correlation relative to the resting-state condition (i.e., baseline) when the ROIs represented different objects.
The impact of global configuration on fMRI fluctuations when the correlations were measured relative to the correlation during the resting-state condition (Experiment 1)
The global configuration impact can also be seen on the normalized level of correlation between the adjacent cross-hemisphere ROIs. Here, we show the level of correlation between the adjacent cross-hemisphere ROIs when measured relative to their level of correlation during the resting-state condition (with eyes closed) (see Table 3). The negative values indicate that the level of correlation was higher during the resting state compared with when subjects were looking at stimuli on the screen. Other details are similar to Figure 3A.
Subsequently, in Experiment 2, we tested whether attentional modulation influences the impact of stimulus configuration on correlated fMRI co-fluctuations measured in V1. According to previous findings in monkeys (Buffalo et al., 2011; Bosman et al., 2012) based on more invasive techniques, we expected to see a weak to no effect of attention on the level of correlation between ROIs located within the primary visual cortex. Notably, a previous fMRI study in humans (Müller and Kleinschmidt, 2003) suggested that object-based attention may affect the amplitude of the BOLD response in unattended parts of an object. However, as mentioned above, if object-based attention influences the BOLD response within stimulated voxels, it does not necessarily follow that this would result in a correspondingly stronger BOLD correlation between these voxels.
Here, we asked a subset of individuals who participated in our first test (n = 7) to perform a more demanding fixation task during which they were required to report any color change of the fixation target (see Materials and Methods). By controlling the level of color change, using a staircase method, we increased the task difficulty (i.e., by making it more attention demanding). These subjects' response accuracy dropped significantly (t test; t(6) = 6.71; p < 10−3), from 94.5 ± 1.9% to 73.9 ± 3.6%, between the original shape-change detection task and the more demanding color-change detection task.
Despite the higher attention demand during the adaptive color-change detection task, which required more attention toward the center of screen (i.e., farther from the ellipse objects), the correlations of fMRI fluctuations again showed a strong impact of stimulus configuration, comparable to that observed during the less demanding task of shape-change detection (Fig. 6). We checked the statistical significance of the findings using a four-way repeated-measures ANOVA, similar to that above (Tables 1 and 2) but adding the task contingency of adaptive color versus shape change. This yielded significant effects of the FPG (p < 0.01) and an FPG × ROI location interaction (p = 0.03), consistent with the results above (Table 4). But it did not yield any significant effect of task (p = 0.57) and/or task × FPG interaction (p = 0.33). These results suggest that changing the difficulty level of central fixation task does not have a significant impact on the effect of stimulus configuration in V1. However, further tests are required to test whether fMRI fluctuation could be influenced by directing attention toward the peripheral objects (see Discussion).
The results of four-way repeated-measures ANOVA applied to test the interaction between the impacts of attention demand and global configuration on fMRI fluctuations (Experiment 2)
Attention demand does not change the impact of global configuration on fMRI co-fluctuations. A, The impact of global configuration on fMRI co-fluctuations in cross- and within-hemisphere ROIs. Subjects included a subset of those individuals who participated in Experiment 1 (n = 7; Fig. 3) (see Materials and Methods). They were instructed to perform a relatively low attention demand task for the fixation target. fMRI fluctuations were more correlated when the ROIs represented the same compared with different objects. B, The fMRI co-fluctuations when the same subjects (during the same scan session) were instructed to perform a significantly higher attention demand task which required more attention to the center of screen (i.e., farther from the ellipse objects). The other aspects of the stimuli remained the same between the two tasks. Despite the significant difference between subject's level of attention across the two tasks, they still showed a statistically equivalent change in the level of fMRI co-fluctuations because of the change in global configuration (Table 4). However, the difference in the overall level of correlation across cortical layers was more apparent in the low attention demand compared with the higher attention demand task. All other details are similar to Figure 3A.
These control data also indicate a larger effect of cortical depth level during a more attention-demanding task (Fig. 6). Specifically, we found a larger correlation between the fMRI fluctuations measured within superficial compared with deep cortical depth level as the attentional demand increased. This phenomenon was indicated by the significant task × cortical depth level interaction (p = 0.04). Thus, consistent with the findings based on more invasive techniques in nonhuman primates (Buffalo et al., 2011; Bosman et al., 2012), the relationship between the activity measured across cortical depth levels is not always the same and may vary with parameters, such as the task and the attentional demand (see below and Discussion).
Furthermore, these results rule out the possibility that fMRI co-fluctuations between ROIs were because of eye movement. Specifically, with increase in the level of central attention in Experiment 2 compared with Experiment 1, one expects a decrease in the level of (involuntary) eye movement toward periphery. If those involuntary eye movements were responsible for an increase in the level of fMRI co-fluctuations, these co-fluctuations would be expected to decrease in Experiment 2 compared with Experiment 1. Rather, we found comparable effects between the two tasks. Thus, it appears unlikely that eye movements are the cause of the observed correlations between cross-hemisphere ROIs.
In Experiment 3, as a control, in a separate group of subjects (n = 11) we also tested whether the global configuration versus local discontinuity (e.g., the edges of the white elliptical contour) influences the level of correlation between fMRI fluctuations measured in V1 cortical subregions. Here, we used a new set of stimuli that included local discontinuities that are generated by spatiotemporal-noise patterns presented within the elliptical objects with partially filled objects, where only the circular focal points were filled (Fig. 1C), or fully filled objects, where the entire ellipse was filled (Fig. 1D). Here again, the global stimulus configuration varied between runs by rotating the overall stimulus by 45° (Fig. 1C,D). As before, subjects showed an almost perfect performance in the attention-demanding shape-change detection task (92.4 ± 2.6%).
The overall pattern of results (Fig. 7) with the partially filled and fully filled objects remained the same as with the empty objects (Figs. 3 and 6). We again found stronger correlations between fMRI fluctuations measured within cross-hemisphere ROIs when they represented the same rather than different objects. We applied a four-way repeated-measures ANOVA, as above (Tables 1 and 2), but adding fully versus partially filled ellipse type as an independent parameter. The results showed a significant FPG × ROI-side interaction (p = 0.02) without any significant effect of ellipse type (p = 0.35) (Table 5). These control results imply that global configuration, but not local stimulus discontinuity, influences the cross-hemisphere correlations.
The results of four-way repeated-measures ANOVA applied to test the interaction between the impacts of local discontinuities and global configuration on fMRI fluctuations (Experiment 3)
The change in the level of correlations between fMRI fluctuations is because of the change in global configuration, not the local discontinuities. A, B, The level of correlation between the fMRI fluctuations measured within adjacent ROIs either from across the two hemispheres (left columns) or within a hemisphere (middle columns). In superficial cortical layers, the level of correlation was higher when the adjacent cross-hemisphere ROIs represented focal points that belonged to the same objects rather than different objects (Table 5). This effect was weaker when measured within deep (compared with superficial) cortical layers. In each panel, the right column represents the impact of global configuration and local discontinuity for each individual subject, measured as described for Figure 3B. We found a stronger correlation when the cross-hemisphere ROIs represented the same compared with different objects in 8 and 9 (of 11) subjects for filled and partially filled stimuli, respectively (see also Fig. 3). All other details are similar to Figure 3.
We also found a significant FPG × cortical depth level interaction (p < 0.01) as a result of the stronger impact of the FPG in superficial compared with deep cortical depth levels. Thus, consistent with the previous test (see above; Fig. 6), here again we found that the relationship between the activity measured across different cortical depth levels is not constant. Rather, it may also vary with stimulus configuration, in addition to the task (see Discussion).
Discussion
We have presented evidence of the impact of global stimulus configuration on fMRI co-fluctuations measured within fine-scale neural structures across human V1. Our findings show that the level of correlation between activity within V1 subregions is higher when they represent the same rather than different objects. This phenomenon was detected regardless of the subject's level of attention, suggesting that local mechanisms, rather than top-down attentional modulations, are responsible for this correlation. Further, this effect was stronger in the dorsal cortical regions that represent the lower visual fields compared with the ventral regions that represent the upper visual fields. This is consistent with observations of superior global configuration encoding in the lower versus upper visual fields seen in humans.
Impact of attentional modulation
Attention plays a large role in the response of extrastriate visual areas, including areas V4 and MT, in which neurons have relatively large receptive fields and bias their response toward to attended objects (Qian and Andersen, 1994; Reynolds and Desimone, 1999). Directly related to our findings, Buffalo et al. (2011) have reported that γ-band synchrony in the superficial layers of monkey V2 and V4 cortices was enhanced by attention. However, the same group reported that the attentional enhancement of γ-band synchrony in V1 appeared to be weaker and inconsistent across the two tested animals (Buffalo et al., 2011). A later study also suggested that the impact of attention may be more apparent as a shift in the peak frequency of γ-band synchrony (Bosman et al., 2012).
Consistent with previous findings in humans and nonhuman primates, we found that, even when subjects directed their attention away from the visual objects, the level of co-fluctuations between the V1 subregions that represented the same objects remained intact (Fig. 6). We only found a significant interaction between task and cortical depth, which indicated a larger overall difference in correlations measured across cortical depths during the more (compared with the less) attention-demanding task. Thus, it is unlikely that the attentional modulation is solely responsible for the co-fluctuations between V1 subregions.
However, three points need to be considered regarding the interpretation of our findings. First, although we showed a significant drop in subjects' response accuracy during the task that demanded greater attention, this does not rule out the possibility that there were residual attentional resources allocated to processing the elliptical objects. A minimum level of attention may still be necessary for generation of fMRI co-fluctuations between V1 subregions that represented different parts of the stimuli. However, this possibility is not incompatible with our conclusion that attentional modulation is unlikely to be the sole mechanism that underlies the fMRI co-fluctuations. It is noteworthy that the classical evidence of synchronous activity was recorded in anesthetized animals in which the level of attention can be considered minimal.
Second, since the correlation was measured over a prolonged time interval (i.e., 240 s), we could not test the possibility that the impact of attention varied with time. Specifically, the impact of attention could be limited to the early interval after the stimulus onset and could then become insignificant, although studies in nonhuman primates, based on invasive methods with high temporal precision, still did not find any evidence for the impact of attention on γ-band synchrony within V1 (Buffalo et al., 2011).
Third, these findings do not rule out the possibility that feedback projections from the extrastriate regions, in which neurons have larger receptive fields (Smith et al., 2001; Dumoulin and Wandell, 2008), may play a role in generation of fMRI co-fluctuations within V1. Unfortunately, our limited imaging FOV did not allow us to measure fMRI activity beyond V1. This intriguing possibility can, however, be tested in future studies.
Vertical asymmetry in the impact of global stimulus configuration
Humans perceive visual stimuli more “globally” when stimuli are presented within the lower visual field compared with the upper visual field (Previc, 1990; Christman, 1993; Levine and McAnany, 2005). This phenomenon is also reflected in the stronger sensitivity to low spatial frequency components, crucial for global configuration encoding (Shulman et al., 1986; Shulman and Wilson, 1987; Lagasse, 1993; Robertson et al., 1993; Flevaris et al., 2010). In particular, low spatial frequency features are encoded more accurately when presented within the lower, rather than the upper, visual fields (Skrandies, 1987; Niebauer and Christman, 1998; Thomas and Elias, 2011; Nasr and Tootell, 2020). Recently, it has been shown that this vertical asymmetry is likely caused by: (1) higher sensitivity of near- compared with far-preferring cortical clusters to low spatial frequency components (Nasr and Tootell, 2020) and (2) more frequent distribution of near-preferring neural clusters within the dorsal, compared with ventral, portion of extrastriate visual cortical areas V2, V3, and V3A (Nasr and Tootell, 2018) that preferentially represent the lower, compared with the upper, visual field.
Here, we extended those prior findings by providing evidence of sensitivity to vertical position in the coding of the global configuration of a stimulus by V1. Despite the fundamental differences between the two phenomena (i.e., activity correlation measured here vs enhanced stimulus preference shown previously), it is not clear whether they are fully distinct or they are two manifestations of the same phenomenon. To clarify, the majority of input to extrastriate visual areas is from V1 (Felleman and Van Essen, 1991), and more synchronous brain activity (in V1) may result in stronger fMRI signaling in the extrastriate areas (Niessing et al., 2005). However, if true, one may expect a stronger co-fluctuation in interblob (compared with blob) regions of V1 that send a stronger input to thick stripes in V2 cortex (Federer et al., 2009, 2013) that comprise near- and far-preferring neural clusters (Nasr and Tootell, 2018). Testing this possibility requires a higher spatial resolution beyond what was achieved in this study.
Are V1 cortical co-fluctuations enough to avoid illusory conjunction?
Our results indicate that activity co-fluctuations remain intact even when attention is directed away from the objects. However, at the behavioral level, illusory conjunction happens more frequently among unattended compared with attended objects (Treisman and Schmidt, 1982). Thus, it appears that encoding through co-fluctuations in neural activity is not the only mechanism in the brain that can overcome the binding problem. Rather, other attention-dependent mechanisms should also exist, most likely in extrastriate visual areas, to encode the binding between visual features.
Cortical depth-dependent variation in fMRI co-fluctuations
The configuration of the stimulus affects the co-fluctuations in the fMRI signals at both superficial and deep cortical depths without any noticeable difference (Fig. 3). However, with the addition of a more attention-demanding task (Fig. 6) and/or random spatiotemporal noise patterns to the stimuli (Fig. 7), the relationship between the co-fluctuations within superficial and deep cortical depth levels changed.
These observations can be linked to one or both of two phenomena. On the one hand, neuronal processing and connectivity differ across cortical depths. It is known that, in primates, the superficial layers of V1 are more connected to the higher visual areas (i.e., V2, V3, V4, and MT), whereas the deep cortical layers are more strongly connected to the subcortical areas (Felleman and Van Essen, 1991). In this condition, the impact of the stimulus noise patterns is preferentially diminished in the superficial layers, likely because of feedback from other cortical regions and/or intercolumnar (local) processing within V1 (Casagrande and Kaas, 1994; Ito and Gilbert, 1999; Liang et al., 2017).
On the other hand, it has been shown that gradient-echo BOLD fMRI responses are stronger in more superficial compared with deeper cortical layers (Koopmans et al., 2010; Polimeni et al., 2010; De Martino et al., 2013; Nasr et al., 2016). This effect partly results from the impact of the large draining veins on the pial surface. One may thus expect less sensitivity to the stimulus noise because the overall fMRI signal is stronger in voxels sampling the superficial cortex.
Notably, multiple factors, including the existence of radial ascending venules (Duvernoy et al., 1981, 1983; Markuerkiaga et al., 2016) and our 1.2 mm isotropic voxel size, may artifactually increase the level of correlation between deep and superficial cortical depth levels. These factors would act to reduce the impact of the stimulus pattern and/or the subject's task on the level of co-fluctuations. This suggests that the true differences in the level of correlation between neurons within the deep and superficial layers may be stronger than what we have observed in our data.
Link between co-fluctuations in the sluggish BOLD fMRI signal versus γ-band neuronal synchrony
Our results are consistent with the possibility that the change in the γ-band synchrony level, caused by global stimulus configuration (Gray et al., 1989; Engel et al., 1991), may be reflected on the level fMRI co-fluctuations. A change in the synchrony level likely leads to an enhanced read-out of near-synchronized neuronal input, as opposed to asynchronous input, by downstream neurons (Grannan et al., 1993). Multiple previous studies have shown a significant relationship between fMRI spontaneous fluctuations and γ-band neuronal activity (Nir et al., 2007; Scholvinck et al., 2010; Scheeringa et al., 2011; Mateo et al., 2017). Modulation of γ-band neuronal activity entrains vasomotive oscillations in pial arterioles on the cortical surface and influences the supply of oxygenated blood to the underlying tissue (Mateo et al., 2017; Drew et al., 2020). Thus, despite the sluggish nature of the BOLD signal, fMRI co-fluctuations may carry valuable information about the configuration of stimuli across the visual field that is originally encoded via γ-band synchrony. By virtue of its spatial coverage, BOLD fMRI provided the ability to measure these co-fluctuations over a larger cortical region than what can be accessed using conventional invasive methods in animal models. Given our ability to use BOLD fMRI to detect changes in gestalt in V1, future fMRI studies can potentially address the link between co-fluctuating activity within extrastriate visual areas and between these areas and V1.
Footnotes
This work was supported in part by the National Institutes of Health, National Institute of Biomedical Imaging and Bioengineering Grants P41-EB015896 and R01-EB019437; National Institutes of Health, National Institute of Neurological Disorders and Stroke Grant R35-NS097265; National Institutes of Health, National Eye Institute Grant R01-EY030434; BRAIN Initiative, National Institutes of Health, National Institute of Mental Health Grants R01-MH111438 and R01-MH111419; and Massachusetts General Hospital/HST Athinoula A. Martinos Center for Biomedical Imaging. We thank Drs. Lars Muckli, Haim Sompolinsky, and Wim Vanduffel for discussions; and Kyle Droppa and Nina Fultz for help with volunteer recruitment and MRI scanning.
The authors declare no competing financial interests.
- Correspondence should be addressed to Shahin Nasr at shahin.nasr{at}mgh.harvard.edu