Abstract
How the human brain reconstructs the three-dimensional (3D) world from two-dimensional (2D) retinal images has received a great deal of interest as has how we shift attention in 2D space. In contrast, it remains poorly understood how visuospatial attention is shifted in depth. In this fMRI study, by constructing a virtual 3D environment in the MR scanner and by presenting targets either close to or far from the participants in an adapted version of the Posner spatial-cueing paradigm, we investigated the behavioral and neural mechanisms underlying visuospatial orienting/reorienting in depth. At the behavioral level, although covering the same spatial distance, attentional reorienting to objects unexpectedly appearing closer to the observer and in the unattended hemispace was faster than reorienting to unexpected objects farther away. At the neural level, we found that in addition to the classical attentional reorienting system in the right temporoparietal junction, two additional brain networks were differentially involved in aspects of attentional reorienting in depth. First, bilateral premotor cortex reoriented visuospatial attention specifically along the third dimension of visual space (i.e., from close to far or vice versa), compared with attentional reorienting within the same depth plane. Second, a network of areas reminiscent of the human “default-mode network,” including posterior cingulate cortex, orbital prefrontal cortex, and left angular gyrus, was involved in the neural interaction between depth and attentional orienting, by boosting attentional reorienting to unexpected objects appearing both closer to the observer and in the unattended hemispace.
Introduction
By using binocular as well as monocular cues, the human brain reconstructs a three-dimensional (3D) world from two-dimensional (2D) retinal images. Stereoscopic vision allows us to extract not only the 3D shape of objects (Sereno et al., 2002; Tsutsui et al., 2002; Durand et al., 2007; Georgieva et al., 2009), but also their spatial locations in depth based on which visuospatial attention can be shifted along the third dimension of visual space. Yet, surprisingly little is known concerning the neural basis of how visuospatial attention is shifted in depth.
Attentional reorienting between spatial locations within a two-dimensional front-parallel plane involves a right hemisphere-dominant ventral frontoparietal network that interrupts and resets ongoing activity (Corbetta et al., 2000, 2008; Hopfinger et al., 2000; Corbetta and Shulman, 2002). In real life, however, the distance in depth of a potentially threatening/rewarding stimulus relative to the observer is crucial for its evaluation: stimuli that unexpectedly appear close to us instantaneously demand a shift of attention toward them, to secure survival. In contrast, stimuli far from us may be less attention demanding (Downing and Pinker, 1985; de Gonzaga Gawryszewski et al., 1987; Previc, 1998; Graziano and Cooke, 2006). Therefore, it is important not only to determine the neural mechanisms underlying attentional reorienting in depth per se, but also to characterize the direction-specific mechanisms by which visuospatial attention is directed to stimuli unexpectedly appearing close to, rather than far from, the observer, due to their differential ecological importance.
We adapted the Posner spatial-cueing task (Posner, 1980) to a virtual 3D world setting (Fig. 1). A fixation cross was centered upon the midsagittal plane of the observer, with an equal spatial distance to the spatial locations in the closer and the farther depth planes (Fig. 1A). Participants were instructed to maintain fixation at the cross throughout the experiment. For each trial, a spatial cue was first presented at the location of the fixation cross, pointing to one of the four spatial locations (Fig. 1A,B). In 80% of the trials, the target appeared in the cued location. In one-third of the 20% invalid trials, visuospatial attention was reoriented between different locations within the same depth plane. In another one-third of the 20% invalid trials, visuospatial attention was reoriented across depth (i.e., from close to far or vice versa), but not across hemispace. In the last one-third of the 20% invalid trials, visuospatial attention was reoriented across depth and hemispace. By this 3D paradigm, we attempted to assess the following: (1) the neural correlates of attentional reorienting in depth, by comparing conditions that required attentional reorienting between different depth planes (i.e., from close to far or from far to close) with those that required reorienting within a given depth plane; and (2) the interactive effects between depth and the cue validity: in particular, the neural correlates involved in reorienting visuospatial attention to unexpected stimuli toward the observer (i.e., from far to close) as opposed to reorienting away from the observer (i.e., from close to far).
A, B, The top view (A) and frontal view (B) of exemplary trials of the experimental paradigm. The default visual scene consisted of four black placeholders (two in a closer and two in a farther depth plane) and one fixation cross. The spatial distance between the two placeholders in close and far depth planes was matched in visual angle. The same 3D objects were presented in the close and far depth planes. All participants reported seeing the same objects in close and far depth planes because of the size–distance constancy effect (Boring, 1964). To avoid complete occlusions between close and far stimuli, the vertical distance between the retinal images of the close and far stimuli was set to 1.5°. Retinal positions of the closer and farther stimuli being upper or lower were counterbalanced across participants. Purple arrows indicate the paths of covert attentional reorienting in the 3D space. WV, Within-depth, valid; WIV, Within-depth, invalid; BIV_SH, Between-depth, same hemispace; BIV_DH, between-depth, different hemispace.
Materials and Methods
Experiment 1: 3D fMRI experiment
Participants.
Twenty healthy volunteers (9 female and 11 male; 21.5 ± 2 years old) participated in the present experiment. They were all right handed, and had normal color vision and visual acuity. None of them had a history of neurological or psychiatric disorders. All participants gave informed consent before the experiment in accordance with the Helsinki Declaration. This experiment was approved by the Ethics Committee of the German Society for Psychology.
Apparatus, stimuli, and experimental setup.
A goggle-based MR-compatible system (VisuaStim Digital, Resonance Technologies) provided two separate VGA monitors with dual digital video inputs for stereoscopic display, each with a resolution of 800 (horizontal) × 600 (vertical) pixels at a 60 Hz refresh rate, which was driven by a 3.0 GHz PC and a NVIDIA GeForce FX 5200 graphic card. The horizontal extent of the field of view was 30°. The dual-display stereoscopic video, with a resolution of 500,000 pixels per 0.25 square inch area, yielded 3D images by delivering slightly disparate images to each eye (binocular disparity). All the 3D objects were generated by Blender (free open-source 3D content creation software, available at: http://www.blender.org), exported as DirectX files, and presented on a gray background by custom made Presentation scripts (Presentation Software package, Neurobehavioral Systems).
The virtual display in each trial contained two black placeholders in a closer depth plane and two place holders in a farther depth plane, thereby marking four spatial locations in the virtual three-dimensional world (Fig. 1). The 3D objects located at the closer depth plane popped out of the goggles screen, whereas objects at the farther depth plane appeared inside the goggles screen. A central fixation cross was positioned on the middle sagittal plane of the observer with an equal spatial distance to the closer and farther depth planes. The closer depth plane was 50 cm away from the participants' eyes, and the farther depth plane was 150 cm away. As mentioned above, the central fixation cross was in the middle between the closer and the farther stimuli (i.e., 100 cm from the participants' eyes). The different target distances were simulated by adjusting the binocular disparity. The binocular disparity between the closer and farther depth planes was ±68.76 min of arc relative to the fusion plane at which the fixation cross was presented (zero disparity). These values are near the maximum disparity separation possible without loss of fusion since a pilot experiment showed that trained observers had no trouble fusing the two images with a binocular disparity of ±68.76 min of arc. Observers reported having a clear perception of the closer and farther depth planes while fixating at the central cross. The size/extent of the central fixation cross was 1.67° in visual angle. The horizontal spatial distance (along the x-axis) between the two placeholders in the closer and farther depth planes was matched in visual angle (22.62°) (i.e., the retinal distance between the two spatial positions was the same for the closer and farther depth planes) (Fig. 1B). The 3D objects in the closer and farther depth planes were the same, resulting in slightly different retinal sizes of the closer and farther stimuli: 2.86° for the closer placeholders and 1.85° for the farther placeholders (Fig. 1B). Otherwise, if the retinal sizes of the objects in close and far depth planes were also matched, participants would subjectively perceive abnormally large objects in far space, which is not ecologically reasonable during our everyday life (Downing and Pinker, 1985; de Gonzaga Gawryszewski et al., 1987; Andersen, 1990; Andersen and Kramer, 1993). Moreover, the differential retinal sizes of stimuli in close and far depth planes would only raise problems when testing for the main effect of the depth of targets (i.e., close vs far and vice versa). Since we were interested in the main effects of cue validity as well as in the neural interaction between depth and cue validity, these retinal size differences were balanced across the experimental conditions of interest. The closer and farther placeholders were tilted by 13° toward the participants. To avoid the occlusion between stimuli in the closer and farther depth planes, the vertical retinal distance between positions in the closer and farther depth planes was slightly shifted to 1.5° visual angles vertically (Fig. 1B). For 10 participants, the two positions in the closer depth plane were higher while the two positions in the farther depth plane were lower, and vice versa for the other 10 participants. The target was a blue sphere. The endogenous cue in the present study consisted of a red cone with a thin black cylinder attached as its handle.
Experimental paradigm and design.
We adapted the Posner-type central predictive cueing paradigm (Posner, 1980). At the start of each trial, the cue appeared at the position of the central fixation cross for 200 ms, pointing to one of the four spatial locations either in the closer or farther depth plane. The target was then presented for 100 ms after the disappearance of the cue. To prevent temporal orienting, we used two randomly occurring cue–target intervals (400 and 700 ms; i.e., the time interval between the cue offset and the target onset varied). The cues were valid in 80% of the trials [i.e., the targets appeared not only in the depth plane, but also in the hemispace the cues pointed to, the within-depth valid condition, Within_Valid (WV)] (Fig. 1A). The cues were invalid in 20% of the trials. In 33% of the invalid trials (i.e., 6.67% of the total trials), the targets appeared in the depth plane that the cue pointed to, but in the opposite location/hemispace [the within-depth invalid condition, Within_Invalid (WIV)]. In another 33% of the invalid trials (i.e., 6.67% of the total trials), the targets appeared in the uncued depth plane, but at the position that was located in the same hemispace as the cue (i.e., both the cue and the target appeared either left or both right) [condition Between_Invalid_Same_Hemispace (BIV_SH)]. In the remaining 33% of the invalid trials (i.e., the last 6.67% of all trials), the targets appeared not only in the uncued depth plane, but also in the spatial location that was located in the opposed hemispace [condition Between_Invalid_Different_Hemispace (BIV_DH)]. Participants were required to fixate at the central fixation cross (zero disparity) throughout the experiment. Therefore, binocular disparity was constant in the eight experimental conditions regardless of both depth of target and cue validity. All participants reported a clear and stable perception of the closer and the farther locations while maintaining central fixation.
The event-related fMRI design was thus a two (depth of target: close vs far) by four (cue validity: WV, WIV, BIV_SH, and BIV_DH) factorial design. There were eight experimental conditions in total. The experiment consisted of 720 experimental trials, including 576 WV trials, 48 WIV trials, 48 BIV_SH trials, and 48 BIV_DH trials (close and far space combined). To prevent participants from responding before the target onset, 32 catch trials were also included, during which only the cues, but not the targets, were presented. Additionally, 360 “null events” were included, during which the baseline display (i.e., four placeholders and one central fixation cross) was presented (Josephs and Henson, 1999), effectively resulting in jittered intertrial intervals between experimental trials (e.g., 2000, 4000, and 6000 ms).
In the middle of the scanning session, an instruction (6 s) was displayed asking participants to switch hands. Ten participants switched from left hand to right hand, and vice versa for the other 10 participants. For both hands, participants were required to use their index finger to press one button on the response pad. Before the fMRI experiment, all participants completed a training session lasting 15 min with the same setting (Z800 3D Visor, eMagin) outside the scanner to familiarize them with the tasks and the experimental setup.
Eye movement control.
Eye position was monitored during scanning with an MR-compatible infrared eye tracker (SensoMotoric Instruments). The sampling rate of the eye tracker was 60 Hz, and its spatial resolution was 800 × 600. Eye movement data were analyzed using ILAB (Gitelman, 2002) to assess whether participants maintained fixation in response to the cue and the target stimuli, regardless of depth of the targets and cue validity. Artifacts related to blinking were filtered out. A region of interest (ROI) within 1.5° of central fixation (i.e., a quadratic central region whose height and width were both 3° of visual angle) was defined as the fixation area. For each of the eight experimental conditions, the ratios between the overall time that participants kept their gaze within this ROI and the duration of the time frame from the cue appearance to the target disappearance (700 or 1000 ms, depending on the cue–target interval on each trial) were calculated. The variances of vertical and horizontal eye positions from the cue appearance to the target disappearance, in terms of SDs along the x-axis and y-axis during each trial, were also calculated as a function of the eight experimental conditions. Furthermore, we separated the cue phase (from the cue appearance before the target appearance) and target phase (from the target onset to offset), and calculated the vertical and horizontal eye positions and the corresponding variances as a function of the direction of the cue and the target (i.e., close left, close right, far left, and far right), respectively.
Statistical analysis of behavioral data.
Omissions, incorrect responses, and trials with reaction times (RTs) faster than 100 ms (i.e., anticipations) were excluded from further analysis. False alarm responses during catch trials were also calculated as percentage values. Mean RTs were then calculated for each of the eight trial types for each subject, and were submitted to a 2 (depth of targets: close vs far) × 4 (cue validity: WV, WIV, BIV_SH, and BIV_DH) repeated-measures ANOVA (with Greenhouse–Geisser correction). Significant effects were further examined by planned t tests (with Bonferroni's correction).
Imaging data acquisition and preprocessing.
A 3T Siemens Trio system with a standard head coil was used to obtain T2*-weighted echoplanar images (EPIs) with blood oxygenation level-dependent (BOLD) contrast (matrix size: 64 × 64; voxel size: 3.1 × 3.1 × 3.0 mm3). Thirty-six transversal slices of 3 mm thickness that covered the whole brain were acquired sequentially with a 0.3 mm gap (TR = 2.2 s, TE = 30 ms, FOV = 220 mm, flip angle = 90°). There was one run of functional scanning that included 931 EPI volumes. The first five volumes were discarded to allow for T1 equilibration effects. Additional high-resolution anatomical images (voxel size: 1 × 1 × 1 mm3) were acquired using a standard T1-weighted 3D MP-RAGE sequence.
Data were preprocessed with Statistical Parametric Mapping software SPM5 (Wellcome Department of Imaging Neuroscience, London, UK; available at: http://www.fil.ion.ucl.ac.uk). Images were realigned to the first volume to correct for interscan head movements. Then the mean EPI image of each participant was computed and spatially normalized to the MNI single-subject template (Collins et al., 1994; Evans et al., 1994; Holmes et al., 1998) using the “unified segmentation” function in SPM5. This algorithm is based on a probabilistic framework that enables image registration, tissue classification, and bias correction to be combined within the same generative model. The resulting parameters of a discrete cosine transform, which define the deformation field necessary to move individual data into the space of the MNI tissue probability maps (Evans et al., 1994), were then combined with the deformation field, transforming between the latter and the MNI single-subject template. The ensuing deformation was subsequently applied to individual EPI volumes. All images were thus transformed into standard MNI space and resampled to 2 × 2 × 2 mm3 voxel size. The data were then smoothed with a Gaussian kernel of 8 mm full-width half-maximum to accommodate intersubject anatomical variability.
Statistical analysis of imaging data.
The preprocessed data were high-pass filtered at 1/128 Hz and were then analyzed with a general linear model (GLM) as implemented in SPM5. Temporal autocorrelation was modeled using an AR(1) process. At the individual level, the GLM was used to construct a multiple regression design matrix that included the following eight experimental conditions: Close_WV, Close_WIV, Close_BIV_SH, Close_BIV_DH, Far_WV, Far_WIV, Far_BIV_SH, and Far_BIV_DH. The eight event types were time locked to the onset of the target of each trial by a canonical synthetic hemodynamic response function and its time and dispersion derivatives, with an event duration of 0 s. The inclusion of the dispersion derivatives took into account the different durations of neural processes induced by the variable cue–target intervals and allowed for changes in dispersion of the BOLD responses induced by different cue–target intervals. Additionally, all the instructions, omissions, error trials, and trials with RTs faster than 100 ms (anticipated responses) were separately modeled as regressors of no interest. Parameter estimates were subsequently calculated for each voxel using weighted least-squares to provide maximum likelihood estimators based on the temporal autocorrelation of the data. No global scaling was applied.
For each participant, simple main effects for each of the eight experimental conditions were computed by applying appropriate [1 0] baseline contrasts [i.e., the experimental conditions vs implicit baseline (null trials) contrasts]. The eight first-level individual contrast images were then fed into a 2 × 4 within-subjects ANOVA at the second group level using a random-effects model (i.e., the flexible factorial design in SPM5 including an additional factor modeling the subject means). In the modeling of variance components, we allowed for violations of sphericity by modeling nonindependence across parameter estimates from the same subject, and allowed for unequal variances between conditions and between subjects using the standard implementation in SPM5. Areas of activation in the main effects were identified as significant only if they passed the threshold of p < 0.05, familywise error (FWE) corrected for multiple comparisons at the voxel level, with a cluster size of >10 voxels (Poline et al., 1997). Areas of activation in the interaction contrasts were identified as significant only if they passed the threshold of p < 0.01, FWE corrected for multiple comparisons at the cluster level with an underlying voxel level of p < 0.001, uncorrected.
Experiment 2: 2D control experiment
Since the same set of 3D objects (placeholders and targets) was used for the closer and the farther depth planes in our experiments, the retinal sizes of the closer and the farther objects were different, which could—at least in principle—constitute a confound to our results. To rule out this possibility, we ran a 2D control experiment, in which a group of new participants performed the same attentional reorienting task as in Experiment 1 but in two dimensions. Hence, in Experiment 2 the retinal sizes and the retinal positions of the objects were identical to those in Experiment 1, but the depth dimension was removed. Therefore, if the 2D experiment revealed the same pattern of results as Experiment 1, the differential retinal images of objects, rather than reorienting in depth, might have caused the results. In contrast, if the pattern of results differed in the 2D experiment, the results can most likely be attributed to reorienting in depth per se.
Participants.
Twenty healthy volunteers (10 female; 21 ± 1.5 years old) participated in the present experiment. They were all right handed, and had normal color vision and visual acuity. None of them had a history of neurological or psychiatric disorders. All participants gave informed consent before the experiment in accordance with the Helsinki Declaration. This experiment was approved by the Ethics Committee of the Department of Psychology, South China Normal University.
Apparatus, stimuli, and experimental setup.
The experiment was run in a dimly lighted room. Participants sat in front of a monitor screen, and the 2D stimuli were presented on the monitor screen, with an eye-to-monitor distance of 75 cm. The 2D stimuli in the present experiment had the same retinal images, and matched in retinal sizes as the 3D stimuli in Experiment 1 (Fig. 1B).
Experimental paradigm and design.
The experimental paradigm and task were the same as in Experiment 1. The experimental design was a 2 (retinal size of targets: large vs small) × 4 (cue validity: WV, WIV, BIV_SH and BIV_DH) within-subject design. Participants were asked to detect the appearance of a blue sphere in one of the four spatial locations on the screen while maintaining central fixation.
Statistical analysis of behavioral data.
The statistical analysis of Experiment 2 followed that of Experiment 1.
Experiment 3: effect of specific depth
Although we could track one eye in the scanner to rule out gross differences in eye movement as a putative confound to attentional reorienting processes, one may still argue that the pattern of our results might be confounded by differential binocular disparity. Due to technical limitations, we were unable to track both eyes in the MR scanner and, accordingly, could not calculate the exact binocular disparity during the fMRI experiment. However, to rule out that our results can simply be attributed to differences in binocular disparity per se, we ran the same experiment with the fixation cross being displayed at three different depths (close, medium, and far) as a behavioral control experiment. Since the specific depth and the binocular disparity parametrically varied across the three levels of depth, a consistent pattern of behavioral results regardless of the specific depth at which the fixation cross was displayed would provide evidence that the findings of Experiment 1 cannot simply be due to differences in binocular disparity.
Participants.
Eighteen healthy volunteers (9 female and 11 male, 20 ± 1.5 years old) participated in Experiment 3. They were all right handed, and had normal color vision and visual acuity. None of them had a history of neurological or psychiatric disorders. All participants gave informed consent before the experiment in accordance with the Helsinki Declaration. This experiment was also approved by the Ethics Committee of the Department of Psychology, South China Normal University.
Apparatus, stimuli, and experimental setup.
The 3D stimuli in this experiment were delivered via eMagin Z800 3D Visor with the same settings as those in Experiment 1, except that the whole stimulus display in the present experiment was presented at three depths (with the central fixation being at 65, 100, and 135 cm from the eyes, respectively), resulting in three levels of eye convergence.
Experimental paradigm and design.
The experimental paradigm and task were exactly the same as in Experiment 1. The design in this experiment constituted a 3 (depth of the stimulus display: close, medium, far) × 2 (depth of targets: close vs far) × 4 (cue validity: WV, WIV, BIV_SH, and BIV_DH) within-subject design, with the depth of the stimulus display blocked. The order in which participants performed the close, medium, and far blocks was counterbalanced. Participants were required to fixate at the central fixation (at three depths) during the close, medium, and far blocks.
Statistical analysis of behavioral data.
Omissions, incorrect responses, and trials with RTs faster than 100 ms (i.e., anticipations) were excluded from further analysis. False alarm responses during catch trials were also calculated as percentage values. Mean RTs were then calculated for each of the experimental conditions for each subject, and were submitted to a 3 (depth of the stimulus display: close, medium, far) × 2 (depth of targets: close vs far) × 4 (cue validity: WV, WIV, BIV_SH, and BIV_DH) repeated-measures ANOVA (with Greenhouse–Geisser correction). Significant effects were further examined by planned t tests (with Bonferroni's correction).
Results
Behavioral data
Experiment 1
Mean RTs in the eight experimental conditions were submitted to a 2 (depth of targets: close vs far) × 4 (cue validity: WV, WIV, BIV_SH, and BIV_DH) repeated-measures ANOVA. The main effect of depth of targets was significant (F(1,19) = 26.15; p<0.001), indicating that RTs to closer targets (391 ± 14 ms) were faster than RTs to the farther targets (401 ± 13 ms) (Fig. 2A). The main effect of cue validity was also significant (F(2,44) = 46.98; p < 0.001). RTs in the WV condition (353 ± 14 ms) were faster than RTs in the other three invalid conditions (WIV: 417 ± 14 ms; BIV_SH: 395 ± 14 ms; BIV_DH: 419 ± 15 ms) (all p < 0.001, Bonferroni's corrected), indicating RT costs whenever visuospatial attention was reoriented to an invalidly cued location, regardless of depth and hemispace. Among the three invalid conditions, RTs in the BIV_SH condition were significantly faster than RTs in the WIV and the BIV_DH conditions (both p < 0.05, with Bonferroni's correction), while there was no significant difference between the latter two conditions (p > 0.1).
A–C, Mean RTs shown as a function of the experimental conditions in Experiments 1 (A), 2 (B), and 3 (C). The error bars represent SEM. The pairs of experimental conditions denoted by an asterisk indicate significant differences between conditions.
Furthermore, the interaction between depth and cue validity was significant (F(2,36) = 6.37; p = 0.005). Planned paired t tests between the closer versus the farther targets in the BIV_DH, BIV_SH, and WIV conditions, respectively, suggested that attentional reorienting to unexpected targets in the closer depth plane was significantly faster than attentional reorienting to the farther depth plane only when the behavioral targets unexpectedly appeared both in the unattended depth plane and the unattended hemispace [i.e., only in the BIV_DH condition (t(19) = 5.32; p < 0.001), but not in the BIV_SH condition (t(19) = 0.89; p = 0.39), with attentional reorienting to the unattended depth only), or in the WIV condition (t(19) = 0.86, p = 0.40; with attentional reorienting to the unattended hemispace only) (Fig. 2A).
Experiment 2: 2D control experiment
Mean RTs were submitted to a 2 (retinal size of targets: large vs small) × 4 (cue validity: WV, WIV, BIV_SH, and BIV_DH) repeated-measures ANOVA (Fig. 2B). The only significant effect was the main effect of cue validity (F(2,40) = 43.0; p < 0.001). RTs in the WV condition (295 ± 12 ms) were faster than RTs in the other three invalid conditions (WIV: 335 ± 13 ms; BIV_SH: 309 ± 15 ms; BIV_DH: 333 ± 13 ms; all p < 0.05, with Bonferroni's correction). Among the three invalid conditions, RTs in the BIV_SH condition were significantly faster than RTs in the WIV and the BIV_DH conditions (both p < 0.05, with Bonferroni's correction), while there was no significant difference between the latter two conditions (p > 0.1). Neither the main effect of the retinal size of targets (F(1,19) = 1.36; p = 0.26) nor the interaction between the two factors (F(2,44) = 0.13; p = 0.94) was significant. The data clearly demonstrate that the behavioral effect observed in Experiment 1 was caused by attentional reorienting in depth between closer and farther objects rather than by the different retinal sizes of these closer and farther objects. In particular, by removing the depth dimension but keeping the retinal sizes of objects in the present behavioral experiment the same as those in Experiment 1, we found that the interactive effect disappeared. Attentional reorienting to objects with bigger retinal sizes (corresponding to the closer objects in Experiment 1) was no longer faster than to objects with smaller retinal sizes (corresponding to the farther objects in Experiment 1).
Experiment 3: effect of specific depth
Mean RTs were submitted to a 3 (depth of the stimulus display: close, medium, far) × 2 (depth of targets: close vs far) × 4 (cue validity: WV, WIV, BIV_SH, and BIV_DH) repeated-measures ANOVA (Fig. 2C). The main effect of the depth of targets was significant (F(1,17) = 12.79; p = 0.002), indicating that RTs to close targets (344 ± 20 ms) were faster than RTs to far targets (350 ± 21 ms). The main effect of cue validity was significant (F(2,26) = 35.86; p < 0.001). RTs in the WV condition (321 ± 19 ms) were faster than RTs in the other three invalid conditions (WIV: 362 ± 21 ms; BIV_SH: 340 ± 21 ms; BIV_DH: 365 ± 22 ms; all p < 0.001, with Bonferroni's correction). Among the three invalid conditions, RTs in the BIV_SH condition were significantly faster than RTs in the WIV and the BIV_DH conditions (both p < 0.05, with Bonferroni's correction), while there was no significant difference between the latter two conditions (p > 0.1). Moreover, interaction between the depth of target and cue validity was significant (F(2,32) = 11.38; p < 0.001). There were no other significant effects. Further planned t tests on simple effects suggested that for all the three levels of the depth of the stimulus display, attentional reorienting to unexpected targets in the closer depth plane was significantly faster than attentional reorienting to the farther depth plane only in the BIV_DH condition (all p < 0.05, with Bonferroni's correction), but neither in the BIV_SH condition nor in the WIV condition (all p > 0.1; Fig. 2C). Therefore, the data show that the behavioral effects observed in Experiment 1 are independent of the specific depth at which the fixation cross was displayed (and therefore independent of differences in binocular disparity): the interactive effect was consistently present at the three different depths (with luminance and visual angles of stimuli equated across depths).
Eye movement data during fMRI scanning
The mean percentage of time that participants maintained fixation during each of the eight experimental conditions was entered into a 2 (depth of targets: close vs far) × 4 (cue validity: WV, WIV, BIV_SH, and BIV_DH) repeated-measures ANOVA (degrees of freedom corrected using a Greenhouse–Geisser correction). Neither the main effect of the depth of targets (F(1,19) = 0.37, p = 0.55) nor the main effect of cue validity (F(2,33) = 0.44, p = 0.62) was significant. The interaction between depth and cue validity was also not significant (F(2,47) = 0.87; p = 0.45). These results suggested that participants maintained central fixation equally well in all of the eight experimental conditions (Fig. 3A).
Eye-tracking data from cue appearance to target disappearance shown as a function of the eight experimental conditions. A, Central fixation rates. B, Variance of vertical (left) and horizontal (right) eye positions. Units of y-axis are in pixels of the screen.
The SDs of eye positions along the x-axis and y-axis during each trial, from the cue appearance to the target disappearance, were also separately submitted to a 2 (depth of targets: close vs far) × 4 (cue validity: WV, WIV, BIV_SH, and BIV_DH) repeated-measures ANOVA (degrees of freedom corrected using a Greenhouse–Geisser correction). For the variances of the vertical (Fig. 3B, left) and horizontal (Fig. 3B, right) eye positions, in terms of SD, no significant effects were found (all F < 1), suggesting that neither the vertical nor the horizontal eye positions changed as a function of the eight experimental conditions. We further separated the cue and the target phases, and tested whether the vertical and horizontal eye positions and their variances varied as a function of the spatial direction of the cue and the target. During the cue phase, the mean vertical and horizontal eye positions and their variances were separately calculated as a function of the spatial direction of the cue, and were submitted to a 2 (depth: close vs far) × 2 (side: left vs right) repeated-measures ANOVA (degrees of freedom corrected using a Greenhouse–Geisser correction). No significant effect was found for either the vertical and horizontal eye positions or their variances (all F < 1), indicating that eye positions did not change as a function of the spatial direction of the cue (Fig. 4A). Similarly during the target phase, no significant effect was found for x and y eye positions and their variances (all F < 1), indicating that eye positions did not change as a function of the spatial location of the target.
Eye-tracking data during the cue and the target phases. A, Vertical and horizontal eye positions and the corresponding variances during the cue phase, in terms of pixels of the screen, are shown as a function of the spatial direction of the cue. B, Vertical and horizontal eye positions and the corresponding variances during the target phase, in terms of pixels of the screen, are shown as a function of the spatial location of the target.
fMRI results
General effects of attentional reorienting
Compared with the WV condition, the three invalid conditions (WIV, BIV_SH, and BIV_DH; close and far combined) induced significantly higher neural activity in the right temporoparietal junction (TPJ) (MNI: 52, −42, 15; t = 8.17; 810 voxels), together with a dorsal, right-dominant frontoparietal network and some posterior areas in bilateral middle occipital gyrus and left middle temporal gyrus (p < 0.05, FWE correction at the voxel level; cluster size > 10 voxels) (Fig. 5A; Table 1). To further characterize the activation changes in TPJ with respect to the different conditions with invalid cues (i.e., to elucidate whether there was any differential effect of the dimension in which the reorienting of attention occurred), parameter estimates under the eight experimental conditions were extracted from the activated cluster in the right TPJ, and submitted to a 2 (depth of target: close vs far) × 4 (cue validity: WV, WIV, BIV_SH and BIV_DH) repeated-measures ANOVA (Fig. 5A). The main effect of cue validity was the only significant effect (F(2,41) = 13.4; p < 0.001). Neither the main effect of depth of targets nor the interaction was significant (both F < 1). Planned t tests on simple effects suggested that regardless of the depth of targets, neural activity in the three invalid conditions (WIV, BIV_SH, and BIV_DH) was significantly higher than neural activity in the valid condition (WV; all p < 0.05), while there was no significant difference among the three invalid conditions (all t < 1). Therefore, the right TPJ showed higher neural activity whenever visuospatial attention needed to be reoriented, regardless of the dimension in which the reorientation occurred. No significant activations were found in the reverse contrast [i.e., WV > (WIV + BIV_SH + BIV_DH); close and far combined].
A, B, Reorienting visuospatial attention per se (A) and across depth (B). A, The general effect of attentional reorienting as captured by the comparison of all the invalid conditions with the valid conditions (close and far combined). B, Brain regions specifically involved in attentional reorienting in depth. All the between-depth invalid conditions were contrasted with all the within-depth invalid conditions (close and far combined). Parameter estimates were extracted from the activated clusters in the activated areas, and are shown as a function of the eight experimental conditions. The experimental conditions denoted by an asterisk indicate significant differences (with Bonferroni's correction).
Main effects of cue validity
Attentional reorienting in depth
To further localize neural mechanisms specifically involved in reorienting attention in depth, we subsequently performed planned comparisons among the three invalid conditions. The between-depth invalid conditions were directly contrasted with the within-depth invalid conditions (i.e., BIV_SH + BIV_DH > WIV, close and far combined). Here, the premotor cortex (left: MNI, −42, −6, 51; t = 5.85; 12 voxels; Right: MNI, 50, −10, 49; t = 5.78; 21 voxels) was significantly activated bilaterally (p < 0.05, FWE correction; cluster size > 10) (Fig. 5B). Since the different types of invalid conditions were directly contrasted, the general reorienting component in the right TPJ was cancelled out. Thus, bilateral premotor cortex showed significantly higher neural activity whenever visuospatial attention was reoriented from close to far or vice versa (between depths), compared with when visuospatial attention was reoriented within the same depth plane (Fig. 5B). Please note that a potential influence of differential eye movements on the observed premotor activation is unlikely, as (1) the participants were able to maintain fixation in all eight experimental conditions (Fig. 3), and (2) the vertical and horizontal eye positions and their variances did not change as a function of either the eight experimental conditions or the spatial directions of the cue and the target (Figs. 3, 4). There were no significant activations in the reverse contrast (i.e., WIV > BIV_SH + BIV_DH; close and far combined).
Neural interaction between depth and cue validity
Corresponding to the significantly faster reorienting process from far to close when both depth and hemispace were invalidly cued (Fig. 2A), the neural interaction contrast ([Close_BIV_DH > (Close_WIV + Close_BIV_SH)] > [Far_BIV_DH > (Far_WIV + Far_BIV_SH)]) revealed significant activations in posterior cingulate cortex (PCC; MNI: −8, −50, 35; t = 5.16; 1432 voxels), orbital prefrontal cortex (OPFC; MNI: −10, 32, −9; t = 4.75; 371 voxels; MNI: −16, 54, 3; t = 4.59; 543 voxels), and left angular gyrus (AG; MNI: −48, −68, 29; t = 4.65; 141 voxels; p < 0.01, corrected for multiple comparisons at the cluster level with an underlying voxel level of p < 0.001, uncorrected; Fig. 6). These three areas showed enhanced neural activity during reorienting from far to close, but only when the hemispace of the target was also unattended (Close_BIV_DH), while significant deactivations occurred for the Far_BIV_DH condition. Since this interaction contrast was orthogonal to the main effect of cue validity, the current pattern of neural activity in PCC, OPFC, and left AG (Fig. 6) was dissociable both from the general reorienting system in the right TPJ (Fig. 5A) and the depth-specific reorienting system in bilateral premotor cortex (Fig. 5B).
Neural interaction between depth and cue validity. Parameter estimates were extracted from the activated clusters in the activated areas, and are shown as a function of the eight experimental conditions.
No brain areas were found to be activated significantly by the main effect of the depth of target (i.e., close vs far or far vs close).
Discussion
By constructing a virtual 3D environment, we aimed at investigating the cognitive and neural mechanisms underlying attentional reorienting in depth. Behaviorally, covering the same spatial distance, covert attentional reorienting to objects unexpectedly appearing both closer to the observer and in the unattended hemispace was faster than reorienting to unexpected objects farther away (Fig. 2A), and this effect was independent of both the retinal sizes of objects (Fig. 2B) and the specific depth at which the fixation cross was displayed (Fig. 2C). Neurally, we found that in addition to the ventral frontoparietal attention network, another two brain networks were involved in different aspects of attentional reorienting along the third dimension of space.
First, compared with attentional reorienting within the same depth plane, bilateral premotor cortex was specifically involved in reorienting visuospatial attention between different depths (from close to far or vice versa). Bilateral human premotor cortex has been suggested to be involved in extracting and processing 3D shapes from disparity (Georgieva et al., 2009). In our study, however, participants fixated at the central cross throughout the experiment and oriented/reoriented spatial attention covertly between spatial locations without moving their eyes or head. Thus, the process of extracting 3D shapes of objects based on disparity (Georgieva et al., 2009) should be consistent across all experimental conditions. Therefore, the neural activations associated with this extracting process should be cancelled out when experimental conditions were contrasted with each other. To covertly shift attention between spatial locations in a 3D environment, the spatial locations of external objects, in terms of both 2D and depth coordinates, need to be first coded as an internal representation map, and then attention can be shifted between these internal spatial representations. The posterior parietal cortex has been suggested to code object positions in multiple spatial reference frames (Colby, 1998; Andersen and Buneo, 2002). Our results further suggest that the premotor cortex is most sensitive to the depth coordinates of spatial representations. In addition to supporting attentional shifts along the depth dimension of space, as in our study, the premotor cortex also transforms the internal spatial representations with three-dimensional coordinates into goal-directed actions at the motor level. For example, the premotor cortex updates spatial positions of objects, especially along the depth dimension of space, when human participants point to the objects after self-motion (Wolbers et al., 2008). In Wolbers et al. (2008), participants first memorized the initial locations of one to four objects, and then experienced a virtual forward movement (i.e., self-motion along the third dimension of space). Participants were asked to point toward the memorized location of one of the objects. This task required participants to calculate the revised object locations dependent upon the extent of their forward motion. Neural activity in the left premotor cortex increased linearly with the number of spatial locations that needed to be updated during pointing, indicating a functional role of the premotor cortex in updating the depth coordinates of objects for motor actions. Note, however, that only the left premotor cortex was involved in the above spatial updating processes during motor actions, probably because participants used their right hands to provide the responses (Wolbers et al., 2008): changes in the depth coordinates of objects were transformed into action representations in the left premotor cortex, thereby supporting the pointing responses of the right hand. An alternative explanation for the bilateral premotor activation observed in our study could be the small differences in vergence eye movements, which may accompany reorienting between different depths. For example, bilateral anterior frontal eye field (FEF) is involved in alternately fixating at three spatial locations centrally aligned along the participants' midline, resulting in changes in the degree of eye vergence (Alvarez et al., 2010; Alkan et al., 2011). There are, however, two critical differences between the aforementioned fMRI studies on vergence eye movements and our study. First, in the former case participants made overt eye movements between different central depth locations, while in our study participants fixated at the central cross and covertly reoriented attention between peripheral locations at different depths without explicit eye/head movements. Second, the bilateral premotor activation in our study is more ventral (by ∼20 mm) and more posterior (by ∼20 mm) than the FEF activations reported in the two fMRI studies on vergence eye movements. Together, our results, combined with previous evidence, strongly suggest that premotor cortex is specifically involved in representing object locations along the third dimension of space, allowing for both covertly reorienting attention between depths and guiding overt eye/hand actions along the third dimension of space.
Second, we found a functional dissociation between the ventral frontoparietal attention network and a network including PCC, OPC, and left AG during attentional reorienting in 3D space. On the one hand, compared with the valid condition, the right TPJ, together with frontoparietal and posterior areas, showed higher neural activity whenever visuospatial attention was reoriented, regardless of the reorienting dimension. When an unexpected event evokes reorienting of attention between two locations on a two-dimensional display, both the ventral and the dorsal frontoparietal networks are transiently activated (Corbetta et al., 2000, 2008; Corbetta and Shulman, 2002), while the ventral areas might signal the occurrence of unexpected events to the dorsal areas (Vossel et al., 2012). Ventral frontal areas have been found to be activated when reorienting is unexpected (Shulman et al., 2009) and may generally be involved in the detection of irregular events (Vossel et al., 2011). Our results extend these previous findings by showing that the right ventral frontoparietal attention network is activated by unexpected stimuli not only within a two-dimensional plane, but also along the third dimension of space (depth).
On the other hand, our behavioral data suggest that participants were able to rapidly reorient attention to an object unexpectedly appearing both closer to the observer and in the unattended hemispace (Fig. 2A,C). This ecologically important behavior specifically engaged a network of areas including PCC, OPFC, and left AG. These areas showed enhanced neural activity during reorienting from far to close into the unattended hemispace while they were significantly deactivated during reorienting from close to far into the unattended hemispace (Fig. 6). This pattern of activations corresponds to activations observed in visual search tasks where subjects switched between different visual dimensions, such as color and motion direction (Pollmann et al., 2000; Weidner et al., 2002), reflecting shifts of attentional weight between separate processing modules. However, the depth and the hemispace of the target simultaneously changed not only in the Close_BIV_DH condition but also in the Far_BIV_DH condition, yet increased neural activity of the PCC, OPFC, and left AG network was found only for the former condition, suggesting that these activations are specific to the closer invalid targets. Please note that in both the Close_BIV_SH and the Close_BIV_DH conditions, the targets unexpectedly appeared closer to the observer, but only in the latter condition, associated with speeded attentional reorienting to the closer unexpected targets, were PCC, OPFC, and left AG involved. Gradient models of visuospatial attention suggest that attention is allocated, in a viewer-centered fashion, from the observer to the precued location in depth (Downing and Pinker, 1985; de Gonzaga Gawryszewski et al., 1987). When a location in a farther depth plane is precued, the hemispace between the observer and the precued location should be more attended than the other hemispace. Therefore, the Close_BIV_SH location should receive more attentional resources than the Close_BIV_DH location. Our data suggest that only when the targets unexpectedly appear both closer to the observer and at the most unattended location, the neural mechanisms in PCC, OPFC, and left AG will be evoked to speed up the attentional reorienting process.
To survive everyday life while navigating through the 3D world around us, one of the most important behaviors from a biological/ecological perspective is to rapidly redirect our attention to potentially threatening or rewarding stimuli that unexpectedly invade the space closer to us. In contrast, reorienting to stimuli far from us, although also important for everyday life, usually lacks the need for urgency and immediacy. Therefore, attentional reorienting along the third dimension of space may involve self-related processing: objects that unexpectedly approach the observer have higher self-relevancy, while farther unexpected objects involve less self-related thoughts. Correspondingly, PCC, OPFC, and left AG have previously been implicated in the default-mode network of the human brain, a network known to show enhanced neural activity during the evaluation of self-relevant information from the body and the world (Gusnard and Raichle, 2001; Raichle et al., 2001; Fox et al., 2005; Corbetta et al., 2008; Qin and Northoff, 2011).
To summarize, by adapting the Posner spatial-cueing paradigm in three-dimensional space, we found that three independent neural networks were involved in aspects of attentional orienting in depth: the right ventral frontoparietal network was involved in reorienting to unattended objects regardless of depth; bilateral premotor cortex was specifically involved in reorienting attention between depths (from close to far or vice versa); and a network of areas reminiscent of the default-mode network was involved in boosting attentional reorienting to objects unexpectedly appearing closer to the observer and in the unattended hemispace.
Footnotes
This work was supported by a grant from the Deutsche Forschungsgemeinschaft (DFG) (DFG-KFO112, TP 1) to G.R.F. S.V. (Vo 1733/1-1) and R.W. and G.R.F. (We4299/3-1) are supported by the DFG. Q.C. is supported by the Foundation for the Author of National Excellent Doctoral Dissertation of People's Republic of China (200907) and by grants from the Natural Science Foundation of China (30970895 and 31070994).
The authors declare no conflict of interest.
- Correspondence should be addressed to Dr. Qi Chen, Cognitive Neuroscience, Institute of Neuroscience and Medicine (INM-3), Research Center Jülich, Leo-Brandt-Strasse 5, 52425 Jülich, Germany. qi.chen27{at}gmail.com