Abstract
One of the more enduring mysteries of neuroscience is how the visual system constructs robust maps of the world that remain stable in the face of frequent eye movements. Here we show that encoding the position of objects in external space is a relatively slow process, building up over hundreds of milliseconds. We display targets to which human subjects saccade after a variable preview duration. As they saccade, the target is displaced leftwards or rightwards, and subjects report the displacement direction. When subjects saccade to targets without delay, sensitivity is poor; but if the target is viewed for 300–500 ms before saccading, sensitivity is similar to that during fixation with a strong visual mask to dampen transients. These results suggest that the poor displacement thresholds usually observed in the “saccadic suppression of displacement” paradigm are a result of the fact that the target has had insufficient time to be encoded in memory, and not a result of the action of special mechanisms conferring saccadic stability. Under more natural conditions, trans-saccadic displacement detection is as good as in fixation, when the displacement transients are masked.
Introduction
The visual system needs to build a representation of space that remains stable in the face of the frequent displacements on the retina each time the eyes move. Saccades create ambiguity about whether motion originates from the self-induced displacement of the retina or from motion in the external world. To track object locations across the transient periods of saccades, presaccadic and postsaccadic object positions must be matched. The object-matching process requires knowledge about the upcoming saccade vector, which is probably mediated by a corollary discharge signal (Sperry, 1950; Sommer and Wurtz, 2002). Information about the saccade vectors has been shown to be present in neurons that predictively shift their receptive fields to the future saccade landing position (Duhamel et al., 1992; Umeno and Goldberg, 1997, 2001; Nakamura and Colby, 2002). Indeed, eye-position information is available in many cortical dorsal areas 100 ms before the eye movements commence, but remains inaccurate until saccadic landing (Morris et al., 2012). These effects may be the origin of many spatial misperceptions at the time of saccades (Honda, 1989, Dassonville et al., 1995, Ross et al., 2001).
A popular way to study the effects of saccades on spatial localization is with the method termed “saccadic suppression of displacement” (SSD). As observers initiate a saccade to a flashed target, the target is displaced parallel to the saccade, and observers report the direction of the displacement (Bridgeman et al., 1975; Deubel et al., 1996, Collins et al., 2009, Demeyer et al., 2010). Observers perform far worse at this task during saccades than during fixation. Recently, physiological studies have identified a possible neural site for the trans-saccadic effects: neurons in the frontal eye field (FEF) seem to be tuned for the stimulus shift across saccades (Crapse and Sommer, 2009). Applying transcranial magnetic stimulation to frontal cortex in humans systematically biases trans-saccadic position estimations, suggesting that the FEFs are implicated in this task (Ostendorf et al., 2012).
The SSD paradigm usually requires subjects to saccade reactively to targets that are turned on abruptly. While experimentally convenient, this situation is very unlike real-life situations, where objects rarely materialize suddenly, but persist over moderately long times. Subjects typically saccade to a stimulus that had been present for some time, so the system has time to encode well its position before executing the saccade. Using an adaptation paradigm, we have recently shown that spatiotopic representations take time to construct (Zimmerman et al., 2013). Therefore the amount of preview time before the saccade is executed may be important, and may affect the results of the SSD task.
In this study we investigated localization of spatial position over saccades, under conditions when the oculomotor system has sufficient time to encode the presaccadic target location. We found that discrimination of displacement improved considerably with preview duration. The preview benefit was nearly identical in a fixation task, with a strong mask to dampen the displacement transients. The results reveal a new and overlooked aspect in trans-saccadic position estimation.
Materials and Methods
Participants.
Seven subjects (one male author; two male and four female naive subjects; mean age, 28 years; range, 25–33 years) participated in the saccade task of the main experiment. Eight subjects (mean age, 31 years; range, 28–37 years) participated in the fixation task of the main experiment. Seven subjects (mean age, 32 years; range, 28–37 years) participated in the contrast control experiment.
All subjects had normal or corrected-to-normal vision. Subjects gave informed consent. The experiments were carried out along the principles laid down in the Declaration of Helsinki.
Saccade condition.
Figure 1 describes the general experimental procedure. Subjects sat 57 cm from a 22 inch CRT color monitor (120 Hz, 800 × 600 pixels; Barco Calibrator) displaying a uniform gray field of 18.8 cd/m2. A black fixation point (0.75° diameter, 0.5 cd/m2, 0.97 contrast) appeared at −10, 0° (where 0, 0° refers to screen center), to which the subject directed gaze. The fixation point was switched off after 100 ms (to reduce visual references) but the subject maintained fixation. After 1000 ms, a target appeared at +10°, to which the subject saccaded on cue (a beep), presented between 0 and 500 ms after the saccade target appeared. With this method we could systematically vary the preview duration of the saccade target before saccade execution.
Experimental setup for saccade and fixation trials. The black circles indicates eye position. The red squares indicate targets for fixation or saccades. A, Saccade trials. A trial started with subjects directing gaze to a fixation point. After 100 ms, the fixation point was turned off and subjects continued to maintain fixation on the blank screen. The saccade target T1 appeared 1000 ms later. Subjects saccaded to it on auditory cue 0–500 ms after saccade target onset. As soon as the saccade was detected, the saccade target was displaced either leftwards or rightwards (T2). At the end of the trial, the subject indicated the direction of the target displacement by key press. B, Fixation trials. The sequence was similar for the saccade trials, except subjects maintained fixation at the position of the fixation point for the entire trial. T1 was displayed for 0–900 ms, and followed by 60 ms of high-contrast mask (simulating the masking effect of the saccade). T2 was then displayed leftwards or rightwards of its original position and subjects reported the direction of the shift by key press.
As soon as eyes had moved 2.5° in the direction of the target, the target was displaced either leftwards or rightwards. Thirteen different displacement sizes between ±3° (including zero) were drawn pseudo-randomly with equal probability. The subject responded by key press whether the target was displaced to the left or to the right (2-AFC task). Then the next trial began with the presentation of a fixation point.
Masking condition.
To examine whether the results were peculiar to saccades, we also ran the experiment with subjects maintaining fixation, with a brief visual whole-field texture mask (0.5 × 0.5° pixel size with luminance varying randomly from zero to twice background luminance within each pixel) to suppress motion transients (simulating to some extent the action of the saccade). As before, subjects fixated on the fixation point, which was switched off after 100 ms, and maintained fixation for 1000 ms on the blank screen. The target was then presented 10° to the right of fixation for 100–700 ms, followed by a 60 ms visual mask. The target was presented again, horizontally displaced over the same range as used in the saccade condition. For the entire session the subject maintained fixation at the initial fixation position, where the fixation point had been.
Eye movements and data analysis.
Eye movements were monitored by the EyeLink 2000 system (SR Research), which samples gaze positions with a frequency of 2000 Hz. Viewing was binocular, but only the dominant eye was recorded. The system detected start and end of a saccade when eye velocity exceeded or fell below 22°/s and acceleration was above or below 4000°/s2.
All saccades with amplitude larger than 10° and latency between 100 and 1000 ms went into analysis (97% of all data). Average saccade latency was 244 ms (SD, 48 ms), and average landing position was 9.6° (SD, 0.8°). The value did not vary significantly with preview duration. The saccade target presentation duration was defined as the duration between target onset and target displacement, which was triggered by the eye movement. This duration thus depended on the delay until the saccade cue was played plus the saccade reaction time of the subjects. For each subject, data were binned according to preview duration into five equal intervals of 125 ms, and the last bin with an open interval. When a bin contained <5 data points, we averaged the data from this bin with those from its neighbor.
The psychophysical data were expressed as “proportion rightward” as a function of displacement. Gaussian error functions were fit to the raw data, and the SDs of these functions taken as a measure of displacement sensitivity. Mean and SEM were estimated across subjects.
Results
Displacement sensitivity in saccade and fixation trials
The left-hand plots of Figure 2 show psychometric functions of three typical subjects for judging displacement of saccadic target under two conditions: when saccades were performed immediately to the target (the standard condition to study saccadic suppression of displacement, shown by triangles) and when they were delayed for on average of 731 ms (SD, 92 ms; shown by open circles). With the delayed saccades, the psychometric functions are far steeper, indicating lower thresholds (given by the SD of the fitted error functions). Delaying saccades, which increases preview duration of the saccade target, results in higher displacement sensitivity.
A, Example psychometric functions for the saccade condition, for three representative subjects. Filled triangles refer to trials where the saccade was cued as soon as the target was presented (the standard condition for this type of study). Open circles refer to trials with the longest preview duration (731 ms; SD, 92 ms). The data are fitted with a Gaussian error function whose SD (σ) is taken as an estimate of threshold (just-noticeable difference). Thresholds are lower (steeper curves) for all subjects with the longest preview duration. B, Psychometric functions for the fixation condition. Same conventions as in A. The longest preview duration in fixation trials was 900 ms.
Psychometric functions for judging displacements during fixation are shown in the right-hand plots of Figure 2. In this task, a target was briefly masked and reappeared after mask offset in a displaced position. Data are shown for when the first target was shown for 100 ms before mask onset (triangles) or for 900 ms (circles). The curves are steeper for the longer presentation duration.
Figure 3A plots average displacement thresholds (means of the SDs of the psychometric functions fitted to the individual data) as a function of the duration of target presentation before saccade onset (measured accurately for each trial). For an average target duration of 183 ms (the reaction time of a reactive saccade), the average displacement threshold was 1.14°, similar to that reported in the standard SSD task, where subjects saccade immediately on target presentation (Bridgeman et al, 1975; Deubel et al., 1996). However, threshold steadily decreased as a function of target duration, to 0.66° for an average duration of 731 ms.
A, Displacement thresholds (geometric mean of thresholds, calculated separately for all subjects) for the saccade condition as a function of presentation duration. Saccade latencies were measured for all trials, and target presentation time binned into 100 ms bins. Error bars represent standard error across subjects. The continuous curves show exponential fits to the data (Eq. 1). B, Displacement thresholds in the fixation condition.
Figure 3B show the results for the fixation condition, where subjects maintained fixation throughout the trial, and a visual mask was presented to attenuate transient displacement signals after variable intervals. When the target was displayed for only 100 ms before displacement, sensitivity was poor (threshold, 1.26°); at 900 ms predisplacement duration, the threshold decreased to 0.74°. Longer target durations clearly yielded significant improvement of performance, when compared with the 100 ms target duration.
A two-way ANOVA revealed a significant main effect for preview duration (df = 4, F = 4.15, p = 0.0047). No statistical difference between data from the saccade and fixation conditions was found (df = 1, F = 0.98, p = 0.32). The absence of an interaction effect suggests that saccade and fixation data were similarly affected by the preview duration (df = 4, F = 0.34, p = 0.84).
The curves passing through the data are exponential decay functions of the following form, Equation 1: T = T∞ − k × exp(−t/τ), where T is threshold, t time, k a constant governing gain, τ the decay constant, and T∞ the saturation threshold. The decay constants were similar in the two conditions, 380 ± 260 ms for the saccade condition, and 318 ± 410 ms for the fixation. The saturation threshold T∞ was 0.5° in the saccade condition and 0.7° in the fixation condition.
Saccadic landing
To check whether the differences in saccade landing accuracy might explain the changes in the trans-saccadic displacement detection, we calculated responses to displacement as a function of the difference between the target position (after displacement) and eye landing position. Figure 4A shows for an example subject the psychometric functions calculated after compensation for eye landing for each individual trial (black triangles). The gray circular functions show for comparison the results as a function of the physical position of the target, without correcting for landing position. Figure 4B shows sensitivity measured with respect to saccadic landing against that measured from physical position. The two measurements are clearly comparable (R2 = 0.97, p < 0.001), as Deubel (1996) previously observed. There was also a trend for sensitivity measured from physical position to have lower thresholds than sensitivity measured from saccadic landing (two-way ANOVA, main effect physical or landing position: df = 1, F = 5.557, p = 0.056). Clearly, using saccadic landing as a reference does not change the slope, which becomes progressively steeper with exposure duration (main effect preview duration: df = 4, F = 10.022, p = 0.002; interaction: df = 1, F = 0.57, p = 0.543).
A, Psychometric functions for a typical subject, calculated either as a function of the physical displacement of the target (gray open circles), or the difference between the target position (after displacement) and eye landing position (black triangles). The slope of the functions is similar for both conditions. B, Average displacement thresholds, calculated from individual psychometric functions considering saccadic landing (ordinate) against those considering only the physical position of the target (abscissa). The continuous line is a linear regression (−0.09 + 1.23x, R2= 0.98). The dashed line is the equality line. C, Average displacement bias, calculated from individual psychometric functions considering saccadic landing (ordinate) against those considering only the physical position of the target (abscissa). The continuous line is a linear regression (0.07 + 0.67x, R2= 0.74). The dashed line is the equality line.
Contrast
To check whether effective visibility may explain the benefit for longer target presentation durations, we measured displacement thresholds for seven subjects with low-contrast and high-contrast stimuli. We used light fixation points and targets, of 20.7 cd/m2, a contrast of 10% on the 18.8 cd/m2, background (Fig. 5, filled triangles). The high-contrast stimuli (open circles) were dark disks of 97% contrast, as before. For both contrasts, thresholds decreased with duration with very similar values over the range of durations.
Thresholds as a function of target exposure duration for two different contrasts, 97% (open circles) and 10% (filled triangles), averaged over two subjects. Error bars refer to ±1 standard error across subjects.
A two-way ANOVA confirmed a significant main effect of preview duration (df = 4, F = 6.57, p > 0.001). However, no significant main effect was found for the contrast modulation (df = 1, F = 1.43, p < 0.23) and no significant interaction effect (df = 4, F = 0.16, p = 0.95). It is clear that the effect of target preview duration does not result from increased visibility or saliency of the longer target durations.
Discussion
The primary result of this study is that discrimination of position depends on the period of time available to encode target location. Discrimination sensitivity improved steadily with increasing duration of target presentation, to a threshold of 0.5°, up to approximately 500 ms. Discrimination sensitivity also improved with duration to a threshold of 0.7° in the condition where subjects made no saccades, but a visual mask came on after a variable duration. The increase in position discrimination sensitivity was similar in the saccade and fixation conditions. In these latter trials, a mask was presented to mimic the suppression of motion transients, which are active at the time of saccades. The similar performance with and without saccades suggests a central mechanism for position estimation that occurs whether the eye has moved or not. Since saccades barely diminish the accumulated discrimination sensitivity, the compensatory remapping for the executed saccade vector must be much more precise than previously thought.
Two kinds of cues are available to detect the displacement of a stimulus: motion transients, which should stimulate motion detection mechanisms, and a nondynamic comparison between prelocations and postlocations. If motion signals are unavailable, either because they are damped during saccades (Burr et al., 1982; 1994; Volkmann, 1986; Bremmer et al., 2009; Allison et al., 2010) or masked by a high-contrast mask (like the change-blindness paradigm; Rensink et al., 1997; Simons and Levin, 1997), the visual system must rely on comparison of the remembered position of the first stimulus location with the location of the second stimulus. The results of this study show that this process depends on the time available to encode the first stimulus location. This suggests that the precise encoding of spatial position is not immediate, but improves over relatively long intervals of time, up to 500 ms.
It may be argued that it is not duration per se that improves localization of stimuli, but stimulus saliency. For this reason we also measured varied contrast, and showed that it had no effect. Both for stimulus contrasts of 10 and 97%, thresholds decreased progressively with duration, at very similar rates. Clearly, it is exposure duration that is important for precise localization, not stimulus strength or saliency.
Theories of visual stability have highlighted the role of the saccade target in trans-saccadic position matching (Deubel et al., 1996; 2002; McConkie and Currie, 1996). These theories assume that the saccade target acts as a reference to help re-establish object position after the eyes have landed. Changes in the saccade target itself go unnoticed because a second reference would be necessary to enable displacement detection. Thus SSD has been thought to reveal mechanisms involved in stability during saccades (Deubel et al., 1996). However, the results of this study suggest a different explanation. The poor displacement thresholds observed for reactive saccades to targets—as they do in the standard SSD paradigm—do not, we believe, result from the action of special mechanisms conferring saccadic stability, but from the fact that the target has had insufficient time to be encoded in memory. If stimuli are previewed for a longer time—as occurs in natural viewing—thresholds are greatly reduced. So under more natural conditions, trans-saccadic displacement detection is much better than revealed by the standard techniques of making reactive saccades to abruptly appearing transient stimuli.
The long time course of position sensitivity (≤500 ms) makes it an unlikely candidate for preserving visual stability in real time. We usually make saccades every 300 ms and stability mechanisms are required after every saccade. Other, more rapid-acting mechanisms that anticipate the action of the saccades (Duhamel et al., 1992; Wurtz, 2008; Morris et al., 2012) must be involved in generating transient spatiotopy (for a recent review, see Burr and Morrone, 2012).
That postsaccadic blanking destroys SSD has been taken as strong evidence for the “reference theory of visual stability” (Deubel et al., 1996). It is claimed that visual stability works to a large extent on the assumption that a stationary target will not move during an eye movement, and therefore relatively large displacements go unnoticed. If the stimulus “disappears,” even briefly, then the stability assumption is broken and displacements become detectable. However, one possibility is that the blanking period gives the system extra time to encode the stimulus in its original position, before the stimulus represented is displaced. Another is that the blank creates extra transient signals outside the interval of maximum saccadic suppression, and these aid displacement detection. We are currently investigating these possibilities with stimuli of various durations, blanked after saccadic landing.
The long duration required for position encoding agrees well with other recent results from our laboratory showing that a spatiotopic representation of an image develops over time, again of the order of 500 ms. The displacement thresholds of this study require a spatiotopic representation, as the presaccadic and postsaccadic images occur at very different retinal positions. Spatiotopic position shifts in saccadic adaptation (Zimmermann et al., 2011) also require >250 ms to build up (Zimmermann, 2013). Thus they reinforce the notion that spatiotopic representations are built over this time scale. It also agrees with a study of Bastin et al. (2013) showing that allocentric representations in parahippocampal gyrus also build up over a similar time period. That the results are very similar in the fixation condition, where there is no clear dissociation between retinotopic and spatiotopic representations, suggests that spatiotopic representations may be used for this task, even when it is not strictly necessary to do so.
There is much evidence for spatiotopy in human vision. This comes from fMRI studies (d'Avossa et al., 2007, Crespi et al., 2011), spatial specificity of adaptation effects (Melcher, 2005; Burr et al., 2007; Turi and Burr, 2012), trans-saccadic summation (Hayhoe et al., 1991; Melcher and Morrone, 2003; Prime et al., 2007), apparent motion (Rock and Ebenholtz 1962; Fracasso et al., 2010; Szinte and Cavanagh, 2011). However, there has been a good deal of controversy. For example, for BOLD responses to be spatiotopically selective, attention has to be directed to the stimuli (Gardner et al., 2008, Crespi et al., 2011). And spatiotopic selectivity of the tilt aftereffect occurs only with sufficient exposure of the saccadic target before saccading to it (Zimmermann et al., 2013). So it is possible that much of the controversy may be explained by the fact that spatiotopic representations are not automatic and immediate, but build up actively over hundreds of milliseconds.
Conclusion
This study suggests that encoding of spatial position does not occur immediately, but takes time, saturating at approximately 500 ms. Half a second is a surprising long period for encoding spatial information. As humans make continual saccades, ≤3/s, this encoding duration is longer than the duration of a typical fixation. Precise visual position encoding thus is not necessary for visual stability, which is required after each saccade. However, the position encoding compensates almost perfectly for executed saccades, as suggested by the almost identical results between saccades and fixation.
Footnotes
This work was supported by the European Union [Space, Time and Number in the Brain (STANIB), FP7-ERC] and by the Italian Ministry of University and Research [Ministero dell'Istruzione, dell'Università e della Ricerca, Progetti di ricerca di interesse nazionale (MIUR-PRIN)].
- Correspondence should be addressed to Eckart Zimmermann at the above address. ec.zimmermann{at}fz-juelich.de