We are normally not aware of the microscopic eye movements that keep the retinal image in motion during visual fixation. In principle, perceptual cancellation of the displacements of the retinal stimulus caused by fixational eye movements could be achieved either by means of motor/proprioceptive information or by inferring eye movements directly from the retinal stimulus. In this study, we examined the mechanisms underlying visual stability during ocular drift, the primary source of retinal image motion during fixation on a stationary scene. By using an accurate system for gaze-contingent display control, we decoupled the eye movements of human observers from the changes in visual input that they normally cause. We show that the visual system relies on the spatiotemporal stimulus on the retina, rather than on extraretinal information, to discard the motion signals resulting from ocular drift. These results have important implications for the establishment of stable visual representations in the brain and argue that failure to visually determine eye drift contributes to well known motion illusions such as autokinesis and induced movement.
It is remarkable that we perceive a stable visual world. Unlike our percept, the visual input to the retina is never stationary. Eye movements continually displace the retinal projection of the scene, even when we attempt to maintain steady fixation (Ditchburn, 1973; Steinman et al., 1973). How does the visual system identify and discard the motion of the retinal image caused by eye movements?
Two main groups of theories have been proposed. According to extraretinal theories, motor/proprioceptive signals that convey information about the movement (von Helmholtz, 1867; von Holst and Mittelstaedt, 1950) and/or current position (Sherrington, 1918) of the eye are used to correct input changes resulting from oculomotor activity. According to retinal theories, the visual input itself enables identification of the changes in the retinal stimulus caused by eye movement (Gibson, 1954).
Visual stability has been extensively investigated in the presence of large eye movements, such as saccades and smooth pursuit. Both retinal (MacKay, 1970; Lappe et al., 2000) and extraretinal (Ross et al., 2001; Wurtz, 2008) mechanisms have been reported to contribute under these conditions. However, few studies have focused on how the visual system discards the retinal image motion caused by the small eye movements that occur during visual fixation. During fixation, slow drifts and small saccades move the stimulus over dozens of photoreceptors. While fixational saccades tend to occur relatively rarely (Collewijn and Kowler, 2008), and some observers can even suppress them (Steinman et al., 1967), ocular drift is always present. Fixational drift causes retinal velocities that are well beyond thresholds for motion detection (Nakayama and Tyler, 1978) and evoke strong neuronal responses (Snodderly et al., 2001; Greschner et al., 2002; Kagan et al., 2008). It is unclear how the brain distinguishes the retinal image motion resulting from ocular drift from that caused by an external object moving at a similar velocity.
Both retinal and extraretinal mechanisms could account for our perception of a stable scene during eye drift. Stabilization based on the retinal stimulus is supported by the motion perceived during viewing of a static noise pattern following adaptation to dynamic random noise (Murakami and Cavanagh, 1998). However, extraretinal compensation is also possible, as a large component of drift is under active control of the oculomotor system (Nachmias, 1961; Steinman et al., 1973). The qualitative report of drifting of the scene during subparalysis of extraocular eye muscles (Stevens et al., 1976) lends supports to an extraretinal mechanism.
Under natural viewing conditions, retinal and extraretinal contributions cannot be easily disentangled, as the motion of the eye is always accompanied by a corresponding change in the retinal stimulus. However, these two theories make opposite predictions about the perception of a stimulus that moves with the eye so as to yield an immobile retinal projection. Retinal theories predict that such a stimulus will be perceived as stationary, since no motion occurs on the retina; extraretinal theories, as moving, since the motion of the eye would no longer be compensated by an equivalent displacement of the retinal stimulus. Furthermore, extraretinal theories predict that the length/speed of drift will affect the probability of perceiving motion with a stationary retinal stimulus, since larger/faster drifts would cause larger cancellation errors. In contrast, such dependence should not occur with retinal mechanisms. Unlike retinal theories, extraretinal theories also predict that stimuli displayed at the same retinal location at different times during ocular drift should be perceived in different spatial positions. In the following sections, we report the results of three experiments that tested these predictions.
Materials and Methods
Thirteen subjects with normal vision (seven males and six females) took part in the study. Twelve subjects were naive about the purposes of the study and were paid to participate. One of the authors (C.L.) participated in experiment 3. Informed consent was obtained from all participants following the procedures approved by the Boston University Charles River Campus Institutional Review Board.
Stimuli consisted of a 4′ dot (2 cd/m2) that was displayed either alone (experiments 1 and 3) or at the center of a 9° square (experiment 2). In experiments 1 and 2, stimuli appeared at the current location of gaze and were displayed for 1 s, a duration sufficiently short to prevent fading of the image during retinal stabilization. In the dot-moving trials of these experiments, the trajectory of the stimulus was randomly selected among 16 possible directions that uniformly covered 360°. In experiment 3, three dots were flashed for 50 ms at intervals of 500 ms. The reference dot was presented with a horizontal displacement of 15′ from the current location of gaze. This displacement randomly alternated in different trials between left and right. Afterimages were created by observing a 22 ms photographic flash (Unomat B20, GN64) placed at a distance of 60 cm and partially covered to yield a square afterimage of 5′. Subjects were instructed to maintain precise fixation at the center of the device to center the afterimage at the preferred retinal location. Stimuli were always observed in complete darkness, with the exception of the light condition of experiment 1.
Stimuli were displayed on a fast phosphor CRT monitor (Iyama HM204DT) at a resolution of 800 × 600 pixels and vertical refresh rate of 200 Hz. Subjects were kept at a fixed distance of 120 cm from the monitor by means of a dental imprint bite bar and a head rest, which prevented movements of the head. Stimuli were observed monocularly with the right eye, while the left eye was patched.
Eye movements were continuously recorded by means of a Generation 6 Dual Purkinje Image (DPI) Eyetracker (Fourward Technologies). The nominal resolution of this system is ∼20″ with a time delay of ∼0.25 ms (Crane and Steele, 1985). Vertical and horizontal eye positions were sampled at 1 kHz and recorded for subsequent analysis.
Control of the retinal stimulus was obtained by means of EyeRIS, a system that enables flexible gaze-contingent display (Santini et al., 2007). This system acquires eye movement signals from the eyetracker, processes them in real time, and updates the stimulus on the display according to the current gaze location. The maximum delay of the system is equal to the time required to render two frames on the display (10 ms at 200 Hz; typical delay 7.5 ms). Previous analysis of EyeRIS performance has shown that this system stabilizes the retinal image with accuracy higher than 1′ (Rucci et al., 2007).
Task and procedure.
Three separate experiments were conducted. Experiment 1 consisted of a forced-choice motion discrimination task. Subjects were asked to report whether a small dot displayed on a dark background was stationary or translated at 30′/s (see Fig. 1a). In one condition (normal condition), the dot either was presented at a fixed location on the CRT or translated on the display. In a second condition [eye-movement (EM)-compensated condition], the retinal image motion resulting from fixational eye movements was eliminated by executing dot trajectories in a reference frame fixed to the subject's retina. In a control condition, subjects reported whether or not an afterimage appeared to move. Experiment 2 differed from experiment 1 by the addition of a reference square around the dot. The impact of differential motion on the retina was examined in four conditions in which the dot and the square moved in different ways (see Fig. 3a). Experiment 3 consisted of a two-alternative forced choice (2AFC) task, in which subjects reported which one of two test dots briefly flashed at distinct times during ocular drift was at the same spatial location as a reference dot displayed at the beginning of each trial. One of the two test dots was displayed at the same monitor location as the reference dot, whereas the other dot was displayed at the location that yielded the same projection of the reference dot on the observer's retina.
Data were collected in separate experimental sessions, each ∼40 min long. Every session started with preliminary setup operations that lasted a few minutes and allowed the subject to adapt to darkness or, in the light condition of experiment 1, to the low level of light in the room. These preliminary operations included the following: (1) positioning the subject optimally and comfortably in the apparatus; (2) tuning the eyetracker; and (3) calibrating EyeRIS. Subjects were then presented with two to three blocks of 40–60 experimental trials. Breaks between consecutive blocks ensured that subjects were never constrained in the apparatus for >20 min consecutively.
To convert the eye position measurements given by the eyetracker into screen coordinates, a calibration procedure preceded each block of trials. This calibration consisted of two phases. In the first phase, the subject sequentially fixated on each of nine points equispaced within the working area of the display. For each point, the mean output voltage from the eyetracker was estimated over a period of 3.5 s. Fixation was repeated until the SD of the measurement fell below a predefined threshold. The mapping from eye-position coordinates to degrees of visual angle was determined by bilinear interpolation over the mean eye positions measured at these nine points. This transformation was made possible by virtue of the highly linear behavior of the DPI eyetracker within the central region of the visual field. In the second phase of the calibration procedure, subjects fine-tuned and/or confirmed the gaze-to-pixel mapping. In this phase, subjects controlled the position of a cross, stabilized at the center of gaze and displayed in real time on the screen, while fixating on various locations marked on the screen. Subjects refined the offsets and gains of the calibration by using EyeRIS' joypad.
To ensure optimal control of retinal image motion, stimuli were always displayed within the central 3° of the CRT, the region in which the DPI eyetracker exhibits the most linear behavior. If the subject looked away from this area for longer than 4 s, a small square was flashed (150 ms) to indicate the center of the monitor. In all experiments, subjects reported their percept by pressing one of two buttons on a joypad at the end of the trial, after the stimulus disappeared.
In the control experiment of Figure 2b, observers reported the presence/absence of motion in the afterimage during successive 1 s intervals. Intervals were separated by a 4′ dot displayed for 1 s at the center of the monitor. Afterimages were observed in complete darkness with no other stimuli on the monitor. The experimental session terminated when the subject reported that the afterimage had become weak, usually after ∼2 min (∼30 trials).
To ensure accurate compensation of fixational eye movements, all trials with blinks and/or saccades, as well as all the trials in which the eye moved >40′, were discarded from data analysis. Saccades were defined as movements with minimal amplitude of 3′ and peak velocity higher than 3°/s. The motion of the eye during drift was characterized by means of several parameters, including the following: (1) mean velocity; (2) drift amplitude, defined as the modulus and the angle of the vector connecting the first and last point of the trajectory followed by the eye during the considered intersaccadic interval; (3) drift length, defined as the average integral along the eye trajectory; and (4) span, defined as the radius of the smallest circle encompassing the trajectory. In the experiments described here, all these parameters yielded similar results, and only the length is reported. In the trials of experiment 3, subjects spent intervals in complete darkness, a condition in which the amplitude of ocular drift varies considerably among observers. To discount this individual variability while preserving the distribution of drift length of each observer, in Figure 5, samples from each subject were normalized by the maximum drift exhibited by the subject across all the available trials. Levels of performance were evaluated over a minimum of 30 drift-only trials for each subject in every experiment and condition.
In a forced-choice discrimination task, subjects were asked to report whether a small dot displayed on a dark background was stationary (dot still) or translated in a random direction at 30′/s (dot moving, see Fig. 1a), a speed comparable to that of ocular drift during intersaccadic fixation. We compared performance measured in two conditions. In the normal condition, dot trajectories were realized in a reference frame fixed to the monitor; that is, the dot either was presented at a fixed location of the CRT or drifted with uniform motion on the display. In the EM-compensated condition, the retinal image motion resulting from fixational eye movements was eliminated by executing dot trajectories in a reference frame fixed to the subject's retina. In this condition, the position of the dot on the screen was adjusted in real time to compensate for the subject's eye movements so that the retinal projection of the dot remained motionless or translated uniformly at 30′/s (Fig. 1b). This operation was accomplished by means of a flexible system for gaze-contingent display, which enabled precise control of the spatiotemporal stimulus on the retina during ocular drift (Santini et al., 2007).
To provide a baseline for evaluating our results, the gray line in Figure 1c shows levels of performance measured when the experiment was run in an illuminated environment. As expected, in the normal condition, observers were highly accurate in discriminating whether or not the stimulus moved on the screen. In contrast, performance dropped drastically when the fixational motion of the retinal image was eliminated (t(5) = 6.5, p < 0.01; two-tailed paired t test). This reduction in performance occurred because, paradoxically, subjects experienced motion when observing a retinally immobile dot (Fig. 1d). Since the edges of the display were visible in this experiment, this motion percept could have been caused not only by an uncompensated extraretinal signal but also by the motion of the dot relative to the monitor. To discriminate between these possibilities, we eliminated differential motion by repeating the experiment in complete darkness.
Switching off the light drastically lowered performance. In the normal condition, observers often experienced motion with a stationary stimulus—a phenomenon known as autokinesis (Carr, 1910)—and were also less accurate in detecting motion. This reduction in performance reveals that a precise registration between retinal image motion and extraretinal signals does not occur during fixational drift. However, it does not per se rule out an extraretinal account of visual stability, as the presence of inaccuracies in extraretinal signals has long been postulated to explain autokinesis (Gregory and Zangwill, 1963; Verheijen and Oosting, 1964). Interestingly, unlike in the light, cancellation of fixational instability improved performance in the dark (t(5) = 7.67, p < 0.01) (black line in Fig. 1c). That is, in the absence of visual references, subjects were more accurate in discriminating between a stationary dot and a moving one when the retinal image motion caused by ocular drift was eliminated. Furthermore, in darkness, stimuli immobile on the retina were no longer systematically perceived to be moving as in the light (percentage of moving responses: light 75.7%; dark 38.4%; t = 6.5, p < 0.01). Thus, it appears that relative object motion, rather than an extraretinal drift signal, was responsible for the motion experienced when retinally immobile stimuli were observed in the light.
To better distinguish between different theories of visual stability, we examined the influence of ocular drift on the probability of reporting motion in the dot-still trials. Figure 2 shows that, in the dark, percentages of moving responses increased with the length/speed of drift when the dot was immobile on the display (normal condition) and were unaffected by drift when the dot was immobile on the retina (EM-compensated condition). That is, the amount of motion of the dot on the retina, but not the extent of ocular drift, influenced the probability of reporting motion in the absence of visual references. These results are not compatible with an extraretinal stabilization mechanism that relies on a signal proportional to ocular drift. Such a mechanism predicts that percentages of moving responses on dot-still trials should (1) increase with the extent of drift in the EM-compensated condition, when the extraretinal signal is not counterbalanced by a corresponding shift of the retinal image, and (2) decrease with the length/speed of drift in the normal condition, when the more accurate (i.e., less noisy) corollary discharges associated with larger/faster drifts should yield better cancellation of retinal image motion.
Diametrically opposite correlations were measured in the presence of visual references. As shown by Figure 2, in the light, percentages of moving responses increased with the length/speed of drift in the compensated condition but not during normal viewing. These correlations confirm that sensitivity to relative object motion was the cause for perceiving motion when a retinally immobile dot was observed in an illuminated environment. Relative motion between the dot and the frame of the CRT increased with the extent of drift in the compensated condition, when the dot moved on the monitor following the observer's eye movements, and was zero in the normal condition.
To rule out possible influences from inaccuracies in retinal stabilization, we also examined the probability of perceiving motion during observation of an afterimage, a perfectly stabilized retinal image. As shown in Figure 2b, percentages of moving responses were lower with an afterimage than with a retinally stabilized dot, a difference possibly caused by changes in the response criteria in the two experiments and/or noise in our stabilization apparatus. Different response biases are to be expected with an afterimage, given that the retinal stabilization experiments contained two types of trials (dot still and dot moving), whereas there was only one type of trial in this case. Notably, despite this change in response level, the extent of ocular drift continued to have no impact on the probability of reporting motion even during viewing of an afterimage. These results can hardly be reconciled with an extraretinal mechanism of visual stability.
The data collected in an illuminated environment emphasize the importance of relative object motion. Observers perceived motion when the stimulus moved relative to the display, even though the dot was actually immobile on their retinas, and perceived the stimulus as stationary in the absence of differential motion, even though the dot moved on the retina because of fixational instability. To better quantify the impact of relative and absolute motion on visual stability, in experiment 2, we enriched the stimulus by displaying a square around the dot (see Fig. 3a). Stimuli were observed in complete darkness.
Addition of a visual reference was sufficient to eliminate autokinesis, so that percentages of correct discrimination were very high during normal viewing (condition C1 in Fig. 3). Performance dropped when we eliminated the retinal image motion caused by eye movements, as the presence of the square reintroduced the percept of motion during viewing of a dot immobile on the retina (condition C2). As shown in Figure 3b, performance levels in this condition were virtually identical to those measured with the dot alone (no square) when the room was illuminated. These data confirm that the different results obtained in the dark and light conditions of experiment 1 originated from the impact of visual references and not from possible differences in visual processing caused by changes in illumination.
In two further conditions in this experiment, we precisely controlled differential motion by manipulating the position of the square during a trial. To conclusively determine that relative motion between the dot and the square was indeed the reason why subjects perceived motion during viewing of a retinally immobile dot, we examined the effect of stabilizing the square on the observer's retina (condition C3). This procedure eliminated differential motion in the dot-still trials and restricted it to occur in the dot-moving trials, as in condition C1. As shown in Figure 3c, a retinally immobile dot was no longer perceived to be moving in this condition, and levels of performance were statistically indistinguishable from those measured in C1.
In the last experimental condition (C4), we controlled the square position so that relative motion was restricted to the dot-still trials instead of the dot-moving trials. The square either shifted on the display with the dot or, in the dot-still trials, translated in a random direction at 30′/s. Stimuli were observed normally, without compensating for eye movements. As shown in Figure 3, performance was extremely low in this condition. Observers experienced induced movement (Duncker, 1929) and reported the dot to be moving when it was stationary on the screen and to be still when it actually moved together with the square. That is, the joint drift of the dot and the square was perceptually suppressed, and motion was perceived only when the dot translated relative to the square. These results demonstrate that, with a richer stimulus than a single dot, differential motion in the retinal image was the main element driving perceptual responses.
Figure 4 shows that eye movements had no impact on perceptual reports. Subjects exhibited similar oculomotor activity in the two conditions in which stimuli were observed normally (C1 and C4). They maintained accurate fixation when the dot was stationary on the monitor and followed the dot with their gaze when it moved. As a consequence, ocular drift was larger in the dot-moving trials than in the dot-still trials (Fig. 4a), whereas the motion of the retinal image of the dot was comparable in the two types of trials (Fig. 4b). Yet subjects gave opposite responses in the dot-still and dot-moving trials of conditions C1 and C4 (Fig. 4c). Paradoxically, in C4, the condition of induced motion, subjects systematically reported an accurately tracked moving dot to be stationary and an accurately fixated stationary dot to be moving. Thus, motion judgments were not influenced by the extent of ocular drift, even when drift was actively generated to maintain a moving stimulus at the preferred retinal location.
These results show that small pursuit-like eye movements with amplitudes comparable to those of fixational drifts are under precise control of the oculomotor system; a moving target is accurately tracked, whereas a stationary one is accurately fixated. Yet the visual system ignores this motor information and reconstructs the motion of the eye from the retinal image on the basis of differential motion. These results provide further evidence that extraretinal drift signals are not used in creating a stable visual percept and suggest that erroneous visual estimation of ocular drift might be at the origin of some cases of induced movement.
In addition to yielding different predictions about the perception of motion with a retinally stabilized stimulus, retinal and extraretinal theories also make opposite predictions regarding the accuracy of spatial localization of isolated stimuli. If an extraretinal signal is responsible for canceling out the displacements of the retinal image caused by ocular drift, stimuli displayed at the same retinal position but at different times during drift should be perceived at different spatial locations. In contrast, if visual stability during ocular drift relies on the retinal input, such stimuli should be perceived as spatially coincident. The third experiment of this study was designed to test these predictions.
In a 2AFC task, subjects were asked to report which one of two test dots briefly displayed at distinct times during ocular drift was at the same spatial location of a reference dot shown at the beginning of each trial. As illustrated in Figure 5a, the flashes were separated by intervals of 500 ms, a sufficiently long period for ocular drift to move the eye by a considerable amount. One of the two test dots was displayed at the same monitor location of the reference dot, whereas the other dot was displayed at the location that yielded the same projection of the reference dot on the observer's retina. Stimuli were observed in complete darkness to eliminate all visual references.
According to extraretinal theories, by updating spatial representations by means of an extraretinal signal proportional to drift, subjects should be able to successfully select the dot at the same monitor location of the reference. The percentage of trials with correct response should also increase with larger/faster drifts, which yield stronger and more precise extraretinal signals. According to retinal theories, instead, performance should be below chance level, because subjects would mistakenly choose the dot at the same retinal position of the reference. The proportion of trials in which subjects select the dot at the same monitor location of the reference should also decrease with the extent of drift, as larger eye movements result in larger retinal offsets between the projections of the two test dots and, thus, reduce ambiguity.
Figure 5b shows the results of this experiment. On average, observers were more likely to select the test dot flashed at the same retinal position of the reference rather than the one coincident with the reference's spatial location (mean percentage of “same monitor position” responses: 37.8%; z = −10.02; p < 0.05, two-tailed z test). Furthermore, analysis of subjects' responses as a function of eye movements revealed that the probability of selecting the dot at the correct spatial location decreased with the length of oculomotor drift. The mean percentage of trials in which observers selected the dot at the same monitor position of the reference (1) was close to chance level in the trials in which the eye moved little, when the retinal projections of the three dots were close to each other, and (2) decreased in the trials with longer drift. In these trials, subjects consistently reported the dot and the reference to be at the same spatial location when their positions overlapped on the retina but not on the monitor. That is, the larger the extent of ocular drift, the more frequently observers mistakenly reported that the test dot displayed at the same retinal location of the reference was the one displayed at the reference's spatial location. This correlation is consistent with the predictions of retinal theories and opposite to the trend expected from extraretinal mechanisms.
The results of this experiment show that spatial representations are not updated by an extraretinal signal proportional to ocular drift. In the absence of visual references, the brain ignores changes in eye position caused by drift.
Identification of the retinal image motion caused by eye movements can be accomplished in at least two distinct ways. The first possibility is to use information about the position and/or movement of the eye, a theory that dates back to Helmholtz (1867). This extraretinal signal could originate either from eye muscle proprioceptors after the eye moves or from the motor command itself before the movement, a signal known as efferent copy or corollary discharge. Alternatively, the brain could infer eye movements from the translation of the image on the retina (Gibson, 1954). Studies that investigated visual stability during saccades have found evidence for extraretinal compensation. The phenomena of saccadic compression (Morrone et al., 1997; Ross et al., 1997) and suppression (Burr et al., 1982) are commonly considered as manifestations of such mechanism. Furthermore the shifting of cell receptive fields before saccades (Duhamel et al., 1992; Sommer and Wurtz, 2006) and the deficits in planning sequences of saccades during inactivation of the mediodorsal thalamus (Sommer and Wurtz, 2002) provide evidence for a corollary discharge [see Wurtz (2008) for a comprehensive review].
Although most studies have focused on large eye movements, the need for accurate visual stabilization also exists during the small eye movements of fixation (Ditchburn, 1973; Steinman et al., 1973). Unlike saccades, fixational drift yields retinal velocities that are well within the range of sensitivity of the visual system. During sustained fixation, drift typically covers an area with diameter of 30′ and has average velocity of 0.5°/s. Velocity is higher during natural intersaccadic fixation, when it can reach 2–3°/s. This motion causes the stimulus to shift across a large number of receptors in the retina, particularly in the fovea, where receptor density is higher. Since saccadic suppression does not occur at this time, an efficient mechanism of visual stability must be responsible for the perceptual cancellation of the retinal image motion caused by ocular drift. Use of extraretinal signals might not be the best strategy to achieve visual stability during fixation, because of possible imprecision in the control of microscopic eye movements. Visual illusions such as the jitter aftereffect (Murakami and Cavanagh, 1998) and the peripheral drift illusion (Beer et al., 2008), as well as the observation that subjects with poor fixation accuracy tend to possess higher thresholds for motion detection (Murakami, 2004), suggest a retinal cancellation of ocular drift. However, these studies only provide indirect evidence for a retinal compensation mechanism, and contributions from other factors cannot be excluded.
A major difficulty in investigating the mechanisms of visual stability is that possible contributions from retinal and extraretinal mechanisms always cooccur under normal viewing conditions. Whenever the eye moves, the retinal image translates, so that a possible extraretinal signal associated with the eye movement is always accompanied by a corresponding change in the visual input. To isolate retinal and extraretinal contributions, in this study, we used a flexible system for gaze-contingent display (Santini et al., 2007), which enabled precise control of the stimulus on the retina during oculomotor activity. In this way, we decoupled eye movements from the retinal modulations they normally cause and created an unnatural condition in which eye movements had no impact on the motion of the retinal image. This approach allowed direct testing of retinal and extraretinal theories.
Our conclusion that perceptual cancellation of ocular drift occurs via retinal mechanisms, rather than extraretinal ones, is based on multiple experimental observations. Our data show that, during viewing of a sufficiently rich stimulus, (1) the probability of reporting motion increases with the amount of differential motion (experiments 1 and 2); and (2) it does not depend on the absolute motion of the retinal image, nor on which object (the dot or the background) was responsible for causing differential motion on the retina (experiment 2). Furthermore, (3) possible extraretinal drift information is not used with a complex stimulus (experiment 2). In the absence of differential motion, a moving dot is perceived as stationary even though it is tracked by means of eye movements, whereas a well fixated stationary dot is reported as moving when differential motion is present. These results demonstrate that, whether or not an extraretinal drift signal exists in the brain, it is normally overridden by the information available in the retinal input during exposure to the rich stimulation of natural environments.
In addition, a putative extraretinal drift signal does not appear to have an impact even in the absence of visual references. We found that (4) the probability of perceiving motion during viewing of an isolated dot increases with the length/speed of ocular drift when the dot is immobile in space, and (5) it does not depend on the length/speed of ocular drift when the dot is immobile on the retina (experiment 1). Furthermore, (6) spatial localization of stimuli displayed at separate times fails to take into account the displacements caused by ocular drift in the interval between presentations (experiment 3). These results are to be expected from a stabilization based on the retinal input but conflict with the predictions of extraretinal theories. In contrast with our findings, an extraretinal mechanism would predict (1) more accurate perceptual cancellation with larger/faster drifts, (2) an increment in the frequency of motion responses with the extent of drift under retinal stabilization, and (3) no localization errors during separate flashes. As explained below, our results suggest a mechanism of drift cancellation in which a motion signal common to multiple regions of the retina is subtracted from the visual input.
In addition to providing insights into the mechanisms of visual stability, the results of this study contribute to further elucidating the origins of autokinesis, the apparent motion of a small light source in an otherwise completely dark room (Carr, 1910). Similar to the case of visual stability, it has long been debated whether autokinesis originates from the retinal image motion caused by eye movements (Matin and MacKinnon, 1964) or from the possible failure of an extraretinal compensation mechanism (Gregory, 1958). Extraretinal theories of autokinesis typically postulate the existence of asymmetries in the strengths of oculomotor eye muscles and the consequent need for unbalanced motor commands during maintained fixation. These motor differences are not cancelled out by a corresponding translation of the retinal image and are, therefore, erroneously interpreted by the brain as motion in the external stimulus (Gregory and Zangwill, 1963; Verheijen and Oosting, 1964). Extraretinal theories predict that autokinesis should decrease as the extent of ocular drift increases, since larger and faster drifts—which are presumably associated with less noisy extraretinal signals—should be more accurate in canceling out the motion of the stimulus on the retina. Furthermore, these theories also predict that autokinesis should not be experienced with a stimulus that remains immobile on the retina when the eye is free to drift, because fixation on a single spatial location is not required in this condition, and unbalanced motor commands are no longer needed (Gregory and Zangwill, 1963).
Our data speak against extraretinal theories of autokinesis. In the experiments of this study, subjects often reported motion when observing an isolated dot at a fixed monitor location, an effect that did not occur in the presence of visual references. Yet, in contrast with the predictions of extraretinal theories, the probability of reporting a stationary stimulus as moving increased with the amplitude of drift. Furthermore, although diminished, autokinesis continued to occur with a retinally stabilized stimulus or with an afterimage. Instead of an extraretinal origin, the results of our experiments suggest that autokinesis originated from failure in the estimation of ocular drift with an extremely impoverished stimulus. Uncertainty in eye movement prevents discrimination of whether the stimulus moved or not, even in the absence of retinal image motion, and explains our results. Unlike extraretinal accounts of autokinesis, this view has no problem in explaining why autokinesis does not occur in the presence of visual references (Koffka, 1935) and is reduced when the size of the stimulus increases (Luchins, 1954), two cases in which ocular drift can be visually estimated.
While our results show that the visual system relies on the retinal stimulus to discard the input changes caused by fixational drift, the specific mechanisms by which this cancellation occurs remain a matter of investigation. Based on our findings, we propose that this process relies on the coherence of motion signals on the retina. During fixation, the visual system estimates the most common motion signal on the retina. If this signal falls within the range of normal fixational instability, it is assumed to originate from eye movements and perceptually subtracted from the motion of the retinal image. As a result of this subtraction, only the regions of the scene that move with velocity different from that of ocular drift are perceived as moving. This mechanism functions efficiently in most cases, but fails under special circumstances, such as (1) with severely impoverished stimuli, when extraction of common motion is impossible (autokinesis); or (2) if a coherent motion signal compatible with ocular drift is not generated by eye movement, a case that yields an erroneous estimation of ocular drift (induced motion). The medial superior temporal area, an area involved in the analysis of complex optic flow information and self-motion (Britten, 2008), is a possible candidate for performing the proposed estimation of coherent motion. Further work is needed to explore this hypothesis.
This work was supported by Grant EY18363 from the National Institutes of Health and Grants BCS-0719849 and IOS-0843304 from the National Science Foundation. We thank Dan Bullock, David Richters, Eric Schwartz, and Antonino Casile for helpful discussions and comments on this manuscript.
- Correspondence should be addressed to Dr. Michele Rucci, Boston University, 64 Cummington Street, Boston, MA 02215.