Abstract
Why does the world appear stable despite the visual motion induced by eye movements during fixation? We find that the answer must reside in how visual motion signals are interpreted by perception, because MT neurons in monkeys respond to the image motion caused by eye drifts in the presence of a stationary stimulus. Several features suggest a visual origin for the responses of MT neurons during fixation: spike-triggered averaging yields a peak image velocity in the preferred direction that precedes spikes by ∼60 ms; image velocity during fixation and firing rate show similar peaks in power at 4–5 Hz; and average MT firing during a period of fixation is related monotonically to the image speed along the preferred axis of the neurons 60 ms earlier. The percept caused by the responses of MT neurons during fixation depends on the distribution of activity across the population of neurons of different preferred speeds. For imposed stimulus motion, the population response peaks for neurons that prefer the actual target speed. For small image motions caused by eye drifts during fixation, the population response is large, but is noisy and does not show a clear peak. This representation of image motion in MT would be ignored if perception interprets the population response in the context of a prior of zero speed. Then, we would see a stable scene despite MT responses caused by eye drifts during fixation.
Introduction
Why does the world appear to be stable even though our eyes move constantly so that the retinal image of the world never is still? Small smooth eye drifts occur even during periods of steady fixation between saccades and microsaccades (Adler and Fliegelman, 1934; Ratliff and Riggs, 1950; Ditchburn and Ginsborg, 1953; Skavenski et al., 1975; Spauschus et al., 1999; Martinez-Conde et al., 2004). The smooth eye movements of fixation, along with microsaccades, play an important role in visual perception. They prevent fading of images by constantly jittering the retinal locus of the image from a stationary object (Ditchburn and Ginsborg, 1952), and they improve perception of fine spatial detail (Rucci et al., 2007). However, they also pose a potential problem.
The smooth drifts in eye position during fixation create an ever-moving retinal image that must be ignored to achieve the perception of a stable world. Recordings from visual neurons have the potential to reveal how and where in the brain the image motions are ignored. If, for example, a motion area did not respond to the eye movements of fixation, then we could conclude that the expected responses had been nulled at an earlier stage. If a motion area did respond to the eye movements of fixation, then the details of those responses would inform the question of how those motion signals are ignored downstream. In the lateral geniculate nucleus, the primary visual cortex (V1), and extrastriate cortical area MT, many neurons respond to the image motion caused by microsaccades (Bair and O'Keefe, 1998; Leopold and Logothetis, 1998; Martinez-Conde et al., 2000, 2002; Reppas et al., 2002). In V1, a subset of neurons responds vigorously during fixation between microsaccades (Snodderly et al., 2001; Kagan et al., 2008). Thus, motion signals are present in the visual system during fixation.
For a long time, it was popular to think that corollary discharge reports of eye motion were used to null and compensate for the image motions caused by eye motion. However, the recent article by Poletti et al. (2010) and the features of the “jitter illusion” (Murakami and Cavanagh, 1998) argue against the corollary discharge explanation. Instead, it appears that visual mechanisms suppress the motion signals that arise from the smooth eye movements of fixation. The nature of the suppression mechanisms may be revealed through recordings from visual motion neurons during the eye drifts of fixation. In particular, given the role of extrastriate area MT in motion perception (Newsome and Paré, 1988), it will be essential to know the discharge properties of MT neurons during the eye drifts of fixation.
We have found that slow drifts of eye position during fixation cause responses in many neurons in extrastriate visual motion area MT. The responses are direction selective but depend only weakly on the speed preference of the neurons. We outline a mechanism that could operate downstream from MT to maintain the perception of a stable world despite an organized set of motion signals in the response of the population of neurons in MT.
Materials and Methods
General procedures.
We recorded eye movements and neural responses from two adult male rhesus monkeys (Macaca mulatta, 7–13 kg). After initial training, monkeys were implanted with hardware to allow head restraint and scleral search coils to record eye movements, as described in detail previously (Ramachandran and Lisberger, 2005). In a separate surgical procedure, we mounted titanium or CILUX recording chambers (Crist Instruments) over a 20 mm circular opening in the skull to allow access to MT for neural recordings. We also recorded the movements of both eyes in a third monkey who had binocular eye coils.
For each experimental session, monkeys sat in a primate chair with their heads immobilized. They received fluid rewards for accurately fixating or tracking visual targets presented on a screen in front of them. Experiments were conducted five times per week and lasted ∼5 h. We used a Thomas Mini-Matrix 05 microdrive (Thomas Recording) to lower quartz-shielded tungsten electrodes with impedances from 1 to 4 MΩ vertically into MT. We identified MT on the basis of stereotaxic coordinates, directional and speed response properties of neurons, receptive field sizes, retinotopic organization, and surrounding cortical areas (Maunsell and Van Essen, 1983).
All surgical and experimental procedures were approved in advance by the Institutional Animal Care and Use Committee of the University of California, San Francisco, and were in strict compliance with the NIH Guide for the Care and Use of Laboratory Animals.
Visual stimuli and experimental design.
All experiments were conducted in a nearly dark room. Visual stimuli appeared on an analog oscilloscope (model 1304A, Hewlett Packard) with a refresh rate of 250 Hz. The oscilloscope was driven by 16-bit digital-to-analog converters on a digital signal-processing board in a PC. The display was positioned 20.5 cm from the monkey and subtended visual angles of 67° by 54°.
After a neuron was isolated, we mapped its receptive field while the monkey fixated within 1° of a bright spot at the center of the screen. We located the receptive field coarsely by hand using a moving spot, and then measured the direction and speed tuning of each neuron for a patch of moving dots placed within the receptive field. Patches were either 5° × 5° with 50 dots, or 8° × 8° with 128 dots placed at random locations within the patch. Because we recorded from central MT, the sizes of the patches matched the scale of the classical receptive fields of the MT neurons we studied.
For measurement of tuning properties, we delivered five moving stimuli of different speeds or directions within each behavioral trial. During each stimulus presentation, dots moved coherently with given direction and speed for 300 ms. Dots moved behind a stationary aperture so that the stimulus remained fixed on the receptive field of the neuron under study. Dots were stationary and visible for 200 ms between stimulus presentations. At the end of each mapping trial, the monkey received a drop of fluid for having maintained proper fixation.
For analysis of responses to image motion during fixation, each trial began with fixation for 500–900 ms of a stationary 0.3° × 0.3° square in the center of the screen. The monkey continued to fixate while a 5° × 5° or 8° × 8° patch of stationary random dots appeared in the receptive field of the neuron for a random duration of 300–900 ms. Presenting the stationary dots before introducing motion allowed us to separate the MT response to the onset of a visual motion from the response to the onset of light in the receptive field. At an unexpected time, the fixation target then disappeared and the dots began to move coherently within the fixed aperture of the patch for 100 ms, followed by en bloc motion of the dots and the aperture for an additional 750 ms. The two intervals of motion provided a stimulus for the initiation of smooth pursuit eye movements. During recordings from an individual neuron, we presented four to six different stimulus speeds and/or directions in random order. Subjects usually completed 2000–3600 pursuit trials in a single daily experiment.
Data acquisition.
Eye position and velocity signals were sampled and stored at 1000 Hz on each channel. Velocity signals were obtained by passing eye position signals through an analog differentiator. The circuit differentiated frequency content from 0 to 25 Hz and filtered higher frequencies with a roll-off of 20 dB/decade. Neural signals were amplified and digitized for on-line spike sorting, and spikes were initially assigned to single neurons using a template-matching algorithm (MAP, Plexon). After the experiment, we used a combination of visual inspection of waveforms, projections onto principal components, template matching, and refractory period violations in Offline Sorter (Plexon) to sort and assign spikes to well isolated single units. After sorting, waveforms were converted to timestamps with 1 ms precision for analysis. Firing rates were obtained by convolving spike trains with a Gaussian function having an SD of 10 ms.
Data analysis.
To quantify the tuning properties of each cell, we fitted the speed and direction tuning data with the following Gaussian functions: where s is speed, θ is direction, R0 is baseline firing rate, a is the amplitude of the speed tuning curve, A is the amplitude of the direction tuning curve, ps is preferred speed, pd is preferred direction, c is the width of the speed tuning curve, C is the width of the direction tuning curve, and d is the skew parameter of the speed tuning curve. We defined tuning curves as “low pass” for target speed if their value for stimulus motion at 0.5°/s was >40% of the distance from baseline to peak firing rate for the neuron.
To study responses to the image motion created by the eye movements of fixation, we analyzed eye movements and neural responses during the interval when the random dots were stationary and visible, and the monkey fixated the central spot. Because the visual stimulus on the receptive field was stationary during the analysis interval, image motion occurred only as a consequence of small eye movements during fixation. To screen neurons for inclusion in the data analysis, we counted the total number of spikes in all behavioral trials in the interval from 130 to 600 ms after stimulus onset. We included a neuron in our dataset if it provided at least 200 spikes. Because we collected many trials for each neuron, this criterion excluded only the neurons with very low baseline firing rates. We excluded the first 130 ms after the onset of the random dot patch to prevent analysis of spikes driven by the onset of the visual stimulus rather than by the retinal image motion consequent to eye movements. We also screened all trials for eye movement behavior. Trials were included in data analyses only if eye position remained within 1° of the fixation point and microsaccades did not occur during the analysis interval. Across our sample of MT neurons, we found >48,693 trials with intervals of fixation that were >600 ms and contained no microsaccades. On average, we estimate that monkeys emitted less than three microsaccades per second in the task conditions we used.
For each neuron that was admitted for data analysis, we performed a number of computations, as follows: (1) to test whether the oscillations in eye velocity occurred along a given direction, we computed the principle components of velocity for each experiment; (2) to compare the frequency content of the spike trains and eye movements, the eye velocity and firing rate were multiplied with a symmetric Henning filter, zero padded, and then subjected to Fourier analysis; (3) to determine whether the neural responses during fixation were driven by image motion, we calculated the spike-triggered average (STA) of image velocity, image(t), as follows: where ti is the time of occurrence of a spike, τ is the temporal lag between the spike and eye velocity, and N is the total number of spikes (the amplitude of the STA of image velocity was calculated as the peak minus the trough occurring before the spike); and (4) to determine how the image motion during fixation affected subsequent responses to imposed visual motion, we computed the time average of firing rate aligned on either the appearance of a stationary patch of dots in the receptive field of the neuron under study or the time of motion onset for a patch of dots that had been visible and stationary for some time.
Results
The smooth eye movements of fixation
During epochs of fixation between microsaccades, monkeys show smooth drifts in eye movements that appear as oscillations in the eye velocity traces. In Figure 1A, for example, the horizontal and vertical eye velocity in a single trial showed oscillations with a period of ∼250 ms and additional fluctuations at higher frequencies. The peak-to-peak amplitudes of the oscillations in eye velocity were <2°/s and were somewhat larger in vertical versus horizontal eye velocity in this example.
The distribution of smooth eye velocities during fixation was centered near zero, with the overwhelming majority of the observations falling between −0.75° and +0.75°/s (Fig. 1B). The distribution deviates slightly from circular and is biased in the first and third quadrants, indicating a slight tendency for the smooth eye velocities of fixation to be right and up, or left and down. Figure 1B was constructed from analysis of eye velocity in 1 ms bins between 130 and 600 ms after the appearance of a stationary visual stimulus across 17,809 trials for all experiments on monkey Y. We observed similar distributions of drift velocities in all experimental sessions, and principle component analysis revealed a consistent slight directional bias in agreement with that demonstrated in Figure 1B. In one monkey equipped with binocular scleral search coils, we verified that the smooth drifts during fixation were strongly correlated and conjugate between the two eyes.
Responses of MT neurons during the smooth eye movements of fixation
We recorded 104 neurons in visual area MT of two monkeys (52 neurons each in monkeys Y and J). Most of the neurons had receptive fields centered within the central 5° of the visual field.
Our behavioral task was designed to study the responses of MT neurons under three conditions that differed in the nature of the visual stimulus. In the first part of the behavioral trial, when the monkey was required to fixate a spot target on an otherwise dark screen (Fig. 2A), firing rate was low in most neurons because of the lack of visual stimulation in the receptive field. When a patch of stationary random dots appeared in the receptive field (Fig. 2B), most neurons emitted a transient “on” response followed by firing at a slightly higher rate than with a dark background. When the dots within the patch moved coherently at a velocity that fell within the preferences of the MT neuron under study, firing rate showed a large “motion onset transient” (Lisberger and Movshon, 1999) followed by an elevated sustained response. We take advantage of the specifics of the three stimulus conditions to understand when and how the smooth eye velocities of fixation affect neural responses in visual area MT.
STAs provided direct evidence that MT neurons respond to the small fluctuations in image velocity caused by the eye movements of fixation. Figure 3 shows STAs calculated from spikes that occurred during the interval of fixation with a patch of stationary dots in the receptive field. On average, spikes in MT neurons were preceded by a positive deviation in image velocity of ∼0.2°/s in the preferred direction of the neuron under study (Fig. 3A). STAs of image velocity along the preferred axis of the neuron under study were remarkably similar across neurons (Fig. 3A) and were accompanied by very little image velocity along the orthogonal axis (Fig. 3B).
Averaging across all MT neurons in our sample revealed STAs with a big central peak and flanking troughs that were very similar for the two monkeys (Fig. 3C,D). The peak of the average STA preceded the spikes by 58 and 67 ms for monkeys Y and J, implying that the image velocity drives the MT response. The latency of the STAs is consistent with the latencies of MT neurons for imposed stimulus motion, especially considering that latencies of MT responses are longer for slow speeds (Lisberger and Movshon, 1999). The shapes of the STAs in our data are similar to those found by Bair and Movshon (2004) using a stimulus that was much “whiter” than ours, suggesting that the inevitable autocorrelations in the smooth eye velocities of fixation did not create the STAs. However, the STAs in their data had smaller inhibitory flanks than seen in our data, suggesting that this feature might arise from the temporal autocorrelations in the smooth eye velocities of fixation.
Many of the MT neurons in our sample were tuned for stimulus speeds >10°/s, but the amplitude of the STAs did not depend on the preferred speeds of the neurons (Fig. 4). Further, the shape and amplitude of the STAs were similar in neurons with Gaussian-shaped tuning curves (open symbols) and in neurons with tuning curves resembling low-pass filters (Orban et al., 1981, 1986; Lagae et al., 1993). We attribute the uniformity of the STAs to the fact that the amplitude of the STA cannot exceed the amplitude of the image velocity during drifts, limiting conclusions about neural tuning from the STAs. Still, it is important that virtually all MT cells responded during fixation to the image velocity generated by drifts in the preferred direction of the neuron, in agreement with the finding of Bair and Movshon (2004) that almost all MT neurons have clear STAs even for stimuli of low temporal frequencies.
Frequency content of eye drifts and MT firing rate
The smooth eye movements of fixation and the firing rates of MT neurons had similarities in their Fourier spectrum that suggest a relationship. For both horizontal and vertical eye velocity, the spectrum had a peak in amplitude of ∼0.2°/s at 4–5 Hz (Fig. 5A,B,D,E). We have shown previously that eye velocity oscillations in this frequency range occur during steady-state pursuit eye movements and are driven by visual motion inputs rather than by an intrinsic oscillator (Goldreich et al., 1992; Churchland and Lisberger, 2001). The spectrum of firing rate had a peak over the same frequency range as did the eye movements (Fig. 5C,F). The spectrum of eye velocity also showed a second peak at ∼15 Hz that was particularly pronounced for horizontal eye velocity in monkey Y (Fig. 5A). In contrast, the spectrum of firing rate of MT neurons did not show a peak in amplitude at higher frequencies. The Fourier spectrums for both eye velocity and firing rate need to be attributed to smooth eye drifts rather than to microsaccades, because the data were taken from intervals that lacked microsaccades during fixation of a stationary spot with a patch of stationary dots in the receptive field of the neuron under study.
Effect of image velocity on subsequent MT responses
To obtain more direct evidence that MT neurons respond to the image motion created by the smooth eye drift during fixation, we assessed the relationship between image velocity and subsequent firing rates during fixation of a stationary visual stimulus. We combined the data across our full sample of neurons and divided the trials into groups according to the magnitude of image velocity along the preferred axis of each neuron in an interval from 200 to 250 ms after the appearance of the stimulus. We then averaged the image velocity and firing rate as a function of time within each group. Altogether there were 13 groups, of which we show data from 5 in the families of traces that appear in Figures 6A,B. As expected, given how the traces had been grouped, the image velocity for the different groups had different amplitude deflections that reached peaks ∼225 ms after the appearance of the stationary visual stimulus (Fig. 6A). The firing rates in the same groups (Fig. 6B) reached peak values ∼60 ms later, consistent with the latencies of MT neurons to imposed visual motion. The magnitude of the peaks in firing rate varied across groups, in the same sequence as the peaks in image velocity. Modulation of firing rate was large and positive for image motion in the preferred direction and negative but smaller for image motion in the null direction. The peak/trough of firing rate varied systematically from −10 to +25 spikes/s relative to baseline firing as the image velocity varied from −0.6 to 0.7°/s. We realize that the analysis in Figure 6 treads on oscillations in eye velocity and firing rate, both of which can be seen in the data traces. The important point is that the relationship between image velocity at 225 ms and firing rate 60 ms later is consistent with the conclusion that oscillations in image motion drive the oscillations in firing rate.
We quantified the effects of small image motions resulting from the smooth eye movements of fixation for the full set of 13 groups of image velocities, rather than the 5 examples used to construct Figure 6, A and B. For each trace, we measured the peak/trough of image velocity caused by eye drifts 225 ms after the appearance of the stationary stimulus and the peak/trough of firing rate 60 ms later. We measured the baseline firing rate from the group of trials with image velocities near zero and subtracted the baseline from the firing rates for each other group. There was a strong relationship between the evoked firing rate and image velocity when the visual stimulus was stationary in the receptive field of the neuron under study (Fig. 6C, open circles). In contrast, the relationship was much shallower if we conducted the same analysis from recordings during the earlier part of the trials when the monkey fixated a central spot without a stationary stimulus in the receptive field of the neuron (Fig. 6C, open triangles). We attribute the small remaining modulation to the fact that the room was lit dimly so that the face of the oscilloscope was visible and provided a low-contrast, dim stationary stimulus that would evoke small responses in MT neurons (Sclar et al., 1990). Figure 6 supports our conclusion that the responses in MT during fixation probably are driven by the small image motions consequent to eye drifts, rather than by extraretinal inputs related to the eye movements themselves.
So far, we have analyzed the responses of MT neurons during fixation, so that the image motion that drove firing rate arose only from small fluctuations of eye velocity during fixation. Because the fixation periods always led to the onset of target motion of a limited set of velocities, we were able to use data from the same behavioral trials to show that the smooth eye movements of fixation also affected the first 100 ms of the responses of MT neurons to the onset of target motion.
We again divided trials into groups according to the image velocity caused by the eye movements of fixation, now using image velocity in the 50 ms surrounding the onset of target motion. We then computed the average eye velocity and firing rate as a function of time within each group (Fig. 7A,B). Comparison of the firing rates across groups (Fig. 7B) revealed that the initial rising phase of the firing rates of MT neurons was delayed by amounts related to the image velocity present at the time the target started to move (Fig. 7A). Image motion in the nonpreferred direction of the neuron under study caused a delay in the rising phase of firing rate. In contrast, the magnitude of the peak firing rate of MT neurons was largely unaffected by the direction or speed of the image motion present at the time the stimulus started to move.
To quantify the effects of small image motions resulting from the smooth eye movements of fixation, we measured image velocity at the time of motion onset and firing rate 45 ms later, the time point at which the effect of the eye movements was largest. Then, we subtracted the firing rates for the group of trials with zero image velocity at the time of motion onset from the firing rates for each other group. Even though image motion during fixation on the responses of MT neurons to a moving stimulus causes a temporal shift in the rising phase of firing rate (Fig. 7B), this effect appears as a relationship between absolute firing rate and prior image motion in Figure 7C. There is a strong relationship between the firing rate measured 45 ms after the onset of stimulus motion and image velocity at the time the stimulus started to move.
The relationship in Figure 7C did not depend on whether the speed of the stimulus was above (filled triangles) or below (open triangles) the preferred speed of the neuron. This observation is important because we would expect a direct effect of ongoing small eye movements on the magnitude of image motion to depend on whether the stimulus motion was on the rising or falling arm of the speed-tuning curve. Ongoing smooth eye drifts will have small effects on the image speed caused by a given physical stimulus motion across the screen. If image speed were on the rising arm, then a slight increase in the image speed caused by a given stimulus speed would cause the relationships in Figure 7C to have a positive slope. If image speed were on the falling arm of the speed tuning curve, then the same slight increase in image speed should cause a decrease in the MT response and a negative slope in Figure 7C. Because we do not see a reversal of the slope depending on target speed relative to preferred speed, we cannot explain the effect of prior image motion on MT response to imposed motion as an effect of eye drift on the physical image velocities created by the combination of drift velocity and stimulus velocity. Instead, it must be an adaptation phenomenon like that reported by Lisberger and Movshon (1999), where presentation of brief image motion in the null direction just before a motion stimulus caused a time delay in the responses of MT neurons for stimulus motion in the preferred direction. The time contingencies in the recordings of Lisberger and Movshon (1999) were similar to those during the eye movements of fixation, with no delay between the null direction and preferred direction motion and the very brief adapting stimuli.
MT population response during smooth eye movements of fixation
So far, we have presented averages across neurons to document MT responses during the eye movements of fixation. However, the percept of motion caused (or not caused) by the image motions present during fixation will depend on the distribution of firing rates across the population of MT neurons tuned for different speeds and directions. To estimate the population response in area MT during the smooth eye movements of fixation, we have repeated the grouping analysis described by Figure 6, but now analyzing each neuron separately. For each neuron, we selected trials where the image velocity was within 0.1°/s of −0.5°, 0°, or +0.5°/s along the preferred axis of the neuron in the interval from 200 to 250 ms after the appearance of the stationary dots in the receptive field. We used the average firing rate across all trials as baseline firing, and subtracted the baseline responses from the other three measures. Then, we measured the peak or trough of the average firing rate ∼60 ms after the time used to bin the trials for each of the three groups of trials, and plotted three symbols for each neuron showing the firing rate response as a function of the neuron's preferred speed. When image speeds were ∼−0.5° or +0.5°/s (Fig. 8A, blue vs red symbols), the firing rates of most MT neurons were higher or lower than when image speed was close to 0°/s (black symbols).
Other properties of the neuron's responses were not predictive of their responses to small drifts during fixation. The response to drifts of +0.5°/s was not correlated with the peak response for imposed motion of preferred speed or direction, except that weakly responsive neurons for imposed motion also failed to respond to smooth image velocities caused by drift (data not shown). There was a weak correlation (r = 0.3) between the magnitude of the response to 0.5°/s drifts during fixation and the width of the speed tuning curves.
It is not clear what function to use to describe the population response in Figure 8A. Therefore, we chose to fit the red and blue symbols with log-linear regression lines. Importantly, the population response is quite noisy and does not show a clear peak at any preferred speed, as it would for image motion created by imposed stimulus motions. Thus, the population response to the smooth eye movements of fixation appears to signal the direction, but not the speed of image motion.
Discussion
Several features of our data argue that image motion during fixation causes the responses we have recorded in MT neurons. First, the responses of MT neurons have the same direction selectivity as the responses to imposed stimulus motion, and lag the image motion of fixation by ∼60 ms, an appropriate visual latency for slow target motions (Lisberger and Movshon, 1999). Second, responses during fixation are large when a high-contrast patch of stationary dots is positioned on the receptive field, and are much smaller when the patch is absent so that the visual stimulus is provided by dim, low-contrast reflections of the room on the monitor of the oscilloscope (Sclar et al., 1990). Third, the shapes of the spike-triggered averages obtained during the smooth drifts of fixation agree with those reported by Bair and Movshon (2004) for imposed visual stimuli in anesthetized monkeys. Finally, MT responses during fixation are graded in relation to the prior image velocity (Fig. 6C), as would be expected if the neurons were operating on the rising arm of a speed tuning curve.
The presence of responses across the population of MT neurons without regard for their preferred speed raises some doubt about the visual origin of the responses. In principle, the responses could be caused, or amplified, by signals related to the eye movements themselves. However, MT neurons lack a corollary discharge related to smooth eye movements (Newsome et al., 1988). In addition, we found that the responses of MT neurons during fixation become very small when the visual stimulus on the receptive field of the neuron under study is very dim and low in contrast. Instead, we attribute the unexpected properties of MT responses during the image motion caused by fixation drifts to our observation that many MT neurons show above-baseline responses for target speeds as slow as 0.5°/s, and to forms of short-term adaptation. For example, responses to small image velocities might be enhanced because MT neurons are especially sensitive to a visual stimulus that crosses from motion in the nonpreferred direction to motion in the preferred direction (Lisberger and Movshon, 1999). This sequence approximates what occurs during the image motions caused by the eye movements of fixation. As we described in the Results, adaptation also may account for the effect of prior image motion on the latency MT responses to imposed motion (Fig. 7B). Thus, even though we cannot disprove alternate explanations, the combined features of our data can be understood in terms of MT responses driven solely by slow image motion during fixation.
Implications of MT responses to the image motion caused by eye movements
Our data have an important functional implication. That MT neurons respond to drifts (our data) and microsaccades (Bair and O'Keefe, 1998) implies that the eye movements of fixation are large enough to prevent adaptation of neurons, and visual fading, when we hold our gaze still on a static visual scene (Steinman et al., 1973). Drifts and microsaccades almost certainly act in concert to maintain vision (Ditchburn and Ginsborg, 1952; Riggs et al., 1953; Gerrits and Vendrik, 1974; Rucci and Desbordes, 2003; Martinez-Conde et al., 2004). Drifts would be particularly important during intervals when we make few microsaccades, because visual fading occurs in as short a time as 80 ms (Coppola and Purves, 1996).
The impressive response of MT neurons to the image motion caused by eye drifts during fixation also has technical implications. Several articles have reported noise correlations between the spike counts of pairs of MT neurons under the circumstances used in our recordings with fixation at a central spot while a visual stimulus is presented on an eccentric receptive field (Bair et al., 2001; Huang and Lisberger, 2009). In principle, part of these “noise correlations” could arise from the common visual signal generated by eye drifts during fixation. In practice, the noise correlations of MT neurons are likely to be real, and not an artifact introduced by image motion during fixation. First, the magnitude of drift at the onset of motion affects only the earliest part of the response to imposed motion, but the noise correlations persist unabated throughout the response (Huang and Lisberger, 2009). Second, noise correlations are present on short time scales between pairs of MT neurons in anesthetized monkeys, and therefore cannot be attributed solely to drifts of eye position during fixation (A. Graf, personal communication).
Visual stability despite MT responses during fixation
During fixation, the population of MT neurons signals the direction of image motion caused by small drifts in eye position during fixation. Why do we perceive a stable world despite substantial responses in a visual area that provides signals to guide both perception (Newsome and Paré, 1988) and pursuit eye movements (Newsome et al., 1985)?
Psychophysical reports during stabilized vision (Poletti et al., 2010) and the properties of the jitter illusion argue against the classical explanation in terms of negating the image motion with oppositely signed corollary discharge signals related to the eye movement (Murakami and Cavanagh, 1998). In the jitter illusion, adaptation of one part of the visual field with dynamic noise leads to awareness of the visual motion caused by the eye movements of fixation, but only in nonadapted regions of the visual field. Cancellation by corollary discharge should apply across the entire visual field. Thus, the perception of a stable world must depend on visual mechanisms that suppress the responses to image motion caused by eye movements during fixation. Because MT neurons have strong responses in relation to the image motion caused by the eye movements of fixation, we imagine that downstream structures are important for rendering invisible the image motion during fixation. Our results contradict a prior model showing how the effects of eye movements could be removed from visual signals before they reach MT (Pitkow et al., 2007).
We also do not think that the extensive surround antagonism found in the motion responses of some MT neurons (Allman et al., 1985a,b; Tanaka et al., 1986; Raiguel et al., 1995; Born, 2000) would contribute to rendering invisible the image motion present during fixation. In real life, the smooth eye drifts during fixation would stimulate both the classical motion-sensitive receptive field of MT neurons and the antagonist inputs from outside the classical receptive field. Whole-field image drift might not elicit a response in many MT neurons. However, the small static patches used in our experiments appear stationary under control conditions even though they stimulate only the classical receptive field of MT neurons. The jitter illusion also does not depend on surround suppression because it is strong for visual stimuli on the spatial scale of the receptive fields of MT neurons (Murakami and Cavanagh, 2001). Therefore, the antagonistic surrounds of MT neurons may contribute, but they cannot be the mechanism that maintains a stable percept in the face of image motion caused by the eye movements of fixation.
Based on the features of the jitter illusion, Murakami and Cavanagh (1998) proposed that our percept of the world normally is stable because perception assumes that the weakest motion signals come from stable images and uses those signals as a baseline. Motion is perceived only for regions of the visual field with motion signals that are clearly stronger than the baseline. Normally, all regions of the visual field provide equally weak motion signals and the world is perceived as stationary despite the image motion caused by the eye movements of fixation. The jitter illusion emerges because the adapted region of the visual field contains very weak motion signals and, by comparison, the motion signals from stationary stimuli in other regions of the visual field are larger and yield the percept of motion.
We can add one element to the mechanisms of stable percepts based on our finding of substantial responses in MT neurons during the image motion caused by small smooth eye drifts during fixation. The population response in MT during fixation does not have the tuned characteristic expected for imposed stimulus motion, but many neurons respond. Decoding the population response in Figure 8A for image motion at +0.5°/s with vector averaging estimates an image speed of 12.5°/s. It is difficult to imagine that such a large estimate of target speed could be seen as stationary. However, the population response in Figure 8A provides more noise than signal, which might be interpreted as extreme uncertainty about image speed (Ma et al., 2006) and lead to the percept of a stationary world. Stated more formally, the influence of a prior for zero speed in visual perception (Hürlimann et al., 2002; Weiss et al., 2002; Stocker and Simoncelli, 2006) would yield a very low estimate of image speed from the population response in Figure 8. We demonstrate this idea using a Bayesian framework where the prior distribution is an exponential function like that used by Stocker and Simoncelli (2006) and the posterior probability distribution is the product of the prior and likelihood distributions.
We base our assumptions about the shape of the likelihood distribution on the conclusion of Ma et al. (2006) that plotting the response of each neuron as a function of its preferred speed provides a good estimate of the likelihood function. Further, we plot the population response during the drifts of fixation as a function of the preferred speed of the neuron for imposed stimulus motion, because the perceptual system should have a fixed interpretation of each neuron's response on the basis of its tuning for imposed motion, without regard for the conditions that lead to the response.
If the population response in MT is tuned, then the likelihood function also will be tuned. Bayesian estimation will lead to a posterior distribution that peaks somewhat below target speed (Fig. 8B, red curve). If, however, the likelihood distribution is formed from the log-linear population responses in Figure 8A, then Bayesian estimation will lead to a posterior distribution that peaks at very low speeds (Fig. 8C, red curve). Such a posterior could reasonably be interpreted to originate from a stationary object. To be consistent with the model of Murakami and Cavanagh (1998), we propose that Bayesian estimation with a prior of zero speed is used in spatially restricted areas; the estimates from different locations in space would be compared to decide what parts of the visual field contain real motion.
Footnotes
This work was supported by the Howard Hughes Medical Institute, and NIH Grants EY03878 and T32 EY007120. We thank Drs. Allison Doupe, Philip Sabes, Christoph Schreiner, Yang Dan, and the Lisberger laboratory for helpful comments on an earlier version of the paper; and Valerio Mante for invaluable input throughout this work. We are grateful to Stefanie Tokiyama, Elizabeth Montgomery, Karen MacLeod, Dirk Kleinhesselink, Scott Ruffner, David Wolfgang-Kimball, Darrell Floyd, and Ken McGary for technical assistance.
- Correspondence should be addressed to Stephen G. Lisberger, Department of Physiology, UCSF, Box 0444, 513 Parnassus Avenue, Room HSE-802A, San Francisco, CA 94143-0444. SGL{at}phy.ucsf.edu