Abstract
Heading perception in primates depends heavily on visual optic-flow cues. Yet during self-motion, heading percepts remain stable, even though smooth-pursuit eye movements often distort optic flow. According to theoretical work, self-motion can be represented accurately by compensating for these distortions in two ways: via retinal mechanisms or via extraretinal efference-copy signals, which predict the sensory consequences of movement. Psychophysical evidence strongly supports the efference-copy hypothesis, but physiological evidence remains inconclusive. Neurons that signal the true heading direction during pursuit are found in visual areas of monkey cortex, including the dorsal medial superior temporal area (MSTd). Here we measured heading tuning in MSTd using a novel stimulus paradigm, in which we stabilize the optic-flow stimulus on the retina during pursuit. This approach isolates the effects on neuronal heading preferences of extraretinal signals, which remain active while the retinal stimulus is prevented from changing. Our results from 3 female monkeys demonstrate a significant but small influence of extraretinal signals on the preferred heading directions of MSTd neurons. Under our stimulus conditions, which are rich in retinal cues, we find that retinal mechanisms dominate physiological corrections for pursuit eye movements, suggesting that extraretinal cues, such as predictive efference-copy mechanisms, have a limited role under naturalistic conditions.
SIGNIFICANCE STATEMENT Sensory systems discount stimulation caused by an animal's own behavior. For example, eye movements cause irrelevant retinal signals that could interfere with motion perception. The visual system compensates for such self-generated motion, but how this happens is unclear. Two theoretical possibilities are a purely visual calculation or one using an internal signal of eye movements to compensate for their effects. The latter can be isolated by experimentally stabilizing the image on a moving retina, but this approach has never been adopted to study motion physiology. Using this method, we find that extraretinal signals have little influence on activity in visual cortex, whereas visually based corrections for ongoing eye movements have stronger effects and are likely most important under real-world conditions.
Introduction
In primates, optic flow is an indispensable cue for navigating through the world (Gibson, 1958; Warren and Hannon, 1988; Lappe et al., 1999). When the eyes are stationary, the brain can easily determine the instantaneous heading, which is the focus of expansion in the optic-flow pattern. When the eyes rotate during smooth pursuit, however, retinal slip distorts the optic flow and upsets this correspondence (Gibson, 1950). Despite these distortions on the retina, humans and nonhuman primates alike perceive very little distortion in their heading direction as they pursue moving objects (Royden et al., 1992; Britten and van Wezel, 2002). Such compensation is a general property of sensory systems, which often discount stimulation caused by the animal's own behavior (termed reafferent stimulation). These results imply that the brain discounts distortions to the optic-flow field, but it is still unclear how this perceptual stability is maintained.
Two main classes of mechanisms by which the brain discounts the retinal-slip distortions have been proposed: retinal and extraretinal (Lappe et al., 1999; Britten, 2008). Under the retinal hypothesis, cortical areas selective for visual motion extract heading from the optic flow pattern directly by calculating and discounting motion components due to eye rotation (Royden, 1997; Perrone and Stone, 1998; Beyeler et al., 2016). Retinal mechanisms depend on depth cues to dissociate flow components due to retinal slip from those due to self-motion (Longuet-Higgins and Prazdny, 1980; Hildreth and Royden, 1998). On the other hand, the extraretinal hypothesis proposes that information about the retinal flow pattern is modified by an internal signal that tracks eye velocity to recover heading. These extraretinal signals likely originate from the efference copy (or corollary discharge) of motor commands for smooth pursuit rather than from proprioception (Sperry, 1950; von Holst and Mittelstaedt, 1950; Bridgeman and Stark, 1991; Crapse and Sommer, 2008). Psychophysical investigations have favored the extraretinal hypothesis, based on the comparison of perceived heading differences between normal pursuit and simulated pursuit, in which the eyes remain fixed while the experimenters artificially add rotation to the flow stimulus (Royden et al., 1994; Banks et al., 1996; Crowell and Andersen, 2001; but see Warren and Hannon, 1988; Stone and Perrone, 1997; Li et al., 2006).
Neurophysiological evidence for retinal versus extraretinal mechanisms in self-motion processing is mixed. Neural responses in heading-selective areas, such as dorsal medial superior temporal area (MSTd), compensate for changes in the speed (Inaba et al., 2007; Chukoskie and Movshon, 2009) and direction of optic flow during smooth pursuit. One group reported a large extraretinal influence on the heading-direction preferences of MSTd neurons based on changes in activity between normal pursuit and simulated pursuit (Bradley et al., 1996; Shenoy et al., 2002). However, their stimulus lacked depth cues, which underlie many proposed mechanisms of retinally based corrections. Supporting this concern, MSTd neurons compensate much better for direction distortions when flow stimuli contain motion parallax and perspective cues to depth (Maciokas and Britten, 2010). However, in that study, one could not identify whether the pursuit-invariant responses were due to a retinal or extraretinal mechanism.
In the present study, we designed a novel optic flow stimulus that isolates the effects of extraretinal influences on the motion responses of heading-selective neurons. The stabilized pursuit condition manipulates the relationship between eye rotation and the resulting retinal motion by rotating the stimulus with the eye as it pursues a target, effectively eliminating distortions to the optic flow while maintaining the influence of efference-copy signals. In this condition, as well as normal and simulated pursuit conditions, we recorded from heading-selective MSTd neurons to identify the signal source responsible for the stability of heading responses during pursuit. As in previous work from our laboratory, we have designed our flow stimulus to isolate motion parallax cues to depth. We found that extraretinal mechanisms contribute only a small, though significant, amount to this stability, whereas retinal mechanisms have a considerably larger effect.
Materials and Methods
Animals and surgical procedures.
Three adult female macaque monkeys (Macaca mulatta) were used in this study. Each monkey was surgically equipped with a head post, chronic recording cylinder (Crist Instrument), and scleral search coil (Judge et al., 1980) to stabilize their heads, provide access for electrical recordings, and record their eye movements, respectively. Recording cylinders were placed under the guidance of prior structural MRI and stereotaxic atlases. All components were implanted under general anesthesia using sterile technique in a dedicated surgical suite. All procedures and experiments were performed in accordance with the National Institutes of Health guidelines and approved by the University of California, Davis Institutional Animal Care and Use Committee.
Electrophysiological recordings.
At the beginning of each recording session, we penetrated the dura mater with a stainless-steel guide tube positioned within a polymer grid that ensured consistent access to the superior temporal sulcus. We then advanced single epoxy-coated tungsten microelectrodes (FHC) through the guide tube under the control of an electrical micromanipulator (National Instruments). Electrode signals were amplified (Bak Electronics), filtered for line noise, and passed through a dual voltage-time window discriminator (Bak Electronics) to isolate action potentials from single units. Timestamps from the individual spikes were then digitized at 1 ms intervals by the experimental control computer using the REX environment (Hays et al., 1982).
Before recording data for the main experiment, we mapped the dorsal subdivision of MST. Guided by MRI reconstruction and stereotaxic atlases, we identified MSTd based on the pattern of gray and white matter transitions as the electrode was advanced and previously described response characteristics (Tanaka and Saito, 1989; Graziano et al., 1994). Its neurons respond vigorously to patterns of moving dots, are often selective for complex motion patterns, and have large (compared with MT) receptive fields that often contain the fovea and portions of the ipsilateral visual field. We avoided recording from motion-selective cells in neighboring area 7a by ensuring that we recorded from a sufficiently ventral stereotaxic position.
Visual stimuli.
We presented visual stimuli on a rear-projection screen with a PROPixx DLP LED projector (VPixx Technologies) at a display resolution of 1920 × 1080 pixels with a 120 Hz refresh rate. At 50 cm from the monkey, the projected image subtended 100° (horizontal) × 68° (vertical) of visual angle. The recording room was as dark as possible (minimum screen luminance of 0.78 cd/m2), and the monkey was kept in a light-adapted state by fully illuminating (110 cd/m2) the screen during the intertrial intervals. These measures minimized the contributions of scattered light in the recording room to overall retinal motion. Throughout each experiment, we sampled eye position (National Instruments, 12-bit ADC) at 1 kHz with a magnetic search coil system (DNI). We initially presented the stimuli under binocular viewing conditions for Monkey Q, but we recorded the majority of the neuronal data under monocular occlusion of the ipsilateral eye to reduce conflict between stereoscopic and motion parallax cues to depth.
To simulate self-motion in the main experiment, we developed a paradigm that translates and rotates a virtual camera through a 3D cloud of randomly positioned dots.
During each trial, graphics commands were sent via a dedicated TCP/IP connection from the computer running the REX experimental control environment to a dedicated rendering machine. Stimuli were then generated on this machine with a custom software application that rendered each frame synchronously with the vertical refresh period of the projector.
In this environment, the viewable volume was a frustum (see Fig. 1A, top left) bounded by a near plane located at the surface of the screen, 50 cm from the observer, a far plane 150 cm from the observer, and the edges of the projected image on the screen. Dot density was 1000 dots/m3, which made ∼3500 dots viewable at any time. Each dot was a white (110 cd/m2) square that subtended ∼0.1° of visual angle on a black (0.78 cd/m2) background. To isolate motion-related signals, no looming or stereo cues to depth were present in these stimuli, despite the fact that all three are present in natural scenes. Another benefit of the unchanging dot size was that the spatial-frequency spectrum and average luminance of the stimuli did not change during a trial. The viewing frustum was embedded in a larger volume that ensured that the dot density was approximately constant as dots entered and exited the FOV. Throughout all self-motion conditions, translation speed was held at a constant 50 cm/s. Although the resulting pattern of dot motion on the screen is consistent with an infinite number of dot distance-observer speed combinations (if the ratio of the two is held constant), we refer to exact physical quantities here for clarity.
Depending on the stimulus condition, the monkey either fixated on a central dot or pursued a moving target during simulated self-motion. This target was a red dot that subtended 0.25° of visual angle and moved independently of the other dots embedded in the 3D environment. Each monkey's gaze had to remain within a 1.75° square window during fixation or a 2° window during pursuit; otherwise, the trial was aborted. To minimize the number of catch-up saccades, the pursuit target on all but the earliest experiments moved in a step-ramp fashion (Rashbass, 1961), with the initial step magnitude (see Fig. 1B) held constant across monkeys and chosen to roughly approximate the lag time between target motion onset and smooth pursuit initiation.
Pursuit manipulations.
The stimuli in the main experiment consisted of four different pursuit conditions over the same set of simulated heading directions (see Fig. 1A). In the first condition (fixation), the monkey simply had to remain fixated on a central target while we simulated self-motion. The second condition (normal pursuit) required the animal to pursue the target moving either leftward or rightward in the plane of the screen during simulated self-motion. In the next two conditions, the correspondence between eye rotation and the resulting reafferent motion on the retina was manipulated. During simulated pursuit, the monkey remained fixated on a central target while the effects of reafferent motion on optic flow were simulated by rightward or leftward rotation of the camera as it translated through the virtual environment. This produced a dynamic retinal image identical to that found in normal pursuit while the eyes were stationary. Finally, in our stabilized pursuit condition, we eliminated reafferent motion while the eyes were in motion by using online estimates of instantaneous eye velocity to rotate the camera in the opposite direction. This produced a retinal flow pattern nearly identical to that found during fixation.
To estimate instantaneous eye velocity for stabilization, we first took a 10 ms sliding window average of eye position throughout the trial. From this running average, eye velocity was estimated with numerical differentiation and used to estimate the rotation of the camera view between frame draws. This rotation was incorporated into the calculations used for the simulated translation of the camera for the next frame (see Fig. 2). Across all trials, this resulted in roughly a single-frame lag (mean = 9.6 ms) between changes in eye velocity and the resulting corrections on the screen (see Fig. 2, inset).
Experimental protocol.
Upon isolation, each cell was initially characterized for its heading and pursuit direction preferences. Heading preference was estimated by simulating self-motion in 26 evenly spaced directions in 2D heading space (i.e., elevation and azimuth) while we recorded spiking activity from a given cell. Each cell's preferred heading direction was determined by fitting these data with a modified Kent (FB5) distribution (Kent, 1982) using MATLAB's fmincon as follows:
where A is a tuning amplitude parameter, κ determines the overall tuning selectivity, β determines the degree of anisotropy, γ1 is a vector that determines the preferred heading direction, and γ2 and γ3 determine the major and minor axes of the tuning function, respectively.
To reduce the number of unique conditions in the main experiment to a manageable number, we used these responses to select a subset of headings for presentation. In this subset, we chose horizontally varying heading directions along a single elevation that were centered around the heading that evoked the maximal response from each neuron. Heading azimuths were chosen to span a range that covered most or all of the cell's response range, encompassing the peak response when possible. Cells were also hand-mapped to estimate the spatial extent of their receptive fields and their rough tuning in spiral space (Graziano et al., 1994). When cells could be held long enough, we ran an additional automated procedure to measure their tuning to planar and spiral space motion after the main experiment.
Pursuit direction preferences were estimated by presenting targets moving in 8 equally spaced directions in the plane of the screen without optic flow stimuli using a step-ramp protocol. We also included a target blink period in the middle of the ramp epoch to assess cells for tuning to extraretinal signals related to pursuit (Newsome et al., 1988). During this brief (150 ms) period, the pursuit target was extinguished (i.e., no stimuli were present in the animals' visual field) while monkeys continued to pursue the implied path of the target. Pursuit direction preference was determined by fitting these data with a von Mises function (outline in Eq. 9).
During the main experiment, the four pursuit conditions were pseudorandomly interleaved. Each trial (see Fig. 1B) consisted of an initial fixation epoch followed by a brief period in which the dot-filled viewing volume appeared before the onset of camera translation. Once the virtual camera started moving, the pursuit target remained stationary for 400 ms before following the step-ramp trajectory to the left or right. Neuronal activity during pursuit initiation was ignored, and spikes were counted during a window (see Fig. 1B, shaded box) for subsequent analyses. To control for the effects of eye position on neural responses (Bremmer et al., 1997), we positioned the pursuit target such that it had the same mean position across the length of this analysis window in each of the pursuit conditions (see Fig. 1B, eye position traces). Finally, each trial was followed by an intertrial interval in which the screen was fully illuminated to maintain light adaptation. This eliminated retinal slip caused by other objects in the recording room that were dimly illuminated by light scattering off the screen.
Effects of pursuit on retinal flow patterns.
To estimate the magnitude of retinally based compensation for pursuit in MSTd, we developed a method to compare a neuron's heading preferences during simulated pursuit to an estimate of the cell's heading preference if it did not compensate at all. We determined this noncompensating response by analyzing the distortions to optic flow during smooth pursuit. In any single virtual-depth plane, retinal slip from pursuit shifts the center of motion of an expanding pattern in the direction of eye movement (or the opposite for contracting patterns). For scenes with more than one depth plane, the magnitude of these shifts increases with distance from the observer, producing an apparent curvature in flow pattern on the retina. This increasing shift with virtual depth results from motion parallax: the motion of the observer produces lower-velocity motion at greater distances. Therefore, pursuit-induced shifts, which do not vary with depth, have greater relative effects at larger virtual distances. We reasoned that a MSTd neuron that did not compensate for pursuit would match a heading direction to these distorted patterns by aggregating center of motion shifts across depth planes into a single location. Therefore, we can estimate how much this hypothetical, noncompensating neuron would appear to shift its heading preference by subtracting this matched heading direction from the neuron's preferred heading direction.
To model the expected effects of smooth pursuit on neuronal responses, we derived a set of differential equations to describe the motion of a texture element for an arbitrary set of heading and eye rotations. To do so, we followed the method of Longuet-Higgins and Prazdny (1980) while incorporating the screen distance of our setup into the perspective projection (i.e., x = DscreenX/Z) as follows:
where X, Y, and Z are the positions of dots in 3D space, x and y are the horizontal and vertical screen positions of each projected dot, RX, RY, and RZ are the components of the eye's rotation about each respective axis, TX, TY, and TZ are the translational components, and Dscreen is the distance of the screen from the observer. If we assume that the center of motion determines each MSTd neuron's response to different retinal flow patterns, and restrict eye rotations to those about the y axis, as we do in this experiment, we can calculate the horizontal (xCoM) and vertical (yCoM) position of this center on the screen as follows:
Solving for the quadratic in Equation 3 yields the horizontal component of the center of motion for any arbitrary combination of heading direction and eye rotation. Additionally, we find that the x component of this center is independent of the TY component of the heading direction, so we can determine the expected horizontal shift in center of motion in the presence of left or right pursuit for any given heading direction. To retrieve the vertical component, we can rearrange Equation 4 to produce the following:
Here, we find that there will be no vertical shift in the center of motion during left or right pursuit for any heading with TY = 0 (i.e., self-motion parallel to the ground plane). Finally, the following transform expresses the position of the center of motion on the tangent screen in degrees of visual angle:
These equations were used to estimate the magnitude of tuning shifts in a hypothetical neuron that did not compensate for retinal flow distortions during pursuit. We first determined the shift in the center of motion for each stimulus depth plane in 1 cm increments of depth. From this, we determined the purely translational heading direction that best matched the aggregate center of motion in the pursuit-distorted retinal flow pattern. To do so, we calculated the mean center of motion over the nearest visible 10 cm of the stimulus (50–59 cm), or just <1% of the total volume of the viewing frustum. This value was chosen as a conservative estimate of the neurons' weighting of each depth plane (as the nearest planes shift less than the farthest ones) while also allowing enough dot motion (∼35 dots) to be visible for a reasonable amount of time (∼20–200 ms). We then calculated predicted tuning shifts using a numerical method (with MATLAB's fmincon function) that found the heading direction that minimized the difference between the aggregate center of motion position and the center of motion position that matched the cell's heading preference. Using either an inverse depth-weighted mean of the set of centers of motion (to reflect velocity scaling due to motion parallax) or an unweighted mean (instead of the mean over the nearest 10 planes) produced larger uncorrected tuning shifts that were qualitatively similar to the results we present.
Data analysis and statistical testing.
All eye position and electrophysiological data were analyzed using custom scripts in MATLAB (The MathWorks, RRID:SCR_001622). We determined the lag between changes in eye velocity and subsequent corrections on screen during the stabilized pursuit trials by finding the lag time that maximized the cross-correlation between the camera and eye-position traces for each trial (see Fig. 2). Before the cross-correlation, the camera position traces were upsampled from 120 to 1000 Hz. Trial-wise lags were then aggregated, revealing a mean stabilization lag time of 9.6 ms.
To quantify how well the monkey followed the pursuit target, we calculated the pursuit gain and number of saccades for each trial. We estimated instantaneous eye velocity (in °/s) with numerical differentiation of the raw eye position (p) using a symmetric difference quotient as follows:
Differentiation was followed by smoothing with a 10 ms sliding-window average. We calculated pursuit gain by dividing the average pursuit velocity in the analysis window by the constant velocity of the pursuit target. The denominator captures the duration in milliseconds that the averaging encompassed. Saccade episodes were identified as times when the estimated 2D eye velocity exceeded the SD for that trial by a factor of 3.
Cells were selected for inclusion based on four criteria: isolation quality, minimum number of trial repeats, significant heading tuning, and Gaussian-like tuning for heading direction. The spike trains of each cell had to exhibit a clear refractory period as assessed by autocorrelation. To be included, each unit had to have at least four repeats for each of the 49 conditions in the main experiment (mean number of repeats per condition was 8). As a first pass to ensure we included only heading-selective neurons, we used a one-way Welch's ANOVA test for significant firing differences between the 26 presented directions in the heading tuning protocol. Units that passed this test had their response patterns under the fixation condition fit with the following von Mises function using MATLAB's fmincon as follows:
where A is an amplitude parameter, θ and θpref are the presented and preferred heading directions, respectively, σ is a parameter that sets heading selectivity, and R is the response offset that sets the response of the cell to the antipreferred heading (see Fig. 4A, bottom). Upper and lower bounds for the parameters were chosen to be physiologically realistic (hard lower bound at 0 for amplitude and response offset), to avoid excessively peaked fits (σ ≥ 1.5 times interheading spacing), and to keep the estimate of preferred tuning within one interheading spacing of the most extreme angles tested. Cells with poor fits during the fixation condition (r2 < 0.5) were excluded from all further analysis.
In the main experiment, the responses of each cell in each of the six pursuit conditions (normal pursuit, simulated pursuit, and stabilized pursuit to the left or right) to the seven presented heading directions were fit individually with the von Mises with all four parameters free. To obtain the tuning bandwidth (full-width of tuning curve at half-maximum) from these fits, we used the following equation:
Because our analyses used pairwise comparisons of parameter fits between fixation and pursuit, if any pair member contained a fit that was not sufficiently better than a mean fit (threshold r2 > 0.5), the pair was excluded from further analysis.
To combine changes in preferred heading direction θpref between leftward and rightward pursuit trials as well as in cells that preferred forward and backward headings, we needed to take into account that these conditions induce different directions of curvature in the retinal flow pattern. For example, rightward pursuit produces rightward curvature for forward heading, but leftward curvature for backward headings (and vice versa). Because these distortions should produce different signs of tuning curve shifts between the fixation and pursuit conditions, we reversed the signs for data obtained under leftward pursuit and backward heading so that all analyzed shifts are consistent with rightward pursuit and forward heading.
To test for significant tuning shifts in the cell sample, we used a Wilcoxon signed-rank test to assess the paired difference between heading preferences during fixation and during each of the three pursuit conditions after the sign-reversing procedure. We used the same test to compare heading preference shifts between normal and simulated pursuit (see Fig. 8), and between monocular and binocular viewing conditions (see Fig. 10).
For the amplitude and bandwidth parameters, we assessed differences between fixation and pursuit with a Wilcoxon signed-rank test in a similar manner. Given the distribution of fits, we tested for pairwise differences in the offset parameter using the following permutation procedure. For each pursuit condition, we pooled the calculated offset parameters from both fixation and pursuit and selected values (without replacement) to form two new samples and calculated the median paired difference between them. This procedure was then repeated 10,000 times to obtain a distribution under the null hypothesis, which was then used to calculate the probability of obtaining the actual median paired difference.
A model II regression (major axis regression) was used to quantify the paired relationship between the simulated-normal pursuit and stabilized pursuit-fixation heading preference shifts (see Fig. 8B). In addition, we calculated the Spearman's rank correlation coefficient between the two pairs of differences and assessed the significance of this result with a t test. To identify neurons that showed significant differences between stabilized pursuit and fixation and between normal and simulated pursuit, we used a permutation test to construct a distribution under the null hypothesis. Responses from individual trials were pooled within each of these pairs (separately for each heading direction) and used to generate new sets of responses for each of the 49 different conditions. These shuffled responses were again fit with a von Mises function to estimate heading preference shifts. After repeating this procedure 500 times, tuning shifts in the original dataset were deemed significant with a two-tailed test at the 5% level.
To compare the differences between expected tuning shifts and measured shifts during simulated pursuit, we used a model I linear regression and a Wilcoxon Signed-Rank test for paired differences.
Code accessibility.
All data analysis and modeling code used to define uncorrected heading preference shifts can be accessed as freeware (https://github.com/tsmanning/EfferenceCopyMST).
Software accessibility.
The software developed for stimulus presentation and image stabilization (render) is available upon request.
Results
Manipulating the relationship between eye velocity and retinal slip isolates the effects of retinal and extraretinal signals on neuronal responses to optic flow
To investigate the origin of the corrective signals for pursuit in MSTd, we manipulated the relationship between retinal slip and rotational eye velocity as monkeys viewed a set of stimuli simulating self-motion through a 3D scene filled with randomly placed dots (Fig. 1A). In our unmanipulated baseline condition (normal pursuit), each monkey pursued a red target that moved to the left or the right independently of the dots in the optic flow pattern. In the simulated-pursuit condition, we recreated the same retinal flow pattern present during normal pursuit by rotating the viewing direction of the virtual camera while the monkey fixated on a stationary central target (Fig. 1A, bottom left). Because the eyes were stationary, this condition eliminated efference-copy inputs and therefore isolated the effects of retinal mechanisms of flow stabilization on MSTd activity.
Experimental setup and trial time course. A, Geometry of the viewing volume and schematic of the three pursuit conditions. In all cases, heading was simulated by translating a virtual camera through a 3D space embedded with randomly placed dots. (Dot density was equal throughout the viewing volume.) For clarity, the contrast sign of the dots has been inverted from what was presented. The dots have also been increased in size and decreased in number. In the normal and stabilized pursuit conditions, the red target moved to the right or left at 10°/s, independent of the background dots. In simulated pursuit, the red target remained fixed while the virtual camera was rotated right or left. For the stabilized pursuit condition, the virtual camera was rotated with the opposite rotational velocity as the eye while the monkeys pursued the target. B, Trial time course and eye position traces. Before and after each trial begins (ITI), the screen is fully illuminated. After the red fixation dot appears on the screen (t−2) at one of three locations, the monkey must remain fixated for 500 ms before the dots appear (t−1) in the volume. Following a 180 ms pause, the translation epoch (t0) begins as the monkey continues to fixate. For pursuit trials, the fixation dot initially steps at the beginning of the epoch (t1) to a more eccentric position before traveling to the left or to the right.
We also developed a novel manipulation that largely eliminated the distortions to the pattern of retinal flow arising from pursuit and therefore isolated the contributions of extraretinal mechanisms. In this stabilized-pursuit condition (Fig. 1A, bottom right), the monkey pursued a moving target as in normal pursuit, but we rotated the virtual camera in the opposite direction as the eye rotation based on online estimates of eye velocity (see Materials and Methods). As a result, the pattern of retinal flow was nearly identical to the undistorted pattern present during the fixation condition, even though the monkey still pursued the moving target. To ensure that stabilization occurred with the shortest lag possible, we updated our stimulus at a high frame rate and estimated eye velocity online with scleral search coils. The corrective rotations to the virtual camera faithfully matched those of the eye during pursuit with a mean lag approximately equal to a single frame update cycle (9.6 ms; Fig. 2).
Performance of retinal stabilization in a single trial and across all stabilized pursuit trials. In the trial shown, the pursuit target (black) is moving to the left at 10°/s. Orange represents eye position. Blue represents corrective rotations to the camera viewing angle. For each trial, stabilization lag was determined by finding the lag time that maximized the cross-correlation between the eye position and camera position traces. Inset, Histogram of stabilization lag times across all stabilized pursuit trials. Mean lag time was calculated to be 9.6 ms, which is approximately a single-frame lag with the projector refresh rate at 120 Hz.
Pursuit eye movements distort optic flow by introducing an apparent curvature in depth to the motion pattern in the same direction as the eye rotation for forward heading directions (Fig. 3A, compare upper and lower flow patterns). More specifically, rightward pursuit produces rightward shifts in the centers of motion that increase proportionally with the distance from the viewer. If the tuning curves were a simple representation of the centers of motion without compensating for their shifts on the retina, then one would expect a substantial shift in the tuning curves in the direction opposite that of the center of motion shifts (Fetsch et al., 2007; Lee et al., 2011). This reflects the degree to which the center of motion in the stimulus overlaps with the center of motion associated with a cell's preferred heading direction. However, pursuit eye movements produce only small shifts in the tuning curves of heading-selective neurons in MSTd (Bradley et al., 1996; Shenoy et al., 2002; Maciokas and Britten, 2010). These shifts reflect an undercompensation for pursuit-related distortions; if the responses of MSTd neurons were completely stable during pursuit, the tuning curves would completely overlap. These tuning curve shifts are illustrated in Figure 3A. The flow pattern resulting from translational self-motion and eye rotation in Figure 3A that best matches the preferred pattern of the neuron without pursuit is displaced by ∼10 degrees to the left, which should produce a similar shift in the tuning curve. As the difference between the peaks of the curves is much closer to 0° than 10°, we can surmise that MSTd uses either efference copy or a retinal cue of pursuit eye movements to achieve stability.
Effects of smooth pursuit on retinal flow and neuronal responses, and predictions under two hypothesized mechanisms of heading stability. A, Top, Optic flow patterns for three example heading directions about straight ahead (−10°, 0°, 10°). Middle, Retinal flow patterns for the same three heading directions during rightward pursuit. Summing the motion components due to self-motion and eye rotation produces an apparent rightward curvature to the flow pattern due to increasing rightward shifts in the centers of motion with each successive depth plane. Bottom, Results from Maciokas and Britten (2010) demonstrating the effects of these distortions on the responses of a single unit in MSTd. Given the rightward displacement of the center of motion in each of the flow patterns with rightward pursuit, the cell's tuning curve is expected to move substantially to the left; yet it shifts only a small amount. B, C, Predicted tuning curve shifts under each of the pursuit conditions in the present study in the case where cells achieve partial stability for heading direction encoding during pursuit using purely an extraretinal (B) or purely a retinal (C) signal. For reference, the expected uncorrected tuning curve for a noncompensating neuron is also shown at the far left in C. The horizontal axis has been magnified here relative to A for clarity.
If we assume that efference-copy and retinal-cue mechanisms are mutually exclusive, we can predict how a neuron's heading-direction preference will change under each stimulus condition. Under the hypothesis that efference copy alone is responsible for response stability, we would expect to observe the responses in Figure 3B (left) during rightward pursuit from a neuron that prefers forward self-motion (0° Az). Responses during fixation and normal pursuit would be similar to those in Figure 3A (bottom), with a small leftward shift in the tuning curve during pursuit. In simulated pursuit, we eliminate the influence of efference copy on MSTd neurons while retaining the distorted pattern of retinal flow. We therefore would expect to see an even greater leftward shift, indicating that the cell signals the true shift in the flow pattern's center of motion on the retina. Conversely, in stabilized pursuit, we retain the putative efference copy inputs while eliminating the flow distortions. If we assume that efference copy pushes the preferred heading direction of the cells in the direction of pursuit to counteract the distortions (i.e., shifts the simulated pursuit tuning curve toward the normal pursuit curve), we would expect to see a rightward shift in tuning compared with fixation during stabilized pursuit. We therefore would expect the difference in tuning between simulated and normal pursuit and between fixation and stabilized pursuit to be of the same magnitude and opposite sign, as efference copy should have the same effect in both cases.
Under the hypothesis that retinal mechanisms alone account for response stability, we would predict the responses in Figure 3B (right). In this case, we assume that MSTd neuronal responses will be purely determined by the retinal flow pattern. As shown previously, during normal pursuit, we would see a leftward shift in tuning compared with fixation. The magnitude of this shift is assumption-dependent; we model this quantitatively for each cell below. In simulated pursuit, we would expect the cell's responses to be identical to those found in normal pursuit, as the distorted flow patterns are identical in the two conditions. Likewise, motion on the retina is identical and undistorted under both fixation and stabilized-pursuit conditions, so we would expect neuronal responses to be identical in these conditions.
Extraretinal signals have a relatively small influence on heading tuning in MSTd neurons compared with retinally based compensatory signals
To test these predictions, we recorded from MSTd neurons in 3 female macaque monkeys (Monkey P: left hemisphere; Monkey Q: right hemisphere; Monkey R: left hemisphere) while they performed the fixation or pursuit tasks. Before the main experiment, we categorized each cell in terms of heading, spiral space, and pursuit tuning (see Materials and Methods). To estimate the changes in preferred heading direction between the conditions, we fit the set of responses in each condition independently with a modified von Mises function (see Materials and Methods). Each parameter of the tuning curve (Fig. 4A, bottom) was free to vary during the fitting procedure. Three example cells (Fig. 4B–D) illustrate that retinal mechanisms are responsible for the majority of compensation for pursuit eye movements, based on the predictions outlined in Figure 3B.
Example MSTd unit responses. A, Top, Heading and pursuit direction conventions. Bottom, Naming conventions for cell response parameters extracted with von Mises curve fit. B, C, Examples of single-unit responses under pursuit manipulations. Heading tuning is shown with an equirectangular projection of 2D heading space. Error bars indicate SD in firing rate across the repeats of each condition. Heading directions on the abscissae for backward-preferring neurons continue to increase in a clockwise fashion to avoid discontinuities. B, A unit that is broadly tuned for leftward pursuit and backward heading directions. C, A unit tuned for upward pursuit and broadly tuned for backward heading. This unit exhibited substantial tuning for clockwise spiral motion. D, An example unit that is suppressed during the smooth-pursuit tuning protocol (mean response during fixation: 6 sp/s). The cell is also well tuned for forward and slightly upward heading direction. White bars in the heading tuning panels represent the range of headings that were used for the main experiment.
In total, we recorded from 147 neurons, of which 101 cells passed our conservative inclusion criteria. First, we selected only cells that were both well isolated (118 of 147) and were held long enough for us to record full datasets (126 of 147). We excluded cells that were nonselective (10 of 147; p > 0.05 for Welch's ANOVA) or had response patterns during fixation that were poorly fit with a von Mises function (9 of 147; r2 < 0.5). For each included cell, we also excluded stimulus conditions (i.e., pursuit direction or manipulation condition) in which responses were poorly fit (Table 1). The heading preferences were biased for forward self-motion (Fig. 5A), characteristic of MSTd (Graziano et al., 1994; Takahashi et al., 2007). Pursuit preferences were fairly evenly spread across the eight pursuit directions tested (Fig. 5B).
Number of included cells in each pursuit conditiona
Distribution of heading and pursuit direction preferences for all included cells in the sample. A, Histogram of heading preferences determined by Kent function fit. B, Polar histogram of pursuit preferences as determined by direction of maximal cell response.
We initially recorded neurons under binocular conditions but recorded the bulk of the data under monocular occlusion to eliminate cue conflict between motion parallax and binocular disparity cues to depth. With monocular viewing, stereoscopic depth is undefined, so it cannot conflict with the motion parallax cue we were most interested in. To compare the effects of the two conditions, we also ran the main experiment under both monocular and binocular viewing on a subset of cells. For cells recorded under both viewing conditions, only the monocular data were included in the subsequent analyses (Table 2).
Number of included cells under each viewing condition
To analyze the contribution of retinal and extraretinal mechanisms to pursuit compensation, we compared average fits between fixation and each of the pursuit conditions (Fig. 6). Because optic flow patterns are opposite for forward (expanding patterns) and backward (contracting patterns) headings, pursuit in a given direction will produce opposite shifts in the centers of motion and tuning curve peaks. Similarly, leftward and rightward pursuit directions produce opposing directions of retinal slip and therefore shifts in the centers of motion. Accordingly, we transformed each cell's tuning into a common coordinate frame corresponding to forward heading and right pursuit (consistent with Fig. 3 conventions; for details, see Materials and Methods). Two comparisons are particularly revealing in these sample-average curves. First, the fixation (gray curve) and stabilized pursuit (red curve) tunings are nearly identical, consistent with the predictions of a retinal model (Fig. 3C). Also consistent with a retinal-model prediction, the simulated and normal pursuit curves are both shifted well to the left. However, the simulated-pursuit case is shifted further to the left, suggesting that there is some influence of an extraretinal signal in the normal pursuit condition not seen in stabilized pursuit.
Mean tuning curve fits under fixation and each of the pursuit conditions. Each cell in the sample was individually fit for each pursuit condition, and then the parameters of all the fits were averaged within each pursuit condition to produce a single average function. Arrows indicate the mean preferred heading direction relative to fixation.
To examine the magnitudes of these changes more closely, we represent the same data as sample histograms in Figure 7. We find significant shifts in heading tuning for the normal pursuit versus fixation comparison (Fig. 7A: median = −5.29, p = 1.2 × 10−5 Wilcoxon signed-rank test) and for the simulated pursuit versus fixation comparison (Fig. 7B: median = −8.64, p = 9.02 × 10−15 Wilcoxon signed-rank test). The stabilized pursuit versus fixation comparison shows a small shift in the direction opposite of the normal pursuit comparison, but this is not significant (Fig. 7C: median = 0.751, p = 0.0596 Wilcoxon signed-rank test). Overall, we find that the shift in heading preference is only modestly larger during simulated pursuit than during normal pursuit, and no significant tuning shifts occur during stabilized pursuit. There were no significant differences in results among the 3 animals (ANOVA; Table 3). Together, our results are inconsistent with the hypothesis that efference copy alone can substantially alter heading tuning in MSTd during smooth pursuit.
Distribution of changes in heading preferences across entire cell sample under different pursuit manipulation conditions. To produce a consistent expected direction of tuning shift (i.e., forward heading and rightward pursuit), the sign of the change was reversed for cells tested with backward headings and leftward pursuit (see Materials and Methods). Each cell therefore contributes two counts to the histogram (i.e., for left and right pursuit). A, Normal pursuit. B, Simulated pursuit. C, Stabilized pursuit. Significance for shift in preferred heading direction between fixation and each pursuit condition was determined with a Wilcoxon signed-rank test and threshold of p < 0.05. Black arrow indicates median paired differences. *Significant difference.
Results from three-way ANOVA for interanimal differences in tuning shifts
Because individual cells in the sample differed in their shift magnitudes, we were interested in whether these differences were systematic or random. Following the predictions of Figure 3, retinally based correction signals and efference-copy signals should appear in specific comparisons between our conditions. Retinal signals of pursuit should be revealed by the comparison between simulated pursuit and fixation, as well as between normal pursuit and fixation. We reasoned that cells that differed in the magnitude of their reliance on retinal cues would lead to a positive correlation in the magnitude of these two shifts, and this was indeed reflected in our results (Fig. 8A; r = 0.463, p = 1.54 × 10−11). Incomplete correction based on retinal signals alone would lead to a difference in the magnitudes of these two shifts, which we also observed (median difference = 3.14, p = 3.02 × 10−5, Wilcoxon signed-rank test). This difference (offset from the diagonal in Fig. 8A) is presumably due to an extraretinal signal only present in the normal pursuit condition. Therefore, we wished to examine this difference in comparison with stabilized pursuit, which should also directly reveal the magnitude of the extraretinal signal.
Effects of efference copy on heading tuning in cell sample. A, Scatter plot and marginal histogram comparing the changes in heading-direction preferences between the simulated and normal pursuit conditions and fixation for all included units. Arrow indicates median difference. *Significant difference. B, Paired comparison of heading shifts for each unit between normal and simulated pursuit and between the stabilized pursuit and fixation conditions. Open squares represent significant stabilized pursuit versus fixation differences. Open circles represent significant normal versus simulated pursuit differences. Asterisks indicate significant differences for both comparisons.
Simulated pursuit removes extraretinal signals compared with normal pursuit, whereas stabilized pursuit adds the signal compared with fixation. In both, the retinal stimulus remains the same during the experimental manipulation. Cells that showed large shifts during stabilized pursuit should therefore show large shift differences between normal and simulated pursuit, on the assumption that these cells carry stronger extraretinal signals. When we compare these differences with the shifts during stabilized pursuit (Fig. 8B), we find a significant positive correlation between the two (r = 0.387, p = 1.15 × 10−7). A model II regression found that the magnitude of the shift in stabilized pursuit was smaller than the shift difference between normal and simulated pursuit (slope = 0.636). Using a permutation test (see Materials and Methods), we identified significant shift differences between normal and simulated pursuit (N = 13), stabilized pursuit and fixation (N = 11), or both (N = 2). In sum, these results are consistent with a small but significant contribution of efference copy to heading preferences during pursuit for a minority (24 of 101) of cells. Additionally, our data suggest that retinal and extraretinal signals might act synergistically in MSTd, given the larger shift magnitude in normal pursuit where both retinal and extraretinal eye movements cues are present.
Given the small contribution of extraretinal signals to pursuit compensation, we designed a procedure to estimate the magnitude of retinal compensation in our neuronal sample. In the absence of simulated depth, tuning in purely retinal coordinates (no pursuit compensation) is straightforward to estimate, as the focus of expansion undergoes a single unique shift. However, because the shift from pursuit depends on depth, additional assumptions are needed to estimate the expected responses of a neuron in purely retinal coordinates to stimuli varying in virtual depth. The problem is that shifts increase with depth, and thus multiple shifts will be simultaneously present in the RF of an MSTd neuron. For the same reason, each region within the RF of an MSTd neuron will contain multiple motion vectors. We assumed that MSTd neurons weight near depth planes more than far, based on psychophysical (Royden et al., 1994) and physiological (Tanaka and Saito, 1989; Upadhyay et al., 2000; Inaba and Kawano, 2010) evidence. This assumption is conservative, as these planes will have the smallest shifts; any contribution of farther depths would produce higher values for the predicted, uncorrected shift. For each neuron, we calculated the average shift in the closest 10 cm of the stimulus and used it as an estimate of the uncorrected retinal prediction.
We ran this procedure across all neurons in our sample and compared the calculated uncorrected shift and the actual shifts seen during simulated pursuit (Fig. 9). The median shift in our data was ∼60% of the predicted value (median difference = 6.7°, p = 0.0081, Wilcoxon signed-rank test), indicating that there was large tuning compensation based on retinal signals alone. We found a substantial range of relative shifts, with some cells shifting much more than predicted. These cells with higher shifts presumably place more weight on farther depth planes, probably because of a preference for the lower speeds that are present at greater virtual distances.
Observed compensation relative to predictions based on a complete lack of correction. The observed shift was derived from the difference between the simulated pursuit condition and fixation for each cell in our sample (n = 101). As in Figure 7, each cell is counted twice. Arrow indicates median shift.
These data show that, consistent with previous reports, MST neurons can substantially correct for the biases in heading tuning consequent to pursuit eye movements using a correction based on motion cues in the scene itself. The finding that these corrections are at least twice (6.7° vs 3.1°) as large as those seen when extraretinal cues are added (Fig. 8A) supports the conclusion that retinal compensation mechanisms dominate extraretinal ones under our conditions.
Viewing condition and pursuit accuracy do not significantly alter heading preference changes
We wanted to confirm that our inclusion of two different viewing conditions did not significantly influence our results. To do so, we recorded from a subset of cells under both binocular and monocular viewing, when cells could be held for long enough. A paired comparison (Fig. 10) failed to show a significant median difference in heading preference shifts between the two viewing conditions (normal vs fixation: median = 0.219, p = 0.411; simulated vs fixation: median = 0.662, p = 0.89; stabilized pursuit vs fixation: median = −1.01, p = 0.42, Wilcoxon signed-rank test).
Comparison of heading preference shifts between monocular and binocular viewing conditions. Unit responses (N = 19) under each of the pursuit conditions were recorded with and without monocular occlusion, and the order of presentation was pseudorandomized across cells. Dashed black line indicates line of unity. Pursuit direction color conventions are the same as for Figure 6.
We also investigated whether differences in eye movements between each of the pursuit conditions could account for the changes in heading preference. During fixation and simulated pursuit, the pursuit gain is fixed at one (i.e., the mean horizontal eye velocity in a given trial is equal to the horizontal velocity of the pursuit target); whereas in normal and stabilized pursuit, it depends on each animal's performance. This result could inflate the shift difference between normal and simulated pursuit if eye speed is substantially lower than that of the pursuit target. The mean pursuit gain across all trials with active pursuit and across the 3 monkeys is 0.88 ± 0.069 (SD), which is similar to previous reports for tracking eye movements in the presence of a textured background (Takeichi et al., 2003).
While this value is below unity, gains for each condition were quite variable. We took advantage of this variability to see whether pursuit gain predicted the magnitude of shifts in heading preference. We found no significant correlation between average pursuit gain and heading preference shift on a cell-by-cell basis (Fig. 11A: normal pursuit, left: r = 0.0735, p = 0.50; right: r = 0.0022, p = 0.984; Fig. 11B: stabilized pursuit, left: r = −0.0013, p = 0.991; right: r = 0.114, p = 0.310). Because failures of pursuit usually lead to catch-up saccades, we also examined whether the tuning shifts were related to saccade frequency, and also found no significant relationships (analysis not shown). Together, these additional analyses support the view that extraretinal signals of eye movements have a limited effect on heading tuning in MST.
Relationship between pursuit gain and heading tuning shifts. A, Normal pursuit. B, Stabilized pursuit. The slightly higher gain in both right pursuit conditions was likely due to a small overrepresentation of leftward headings in our set of tested stimuli.
Pursuit manipulations have negligible effect on other aspects of neuronal tuning
Our analysis thus far focused on the stability of heading tuning preferences during pursuit eye movements, but theoretical work has proposed that MSTd can encode veridical heading direction at the population level with gain fields or by modulating response offset. Gain fields for eye velocity may reflect the first stage of the process used to produce shifts in heading preference, as found for positional coordinate transforms in parietal areas (Andersen et al., 1985; Beintema and van den Berg, 1998). Response offsets (i.e., the response of the cell to the antipreferred heading) of the cell population could be additively modulated with an efference-copy signal to perform a vector operation that would subtract off the pursuit-related components from retinal flow (Perrone and Krauzlis, 2008).
To test these hypotheses, we investigated whether other parameters of the cell-response fits were modulated between fixation and the different pursuit conditions. These parameters (Fig. 4A, bottom) were extracted at the same time as the estimation of each cell's preferred direction, using a modified von Mises function in which all parameters are independent (see Materials and Methods). Overall, the cells showed a decrease in their firing range across all three pursuit conditions, as seen in the response-amplitude parameter (Fig. 12; normal vs fixation: median = −3.29, p = 2.91 × 10−5; simulated vs fixation: median = −1.75, p = 1.87 × 10−4; stabilized pursuit vs fixation: median = −2.75, p = 6.16 × 10−6, Wilcoxon signed-rank). Neither the tuning bandwidth (normal vs fixation: median = −1.17, p = 0.593; simulated vs fixation: median = −4.53, p = 0.179; stabilized pursuit vs fixation: median = −2.08, p = 0.0823, Wilcoxon signed-rank), nor the offset (normal vs fixation: median = −2.42 × 10−7, p = 0.693; simulated vs fixation: median = −1.15 × 10−6, p = 0.404; stabilized pursuit vs fixation: median = −2.37 × 10−7, p = 0.669; for permutation test, see Materials and Methods) parameters showed any significant difference between fixation and pursuit across the cell sample. Further analysis revealed no significant differences in parameter changes between left and right pursuit (Table 4).
Effects of pursuit manipulations on other tuning curve parameters. Each plot is a cell-by-cell paired comparison of best-fit parameters for fixation and pursuit. A, Amplitude parameter. B, Tuning curve bandwidth or FWHM, as derived from σ parameter. C, Response offset parameter (sometimes referred to as baseline).
Differences in tuning curve parameters between left and right pursuita
Offset parameters were most often best fit with a value of zero, which together with the median change across the sample is at odds with theoretical mechanisms that depend on this response property (Perrone and Krauzlis, 2008). Although we were not able to replicate previous findings of gain fields for eye velocity (Squatrito and Maioli, 1997) with the amplitude parameter, our sampling of pursuit space was limited to two different directions that were often misaligned with the cells' null-preferred pursuit direction axes. In sum, we find that our pursuit manipulations mostly affected neurons' preferred heading directions rather than other aspects of cell tuning.
Discussion
We investigated how animals compensate for the sensory consequences of their own behavior, using monkey visual-motion perception as a model. Our experiments measured the relative contributions of extraretinal and retinal mechanisms to the stability of heading responses in MSTd during smooth pursuit. To this end, we developed a novel manipulation of an optic flow stimulus that actively stabilizes the flow pattern on the retina during smooth pursuit to eliminate the effects of retinal mechanisms on the heading preferences of MSTd neurons. Using this manipulation alongside a simulated pursuit paradigm revealed that the contributions of extraretinal mechanisms to the stability of heading tuning during pursuit are small compared with the contributions of retinal mechanisms. These results show that high-level visual cortex discounts reafferent signals of eye movements largely through retinally based calculations when motion parallax is the only available depth cue. However, our results show more evidence for a contribution of extraretinal signals under normal pursuit than when retinal cues are removed via stabilization. This finding suggests that retinal and extraretinal compensation mechanisms interact synergistically, rather than additively.
Relationship with previous physiological work
Our work differs in some important ways from previous experiments that identified a larger role for efference copy in MSTd. The Andersen laboratory (Bradley et al., 1996; Shenoy et al., 2002) reported large changes in preferred headings between normal and simulated pursuit, which could be due to a variety of factors. Their flow stimulus included only a single depth plane, was limited to a 50° × 50° FOV, and had a higher ratio of pursuit speed to self-motion speed (which produces a larger center of motion shift). More recent physiological work has shown that compensation for pursuit is much more complete in the presence of simulated depth from motion parallax (Maciokas and Britten, 2010; Sunkara et al., 2015). This is consistent with most proposed retinal mechanisms of pursuit compensation, which depend on the presence of depth cues in the scene. Because the components of retinal flow due to retinal slip affect all depth planes equally, while the self-motion components are subject to motion parallax, the brain could theoretically subtract the full-field slip to retrieve the undistorted pattern of motion (Longuet-Higgins and Prazdny, 1980; Heeger and Jepson, 1992; Royden, 1997). With only a single depth plane available, MSTd neurons would depend entirely on efference copy to solve the rotation problem, increasing the difference in heading preference shifts between normal and simulated pursuit due to increased shifts during simulated pursuit (as discussed above; see Fig. 9). Thus, the single-plane stimulus enables an accurate and assumption-free estimate of the retinal shift, but at the cost of removing the most profound cue that enables a retinally based solution.
Conclusions similar to ours have been drawn about the dominance of retinal stability mechanisms in ventral intraparietal sulcus (VIP) (Sunkara et al., 2015). VIP represents heading direction at a nearly identical level of precision to MSTd and is similarly tolerant to pursuit distortions (Maciokas and Britten, 2010). It also appears to code heading direction gleaned from optic flow in eye-centered coordinates, like MSTd (Chen et al., 2013), although some neurons' receptive fields are more consistent with a head-centered coordinate system (Duhamel et al., 1997). Sunkara et al. (2015) also find only small differences in preferred heading shifts between normal and simulated pursuit, and conclude that retinal factors contribute more to pursuit tolerance in neurons than does efference copy. Their study differs from ours and those previously mentioned in their inclusion of binocular depth cues, which provide additional information for a retinally based stability mechanism. Changes in the disparity structure in the scene can improve heading judgments during simulated pursuit, mainly in the presence of considerable motion noise (van den Berg and Brenner, 1994; Grigo and Lappe, 1998). Both MT (DeAngelis et al., 1998) and MSTd (Roy et al., 1992) neurons are sensitive to disparity and may use this cue instead of efference copy to compute true heading direction during pursuit (Kim et al., 2015). As in our current study, however, Sunkara et al. (2015) do not find substantial differences in tuning shifts between monocular and binocular viewing conditions during simulated and normal pursuit. They propose that motion parallax and dynamic perspective (i.e., the changing angle between the viewing direction and the visual scene during pursuit) cues alone can support retinally based mechanisms. Overall, the richness of these retinal cues helps explain why our results show that retinal mechanisms dominate pursuit compensation in higher-level visual cortex.
Theoretical retinal mechanisms
How neurons use retinal cues to extract the true heading from the distorted flow patterns is debated. One group proposed that MSTd neurons directly encode heading direction with a dense array of templates that combine varying amounts of retinal slip with optic flow fields resulting from self-motion (Perrone and Stone, 1994, 1998). A downstream decoder would then choose the preference of the most active neuron as the current heading direction. Others suggested that the true heading is instead extracted at the population level via sparse decomposition of the flow field (Beyeler et al., 2016). In their study, the retinal flow field was represented by a comparatively small patchwork of MSTd-like units selective for complex motion patterns covering large portions of the visual field. Subsequent decoding of this population with a linear combination of units could recapitulate some (but not all) of the psychophysical results comparing heading biases between simulated and normal pursuit.
Alternatively, other groups proposed retinal slip is separated from flow due to self-motion by combining tuning shifts with gain modulation of neuronal responses (Beintema and van den Berg, 1998). Analysis of single-unit responses in VIP to flow patterns with different combinations of translational and rotational components revealed that some cells showed joint, inseparable tuning for specific combinations (Sunkara et al., 2016). These responses were similar to those seen in real pursuit, suggesting that VIP simultaneously represents both the distorted retinal flow pattern and the veridical heading, which could be extracted by a downstream area. Similar joint tuning may be present in MSTd. Along this line, another group also separately extracted pursuit direction and the center of motion in head-centered coordinates using an optimal linear estimator on MSTd responses to optic flow from a single depth plane, concluding that MSTd encodes eye and self-motion with a set of basic functions (Ben Hamed et al., 2003).
These different model formulations make testable predictions for future experiments. The spatial extent of MSTd receptive fields can be revealed by further investigation into responses to local motion patches and interactions between patches (Heuer and Britten, 2007; Mineault et al., 2012). Although our stimulus paradigm was not designed to test these models directly, one could identify joint tuning curves for pursuit and heading with denser sampling of pursuit space within the simulated pursuit paradigm.
Relationship with human psychophysical literature
Despite our physiological findings in MSTd, many psychophysical results support the hypothesis that pursuit compensation in heading perception depends on efference-copy signals (Royden et al., 1994; Banks et al., 1996; Haarmeier et al., 1997). These results may be sensitive to the exact parameters of the experiment, including pursuit speed (Warren and Hannon, 1988), task instructions (Li et al., 2006), and the presence of reference objects in the visual scene (Li and Warren, 2000). Our animals were not trained to make heading judgments, which may have affected top-down influences on sensory cortex. The pursuit and self-motion speeds used in our experiments were similar to those used by Royden et al. (1994), except that our stimulus covered a larger portion of the visual field and contained more dots than theirs, both of which provided richer cues for a retinal mechanism. The size of the FOV especially has been linked to the precision with which observers can identify the axis of eye rotation and the center of motion in retinal flow (Koenderink and van Doorn, 1987). Going forward, it will be critical to identify how retinal and extraretinal mechanisms might be weighted differently at the neuronal level based on these parameters. Some conditions (e.g., scenes with limited depth cues) may force heading-processing areas to rely more on efference copy, while the rich retinal inputs may allow retinal mechanisms to dominate when they are available (Crowell and Andersen, 2001; Wilkie and Wann, 2002).
To resolve these differences between our results and those from the human literature, the stabilized pursuit manipulation needs to be used in psychophysical experiments. We predict that biases in heading judgments using this manipulation would be small, as we see in the physiology. If, however, biases remain as large as they are under simulated pursuit, that finding would suggest that downstream areas integrate efference-copy cues. Relatively few physiological studies have investigated pursuit compensation in frontal (Yang and Gu, 2017), premotor (Cottereau et al., 2017), or high-level parietal (Siegel and Read, 1997) cortical areas that receive projections from high-level visual-motion cortex. Additional investigations into the roles of efference copy in these areas would therefore benefit the field.
Footnotes
This work was supported by National Eye Institute Grant R01 EY022087 to K.H.B., National Eye Institute Training Grant T32 EY015387 to T.S.M., National Eye Institute Core Facilities Grant P30 EY012576 to UC Davis, principal investigator J.S. Werner, and New Zealand Marsden Fund Grant (to J. Perrone, coinvestigator K.H.B.). We thank Daniel Sperka for designing the code for stimulus display, data recording, and image stabilization; Martin Banks, John Perrone, and Richard Krauzlis for helpful discussion; Conor Weatherford for technical support and animal care; and Sandra Aamodt for help in editing the manuscript. We also appreciate the helpful suggestions of the anonymous reviewers.
The authors declare no competing financial interests.
- Correspondence should be addressed to Tyler S. Manning at tmanning{at}ucdavis.edu