Visual and vestibular signals converge onto the dorsal medial superior temporal area (MSTd) of the macaque extrastriate visual cortex, which is thought to be involved in multisensory heading perception for spatial navigation. Peripheral otolith information, however, is ambiguous and cannot distinguish linear accelerations experienced during self-motion from those resulting from changes in spatial orientation relative to gravity. Here we show that, unlike peripheral vestibular sensors but similar to lobules 9 and 10 of the cerebellar vermis (nodulus and uvula), MSTd neurons respond selectively to heading and not to changes in orientation relative to gravity. In support of a role in heading perception, MSTd vestibular responses are also dominated by velocity-like temporal dynamics, which might optimize sensory integration with visual motion information. Unlike the cerebellar vermis, however, MSTd neurons also carry a spatial orientation-independent rotation signal from the semicircular canals, which could be useful in compensating for the effects of head rotation on the processing of optic flow. These findings show that vestibular signals in MSTd are appropriately processed to support a functional role in multisensory heading perception.
How we orient and move in the world is encoded by sensory information from the visual and vestibular systems. The dorsal medial superior temporal area (MSTd) of extrastriate visual cortex is important for the processing of optic flow, i.e., the retinal flow patterns experienced during navigation (Gibson, 1950; Warren and Hannon, 1990; Warren, 2003). MSTd neurons have large receptive fields and respond to complex optic flow patterns (Tanaka et al., 1986; Duffy and Wurtz, 1991, 1997; Heuer and Britten, 2004; Logan and Duffy, 2006). In addition, microstimulation of MSTd biases heading judgments based on optic flow (Britten and van Wezel, 1998).
MSTd neurons are also tuned during actual self-motion (Duffy, 1998; Bremmer et al., 1999; Page and Duffy, 2003; Gu et al., 2006; Fetsch et al., 2007) and this selectivity is of vestibular origin (Takahashi et al., 2007). Furthermore, MSTd responses are correlated with perceptual decisions in a heading discrimination task based solely on vestibular cues (Gu et al., 2007). These findings are further corroborated by the fact that multimodal MSTd neurons with congruent visual and vestibular preferences show improved directional sensitivity under bimodal stimulation that parallels similar effects on behavior (Gu et al., 2008). Collectively, these results suggest that vestibular signals in MSTd could be functionally relevant for sensory integration for heading perception.
But are vestibular signals in MSTd indeed appropriate for such a role in multisensory perception? First, for cross-modal integration, vestibular signals must be temporally matched to visual signals (Zupan et al., 2002). But early otolith signals encode acceleration (Fernández and Goldberg, 1976a; Si et al., 1997), whereas visual motion responses are typically velocity-like (Rodman and Albright, 1987; Lisberger and Movshon, 1999). Although the population peristimulus time histogram (PSTH) during heading along the preferred direction appears to at least qualitatively follow a velocity-like waveform (Gu et al., 2006), quantitative characterization of the dynamics of vestibular responses in MSTd is missing. Second, these early otolith signals suffer from a sensory ambiguity: They encode net linear acceleration and cannot distinguish those accelerations resulting from self-motion from changing spatial orientation relative to gravity (e.g., during tilt) (Angelaki et al., 2004; Dickman et al., 1991; Fernández and Goldberg, 1976a,b). Perceptually this ambiguity rarely constitutes a problem even in darkness, except for very prolonged periods of acceleration (i.e., at very low frequencies (Merfeld et al., 1999, 2005). This occurs because the sensory ambiguity in the vestibular periphery can be resolved centrally by combining signals from both vestibular sensors, the otolith organs and the semicircular canals (Angelaki et al., 1999; Merfeld and Zupan, 2002; Zupan et al., 2002; Green and Angelaki, 2003, 2004; Green et al., 2005; Shaikh et al., 2005b). Indeed, Yakusheva et al. (2007, 2008) have described such optimal canal/otolith convergence in Purkinje cells of the nodulus and uvula of the cerebellar vermis.
If vestibular responses in MSTd are indeed appropriate for heading perception, they should modulate selectively during self-motion and not in response to changes in spatial orientation relative to gravity. Here we show that vestibular signals in MSTd are indeed temporally and functionally appropriate for such a role in heading perception.
Materials and Methods
Subjects and surgery.
Three rhesus monkeys (Macaca mulatta, 3.5–6 kg) were chronically implanted with an eye coil, head-restraining ring, and a plastic guide tube platform for single unit recordings (see Meng et al., 2005; Gu et al., 2006 for details). All surgical procedures were performed in accordance to institutional and NIH guidelines.
For these experiments, animals were seated in a primate chair that was secured inside a vestibular turntable consisting of a three-axis rotator on top of a linear sled (Acutronics Inc.). The system could deliver yaw, pitch, or roll rotation and translation along any direction in the horizontal plane (but not vertical translations, as in Gu et al., 2006; Takahashi et al., 2007). Animals were placed such that, when upright, their horizontal stereotaxic plane was aligned with the earth-horizontal and all three rotational axes (yaw, pitch, and roll) were aligned with the center of the head.
We recorded extracellular activity of single neurons in area MSTd using epoxy-coated tungsten microelectrodes (FHC; 1–2 MΩ). Electrodes were inserted into 26-gauge transdural guide tubes and advanced by a remote-controlled microdrive (FHC). Neural activities were amplified, filtered (300–6 kHz) and passed through a dual time-amplitude window discriminator (BAK Electronics). Note that, although horizontal and vertical eye movements were recorded as part of these experiments, they are not further analyzed here. Details about the three-dimensional eye movements evoked during the same experimental protocols as those used in the present experiments have been presented by Angelaki et al. (1999).
The vestibular-responsive region of area MSTd was identified using a combination of anatomical and electrophysiological criteria, as described in detail previously (Gu et al., 2006). Anatomical criteria were based both on stereotaxic coordinates and magnetic resonance imaging (MRI). Physiological criteria were as follows: (1) MSTd was usually the first gray matter modulating to flashing visual stimuli; (2) MSTd neurons had large receptive fields (RF) that often included the contralateral visual field but could also extend into the ipsilateral visual field; (3) Finally, our penetrations in MSTd were also guided by the eccentricity of receptive fields in underlying area MT (Gu et al., 2006).
The microelectrode was advanced into area MSTd while the monkey performed a simple fixation task and the neuron's RF was mapped by moving a patch of drifting random dots around the visual field on a custom graphical interface (see Gu et al., 2006 for details). Then the monitor was turned off and the following protocols were all run in total darkness. Note that the sensitivity of MSTd neurons does not change significantly during self-motion in complete darkness versus fixation of a real or imaginary target (Gu et al., 2006, 2007; Takahashi et al., 2007). The present experiments were all performed in total darkness and care was taken not to have any light leak from around the door of the laboratory. We chose to perform these experiments in total darkness and without any behavioral control for two main reasons. First, we wanted to exclude any contribution of retinal slip or attempt to fixate and suppress the vestibulo-ocular reflex. Second, we wanted to make direct comparisons with responses in thalamus, cerebellum and brainstem, where data were collected while the animals were allowed to make eye movements freely. Finally, note that, although MSTd neurons have robust responses to pursuit (Bremmer et al., 1997; Upadhyay et al., 2000; Page and Duffy, 2003; Ilg, 2008), they do not seem to systematically change their firing rate during the vestibulo-ocular reflex in darkness (Gu et al., 2007).
To manipulate translational (inertial) and net (gravitoinertial) linear acceleration (Angelaki et al., 2004; Meng et al., 2007; Yakusheva et al., 2007), four stimuli were delivered: translation only, tilt only or combined translation and tilt (“tilt-translation” and “tilt+translation”). The tilt stimulus consisted of a 0.5 Hz sinusoidal rotation from an upright position with peak amplitude of ±11.5°. This stimulus causes reorientation of the head relative to gravity, such that otolith afferents are stimulated by a ±0.2 G linear acceleration component in the head-horizontal plane. The amplitude of translation was then adjusted to match that induced by the head tilt (±0.2 G, resulting in a displacement of ±20 cm). During combined tilt and translation, the translational and gravitational accelerations combine in either an additive or subtractive manner, depending on the relative phase of the two stimuli. As a result, the net gravitoinertial acceleration in the horizontal plane either doubled (tilt+translation) or was nearly zero (tilt-translation), even though the actual translation remained the same. Each cell was tested at two orientations, θ = 0° and θ = 90°, corresponding to lateral motion/roll tilt and forward/backward motion/pitch tilt, respectively. Whenever single cell isolation was maintained, lateral and forward/backward translations were also delivered at different frequencies: 0.3 Hz (±0.1 G), 1 Hz (±0.2 G), and 2 Hz (±0.3 G).
In addition to pitch and roll tilts (which activate both otolith and semicircular canal afferents), neurons were also tested during rotations (0.5 Hz, ±10°) about an earth-vertical axis (EVR). Such rotations do not change head orientation relative to gravity, thus they activate exclusively semicircular canal (but not otolith) afferents. First, yaw (left-right) rotation was delivered with the animals seated upright. Next, to test vertical canal activation during EVR, the same motion was also delivered with the animal statically tilted: pitched 45° nose-up/down (stimulating a plane half-way between yaw and roll) and/or rolled 45° right/left ear-down (stimulating a plane half-way between yaw and pitch).
These data, collected during EVR (thus activating only the semicircular canals), were then used to compare with pitch/roll tilt responses (which activate both otolith organs and semicircular canals) and quantitatively test the hypothesis that rotation responses in MSTd do not depend on spatial orientation relative to gravity. These comparisons represent the best way to test the canal- versus otolith-driven origin of the MSTd tilt responses (see Results): if earth-horizontal (tilt) and earth-vertical axis responses are identical, then they are gravity-independent and likely originate from the semicircular canals. On the other hand, if they are not identical, then we conclude that otolith-driven signals also contribute to the pitch/roll tilt modulation of MSTd cells. Note that we have used this experimental protocol previously in vestibular nuclei neurons: we found no correlation between the two, allowing us to conclude that some of the tilt responses arise from activation of the otolith organs (Dickman and Angelaki, 2002).
It is important to emphasize that this comparison could not be done by measuring MSTd activity during static tilt; the reason is that otolith-driven central responses are often strongly frequency-dependent (see Fig. 6) (see also Dickman and Angelaki, 2002; Yakusheva et al., 2008; Shaikh et al., 2005a); what happens under static tilt conditions and at 0.5 Hz can be quite different. In particular, if we found no static tilt sensitivity, it would have been incorrect to conclude that there is no otolith-driven tilt response at 0.5 Hz (since we cannot eliminate high-pass tilt dynamics for these neurons). Similarly, if we had found static tilt sensitivity, we could not have concluded that there is an otolith-driven contribution to 0.5 Hz tilt responses (since we cannot eliminate low-pass tilt dynamics). In fact, the simple spike responses of nodulus/uvula Purkinje cells modulate strongly with static tilt, but not during 0.5 Hz tilt, and this difference is striking (Yakusheva et al., 2007, 2008).
Permutation analysis was used to determine whether cells modulated significantly to each sinusoidal stimulus, as follows. Firing rates were first binned (40 bins per cycle) and a Fourier ratio (FR) was defined as the fundamental frequency over the maximum of the first 20 harmonics. Subsequently, the 40 response bins were shuffled randomly, thus destroying the systematic modulation in the data but maintaining the inherent variability of the responses. An FR was then computed from those randomly permuted histograms, and the randomization process was repeated 1000 times. If the FR for the original data exceeded that for 99% of the permuted data sets, we considered the temporal modulation to be statistically significant (p < 0.01).
Sinusoidal responses were further quantified using instantaneous firing rate (IFR) (computed as the inverse of interspike interval). First, IFRs from multiple cycles were folded into a single cycle by overlaying neural responses. Subsequently, amplitude and phase were determined by fitting a sine function (clipped off at zero response) to both response and stimulus using a nonlinear least-squares minimization algorithm (Levenberg-Marquardt). Response amplitude refers to half the peak-to-trough modulation. For rotational stimuli, neural response gain was computed as the ratio of response amplitude over peak head velocity (in units of spikes/s per °/s). For translational stimuli, neural gain was calculated as response amplitude divided by either peak acceleration (“acceleration” gains, in units of spikes/s per G; G = 981 cm/s2) or by peak velocity (“velocity” gains, in units of spikes/s per cm/s). Phase was expressed as the difference between peak response and peak velocity (rotation and tilt) or acceleration (translation).
The spatial tuning in the horizontal plane for translation and sagittal and frontal planes for rotation were quantified using a spatiotemporal cosine-like tuning model (Angelaki, 1991, 1992; Schor and Angelaki, 1992). In particular, the model has four parameters; three parameters characterize the properties of the cell's preferred stimulus: i.e., preferred direction, as well as response gain and phase for stimulation along the preferred direction. However, unlike the traditional cosine-tuning in which these three parameters are sufficient to characterize responses along any other direction, the spatiotemporal model has a fourth parameter, the response gain along a second response direction (the latter is always spatially and temporally orthogonal to the preferred direction). This fourth parameter is always assumed to be zero for traditional cosine-tuning. Note that the spatiotemporal model is more general than the traditional cosine-tuning model; whereas the latter assumes zero response for perpendicular directions, the spatiotemporal model allows for nonzero response along the axis perpendicular to the preferred direction. The larger the magnitude of this perpendicular response relative to the preferred response, the larger is the departure from the traditional cosine-tuning. In general, spatiotemporal tuning allows temporal dynamics and spatial properties to be intermingled, such that more than one temporal parameter (e.g., velocity and acceleration) can be simultaneously coded along different spatial directions. The spatiotemporal model was shown to characterize best the translation tuning of brainstem and cerebellar vestibular neurons (Bush et al., 1993; Angelaki and Dickman, 2000; Shaikh et al., 2005a; Chen-Huang and Peterson, 2006; Yakushin et al., 2006).
To determine whether cell responses correlated best with translation or net acceleration, linear regression analysis was used to simultaneously fit cumulative cycles of cell modulation during each of the translation, tilt and combined stimuli using “net acceleration” and “translation”-coding models (for details, see Angelaki et al., 2004; Green et al., 2005). To determine how well each of the two models fitted the data, we computed partial correlation coefficients, which were normalized using Fisher's r-to-z transform.
More than half of the recorded MSTd neurons (92 of 175, 53%) had a significant response modulation (permutation test, p < 0.01; see Materials and Methods) during motion in darkness (Table 1). These motions included yaw (left-right rotation), pitch (nose up-down rotation) and roll (left-right ear-down rotation), as well as lateral and forward/backward translation. Vestibular neurons included approximately equal percentages of translation- and rotation-responding cells (42% and 39%, respectively; see Table 1). Given that vertical motion-preferring neurons were not identified here, the percentage of translation-responding neurons in total darkness reported here (42%) is slightly lower than the percentage of MSTd neurons (54–64%) tuned to three-dimensional (3D) translation (Gu et al., 2006, 2007; Takahashi et al., 2007). Similarly, the percentage of rotation-responding neurons reported here (39%) is similar to those tuned to 3D rotation in darkness [note that during fixation the percentage of responding cells is higher because of a residual rotational vestibulo-ocular reflex causing retinal slip and thus evoking visual responses in many MSTd neurons during fixation (Takahashi et al., 2007; Chowdhury et al., 2008)]. The majority (28%, 49 of 175) of cells were “convergent,” i.e., they responded during both translation and rotation. Nonconvergent neurons were less frequent; 14% (24 of 175) of MSTd cells were exclusively sensitive to translation and 11% (19 of 175) modulated exclusively during rotation.
The spatial tuning in the horizontal plane for translation and sagittal and frontal planes for rotation were quantified using a spatiotemporal cosine-like tuning model (Angelaki, 1991, 1992; Schor and Angelaki, 1992) (see Materials and Methods). Preferred directions were broadly and uniformly distributed within the horizontal plane (uniformity test, p = 0.75) (Fig. 1 A), with gains averaging 269 ± 22.7 spikes/s/G at 0.5 Hz (range 39–1036). There was no difference in either gain or preferred direction for convergent versus nonconvergent cells (Wilcoxon test, gain: p = 0.11; preferred direction: p = 0.76) (Fig. 1 A, filled vs open symbols, respectively). Tuning ratios (i.e., the ratio of the minimum over maximum response gain) were unimodally distributed (modality test, p uni = 0.4) (Fig. 1 B). The majority of MSTd cells had tuning ratio close to zero, suggesting traditional cosine-tuning. A notable proportion (44%, 32 of 73) of MSTd neurons, however, exhibited spatiotemporal properties, with response gains along a perpendicular direction that were larger than 20% of those along the preferred direction (tuning ratio >0.2). The distribution of neuronal phase was uniform (uniformity test, p = 0.3) (Fig. 1 C), as is typical of responses in other vestibular areas (Angelaki and Dickman, 2000; Shaikh et al., 2005a).
Lesion experiments have shown that the responses of MSTd neurons during self-motion in darkness are of labyrinthine origin (Gu et al., 2007; Takahashi et al., 2007). Specifically, translation responses arise from activation of the otolith organs and yaw rotation responses arise from activation of the semicircular canals. Pitch and roll modulation, however, can arise from activation of either the otolith organs or the semicircular canals. This occurs because otolith afferents are sensitive to net linear acceleration. Pitch and roll rotations (referred to here as “tilt”) change the orientation of the head relative to gravity, thus providing an effective stimulus that activates both otolith organs and vertical semicircular canals. The origin of tilt modulation for MSTd neurons (Table 1) is crucial for their proposed role in heading perception. On the one hand, tilt responses might be of semicircular canal origin, thus reflecting a gravity-independent rotation signal that can be used for the processing of optic flow (see Discussion). On the other hand, pitch/roll responses can also be otolith-driven, reflecting the sensitivity of the peripheral otolith sensors to gravitational acceleration. The latter, i.e., if pitch/roll responses are otolith-driven and caused by changes in orientation relative to gravity, would be inappropriate for driving heading perception; otherwise, every time we tilt our head it would be perceived as self-motion. Here we test whether the pitch/roll modulation of MSTd neurons arise from gravity-responsive, otolith-driven signals or spatial orientation-independent, canal-driven signals. First, we use translation and tilt, as well as combinations of translation and tilt, to show that MSTd responses correlate best with translation and not net acceleration. Next, we use rotations about different axes to show that rotational responses in MSTd are independent of head orientation relative to gravity.
MSTd neurons correlate better with translation rather than with net linear acceleration
To investigate whether MSTd neurons selectively encode true heading information or, like otolith afferents, they also modulate in response to gravitational acceleration, MSTd neurons were tested during translation, tilt and combination stimuli (Angelaki et al., 2004), as shown in the top schematics of Figure 2. Because peak tilt amplitude is such that the horizontal linear acceleration caused by gravity is the same as that during translation (see Materials and Methods), when both translation and tilt are presented together, the net horizontal acceleration is either zeroed (tilt-translation) or doubled (0.4 G, tilt+translation, see Materials and Methods).
Representative responses from a typical MSTd cell during lateral translation/roll tilt (θ = 0°) and forward/backward translation/pitch tilt (θ = 90°) are shown in Figure 2. Although net acceleration was the same during translation and tilt (traces on bottom), most heading-sensitive MSTd neurons modulated more strongly during translation than during tilt (Fig. 2 A,B, compare peak-to-trough sinusoidal modulation of firing rate). When translation and tilt are presented simultaneously, such that net horizontal linear acceleration is either zero (tilt-translation, Fig. 2 C) or double (tilt+translation, Fig. 2 D), MSTd neuron responses appear similar to those during translation.
Data from all heading-sensitive MSTd neurons are summarized in Figure 3. Across the population, responses during tilt are significantly attenuated compared with those during translation (Wilcoxon rank test, p ≪ 0.001) (Fig. 3 A). In addition, as expected from neurons that selectively encode translation and ignore changes in orientation relative to gravity (solid red lines), responses during tilt-translation and tilt+translation are similar to those during translation (Fig. 3, B and C, respectively; Wilcoxon rank test, p ≫ 0.05). Correlation slopes are not significantly different from unity (tilt-translation vs translation: 0.96 (95% confidence interval: [0.84, 1.1], r = 0.88, p ≪ 0.001) and tilt+translation vs translation: 0.93 (95% confidence interval: [0.81, 1.06], r = 0.88, p ≪ 0.001). In contrast, for both combination stimuli, responses are inconsistent with the predictions of net linear acceleration (Fig. 3 B,C, dashed blue lines). This is most striking during tilt-translation, where net linear acceleration is zero; data points fall along the diagonal axis and not along the abscissa (Fig. 3 B). Thus, during combined stimulation the tilt movement is ignored and the cells modulate exclusively to the translational component of the motion. In fact, the translation and tilt responses of MSTd neurons sum and subtract linearly to generate the responses to combination stimuli, tilt+translation and tilt-translation (Fig. 4). Briefly, for each cell, we computed the vectorial sum and difference between the translation and tilt response and compared the computed gain and phase with those measured during tilt+translation and tilt-translation, respectively. There was no significant difference between either gain or phase of actually measured and predicted tilt+translation and tilt-translation responses (Wilcoxon rank test, p > 0.05). We conclude that the stimuli we used operate on the linear range of MSTd cells. Note that these observations are true for both nonconvergent and convergent cell types; i.e., those that only modulate in response to translation and those with significant modulation during both rotation and translation (Figs. 3 and 4, open vs filled symbols).
The finding that heading responses in MSTd reflect true self-motion sensitivity that is independent of gravity is further corroborated by examining response phase (Fig. 3 D–F). Note that tilt+translation and tilt-translation stimuli differ in the relative phase of the translation and tilt (Fig. 2 C,D, bottom traces marked translation and tilt) and that response phase here has been expressed relative to tilt (which is the same for both stimuli). Thus, if indeed the modulation of MSTd neurons reflects selective coding of translation rather than net acceleration, neuronal phase during tilt-translation and translation should be the same; thus, data should fall along the unity-slope, red line (Fig. 3 E). But neuronal phase during tilt+translation should be opposite (i.e., different by 180°) to that during translation (Fig. 3 F, solid red lines). In contrast, if neurons respond to net acceleration, tilt+translation phase should be the same as that during translation (Fig. 3 F, dashed blue lines). Data are clustered around the predictions for coding translation and not net acceleration (Fig. 3 E,F).
To quantify these observations, multiple linear regression analysis was used to compute partial correlation coefficients of how well each neuron's response to translation, tilt, tilt-translation and tilt+translation could be predicted by net acceleration (Fig. 2, bottom traces) or translation-coding models (Fig. 2, third row from bottom). To simplify plotting and visual interpretation, Fisher's r-to-z transform was used to normalize the variances of partial correlation coefficients (Angelaki et al., 2004). Figure 5 shows a scatter plot of the z-transformed partial correlation coefficients, where dotted lines mark the 0.01 level of significance. Most MSTd neurons (84%, 57 of 68) fall in the upper-left quadrant, illustrating that their firing rates are better correlated with coding of translation. Only 2.9% (2 of 68) are better correlated with net acceleration (Fig. 5, lower-right quadrant). This distribution of partial correlation coefficients is not different from that in the nodulus/uvula (MANOVA, p = 0.1) (Yakusheva et al., 2007), but differs from the medial vestibular nuclei (MANOVA, p < 0.001) (Angelaki et al., 2004) and ventrolateral thalamus (MANOVA, p < 0.001) (Meng et al., 2007), where data spanned the whole range and many neurons had net acceleration-like properties.
These results allow us to conclude that translation responses in MSTd encode true heading information and not net linear acceleration; that is, MSTd neurons are not sensitive to the component of otolith activation that results from changes in spatial orientation relative to gravity. Importantly, although several MSTd neurons modulate during pitch and roll tilt (see Table 1 and Fig. 3 A), the analysis of combinations of tilt and translation in Figure 5 does not support the hypothesis that these responses are otolith-driven. As will be shown next, a complementary analysis focusing on rotation responses further demonstrates that MSTd neurons: (1) ignore the otolith activation during tilt; and (2) tilt (rotation) responses reflect gravity-independent signals, likely arising from the semicircular canals. But before we present that analysis, we first describe the frequency dynamics of heading responses of MSTd neurons. Note that, other than a qualitative description of the population PSTH along the preferred direction (Gu et al., 2006), MSTd response dynamics to translation have not been previously quantified.
Translation at different frequencies was used to address whether heading responses are acceleration-like, similar to otolith afferents, or velocity-like, similar to visual responses to optic flow stimulation. We found that translation responses are largest at low frequencies, reaching 442 ± 49 spikes/s/G at 0.3 Hz, but decreased with increasing frequency, as illustrated with a typical example in Figure 6 A. Acceleration gains (i.e., ratio of peak response over peak acceleration) decrease with increasing frequency (Fig. 6 B; ANCOVA, p < 0.001, slope: −0.57). Response phase was independent of frequency (Fig. 6 C; ANCOVA, p > 0.05) [note that MSTd cells exhibit nonminimum phase characteristics, i.e., phase does not follow similar dependence on frequency as gain; this is typical of all central translation responses (Angelaki and Dickman, 2000; Dickman and Angelaki, 2002; Shaikh et al., 2005a; Yakusheva et al., 2008)]. The decreasing acceleration gain versus frequency plot implies that MSTd neurons, unlike otolith afferents, do not encode acceleration (otherwise acceleration gains would be flat and independent of frequency). A negative unity slope (when plotted in a log-log manner as in Fig. 6 B) would indicate that MSTd neurons encode linear velocity. Thus, the less than unity slope in Figure 6 B suggests that MSTd neurons encode combinations of velocity and acceleration. This is further shown in Figure 6 D, which plots mean (±SEM) acceleration (filled black circles) and mean velocity gains (filled gray squares). As expected, acceleration gains decrease with frequency, but velocity gains increase with frequency (ANCOVA, p < 0.001, slope: 0.43).
The MSTd response dynamics differ from those in the vestibular nuclei (Angelaki and Dickman, 2000; Dickman and Angelaki, 2002) and fastigial nuclei (Shaikh et al., 2005a), where a mixture of increasing, decreasing and flat acceleration gains and strong phase dependence on frequency have been reported. However, both mean gain and its frequency dependence are identical to those of nodulus/uvula Purkinje cells (Yakusheva et al., 2008) (Fig. 6 D, open symbols and dashed lines) (ANCOVA, main effect comparing MSTd and Purkinje cell data: F (1,245) = 1.9, p = 0.17; interaction: F (3,240) = 0.5, p = 0.7). Velocity-like responses have been suggested based on population PSTHs (Gu et al., 2006), but here we have quantified that MSTd heading responses carry combinations of velocity and acceleration signals
MSTd rotation responses are independent of spatial orientation relative to gravity
Another way to test whether MSTd neurons indeed ignore the otolith activation during tilt is by showing that their modulation during rotation is independent of spatial orientation relative to gravity. The rationale is as follows. Rotation responses from the semicircular canals are independent of how the head is oriented relative to gravity. In contrast, rotation (i.e., tilt) responses of otolith afferents depend on how the head moves relative to gravity. Thus, our goal is to test whether rotational responses in MSTd depend on how the head is oriented relative to gravity. That is, for the same rotation plane relative to the head (i.e., pitch or roll), we could compare individual MSTd cell activity in response to tilt (i.e., earth-horizontal axis rotations that change head orientation relative to gravity) and EVR (i.e., earth-vertical axis rotations that do not change head orientation relative to gravity). If rotation modulation in MSTd arises exclusively from activation of the semicircular canals, we expect that tilt and EVR responses would be identical. Alternatively, if pitch and roll MSTd responses (Table 1) arise at least partially from otolith activation, tilt and EVR responses would be expected to differ (since the latter only activates semicircular canals, whereas the former incorporates both otolith and semicircular canal signals).
How MSTd cells respond during pitch and roll tilt has already been described (Table 1 and Fig. 3 A). But how do we test for EVR pitch and roll responses? With the animal upright, this is impossible. To characterize EVR pitch/roll responses, macaques must be statically positioned 90° ear-down (to test pitch) and supine or prone (to test roll; see schematics in Fig. 7 A, top). In practice, 90° repositioning is difficult; to avoid risk of losing cell isolation, we recorded MSTd neuron activities during EVR with the animals repositioned up to ±45° from upright, thus testing planes half-way between yaw and pitch/roll (Fig. 7 A, top schematics). We then used both the yaw and yaw+roll or yaw+pitch EVR modulation to construct spatial tuning curves using the spatiotemporal cosine-like tuning model (see Materials and Methods) and computed the corresponding EVR pitch and roll modulation of the cell.
Responses from a typical MSTd cell during EVR stimulation with the animal positioned not only upright (yaw rotation), but also pitched 45° nose-up/down (resulting in a combination of yaw+roll rotations) and rolled 45° right/left ear-down (resulting in a combination of yaw+pitch rotations) are illustrated in Figure 7 A. Figure 7, B and C, shows the distribution of preferred rotational directions in each of the sagittal and frontal planes, respectively. Each symbol in the plots corresponds to a convergent neuron (i.e., a cell modulating significantly during both rotation and translation; filled symbols) or nonconvergent neuron (i.e., a cell modulating only during rotation; open symbols), with the distance from the origin corresponding to its gain. From these plots (and corresponding tuning curves), we then calculated the predicted response to EVR pitch and roll rotations. Figure 7, D and E, compares the predicted EVR pitch/roll gain and phase with those measured during pitch/roll tilt for 14 MSTd cells. Both sets of parameters were not significantly different (Wilcoxon rank test, gain: p = 0.77; phase: p = 0.82). The fact that data fall along the diagonal suggests that tilt and EVR responses are not encoded differently by MSTd neurons. Given that the two stimuli differ only in terms of spatial orientation relative to gravity, we conclude that the rotational responses of MSTd cells reflect a (presumably canal-driven), spatial orientation-independent signal. Like the tilt/translation analysis (Fig. 5), this property of MSTd neurons also differs from brainstem responses: Unlike MSTd (Fig. 7), EVR and tilt responses in convergent vestibular nuclei neurons are not identical, reflecting the fact that a component of the tilt response is otolith-driven (Dickman and Angelaki, 2002).
We have used traditional vestibular stimulation to characterize how neurons in an extrastriate visual cortical area, which is believed to be functionally linked to heading perception (Britten and van Wezel, 1998; Gu et al., 2007, 2008), respond during self-motion in darkness and how they compare with those in the brainstem and cerebellar cortex. Two response properties that are particularly relevant to multisensory heading perception were explored here. First, we show that MSTd neurons code combinations of heading velocity and acceleration. Responses closer to velocity make the vestibular modulation of MSTd neurons more similar and likely more compatible with the velocity-like responses to optic flow (Rodman and Albright, 1987; Lisberger and Movshon, 1999; Gu et al., 2006). Second, we show that MSTd vestibular responses are transformed from otolith afferent signals such that they are selective to the motions experienced during navigation (heading) and do not represent generic responses to net linear acceleration. In particular, we have shown that MSTd neurons do not respond to changes in spatial orientation relative to gravity, although they do carry an independent, presumably canal-driven, rotation signal (see below). Such selective coding of heading, over net linear acceleration, is appropriate for the proposed role of MSTd in visual/vestibular multisensory cue integration for self-motion perception.
This finding contrasts with a broader representation of both heading-specific and net linear acceleration signals in brainstem and thalamic nuclei (Angelaki et al., 2004; Meng et al., 2007). MSTd heading responses are instead similar to those of vermal Purkinje cells (Yakusheva et al., 2007). Interestingly, the two populations are also similar in other respects, including modulation amplitude and temporal properties (Fig. 4 D). One difference between MSTd and cerebellar cortex responses is that the latter do not modulate during earth-vertical axis rotations (Yakusheva et al., 2007, 2008). In contrast, MSTd neurons do: unlike vermal Purkinje cells, a little less than half of MSTd neurons also carry a heading-independent, rotation signal (see also Takahashi et al., 2007). We have shown here that rotation signals in MSTd are independent of spatial orientation relative to gravity and thus likely reflect semicircular canal-driven (and not otolith-driven, tilt-related) signals.
Translation vestibular signals in MSTd are thought to be functionally linked to multisensory integration for heading perception (Britten and van Wezel, 1998, 2002; Gu et al., 2007, 2008). What, then, might be the function of rotation responses? One obvious role would be to disambiguate optic flow that is produced by self-translation from that produced by eye/head/body rotation (Royden et al., 1992; Banks et al., 1996; Crowell et al., 1998). Physiological studies have shown that MSTd neurons can signal heading from optic flow in the presence of pursuit eye movements (Bradley et al., 1996; Page and Duffy, 1999; Shenoy et al., 1999). By analogy, vestibular rotation signals in MSTd may be involved in compensating for the effects of head rotations on the processing of optic flow (see Takahashi et al., 2007, where this hypothesis was introduced). The different functional role of rotation and translation signals in MSTd is consistent with the fact that mainly translational components of optic flow are used for navigation; rotational optic flow is typically nulled by a compensatory VOR (Angelaki and Hess, 2005).
An important, and perhaps surprising, finding of the present study is the remarkable similarity between the heading properties of MSTd neurons and those of Purkinje cells in vermal lobules 9 and 10 (uvula and nodulus; Yakusheva et al., 2007, 2008). If MSTd and nodulus/uvula are somehow interconnected (and in later sections we postulate they are), this could only be through polysynaptic pathways. Let us first consider the afferents of the nodulus/uvula. Purkinje cells from the nodulus/uvula inhibit neurons in the vestibular and rostral fastigial nuclei (Shojaku et al., 1987; Wylie et al., 1994; Voogd et al., 1996; Fushiki and Barmack, 1997). Vestibular and cerebellar nuclei then project to the ventral lateral and ventral posterior lateral nuclei of the thalamus (Lang et al., 1979; Asanuma et al., 1983; Meng et al., 2001, 2007; Marlinski and McCrea, 2009), although it is presently unclear whether thalamus-projecting neurons are also nodulus/uvula-target cells. The cortical projections of vestibular-responsive cells in the thalamus are also unknown, yet it is unlikely that their targets include MSTd, since thalamic inputs to MSTd appear limited to the inferior/medial pulvinar (Boussaoud et al., 1992; Kaas and Lyon, 2007), areas that do not modulate during vestibular stimulation (Meng and Angelaki, 2008).
Thus, most likely, vestibular signals reach MSTd after a minimum of four synapses, through corticocortical pathways, potentially involving the frontal eye fields (Ebata et al., 2004) and parieto-insular vestibular cortex (PIVC) (Guldin et al., 1992). Shorter-latency connectivity between MSTd and the cerebellum is more likely when considering the efferents to the nodulus/uvula. MSTd has strong projections to the pretectum and pontine nuclei (Boussaoud et al., 1992; Distler et al., 2002), which give rise to both mossy and climbing fibers (the latter through the dorsal cap and ventrolateral outgrowth of the principal olive) (Voogd et al., 1996; Barmack, 2006). Alternatively to the possibility that responses in one area are driven (indirectly) by those in the other, it is equally likely that similar computations have been performed independently in different parts of the brain. At present, our findings cannot help distinguish among these alternatives.
Based on transneuronal tracing methods involving mainly the cerebellar hemispheres and dentate nucleus, Strick and colleagues (Middleton and Strick, 2001; Dum and Strick, 2003; Kelly and Strick, 2003) have shown that closed-loop circuits might represent a fundamental architectural feature of cerebro-cerebellar interactions. Although not yet verified experimentally, the underlying assumption behind such closed-loop anatomical connections is similarity in physiological properties and underlying function. The similarity in response properties between MSTd and nodulus/uvula might represent a physiological signature of a yet-unidentified interconnectivity and linked function. Although this idea remains merely a hypothesis at present, the current findings provide strong motivation to search for such multisynaptic connectivity. We hypothesize that there is a functional link between MSTd (and perhaps other extrastriate cortical areas with a role in heading perception; e.g., VIP) (for review, see Britten, 2008) and nodulus/uvula, which could be mediated by closed-loop anatomical circuits, an emerging architecture of cerebro-cerebellar interactions (Strick et al., 2009).
Unlike the similarity in heading properties with the nodulus/uvula, MSTd responses differ from those in the thalamus and brainstem/cerebellar nuclei projecting to the thalamus (Angelaki and Dickman, 2000; Dickman and Angelaki, 2002; Musallam and Tomlinson, 2002; Shaikh et al., 2005a; Meng et al., 2007; for review, see Angelaki and Cullen, 2008). Neurons in these subcortical vestibular areas carry mixtures of translation and net acceleration signals and exhibit a wide variety of response dynamics with acceleration gains and phases that remain flat, increase or decrease as a function of frequency. Given that vestibular signals reach the cortex through vestibular, deep cerebellar nuclei and thalamus projections, such differences in response properties are surprising. Does the encoding of heading-related signals really change in these intermediate structures, or is it simply the case that the signals in MSTd and uvula/nodulus are a subset of what is found in the vestibular/cerebellar/thalamic nuclei? No data are currently available to distinguish among these alternatives.
The present findings support the postulated role of MSTd in multisensory heading perception. But, what is the functional role of the nodulus/uvula? Are these areas also involved in self-motion perception? The cerebellum has long been implicated in movement adaptation and internal models (Shidara et al., 1993; Wolpert et al., 1995; Glasauer, 2003; Green et al., 2007; Ghasia et al., 2008). In particular, the cerebellum is thought to construct “forward” models, whose function is to predict the consequences of the motor command on behavior, a signal critical to refining the motor command by computing an error between predicted and desired action. Although the concept of internal models has been particularly influential for motor control (Wolpert and Miall, 1996; Kawato and Wolpert, 1998; Kawato, 1999; Hwang and Shadmehr, 2005; Ito, 2005), it is also becoming increasingly popular for spatiotemporal sensory processing for multisensory perception (Merfeld et al., 1999; Angelaki et al., 2004; Zupan et al., 2004; Glasauer et al., 2007). The fact that the nodulus/uvula encodes true heading information, like cortical areas that presumably mediate perception (Gu et al., 2007, 2008), is consistent with the notion that the cerebellum maintains internal models of the sensory world and how it is encoded by the brain (i.e., an internal model of heading perception). Such internal model of sensory perception, similar to a forward model in motor control, can be fundamental in perceptual learning (Gilbert et al., 2001; Tsodyks and Gilbert, 2004) and construction/implementation of Bayesian priors (Jacobs, 1999; Knill and Pouget, 2004; Stocker and Simoncelli, 2006).
Experiments were supported by National Institutes of Health Grants EY017866 and EY019087. We thank Tatyana Yakusheva, Yong Gu, David Dickman, and Greg DeAngelis for collegial contributions to this work. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Eye Institute or the National Institutes of Health.
- Correspondence should be addressed to Dr. Dora E. Angelaki, Department of Neurobiology, Box 8108, Washington University School of Medicine, 660 South Euclid Avenue, St. Louis MO 63110.