Abstract
Earlier studies of neurons in the anterior region of the superior temporal polysensory area (STPa) have demonstrated selectivity for visual motion using stimuli contaminated by nonmotion cues, including texture, luminance, and form. The present experiments investigated the motion selectivity of neurons in STPa in the absence of form cues using random dot optic flow displays. The responses of neurons were tested with translation, rotation, radial, and spiral optic flow displays designed to mimic the types of motion that occur during locomotion. Over half of the neurons tested responded significantly to at least one of these displays. On a cell by cell basis, 60% of the neurons tested responded selectively to rotation, radial, and spiral motion, whereas 20% responded selectively to translation motion. The majority of neurons responded maximally to single-component optic flow displays but was also significantly activated by the spiral displays that contained their preferred component. Moreover, there was a bias in the selectivity of the neurons for radial expansion motion. These results suggest that neurons within STPa are contributing to the analysis of optic flow. Furthermore, the preponderance of cells selective for radial expansion provides evidence that this area may be specifically involved in the processing of forward locomotion and/or looming stimuli. Finally, these results provide carefully controlled physiological evidence for an extension and specialization of the motion-processing pathway into the anterior temporal lobe.
Optic flow fields are generated across the retina as an observer moves through the environment, providing effective cues regarding both the heading of the observer and the structure of the environment (Gibson, 1950; Koenderink and Van Doorn, 1981). Multiple cortical regions are involved in the analysis of motion. Neurons in the middle temporal area (MT/V5) respond to motion in a single direction within a small area of the visual field but do not show selectivity for complex motion patterns (Allman et al., 1973;Zeki, 1974; Van Essen et al., 1981; Maunsell and Van Essen, 1983a;Albright, 1984). Neurons in the dorsal division of the medial superior temporal area (MSTd) receive projections from area MT (Maunsell and Van Essen, 1983b; Ungerleider and Desimone, 1986; Boussaoud et al., 1990) and respond selectively to complex patterns of optic flow, including rotation, expansion, and spiral motion (Saito et al., 1986; Tanaka et al., 1986, 1989; Tanaka and Saito, 1989; Duffy and Wurtz, 1991a,b;Orban et al., 1992; Graziano et al., 1994). MST projects to area 7a, the lateral intraparietal area (LIP), and the ventral intraparietal area (VIP) in the parietal cortex, and to the anterior superior temporal polysensory area (STPa) in the temporal cortex (Andersen et al., 1990; Boussaoud et al., 1990; Baizer et al., 1991), all of which have some motion-processing capability (Oram et al., 1993; Schaafsma and Duysens, 1996; Shadlen and Newsome, 1996; Siegel and Read, 1997a). On the basis of these divergent projections from MST, it has been suggested that the dorsal visual pathway can be further divided into two substreams. Areas involved in the analysis of spatial relationships and goal-directed functions form a pathway to the parietal lobe (Siegel and Read, 1997b; Andersen et al., 1997), whereas projections that are directed toward the temporal lobe may constitute a separate pathway for motion analysis (Boussaoud et al., 1990; Morel and Bullier, 1990; Baizer et al., 1991).
Initial electrophysiological studies of STPa used hand-manipulated objects to demonstrate that its cells have large, bilateral receptive fields and respond selectively to translation and radial motion and movement in depth (Bruce et al., 1981; Baylis et al., 1987; Oram et al., 1993; Rodman et al., 1993). In addition, some neurons in STPa are reported to be selective for biological motion (Perrett et al., 1985;Oram and Perrett, 1994). However, these studies confounded the stimulus parameters of form and motion, leaving it unclear whether STPa contributes directly to the analysis of optic flow.
To determine whether STPa is involved in the analysis of self-motion, the responses of neurons were tested using controlled optic flow stimuli in monkeys trained to respond to these stimuli. Many neurons responded selectively to optic flow. Most STPa neurons fired maximally for single-component rather than combinations of optic flow with a bias for radial expansion. Thus, STPa may be an area in the anterior temporal lobe that is specialized for the processing of forward self-motion and/or looming stimuli.
These results have been published previously in abstract form (Anderson and Siegel 1995, 1997).
MATERIALS AND METHODS
The responses of STPa neurons to four types of optic flow stimuli were studied in three hemispheres of two behaving male rhesus monkeys (Macaca mulatta; 6 and 10 kg). All experimental and surgical procedures were in accordance with National Institutes of Health Guidelines on the Care and Use of Animals in Research and approved by the Rutgers University Institutional Review Board for the Use and Care of Animals. During training and recording sessions, the monkey was seated in a chair 57 cm away from a video monitor. The monkey was trained to pull back a lever at the onset of a central 0.3° red point. Two seconds later, a visual display appeared centered around this point. A change (see below) in the display occurred at a random time between 1500 and 4000 msec after the onset of the display. The animal was required to attend to the display and respond to the change in the display by releasing the lever within 800 msec. After the release of the key, the display disappeared. A correct response was rewarded with a drop of juice. A restricted watering schedule during the week provided motivation to perform the task. Figure1 illustrates the time course of the behavioral task used in these experiments.
Once the monkey could perform this behavioral task at greater than 90% correct, sterile surgery was performed using standard surgical procedures (Siegel and Read, 1997a) to attach a cap of bone cement to the monkey’s skull. A stainless steel T bar was embedded in the cement for head fixation during recording sessions. Each animal was then trained to maintain fixation within a window of 1° for up to 6 sec. Eye position was monitored with a noninvasive infrared video eye-tracking system (RK-416; ISCAN Inc., Cambridge, MA) and was sampled every 32 msec.
After fixation training, a second surgery was performed to implant a 16-mm-diameter stainless steel recording chamber over each hemisphere. The chambers were placed ∼15–16 mm anterior to the interaural plane and 19–20 mm lateral to the midline. These coordinates were chosen based on magnetic resonance images (MRI) of each monkey’s brain and on previous studies of STPa and adjacent areas (Bruce et al., 1981;Richmond et al., 1983; Oram et al., 1993). STPa lies 20–25 mm below these coordinates, with the exact depth depending on the lateral coordinates of the penetration. Single-unit recordings were made with insulated paralyne-coated tungsten microelectrodes (Frederick Haer & Co., Bowdoinham, ME). The electrode was passed through a guide tube and lowered using a two-stage stereotaxic microdrive that attached to the recording chamber. The final depth of the electrode was based on the MRI images and on electrophysiological landmarks along the penetration, e.g., auditory cortex, gray and white matter, and general neuronal response properties. The response properties of STPa neurons to auditory stimuli were not formally tested; however, many neurons were found to respond to both auditory and visual stimulation. This property was useful as an indication that the electrode was in STPa (Bruce et al., 1981). Neuronal waveforms were isolated using standard methods (Siegel and Read, 1997a), converted to digital pulses with a window discriminator (Bak Electronics, Germantown, MD), and collected at a resolution of 0.1 msec. Only neuronal data from trials in which the animal maintained fixation and correctly performed the task were analyzed and included in this report.
Optic flow stimuli
The optic flow displays used in these experiments were adapted from earlier studies (Siegel and Andersen, 1990; Siegel and Read, 1997a). All optic flow displays were 40° in diameter. Displays of this size may not encompass the full receptive field of STPa neurons; however, studies in MST, VIP, and 7a have found significant activation and selectivity with optic flow stimuli smaller than the receptive field size (Graziano et al., 1994; Schaafsma and Duysens, 1996; Siegel and Read, 1997a). The stimuli consisted of 128 white dots (32 cd/m2), 0.1° in diameter, and were plotted on a dark background (1.0 cd/m2). The points were plotted asynchronously, and each point was visible for 533 msec (32 frames). Once a dot disappeared, it was replotted at a random location within the display. Consequently, any fortuitous form cues were constantly changing from frame to frame. Point density across the displays was kept constant. New displays were generated for each recording session. Stimulus displays were grouped into blocks and presented in pseudorandom order for 8–10 trials each.
Four types of optic flow were used in these experiments: planar rotation [clockwise (CW) and counterclockwise (CCW)], radial [expansion (EXP) and compression (COM)], spiral [clockwise expansion (CWE) and compression (CWC), and counterclockwise expansion (CCWE) and compression (CCWC)] and translation (eight directions, spaced at 45°). The parameters of the displays (speed, point number, density, lifetime, and size) in the present experiments were demonstrated previously to give a robust perception of structured motion (Siegel and Andersen, 1988, 1990).
Both the planar rotation and the radial expansion displays were generated so that the speed of the points varied as a function of the distance of the point from the center of the display (Siegel and Andersen, 1990; Anderson and Siegel, 1993). The tangential velocity of each point in a rotation display was calculated with the following equation: Vt = 2πfr, whereVt is the tangential velocity of the point,f is the frequency of rotation for a given angular velocity of the display, and r is the distance of the point from the center of the display. The velocity of each point in the radial displays was calculated using the same equation, and then the direction of the trajectory was rotated 90° so that it moved in a radial direction toward or away from the center of the display. Thus, the rotation and radial displays contained speed gradients that were identical. The angular velocity of the rotation and radial displays used in these experiments was 1°/frame, which at a refresh rate of 60 frames/sec corresponded to 60°/sec or one full rotation in 6 sec. The range of speeds in these displays was calculated to be 0°/sec for points at the center of the display to 20°/sec for points at the edge. The mean speed was empirically measured at 14 ± 5°/sec for both the rotation and radial displays. This discrepancy between the theoretical mean and the actual value was attributable to roundoff error. This occurred because the calculated value of each position of the dots had to be rounded to the nearest pixel when plotted on the screen at a resolution of 640 × 480 pixels. Figure2, A and B, illustrates the rotation and radial displays used in these experiments.
In the spiral displays, the velocity of the dots was again proportional to the distance from the center of the display (Fig. 2C). To match the tangential speed of the dots in the spiral displays with those in the rotation and radial displays, spiral displays were generated by using vector addition to combine the motion trajectories of the rotation and radial displays. The speeds of the dots in the spiral displays were then adjusted by a factor of 1.414 so that they matched the speeds in the radial and rotation displays. The average speed and distribution of velocities of the spiral displays were the same as that for the rotation and radial displays (14 ± 5°/sec). The eight rotation, radial and spiral displays were matched on all parameters, including speed and velocity gradients; therefore, these displays were grouped into one block for presentation during recording sessions. Some neurons were tested with rotation and radial optic flow displays only. Figure 2 shows the displays used in this experiment.
Neurons in STPa were also tested for their selectivity for translating motion. A block of eight translation displays with directions separated by 45° was used to test the directional selectivity of isolated neurons. These displays contained dots that moved at the same speed and in the same direction (Fig. 2D). The speed of dots in the translation displays was 12 ± 0.5°/sec (mean ± SD).
Neurons were tested with the onset of fully structured (coherent) rotation, radial, and, in most cases, spiral and translation motion. The task of the animal was to release the key when these displays changed to fully unstructured (noncoherent) motion for the rotation, radial, and spiral displays and when the dots became stationary in the translation displays. The process of unstructuring these displays has been described in detail elsewhere (Siegel and Andersen, 1990; Siegel and Read, 1997a). This behavioral task was selected to ensure that the monkey consistently attended to the stimulus throughout the trial while fixating at the center of the screen.
Statistical analysis
For statistical analysis, average firing rates were calculated for the 500 msec period before and immediately after the onset of the visual display. This interval was chosen so that it incorporated both tonic and burst firing responses of the cell. Both excitatory and inhibitory responses were included in this analysis.
ANOVA. To classify the responses of neurons, a two-way ANOVA was performed on the responses of each neuron within a block of stimuli, with one independent factor corresponding to the type of displays within the block (e.g., CW rotation, CCW rotation, etc.) and the other corresponding to the time period of the firing activity (before vs after onset) (Siegel and Read, 1997a). In this way, small fluctuations in neuronal excitability (baseline) were controlled for throughout the recording period. The significance level was set atp < 0.05. Accordingly, a cell that had no significant response to either the type of display (factor 1) or the time period (factor 2) and no significant interaction between the two factors was classified as “unresponsive” because the stimuli used in these experiments failed to drive the cell. Neurons that had a main effect of time period alone (either excitatory or inhibitory) were classified as “sensitive” but not selective for a display within the block in that they demonstrated a significant change in firing rate to the onset of the visual displays, but this change was equal for all of the displays within the block. Neurons that showed a main effect of both time and display were classified as “selective,” as were neurons that showed an interaction effect. These neurons showed responses that varied not only with the onset of the displays, but also with the individual displays, indicating differential firing across the different displays within the block. A small (5%) proportion of the cells termed “statistical” were found that had a main effect of display but no effect of time period (Siegel and Read, 1997a). These were grouped with the unresponsive cells.
The above design was used to classify a cell as (1) sensitive, (2) selective, or (3) unresponsive. The use of these terms in this study corresponds to the definitions provided by Van Essen (1985), his Table 3. This type of analysis was used instead of calculating selectivity indices to take into account the variability in the baseline firing rate of the neuronal responses from trial to trial. This can lead to a more conservative count of the neurons showing sensitivity and selectivity for particular stimuli (type II error). However, we used this measure to account for changes in the baseline activity of the neuron across many trials and to prevent false positive results. Table1 illustrates the classification of the statistical responses used in this study.
Estimation of directional tuning. Neurons that showed a selective response (based on the results of the ANOVA) for at least one of the rotation, radial, or spiral displays were further analyzed using a sinusoidal regression model, adapted from Steinmetz et al. (1987), to determine which display produced the maximal firing rate and the dependence of the response on each optic flow component. Each display was assigned an angle in spiral space (θ) according to the amount of each optic flow component within it (Graziano et al., 1994). The following model was then used to fit a sinusoidal function to the data: In this model, A is the contribution of the radial components to the firing rate of the neuron, and B is the contribution of rotation components. The angle corresponding to the display that elicits the maximal neuronal response is calculated as tan−1(B/A). The amplitude of the response is (A2 +B2)1/2. The baseline rate of the cell is C. This model estimated the average firing rate of a neuron against stimulus direction in spiral space. The data were then fit to the sinusoidal function, and a stepwise nonlinear regression method was used to determine the significant parameters of the equation that minimized the difference between the predicted and actual responses of the neuron. Variables that significantly improved the curve fit at p < 0.05 were entered into the model. This model gave the predicted responses of the neuron based on the best fit regression curve and assumed broad tuning for a direction rather than sharp, unitary peaks of activity. Nonsignificant fits indicate that the behavior of the neuron could not be predicted using this model and may suggest that direction selectivity is not robust or that the selectivity is more complex than that which can be described with this periodic function.
Data that could not be fit significantly using the above model were analyzed by performing Bonferroni post hoc tests at a level of p < 0.05 to determine the particular display(s) within a block underlying the responses of the neuron. The Bonferroni test was chosen as the most conservative pairwise test with minimal false positives.
Estimation of receptive field shape and size. The receptive fields of neurons were mapped with 10° white stationary squares. The squares were presented at nine different locations on a 3 × 3 grid covering a square area of 40 × 40° on the monitor that was centered on the fixation point. The fixation point was always at the primary position. The luminance of the squares was 32 cd/m2. Receptive field shape and size were determined for the significant responses to stationary squares using the following general quadratic model: A(x,y) =axRx + ayRy +axyRxRy +axxRx2 +ayyRy2 +b + εi, where A is the neural activity in spikes per second (Read and Siegel, 1997).Rx and Ry were the horizontal and vertical retinotopic positions, respectively. The coefficientsax and ay are the slopes of the regression in the horizontal and vertical dimensions, respectively. The horizontal–vertical interaction term is axy , and the quadratic terms are axx andayy . b is the intercept. The error term ε i is the difference of the predicted value and the actual value for the ith measurement. The a and b parameters were fit using linear regression by a stepwise procedure to introduce and remove variables at the p < 0.05 level (GLM procedure; SAS Co., Durham, NC.) This stepwise procedure removes all terms that do not significantly account for variance in the data at the p< 0.05 level. Thus, a final fit might consist of only three parameters: ax, axx, and b (i.e., A(x,y) =axRx +axxRx2 + b + εi). This stepwise approach has the advantage that the model cannot be over determined; additional parameters that have no statistical basis will not be estimated. Typically, the significance level of each remaining parameter is p = 0.001. (SeeRead and Siegel, 1997 for further description and justification.)
Histology
After the conclusion of this study, electrolytic lesions were made in two hemispheres of one monkey by passing 4 μA of direct current for 4 sec through the electrode. Histology was performed using standard techniques (Siegel and Read, 1997a). Frozen sections were cut at 25 or 50 μm, mounted on gelatinized slides, and stained with thionin. All (seven of seven) lesions were found to be in the upper bank and fundus of STPa in this monkey. The second monkey is still being used for ongoing experiments in this laboratory. The recording sites in this monkey have been tentatively verified using x-ray (minXray 803, Northbrook, IL) images taken in the coronal and parasagittal planes while the electrode was in place (Nahm et al., 1994). Using landmarks visible on both the x-ray and MRI sections (e.g., skull, ear canals), the x-rays were scaled to and superimposed on the MRI sections taken at the same anteroposterior coordinates to verify that the location was in STPa. Figure3 shows one lesion in the upper bank of the superior temporal sulcus (STS).
RESULTS
A total of 786 neurons from 140 penetrations in three hemispheres of two monkeys were tested for their responses to visual stimuli. Of these, 514 (65%) exhibited significant responses to the onset of the test stimuli using the two-way ANOVA and were termed visual. The remaining 272 neurons were unresponsive and are not considered further.
Receptive field properties
The receptive fields of 222 visual neurons were mapped with stationary squares. Of these cells, 109 (49%) were unresponsive to the square at any position, and thus, their receptive fields could not be assessed. This was expected because previous studies of STPa have found neurons in this area to be relatively insensitive to stationary stimuli (Bruce et al., 1981). The other 113 neurons showed statistically significant activation to the squares using the two-way ANOVA.
These neurons were divided into two groups based on their responses. One group (51 neurons or 23% of the total tested) responded equally and above baseline to the square in all nine positions, indicating that their receptive fields were beyond the limits of the testing area (40° × 40°). The second group of cells that responded significantly (62 neurons or 28% of the total tested) showed a selective response to the onset of the squares that was dependent on the position of the square as shown for three representative neurons in Figure 4. Visual inspection of the peristimulus time histograms of the responses of these cells to each position of the square revealed that most of them (44 neurons) showed the maximal firing rate for the square at the center position of the screen where the animal was fixating (Fig. 4A). A further 11 cells did not respond maximally for the center position but still showed a significant increase over baseline for the center position, indicating that the cell was responsive to the stimulus at this center position (Fig. 4B). Only seven cells showed responses that were weaker or inhibited for the center position (Fig. 4C). Although the activity of these cells did not increase for the center position of the square, their activity for other positions was significantly higher than baseline. These cells may be similar to those termed foveal sparing in area 7a by Motter and Mountcastle (1981); however, this pattern of responses does not appear to be a predominant feature of the population of STPa cells studied here. This qualitative examination of the responses were first confirmed by performing Bonferroni post hoc tests on the neuronal responses to the squares that showed a dependence on position and then by using the quadratic receptive field analysis.
The group of 62 neurons whose responses to squares depended on position in the two-way ANOVA was examined using the stepwise linear regression model (Read and Siegel, 1997). Forty-three of the 62 neurons had nonlinear receptive field structures. For these 43 cells, there was a quadratic dependence on horizontal or vertical position (i.e., significant axx and/or ayy terms), with only seven of these having an interaction term (axy ). Of the cells that had a quadratic component, 23 of these cells had significant modulation along the horizontal meridian, 6 had significant modulation along the vertical meridian, and 10 cells were modulated along both the vertical and horizontal meridians. The sign of the quadratic coefficient signifies a peak (aii < 0) or trough (aii > 0) in the receptive field. The population of STPa cells had three times as many cells with peaked receptive fields than those with troughs. This would support the impression obtained from visual inspection and the Bonferroni analysis that the receptive fields often had maximal activity at the center position. Twelve of the 62 neurons had a purely linear receptive field structure, with half of the cells having an upper–lower receptive field asymmetry, half having an ipsilateral–contralateral asymmetry, and only one cell having both.
The size of the average STPa receptive field can be estimated from the receptive field width at half-height. This value can be computed using the coefficients from the quadratic equation. The shift in position along the horizontal meridian from the receptive field center that would result in a 50% change in firing rate from the peak or minimum (X50) can be computed as follows: where ‖axx‖ is the mean of the absolute value of the horizontal quadratic coefficient, and ‖c‖ is the absolute value of the intercept of the quadratic equation. (A similar equation may be derived for modulation along the vertical meridian.) The means of the absolute value of the quadratic components were axx = 0.011 ± 0.002 and ayy = 0.011 ± 0.003 Hz/deg2 for the horizontal and vertical quadratic terms, respectively (n = 33;n = 16). The mean intercept ( c ) was 10 ± 1.9 Hz. Using the equation, X50 andY50 are both ∼22°, and the half-height receptive field width is 44°. The receptive fields of STPa neurons are large.
The receptive field regression analysis suggests that the majority of cells in STPa have receptive fields larger than 40°, consistent with previous findings of receptive field sizes in STPa extending 30–150° from the fovea (Desimone and Gross, 1979; Bruce et al., 1981). Furthermore, most of the cells that showed responses that were altered by the position of the square exhibited the maximal response at the center of the screen or responded above baseline for this position (89%).
The next series of experiments examined the responses of STPa neurons to motion stimuli. As it was not possible to test all retinotopic positions with multiple types of optic flow, large-field motion stimuli were positioned at the fixation point in the center of the screen for all neurons to overlap with the static receptive field. In many motion-selective cortical regions, the receptive field in response to static stimuli overlaps with the receptive field of motion stimuli (Allman et al., 1973; Albright, 1984; Lagae et al., 1994; Read and Siegel, 1997). Given the broad spatial tuning of STPa neurons to static stimuli and the preponderance of quadratic tuned cells with strong responses at the center of the receptive field from the regression analysis, it was expected that robust responses to the motion displays would be found with large diameter (40°) motion stimuli centered on the fovea.
Translation motion selectivity
To determine whether neurons in STPa were responsive to translation motion in the frontoparallel plane, neurons were tested with a block of planar translation displays, each moving in one of eight directions spaced 45° apart. Of the 303 visual neurons tested with translation, 172 (57%) showed a significant response to the onset of translating motion. Of these neurons, 48 (28%) showed selective responses for one or more of the eight directions of translation motion, whereas the remaining 124 showed equal responses to all directions of motion (sensitive responses). The direction of motion that elicited the maximal firing rate of cells showing a selective response was assessed using the sinusoidal regression model. In this analysis, 0° corresponded to motion in the rightward direction, 90° to motion upward, and so forth. The responses of 25 neurons (52%) was modeled significantly by this function (p < 0.05) (Fig.5A,B). The tuning of this cell was broad in that it showed increased activation to more than one direction of motion.
The selective responses of cells that were not significantly fit with the sinusoidal model were assessed by performing Bonferroni post hoc tests. The direction of motion underlying the maximal responses of 11 cells could be determined with this post hoctest, and all 11 cells showed increases in activity for more than one direction of motion (Fig. 5C). The responses of the remaining cells were not significantly different across directions based on the post hoc results, suggesting that they were only weakly tuned for direction.
Although the tuning of most cells for translation was broad, the preferred direction of motion of 22 of the 25 selective cells was along one of the cardinal axes, particularly in the upward or downward direction as assessed using the distribution of preferred directions (Fig. 5D). These results indicate that a minority of neurons in STPa (16% of all tested) showed selective responses to translation motion; however, their tuning was broad and usually included more than one direction.
Optic flow sensitivity and selectivity
A total of 307 cells were tested for their sensitivity and selectivity to eight directions of optic flow stimuli: rotation (CW and CCW), radial (EXP and COM), and spiral (CWE, CCWE, CWC, and CCWC). Another 182 neurons were tested only with the four single-component optic flow displays (CW and CCW, rotation; EXP and COM, radial). In both of these blocks, all displays began as structured motion and changed to unstructured motion. The monkey was required to release the key in response to this change.
Of the neurons tested with both the single-component and the four spiral displays generated from the combined trajectories of rotation and radial motion, 201 (65%) responded significantly (two-way ANOVA) to the onset of at least one of the displays. Of these neurons, 105 (52%) responded equally to the eight displays and were classified as sensitive but not selective for a particular pattern of flow. The other 96 neurons (47%) that showed significant responses to the displays responded differentially to the eight displays and were classified as selective. The responses of the neurons were affected by the type of display, responding to some but not others in the block. Figure6A is an example of a cell that showed a selective response. The activity of this cell increased for the EXP, CWE, and CCWE optic flow displays. For this cell, there was little response to the other displays within the block.
The responses of the 96 of 307 (31%) cells that showed selectivity when tested with the block of rotation, radial, and spiral displays were fit with a sinusoidal function using the regression analysis. It was found that the responses of 51 (53%) of these cells could be fit significantly with this function (p < 0.05). The responses of these neurons were plotted in spiral space (Fig.6B) in which each angle represented the relative contribution of rotation and radial motion to the firing rate of the cell (Graziano et al., 1994). The response to radial expansion was arbitrarily assigned to the 0° position, the response to clockwise rotation to 90°, and the responses to spiral displays were plotted along oblique angles corresponding to the amount of radial and rotation component within the displays (equal in these experiments). For the cell shown in Figure 6, radial expansion evoked the strongest response, whereas the two spiral displays containing expansion motion components (CWE and CCWE) evoked significant but slightly weaker responses. The best fit regression curve for the response of this cell is shown in Figure 6C. The activity of this cell was only slightly above baseline for displays not containing expansion motion. Thus, the firing rate of this cell was maximally activated for radial expansion motion and decreased as the amount of expansion motion vectors in the display decreased. As a whole, most of the neurons whose responses could be fit with a tuning curve (96%) showed maximal activation for single-component optic flow patterns but were also responsive to spirals containing their preferred pattern. Of these 51 neurons, 28 showed maximal firing rates for radial expansion, 11 for radial compression, 7 for clockwise rotation, and 3 for counterclockwise rotation. Two cells showed responses that were maximally activated for clockwise compression spirals.
The selective responses of the 45 cells to radial, rotation, and spiral optic flow that could not be fit significantly with this sinusoidal model were evaluated using the Bonferroni post hoc test. Ten (22%) cells showed a maximal firing rates to only one display: four for EXP, two for COM, two for CCW, one for CWE, and one for CCWE. The displays evoking the maximal response of all but one of these cells were single-component displays, similar to the finding reported above. One cell showed significantly greater responses for both radial compression and radial expansion but no difference in firing rate between these two displays. There were 12 cells that showed significantly stronger responses to the four spiral displays than the four single-component displays but did not show a preference for a particular spiral display. Likewise, eight cells showed significantly greater responses to the single-component compared with the spiral displays but showed equal responses to the four single-component displays. These cells appeared to respond to classes of stimuli, either single-component or double-component displays, although it is possible that the analytical tests used were not sensitive enough to detect small differences in their firing rates to these displays. These cells differ from the ones described above in that they seem to show preferential responses to more than one type of optic flow. Furthermore, this preference does not depend on the presence of a preferred direction of flow but rather on the number of optic flow component vectors within the displays. The post hoc tests for the remaining 14 cells did not show significant differences in their firing rates across the displays.
The regression analysis demonstrated that the responses of over half of the neurons that responded significantly to optic flow could be modeled with a sinusoidal function that evaluated the contribution of radial and rotation motion components. The displays evoking the maximal response of all but two showed maximal firing rates for a single-component display (Fig.7A). However, these cells also responded significantly to the spiral displays that contained their preferred pattern of motion. Many of the cells evaluated with Bonferroni post hoc tests showed a similar preference for single-component optic flow displays.
Of the 182 neurons that were tested only with the four single-component optic flow displays, 74 (41%) showed significant responses, with 25 (34%) of these showing selective responses for at least one of the displays (two-way ANOVA). Figure 8 shows a cell that responded selectively for radial expansion. Curve-fitting and regression analyses were not performed on the responses of these cells because the sampling was sparse in spiral space. The particular display(s) responsible for the selectivity of these cells was determined by performing post hoc tests on the results of the ANOVA. It was found that 13 of the cells were selectively responsive to radial expansion, similar to the cell shown in Figure 8. Two cells showed selective responses to radial compression, and one cell responded strongest to counterclockwise rotation. For this set of neurons, there also appeared to be a bias for radial expansion, similar to the findings of neurons tested with the larger blocks of displays that included spiral motion (Fig. 7B). The results of thepost hoc tests for the other eight cells that showed a selective response to one of the four single-component optic flow displays on the basis of the ANOVA results were not significant at thep < 0.05 level. Therefore, the particular display responsible for their selective response could not be determined from this post hoc test.
A total of 489 neurons were tested with four or eight optic flow displays. Two hundred seventy-five neurons (56%) responded significantly to the onset of these displays, with 121 neurons (44%) being selective. The selectivity of the responses indicated that there was a bias for radial expansion optic flow in this population of neurons. In addition, most cells showed the strongest response to only one type of flow but were responsive to other displays if the preferred pattern of optic flow was present. A small number of cells responded equally to all spirals or all single-component displays; these could be responding to larger classes of motion rather than a single optic flow pattern.
Comparison of translation and optic flow selectivity on a cell by cell basis
The relative sparseness of direction selectivity for translating motion in STPa (16% compared with 25% for optic flow for all neurons tested) suggests that these neurons respond better to more complex optic flow than to simple translation in one direction. However, these numbers are taken from the population and do not reflect the tuning of cells on an individual basis. More direct evidence comes from a comparison of the responses of individual neurons to translation and the more complex optic flow stimuli (Fig.9). One block (“translation”) of stimuli consisted of eight different directions of translation motion. The other block (“optic flow”) consisted of the eight different optic flows (CW, CCW, CW-EXP, etc.) The significance of the responses for these neurons was determined using the two-way ANOVA for each block. Cells were categorized on whether they showed significant selective responses in the two-way ANOVA for each block. Significant (p < 0.05) selective activity indicated that a cell had a response to at least one of the individual stimuli that was “different” from the others (see Materials and Methods.) Of the 215 cells given the full battery of stimuli, 91 cells passed this stringent test of selectivity for the translation group and/or the optic flow group. Of these 91 neurons, there were 55 cells (60%) that were only selective to the displays of the optic flow block, 19 cells (20%) that were only selective to the translation block, and 17 cells (19%) that were selective to both optic flow and translation (Fig. 9). Of these 17 cells, many were selective only for radial expansion in the optic flow block but were tuned for multiple directions of translation motion. The remaining 124 cells of the initial set of 215 cells either did not respond at all to any of the stimuli (40 cells) or showed responses that were not tuned to the optic flow or translation displays (84 cells).
These results support the assertion that the selectivity to global optic flow cannot simply be explained by translation selectivity. It was found that 55 of 91 neurons were selective only to optic flow but not to translation motion. Furthermore, many cells responded to all directions of translation motion, but only a few responded selectively for a particular direction. This response to all directions of translation motion may be indicative of a general response to the presence of motion in all directions of the visual field, as in expansion and compression. This was confirmed when the cells were tested with the more complex optic flow displays. However, the bias for radial expansion probably does not arise from the simple presence of linear motion in many directions, because there was an unequal number of expansion-selective and compression-selective cells (Fig. 7).
DISCUSSION
This study investigated the responses of neurons in STPa to optic flow patterns that result from self-motion. Most neurons in STPa responded to motion stimuli in the absence of other cues, and many responded selectively to optic flow patterns. These neurons were found to be unimodally tuned around a specific direction of optic flow, but they also responded to combinations of optic flow that contained their preferred direction.
Elegant initial studies of the motion properties in STP used hand-held stimuli that contained nonmotion cues, including luminance, texture, density, and speed changes (Bruce et al., 1981; Perrett et al., 1985;Oram et al., 1993). Consequently, although many of the neurons showed no preference for a particular stimulus, the activity of the neurons to the stimulus movement could not be dissociated from their responses to the stimulus form. The present study removes these limitations on the interpretation of motion selectivity in STPa because controlled, computer-generated motion stimuli devoid of nonmotion cues were used to test the responses of STPa neurons. The responses to optic flow reported here are of similar, and in some cases greater, magnitudes as those reported for combinations of form and motion (cf. Oram and Perrett, 1996). This suggests that optic flow is sufficient to selectively activate neurons in STPa and that this area is likely to play a role in the analysis of self-motion. It is possible that STPa neurons also combine form with the motion selectivity as suggested by this earlier study. However, in the same hemispheres studied here, almost no neurons were found that were selective to two-dimensional form defined by both motion and form defined by luminance (Anderson and Siegel, 1998). These negative results argue against a role of STPa in simple form analysis and lead to the suggestion that either uncontrolled stimulus parameters that arise from using hand-held stimuli or the higher complexity of the moving biological forms (Perrett et al., 1985; Oram et al., 1993; Oram and Perrett, 1996) are in part responsible for differences in our results and those published earlier.
Another advance in the present study is the confinement of recordings to the upper bank of the anterior division of STP to sample a more homogeneous population (cf. Hikosaka et al., 1988). In earlier studies, neurons were sampled in both the posterior and anterior regions of STP and in the lower bank of the STS (Desimone and Gross, 1979; Bruce et al., 1981; Perrett et al., 1985). There are cytoarchitectonic differences along the caudal to rostral extent of STP, and these differences likely correspond to functional heterogeneity in this region (Baylis et al., 1987; Cusick et al., 1995).
Selectivity biases in STPa
STPa neurons had two selectivity biases. First, in a cell by cell comparison of the responses to translation and to complex optic flow for 91 neurons, threefold more cells showed selective responses to complex optic flow compared with translation. There are two possible explanations for these differences, given that almost all parameters (e.g., number of points, point life, size, etc.) of the displays were identical. First, although the mean speed of translation (12 ± 0.5°/sec) and complex optic flow displays (14 ± 5°/sec) were similar, the distributions of speeds were different. The translation displays had only one speed, whereas the complex optic flow displays had a range of speeds (cf. Graziano et al., 1994). This wider range of speeds may have contributed to the increased numbers of neurons showing selectivity for complex optic flow. However, neurons in STPa have been reported to be insensitive to differences in speed, particularly differences of such small magnitudes (Oram et al., 1993). A second explanation for this bias in selectivity toward complex optic flow is that it reflects an actual specialization of cortical processing for optical flow during locomotion. While moving, an organism rarely encounters pure frontoparallel motion. If STPa is indeed specialized for forward locomotion, then fewer neurons would be needed to encode pure translation motion.
The lower percentage of translation-selective neurons does not diminish the importance of the representation of planar motion in STPa. Of the STPa neurons that responded selectively to translation motion, there was a bias for motion in one of the four cardinal directions, particularly for the upward and downward direction based on the responses of 36 of 48 cells tested (Fig. 5B). Similar biases for selectivity in the cardinal directions have been shown in other studies of STPa (Perrett et al., 1985; Oram et al., 1993). Such neuronal population biases may underlie better human discrimination of motion in cardinal directions than in oblique directions (Heeley and Buchanan-Smith, 1992). Furthermore, a bias for stimuli moving in the upward or downward direction may be ecologically relevant for monkeys who direct their gaze and attention to the ground while foraging and tracking, which adds an upward translation component to the resulting change in optic flow. Lesions of STP result in deficits in pursuit eye movements, particularly for targets moving downward, consistent with the increased selectivity for downward motion in STPa found in the present study (Ó Scalaidhe et al., 1995). Thus, a preference in the selectivity of STPa neurons for downward motion provides additional evidence that STPa is selectively activated during locomotion.
A second bias that was discovered in the population of STPa neurons studied here was for radial expansion motion over the other complex optic flows. The predominance of expansion-selective neurons cannot be explained by an increased firing rate to straight motion trajectories over curved ones. Fewer neurons showed selective responses to the translation and compression radial displays, which also contained straight motion trajectories. The bias for expansion motion found in STPa is consistent with the proposed role of STPa in specifically encoding forward locomotion. Strong activation of the STPa population would be expected while subjects are moving forward with their head unmoving on their shoulders. Movements of downward-directed gaze would additionally activate the translation-selective neurons, providing a unique cortical representation of gaze alignment relative to the direction of locomotion. If this is a function of STPa, then its neurons should also represent shifts in the center of the flow fields when radial motion and downward translation motion are combined. These neurons could assist in the compensation for eye movements while maintaining selectivity for the overall pattern of motion (Warren and Hannon, 1988, 1990; Bradley et al., 1996). An additional possibility for the prevalence of expansion-selective cells over rotation cells is that expansion is almost always encountered during locomotion, whereas rotary components more often come from eye movements. Last, the orthogonal expansion flow component that is specially represented within STPa may have arisen from selective advantages during evolution to permit better localization during forward locomotion. Current computational models may need modification to consider the biases in the selectivity of STPa neurons.
Multiple representations of complex motion processing beyond MT
A possible explanation for the emerging complexity of motion processing as one progresses from MT to MST to STPa begins with MT extracting local motion vectors (Allman et al., 1973; Zeki, 1984;Maunsell and Van Essen, 1983a,b). MST then computes a range of optic flows from MT (Tanaka et al., 1986; Zemel and Sejnowski, 1998), whereas STPa neurons use the MST representation to extract the specific motion information patterns for navigation during forward locomotion. Support for this idea may be obtained by contrasting the representation of optic flow in STPa with other representations in the cortex. In contrast to MSTd neurons (Graziano et al., 1994; Duffy and Wurtz, 1997), STPa neurons were sensitive to spiral motion but rarely showed their maximal response to these combinations of optic flow. Rather, the majority of cells responded preferentially to expansion, compression, or rotation (i.e., the “pure” optic flows). A second difference in the responses of neurons in STPa from those in MSTd is the pronounced bias in the selectivity for radial expansion. Both STPa and MSTd have a large percentage of cells selective for expansion; however, the bias in MSTd is not as strong as the bias in STPa (Graziano et al., 1994). These differences indicate that STPa is not representing optic flow in the same way as MSTd, but may be encoding specific components of motion that occur with forward locomotion.
Optic flow is also represented in the parietal cortex. Neurons in VIP respond to optic flow similarly to MSTd and STPa neurons, but in addition, they respond to tactile stimulation near the head. VIP may be integrating head somatosensory cues with motion signals from MSTd to guide the acquisition of visual stimuli for intended head movements (Colby et al., 1993; Schaafsma and Duysens, 1996). VIP neurons also have a bias toward expansion selectivity (Schaafsma and Duysens, 1996), which further supports a role in detecting looming stimuli near the head. LIP neurons have not been specifically tested with complex optic flow stimuli, but their responses to translation motion appear to be more related to encoding the direction of upcoming eye movements to specific targets in space (Shadlen and Newsome, 1996). Neurons in parietal area 7a appear to code for discrete classes of optic flow rather than for a continuum of directions (Siegel and Read, 1997a; Read and Siegel, 1997). In contrast to both MSTd and STPa, area 7a does not appear to be involved in the processing of specific directions of optic flow but rather in the combination of these motion signals with eye position, a function necessary for egocentric localization in space (Read and Siegel, 1997). Given the inputs of area 7a to STPa (Andersen et al., 1990), changes in the eye position and/or head position may also alter the representation in STPa and will need to be studied. Each of these cortical regions (MSTd, LIP, VIP, 7a, and STPa) can be broadly said to process motion; however, each has a specific tuning. The hypothesis that emerges is that higher motion analysis is itself parceled into multiple visual areas depending on the functions selected by environmental and evolutionary pressures.
In summary, the current studies suggest that STPa is an extension of the motion-processing pathway beyond MT and MST into the anterior temporal lobe and that it might contribute to the processing of optic flow that is specifically associated with forward locomotion. Further, STPa is a polysensory region (Bruce et al., 1981); thus, its neurons could be integrating visual optic flow cues with position information derived from auditory (e.g., auditory looming) and proprioceptive cues. It is suggested that STPa, and other recipient zones of MST projections, may be utilizing motion for a particular environmentally based function. If so, the idea of a subdivision of cortical labor into “what” and “where” pathways needs to be reconciled with the multiple representations of motion that are being discovered in cortex.
Footnotes
This work was supported by Office of Naval Research Grant N00014-93-1-0034 and National Institutes of Health Grant R01 EY-9223. We gratefully acknowledge Dr. Charles Schroeder of Albert Einstein College of Medicine and Drs. Lawrence Tannenbaum and Martin Gizzi of the New Jersey Neuroscience Institute for performing the MRI scans on the animals. We also thank Dr. Cassandra Cusick of Tulane University for performing the histology and for helpful discussion on STPa.
Correspondence should be addressed to Dr. Ralph M. Siegel, Center for Molecular and Behavioral Neuroscience, Rutgers University, 197 University Avenue, Newark, NJ 07102.
Dr. Anderson’s present address: Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, E25–236, Cambridge, MA 02139.