Abstract
How does the brain process visual information about self-motion? In monkey cortex, the analysis of visual motion is performed by successive areas specialized in different aspects of motion processing. Whereas neurons in the middle temporal (MT) area are direction-selective for local motion, neurons in the medial superior temporal (MST) area respond to motion patterns. A neural network model attempts to link these properties to the psychophysics of human heading detection from optic flow. It proposes that populations of neurons represent specific directions of heading. We quantitatively compared single-unit recordings in area MST with single-neuron simulations in this model. Predictions were derived from simulations and subsequently tested in recorded neurons. Neuronal activities depended on the position of the singular point in the optic flow. Best responses to opposing motions occurred for opposite locations of the singular point in the visual field. Excitation by one type of motion is paired with inhibition by the opposite motion. Activity maxima often occur for peripheral singular points. The averaged recorded shape of the response modulations is sigmoidal, which is in agreement with model predictions. We also tested whether the activity of the neuronal population in MST can represent the directions of heading in our stimuli. A simple least-mean-square minimization could retrieve the direction of heading from the neuronal activities with a precision of 4.3°. Our results show good agreement between the proposed model and the neuronal responses in area MST and further support the hypothesis that area MST is involved in visual navigation.
In the cortical motion pathway of primates, two areas are concerned with optic flow processing. The middle temporal (MT) area (Allman and Kaas, 1971; Dubner and Zeki, 1974) contains many direction-selective cells (Maunsell and Van Essen, 1983a,b; Rodman and Albright, 1987) that, in principle, might form a distributed encoding of the flow field arriving on the retina (Movshon et al., 1985;Bülthoff et al., 1989; Wang et al., 1989; Newsome et al., 1990;Britten et al., 1993). In the medial superior temporal (MST) area, which follows area MT in the motion pathway (Ungerleider and Desimone, 1986; Boussaoud et al., 1990), many neurons respond to large random dot optic flow patterns (Saito et al., 1986; Tanaka et al., 1986; Tanaka and Saito, 1989a; Duffy and Wurtz, 1991a,b), suggesting an involvement in the analysis of optic flow. Several studies capitalized on the fact that, mathematically, any flow field can be locally decomposed into a small number of basic or elementary flow components: divergence, deformation, and curl (Koenderink and van Doorn, 1975). Optic flow stimuli presented to MST neurons were pure expansions, contractions, and rotations (Saito et al., 1986; Tanaka et al., 1986; Tanaka and Saito, 1989a; Duffy and Wurtz, 1991a,b), recently augmented also by deformations (Lagae et al., 1994) and linear combinations of expansion/contraction and rotation (Orban et al., 1992; Graziano et al., 1994). Consistently, it was found that most single neurons in MST responded strongly to several of these stimuli and also to unidirectional frontoparallel motion (Duffy and Wurtz, 1991a,b; Lagae et al., 1994). They do not perform a mathematical decomposition of the flow field (Graziano et al., 1994; Lagae et al., 1994). Different conclusion have been drawn on whether these neurons, or a more restricted subset of them, might be involved in the processing of optic flow fields arising from egomotion. Here we propose to investigate this question with a combination of experimental and theoretical considerations.
Lappe and Rauschecker (1993b) have devised a network model of visual navigation. This model generates neurons that respond to several of the usually tested optic flow stimuli in a way similar to cells in MST (Lappe and Rauschecker, 1993a). A single neuron might respond to several optic flow patterns and also to frontoparallel, unidirectional movement. Other single-model neurons respond selectively only to a smaller set of basic flow patterns. However, consistent with the findings in MST, no single neuron in the model performs a mathematical decomposition of the flow field and detects a preferred basic component. Rather, the model neurons are designed to contribute to the solution of a specific and important task of optic flow analysis, namely, the detection of the direction of heading. In a retinotopic frame of reference, which is assumed in this paper, this would mean specifically the determination of the direction of heading with respect to the direction of gaze. The selective responses of neurons to certain basic flow patterns result from their function in this task, and they also respond to more complex flow fields, such as linear combinations of expansion/contraction and rotation. This model can be used to derive a number of predictions for optic flow processing neurons that differ from the propositions used in earlier studies. Here we attempt to perform a comparison between computer simulations of single-model neurons and the activities of single neurons recorded in area MST with exactly the same stimulation procedure in both cases. For the first step of this comparison, which comprises the scope of this paper, we focus on a simple and basic simulated egomotion optic flow, namely, a linear translation in three-dimensional space. We will not consider the important issue of simultaneous visual rotations attributable to eye movements. We will, however, partially include pure rotations in the frontoparallel plane, mostly for reasons of comparability with previous studies.
We would like to add a few comments on rotational motion patterns and their relation to optic flow occurring during egomotion. It is a mathematical fact that any optic flow field can be locally decomposed into divergence, rotation, and deformation (Koenderink and van Doorn, 1975). However, this decomposition is a local operation, meaning that it is only defined in an infinitesimal neighborhood around the point of interest, i.e., a very tiny patch of the optic flow field. The large size of the receptive fields of MST neurons and their preference for large stimuli are incongruent with a local decomposition of the flow field by MST neurons. On the other hand, it is practically impossible to observe a pure full-field frontoparallel rotation by any normal type of self-motion. It would be required, essentially, to spin around a midsagittal axis running through the center of the eye. This movement is not in the repertoire of a normal human or monkey. Rotational movements that occur during normal primate locomotion are either curved paths of travel or eye rotations resulting from version eye movements. Both do not induce full-field frontoparallel rotational motion patterns. Rather, the former results in curved trajectories of the flow field elements over time (Warren et al., 1991). The latter results in a distortion of the optic flow field on the retina depending on the direction and speed of the eye movement. A limited amount of rotational visual motion is obtained when a moving observer tracks an eccentric point in the environment. In this case, however, the retinal flow field resembles much more a spiral than a rotation [see Lappe and Rauschecker (1995) for a mathematical analysis of this type of self-motion]. So why do we include full-field frontoparallel rotations in our study? For one reason, all previous studies have included frontoparallel rotations in their basic stimulus set, and it serves as a means of comparison. Moreover, the model neurons also respond to frontoparallel rotation, thus allowing a comparison between model and experiment. However, we would like to emphasize that the model responses to frontoparallel rotation are not a specific selectivity for this type of motion but, rather, a reflection of their selectivity for heading detection.
MATERIALS AND METHODS
Single-unit recordings were performed in two awake, behaving monkeys (Macaca mulatta) performing a fixation task. All procedures were in accordance with published guidelines on the use of animals in research (European Communities Council Directive 86/609/ECC). Experimental methods followed standard procedures that can be found in more detail in Bremmer et al. (1996).
Animal preparation. The animals were surgically prepared for chronic neurophysiological recording. The monkeys were pretreated with atropine and sedated with ketamine hydrochloride. Under general anaesthesia [10 mg/kg, i.v., pentobarbital sodium (Nembutal)] and sterile surgical conditions, the animals were implanted with a chronic device for holding the head. Two scleral search coils were implanted, so as to monitor eye position, and were connected to a plug on top of the skull. A recording chamber for introducing a guide tube and an electrode through the intact dura was implanted over a trephine hole in the skull. The chamber was placed over occipital cortex in a parasagittal stereotaxic plane, tilted 60° off vertical. Recording chamber, eye coil plug, and head holder were all embedded in dental acrylic, which was connected to the skull by self-tapping screws. Analgetics were applied postoperatively, and recording started no sooner than 1 week after surgery.
Behavioral paradigm and recordings. During training and recording sessions, the monkey’s head was restrained on a primate chair while he was performing a fixation task for liquid reward (apple juice). Rewards were given for keeping the eyes within an electronically defined window centered on the fixation target. The fixation target (1.0° diameter) was generated by a light-emitting diode and back-projected on a translucent tangent screen subtending 90° × 90° of visual angle at a viewing distance of 35 cm. The fixation target was always presented in the center of the projection screen. The monkey was required to maintain central fixation throughout the stimulation period. Behavioral paradigm, visual stimulation, and data acquisition were controlled by a PC (Compaq 386-25) and an in-house-developed software package (DADA: Data acquisition and data analysis, U. J. Ilg). At the end of the training or experimental sessions, the monkey was returned to his home cage. Monkey’s weight was monitored daily, and supplementary fruit or water supply was provided.
For cell recordings, tungsten-in-glass electrodes (Impedance, 1-2 MΩ at 1 kHz) were advanced using a hydraulic microdrive (Narishige) mounted on the recording chamber. Neuronal activity and electrode depth were noted, so as to establish the relative positions of landmarks, such as gray and white matter, and neuronal response characteristics.
Visual stimulation and data analysis. When a cell was isolated, the receptive field was mapped using a hand-held projector while the monkey fixated a central target. Quantitative testing of basic visual properties was performed using a galvanometer-mounted slide projection system allowing display of light bars or random dot patterns of different sizes. The results of these tests determined the optimal two-dimensional stimulus including optimal speed and preferred direction. Other tests performed on the neurons included visual responses to pattern on- and off-set, responses during smooth pursuit eye movements, and modulations by eye position. Quantitative computation of the preferred stimulus direction of the neurons for full-field frontoparallel two-dimensional motion was done by means of the SDO analysis (Wörgötter and Eysel, 1987) using a full-field random dot pattern.
For testing responses to optic flow stimuli, the main deviations from previous studies are threefold and are similar to a paradigm used in a more recent paper of Duffy and Wurtz (1995). First, we use full-field stimulation covering a central 90° × 90° visual field. We take the center of the visual field as a reference point for the description of the neuronal tuning. Previous studies have often positioned stimuli with respect to the receptive field of a neuron. Second, we varied the position of the singular point in the full-field stimulus instead of the position of the stimulus itself. The singular point is defined as an idealized point in the flow field for which the visual motion is zero, i.e., a point that remains stationary in the optic array. For expansion/contraction stimuli, the singular point is the focus of expansion/contraction. For rotation stimuli, the singular point is the center of rotation. Third, unlike Duffy and Wurtz (1995), we use optic flow fields that simulate self-motion in a visual environment consisting of a random distribution of points in three-dimensional space. For the case of translational forward or backward movement, this results in a random distribution of flow field speeds in the stimulus. Previous studies have mostly used uniform speed distributions or speed distributions that were consistent with a movement toward a frontoparallel plane. All of these deviations are motivated by our intent to search for an involvement of these cells in the processing of optic flow fields arising from egomotion. During egomotion, the entire visual field is moving. The direction of heading has to be specified with respect to the direction of gaze, not with respect to the location of individual receptive fields. The variation of the position of the singular point was chosen, because for the simple linear ego-translation that we consider here, the location of the singular point of an expansion/contraction pattern, i.e., the focus of expansion/contraction, is directly related to the direction of heading. The random distribution of flow field speeds was preferred over a uniform distribution because much evidence from psychophysics indicates that the motion parallax in these flow fields is an important source of information for visual navigation.
Optic flow stimuli were generated by a Macintosh Quadra computer. The stimuli consisted of full-field computer-generated sequences that were back-projected onto the tangent screen. Image resolution was 400 × 400 pixels. Movies simulated approaching (expansion), receding (contraction), and rotating (clockwise/counterclockwise) egomotion with respect to a random cloud of dots in three-dimensional space. These dots were white on a dark background. The cloud extended from 2 to 40 m in depth from the monkey. It contained 90 spherical dots all with the same simulated diameter of 20 cm. Visual dot size depended on the simulated distance of the dot from the observer. Median size was ∼1° of visual angle. Simulated speed of the monkey was 3 m/sec for the expansion/contraction stimuli and 60 deg/sec for the rotation stimuli. The movie sequences were generated off-line, stored, and later played back during the recording session. A single sequence lasted 1300 msec and displayed one direction of motion (expansion or clockwise rotation) for a duration of 650 msec, immediately followed by 650 msec stimulation by the opposite direction (contraction or counterclockwise rotation). For data analysis, the mean spike rate during each 650 msec stimulus interval was computed, corrected for the latency of the response onset. The sequences simulated entirely realistic egomotion flow fields. Dots accelerated with eccentricity, grew larger in size as the approached the monkey, and exhibited a nonuniform speed distribution that depended on the simulated distance of each dot from the monkey. For the rotation displays, dots did not accelerate or grow in size. In both cases, the visual motion of the dots was identical to the motion that the monkey would experience when moving relative to such a cloud of dots. In some neurons, we also tested a stimulus in which all dots were assumed to lie on a frontoparallel plane instead of a random cloud. In this case, the flow field speeds are more evenly distributed and the stimulus does not contain any motion parallax.
To test whether the responsiveness of neurons was modulated by the position of the singular point in the optic flow stimulus, nine different movie sequences were presented in random order. In each of these nine sequences, the singular point was located either in the center of the screen or at one of eight different locations arranged on a circle around the center of the screen. The radius of the circle could be either 15° or 40°.
Histology. In the last days of recording with the first monkey, electrolytic microlesions (10 nA for 10 sec) and neuronal tracer injections were made. After recording was completed, the monkey was given an overdose of pentobarbital sodium and, after respiratory block and cessation of all reflexes, transcardially perfused. Frozen sections were cut at 50 μm thickness. Sections 250 μm apart were stained with cresyl violet and Klüver Barrera to visualize cytoarchitecture. Another series was stained for myelin with theGallyas (1979) method as modified by Hess and Merker (1983). Electrode tracks were identified on the basis of the relative location of the penetration to the entire recorded area, the spatial relationship to other tracks and marking lesions or injections, and the depth profile during a penetration. Our penetration scheme covered only a small spatial region. The locations of the microlesions with respect to this penetration scheme allowed us to identify the full area from which we recorded. Approximate location of each recording site on the track was determined, based on the distance from the above specified landmarks as well as the appearance and disappearance of gray matter. Camera lucida drawings of the relevant sections as well as two-dimensional maps of the recorded hemisphere were made as a standard procedure. Most MST recording sites were located in the posterior bank of the STS, near the anterior border of area MT, and in the fundus of the STS. Histology of the second monkey is not yet available because the animal is involved in other experiments. In addition, evidence that a given neuron was located in MST was also obtained from physiological criteria that were used during the recording sessions following the procedure outlined byCelebrini and Newsome (1994).
RESULTS
We first want to give a brief outline of the structure and function of the network model proposed by Lappe and Rauschecker and develop predictions for single-neuron properties. Then we will describe the experimental results and the evaluation of the predictions.
Modeling optic flow processing
In his influential work starting in the 1950s, J. J. Gibson postulated that the changing retinal illumination pattern occurring during egomotion in a visual environment could be used effectively for navigation (Gibson, 1950). Since then, much research in psychophysics and computer vision has been concerned with optic flow processing, but only recently have neurobiologists started to investigate these questions in higher mammals. Humans can accurately detect their direction of heading from optic flow, even in the presence of confounding eye movements (Warren and Hannon, 1988, 1990; van den Berg, 1993). In some situations, however, humans do also need additional extraretinal information about their eye movements to detect correctly their direction of heading (Warren and Hannon, 1990; Royden et al., 1994). Many mathematical investigations have been concerned with the visual decomposition of the retinal flow (Koenderink and van Doorn, 1975; Prazdny, 1980; Longuet-Higgins, 1981; Rieger and Lawton, 1985;Verri et al., 1989), but few neurobiological models of optic flow processing exist (Lappe and Rauschecker, 1993b; Perrone and Stone, 1994; Zemel and Sejnowski, 1995).
The model of Lappe and Rauschecker (1993b) is a two-layer implementation of an algorithm (Heeger and Jepson, 1992) that computes the direction of heading inherent in a measured optic flow field by matching the motion parameters of the observer, i.e., ego-translationT and eye-rotation Ω, to the measured optic flow field according to a least-square criterion. Thus, given a specific flow field as input, it determines which of a possible set of heading directions most likely generated this input flow field. This is equivalent to determine the direction of heading in a retinotopic frame of reference, i.e., the direction of heading relative to the direction of gaze. A schematic layout of the network is shown in Figure1. In the first layer, direction-selective cells represent the optic flow input. We assume that the response of each cell is maximal for movements of small objects in an individual preferred direction and zero for movements in the null direction. We further assume that each retinal location contains several neurons with different preferred directions that together encode a measured optic flow vector. We regard the first layer of the network as a functional representation of area MT in monkey cortex. Various models have been proposed for the measurement of optic flow and its possible implementation in area MT (Hildreth and Koch, 1987; Bülthoff et al., 1989; Wang et al., 1989; Qian et al., 1994; Nowlan and Sejnowski, 1995). Different suggested mechanisms such as “winner-take-all” or “population coding” have been tested experimentally (Salzman and Newsome, 1994). In our model, the precise nature of the optic flow representation in MT is not critical. We only require the first layer to signal the direction and the speed of an optic flow vector that occurs at a specific location in the visual field. Any biologically plausible algorithm would suffice. For the simulations described later, we used a simple encoding. A set of 32 simplified neurons encodes the local optic flow at a particular position in the visual field. We assume a cosinosoidal direction tuning and a Gaussian speed tuning. We typically use four direction preference classes (0°, 90°, 180°, 270°) and eight speed preference classes (0.5, 1, 2, 4, 8, 16, 32, 64 deg/sec). We disregard any effects of spatial summation of a neuron caused by an extended receptive field. Instead, we assume that the neuronal responses reflect only the speed and direction of a single point in the flow field. We assume that in the first layer of the network a large number (typically 300) of such functional units are randomly distributed within the visual field.
The second layer of the network contains neuronal populations individually tuned to specific directions of heading. This layer forms a computational map of possible heading directions. Each map position represents a specific direction of heading, given by the intersection of the axis of translational movement of the observer with the retinal image. A column of neurons occupying a specific map position in layer two separately computes the likelihood that the optic flow field represented in layer one is the result of an egomotion along the axis of translation (the direction of heading) given by its position in the map. These neurons form the population that represents this specific direction of heading. Other directions of heading, which are associated with different locations in the heading map, are served by different populations of neurons. To achieve this computation, the connection strengths between the two layers have to be carefully adjusted. However, this adjustment is not done with a weight update method such as backpropagation. Rather, the required connection strengths are precalculated from the mathematical formalization of the underlying heading detection algorithm (Lappe and Rauschecker, 1993b). Therefore, no training is necessary. The distribution of the connections, on the other hand, can be chosen at random. Each second-layer neuron receives input from a random subset of first-layer units. Only the connection strengths have to be specified. This allows for convergence, divergence, and overlap in the receptive fields of the second-layer neurons. Also, the receptive fields sizes can be chosen to be consistent with the typical receptive field dimensions of MST neurons. As a result of the freedom of assignment of first-layer input neurons to second-layer neurons, the receptive fields of the second layer neurons can be inhomogeneous, i.e., clustering of inputs in parts of the receptive field can occur. However, as many researchers have noted (Tanaka and Saito, 1989b; Duffy and Wurtz, 1991b; Lagae et al., 1994), the receptive fields of MST neurons are also often irregular and difficult to determine. In the simulations, we typically used 32 input locations per second-layer neuron.
Once the connections have been determined, the network minimizes a certain residual function that describes the error between the measured flow field and a candidate flow field induced by an egomotion into a certain heading direction: a peak of activity in the map occurs at the position where this residual function is minimal. This peak specifies the most likely direction of heading as computed by the network. An example is given in Figure 2. The example simulates movement of an observer on top of a ground plane. The observer simply moves on a linear path toward the plus sign (+) while gazing toward the left of his movement trajectory. During the movement, he keeps a fixed angle between the direction of gaze and the direction of heading. No eye rotation occurs. The resulting optic flow input to the network is a pure expansion with the singular point, the focus of expansion, located in the direction of heading. The network is able to determine the correct direction of heading. The right side of Figure 2 shows the population activities in the second layer. Each square in this grayscale plot corresponds to a specific map position in xand y, i.e., to the retinal projection of a specific direction of heading. The brightness indicates the population activity at this map position. The brightest square in the map indicates the direction of heading as computed by the network. It matches the correct direction (+) within the resolution of the grid (1°). In this and the following simulations, the second layer of the network consisted of 16,000 neurons.
The example in Figure 2 describes a simple linear movement that does not involve any visual rotations attributable to eye- or head-movements. However, such rotations often occur during locomotion and have a profound influence on the structure of the flow field on the retina (Regan and Beverly, 1982; Warren and Hannon, 1990; Lappe and Rauschecker, 1995). The network has been designed to cope with this situation. To achieve an invariance against eye-rotations, a simple search for the focus of expansion on the retina is misguided. Instead, each second-layer population has to evaluate the residual function and adjusts its activity accordingly: the lower the value of the residual function, the higher the output activity of the population. This evaluation cannot be performed by any single neuron alone, but is spread out over all of the cells within a population. Therefore, whereas the population possesses a “preferred” direction of heading, a single cell is not able to signal the direction of heading on its own. The activity of a single cell only serves as one constraint on the heading direction. To compute the most likely heading direction thus requires summation of the outputs of many cells. This procedure is illustrated in Figure 3. The response of a single cell (Fig. 3A) to an optic flow input is a sigmoid function of the direction of heading. Such a cell only signals whether the direction of heading lies roughly in one-half of the visual field (left hemifield in Fig. 3A). Together with a second cell (Fig. 3B), providing information about whether the direction of heading is likely to be located in the right hemifield, the location of the direction of heading is found to lie on a line dissecting the visual field. The combination of many neuronal responses (a second pair of neurons is shown in Fig. 3C,D) finally results in a peak of population activity at the retinal position of the correct direction (Fig. 3E).
The structure of the model assumes that each MST cell is associated with a specific population that represents a specific direction of heading. Figure 3 also illustrates why it is difficult to determine from physiological data the specific population or direction of heading with which a given neuron is associated. The most obvious differences in the response curves of the four individual neurons are their orientations. However, these differences do not functionally separate the neurons. In fact, these different orientations within one population are necessary to achieve the desired overlap in the response functions that generates the population selectivity. This principle is very similar to the way local motion information is encoded in a distributed fashion in the MT layer. There, the relative activities of a population of neurons with different directional selectivity give the direction of motion of the stimulus. In MT, these populations are formed by all neurons that occupy the same receptive field location in visual space. A neuron with a receptive field at another location clearly belongs to a different population, encoding motion at that location. In the MST model, receptive field location would not be a good basis to group neurons together in one population. To acquire as much information for the determination of self-motion parameters as possible would instead require covering all of the visual field with the neurons within one population. Thus, neither the response curves directly nor the receptive field positions would be expected to differentiate neurons in one population from neurons in another population. The appropriate parameter to group neurons together would instead be their nearness in “heading space.” However, this parameter manifests itself neither in the individual response curve nor in the receptive field position, but only when neuronal responses are combined. Thus, if such a map of heading-populations were anatomically present in MST, it would not be directly visible in the response properties of neighboring neurons. As Figure 3 shows, the response function of neurons within one such population can be quite different from one another. This might explain why attempts to find a map-like organization in MST based on physiological properties such as the selectivity for specific flow patterns have failed.
Properties of single-model neurons
All individual neurons in one column respond to optic flow patterns. However, their optic flow response properties are determined not solely from their position in the heading map, but also from the locations of their first-layer inputs and connections. Thus, different individual neurons might display different optic flow tuning. Such a map structure might explain why many researchers have failed to find a clear-cut topographic order in MST. In this model, neither the receptive field position nor a selectivity for certain optic flow components would necessarily display a topographic order. Rather, a certain orderly arrangement of columns of neurons encoding specific heading directions in a population code would be expected.
Neurons in the model are designed to perform a specific task, namely, to compute the direction of heading. However, they also show specific properties when tested with the abstract flow stimuli that are commonly used in neurophysiological research on optic flow processing. Typically, these stimuli consist of basic flow components such as pure expansions/contractions or pure rotations. When stimulated with such input stimuli, their activity is modulated by the position of the singular point of the optic flow stimulus within the visual field. A singular point of an optic flow field is defined as a point where the optical velocity vanishes. For an expansion/contraction stimulus, the singular point is the focus of expansion or contraction. For a rotation stimulus, the singular point coincides with the retinal projection of the axis of rotation. The response modulation is characterized by complementary response fields for expansion/contraction and clockwise/counterclockwise rotation. For instance, an individual model neuron might favor expansions with the singular point in the left hemifield and contractions with the singular point in the right hemifield, i.e., it reverses its selectivity from expansion to contraction as the singular point is moved in the visual field. In addition, a second-layer cell that is excited by one type of motion at a specific location of the singular point will be inhibited by the reversed motion with the same location of the singular point. This inhibition is an important requirement, because it allows the population to retain a medium activity even when some neurons are excited by the stimulus.
The model neurons also respond to frontoparallel, two-dimensional, unidirectional motion in a direction-selective manner. As with the other optic flow responses, this directional selectivity is also a reflection of the functional requirements of the task of heading detection from optic flow. It does not imply that a neuron is specifically tuned to translation, i.e., that all of its inputs from the first layer have the same preferred direction. Instead, a neuron receives input from many cells with different preferred directions and uses a complex weighting scheme for these inputs. However, if all first-layer neurons are stimulated by a large field translation, then there is always one direction of translation for which the input for a given second-layer neuron is maximal and one for which it is minimal. Thus, this neuron will appear direction-selective, even though it does not receive restricted input only from cells with the same preferred direction.
An example of the responses of a single second-layer model neuron to several optic flow stimuli is shown in Figure 4. It is important to note that, similar to the experimental methods described later, the optic flow stimuli always covered a full central 90° × 90° of the (simulated) visual field. Thus, only the position of the singular point was moved, not the stimulus itself. The example neuron in Figure 4 responds differentially to expanding, contracting, and rotating flow stimuli, depending on where in the visual field the singular point of the flow stimulus is located. For very large displacements of the singular point, i.e., when the singular point is moved from the lower left to the upper right corner of the visual field, the neuron reverses its selectivity from expansions to contractions. For smaller displacements of the singular point, the neuron displays a position invariant selectivity in large parts of the visual field. For instance, the selective response to counterclockwise rotations stays the same within an area covering the left hemifield and extending at least 20° into the right hemifield. The simulated receptive field of this model neuron covers the lower left quadrant. It extends up to 10° into each of the other quadrants. Thus, for this neuron, the reversals of selectivity occur only when the singular points of the respective flow patterns are placed outside its receptive field. Within the receptive field, the neuron responds selectively only to expansions and counterclockwise rotations, and both responses are position-invariant. In addition to the optic flow responses, the neuron also responds to full-field unidirectional motion, favoring movements toward the upper right.
It is also possible that single-model neurons do not respond selectively to all of the basic flow patterns. For instance, a second model neuron, shown in Figure 5, lacks any selectivity for clockwise versus counterclockwise rotations. The responses to expanding or contracting patterns remain dependent on the location of the singular point. In addition, the neuron is also direction-selective. Preferred direction for full-field unidirectional motion for this neuron is toward the lower right. The simulated receptive field covered the full 90° × 90° visual field. The difference in the response selectivity of the two model neurons stems from a difference in their account for visual disturbances caused by eye movements that might occur during egomotion. For an analytical derivation of these properties, see Lappe and Rauschecker (1993a).
An interesting property of the optic flow-selective neurons in MST has been reported by Graziano et al. (1994). Many cells responded very well to linear combinations of expanding/contracting and rotating patterns, i.e., to spiral motion. We have not included spiral motions in our study, but responses to spiraling optic flow patterns are also observed in model neuron simulations. Spiraling optic flow patterns often occur in everyday egomotion conditions and are very efficient stimuli for the human heading-detection system (Lappe and Rauschecker, 1995). For instance, they result when, during egomotion with respect to a ground plane, the gaze is stabilized on a ground plane target by appropriate eye movements. For the heading-detection system implemented by the model, selective responses to spiraling patterns are a natural consequence. Thus, although the simulations and recordings described in this paper were all performed with basic optic flow patterns such as pure expansions, rotations, or translations, the responses of the model neurons are not restricted to these basic patterns. The model neurons do not separate the optic flow into isolated basic components but, rather, form a continuum of selectivities, in which some neurons respond stronger to spiraling patterns at certain positions of the singular point, whereas other neurons respond stronger to the pure flow patterns.
The model neurons in Figures 4 and 5 clearly represent idealized response properties obtained from a mathematically optimal network. Such idealized responses cannot be expected from real neuronal data. However, from the model simulations, a number of predictions can be made that can be tested experimentally. First, neuronal activities will depend on the position of the singular point in a full-field optic flow stimulus. Reversals of selectivity might occur when the singular point is displaced. Second, best responses to opposing stimuli (expansion vs contraction, clockwise vs counterclockwise) will occur for diametral locations of the singular point with respect to the center of the visual field. Third, excitation by one type of motion at a particular location of the singular point will be paired with inhibition by the opposite type of motion. Fourth, maximum activities will occur for peripheral locations of the singular point. In general, all of the effects should be best observable for large distances of the singular point from the center of the visual field.
EXPERIMENTAL RESULTS
We recorded from 134 neurons. A total of 98 neurons could be tested with expansion/contraction stimuli at different locations of the singular point. Of these 98 neurons, 88 were tested with the expansion/contraction stimuli at 40° eccentricity. Thirty-one neurons were tested with the expansion/contraction stimuli at 15° eccentricity. Twenty-one neurons were tested with both sets of expansion/contraction stimuli. Rotational optic flow stimuli were tested less often. A total of 53 neurons were recorded with rotational flow patterns, 26 with the 15° stimuli and 45 with the 40° stimuli. A total of 18 neurons were tested with 15° and 40° rotation stimuli.
Basic properties
Most neurons we encountered could be well driven by visual stimulation. Visual receptive field dimensions of these neurons were usually large to very large, often covering the whole 90° × 90° screen. However, as has been noted by previous researchers, the receptive fields were sometimes difficult to map, because the responses depended on the stimulus used (bar or random dot pattern), and also because in some neurons inhomogeneities in the receptive fields were observed. However, because our experimental paradigm as well as the natural situation during egomotion involved only full-field stimulation, we considered the estimates obtained with the hand-held projector to be sufficient.
Many neurons responded well to the optic flow stimuli. Most of these neurons also displayed a broad direction selectivity for full-field, frontoparallel, unidirectional motion. However, in 57 (70%) of 81 neurons that were compared, the response recorded during an optic flow stimulation exceeded the response elicited with the unidirectional motion. When making this comparison, it is important to bear in mind that the frontoparallel motion stimuli were not directly comparable to the optic flow stimuli in terms of speed, direction, or dot size. Instead, we used optimized stimuli for the frontoparallel motion responses. These stimuli were chosen from a large set of stimuli differing in speed, direction, stimulus size, dot size, etc., so as to elicit an optimum response of the individual neuron. Thus, although the frontoparallel motion stimuli and the optic flow stimuli were not directly equivalent, we think that the comparison nevertheless provides a conservative assessment of the relative response strengths to frontoparallel motion and optic flow. In addition to visual responses, some neurons also showed pursuit-related activity or extraretinal modulations by eye-position (Bremmer and Hoffmann, 1993; Bremmer et al., 1996).
For most neurons, the recorded activities during the optic flow stimulations depended on the position of the singular point on the tangent screen. Usually, the selectivity for an optic flow pattern could be changed by changing the placement of the singular point. Often, however, a reversal of selectivity from expansion to contraction, or from clockwise to counterclockwise rotation, did not occur within the stimulus set that included only the 15° eccentric positions. But in this case, selectivity reversals could usually be induced using the 40° eccentric stimulus set. Figure 6shows spike trains and peristimulus time histograms for a neuron tested with the 15° expansion/contraction and the 15° and 40° rotation stimuli. The arrangement of the histograms reflects the screen location of the singular point in the different stimulations. If the singular point is placed in the left to upper left part of the visual field, the neuron fires with increased firing rate in the contraction phase. In contrast, if the singular point is placed in the right hemifield, the neuron fires stronger in the expansion phase. Within the 15° rotation stimuli (inner histograms in Fig. 6B) no such reversal of selectivity is observed. At most positions within the central 30° of the visual field, the activity of the neuron is larger in the clockwise rotation phase than it is in the counterclockwise rotation phase. However, if the responses to the 40° rotation stimuli are considered (outer histograms in Fig. 6B), it becomes apparent that the activity of the neuron during counterclockwise rotation increases when the singular point is located in the upper left periphery. Thus, the neuron favors clockwise rotations in most of the visual field, but reverses its selectivity in a restricted peripheral area, similar to the model neuron in Figure 4. The receptive field of the neuron covered the entire left hemifield with an area of increased excitability covering the lower left quadrant. Preferred direction for frontoparallel unidirectional motion was toward the right.
The main goal of this study was to determine the shape of the activity modulation of MST neurons when the position of the singular point in the visual field is varied, and to compare it to the response curves obtained in computer simulations. In Figure 7, the activity profiles of an MST neuron for the different optic flow stimuli are plotted as three-dimensional surface graphs. Smooth activity slopes in response to expansion/contraction can be seen to agree with the expansion/contraction response functions of the model neurons in Figures 4 and 5. Activities during rotational stimulation were recorded only with the 15° stimuli set. A strong response to counterclockwise rotation and a dependence on the position of the singular point are apparent. In mapping the receptive field of this neuron, some response could be elicited from all over the visual field, but increased responsiveness was obtained from only the lower left quadrant of the visual field. The neuron was also direction-selective for full-field unidirectional motion, favoring directions toward the lower left.
Evaluation of model predictions
To evaluate the predictions of the model for the recorded neurons, we computed the percentages of neurons that were consistent with the predictions.
Reversals of selectivity for large displacements of the singular point
For each neuron tested with a set of expansion/contraction stimuli, we computed the difference of the mean spike rate during expansion and the mean spike rate during contraction for each of the nine locations of the singular point. If at one location of the singular point a cell responded more strongly to expansion than to contraction, and if at a different location of the singular point the same cell responded more strongly to contraction than to expansion, then direction indices (DI) for this pair of locations were computed, following the standard formula: The neuron was counted as reversing its selectivity when both direction indices exceeded a value of 0.5. The same procedure was applied to clockwise and counterclockwise rotations.
The percentage of neurons that displayed reversals of selectivity is listed in Table 1 for the various sets of stimuli used. Table 1 shows that most neurons reverse their selectivity depending on the position of the singular point, consistent with the model prediction. Also consistent with the model prediction, the reversals become more prominent when the singular point of the optic flow stimulus is moved further in the periphery of the visual field.
Complementary response fields
We next tested the model prediction that best responses to opposing stimuli (expansion vs contraction, clockwise vs counterclockwise rotation) should occur for opposite locations of the singular point in the visual field. Only those neurons that displayed a reversal of selectivity according to the above criteria were considered. For each type of motion, we determined in which direction from the visual field center the area of best response was located. To obtain this direction, we computed the gradient of a two-dimensional regression on the nine activities recorded for a given motion type and stimulus set. The gradients for opposite types of motion were then compared to each other. If the gradient for expansion and the gradient for contraction pointed in directions more than 90° apart, the cell was considered having complementary response fields for expansion/contraction. Complementary response fields for clockwise and counterclockwise rotation were determined analogously. The percentages of neurons that had complementary response fields are shown in Table2. The data in Table 2 conform with the predictions from the model. Best responses to opposing stimuli occur at opposite locations in the visual field. Again, the result is clearest when the neurons were tested with the 40° stimuli.
Inhibition
Activities dropping below the background level during an optic flow stimulation were observed frequently. Table 3 lists the percentages of neurons for which the activity during an optic flow stimulation dropped below the background level at one or more locations of the singular point. Table 3 shows that inhibition by a nonpreferred optic flow pattern is a common finding that is in agreement with the model. Also, consistent with previous authors (Lagae et al., 1994), we found the background activity in MST to be relatively high. Median background activity for our sample of neurons was 12 spikes/sec.
Activity maxima in the periphery
A fourth prediction from the model simulations was that maximum response should occur in the periphery. We next tested whether the maximum activities for a given optic flow pattern occurred at the central position of the singular point in the visual field or at one of the peripheral locations. Table 4 shows that for the majority of the neurons, maximum activities occurred at one of the peripheral positions. However, with the 40° stimulus set the percentages are near or below the level of chance, which is 89%, because eight peripheral but only one central location had been tested. But from a closer inspection of Figure 4 one can deduce that for the model neurons, the maximum activity might already be approximately reached at the central position, even though the activity modulation is monotonously increasing toward the periphery. An inspection of the measured activities of those MST neurons that failed to show the maximum activity in the periphery revealed that the prevalent response characteristic is that of a maximum in the center paired with an activity of almost the same strength at one or more peripheral locations (Fig. 8A). Truly bell-shaped response curves with a clear single peak in the center (Fig.8B) were rare. Less than half of those neurons that had a maximum response in the center displayed a single peak response (6 of 14 for expansion/contraction, 4 of 11 for rotation).
Average response curves
To compare further the shape of the activity modulations in MST to those obtained in model simulations, we wanted to generate an average response curve for the population of neurons recorded. The procedure used to generate an average response curve consisted of two steps. First, the response curves from individual neurons had to be aligned. Second, the average over the aligned curves had to be determined. For the alignment, the directions of the areas of best response were used, which were introduced above. Response curves from individual neurons were rotated in the (x,y)-plane in such a way that their response gradients all pointed in the same direction. To enable the averaging, this rotation had to be performed in discrete steps of 45°. Average response curves were then obtained by averaging over the responses of all individual neurons, separately for each location of the singular point.
Figure 9 shows the average response curves for all neurons tested with both sets of stimuli, the 15° and the 40° set (N = 21 for expansion/contraction, N = 18 for rotation). Shown on the left are the three-dimensional surface plots also used in the single neuron examples. The arrangement of expansion/contraction and clockwise/counterclockwise rotation curves in opposite directions was justified by the observation that the average response gradients also pointed in opposite directions. The plots on the right of Figure 9 display cross sections through the midline of the response curves. There, five points were measured in a row. These plots serve to illustrate the sigmoidal shape of the response curves along the gradient direction. A comparison with Figure 4 shows that the average response curves for the MST neurons we recorded are in good agreement with the response curves of the model neurons.
One has to keep in mind, however, that averaging the responses of all recorded neurons also includes neurons with different response curves, such as the one in Figure 8. However, we believe that averaging over all neurons recorded provides the most unbiased way to determine global characteristics. This is not to say that all individual neurons behave the same. It simply helps in illustrating a prevalent response pattern.
Two-dimensional direction selectivity
Most of the MST neurons we recorded also displayed direction selectivity for frontoparallel unidirectional motion of a full-field random dot pattern. Direction selectivity, in addition to optic flow selectivity, has often been described for optic flow-responsive neurons in MST. However, different authors have put different emphasis on this observation and on its implication for the optic flow processing capabilities of these neurons. Early investigations (Saito et al., 1986; Tanaka et al., 1986; Tanaka and Saito, 1989a) required optic flow-selective neurons to be directionally unselective for unidirectional motion. Later studies have suggested that optic flow selectivity and direction selectivity can coexist and might not be related to each other (Duffy and Wurtz, 1991a,b). The finding that some neurons reverse their optic flow selectivity when the stimulus is moved such that the local motion direction in part of the receptive field is reversed was taken as evidence against an involvement of these neurons in optic flow processing (Orban et al., 1992; Lagae et al., 1994). According to this argument, only neurons that display a positional invariance when the optic flow stimulus is placed in different parts of the receptive field are considered contributing to the optic flow analysis. On the other hand, if a neuron behaves completely position-invariant toward the retinal location of, for instance, the focus of expansion, it would also be useless for a navigational task such as heading detection (Graziano et al., 1994). Because of the network model, we are in a position to test the neuronal properties—including the relationship between direction selectivity and optic flow selectivity—in comparison to simulated neurons with a proven capability to perform a complex analysis of the optic flow.
In our sample, we often encountered a positional invariance of the optic flow responses when shifts of the position of the singular point were within the range tested in most of the above studies (≤40°). However, we usually could elicit a reversal of the selectivity when the displacement of the singular point was large (≥40°). To test whether these reversals of selectivity might be related to the direction selectivity of the neurons, it is useful to consider the following example. Consider a neuron that displays selectivity for expansions whenever the singular point is located in the left hemifield and selectivity for contractions when the singular point is located in the right hemifield. For the expansion stimuli centered on the right, most local motion directions would contain a movement component directed to the left. Similarly, for the contraction stimuli centered on the left, most local motion directions would contain a movement component directed to the left, too. Thus, if the same neuron also displays a direction selectivity for leftward motion, its two-dimensional direction selectivity and its optic flow selectivity would be in agreement. However, as mentioned above, such a consistency has often been regarded as evidence against a true involvement in optic flow processing.
To see how frequent such an agreement between the optic flow selectivity and the frontoparallel direction selectivity appears in our sample of MST neurons, we calculated the angular difference between the direction of reversal of selectivity for two opposing optic flow stimuli and the preferred direction for frontoparallel, unidirectional motion. To determine the direction of reversal of the optic flow selectivity, we again computed the gradient of a two-dimensional regression on 9 or 18 data points. Because we wanted to find the direction of maximum change from, e.g., expansion to contraction, we used for the regression the difference between the activities during expansion and the activities during contraction. The obtained gradient gives the direction along which the activity of the cell during expansion increases and its activity during contraction decreases. Along this direction, the selectivity of the cell changes maximally from contraction to expansion. Consistency with the preferred direction for frontoparallel motion would imply a 180° angular difference between the two. The same argument can also be applied toward the preferred direction for frontoparallel motion and the direction of reversal of selectivity from counterclockwise to clockwise rotation. Consistency between the two directions would imply a 90° angular difference.
Figure 10, A and B, shows histograms of the angular difference between the directions of reversal of optic flow selectivity and the preferred direction for two-dimensional frontoparallel motion for all neurons tested. Indeed, for most neurons the direction of reversal is consistent with the preferred direction. This is true for both expansion/contraction and rotation stimuli. However, the same consistency is also found in simulations of model neurons. Fig. 10, C and D, shows the results of the same test run on the simulated responses from 250 randomly selected model neurons. In the model neurons, also, the direction of reversal is consistent with the preferred direction for frontoparallel unidirectional motion. The distribution of the model activities is even more peaked than for the physiological data. However, a broader distribution of the physiological data might have been expected simply because of noise in the measurements. Moreover, the direction was calculated by interpolating between eight discrete measurements, which limits the accuracy with which it can be determined. Thus, we think that model and experimental data are in agreement. In addition, from the model simulations it is evident that the observed consistency between the preferred direction of frontoparallel motion and the direction of reversal of optic flow selectivity does not oppose an involvement in optic flow processing.
Motion parallax
In our expansion/contraction stimuli, as well as in most natural situations, the flow field contains motion parallax, i.e., different visual objects move at different visual speeds according to their distance from the observer. Motion parallax is an important cue for the visual system, transmitting information about the three-dimensional layout of a visual scene (Rogers and Graham, 1979) and separating visual rotations caused by eye movements from body translations (Warren and Hannon, 1990). The response strength of MST neurons is little affected when the stimuli contain two or three different speed distributions simulating the motion of dots in different depth planes (Duffy and Wurtz, 1991a). This might indicate that they are not concerned with the recovery of spatial structure. However, motion parallax also carries important information for heading detection, albeit only in the presence of eye movements and the absence of extraretinal signals (Warren and Hannon, 1990). In simulations of psychophysical experiments, the network model also shows a dependence on motion parallax, because its robustness against noise decreases with decreasing depth range of the visual scene (Lappe and Rauschecker, 1993b).
To compare motion parallax influences on the level of single neurons, we recorded the responses of 55 neurons to expansion and contraction using stimuli containing no motion parallax. This was obtained by assuming that all random dots were distributed on a frontoparallel plane instead of in a random three-dimensional cloud. The depth range of the visual scene in this case is zero, and the distribution of speeds in the stimulus is uniform and depending only on visual eccentricity. For each neuron tested, we computed the direction of maximum change from expansion to contraction as described in the previous section. We then compared this gradient to the one obtained using the stimulus set that did contain motion parallax and computed the angular difference between the gradients in these two conditions. Figure 11 shows the angular distribution of these differences for the recorded neurons and for 84 randomly selected model neurons. In both cases, the distributions are centered around zero, but some variation occurs. The response modulations are very similar in the two conditions.
This raises the question of the origin of the psychophysical observations. The simulation results indicate that a modest variation observed in the simulated as well as the experimental data is sufficient to induce an effect similar to the effects observed in humans, i.e., an inability to detect correctly the direction of heading in the presence of slow eye movements when motion parallax is lacking (Warren and Hannon, 1990). This would suggest that a dependence of the heading detection system on motion parallax is a population effect. The dependence of individual cells on motion parallax is only weak but observable in the behavior of the complete system.
Recovering heading direction from neuronal activities
So far we have described the activity modulations of single MST neurons to optic flow stimuli that varied the location of the singular point in the flow field. We have compared these modulations to response curves obtained from single-neuron simulations in the model. For the case of our expansion stimuli, the singular point is a focus of expansion and directly indicates the direction of heading. In a final step, we wanted to test, therefore, whether a representation of the direction of heading by the population response of the recorded MST neurons is possible, as the model suggests. Unfortunately, it is impossible to use the experimental data directly in the mechanisms used by the model. This is for the following reason. Neurons in the model are organized in subpopulations, each of which represents a specific direction of heading (see Fig. 1). The assignment of a single neuron to one such population is used for the calculation of the connection strengths required to perform the task. The specific properties of the neuron, e.g., the orientations of its response curves, result partially from this assignment. But other parameters, such as the spatial distribution of inputs that the neuron receives from the first layer, also influence its response properties (see Fig. 3). Thus, from the measured response curves alone it is not possible to reversely determine with which subpopulation the neuron is to be associated. Therefore, a direct use of the recorded neuronal activities in the computational algorithm of the model is not possible. However, it is possible to test whether the population of neurons in MST is capablein principle to represent the direction of heading in a manner similar to the model. We used a least-mean-square minimization scheme to derive the position (x,y) of the focus of expansion of an expanding flow pattern from the neuronal activities. For each neuron, a sigmoid response curve u(x,y), analogous to the one used in the model, was fitted to the recorded activities. Then the actual recorded activity urof the neuron was used as a constraint for the location of the singular point: ur − u(x,y) = 0. The constraints from the individual neurons were squared and averaged. The result is a map of the least-mean-squared errors,U(x,y) = 1/N∑Ni(ur − u(x,y))2, for each possible location (x,y) of the singular point. This map is similar to the likelihood map of heading directions shown in Figure 2.
Grayscale plots of such heading maps obtained from the recorded neuronal activities are shown in Figure 12. The gray value at each map position corresponds to the magnitude ofU(x,y). Brighter gray levels indicate small values of U(x,y). For these plots, all neurons recorded with the 15° expansion stimuli were used (N = 31). As in Figure 2, the most likely heading direction, implicated by the focus of expansion in the stimulus and represented by the neuronal activities, is given by the brightest square in the map. For comparison with the true heading direction, the nine optic flow stimuli used are plotted on top of the grayscale maps. It is evident that the potential to represent the direction of heading in the situation we studied is present in the neuronal population activity. A computation of the mean error over all nine positions shows that with this simple procedure, the direction of heading, i.e., the location of the singular point, could be retrieved from the neuronal activities with an average precision of 4.3°. This error has to be compared to an average error of 2.5° obtained in a human psychophysical study with comparable stimuli (Warren and Kurtz, 1992). Thus, the precision appears to be in the range of the human data, especially if one considers that only 31 neurons contributed to the computation.
We feel that it is very important to add two caveats, however. First, we want to emphasize again that the procedure used to determine the direction of heading from the neuronal activities differs from the procedure used by the model. Thus, the possibility of recovering the direction of heading from the neuronal activities can only be taken as an indication that this capability is present in MST, not that area MST essentially operates in this manner. In fact, a direct computation of the least-mean-square error by the nervous system is difficult to imagine. The neural network model outlined above is a much more biologically plausible way to achieve the same result. However, because of its structure it would require a much larger number of neurons to achieve similar accuracy. The simulations used up to 16,000 neurons. This is partly because it also includes the means to cope with ongoing slow eye movements. But partly also because it uses an excessive population coding in which the actual computation is spread out over many neurons. Second, it is important to note that the procedure outlined above is only capable of locating the singular point in an optic flow pattern. The one-to-one correspondence between the direction of heading and the retinotopic location of the singular point of an expanding flow pattern only holds under limited conditions, namely, when no eye movements occur. However, when eye movements occur during locomotion, the location of the singular point is different from the direction of heading (Regan and Beverly, 1982; Warren and Hannon, 1990;Lappe and Rauschecker, 1995). So far, we can only claim that the neuronal population in MST can recover the direction of heading for the specific and limited set of stimuli we used.
DISCUSSION
We recorded activities from single neurons in area MST of the macaque monkey during full-field optic flow stimulation and compared them to simulations of a neural network model of heading detection. The neuronal activities in MST are modulated by the retinal position of the singular point of a flow pattern. We demonstrated that these activity modulations could enable the MST population to determine the location of the focus of expansion and, hence, the direction of heading in the case of our stimulus set.
Compliance with model predictions
Work on modeling the optic flow processing capabilities of human observers (Lappe and Rauschecker, 1993b, 1994) resulted in a number of predictions for the properties of optic flow processing neurons. Our experiments were designed to investigate whether optic flow-responsive neurons in area MST conformed with these predictions. We found the majority of the neurons in good agreement. Neuronal activities depended on the position of the singular point. Reversals of selectivity occurred as the singular point was moved a large distance across the visual field. Best responses to opposing stimuli occurred for opposite locations of the singular point within the visual field. Substantial background activity occurred in the absence of visual stimulation. Excitation by one type of motion at a particular location of the singular point was often paired with inhibition by the opposite type of motion. Activity maxima often occurred for peripheral locations of the singular point.
In addition, a correlation between the reversals of optic flow selectivity and the preferred directions for frontoparallel, two-dimensional motion that was evident from the recorded data was similarly found in model simulations. Motion parallax influenced MST neurons in much the same way as it did model neurons.
The clearest demonstration of the similarity between model simulations and experimental data was seen in the average response functions of the recorded neurons. The complementary response characteristics and the sigmoidal modulation by the position of the singular point are immediately apparent in the recorded data.
Our present analysis is based on mean firing rates. We were able to show that the mean firing rates could provide information necessary to determine the focus of expansion. A question might be whether temporal variations in the firing rate could also convey useful information. We think that this is not so much the case for egomotion parameters but possibly for information about the complexity of the flow field and about the structure of the environment. This view is based on a recent related study (Pekel et al., 1996) that compared simulated egomotions in complex, colored, realistic environments with responses to simple random-dot flow fields. In these comparisons, mean spike rates and neuronal optic flow response characteristics calculated from mean spike rates were very similar and independent of environmental features. However, in complex environments neuronal firing patterns showed significantly more temporal variation than the responses to simple homogeneous stimuli.
Comparison to previous studies in MST
Our experimental paradigm differed from most previous studies in essentially three respects. We used full-field stimulation, evaluated the data with respect to the singular point in the stimulus, and used realistic flow fields simulating egomotion in a three-dimensional environment. Nevertheless, a qualitative comparison with previous results reveals that our data are consistent with most of the earlier findings. Most neurons possessed very large receptive fields, responded preferentially to large stimuli, and were direction-selective for two-dimensional motion. Some neurons also exhibited pursuit-related activity. These are well established characteristics of MST cells (Desimone and Ungerleider, 1986; Komatsu and Wurtz, 1988; Newsome et al., 1988; Erickson and Dow, 1989). A frontoparallel direction selectivity in addition to optic flow responses has been described by several authors (Duffy and Wurtz, 1991a,b; Orban et al., 1992; Lagae et al., 1994). In fact, a continuum of selectivities was found with respect to the responses to expansion/contraction, rotation, and translation. Although this makes a distribution of neurons into different classes difficult, a comparison to the classification proposed by Duffy and Wurtz also shows that our sample of cells is in good agreement with the published data (Table 5).
Reversals of optic flow selectivities depending on the placement of the stimulus have been observed previously. However, the extent to which these reversals occurred seems to depend on the exact experimental paradigm. Duffy and Wurtz (1991a,b) reported that between 16 and 40% of the neurons showed reversals of selectivity. Orban et al. (1992) andLagae et al. (1994) reported that 60% of MST neurons that showed direction-selective responses to optic flow reversed their selectivity when the stimulus was moved. On the other hand, Graziano et al. (1994)reported that all of the cells they recorded kept their selectivity when the stimulus was displaced. In our sample, the selectivity reversals were very prominent. However, a true comparison of the percentages is difficult because of the above mentioned differences in stimulus size, placement, and structure. But an important factor is the range of the displacement used. Our results show that a reversal of selectivity occurs more frequently when the displacement of the position of the singular point is large (40–80°) than when only a limited area is tested (15–30°). This is consistent with model simulations. The ranges used previously were 40° in the studies byOrban et al. (1992) and Lagae et al. (1994), 66° in the study ofDuffy and Wurtz (1991a,b), and 10° in the study of Graziano et al. (1994). Thus, an increase in the number of reversals with the range of displacement tested is also indicated by these results.
Another possibility for the differences in the number of neurons that reverse their optic flow selectivity could be the difference in the recording sites. In the above studies, most optic flow-selective neurons were recorded in the dorsal part of area MST. Our recordings were performed in the posterior bank of the superior temporal sulcus, in a part of area MST close to the border with area MT and the fundus of the sulcus. However, most of the properties of the neurons, such as receptive field size, optic flow selectivity, and preferred stimulus size, were very similar to the properties of neurons in the dorsal part of MST. Lagae et al. (1994), who also recorded a number of neurons in the fundus of the STS, found these neurons to be similar to the ones they recorded in dorsal MST.
Tanaka et al. (1993) reported neurons in ventral MST that preferably responded to small stimuli. They proposed that these neurons analyse object motion. But neurons in dorsal MST also respond well to small stimuli and have been suggested to analyze object motion instead of self-motion (Graziano et al., 1994). The neurons we recorded do not seem to fall in this category. Instead, they responded preferentially to large flow patterns, and they clearly have the potential to contribute to an analysis of self-motion. Mathematically, the mechanisms necessary to analyse object motion in three-dimensional space are similar to those necessary to analyse self-motion. A strict differentiation between object and self-motion might be somewhat artificial. Even the preferred stimulus size might not be a decisive criterion. Psychophysically, effects clearly related to self-motion can also be observed with rather small optic flow stimuli (Anderson and Braunstein, 1985; Warren and Kurtz, 1992).
Recently, Duffy and Wurtz (1995) performed a study with a stimulation paradigm similar to ours. They also found that the responses of individual neurons change with the position of the singular point in the visual field. They concluded that MST might contain a map of heading space. Overall, our findings are in agreement with their study. However, Duffy and Wurtz found a larger percentage of neurons that exhibited a peak of response activity to centered optic flow patterns, whereas we found a more equal distribution. A possible explanation might be that our sample of neurons contained a slightly lower percentage of cells that respond to only expansion/contraction or rotation (single component neurons). In their study, these neurons are the ones that display an increased selectivity for centered flow stimuli, whereas neurons that responded to several flow components also displayed an equal distribution of peak positions.
Comparison to other model conceptions
One of the most influential ideas in optic flow research is the decomposition hypothesis, i.e., the mathematical observation that any optic flow field can locally be linearly decomposed in a number of basic flow components. However, recent neurophysiological studies have convincingly demonstrated that MST neurons do not linearly decompose the optic flow into these basic components. Neurons respond to several basic components (Duffy and Wurtz, 1991a,b), cannot extract a preferred component when a different component is superimposed (Orban et al., 1992; Lagae et al., 1994), and often respond better to combinations of basic components (Graziano et al., 1994).
The properties of the model neurons and our experimental results are in line with the observation that MST neurons do not linearly decompose the optic flow into basic components. Instead, the model neurons act as part of a detection scheme for specific self-motions. The selective responses are a consequence of the neuronal selectivity for the direction of heading.
Another conception of the role of MST neurons in heading detection was put forward recently by Perrone and Stone (1994). In their model, individual neurons code for individual directions of heading, by forming templates for individual flow fields. Because, strictly speaking, an infinite number of possible flow fields would require a large number of individually tuned detectors, a couple of simplifications were made so as to keep the amount of detectors required reasonable. However, this approach would require that certain heading detection neurons would individually detect the position of the focus of expansion. A template for a specific location of the focus of expansion would result in a neuron that, much like the computational maps we obtained for the population activity, would exhibit a bell-shaped tuning curve for expansional flow with a single peak at a preferred location of the focus of expansion in a full-field flow pattern. Such a tuning curve was seldom found. From 134 neurons recorded, only 10 neurons conformed with this prediction. Also, the reversals of selectivity that we observed frequently are difficult to explain in the context of matching the optic flow input to templates.
When a single neuron cannot detect the direction of heading by itself, the question remains whether the population activity needs to be explicitly evaluated in another neuronal structure or even by specialized neurons within the same area. The process of combining individual responses to determine population responses in Figure 3shows that by a simple summation, selective responses to specific directions of heading could be achieved. Such a summation could be explicitly performed by individual neurons. In this case, an individual peak-shaped selectivity towards a specific direction of heading would result (Fig. 3E). Some MST neurons display a peak-shaped response dependence like this (see also Duffy and Wurtz, 1995) and might individually prefer a single direction of heading. As already put forward in Lappe and Rauschecker (1993b), such neurons might read out the activity of MST subpopulations provided by the more basic sigmoidal-shaped response curves. However, from the computational point of view taken by the model, one would also assume such neurons to be rare, because much more effort is required to establish the population activity than to evaluate it. This might explain the low frequency with which we encountered such cells.
Heading detection in area MST
We showed that a distributed encoding of the direction of heading is possible in area MST. This distributed encoding is an implicit representation of an external parameter. It is similar to the distributed representation of external space in parietal cortex (Zipser and Andersen, 1988; Pouget et al., 1993), of reaching movements in motor cortex (Georgopoulos et al., 1986), or of gaze shifts in the superior colliculus (Van Gisbergen et al., 1987; Lee et al., 1988). The recorded neurons retain some essential features in accordance with the properties of model neurons, which are designed to solve the task of heading detection from optic flow. However, the important issue of eye movements during egomotion also needs to be considered. Eye movements change the pattern of motion on the retina and destroy the one-to-one correspondence between the direction of heading and the retinal location of the focus of expansion. Neurons in the model are designed to cope with this situation. Some properties of neurons in area MST also indicate that the potential to deal with eye movements during egomotion is present. Many optic flow-responsive neurons also respond to frontoparallel, unidirectional motion which, to a first degree, is an approximation of the flow field generated by an eye movement. This frontoparallel direction selectivity might reflect the use of visual cues in dealing with eye movement issues during egomotion. It also exists in the model neurons. More important, the activity of some MST neurons is modulated when the animal actively performs an eye movement (Komatsu and Wurtz, 1988; Newsome et al., 1988; Erickson and Thier, 1991). These properties might be part of a mechanism that supports the estimation of egomotion parameters by using extraretinal information (Lappe et al., 1994). This approach predicts that neuronal response curves shift along with the eye movement to compensate for the transformations of the retinal flow field. Some preliminary reports show an interaction between optic flow responses and pursuit related activity in single MST neurons (Duffy and Wurtz, 1994; Lappe, 1996a). However, the functional integration of these properties with the optic flow responses still awaits further study.
An additional cue present in real-life situations is stereoscopic depth. It has been shown recently that the human visual system uses stereoscopic depth cues in the analysis of optic flow fields and the determination of the direction of heading (van den Berg and Brenner, 1994). It is a reasonable speculation that optic flow processing neurons in MST might modulate their responses dependent on disparity. Using frontoparallel translational motion stimuli, selective responses to meaningful combinations of motion and disparity have already been described in areas MT (Bradley et al., 1995) and MST (Roy and Wurtz, 1992). In fact, the specific combination of translational motion and disparity in area MT has been implicated to account for the use of disparity in human optic flow processing on theoretical grounds (Lappe, 1996b).
We have shown that a small number of MST neurons could already carry enough information to recover the direction of heading in limited circumstances. We have also argued that a biologically plausible mechanism for heading detection would require considerably more neurons to address adequately the needs of self-motion computation in general circumstances, for instance, when eye movements occur. Still, the problem of the determination of self-motion from optic flow might not require the devotion of an entire brain area to it. In fact, many reports have shown that MST is also involved in a different behavioral task, namely, the generation of smooth pursuit and optokinetic eye movements (Dürsteler and Wurtz, 1988; Komatsu and Wurtz, 1988;Erickson and Dow, 1989; Kawano et al., 1994). However, the relationship between retinal flow and eye movements is close. On the one hand, eye movements also heavily influence optic flow fields. On the other hand, any self-motion immediately poses a challenge to the stability of the retinal image, which could lead to stabilizing eye movements. Thus, functionally locating both tasks in the same area is sensible. It is conceivable that optic flow processing neurons might also contribute to the generation of eye movements in a process not yet investigated. The complexity of the task of heading detection from optic flow also increases when, instead of the direction of heading in retinotopic coordinates, a heading signal in exocentric coordinates is required. Although this report has been concerned only with retinotopic representations, there is some indication that a step from retinotopic to exocentric coordinates might also occur already in area MST. MST neuronal responses are modulated by eye position (Bremmer et al., 1996) in a way similar to neurons in higher parietal areas (Andersen and Mountcastle, 1983). In the sense of a distributed encoding of exocentric spatial position (Zipser and Andersen, 1988; Pouget et al., 1993) combined with optic flow selectivity, area MST could already contain information to encode the direction of self-movement in extrapersonal space.
Footnotes
This work was supported by DFG La 952/1-1, DFG SFB 509, and Esprit Insight II.
Correspondence should be addressed to Dr. Markus Lappe, Department of Zoology and Neurobiology, Ruhr University Bochum, D-44780 Bochum, Germany.