Previous Article | Next Article 
Volume 16, Number 19,
Issue of October 1, 1996
pp. 6265-6285
Copyright ©1996 Society for Neuroscience
Optic Flow Processing in Monkey STS: A Theoretical and
Experimental Approach
Markus Lappe,
Frank Bremmer,
Martin Pekel,
Alexander Thiele, and
Klaus-Peter Hoffmann
Department of Zoology and Neurobiology, Ruhr University Bochum,
D-44780 Bochum, Germany
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
EXPERIMENTAL RESULTS
DISCUSSION
FOOTNOTES
REFERENCES
ABSTRACT
How does the brain process visual information about self-motion? In
monkey cortex, the analysis of visual motion is performed by successive
areas specialized in different aspects of motion processing. Whereas
neurons in the middle temporal (MT) area are direction-selective for
local motion, neurons in the medial superior temporal (MST) area
respond to motion patterns. A neural network model attempts to link
these properties to the psychophysics of human heading detection from
optic flow. It proposes that populations of neurons represent specific
directions of heading. We quantitatively compared single-unit
recordings in area MST with single-neuron simulations in this model.
Predictions were derived from simulations and subsequently tested in
recorded neurons. Neuronal activities depended on the position of the
singular point in the optic flow. Best responses to opposing motions
occurred for opposite locations of the singular point in the visual
field. Excitation by one type of motion is paired with inhibition by
the opposite motion. Activity maxima often occur for peripheral
singular points. The averaged recorded shape of the response
modulations is sigmoidal, which is in agreement with model predictions.
We also tested whether the activity of the neuronal population in MST
can represent the directions of heading in our stimuli. A simple
least-mean-square minimization could retrieve the direction of heading
from the neuronal activities with a precision of 4.3°. Our results
show good agreement between the proposed model and the neuronal
responses in area MST and further support the hypothesis that area MST
is involved in visual navigation.
Key words:
visual motion;
self-motion;
heading;
monkey;
visual
cortex;
modeling
INTRODUCTION
In the cortical motion pathway of primates, two
areas are concerned with optic flow processing. The middle temporal
(MT) area (Allman and Kaas, 1971
; Dubner and Zeki, 1974
) contains many
direction-selective cells (Maunsell and Van Essen, 1983a
,b; Rodman and
Albright, 1987
) that, in principle, might form a distributed encoding
of the flow field arriving on the retina (Movshon et al., 1985
;
Bülthoff et al., 1989
; Wang et al., 1989
; Newsome et al., 1990
;
Britten et al., 1993
). In the medial superior temporal (MST) area,
which follows area MT in the motion pathway (Ungerleider and Desimone,
1986
; Boussaoud et al., 1990
), many neurons respond to large random dot
optic flow patterns (Saito et al., 1986
; Tanaka et al., 1986
; Tanaka
and Saito, 1989a
; Duffy and Wurtz, 1991a
,b), suggesting an involvement
in the analysis of optic flow. Several studies capitalized on the fact
that, mathematically, any flow field can be locally decomposed into a
small number of basic or elementary flow components: divergence,
deformation, and curl (Koenderink and van Doorn, 1975
). Optic flow
stimuli presented to MST neurons were pure expansions, contractions,
and rotations (Saito et al., 1986
; Tanaka et al., 1986
; Tanaka and
Saito, 1989a
; Duffy and Wurtz, 1991a
,b), recently augmented also by
deformations (Lagae et al., 1994
) and linear combinations of
expansion/contraction and rotation (Orban et al., 1992
; Graziano et
al., 1994
). Consistently, it was found that most single neurons in MST
responded strongly to several of these stimuli and also to
unidirectional frontoparallel motion (Duffy and Wurtz, 1991a
,b; Lagae
et al., 1994
). They do not perform a mathematical decomposition of the
flow field (Graziano et al., 1994
; Lagae et al., 1994
). Different
conclusion have been drawn on whether these neurons, or a more
restricted subset of them, might be involved in the processing of optic
flow fields arising from egomotion. Here we propose to investigate this
question with a combination of experimental and theoretical
considerations.
Lappe and Rauschecker (1993b)
have devised a network model of visual
navigation. This model generates neurons that respond to several of the
usually tested optic flow stimuli in a way similar to cells in MST
(Lappe and Rauschecker, 1993a
). A single neuron might respond to
several optic flow patterns and also to frontoparallel, unidirectional
movement. Other single-model neurons respond selectively only to a
smaller set of basic flow patterns. However, consistent with the
findings in MST, no single neuron in the model performs a mathematical
decomposition of the flow field and detects a preferred basic
component. Rather, the model neurons are designed to contribute to the
solution of a specific and important task of optic flow analysis,
namely, the detection of the direction of heading. In a retinotopic
frame of reference, which is assumed in this paper, this would mean
specifically the determination of the direction of heading with respect
to the direction of gaze. The selective responses of neurons to certain
basic flow patterns result from their function in this task, and they
also respond to more complex flow fields, such as linear combinations
of expansion/contraction and rotation. This model can be used to derive
a number of predictions for optic flow processing neurons that differ
from the propositions used in earlier studies. Here we attempt to
perform a comparison between computer simulations of single-model
neurons and the activities of single neurons recorded in area MST with
exactly the same stimulation procedure in both cases. For the first
step of this comparison, which comprises the scope of this paper, we
focus on a simple and basic simulated egomotion optic flow, namely, a
linear translation in three-dimensional space. We will not consider the
important issue of simultaneous visual rotations attributable to eye
movements. We will, however, partially include pure rotations in the
frontoparallel plane, mostly for reasons of comparability with previous
studies.
We would like to add a few comments on rotational motion patterns and
their relation to optic flow occurring during egomotion. It is a
mathematical fact that any optic flow field can be locally decomposed
into divergence, rotation, and deformation (Koenderink and van Doorn,
1975
). However, this decomposition is a local operation,
meaning that it is only defined in an infinitesimal neighborhood around
the point of interest, i.e., a very tiny patch of the optic flow field.
The large size of the receptive fields of MST neurons and their
preference for large stimuli are incongruent with a local decomposition
of the flow field by MST neurons. On the other hand, it is practically
impossible to observe a pure full-field frontoparallel rotation by any
normal type of self-motion. It would be required, essentially, to spin
around a midsagittal axis running through the center of the eye. This
movement is not in the repertoire of a normal human or monkey.
Rotational movements that occur during normal primate locomotion are
either curved paths of travel or eye rotations resulting from version
eye movements. Both do not induce full-field frontoparallel rotational
motion patterns. Rather, the former results in curved trajectories of
the flow field elements over time (Warren et al., 1991
). The latter
results in a distortion of the optic flow field on the retina depending
on the direction and speed of the eye movement. A limited amount of
rotational visual motion is obtained when a moving observer tracks an
eccentric point in the environment. In this case, however, the retinal
flow field resembles much more a spiral than a rotation [see Lappe and
Rauschecker (1995)
for a mathematical analysis of this type of
self-motion]. So why do we include full-field frontoparallel rotations
in our study? For one reason, all previous studies have included
frontoparallel rotations in their basic stimulus set, and it serves as
a means of comparison. Moreover, the model neurons also respond to
frontoparallel rotation, thus allowing a comparison between model and
experiment. However, we would like to emphasize that the model
responses to frontoparallel rotation are not a specific
selectivity for this type of motion but, rather, a reflection of their
selectivity for heading detection.
MATERIALS AND METHODS
Single-unit recordings were performed in two awake, behaving
monkeys (Macaca mulatta) performing a fixation task. All
procedures were in accordance with published guidelines on the use of
animals in research (European Communities Council Directive
86/609/ECC). Experimental methods followed standard procedures that can
be found in more detail in Bremmer et al. (1996)
.
Animal preparation. The animals were surgically prepared for
chronic neurophysiological recording. The monkeys were pretreated with
atropine and sedated with ketamine hydrochloride. Under general
anaesthesia [10 mg/kg, i.v., pentobarbital sodium (Nembutal)] and
sterile surgical conditions, the animals were implanted with a chronic
device for holding the head. Two scleral search coils were implanted,
so as to monitor eye position, and were connected to a plug on top of
the skull. A recording chamber for introducing a guide tube and an
electrode through the intact dura was implanted over a trephine hole in
the skull. The chamber was placed over occipital cortex in a
parasagittal stereotaxic plane, tilted 60° off vertical. Recording
chamber, eye coil plug, and head holder were all embedded in dental
acrylic, which was connected to the skull by self-tapping screws.
Analgetics were applied postoperatively, and recording started no
sooner than 1 week after surgery.
Behavioral paradigm and recordings. During training and
recording sessions, the monkey's head was restrained on a primate
chair while he was performing a fixation task for liquid reward (apple
juice). Rewards were given for keeping the eyes within an
electronically defined window centered on the fixation target. The
fixation target (1.0° diameter) was generated by a light-emitting
diode and back-projected on a translucent tangent screen subtending
90° × 90° of visual angle at a viewing distance of 35 cm. The
fixation target was always presented in the center of the projection
screen. The monkey was required to maintain central fixation throughout
the stimulation period. Behavioral paradigm, visual stimulation, and
data acquisition were controlled by a PC (Compaq 386-25) and an
in-house-developed software package (DADA: Data acquisition and data
analysis, U. J. Ilg). At the end of the training or experimental
sessions, the monkey was returned to his home cage. Monkey's weight
was monitored daily, and supplementary fruit or water supply was
provided.
For cell recordings, tungsten-in-glass electrodes (Impedance, 1-2 M
at 1 kHz) were advanced using a hydraulic microdrive (Narishige)
mounted on the recording chamber. Neuronal activity and electrode depth
were noted, so as to establish the relative positions of landmarks,
such as gray and white matter, and neuronal response
characteristics.
Visual stimulation and data analysis. When a cell was
isolated, the receptive field was mapped using a hand-held projector
while the monkey fixated a central target. Quantitative testing of
basic visual properties was performed using a galvanometer-mounted
slide projection system allowing display of light bars or random dot
patterns of different sizes. The results of these tests determined the
optimal two-dimensional stimulus including optimal speed and preferred
direction. Other tests performed on the neurons included visual
responses to pattern on- and off-set, responses during smooth pursuit
eye movements, and modulations by eye position. Quantitative
computation of the preferred stimulus direction of the neurons for
full-field frontoparallel two-dimensional motion was done by means of
the SDO analysis (Wörgötter and Eysel, 1987
) using a
full-field random dot pattern.
For testing responses to optic flow stimuli, the main deviations from
previous studies are threefold and are similar to a paradigm used in a
more recent paper of Duffy and Wurtz (1995)
. First, we use full-field
stimulation covering a central 90° × 90° visual field. We take the
center of the visual field as a reference point for the description of
the neuronal tuning. Previous studies have often positioned stimuli
with respect to the receptive field of a neuron. Second, we varied the
position of the singular point in the full-field stimulus instead of
the position of the stimulus itself. The singular point is defined as
an idealized point in the flow field for which the visual motion is
zero, i.e., a point that remains stationary in the optic array. For
expansion/contraction stimuli, the singular point is the focus of
expansion/contraction. For rotation stimuli, the singular point is the
center of rotation. Third, unlike Duffy and Wurtz (1995)
, we use optic
flow fields that simulate self-motion in a visual environment
consisting of a random distribution of points in three-dimensional
space. For the case of translational forward or backward movement, this
results in a random distribution of flow field speeds in the stimulus.
Previous studies have mostly used uniform speed distributions or speed
distributions that were consistent with a movement toward a
frontoparallel plane. All of these deviations are motivated by our
intent to search for an involvement of these cells in the processing of
optic flow fields arising from egomotion. During egomotion, the entire
visual field is moving. The direction of heading has to be specified
with respect to the direction of gaze, not with respect to the location
of individual receptive fields. The variation of the position of the
singular point was chosen, because for the simple linear
ego-translation that we consider here, the location of the singular
point of an expansion/contraction pattern, i.e., the focus of
expansion/contraction, is directly related to the direction of heading.
The random distribution of flow field speeds was preferred over a
uniform distribution because much evidence from psychophysics indicates
that the motion parallax in these flow fields is an important source of
information for visual navigation.
Optic flow stimuli were generated by a Macintosh Quadra computer. The
stimuli consisted of full-field computer-generated sequences that were
back-projected onto the tangent screen. Image resolution was 400 × 400 pixels. Movies simulated approaching (expansion), receding
(contraction), and rotating (clockwise/counterclockwise) egomotion with
respect to a random cloud of dots in three-dimensional space. These
dots were white on a dark background. The cloud extended from 2 to
40 m in depth from the monkey. It contained 90 spherical dots all
with the same simulated diameter of 20 cm. Visual dot size depended on
the simulated distance of the dot from the observer. Median size was
~1° of visual angle. Simulated speed of the monkey was 3 m/sec for
the expansion/contraction stimuli and 60 deg/sec for the rotation
stimuli. The movie sequences were generated off-line, stored, and later
played back during the recording session. A single sequence lasted 1300 msec and displayed one direction of motion (expansion or clockwise
rotation) for a duration of 650 msec, immediately followed by 650 msec
stimulation by the opposite direction (contraction or counterclockwise
rotation). For data analysis, the mean spike rate during each 650 msec
stimulus interval was computed, corrected for the latency of the
response onset. The sequences simulated entirely realistic egomotion
flow fields. Dots accelerated with eccentricity, grew larger in size as
the approached the monkey, and exhibited a nonuniform speed
distribution that depended on the simulated distance of each dot from
the monkey. For the rotation displays, dots did not accelerate or grow
in size. In both cases, the visual motion of the dots was identical to
the motion that the monkey would experience when moving relative to
such a cloud of dots. In some neurons, we also tested a stimulus in
which all dots were assumed to lie on a frontoparallel plane instead of
a random cloud. In this case, the flow field speeds are more evenly
distributed and the stimulus does not contain any motion parallax.
To test whether the responsiveness of neurons was modulated by
the position of the singular point in the optic flow
stimulus, nine different movie sequences were presented in random
order. In each of these nine sequences, the singular point was located
either in the center of the screen or at one of eight different
locations arranged on a circle around the center of the screen. The
radius of the circle could be either 15° or 40°.
Histology. In the last days of recording with the first
monkey, electrolytic microlesions (10 nA for 10 sec) and neuronal
tracer injections were made. After recording was completed, the monkey
was given an overdose of pentobarbital sodium and, after respiratory
block and cessation of all reflexes, transcardially perfused. Frozen
sections were cut at 50 µm thickness. Sections 250 µm apart were
stained with cresyl violet and Klüver Barrera to visualize
cytoarchitecture. Another series was stained for myelin with the
Gallyas (1979)
method as modified by Hess and Merker (1983)
. Electrode
tracks were identified on the basis of the relative location of the
penetration to the entire recorded area, the spatial relationship to
other tracks and marking lesions or injections, and the depth profile
during a penetration. Our penetration scheme covered only a small
spatial region. The locations of the microlesions with respect to this
penetration scheme allowed us to identify the full area from which we
recorded. Approximate location of each recording site on the track was
determined, based on the distance from the above specified landmarks as
well as the appearance and disappearance of gray matter. Camera lucida
drawings of the relevant sections as well as two-dimensional maps of
the recorded hemisphere were made as a standard procedure. Most MST
recording sites were located in the posterior bank of the STS, near the
anterior border of area MT, and in the fundus of the STS. Histology of
the second monkey is not yet available because the animal is involved
in other experiments. In addition, evidence that a given neuron was
located in MST was also obtained from physiological criteria that were
used during the recording sessions following the procedure outlined by
Celebrini and Newsome (1994)
.
RESULTS
We first want to give a brief outline of the structure and
function of the network model proposed by Lappe and Rauschecker and
develop predictions for single-neuron properties. Then we will describe
the experimental results and the evaluation of the predictions.
Modeling optic flow processing
In his influential work starting in the 1950s, J. J. Gibson
postulated that the changing retinal illumination pattern occurring
during egomotion in a visual environment could be used effectively for
navigation (Gibson, 1950
). Since then, much research in psychophysics
and computer vision has been concerned with optic flow processing, but
only recently have neurobiologists started to investigate these
questions in higher mammals. Humans can accurately detect their
direction of heading from optic flow, even in the presence of
confounding eye movements (Warren and Hannon, 1988
, 1990
; van den Berg,
1993
). In some situations, however, humans do also need additional
extraretinal information about their eye movements to detect correctly
their direction of heading (Warren and Hannon, 1990
; Royden et al.,
1994
). Many mathematical investigations have been concerned with the
visual decomposition of the retinal flow (Koenderink and van Doorn,
1975
; Prazdny, 1980
; Longuet-Higgins, 1981
; Rieger and Lawton, 1985
;
Verri et al., 1989
), but few neurobiological models of optic flow
processing exist (Lappe and Rauschecker, 1993b
; Perrone and Stone,
1994
; Zemel and Sejnowski, 1995
).
The model of Lappe and Rauschecker (1993b)
is a two-layer
implementation of an algorithm (Heeger and Jepson, 1992
) that computes
the direction of heading inherent in a measured optic flow field by
matching the motion parameters of the observer, i.e., ego-translation
T and eye-rotation
, to the measured optic
flow field according to a least-square criterion. Thus, given a
specific flow field as input, it determines which of a possible set of
heading directions most likely generated this input flow field. This is
equivalent to determine the direction of heading in a retinotopic frame
of reference, i.e., the direction of heading relative to the direction
of gaze. A schematic layout of the network is shown in Figure
1. In the first layer, direction-selective cells
represent the optic flow input. We assume that the response of each
cell is maximal for movements of small objects in an individual
preferred direction and zero for movements in the null direction. We
further assume that each retinal location contains several neurons with
different preferred directions that together encode a measured optic
flow vector. We regard the first layer of the network as a functional
representation of area MT in monkey cortex. Various models have been
proposed for the measurement of optic flow and its possible
implementation in area MT (Hildreth and Koch, 1987
; Bülthoff et
al., 1989
; Wang et al., 1989
; Qian et al., 1994
; Nowlan and Sejnowski,
1995
). Different suggested mechanisms such as ``winner-take-all'' or
``population coding'' have been tested experimentally (Salzman and
Newsome, 1994
). In our model, the precise nature of the optic flow
representation in MT is not critical. We only require the first layer
to signal the direction and the speed of an optic flow vector that
occurs at a specific location in the visual field. Any biologically
plausible algorithm would suffice. For the simulations described later,
we used a simple encoding. A set of 32 simplified neurons encodes the
local optic flow at a particular position in the visual field. We
assume a cosinosoidal direction tuning and a Gaussian speed tuning. We
typically use four direction preference classes (0°, 90°, 180°,
270°) and eight speed preference classes (0.5, 1, 2, 4, 8, 16, 32, 64 deg/sec). We disregard any effects of spatial summation of a neuron
caused by an extended receptive field. Instead, we assume that the
neuronal responses reflect only the speed and direction of a single
point in the flow field. We assume that in the first layer of the
network a large number (typically 300) of such functional units are
randomly distributed within the visual field.
Fig. 1.
Structure of the network model. In the first
layer, direction-selective cells (A) modeled after
properties of neurons in monkey visual area MT represent the optic flow
input. At each receptive field position, columns of several neurons
with different preferred directions
four in the drawing
encode
the optic flow occurring at that location in the visual field. The
combined activity in layer one encodes the optic flow field
(B). In the second layer, the motion of the observer is
recovered by neuronal populations that are tuned to preferred
directions of heading. It contains a two-dimensional retinotopic map of
heading directions. Each map position represents a specific direction
of heading, given by the retinal projection of the movement of the
observer. It is occupied by a column of neurons that receive inputs
from different parts of the visual field and respond to optic flow
motion patterns. Their response is a function of the direction of
heading inherent in the flow field input (C). Individual
neurons within a single column might carry different optic flow
selectivities, but the combined activity of all neurons within one
column is tuned to the direction of heading symbolized by this position
in the map. Taken together, the population activity in layer two gives
a map of heading directions (D). Connections between the two
layers are randomly assigned. Connection strengths, however, are chosen
in compliance with the heading detection scheme implemented (Lappe and
Rauschecker, 1993a
). Neurons from within one second-layer column may
receive input from different, potentially overlapping, regions of the
visual field and retain very large receptive fields.
[View Larger Version of this Image (55K GIF file)]
The second layer of the network contains neuronal populations
individually tuned to specific directions of heading. This layer forms
a computational map of possible heading directions. Each map position
represents a specific direction of heading, given by the intersection
of the axis of translational movement of the observer with the retinal
image. A column of neurons occupying a specific map position in layer
two separately computes the likelihood that the optic flow field
represented in layer one is the result of an egomotion along the axis
of translation (the direction of heading) given by its position in the
map. These neurons form the population that represents this specific
direction of heading. Other directions of heading, which are associated
with different locations in the heading map, are served by different
populations of neurons. To achieve this computation, the connection
strengths between the two layers have to be carefully adjusted.
However, this adjustment is not done with a weight update method such
as backpropagation. Rather, the required connection strengths are
precalculated from the mathematical formalization of the underlying
heading detection algorithm (Lappe and Rauschecker, 1993b
). Therefore,
no training is necessary. The distribution of the connections, on the
other hand, can be chosen at random. Each second-layer neuron receives
input from a random subset of first-layer units. Only the connection
strengths have to be specified. This allows for convergence,
divergence, and overlap in the receptive fields of the second-layer
neurons. Also, the receptive fields sizes can be chosen to be
consistent with the typical receptive field dimensions of MST neurons.
As a result of the freedom of assignment of first-layer input neurons
to second-layer neurons, the receptive fields of the second layer
neurons can be inhomogeneous, i.e., clustering of inputs in parts of
the receptive field can occur. However, as many researchers have noted
(Tanaka and Saito, 1989b
; Duffy and Wurtz, 1991b
; Lagae et al., 1994
),
the receptive fields of MST neurons are also often irregular and
difficult to determine. In the simulations, we typically used 32 input
locations per second-layer neuron.
Once the connections have been determined, the network minimizes a
certain residual function that describes the error between the measured
flow field and a candidate flow field induced by an egomotion into a
certain heading direction: a peak of activity in the map occurs at the
position where this residual function is minimal. This peak specifies
the most likely direction of heading as computed by the network. An
example is given in Figure 2. The example simulates
movement of an observer on top of a ground plane. The observer simply
moves on a linear path toward the plus sign (+) while gazing toward the
left of his movement trajectory. During the movement, he keeps a fixed
angle between the direction of gaze and the direction of heading. No
eye rotation occurs. The resulting optic flow input to the network is a
pure expansion with the singular point, the focus of expansion, located
in the direction of heading. The network is able to determine the
correct direction of heading. The right side of Figure 2 shows the
population activities in the second layer. Each square in this
grayscale plot corresponds to a specific map position in x
and y, i.e., to the retinal projection of a specific
direction of heading. The brightness indicates the population activity
at this map position. The brightest square in the map indicates the
direction of heading as computed by the network. It matches the correct
direction (+) within the resolution of the grid (1°). In this and the
following simulations, the second layer of the network consisted of
16,000 neurons.
Fig. 2.
Example of a network simulation. A, An
observer moves on top of a ground plane into a direction indicated by
the plus sign (+), which keeps a constant angle with the direction of
gaze (x). B, The resulting optic flow field, which is used
as input to the network, is a pure expansion with the singular point,
i.e., the focus of expansion, located in the direction of the movement
(+). C, The population activities in the second layer of the
network as a grayscale map. Each square corresponds to one
possible direction of heading. The brightness of the square indicates
the activity of the population that represents this direction. The
computed direction of heading corresponds to the brightness peak and is
close to the correct direction (+). Note that the flow field and the
output map are drawn on different scales. The diameter of the flow
field is 100°, whereas the side length of the output map is only
40°.
[View Larger Version of this Image (34K GIF file)]
The example in Figure 2 describes a simple linear movement that does
not involve any visual rotations attributable to eye- or
head-movements. However, such rotations often occur during locomotion
and have a profound influence on the structure of the flow field on the
retina (Regan and Beverly, 1982
; Warren and Hannon, 1990
; Lappe and
Rauschecker, 1995
). The network has been designed to cope with this
situation. To achieve an invariance against eye-rotations, a simple
search for the focus of expansion on the retina is misguided. Instead,
each second-layer population has to evaluate the residual
function and adjusts its activity accordingly: the lower the value of
the residual function, the higher the output activity of the
population. This evaluation cannot be performed by any single neuron
alone, but is spread out over all of the cells within a population.
Therefore, whereas the population possesses a ``preferred'' direction
of heading, a single cell is not able to signal the direction of
heading on its own. The activity of a single cell only serves as one
constraint on the heading direction. To compute the most likely heading
direction thus requires summation of the outputs of many cells. This
procedure is illustrated in Figure 3. The response of a
single cell (Fig. 3A) to an optic flow input is a sigmoid
function of the direction of heading. Such a cell only signals whether
the direction of heading lies roughly in one-half of the visual field
(left hemifield in Fig. 3A). Together with a second cell
(Fig. 3B), providing information about whether the direction
of heading is likely to be located in the right hemifield, the location
of the direction of heading is found to lie on a line dissecting the
visual field. The combination of many neuronal responses (a second pair
of neurons is shown in Fig. 3C,D) finally results in a peak
of population activity at the retinal position of the correct direction
(Fig. 3E).
Fig. 3.
Schematic illustration of the process of
representing a certain direction of heading with a population of
neurons. A, The response u(x,y)
of a single cell from the second layer of the network to an optic flow
input. The map position (x,y) denotes the azimuth and
elevation of the direction of heading. The response u is a
sigmoid function of the direction of heading. The single cell in
A only responds when the direction of heading lies along the
vertical meridian or in the left hemifield. B, A second cell
from within the same subpopulation in the second layer of the network
signals that the direction of heading is located in the right visual
field. Both activity profiles overlap near the vertical meridian.
Summing the activities of the neurons in A and B
and of two more neurons (C, D), also from within the same
column but with differently oriented response curves, results in a peak
of population activity U for the total neuronal population
in the second layer of the network. This peak of activity signals the
direction of heading.
[View Larger Version of this Image (24K GIF file)]
The structure of the model assumes that each MST cell is associated
with a specific population that represents a specific direction of
heading. Figure 3 also illustrates why it is difficult to determine
from physiological data the specific population or direction of heading
with which a given neuron is associated. The most obvious differences
in the response curves of the four individual neurons are their
orientations. However, these differences do not functionally separate
the neurons. In fact, these different orientations within one
population are necessary to achieve the desired overlap in the response
functions that generates the population selectivity. This principle is
very similar to the way local motion information is encoded in a
distributed fashion in the MT layer. There, the relative activities of
a population of neurons with different directional selectivity give the
direction of motion of the stimulus. In MT, these populations are
formed by all neurons that occupy the same receptive field location in
visual space. A neuron with a receptive field at another location
clearly belongs to a different population, encoding motion at that
location. In the MST model, receptive field location would not be a
good basis to group neurons together in one population. To acquire as
much information for the determination of self-motion parameters as
possible would instead require covering all of the visual field with
the neurons within one population. Thus, neither the response curves
directly nor the receptive field positions would be expected to
differentiate neurons in one population from neurons in another
population. The appropriate parameter to group neurons together would
instead be their nearness in ``heading space.'' However, this
parameter manifests itself neither in the individual response curve nor
in the receptive field position, but only when neuronal responses are
combined. Thus, if such a map of heading-populations were anatomically
present in MST, it would not be directly visible in the response
properties of neighboring neurons. As Figure 3 shows, the response
function of neurons within one such population can be quite different
from one another. This might explain why attempts to find a map-like
organization in MST based on physiological properties such as the
selectivity for specific flow patterns have failed.
Properties of single-model neurons
All individual neurons in one column respond to optic flow
patterns. However, their optic flow response properties are determined
not solely from their position in the heading map, but also from the
locations of their first-layer inputs and connections. Thus, different
individual neurons might display different optic flow tuning. Such a
map structure might explain why many researchers have failed to find a
clear-cut topographic order in MST. In this model, neither the
receptive field position nor a selectivity for certain optic flow
components would necessarily display a topographic order. Rather, a
certain orderly arrangement of columns of neurons encoding specific
heading directions in a population code would be expected.
Neurons in the model are designed to perform a specific task, namely,
to compute the direction of heading. However, they also show specific
properties when tested with the abstract flow stimuli that are commonly
used in neurophysiological research on optic flow processing.
Typically, these stimuli consist of basic flow components such as pure
expansions/contractions or pure rotations. When stimulated with such
input stimuli, their activity is modulated by the position of the
singular point of the optic flow stimulus within the visual field. A
singular point of an optic flow field is defined as a point where the
optical velocity vanishes. For an expansion/contraction stimulus, the
singular point is the focus of expansion or contraction. For a rotation
stimulus, the singular point coincides with the retinal projection of
the axis of rotation. The response modulation is characterized by
complementary response fields for expansion/contraction and
clockwise/counterclockwise rotation. For instance, an individual model
neuron might favor expansions with the singular point in the left
hemifield and contractions with the singular point in the right
hemifield, i.e., it reverses its selectivity from expansion to
contraction as the singular point is moved in the visual field. In
addition, a second-layer cell that is excited by one type of motion at
a specific location of the singular point will be inhibited by the
reversed motion with the same location of the singular point. This
inhibition is an important requirement, because it allows the
population to retain a medium activity even when some neurons are
excited by the stimulus.
The model neurons also respond to frontoparallel, two-dimensional,
unidirectional motion in a direction-selective manner. As with the
other optic flow responses, this directional selectivity is also a
reflection of the functional requirements of the task of heading
detection from optic flow. It does not imply that a neuron is
specifically tuned to translation, i.e., that all of its inputs from
the first layer have the same preferred direction. Instead, a neuron
receives input from many cells with different preferred directions and
uses a complex weighting scheme for these inputs. However, if all
first-layer neurons are stimulated by a large field translation, then
there is always one direction of translation for which the input for a
given second-layer neuron is maximal and one for which it is minimal.
Thus, this neuron will appear direction-selective, even though it does
not receive restricted input only from cells with the same preferred
direction.
An example of the responses of a single second-layer model neuron to
several optic flow stimuli is shown in Figure 4. It is
important to note that, similar to the experimental methods described
later, the optic flow stimuli always covered a full central 90° × 90° of the (simulated) visual field. Thus, only the position of the
singular point was moved, not the stimulus itself. The example neuron
in Figure 4 responds differentially to expanding, contracting, and
rotating flow stimuli, depending on where in the visual field the
singular point of the flow stimulus is located. For very large
displacements of the singular point, i.e., when the singular point is
moved from the lower left to the upper right corner of the visual
field, the neuron reverses its selectivity from expansions to
contractions. For smaller displacements of the singular point, the
neuron displays a position invariant selectivity in large parts of the
visual field. For instance, the selective response to counterclockwise
rotations stays the same within an area covering the left hemifield and
extending at least 20° into the right hemifield. The simulated
receptive field of this model neuron covers the lower left quadrant. It
extends up to 10° into each of the other quadrants. Thus, for this
neuron, the reversals of selectivity occur only when the singular
points of the respective flow patterns are placed outside its receptive
field. Within the receptive field, the neuron responds selectively only
to expansions and counterclockwise rotations, and both responses are
position-invariant. In addition to the optic flow responses, the neuron
also responds to full-field unidirectional motion, favoring movements
toward the upper right.
Fig. 4.
Simulated responses
u(x,y) of a single neuron from the second
layer of the model to optic flow stimuli. The stimuli were pure
expansions, contractions, clockwise rotations, and counterclockwise
rotations as a function of the location (x,y) of the
singular point within the visual field. The simulations show sigmoidal
response profiles for all of these stimuli. A comparison of the
responses to opposite stimuli reveals a complementary arrangement of
areas of best response. Best responses to expansion are obtained in the
lower left of the visual field, and best responses to contraction are
obtained in the upper right of the visual field. For rotations,
clockwise rotation is favored in the left and the center of the visual
field, whereas counterclockwise rotation becomes more preferred in the
right periphery of the visual field. The neuron is also
direction-selective. It prefers full-field unidirectional
frontoparallel translation toward the upper right. The receptive field
of the neuron covers the lower left quadrant of the visual field and
extends up to 10° into the other three quadrants. The neuron receives
input from 32 locations from within this receptive field, which are
indicated by black dots. At every such location, inputs from
all possible local movement directions are present but are weighted
according to an algorithm that allows the determination of the
direction of heading. The receptive field of the neuron has a complex
structure. Different parts of the receptive field can have different
selectivities for local motion.
[View Larger Version of this Image (41K GIF file)]
It is also possible that single-model neurons do not respond
selectively to all of the basic flow patterns. For instance, a second
model neuron, shown in Figure 5, lacks any selectivity
for clockwise versus counterclockwise rotations. The responses to
expanding or contracting patterns remain dependent on the location of
the singular point. In addition, the neuron is also
direction-selective. Preferred direction for full-field unidirectional
motion for this neuron is toward the lower right. The simulated
receptive field covered the full 90° × 90° visual field. The
difference in the response selectivity of the two model neurons stems
from a difference in their account for visual disturbances caused by
eye movements that might occur during egomotion. For an analytical
derivation of these properties, see Lappe and Rauschecker (1993a)
.
Fig. 5.
Example of a model neuron that is nonselective for
rotational flow stimuli. The responses
u(x,y) to expansional and contractional
flow stimuli depend in a complementary manner on the location
(x,y) of the singular point. The responses to
rotational stimuli are independent of the location of the singular
point and are identical for both directions of rotation. However, the
neuron is direction selective for full-field frontoparallel translation
toward the lower right. The receptive field covers the central 90° × 90° of the simulated visual field.
[View Larger Version of this Image (39K GIF file)]
An interesting property of the optic flow-selective neurons in MST has
been reported by Graziano et al. (1994)
. Many cells responded very well
to linear combinations of expanding/contracting and rotating patterns,
i.e., to spiral motion. We have not included spiral motions in our
study, but responses to spiraling optic flow patterns are also observed
in model neuron simulations. Spiraling optic flow patterns often occur
in everyday egomotion conditions and are very efficient stimuli for the
human heading-detection system (Lappe and Rauschecker, 1995
). For
instance, they result when, during egomotion with respect to a ground
plane, the gaze is stabilized on a ground plane target by appropriate
eye movements. For the heading-detection system implemented by the
model, selective responses to spiraling patterns are a natural
consequence. Thus, although the simulations and recordings described in
this paper were all performed with basic optic flow patterns such as
pure expansions, rotations, or translations, the responses of the model
neurons are not restricted to these basic patterns. The model neurons
do not separate the optic flow into isolated basic components but,
rather, form a continuum of selectivities, in which some neurons
respond stronger to spiraling patterns at certain positions of the
singular point, whereas other neurons respond stronger to the pure flow
patterns.
The model neurons in Figures 4 and 5 clearly represent idealized
response properties obtained from a mathematically optimal network.
Such idealized responses cannot be expected from real neuronal data.
However, from the model simulations, a number of predictions can be
made that can be tested experimentally. First, neuronal activities will
depend on the position of the singular point in a full-field optic flow
stimulus. Reversals of selectivity might occur when the singular point
is displaced. Second, best responses to opposing stimuli (expansion vs
contraction, clockwise vs counterclockwise) will occur for diametral
locations of the singular point with respect to the center of the
visual field. Third, excitation by one type of motion at a particular
location of the singular point will be paired with inhibition by the
opposite type of motion. Fourth, maximum activities will occur for
peripheral locations of the singular point. In general, all of the
effects should be best observable for large distances of the singular
point from the center of the visual field.
EXPERIMENTAL RESULTS
We recorded from 134 neurons. A total of 98 neurons could be
tested with expansion/contraction stimuli at different locations of the
singular point. Of these 98 neurons, 88 were tested with the
expansion/contraction stimuli at 40° eccentricity. Thirty-one neurons
were tested with the expansion/contraction stimuli at 15°
eccentricity. Twenty-one neurons were tested with both sets of
expansion/contraction stimuli. Rotational optic flow stimuli were
tested less often. A total of 53 neurons were recorded with rotational
flow patterns, 26 with the 15° stimuli and 45 with the 40° stimuli.
A total of 18 neurons were tested with 15° and 40° rotation
stimuli.
Basic properties
Most neurons we encountered could be well driven by visual
stimulation. Visual receptive field dimensions of these neurons were
usually large to very large, often covering the whole 90° × 90°
screen. However, as has been noted by previous researchers, the
receptive fields were sometimes difficult to map, because the responses
depended on the stimulus used (bar or random dot pattern), and also
because in some neurons inhomogeneities in the receptive fields were
observed. However, because our experimental paradigm as well as the
natural situation during egomotion involved only full-field
stimulation, we considered the estimates obtained with the hand-held
projector to be sufficient.
Many neurons responded well to the optic flow stimuli. Most of these
neurons also displayed a broad direction selectivity for full-field,
frontoparallel, unidirectional motion. However, in 57 (70%) of 81 neurons that were compared, the response recorded during an optic flow
stimulation exceeded the response elicited with the unidirectional
motion. When making this comparison, it is important to bear in mind
that the frontoparallel motion stimuli were not directly comparable to
the optic flow stimuli in terms of speed, direction, or dot size.
Instead, we used optimized stimuli for the frontoparallel motion
responses. These stimuli were chosen from a large set of stimuli
differing in speed, direction, stimulus size, dot size, etc., so as to
elicit an optimum response of the individual neuron. Thus, although the
frontoparallel motion stimuli and the optic flow stimuli were not
directly equivalent, we think that the comparison nevertheless provides
a conservative assessment of the relative response strengths to
frontoparallel motion and optic flow. In addition to visual responses,
some neurons also showed pursuit-related activity or extraretinal
modulations by eye-position (Bremmer and Hoffmann, 1993
; Bremmer et
al., 1996
).
For most neurons, the recorded activities during the optic flow
stimulations depended on the position of the singular point on the
tangent screen. Usually, the selectivity for an optic flow pattern
could be changed by changing the placement of the singular point.
Often, however, a reversal of selectivity from expansion to
contraction, or from clockwise to counterclockwise rotation, did not
occur within the stimulus set that included only the 15° eccentric
positions. But in this case, selectivity reversals could usually be
induced using the 40° eccentric stimulus set. Figure 6
shows spike trains and peristimulus time histograms for a neuron tested
with the 15° expansion/contraction and the 15° and 40° rotation
stimuli. The arrangement of the histograms reflects the screen location
of the singular point in the different stimulations. If the singular
point is placed in the left to upper left part of the visual field, the
neuron fires with increased firing rate in the contraction phase. In
contrast, if the singular point is placed in the right hemifield, the
neuron fires stronger in the expansion phase. Within the 15° rotation
stimuli (inner histograms in Fig. 6B) no such
reversal of selectivity is observed. At most positions within the
central 30° of the visual field, the activity of the neuron is larger
in the clockwise rotation phase than it is in the counterclockwise
rotation phase. However, if the responses to the 40° rotation stimuli
are considered (outer histograms in Fig. 6B), it
becomes apparent that the activity of the neuron during
counterclockwise rotation increases when the singular point is located
in the upper left periphery. Thus, the neuron favors clockwise
rotations in most of the visual field, but reverses its selectivity in
a restricted peripheral area, similar to the model neuron in Figure 4.
The receptive field of the neuron covered the entire left hemifield
with an area of increased excitability covering the lower left
quadrant. Preferred direction for frontoparallel unidirectional motion
was toward the right.
Fig. 6.
Spike trains and peristimulus time histograms for
a cell recorded in area MST show a reversal of selectivity depending on
the location of the singular point. A, Neuronal activities
during expansion (first phase of stimulus) and contraction (second
phase) stimulation were recorded for nine locations of the singular
point of the flow field. The arrangement of the histograms reflects the
location of the singular point during the individual stimulations. One
location was in the center of the visual field, and eight locations
were arranged equidistantly on a circle of radius 15° around the
center. The neuron favors the contraction stimulus when the singular
point is placed in the upper left visual hemifield. In contrast, if the
singular point is placed (Figure legend
continues)
to the right of the visual field center, the neuron favors
expansion. B, Activities during rotational stimulation were
recorded for 17 locations of the singular point. In addition to the 9 inner locations, 8 more locations were arranged on a second circle of
radius 40°. Within the 9 central rotation stimuli (15° eccentric),
no reversal of selectivity is observed. Instead, the neuron favors
clockwise rotations (first stimulus phase) at most positions within the
central 30° of the visual field. However, when the singular point is
located 40° eccentric, it becomes apparent that the neuron favors
counterclockwise rotation when the singular point is located in the
upper left periphery of the visual field, and clockwise rotation when
the singular point is located in the right or in the lower visual
hemifield. C, Directional tuning for full-field
frontoparallel translation is toward the left. The polar plot of the
directional tuning was obtained by moving a full-field random dot
pattern on a circular path in a frontoparallel plane, thereby covering
all 360° of motion direction in a single trial. The receptive field
covered the left half of the tangent screen, but an area of increased
responsibility comprised the lower left hemifield.
[View Larger Version of this Image (32K GIF file)]
The main goal of this study was to determine the shape of the
activity modulation of MST neurons when the position of the singular
point in the visual field is varied, and to compare it to the response
curves obtained in computer simulations. In Figure 7,
the activity profiles of an MST neuron for the different optic flow
stimuli are plotted as three-dimensional surface graphs. Smooth
activity slopes in response to expansion/contraction can be seen to
agree with the expansion/contraction response functions of the model
neurons in Figures 4 and 5. Activities during rotational stimulation
were recorded only with the 15° stimuli set. A strong response to
counterclockwise rotation and a dependence on the position of the
singular point are apparent. In mapping the receptive field of this
neuron, some response could be elicited from all over the visual field,
but increased responsiveness was obtained from only the lower left
quadrant of the visual field. The neuron was also direction-selective
for full-field unidirectional motion, favoring directions toward the
lower left.
Fig. 7.
Activities of a cell from area MST recorded during
various optic flow stimulations. The plots are drawn in analogy to the
ones used in the examples of model simulations in Figures 4 and 5. They
show the activity u in spikes/sec for a number of locations
(x,y) of the singular point. Activities during
expansion and contraction stimulation were recorded for 17 locations of
the singular point of the flow field, distributed around the center (0, 0) of the visual field. For the plots, individual activities recorded
at these discrete positions were joined with nearest neighbors by
linear triangular segments. Activities recorded during expansion and
contraction display a smooth graded profile that conforms with the
model predictions. A reversal of selectivity occurs roughly along the
vertical meridian. Best responses to expansion and contractions were
recorded from opposite areas of the visual field: expansion was favored
when the singular point was in the upper visual hemifield
(y > 0), and contraction was favored when the
singular point was in the lower visual hemifield
(y < 0). Responses to rotational flow stimuli
were recorded for 9 locations of the singular point of the flow field,
centered on the fovea, or 15° eccentric. For these stimuli, the
neuron responded only to counterclockwise rotation. However, response
strength is modulated strongly by the location of the singular point.
Directional tuning for frontoparallel translation was toward the lower
left. The receptive field covered the lower left quadrant of the visual
field.
[View Larger Version of this Image (29K GIF file)]
Evaluation of model predictions
To evaluate the predictions of the model for the recorded neurons,
we computed the percentages of neurons that were consistent with the
predictions.
Reversals of selectivity for large displacements of the
singular point
For each neuron tested with a set of expansion/contraction
stimuli, we computed the difference of the mean spike rate during
expansion and the mean spike rate during contraction for each of the
nine locations of the singular point. If at one location of the
singular point a cell responded more strongly to expansion than to
contraction, and if at a different location of the singular point the
same cell responded more strongly to contraction than to expansion,
then direction indices (DI) for this pair of locations were computed,
following the standard formula:
The neuron was counted as reversing its selectivity when both
direction indices exceeded a value of 0.5. The same procedure was
applied to clockwise and counterclockwise rotations.
The percentage of neurons that displayed reversals of selectivity is
listed in Table 1 for the various sets of stimuli used.
Table 1 shows that most neurons reverse their selectivity depending on
the position of the singular point, consistent with the model
prediction. Also consistent with the model prediction, the reversals
become more prominent when the singular point of the optic flow
stimulus is moved further in the periphery of the visual field.
Table 1.
Percentages of optic flow-selective cells that reversed
preferred stimulus direction when the singular point of the optic flow
stimulus was shifted
|
15 deg. ecc. |
40 deg. ecc.
|
|
| exp/cont |
28% |
78% |
| cw/ccw |
27% |
87% |
|
|
For each neuron, we computed the difference of the mean spike
rate during expansion (or clockwise rotation) and the mean spike rate
during contraction (or counterclockwise rotation) for each of the nine
singular point positions. For a neuron to be counted as reversing its
selectivity, two conditions had to be fulfilled. A change of the sign
of this difference value between at least one pair of locations was
required, and the direction indices (computed according to a standard
formula) at those locations had to exceed a value of 0.5. ecc,
Eccentricity; exp, expansion; cont, contraction; cw, clockwise; ccw,
counterclockwise (throughout tables).
|
|
Complementary response fields
We next tested the model prediction that best responses to
opposing stimuli (expansion vs contraction, clockwise vs
counterclockwise rotation) should occur for opposite locations of the
singular point in the visual field. Only those neurons that displayed a
reversal of selectivity according to the above criteria were
considered. For each type of motion, we determined in which direction
from the visual field center the area of best response was located. To
obtain this direction, we computed the gradient of a two-dimensional
regression on the nine activities recorded for a given motion type and
stimulus set. The gradients for opposite types of motion were then
compared to each other. If the gradient for expansion and the gradient
for contraction pointed in directions more than 90° apart, the cell
was considered having complementary response fields for
expansion/contraction. Complementary response fields for clockwise and
counterclockwise rotation were determined analogously. The percentages
of neurons that had complementary response fields are shown in Table
2. The data in Table 2 conform with the predictions from
the model. Best responses to opposing stimuli occur at opposite
locations in the visual field. Again, the result is clearest when the
neurons were tested with the 40° stimuli.
Table 2.
Percentages of optic flow-selective cells that exhibited
reversals of selectivity and that gave best responses to opposing
stimuli (expansion vs contraction, clockwise vs counterclockwise
rotation) for opposite locations of the singular point in the visual
field
|
15 deg. ecc. |
40 deg. ecc.
|
|
| exp/cont |
89% |
94% |
| cw/ccw |
57% |
72% |
|
|
To obtain the direction in which the area of best response for a
given flow pattern was located, we computed the gradient of a
two-dimensional regression on the nine activities recorded for a given
stimulus set. If the gradients for opposing stimuli pointed in
directions more than 90° apart, the cell was considered to have
complementary response fields.
|
|
Inhibition
Activities dropping below the background level during an optic
flow stimulation were observed frequently. Table 3 lists
the percentages of neurons for which the activity during an optic flow
stimulation dropped below the background level at one or more locations
of the singular point. Table 3 shows that inhibition by a nonpreferred
optic flow pattern is a common finding that is in agreement with the
model. Also, consistent with previous authors (Lagae et al., 1994
), we
found the background activity in MST to be relatively high. Median
background activity for our sample of neurons was 12 spikes/sec.
Table 3.
Percentages of optic flow-selective cells that displayed
inhibition for certain positions of the singular point of the optic
flow
|
15 deg. ecc. |
40 deg. ecc.
|
|
| exp/cont |
97% |
95% |
| cw/ccw |
80% |
89% |
|
Activity maxima in the periphery
A fourth prediction from the model simulations was that maximum
response should occur in the periphery. We next tested whether the
maximum activities for a given optic flow pattern occurred at the
central position of the singular point in the visual field or at one of
the peripheral locations. Table 4 shows that for the
majority of the neurons, maximum activities occurred at one of the
peripheral positions. However, with the 40° stimulus set the
percentages are near or below the level of chance, which is 89%,
because eight peripheral but only one central location had been tested.
But from a closer inspection of Figure 4 one can deduce that for the
model neurons, the maximum activity might already be approximately
reached at the central position, even though the activity modulation is
monotonously increasing toward the periphery. An inspection of the
measured activities of those MST neurons that failed to show the
maximum activity in the periphery revealed that the prevalent response
characteristic is that of a maximum in the center paired with an
activity of almost the same strength at one or more peripheral
locations (Fig. 8A). Truly bell-shaped
response curves with a clear single peak in the center (Fig.
8B) were rare. Less than half of those neurons that
had a maximum response in the center displayed a single peak response
(6 of 14 for expansion/contraction, 4 of 11 for rotation).
Table 4.
Percentages of optic flow-selective cells for which the
maximum activity was recorded when the singular point of the optic flow
stimulus was located in the peripheral visual field as opposed to the
visual field center
|
15 deg. ecc. |
40 deg. ecc.
|
|
| exp/cont |
94% |
87% |
| cw/ccw |
96% |
78% |
|
Fig. 8.
Examples of different shapes of the response
function of neurons with peak response for centered optic flow stimuli.
In both cases, the maximum response is reached when the stimulus is
placed in the visual field center. The neuron in A, however,
exhibits responses of almost the same strength for a number of
eccentric positions that form a plateau in one part of the visual
field. The neuron in B displays a bell-shaped response
function with a clear single peak in the center.
[View Larger Version of this Image (37K GIF file)]
Average response curves
To compare further the shape of the activity modulations in MST to
those obtained in model simulations, we wanted to generate an average
response curve for the population of neurons recorded. The procedure
used to generate an average response curve consisted of two steps.
First, the response curves from individual neurons had to be aligned.
Second, the average over the aligned curves had to be determined. For
the alignment, the directions of the areas of best response were used,
which were introduced above. Response curves from individual neurons
were rotated in the (x,y)-plane in such a way that their
response gradients all pointed in the same direction. To enable the
averaging, this rotation had to be performed in discrete steps of
45°. Average response curves were then obtained by averaging over the
responses of all individual neurons, separately for each location of
the singular point.
Figure 9 shows the average response curves for all
neurons tested with both sets of stimuli, the 15° and the 40° set
(N = 21 for expansion/contraction, N = 18 for rotation). Shown on the left are the three-dimensional surface
plots also used in the single neuron examples. The arrangement of
expansion/contraction and clockwise/counterclockwise rotation curves in
opposite directions was justified by the observation that the average
response gradients also pointed in opposite directions. The plots on
the right of Figure 9 display cross sections through the midline of the
response curves. There, five points were measured in a row. These plots
serve to illustrate the sigmoidal shape of the response curves along
the gradient direction. A comparison with Figure 4 shows that the
average response curves for the MST neurons we recorded are in good
agreement with the response curves of the model neurons.
Fig. 9.
Average response curves for different optic flow
stimuli. The curves for expansion/contraction show an average of
individual curves from 21 neurons that were recorded with 17 locations
of the singular point each. Eighteen neurons contributed to the average
response curves for rotation. Average response curves were generated by
first aligning the response curves from the individual neurons and then
averaging over the aligned curves. For the alignment, the gradient of
the two-dimensional linear regression was used (see text).
Left, Three-dimensional surface plots of the response
u depending on the location (x,y) of the
singular point in the flow stimulus. Right, Cross sections
through the midlines of the three-dimensional surface plots.
[View Larger Version of this Image (36K GIF file)]
One has to keep in mind, however, that averaging the responses of all
recorded neurons also includes neurons with different response curves,
such as the one in Figure 8. However, we believe that averaging over
all neurons recorded provides the most unbiased way to determine global
characteristics. This is not to say that all individual neurons behave
the same. It simply helps in illustrating a prevalent response pattern.
Two-dimensional direction selectivity
Most of the MST neurons we recorded also displayed direction
selectivity for frontoparallel unidirectional motion of a full-field
random dot pattern. Direction selectivity, in addition to optic flow
selectivity, has often been described for optic flow-responsive neurons
in MST. However, different authors have put different emphasis on this
observation and on its implication for the optic flow processing
capabilities of these neurons. Early investigations (Saito et al.,
1986
; Tanaka et al., 1986
; Tanaka and Saito, 1989a
) required optic
flow-selective neurons to be directionally unselective for
unidirectional motion. Later studies have suggested that optic flow
selectivity and direction selectivity can coexist and might not be
related to each other (Duffy and Wurtz, 1991a
,b). The finding that some
neurons reverse their optic flow selectivity when the stimulus is moved
such that the local motion direction in part of the receptive field is
reversed was taken as evidence against an involvement of these neurons
in optic flow processing (Orban et al., 1992
; Lagae et al., 1994
).
According to this argument, only neurons that display a positional
invariance when the optic flow stimulus is placed in different parts of
the receptive field are considered contributing to the optic flow
analysis. On the other hand, if a neuron behaves completely
position-invariant toward the retinal location of, for instance, the
focus of expansion, it would also be useless for a navigational task
such as heading detection (Graziano et al., 1994
). Because of the
network model, we are in a position to test the neuronal
properties
including the relationship between direction selectivity
and optic flow selectivity
in comparison to simulated neurons with a
proven capability to perform a complex analysis of the optic flow.
In our sample, we often encountered a positional invariance of the
optic flow responses when shifts of the position of the singular point
were within the range tested in most of the above studies (
40°).
However, we usually could elicit a reversal of the selectivity when the
displacement of the singular point was large (
40°). To test whether
these reversals of selectivity might be related to the direction
selectivity of the neurons, it is useful to consider the following
example. Consider a neuron that displays selectivity for expansions
whenever the singular point is located in the left hemifield and
selectivity for contractions when the singular point is located in the
right hemifield. For the expansion stimuli centered on the right, most
local motion directions would contain a movement component directed to
the left. Similarly, for the contraction stimuli centered on the left,
most local motion directions would contain a movement component
directed to the left, too. Thus, if the same neuron also displays a
direction selectivity for leftward motion, its two-dimensional
direction selectivity and its optic flow selectivity would be in
agreement. However, as mentioned above, such a consistency has often
been regarded as evidence against a true involvement in optic flow
processing.
To see how frequent such an agreement between the optic flow
selectivity and the frontoparallel direction selectivity appears in our
sample of MST neurons, we calculated the angular difference between the
direction of reversal of selectivity for two opposing optic flow
stimuli and the preferred direction for frontoparallel, unidirectional
motion. To determine the direction of reversal of the optic flow
selectivity, we again computed the gradient of a two-dimensional
regression on 9 or 18 data points. Because we wanted to find the
direction of maximum change from, e.g., expansion to contraction, we
used for the regression the difference between the activities during
expansion and the activities during contraction. The obtained gradient
gives the direction along which the activity of the cell during
expansion increases and its activity during contraction decreases.
Along this direction, the selectivity of the cell changes maximally
from contraction to expansion. Consistency with the preferred direction
for frontoparallel motion would imply a 180° angular difference
between the two. The same argument can also be applied toward the
preferred direction for frontoparallel motion and the direction of
reversal of selectivity from counterclockwise to clockwise rotation.
Consistency between the two directions would imply a 90° angular
difference.
Figure 10, A and B, shows
histograms of the angular difference between the directions of reversal
of optic flow selectivity and the preferred direction for
two-dimensional frontoparallel motion for all neurons tested. Indeed,
for most neurons the direction of reversal is consistent with the
preferred direction. This is true for both expansion/contraction and
rotation stimuli. However, the same consistency is also found in
simulations of model neurons. Fig. 10, C and D,
shows the results of the same test run on the simulated responses from
250 randomly selected model neurons. In the model neurons, also, the
direction of reversal is consistent with the preferred direction for
frontoparallel unidirectional motion. The distribution of the model
activities is even more peaked than for the physiological data.
However, a broader distribution of the physiological data might have
been expected simply because of noise in the measurements. Moreover,
the direction was calculated by interpolating between eight discrete
measurements, which limits the accuracy with which it can be
determined. Thus, we think that model and experimental data are in
agreement. In addition, from the model simulations it is evident that
the observed consistency between the preferred direction of
frontoparallel motion and the direction of reversal of optic flow
selectivity does not oppose an involvement in optic flow processing.
Fig. 10.
Distribution of angular differences between the
directions of reversal of optic flow selectivity and the preferred
direction for full-field, frontoparallel, unidirectional motion. The
direction of reversal of the optic flow selectivity was obtained by
fitting a two-dimensional linear regression to 9 or 18 data points. To
determine the direction of maximum change from expansion to contraction
(or clockwise to counterclockwise rotation, respectively), we performed
a regression on the difference between the activity recorded during
expansion and the activity recorded during contraction. The gradient of
the regression indicates the direction in which the selectivity changes
maximally from contraction to expansion. This direction was then
subtracted from the preferred direction for full-field, frontoparallel,
unidirectional motion. The experimental distributions (top
graphs) show a clear correlation between the directions of
reversal and the preferred direction: the distributions peak at 180°
angular difference for expansion/contraction (top left) and
90° angular difference for rotation (top right). The
distributions for 250 randomly selected model neurons (bottom
graphs) display the same correlation. Thus, a correlation between
the optic flow responses and the directional selectivity is obvious for
most neurons, but it is consistent with an involvement in optic flow
analysis.
[View Larger Version of this Image (32K GIF file)]
Motion parallax
In our expansion/contraction stimuli, as well as in most natural
situations, the flow field contains motion parallax, i.e., different
visual objects move at different visual speeds according to their
distance from the observer. Motion parallax is an important cue for the
visual system, transmitting information about the three-dimensional
layout of a visual scene (Rogers and Graham, 1979
) and separating
visual rotations caused by eye movements from body translations (Warren
and Hannon, 1990
). The response strength of MST neurons is little
affected when the stimuli contain two or three different speed
distributions simulating the motion of dots in different depth planes
(Duffy and Wurtz, 1991a
). This might indicate that they are not
concerned with the recovery of spatial structure. However, motion
parallax also carries important information for heading detection,
albeit only in the presence of eye movements and the absence of
extraretinal signals (Warren and Hannon, 1990
). In simulations of
psychophysical experiments, the network model also shows a dependence
on motion parallax, because its robustness against noise decreases with
decreasing depth range of the visual scene (Lappe and Rauschecker,
1993b
).
To compare motion parallax influences on the level of single neurons,
we recorded the responses of 55 neurons to expansion and contraction
using stimuli containing no motion parallax. This was obtained by
assuming that all random dots were distributed on a frontoparallel
plane instead of in a random three-dimensional cloud. The depth range
of the visual scene in this case is zero, and the distribution of
speeds in the stimulus is uniform and depending only on visual
eccentricity. For each neuron tested, we computed the direction of
maximum change from expansion to contraction as described in the
previous section. We then compared this gradient to the one obtained
using the stimulus set that did contain motion parallax and computed
the angular difference between the gradients in these two conditions.
Figure 11 shows the angular distribution of these
differences for the recorded neurons and for 84 randomly selected model
neurons. In both cases, the distributions are centered around zero, but
some variation occurs. The response modulations are very similar in the
two conditions.
Fig. 11.
Influence of motion parallax. Fifty-five neurons
were tested with expansion/contraction stimuli in which all motion
parallax was removed. This was obtained by assuming that all visible
dots were distributed on a frontoparallel plane instead of in a random
three-dimensional cloud. In this case, the distribution of speeds in
the stimulus is uniform and depends only on visual eccentricity. For
each neuron tested, we computed the direction of maximum change from
expansion to contraction as the gradient of a regression on the
difference between the nine activity recorded during expansion and
those recorded during contraction. This gradient was compared to the
one obtained using the stimulus set that did contain motion parallax.
The graph in A shows the angular distribution of these
differences between the gradients in these two conditions. B
shows the results for 84 randomly selected model neurons. In both
cases, the distributions are centered around zero. This indicates that
the response modulations in the case of stimuli lacking motion parallax
are very similar to the case when motion parallax is present. This
holds for the recorded as well as the simulated data.
[View Larger Version of this Image (24K GIF file)]
This raises the question of the origin of the psychophysical
observations. The simulation results indicate that a modest variation
observed in the simulated as well as the experimental data is
sufficient to induce an effect similar to the effects observed in
humans, i.e., an inability to detect correctly the direction of heading
in the presence of slow eye movements when motion parallax is lacking
(Warren and Hannon, 1990
). This would suggest that a dependence of the
heading detection system on motion parallax is a population effect. The
dependence of individual cells on motion parallax is only weak but
observable in the behavior of the complete system.
Recovering heading direction from neuronal activities
So far we have described the activity modulations of single MST
neurons to optic flow stimuli that varied the location of the singular
point in the flow field. We have compared these modulations to response
curves obtained from single-neuron simulations in the model. For the
case of our expansion stimuli, the singular point is a focus of
expansion and directly indicates the direction of heading. In a final
step, we wanted to test, therefore, whether a representation of the
direction of heading by the population response of the recorded MST
neurons is possible, as the model suggests. Unfortunately, it is
impossible to use the experimental data directly in the mechanisms used
by the model. This is for the following reason. Neurons in the model
are organized in subpopulations, each of which represents a specific
direction of heading (see Fig. 1). The assignment of a single neuron to
one such population is used for the calculation of the connection
strengths required to perform the task. The specific properties of the
neuron, e.g., the orientations of its response curves, result partially
from this assignment. But other parameters, such as the spatial
distribution of inputs that the neuron receives from the first layer,
also influence its response properties (see Fig. 3). Thus, from the
measured response curves alone it is not possible to reversely
determine with which subpopulation the neuron is to be associated.
Therefore, a direct use of the recorded neuronal activities in the
computational algorithm of the model is not possible. However, it is
possible to test whether the population of neurons in MST is capable
in principle to represent the direction of heading in a
manner similar to the model. We used a least-mean-square minimization
scheme to derive the position (x,y) of the focus of
expansion of an expanding flow pattern from the neuronal activities.
For each neuron, a sigmoid response curve u(x,y),
analogous to the one used in the model, was fitted to the recorded
activities. Then the actual recorded activity ur
of the neuron was used as a constraint for the location of the singular
point: ur
u(x,y) = 0. The constraints from the individual neurons were squared and averaged.
The result is a map of the least-mean-squared errors,
U(x,y) = 1/N
Ni(ur
u(x,y))2, for each possible
location (x,y) of the singular point. This map is similar to
the likelihood map of heading directions shown in Figure 2.
Grayscale plots of such heading maps obtained from the recorded
neuronal activities are shown in Figure 12. The gray
value at each map position corresponds to the magnitude of
U(x,y). Brighter gray levels indicate small
values of U(x,y). For these plots, all neurons
recorded with the 15° expansion stimuli were used (N = 31). As in Figure 2, the most likely heading direction, implicated by
the focus of expansion in the stimulus and represented by the neuronal
activities, is given by the brightest square in the map. For comparison
with the true heading direction, the nine optic flow stimuli used are
plotted on top of the grayscale maps. It is evident that the potential
to represent the direction of heading in the situation we studied is
present in the neuronal population activity. A computation of the mean
error over all nine positions shows that with this simple procedure,
the direction of heading, i.e., the location of the singular point,
could be retrieved from the neuronal activities with an average
precision of 4.3°. This error has to be compared to an average error
of 2.5° obtained in a human psychophysical study with comparable
stimuli (Warren and Kurtz, 1992
). Thus, the precision appears to be in
the range of the human data, especially if one considers that only 31 neurons contributed to the computation.
Fig. 12.
Grayscale plots of computational heading maps
obtained from the recorded neuronal activities. A least-square
minimization scheme was used to derive the position
(x,y) of the singular point of an expanding optic
flow stimulus from the neuronal activities (see text for details). All
neurons recorded with the 15° expansion stimuli contributed to the
computation (N = 31). In the plots, the obtained
least-square error for a specific heading direction
(x,y) is coded by the gray value at that map
location. Brighter gray levels indicate smaller values of the
mean-square error. The most likely heading direction is given by the
brightest square in the map. For comparison with the true heading
direction, which is the focus of expansion in the case we studied, the
optic flow stimuli used are plotted on top of the grayscale
maps.
[View Larger Version of this Image (156K GIF file)]
We feel that it is very important to add two caveats, however. First,
we want to emphasize again that the procedure used to determine the
direction of heading from the neuronal activities differs from the
procedure used by the model. Thus, the possibility of recovering the
direction of heading from the neuronal activities can only be taken as
an indication that this capability is present in MST, not that area MST
essentially operates in this manner. In fact, a direct computation of
the least-mean-square error by the nervous system is difficult to
imagine. The neural network model outlined above is a much more
biologically plausible way to achieve the same result. However, because
of its structure it would require a much larger number of neurons to
achieve similar accuracy. The simulations used up to 16,000 neurons.
This is partly because it also includes the means to cope with ongoing
slow eye movements. But partly also because it uses an excessive
population coding in which the actual computation is spread out over
many neurons. Second, it is important to note that the procedure
outlined above is only capable of locating the singular point in an
optic flow pattern. The one-to-one correspondence between the direction
of heading and the retinotopic location of the singular point of an
expanding flow pattern only holds under limited conditions, namely,
when no eye movements occur. However, when eye movements occur during
locomotion, the location of the singular point is different from the
direction of heading (Regan and Beverly, 1982
; Warren and Hannon, 1990
;
Lappe and Rauschecker, 1995
). So far, we can only claim that the
neuronal population in MST can recover the direction of heading for the
specific and limited set of stimuli we used.
DISCUSSION
We recorded activities from single neurons in area MST of
the macaque monkey during full-field optic flow stimulation and
compared them to simulations of a neural network model of heading
detection. The neuronal activities in MST are modulated by the retinal
position of the singular point of a flow pattern. We demonstrated that
these activity modulations could enable the MST population to determine
the location of the focus of expansion and, hence, the direction of
heading in the case of our stimulus set.
Compliance with model predictions
Work on modeling the optic flow processing capabilities
of human observers (Lappe and Rauschecker, 1993b
, 1994
) resulted in a
number of predictions for the properties of optic flow processing
neurons. Our experiments were designed to investigate whether optic
flow-responsive neurons in area MST conformed with these predictions.
We found the majority of the neurons in good agreement. Neuronal
activities depended on the position of the singular point. Reversals of
selectivity occurred as the singular point was moved a large distance
across the visual field. Best responses to opposing stimuli occurred
for opposite locations of the singular point within the visual field.
Substantial background activity occurred in the absence of visual
stimulation. Excitation by one type of motion at a particular location
of the singular point was often paired with inhibition by the opposite
type of motion. Activity maxima often occurred for peripheral locations
of the singular point.
In addition, a correlation between the reversals of optic flow
selectivity and the preferred directions for frontoparallel,
two-dimensional motion that was evident from the recorded data was
similarly found in model simulations. Motion parallax influenced MST
neurons in much the same way as it did model neurons.
The clearest demonstration of the similarity between model simulations
and experimental data was seen in the average response functions of the
record