 |
Previous Article | Next Article 
The Journal of Neuroscience, April 15, 1999, 19(8):3122-3145
A Theory of Geometric Constraints on Neural Activity
for Natural Three-Dimensional Movement
Kechen
Zhang1 and
Terrence J.
Sejnowski1, 2
1 Howard Hughes Medical Institute, Computational
Neurobiology Laboratory, The Salk Institute for Biological Studies,
La Jolla, California 92037, and 2 Department of Biology,
University of California, San Diego, La Jolla, California 92093
 |
ABSTRACT |
Although the orientation of an arm in space or the static view of
an object may be represented by a population of neurons in complex
ways, how these variables change with movement often follows simple
linear rules, reflecting the underlying geometric constraints in the
physical world. A theoretical analysis is presented for how such
constraints affect the average firing rates of sensory and motor
neurons during natural movements with low degrees of freedom, such as a
limb movement and rigid object motion. When applied to nonrigid
reaching arm movements, the linear theory accounts for cosine
directional tuning with linear speed modulation, predicts a curl-free
spatial distribution of preferred directions, and also explains why the
instantaneous motion of the hand can be recovered from the neural
population activity. For three-dimensional motion of a rigid object,
the theory predicts that, to a first approximation, the response of a
sensory neuron should have a preferred translational direction and a
preferred rotation axis in space, both with cosine tuning functions
modulated multiplicatively by speed and angular speed, respectively.
Some known tuning properties of motion-sensitive neurons follow as
special cases. Acceleration tuning and nonlinear speed modulation are
considered in an extension of the linear theory. This general approach
provides a principled method to derive mechanism-insensitive neuronal
properties by exploiting the inherently low dimensionality of natural movements.
Key words:
3-D object; cortical representation; visual cortex; tuning curve; motor system; reaching movement; speed modulation; potential function; gradient field; zero curl
 |
INTRODUCTION |
For natural movements, such as the
motion of a rigid object or an active limb movement, many sensory
receptors or muscles are involved, but the actual degrees of freedom
are low because of geometric constraints in the physical world. For
example, as illustrated in Figure 1, the
rotation of an object alters many visual cues. How these cues vary in
time is not arbitrary but is fully determined by the rigid motion,
which has only 6 degrees of freedom. As a consequence, neuronal
activity reflecting such natural movements also is likely to be highly
constrained and to have only a few degrees of freedom.

View larger version (72K):
[in this window]
[in a new window]
|
Figure 1.
Axis of rotation determines how the view of an
object changes instantaneously, along with various visual cues, such as
shading, shadow, mirror reflection, glare, and occlusion. Rigid
geometry predicts that the response of a motion-sensitive neuron, to a
first approximation, should have a preferred rotation axis in
three-dimensional space with cosine tuning function and linear angular
speed modulation, regardless of the exact cues used and the exact
computational mechanisms involved.
|
|
This paper presents a theoretical analysis of how neuronal activity
correlated with natural movements might be constrained by geometry. The
basic theory, although essentially linear, can account for several key
features of diverse neurophysiological results and generates strong
predictions that are testable with current experimental techniques.
An emerging principle from this analysis is that neuronal activity
tuned to movement often obeys simple generic rules as a first
approximation, insensitive to the exact sensory or motor variables that
are encoded and the exact computational interpretation. Such generic
tuning properties are mechanism insensitive because they are
better described as reflecting the underlying geometric constraints on
movements rather than the actual computational mechanisms. This
simplicity arises when sensory or motor variables represent changes in
time rather than static values. In the example shown in Figure 1, the
viewpoint was fixed and the object was rotated systematically
around different axes. The focus is on how neuronal responses depend on
the rotation axis in three-dimensional space, given approximately the
same view of the object. It is possible to derive a simple cosine
tuning rule for the rotation axis, although various visual cues may
depend on the static geometrical orientation of the object in complex
ways. Three-dimensional object motion is a specific example; the same
principles also apply to several other biological systems, including
nonrigid arm movement.
 |
DIRECTIONAL TUNING FOR ARM MOVEMENT |
Although the visual and the motor examples share similar
mechanism-insensitive properties, the reaching arm movement has a simpler mathematical description and more supporting experimental results and will be considered first.
Ubiquity of cosine tuning
A directional tuning curve describes how the mean firing rate of a
neuron depends on the reaching direction of the hand. As illustrated in
Figure 2, broad cosine-like tuning curves
are very typical in many areas of the motor system of monkeys,
including the primary motor cortex (Georgopoulos et al., 1986 ),
premotor cortex (Caminiti et al., 1991 ), parietal cortex (Kalaska et
al., 1990 ), cerebellum (Fortier et al., 1989 ), basal ganglia (Turner and Anderson, 1997 ), and somatosensory cortex (Cohen et al., 1994 ; Prud'homme and Kalaska, 1994 ). Although the examples shown in Figure 2
are two-dimensional, cosine tuning holds as well for three-dimensional
reaching movement (Georgopoulos et al., 1986 ; Schwartz et al.,
1988 ).

View larger version (26K):
[in this window]
[in a new window]
|
Figure 2.
Cosine tuning to hand movement direction is very
common in monkey motor system, here showing examples of average tuning
curves in two-dimensional reaching tasks, with preferred direction
taken as 0°. Left column, Circular normal functions
(solid curves) fit the data ( ) slightly better and are
slightly narrower than cosine functions (dashed curves).
Horizontal lines indicate background firing rates without
movement. Right column, Data and the circular normal
functions after subtracting the cosine functions. Data from motor
cortex (M1) and cerebellum (Purkinje cells plus deep nuclei) are from
Figure 2 in Fortier et al. (1993) , basal ganglia data (GPe) are from
Figure 8B (decrease type) in Turner and Anderson (1997) , and
somatosensory cortex data (S1) are from Figure 11A (no load case) in
Prud'homme and Kalaska (1994) , with permission.
|
|
The ubiquity of cosine tuning is a hint that this property is generic
and insensitive to the exact computational function of these neurons.
For example, coding of muscle shortening rate is one theoretical
mechanism that can generate cosine tuning (Mussa-Ivaldi, 1988 ). As
another example, many somatosensory cortical cells related to reaching
had cosine directional tuning, probably because of the geometry of
mechanical deformation of the skin during arm movement (Cohen
et al., 1994 ; Prud'homme and Kalaska, 1994 ). Because a cosine tuning
function implies a dot product between a fixed preferred direction and
the actual reaching direction (Georgopoulos et al., 1986 ), cosine
tuning by itself suggests a linear relation with reaching direction
(Sanger, 1994 ), which could arise as an approximation to the activity
in a nonlinear recurrent network (Moody and Zipser, 1998 ). Therefore,
cosine tuning curves should be common in a theoretical model that is
approximately linear.
Basic theory
In this section we derive a general tuning rule for motor neurons
and then discuss its basic properties. This example illustrates what is
meant by mechanism-insensitive properties and the general theoretical
argument based on geometric constraints.
Consider stereotyped reaching movement in which the configuration of
the whole arm is determined completely by the hand position (x,
y, z) in space. In other words, such movements have only 3 degrees
of freedom. Assume that the mean firing rate of a neuron relative
to baseline is proportional to the time derivative of an unknown smooth
function of hand position in space. In other words:
|
(1)
|
where f is the firing rate, f0
is the baseline rate, and is an arbitrary function of the hand
position (x, y, z). A possible small time difference between
the neural activity and the arm movement may also be included, as appropriate.
The function (x, y, z) could have any form and could
include any function of arm configuration, such as muscle length, joint angles, or any combination of those. Mussa-Ivaldi (1988) first used
muscle length to demonstrate the appearance of cosine tuning in a
two-dimensional situation and pointed out that the argument could be
generalized to include other muscle variables. This interesting example
illustrates how cosine tuning property might emerge from some simple
assumptions. The assumption in Equation 1 is more general and the
formalism is simpler than that of Mussa-Ivaldi (1988) because joint
angles are no longer used as intermediate variables in the derivation.
This makes interpretation easier and more flexible and the curl-free
condition more apparent (see below). The precise interpretation of is not the focus of this paper; the only requirement is that it be a
function fully determined by the hand position in the three-dimensional space.
We emphasize that although Equation 1 uses hand position as the only
free variables, this does not require that the neuron must
directly encode the hand position or end-point in particular or
kinematic variables in general. Stereotypical reaching movements have
only 3 degrees of freedom and can be conveniently parameterized by the
hand position (x, y, z), although other parameters can also
be used without affecting the final conclusion (see below and Appendix
A). A neuron related to reaching arm movement should be sensitive to
changes of arm posture, which can always be expressed equivalently as
changes in some functions of the hand position (x, y, z).
The simplest estimate of such changes is the first temporal derivative
given in Equation 1. In other words, the above assumption only
postulates a general dependence of the firing rate of a neuron on
changing arm posture as a first approximation, regardless of which
parameters are encoded and how they are encoded.
The assumption in Equation 1 implies that the mean firing rate of a
neuron should follow the tuning rule:
|
(2)
|
where v = ( , , ) is the
instantaneous reaching velocity of the hand, and the vector
p is the preferred reaching direction, given by:
|
(3)
|
The derivation of this result follows immediately from the chain
rule:
|
(4)
|
For hand movements starting from the same position (x, y,
z) in space, the tuning rule in Equation 2 implies cosine
directional tuning and linear speed modulation (see Eq. 12). The
preferred direction vector p = p(x, y,
z) of the neuron may depend on the starting hand position. It can
be regarded as a constant vector when the hand is close to its starting position.
For hand movements starting from different positions, the preferred
direction vector may vary with the starting hand position (x, y,
z) and thus can be visualized as a vector field (Caminiti et al.,
1990 ; Moody and Zipser, 1998 ). It follows from the gradient formula in
Equation 3 that this vector field of preferred direction must have zero
curl:
|
(5)
|
because of the equality of mixed second partial derivatives of
. This means that the components of the preferred direction cannot
vary arbitrarily with the starting hand position. An equivalent integral formulation of the curl-free condition is that the path integral of p vanishes along any closed curve in
three-dimensional space:
|
(6)
|
with dl = (dx, dy, dz), assuming that there
are no singularities in the vector field. This constrains how the
preferred direction of a neuron should vary with the starting hand
position. Any distribution with non-zero curl can be ruled out (Fig.
3).

View larger version (10K):
[in this window]
[in a new window]
|
Figure 3.
The preferred direction field of a hypothetical
neuron that violates the curl-free condition in a planar reaching task.
For each hand position, the preferred direction of this neuron is
always perpendicular to the straight line from the hand
(H) to the shoulder (S), and the length
of the vector is proportional to the distance of HS. This
vector field has constant non-zero curl everywhere in the work space.
The gradient theory does not allow the existence of such a
neuron.
|
|
Human eyes are not reliable at judging whether a vector field is
curl-free (see Fig. 10), so numerical computation is needed (Mussa-Ivaldi et al., 1985 ; Giszter et al., 1993 ). See Appendix B for
more discussion. A vector field is curl-free if and only if it can be
generated as the gradient of a potential function. A more intuitive
interpretation of the curl-free condition is that when a vector field
is regarded as the velocity field of a fluid, there is no net
circulation along any closed path in space.
Under the curl-free condition, the net spike count (integration of the
firing rate with respect to baseline over time) can be used to recover
the value of the unknown potential function:
|
(7)
|
where the integral depends only on the initial hand position
(x0, y0,
z0) at time 0 and the final position (x,
y, z) at time T, not on the exact trajectory of hand
movement. For each hand position, the firing rate is the largest when
the hand moves along the local gradient of the potential function,
which defines p.
Baseline firing rate
The theory in the preceding section does not constrain the
baseline firing rate f0, which needs to
be considered separately. By definition, the baseline firing rate is
independent of the reaching direction, but it may be modulated by
several other factors. For example, in the motor cortex, Kettner et al.
(1988) have reported that the linear formula:
|
(8)
|
approximately described the baseline firing rate while the hand
was held fixed at position (x, y, z) in the
three-dimensional work space, where a0,
a1, a2, a3 are
constant coefficients. For reaching at speed v, a more
general linear formula for the baseline firing rate is:
|
(9)
|
where the coefficients a0,
a1, a2, a3,
a are independent of the hand position (x, y, z) and
the speed v, but may vary with task conditions. For
instance, the baseline firing rate when the hand is held still (Fig. 2,
horizontal lines) differs from the baseline rate defined as
the average of the cosine curve during reaching. Moran and Schwartz
(1999) showed that a linear speed term for baseline rate should
be included in the fitting formula, although their analysis used the
square root of firing rate instead of the raw firing rate. Indirect
evidence for a linear speed term in baseline rate is provided by the
linear effect of reaching distance (see below).
Note that in Equation 9, the baseline firing rate contains information
about both the static hand position (x, y, z) and its speed
v. As shown by Kettner et al. (1988) , the spatial gradient of the spontaneous firing rate for static hand position tends to be
consistent with the preferred direction of the same neuron. In the
current theory, this means that the preferred direction p =  tends to point in the same direction as the
vector (a1, a2,
a3) in Equation 9. Therefore, if the potential
function = (x, y, z) can be approximated as a
linear function in x, y, z, we can replace Equation 9
by:
|
(10)
|
where k is a constant coefficient. In this case, the
overall firing rate of a neuron that obeys the basic tuning rule in Equation 2 would convey two pieces of information: the baseline firing
rate f0 would represent the static value of the
potential function , and the directionally tuned part
p · v would represent the spatial gradient of
the same potential function.
Linear theory without gradient
If we simply postulate that the firing rate of a neuron is
linearly related to the components of reaching velocity v = (vx, vy,
vz), we would have the same tuning rule:
|
(11)
|
where the components of the preferred direction,
(px, py,
pz) p, are three arbitrary functions
of the hand position (x, y, z). For a single starting hand
position, this tuning rule is locally indistinguishable from the
prediction of the gradient theory. The difference is that now the
preferred direction field is not required to be the gradient of any
potential function so that its global distribution in hand position
space is not constrained at all. In other words, this vector field need
not be curl-free. The nongradient theory is more general, allowing a
circular distribution of the preferred directions as in Figure 3. The
necessary and sufficient condition for the gradient theory to be true
is that the preferred direction field is curl-free. The existing data cannot distinguish the two theories (see discussion below and Appendix
B).
Comparison with experimental results
Data from a wide range of motor-related brain areas largely
confirm the tuning rule in Equation 2 as a reasonable approximation, together with its various ramifications as follows. Theoretical predictions such as the curl-free distribution remain to be tested.
Cosine directional tuning and multiplicatively linear
speed modulation
The tuning rule in Equation 2 captures two main effects: cosine
directional tuning and multiplicatively linear speed modulation, as
clearly seen in its equivalent form:
|
(12)
|
where v = |v| is the reaching speed, the
proportional factor p = |p| is length of the
preferred direction vector p, and is the angle between
the instantaneous reaching direction and the preferred direction.
Because the hand trajectory is approximately straight in normal
reaching, the instantaneous velocity v is a vector that
points in the same direction as the reaching direction. If a tuning
function is cosine in three-dimensional space, it must also be cosine
in any two-dimensional subspace, as in the examples in Figure 2.
A cosine function is a good approximation to the directional tuning
data, although a circular normal function (Eq. A17), with one more free
parameter, tends to fit the data slightly better (Fig. 2). The residual
can be roughly accounted for by an additional Fourier term, cos 2 ,
with an amplitude less than ~10% of that of the original term, cos
(see further discussion in Appendix A).
The speed modulation effect predicted by Equation 12 is multiplicative;
that is, the firing rate should be higher for faster reaching speed
without affecting the shape of the cosine tuning function. This is
approximately true as shown by Moran and Schwartz (1999) , who,
however, used the square root of firing rate in analysis so that the
linearity of speed modulation on raw firing rate was not directly
quantified. Indirect evidence for linear speed modulation includes
trajectory reconstruction and the curvature power law (see below).
Neuronal population vector
Suppose the firing rate of each neuron i in a
population (i = 1, 2, ... , N) follows the same
tuning rule as considered above:
|
(13)
|
The population vector u is defined as the vector sum of
the preferred directions pi weighted by
firing rates relative to baselines (Georgopoulos et al., 1986 ):
|
(14)
|
where in the second step, Equation 13 is used. For the population
vector u to be proportional to the true velocity v, namely:
|
(15)
|
the necessary and sufficient condition is that the preferred
directions satisfy:
|
(16)
|
where is an arbitrary constant, I is 3 × 3 identity matrix, each pi is a column
vector, and piT is a row vector
(Mussa-Ivaldi, 1988 ; Gaál, 1993 ; Salinas and Abbott, 1994 ;
Sanger, 1994 ). In particular, when pi are
distributed uniformly, as is roughly true for cells in motor cortex
(Georgopoulos et al., 1988 ), the condition in Equation 16 is satisfied
so that Equation 15 follows as a consequence. Then the population
vector approximates the reaching direction and reaching velocity (Moran
and Schwartz, 1999 ).
Trajectory reconstruction
One implication of Equation 15 is that integration over the
population over time can reconstruct the hand trajectory, up to a
scaling constant:
|
(17)
|
where r(t) is hand position at time t.
This is consistent with the finding that adding up the population
vector head-to-tail approximately reproduced the shape of the hand
trajectory (Schwartz, 1993 , 1994 ), because head-to-tail addition is a
discrete approximation to the continuous vector integration.
Curvature power law
While drawing, the hand moves more slowly when the trajectory is
more highly curved, and obeys a power law:
|
(18)
|
where is instantaneous angular velocity with respect to an
instantaneous center determined by the local curvature of the
trajectory, and B is a constant (Lacquaniti et al., 1983 ). Schwartz (1994) showed that the changing direction of the population vector of cells in motor cortex of monkeys followed the same power law
during drawing. This is consistent with Equation 15, which requires
that the population vector u be proportional to the
instantaneous hand velocity v, up to a possible time difference. This form of the power law involves only the direction of
population vector u. To test its length u = |u| or the linearity of firing rate modulation by
reaching speed, one may use the equivalent form of the power law:
|
(19)
|
where v = r is the hand speed and r = 1/ is the local radius for the curvature of the trajectory.
The length of the population vector u is proportional to the
hand speed v if and only if the population vector follows
the same power law in Equation 19.
Reaching distance
Fu et al. (1993) reported a nearly linear correlation between
firing rates of cells in motor cortex and reaching distance. Although
this result was somewhat confounded by faster reaching for longer
distances, it raises the question of the general effect of reaching
distance. A linear distance effect would be consistent with the basic
model in Equation 2, which implies that:
|
(20)
|
where vector d = r(T) r(0) is the final
displacement from the starting position r(0), assuming the
preferred direction p is approximately constant along movement trajectory. The dot product p · d
implies a linear relation between the reaching distance and the total spike count above baseline, together with a cosine directional tuning,
regardless of the exact time course of hand velocity.
Note that in Equation 20 the baseline rate f0
has been subtracted. Because the baseline rate itself may contain a
linear speed component as in Equation 9 (Moran and Schwartz, 1999 ), its
contribution to total spike count should be:
|
(21)
|
where, for simplicity, a1 = a2 = a3 = 0 has been assumed to ignore the effect of static
hand position. Because the last term is proportional to the reaching
distance |d| but independent of the reaching direction,
it might account for the observation that the modulation of overall
firing rates by reaching distance was often linear but insensitive to
the reaching direction (Fu et al., 1993 ; Turner and Anderson,
1997 ).
Curl-free distribution of preferred direction
Caminiti et al. (1990 , 1991 ) reported that the preferred direction
of a motor cortical neuron often varied with the starting point of hand
movement. This is allowed by the gradient theory, provided that this
vector field is curl-free, according to Equation 5 or 6. A constant
preferred direction field is always allowed because it has zero curl.
The curl-free condition constrains how the preferred direction of a
neuron may vary in different parts of space. For example, it rules
out the possibility of any circular arrangement of the preferred
directions, such as that in the two-joint planar arm example shown in
Figure 3. The existing data do not include enough points to compute the
curl (see Appendix B). Further experiments would be needed to test
whether the prediction of the gradient theory is correct.
Elbow position
Scott and Kalaska (1997) found that the preferred directions of
some motor cortical cells were altered when the monkey had to reach
unnaturally with the elbow raised to shoulder level. In the current
theoretical framework, adding elbow position as a free parameter is
equivalent to adding one rotation variable , for example, the angle
between the horizontal plane and the plane determined by the hand,
elbow, and shoulder. The same theoretical argument yields the tuning
rule:
|
(22)
|
where K is a coefficient that may depend on both hand
position and elbow position. This formula implies two new effects. The
first is that now the preferred direction vector, both its direction
and length, may depend on the elbow position as well as the hand
position (x, y, z):
|
(23)
|
as reported by Scott and Kalaska (1997) . The second effect, a new
prediction, is that the firing rate may contain a component proportional to the angular speed d /dt of elbow rotation.
How does this case relate to our earlier results with hand position as
the only free parameter? In the preceding sections, reaching was
assumed to be "stereotypical" in the sense that the elbow position
can be determined completely by the hand position, ignoring forearm
rotation. This assumption may not be true if the final posture
sometimes depends also on the initial hand position (Soechting et al.,
1995 ). However, when comparing reaching movements starting from the
same initial hand position, it is reasonable to assume that for
stereotypical reaching, the elbow angle can be completely
determined by the hand position (x, y, z), or = (x, y, z). Then the time derivative of , after expanding by the chain rule, can be absorbed into the term
p · v, yielding the original basic tuning rule
in Equation 2. In other words, the assumption of stereotypical movement
reduces the total degrees of freedom to 3, eliminating the elbow
position as an independent variable. Although the elbow angle can still
be used as a free parameter, it is no longer independent of the hand
position. Only three parameters are independent in this case, and their exact choice does not affect the general form of the tuning rule (see
Appendix A for more discussion on coordinate-system independence).
Summary and discussion of more complex cases
As shown above, the basic tuning theory can naturally account for
several important experimental results without making any specific
assumptions about the exact variables encoded or details of the
encoding. These results are generic properties independent of the exact
functional interpretations. This generality makes sense because during
stereotypical movement, redundant variables are inevitably
constrained by the geometry and become highly correlated, so that
they are likely to show similar tuning properties of the same
general type. The theory presented here has formalized this intuition.
The relationship between cosine tuning properties and geometric
constraints is also apparent in the studies of muscle activities and
actions during reaching and isometric tasks. Basic properties resembling those for motor cortical cells have been reported, including
approximately cosine directional tuning curves (but often with a small
secondary peak opposite the preferred direction), speed sensitivity,
and posture dependence (Flanders and Soechting, 1990 ; Flanders and
Herrmann, 1992 ; Buneo et al., 1997 ).
The basic theory needs to be generalized in situations where the hand
position is not the only free parameter. For example, force is one
variable that is often correlated with the activity of motor cortex;
recent examples related to directional tuning include tasks with static
load (Kalaska et al., 1989 ) and varying isometric forces (Georgopoulos
et al., 1992 ; Sergio and Kalaska, 1997 ).
As another example, preparatory activity in motor cortex before onset
of movement can reflect the upcoming reaching direction, as is
especially evident during instructed delay (Georgopoulos et al.,
1989a ), and can change rapidly in tasks requiring mental rotation
(Georgopoulos et al., 1989b ) or target switching (Pellizzer et al.,
1995 ).
Moreover, when sensory and motor components were decoupled, some
neurons even from primary motor cortex were more closely related to the
visual movement of a cursor on the computer screen than to the joystick
position or hand movements, in both one-dimensional (Alexander and
Crutcher, 1990 ) and two-dimensional tasks (Shen and Alexander, 1997a ).
By contrast, in virtual reality experiments with visual distortion,
motor cortical activity mainly followed the actual limb trajectory
rather than the animal's visual perception (Moran et al., 1995 ).
In addition, some differences exist among the neural activity from
different brain areas, although they all show approximate cosine
directional tuning (compare Fig. 2). For instance, compared with
neurons in the motor cortex in a reaching task, the preferred directions in the cerebellum are more variable in repeated trials (Fortier et al., 1989 ), neurons in the parietal cortex are less sensitive to static load (Kalaska et al., 1990 ), and neurons in the
premotor cortex are activated earlier, more transiently (Caminiti et
al., 1991 ; Crammond and Kalaska, 1996 ), and affected more frequently by
visual cues (Wise et al., 1992 ; Shen and Alexander, 1997b ). In the
motor cortex and elsewhere, there also exist neurons with complex
properties that are either not task-related or hard to describe but
still could have useful functions in a distributed network (Fetz, 1992 ;
Zipser, 1992 ; Moody et al., 1998 ).
In most of these cases, there are additional free variables besides
hand position. The linear theory may still yield useful results in
these more complex cases after including these additional variables.
For example, the planned movement direction is an independent variable,
which could be used to describe some preparatory activity before overt
hand movement. These new variables should be included when deriving the
tuning rule, as demonstrated in the preceding section by adding the
elbow position as a free variable in abducted reaching.
 |
REPRESENTING RIGID OBJECT MOTION |
The same geometric argument for arm movement can be applied to
moving rigid objects, which have additional rotational degrees of
freedom around an axis in space (Fig. 1). In the following, we derive a
general tuning rule for rigid motion, discuss its basic properties, and
then contrast the results with concrete models of visual receptive fields.
Description of rigid object motion
Arbitrary instantaneous motion of a rigid object can always be
described by a rotation plus a translation (Fig.
4), but given the same physical motion,
this description is ambiguous up to an arbitrary parallel shift of the
rotation axis. For example, translational velocity can always be
aligned instantaneously with the angular velocity to obtain a screw
motion by passing the rotation axis through the point of zero velocity
in a perpendicular plane (Fig. 4).

View larger version (12K):
[in this window]
[in a new window]
|
Figure 4.
Arbitrary motion of a rigid object can always be
decomposed instantaneously into a translation and a rotation, allowing
arbitrary parallel shift of the rotation axis. The two examples shown
here describe identical physical motion. Parallel shift of rotation
axis affects the translation velocity but not the angular velocity
.
|
|
This ambiguity disappears when the rotation axis is always required to
pass through the same reference center in the object, say, the center
of mass. We assume that the reference center has been chosen so that a
rigid motion can be described uniquely by a translational velocity and
an angular velocity. We return to this topic later.
The static position and orientation of a rigid object can be specified
by six independent parameters:
|
(24)
|
where x, y, z describe the position of the reference
center of the object with respect to a coordinate system fixed to the world, and 1, 2,
3 are three angular variables that represent the
object's orientation. The translational velocity of the object is:
|
(25)
|
The angular velocity = ( x,
y, z)T in
world coordinates is always linearly related to the time derivatives of
the orientation variables = ( 1,
2, 3)T:
|
(26)
|
where M is an invertible 3 × 3 matrix that
depends only on the orientation ( 1,
2, 3). For example, when
Euler angles are used to describe orientation (Fig.
5), we have:
|
(27)
|
and
|
(28)
|
which is invertible as long as det M = sin 0 (Goldstein, 1980 ).

View larger version (44K):
[in this window]
[in a new window]
|
Figure 5.
Euler angles ( , , ) describe an arbitrary
orientation of a rigid object with axes (X', Y', Z') with
respect to a standard orientation with axes (X, Y, Z).
|
|
Only the abstract linear relation in Equation 26 is needed in the next
section. The actual choice of ( 1,
2, 3) is unimportant here.
Because the time derivatives of different sets of variables are
linearly related by a Jacobian matrix, Equation 26 always holds regardless of the exact choice of the parameterization of orientation (see also Appendix A on independence of the coordinate system).
Tuning rule for rigid motion
Consider neuronal activity associated with motion of a rigid
three-dimensional object. Assume that the mean firing rate of a
neuron relative to baseline, with a possible time delay, is proportional to the time derivative of a smooth function of the position and orientation of the object in three-dimensional space. In other words:
|
(29)
|
where f is the firing rate, f0
is the baseline rate, and is an arbitrary function of object
position (x, y, z) and orientation ( 1,
2, 3), as described in the
preceding section. This equation is analogous to Equation 1.
The exact form of function need not be specified here. It may
depend on both the receptive field properties of the cell and the
visual appearance of the object and its surroundings. This formulation
is quite general. For example, all the visual cues of the object
illustrated in Figure 1 are functions of the position and orientation
of the object that completely determine how light is reflected from
various surfaces, whether diffuse (uniform scattering in all
directions) or specular (energy concentrated around the mirror
reflection direction), giving rise to various visual effects such
as shading, shadows, specular reflections, and highlights (Watt and
Watt, 1992 ). Given that all sensory cues are determined completely by
the position and orientation of the object, we expect a
motion-sensitive neuron to respond to changes of these variables.
The simplest way to estimate these changes is to compute the first
temporal derivative.
The assumption in Equation 29 allows us to derive a general tuning rule
for neurons sensitive to three-dimensional object motion. Given a
three-dimensional object moving at instantaneous translational velocity
v and angular velocity , the mean firing rate
of a generic neuron should depend on these variables in a highly
stereotyped way:
|
(30)
|
where f0 is the background firing rate,
p is the preferred translational direction, given
by:
|
(31)
|
and vector q is the preferred rotation axis,
given by:
|
(32)
|
with matrix M as in Equation 26, and:
|
(33)
|
is an intermediate vector variable, the transformed preferred
rotation axis in the orientation angle space. Both the preferred translational direction p and the preferred rotation axis q are vectors in the physical space. They may depend on the
object and its position and orientation but not on the translational velocity v and angular velocity . The
derivation of Equation 30 follows from the chain rule:
|
(34)
|
where Equations 25 and 26 and the definitions in Equations 31-33
have been used. The derivation of the tuning rule does not depend on
which coordinate system is used (Appendix A).
Before explaining the meaning of the tuning rule in the next section,
first consider the baseline firing rate, which is not constrained by
the present theory and thus requires separate consideration. The
baseline firing rate may itself be modulated by several factors, and
the simplest linear model is:
|
(35)
|
where ai, bi, a,
b are constants, and the position (x, y, z) and the
orientation ( 1, 2,
3) of the object are included as possibly
relevant factors related to the static view, together with the
translational speed v and the angular speed for object motion, which may also be relevant. This linear equation generalizes Equation 9 for motor neurons. Similarly, Equation 10 can also be generalized by including angular position and speed. This assumes that
the baseline firing rate in general may contain information about both
the static configuration of an object and its instantaneous motion.
Cosine tuning and multiplicative speed modulation
The basic tuning rule in Equation 30 can be rewritten in its
equivalent form:
|
(36)
|
where v = |v| is the speed of
translation, = | | is the angular speed of
rotation, p = |p| is the length of the
preferred direction vector, q = |q| is the length of the preferred rotation vector, is the angle between vectors p and v, and is the angle between
vectors q and .
In other words, given the particular view of a particular object, the
response above baseline should be the sum of two components, one
translational and one rotational. The translational component is
proportional to the cosine of the angle between a fixed preferred translational direction and the actual translational direction. In
addition, it is also modulated linearly by the speed of translation, which does not alter the shape of the tuning curve. Similarly, the
rotational component is proportional to the cosine of the angle between
a fixed preferred rotation axis and the actual rotation axis. In
addition, the rotational component is also modulated linearly by the
angular speed of rotation.
Distribution of preferred direction and preferred axis
Thus far, the view of the given object is assumed to be fixed.
That is, the cosine tunings for both translation and rotation are
defined with respect to a particular view of the object. When the view
of the object changes, the preferred translational direction p and preferred rotation axis q of a
motion-sensitive neuron may also change.
The theory constrains this change because the preferred translational
direction p and the transformed preferred rotation axis
q* are derived as gradient fields in Equations 31 and 33.
Here the intermediate vector q* is related to the preferred rotation axis q in physical space by:
|
(37)
|
according to Equation 32. In three-dimensional space, where curl
is defined, the gradient field implies that any three variables taken
from the six variables (x, y, z, 1,
2, 3) must be
curl-free. For example, when the position (x, y, z) of the
object is fixed, the distribution of the transformed preferred rotation
axis in the orientation space ( 1,
2, 3) must be
curl-free:
|
(38)
|
Any hypothetical neurons with non-zero curl can be ruled out by
this condition (see below). For a gradient field, the zero curl is
simply attributable to the equality of mixed second partial derivatives
of the potential function, which holds also in higher dimensions. The
equivalent path integral formulation is valid also in all
dimensions:
|
(39)
|
along any closed curve in the six-dimensional space, where
dl = (dx, dy, dz), and d = (d 1, d 2,
d 3). Another equivalent
formulation is that the potential function can be constructed by the path integral:
|
(40)
|
which depends only on the end points, not on the exact path. Here
= (x, y, z, 1, 2,
3) is an arbitrary point in the parameter
space, and 0 is the value at a given initial point. Therefore, in the gradient theory, how the preferred translational direction and the preferred rotation axis of a neuron change with the
view of a given object cannot be arbitrary but is highly constrained. This can provide testable predictions (see below).
Linear nongradient theory
A more general theory can be obtained by directly assuming a
linear relationship between the firing rate and the components of the
translational velocity v = ( , , )
and the time derivatives of the angular variables = ( 1,
2, 3). This yields the same
tuning rule:
|
(41)
|
where p = (px,
py, pz) and q* = (q*1, q*2,
q*3) are arbitrary vector fields, not
necessarily gradient fields, and Equations 26 and 32 are used in the
last step. This tuning rule gives the same response properties
predicted by the gradient theory for a single view of the object. The
difference shows up when the view changes. The nongradient theory
imposes no constraint on how preferred translational direction and
preferred rotation axis should vary with the view of the object. The
gradient theory is more restrictive, and therefore makes stronger predictions.
Change of reference center
Because the description of the same physical motion of a rigid
object is ambiguous up to a parallel shift of the rotation axis (Fig.
4), we have assumed in the above that the rotation axis always passes
through the same reference point c = (x, y, z) in the
object to ensure uniqueness of description. When a different reference
center c' is chosen, the form of the basic tuning rule in
Equation 30 remains valid, but the preferred rotation axis is affected
in a predictable way:
|
(42)
|
where v' and ' are the translational
velocity and angular velocity for the new reference center
c', and:
|
(43)
|
|
(44)
|
are the new preferred translational direction and rotation axis.
One can readily verify that Equation 42 is valid under Equations 43 and
44, using the relations:
|
(45)
|
|
(46)
|
Therefore, changing the reference center of an object has no
effect on the preferred translational direction of a neuron (Eq. 43),
whereas the preferred rotation axis is altered systematically in a
completely predictable manner (Eq. 44). These relations arise purely
from the ambiguity of the description of rigid motion, and thus apply
to both the gradient and the nongradient theories.
|