Abstract
Although the orientation of an arm in space or the static view of an object may be represented by a population of neurons in complex ways, how these variables change with movement often follows simple linear rules, reflecting the underlying geometric constraints in the physical world. A theoretical analysis is presented for how such constraints affect the average firing rates of sensory and motor neurons during natural movements with low degrees of freedom, such as a limb movement and rigid object motion. When applied to nonrigid reaching arm movements, the linear theory accounts for cosine directional tuning with linear speed modulation, predicts a curlfree spatial distribution of preferred directions, and also explains why the instantaneous motion of the hand can be recovered from the neural population activity. For threedimensional motion of a rigid object, the theory predicts that, to a first approximation, the response of a sensory neuron should have a preferred translational direction and a preferred rotation axis in space, both with cosine tuning functions modulated multiplicatively by speed and angular speed, respectively. Some known tuning properties of motionsensitive neurons follow as special cases. Acceleration tuning and nonlinear speed modulation are considered in an extension of the linear theory. This general approach provides a principled method to derive mechanisminsensitive neuronal properties by exploiting the inherently low dimensionality of natural movements.
 3D object
 cortical representation
 visual cortex
 tuning curve
 motor system
 reaching movement
 speed modulation
 potential function
 gradient field
 zero curl
For natural movements, such as the motion of a rigid object or an active limb movement, many sensory receptors or muscles are involved, but the actual degrees of freedom are low because of geometric constraints in the physical world. For example, as illustrated in Figure 1, the rotation of an object alters many visual cues. How these cues vary in time is not arbitrary but is fully determined by the rigid motion, which has only 6 degrees of freedom. As a consequence, neuronal activity reflecting such natural movements also is likely to be highly constrained and to have only a few degrees of freedom.
This paper presents a theoretical analysis of how neuronal activity correlated with natural movements might be constrained by geometry. The basic theory, although essentially linear, can account for several key features of diverse neurophysiological results and generates strong predictions that are testable with current experimental techniques.
An emerging principle from this analysis is that neuronal activity tuned to movement often obeys simple generic rules as a first approximation, insensitive to the exact sensory or motor variables that are encoded and the exact computational interpretation. Such generic tuning properties are mechanism insensitive because they are better described as reflecting the underlying geometric constraints on movements rather than the actual computational mechanisms. This simplicity arises when sensory or motor variables represent changes in time rather than static values. In the example shown in Figure 1, the viewpoint was fixed and the object was rotated systematically around different axes. The focus is on how neuronal responses depend on the rotation axis in threedimensional space, given approximately the same view of the object. It is possible to derive a simple cosine tuning rule for the rotation axis, although various visual cues may depend on the static geometrical orientation of the object in complex ways. Threedimensional object motion is a specific example; the same principles also apply to several other biological systems, including nonrigid arm movement.
DIRECTIONAL TUNING FOR ARM MOVEMENT
Although the visual and the motor examples share similar mechanisminsensitive properties, the reaching arm movement has a simpler mathematical description and more supporting experimental results and will be considered first.
Ubiquity of cosine tuning
A directional tuning curve describes how the mean firing rate of a neuron depends on the reaching direction of the hand. As illustrated in Figure 2, broad cosinelike tuning curves are very typical in many areas of the motor system of monkeys, including the primary motor cortex (Georgopoulos et al., 1986), premotor cortex (Caminiti et al., 1991), parietal cortex (Kalaska et al., 1990), cerebellum (Fortier et al., 1989), basal ganglia (Turner and Anderson, 1997), and somatosensory cortex (Cohen et al., 1994;Prud’homme and Kalaska, 1994). Although the examples shown in Figure 2are twodimensional, cosine tuning holds as well for threedimensional reaching movement (Georgopoulos et al., 1986; Schwartz et al., 1988).
The ubiquity of cosine tuning is a hint that this property is generic and insensitive to the exact computational function of these neurons. For example, coding of muscle shortening rate is one theoretical mechanism that can generate cosine tuning (MussaIvaldi, 1988). As another example, many somatosensory cortical cells related to reaching had cosine directional tuning, probably because of the geometry of mechanical deformation of the skin during arm movement (Cohen et al., 1994; Prud’homme and Kalaska, 1994). Because a cosine tuning function implies a dot product between a fixed preferred direction and the actual reaching direction (Georgopoulos et al., 1986), cosine tuning by itself suggests a linear relation with reaching direction (Sanger, 1994), which could arise as an approximation to the activity in a nonlinear recurrent network (Moody and Zipser, 1998). Therefore, cosine tuning curves should be common in a theoretical model that is approximately linear.
Basic theory
In this section we derive a general tuning rule for motor neurons and then discuss its basic properties. This example illustrates what is meant by mechanisminsensitive properties and the general theoretical argument based on geometric constraints.
Consider stereotyped reaching movement in which the configuration of the whole arm is determined completely by the hand position (x, y, z) in space. In other words, such movements have only 3 degrees of freedom. Assume that the mean firing rate of a neuron relative to baseline is proportional to the time derivative of an unknown smooth function of hand position in space. In other words:
The function Φ(x, y, z) could have any form and could include any function of arm configuration, such as muscle length, joint angles, or any combination of those. MussaIvaldi (1988) first used muscle length to demonstrate the appearance of cosine tuning in a twodimensional situation and pointed out that the argument could be generalized to include other muscle variables. This interesting example illustrates how cosine tuning property might emerge from some simple assumptions. The assumption in Equation 1 is more general and the formalism is simpler than that of MussaIvaldi (1988) because joint angles are no longer used as intermediate variables in the derivation. This makes interpretation easier and more flexible and the curlfree condition more apparent (see below). The precise interpretation of Φ is not the focus of this paper; the only requirement is that it be a function fully determined by the hand position in the threedimensional space.
We emphasize that although Equation 1 uses hand position as the only free variables, this does not require that the neuron must directly encode the hand position or endpoint in particular or kinematic variables in general. Stereotypical reaching movements have only 3 degrees of freedom and can be conveniently parameterized by the hand position (x, y, z), although other parameters can also be used without affecting the final conclusion (see below and Appendix). A neuron related to reaching arm movement should be sensitive to changes of arm posture, which can always be expressed equivalently as changes in some functions of the hand position (x, y, z). The simplest estimate of such changes is the first temporal derivative given in Equation 1. In other words, the above assumption only postulates a general dependence of the firing rate of a neuron on changing arm posture as a first approximation, regardless of which parameters are encoded and how they are encoded.
The assumption in Equation 1 implies that the mean firing rate of a neuron should follow the tuning rule:
For hand movements starting from different positions, the preferred direction vector may vary with the starting hand position (x, y, z) and thus can be visualized as a vector field (Caminiti et al., 1990; Moody and Zipser, 1998). It follows from the gradient formula in Equation 3 that this vector field of preferred direction must have zero curl:
Human eyes are not reliable at judging whether a vector field is curlfree (see Fig. 10), so numerical computation is needed (MussaIvaldi et al., 1985; Giszter et al., 1993). See Appendix for more discussion. A vector field is curlfree if and only if it can be generated as the gradient of a potential function. A more intuitive interpretation of the curlfree condition is that when a vector field is regarded as the velocity field of a fluid, there is no net circulation along any closed path in space.
Under the curlfree condition, the net spike count (integration of the firing rate with respect to baseline over time) can be used to recover the value of the unknown potential function:
Baseline firing rate
The theory in the preceding section does not constrain the baseline firing rate f_{0}, which needs to be considered separately. By definition, the baseline firing rate is independent of the reaching direction, but it may be modulated by several other factors. For example, in the motor cortex, Kettner et al. (1988) have reported that the linear formula:
Note that in Equation 9, the baseline firing rate contains information about both the static hand position (x, y, z) and its speed v. As shown by Kettner et al. (1988), the spatial gradient of the spontaneous firing rate for static hand position tends to be consistent with the preferred direction of the same neuron. In the current theory, this means that the preferred directionp = ∇Φ tends to point in the same direction as the vector (a_{1}, a_{2}, a_{3}) in Equation 9. Therefore, if the potential function Φ = Φ(x, y, z) can be approximated as a linear function in x, y, z, we can replace Equation 9by:
Linear theory without gradient
If we simply postulate that the firing rate of a neuron is linearly related to the components of reaching velocity v = (v_{x}, v_{y}, v_{z}), we would have the same tuning rule:
Comparison with experimental results
Data from a wide range of motorrelated brain areas largely confirm the tuning rule in Equation 2 as a reasonable approximation, together with its various ramifications as follows. Theoretical predictions such as the curlfree distribution remain to be tested.
Cosine directional tuning and multiplicatively linear speed modulation
The tuning rule in Equation 2 captures two main effects: cosine directional tuning and multiplicatively linear speed modulation, as clearly seen in its equivalent form:
A cosine function is a good approximation to the directional tuning data, although a circular normal function (Eq. EA17), with one more free parameter, tends to fit the data slightly better (Fig. 2). The residual can be roughly accounted for by an additional Fourier term, cos 2α, with an amplitude less than ∼10% of that of the original term, cos α (see further discussion in Appendix ).
The speed modulation effect predicted by Equation 12 is multiplicative; that is, the firing rate should be higher for faster reaching speed without affecting the shape of the cosine tuning function. This is approximately true as shown by Moran and Schwartz (1999), who, however, used the square root of firing rate in analysis so that the linearity of speed modulation on raw firing rate was not directly quantified. Indirect evidence for linear speed modulation includes trajectory reconstruction and the curvature power law (see below).
Neuronal population vector
Suppose the firing rate of each neuron i in a population (i = 1, 2, … , N) follows the same tuning rule as considered above:
Trajectory reconstruction
One implication of Equation 15 is that integration over the population over time can reconstruct the hand trajectory, up to a scaling constant:
Curvature power law
While drawing, the hand moves more slowly when the trajectory is more highly curved, and obeys a power law:
Reaching distance
Fu et al. (1993) reported a nearly linear correlation between firing rates of cells in motor cortex and reaching distance. Although this result was somewhat confounded by faster reaching for longer distances, it raises the question of the general effect of reaching distance. A linear distance effect would be consistent with the basic model in Equation 2, which implies that:
Note that in Equation 20 the baseline rate f_{0}has been subtracted. Because the baseline rate itself may contain a linear speed component as in Equation 9 (Moran and Schwartz, 1999), its contribution to total spike count should be:
Curlfree distribution of preferred direction
Caminiti et al. (1990, 1991) reported that the preferred direction of a motor cortical neuron often varied with the starting point of hand movement. This is allowed by the gradient theory, provided that this vector field is curlfree, according to Equation 5 or 6. A constant preferred direction field is always allowed because it has zero curl. The curlfree condition constrains how the preferred direction of a neuron may vary in different parts of space. For example, it rules out the possibility of any circular arrangement of the preferred directions, such as that in the twojoint planar arm example shown in Figure 3. The existing data do not include enough points to compute the curl (see Appendix ). Further experiments would be needed to test whether the prediction of the gradient theory is correct.
Elbow position
Scott and Kalaska (1997) found that the preferred directions of some motor cortical cells were altered when the monkey had to reach unnaturally with the elbow raised to shoulder level. In the current theoretical framework, adding elbow position as a free parameter is equivalent to adding one rotation variable ϕ, for example, the angle between the horizontal plane and the plane determined by the hand, elbow, and shoulder. The same theoretical argument yields the tuning rule:
How does this case relate to our earlier results with hand position as the only free parameter? In the preceding sections, reaching was assumed to be “stereotypical” in the sense that the elbow position can be determined completely by the hand position, ignoring forearm rotation. This assumption may not be true if the final posture sometimes depends also on the initial hand position (Soechting et al., 1995). However, when comparing reaching movements starting from the same initial hand position, it is reasonable to assume that for stereotypical reaching, the elbow angle ϕ can be completely determined by the hand position (x, y, z), or ϕ = ϕ(x, y, z). Then the time derivative of ϕ, after expanding by the chain rule, can be absorbed into the termp · v, yielding the original basic tuning rule in Equation 2. In other words, the assumption of stereotypical movement reduces the total degrees of freedom to 3, eliminating the elbow position as an independent variable. Although the elbow angle can still be used as a free parameter, it is no longer independent of the hand position. Only three parameters are independent in this case, and their exact choice does not affect the general form of the tuning rule (see Appendix for more discussion on coordinatesystem independence).
Summary and discussion of more complex cases
As shown above, the basic tuning theory can naturally account for several important experimental results without making any specific assumptions about the exact variables encoded or details of the encoding. These results are generic properties independent of the exact functional interpretations. This generality makes sense because during stereotypical movement, redundant variables are inevitably constrained by the geometry and become highly correlated, so that they are likely to show similar tuning properties of the same general type. The theory presented here has formalized this intuition.
The relationship between cosine tuning properties and geometric constraints is also apparent in the studies of muscle activities and actions during reaching and isometric tasks. Basic properties resembling those for motor cortical cells have been reported, including approximately cosine directional tuning curves (but often with a small secondary peak opposite the preferred direction), speed sensitivity, and posture dependence (Flanders and Soechting, 1990; Flanders and Herrmann, 1992; Buneo et al., 1997).
The basic theory needs to be generalized in situations where the hand position is not the only free parameter. For example, force is one variable that is often correlated with the activity of motor cortex; recent examples related to directional tuning include tasks with static load (Kalaska et al., 1989) and varying isometric forces (Georgopoulos et al., 1992; Sergio and Kalaska, 1997).
As another example, preparatory activity in motor cortex before onset of movement can reflect the upcoming reaching direction, as is especially evident during instructed delay (Georgopoulos et al., 1989a), and can change rapidly in tasks requiring mental rotation (Georgopoulos et al., 1989b) or target switching (Pellizzer et al., 1995).
Moreover, when sensory and motor components were decoupled, some neurons even from primary motor cortex were more closely related to the visual movement of a cursor on the computer screen than to the joystick position or hand movements, in both onedimensional (Alexander and Crutcher, 1990) and twodimensional tasks (Shen and Alexander, 1997a). By contrast, in virtual reality experiments with visual distortion, motor cortical activity mainly followed the actual limb trajectory rather than the animal’s visual perception (Moran et al., 1995).
In addition, some differences exist among the neural activity from different brain areas, although they all show approximate cosine directional tuning (compare Fig. 2). For instance, compared with neurons in the motor cortex in a reaching task, the preferred directions in the cerebellum are more variable in repeated trials (Fortier et al., 1989), neurons in the parietal cortex are less sensitive to static load (Kalaska et al., 1990), and neurons in the premotor cortex are activated earlier, more transiently (Caminiti et al., 1991; Crammond and Kalaska, 1996), and affected more frequently by visual cues (Wise et al., 1992; Shen and Alexander, 1997b). In the motor cortex and elsewhere, there also exist neurons with complex properties that are either not taskrelated or hard to describe but still could have useful functions in a distributed network (Fetz, 1992;Zipser, 1992; Moody et al., 1998).
In most of these cases, there are additional free variables besides hand position. The linear theory may still yield useful results in these more complex cases after including these additional variables. For example, the planned movement direction is an independent variable, which could be used to describe some preparatory activity before overt hand movement. These new variables should be included when deriving the tuning rule, as demonstrated in the preceding section by adding the elbow position as a free variable in abducted reaching.
REPRESENTING RIGID OBJECT MOTION
The same geometric argument for arm movement can be applied to moving rigid objects, which have additional rotational degrees of freedom around an axis in space (Fig. 1). In the following, we derive a general tuning rule for rigid motion, discuss its basic properties, and then contrast the results with concrete models of visual receptive fields.
Description of rigid object motion
Arbitrary instantaneous motion of a rigid object can always be described by a rotation plus a translation (Fig.4), but given the same physical motion, this description is ambiguous up to an arbitrary parallel shift of the rotation axis. For example, translational velocity can always be aligned instantaneously with the angular velocity to obtain a screw motion by passing the rotation axis through the point of zero velocity in a perpendicular plane (Fig. 4).
This ambiguity disappears when the rotation axis is always required to pass through the same reference center in the object, say, the center of mass. We assume that the reference center has been chosen so that a rigid motion can be described uniquely by a translational velocity and an angular velocity. We return to this topic later.
The static position and orientation of a rigid object can be specified by six independent parameters:
Only the abstract linear relation in Equation 26 is needed in the next section. The actual choice of (θ_{1}, θ_{2}, θ_{3}) is unimportant here. Because the time derivatives of different sets of variables are linearly related by a Jacobian matrix, Equation 26 always holds regardless of the exact choice of the parameterization of orientation (see also Appendix on independence of the coordinate system).
Tuning rule for rigid motion
Consider neuronal activity associated with motion of a rigid threedimensional object. Assume that the mean firing rate of a neuron relative to baseline, with a possible time delay, is proportional to the time derivative of a smooth function of the position and orientation of the object in threedimensional space.In other words:
The exact form of function Φ need not be specified here. It may depend on both the receptive field properties of the cell and the visual appearance of the object and its surroundings. This formulation is quite general. For example, all the visual cues of the object illustrated in Figure 1 are functions of the position and orientation of the object that completely determine how light is reflected from various surfaces, whether diffuse (uniform scattering in all directions) or specular (energy concentrated around the mirror reflection direction), giving rise to various visual effects such as shading, shadows, specular reflections, and highlights (Watt and Watt, 1992). Given that all sensory cues are determined completely by the position and orientation of the object, we expect a motionsensitive neuron to respond to changes of these variables. The simplest way to estimate these changes is to compute the first temporal derivative.
The assumption in Equation 29 allows us to derive a general tuning rule for neurons sensitive to threedimensional object motion. Given a threedimensional object moving at instantaneous translational velocityv and angular velocity ω, the mean firing rate of a generic neuron should depend on these variables in a highly stereotyped way:
Before explaining the meaning of the tuning rule in the next section, first consider the baseline firing rate, which is not constrained by the present theory and thus requires separate consideration. The baseline firing rate may itself be modulated by several factors, and the simplest linear model is:
Cosine tuning and multiplicative speed modulation
The basic tuning rule in Equation 30 can be rewritten in its equivalent form:
In other words, given the particular view of a particular object, the response above baseline should be the sum of two components, one translational and one rotational. The translational component is proportional to the cosine of the angle between a fixed preferred translational direction and the actual translational direction. In addition, it is also modulated linearly by the speed of translation, which does not alter the shape of the tuning curve. Similarly, the rotational component is proportional to the cosine of the angle between a fixed preferred rotation axis and the actual rotation axis. In addition, the rotational component is also modulated linearly by the angular speed of rotation.
Distribution of preferred direction and preferred axis
Thus far, the view of the given object is assumed to be fixed. That is, the cosine tunings for both translation and rotation are defined with respect to a particular view of the object. When the view of the object changes, the preferred translational directionp and preferred rotation axis q of a motionsensitive neuron may also change.
The theory constrains this change because the preferred translational direction p and the transformed preferred rotation axisq* are derived as gradient fields in Equations 31 and 33. Here the intermediate vector q* is related to the preferred rotation axis q in physical space by:
Linear nongradient theory
A more general theory can be obtained by directly assuming a linear relationship between the firing rate and the components of the translational velocity v = (x˙, y˙, z˙) and the time derivatives of the angular variables θ = (θ˙_{1},θ˙_{2},θ˙_{3}). This yields the same tuning rule:
Change of reference center
Because the description of the same physical motion of a rigid object is ambiguous up to a parallel shift of the rotation axis (Fig.4), we have assumed in the above that the rotation axis always passes through the same reference point c = (x, y, z) in the object to ensure uniqueness of description. When a different reference center c′ is chosen, the form of the basic tuning rule in Equation 30 remains valid, but the preferred rotation axis is affected in a predictable way:
Summary
Simple assumptions have led to a general tuning rule for how the mean firing rate of a neuron should depend on the instantaneous motion of an arbitrary rigid object. For each given view of the object, the firing rate is predicted to be the sum of two terms, one for the translational motion component and one for the rotational motion component, both with cosine directional tuning and linear speed or angular speed modulation. In general, the preferred translational direction and the preferred rotation axis may depend on the identity of the object as well as its view. This tuning rule is a linear approximation to the geometry of rigid motion and therefore should obtain regardless of the exact computational mechanisms involved. In other words, this rule is expected to be a robust property for motionsensitive neurons responding to realistic moving objects. When the view of the object changes, both the preferred translational direction and the preferred rotation axis of a neuron may change as well. The gradient theory provides additional constraints on such changes, whereas the nongradient theory imposes no further constraints. As a consequence, these two theories can be distinguished by further experiments. Finally, although the description of rigid motion is ambiguous up to a parallel shift of the rotation axis, the effects on the tuning rule are completely predictable and therefore convey no additional information about the response properties of a neuron.
Examples of motionsensitive receptive field models
Many neurons in visual cortex, particularly in the dorsal stream leading to parietal cortex, respond selectively to visual motion. Here we consider threedimensional rigid motion and examine several simple computational mechanisms that yield explicit analytical formulas for the preferred translational direction and the preferred rotation axis. For each fixed view of the object, the results are consistent with the basic tuning rule in Equation 30. For different views, however, the global gradientfield condition for the preferred axes can be violated by the idealized velocity component detectors. This shows that the neuronal behavior predicted by the gradient theory is not always identical to that of an opticflow detector.
Velocity component detectors
As illustrated in Figure6
A, suppose the firing rate of an idealized neuron detects local motion on the image plane according to:
This idealized neuron obeys the basic tuning rule in Equation 30, namely:
To derive these formulas, note that a point in the object with coordinate r has the velocity:
Finally, the basic tuning rule in Equation 48 still holds for image motion of a rigid object under a perspective projection, which projects each point (x, y, z) in the real world toward the observer at the origin (0, 0, 0), leaving an image at (X, Y) in the image plane at z = η:
Spatiotemporal receptive field
Now consider motionsensitive linear spatiotemporal receptive fields that obey the basic tuning rule in Equation 30 with a known potential function. Let I(X, Y, t) describe the intensity of an image at location (X, Y) on the image plane at time t, ignoring color and stereo. Suppose the firing rate of a neuron with linear receptive field F(X, Y) is linearly related to how fast the overlap between the image and receptive field is changing:
More generally, consider a neuron with an arbitrary linear spatiotemporal receptive field G so that its firing rate is:
Existence of a global potential function
In all the concrete examples considered above, the basic tuning rule in Equation 30 holds true for each given view of an object. However, for a single view, the gradient and nongradient theories are indistinguishable. By assumption, the nongradient theory allows arbitrary preferred translational direction p and preferred rotation axis q. For a given view, the gradient theory can also generate any desired constant vectors p andq from the gradients of the following potential function:
The gradient theory is globally correct only when a potential function exists for all views of the object. This is the case for the linear spatiotemporal model, where the potential function can be given explicitly (Equation 60). By contrast, for the idealized velocity component detector, a global potential function in general does not exist, as shown in Example 1 below.
Because the existence of a potential function does not depend on the choice of the coordinate system (see Appendix ), we only need to show that a potential function does not exist in the Euler angle space: (θ_{1}, θ_{2}, θ_{3}) = (θ, φ, ψ), assuming that the center of the object is fixed. In this threedimensional space, a potential function exists if and only if the distribution of the transformed preferred rotation axis q* is curl free. Now consider two special examples that do not admit a global potential function:
Example 1: Constant preferred rotation axis fixed to the world. An explicit example is the model in Figure 6B, where the preferred rotation rotation axis q is the same regardless of the orientation of the spherical object. Here it is assumed that the velocity component detector has two vanishingly small receptive fields that can nevertheless detect the true local velocity components regardless of the orientation of the object. Without loss of generality, take the preferred rotation axis as a unit vector in the negative Y axis:
Example 2: Constant preferred rotation axis fixed to the object. A possible example is a vestibular neuron receiving input from only a single semicircular canal without other influences such as that from the otolith. Then the firing rate has a cosine tuning with respect to a preferred rotation axis fixed on the head of the animal, regardless of the orientation of the head in the world (Baker et al., 1984; Graf et al., 1993). Without loss of generality, let the preferred rotation axis be a unit vector in the positive Z′ axis of the object (head), then in world coordinates this axis is:
Therefore, there are simple computational mechanisms that can violate the gradient theory. As shown above, the gradient theory prohibits a neuron from having a truly invariant preferred rotation axis fixed either to the world or to the object. These neurons, however, are allowed by the nongradient theory. In particular, the prediction of the gradient theory can differ from that of an opticflow sensor. As another example, the preferred translational direction field for a small moving dot may also have nonzero curl when measured at different regions inside a large MSTlike receptive field that has circular arrangement of local preferred directions (Saito et al., 1986). These idealized examples demonstrate that the global property of the gradient theory is quite restrictive, which, however, makes its prediction strong and refutable. Experiments could be performed to test whether the gradient theory accounts for the neuronal responses to threedimensional object motion.
COMPARISON WITH EXPERIMENTAL RESULTS OF SINGLE NEURONS
The tuning rule for reaching arm movement in Equation 2 is a special case of the general tuning rule in Equation 30 without the rotational terms. Biological evidence from the motor system in support of the tuning rule has already been considered in the preceding sections. In this section we examine several additional biological examples that are consistent with some special cases of the general tuning rule and then discuss more comprehensive tests for moving rigid objects.
Onedimensional example: hippocampal place fields on a linear track
For onedimensional movement, the linear tuning theory predicts only linear speed modulation, without further constraint on the tuning function. The firing rate is given by:
Twodimensional example: local translational visual motion
Neurons in middle temporal area (MT or V5) of monkeys respond selectively to the direction of local visual motion (Zeki, 1974;Maunsell and Van Essen, 1983; Albright, 1984), although they are also affected by other factors, such as surround motion (Allman et al., 1985; Tanaka et al., 1986; Raiguel et al., 1995), pattern motion (Movshon et al., 1985), transparency (Stoner and Albright, 1992; Qian and Andersen, 1994), and form cues (Albright, 1992). Consider the following formula obtained by keeping only the translational term in Equation 30:
This simple formula can capture two primary features of many MT neurons: a broad directional tuning curve, and speed modulation without changing the shape of the tuning curves (Rodman and Albright, 1987), while setting aside various other properties accounted for by more detailed models (Sereno, 1993; Nowlan and Sejnowski, 1995;Buračas and Albright, 1996; Simoncelli and Heeger, 1998). For many MT neurons, the tuning curves are often sharper than cosine, in which case a circular normal curve in Equation EA17 might provide better fit because of its closeness to a Gaussian (Albright, 1984). Linear speed modulation is probably a reasonable approximation for some neurons when the velocity is slow, but typically firing rates often decrease after reaching a peak at an optimal speed (Maunsell and Van Essen, 1983). It would be interesting to test whether speed modulation is linear when averaged over raw firing rates for a large population of neurons, especially under ecologically plausible stimulus conditions. The above consideration may also apply to many V4 neurons, which responded to visual motion response in an MTlike manner (Cheng et al., 1994). Cosine tuning curves for translational motion have also been described in the cerebellum (Krauzlis and Lisberger, 1996) and the parietal area 7a (Siegel and Read, 1997).
Threedimensional object motion
Spiral motion
No direct experimental data are available on how a neuron responds systematically to a realistic moving threedimensional object with arbitrary translation and rotation. One closely related example is the broad tuning of some neurons to spiral visual motion, which may be generated plausibly by a large moving planar object facing the observer.
As shown in Figure 8, neurons in monkey visual medial superior temporal area (MST), which receive a major input from area MT, typically respond well to widefield randomdot spiral motion patterns (Graziano et al., 1994). Most neurons in the ventral intraparietal area (VIP) are also sensitive to visual motion (Colby et al., 1993), and some have tuning properties to spiral motion similar to those in area MST (Schaafsma and Duysens, 1996), probably due to input directly from MST and/or integration of inputs from area MT. Area 7a is at a higher level than MST and might have more complex response properties for optic flows (Siegel and Read, 1997). In theory, it is possible to build an MSTlike neuron from MTlike local motion inputs, even with positioninvariance properties (Saito et al., 1986; Poggio et al., 1991; Sereno and Sereno, 1991; Zhang et al., 1993). Tuning to spiral motion was predicted based on Hebbian learning of optic flow patterns (Zhang et al., 1993) and by other unsupervised learning algorithms (Wang, 1995; Zemel and Sejnowski, 1998).
To explain spiral tuning in terms of rigid motion, regard the environment itself as a large rigid object, moving relative to the observer. For the experiments mentioned above, the environment may be considered as a finely textured screen, oriented vertically, facing the observer. Translating this screen toward or away from the observer induces expansion or contraction, whereas rotating the screen around a perpendicular axis induces circular motion. According to the basic tuning rule in Equation 36, a neuron should respond to arbitrary motion of this screen with the firing rate:
To see why this accounts for spiral tuning, write Equation 73 in the equivalent form:
The above interpretation implies that firing rate should scale linearly with translational speed and angular speed independently of the spiral tuning curve. The responses of most MST neurons do indeed depend on speed (Tanaka and Saito, 1989; Orban et al., 1995), and many are monotonically increasing (Duffy and Wurtz, 1997a). It would be interesting to test quantitatively how well the linearity holds when averaged over a population of cells, especially for an ecologically relevant range of motion.
The tuning rule in Equation 73 also implies that the response should depend on the focus of expansion or the translational component in the optic flow, which also occurs for many MST cells (Duffy and Wurtz, 1995, 1997b). Adding a translational velocity vector to the stimulus corresponds to translating the stimulus screen sideways, which affects the angle α in Equation 74 and thus the response in Equation 73. Changing the translational direction and the rotation axis of the stimulus screen can alter both angles α and β in Equation 74 and thus the predicted response in Equation 73.
Motionsensitive neurons in area MST may be used for purposes such as estimating heading or selfmotion (Perrone and Stone, 1994; Lappe et al., 1996) or segmenting multiple moving objects (Zemel and Sejnowski, 1998). MST responses can be affected by various factors, including, for example, surround motion (Tanaka et al., 1986; Eifuku and Wurtz, 1998), disparity (Roy et al., 1992), eye position and movement (Newsome et al., 1988; Bradley et al., 1996; Squatrito and Maioli, 1997), vestibular input (Thier and Erickson, 1992), form cues (Geesaman and Andersen, 1996), the presence of multiple objects (Recanzone et al., 1997), and attention (Treue and Maunsell, 1996). Most experiments used simplified stimuli, although more realistic stimuli were tested recently (Sakata et al., 1994; Pekel et al., 1996). Because most of these examples contain parameters other than the object’s position and orientation, additional variables are needed to account for all of these effects in a model.
Further experimental test
Given all the contributing factors mentioned above, it is natural to ask how a neuron would respond to a more naturalmoving threedimensional object. A simple geometric stimulus is easier to specify and present but may lack important sensory cues needed to predict the response of a neuron to a natural stimulus. Our analysis relies on varying the translational direction and rotation axis and might provide a convenient basic description for response properties in terms of a preferred translational direction and a preferred rotation axis.
To test directly the basic tuning rule in Equation 30 or 36, one should present realistic images of a moving threedimensional object to motionsensitive neurons. The simplest way to test the theory is to oscillate slightly an object around a fixed axis. The oscillation should be sufficiently small so that salient visual cues are not occluded. For sinusoidal oscillations with frequency Ω and amplitude ρ:
Similarly, the response to translation in threedimensional space could be tested by oscillating the whole object along a straight line:
For more efficient tests, the object could be rotated continuously with varying angular speed, covering all relevant views, first with respect to a fixed axis, and then systematically changing the axis. If the basic tuning rule is correct and the system is essentially linear, the tuning function and the preferred rotation axis could be computed for each view of the object. An even more efficient test is possible with a continuously timevarying rotation axis that generates tumbling movements of the object (Stone, 1998).
Eye position is one implicit factor that may affect the preferred translational direction and preferred rotation axis. The present theory allows an eye position effect but provides no additional constraints.
The linear response properties of a neuron for a given object are specified completely by its preferred translational direction, preferred rotation axis, and baseline firing rate for each given view of the object, as well as how these parameters depend on the view. All of these properties are experimentally testable and can be compared with the theoretical predictions in the preceding sections. For example, with the center of the object fixed, the curlfree condition for a given neuron should be tested by measuring its preferred rotation axis for four or more different orientations of the object. For a full test in sixdimensional space, both the preferred translational direction and the preferred rotation axis of the neuron should be measured for seven or more different positions and orientations of the object. See Appendix for further discussion.
If motionsensitive neurons with similar tuning properties are clustered in the brain, then it might be possible to use functional magnetic resonance imaging techniques to test the predicted properties of the tuning rule in animal and human subjects using realistic images of moving 3D objects as visual stimuli.
DISCUSSION
An explanation for cosine tuning
The remarkable ubiquity of approximately cosine tuning curves for a wide range of neural responses in the visual and motor systems suggests that there may be a common explanation that transcends the specific mechanisms that generate these response properties. We have shown that the low dimensionality of the geometric variables that underlie object motion and body movements could account for these observations. The gradient formulation of this general principle provides a rigorous framework for unifying the dependence of tuning curves on the axes and speeds of rotation and translation.
This theoretical framework makes a number of specific predictions. The primary prediction is the existence of preferred axes of rotation and translation for moving objects, which can be determined by systematically rotating and translating objects in the receptive field of cortical neurons. The firing rate of a neuron should fall off in proportion to the cosine of the angle between the preferred rotation axis and translational direction and the true rotation axis and translational direction. In addition, the speed and angular speed should modulate the firing rate multiplicatively without changing the shape of the directional tuning functions, somewhat related to the multiplicative gain fields in the parietal cortex for eye position (Andersen et al., 1997; Salinas and Abbott, 1997) and recent evidence for distance modulation of responses in visual cortex (Trotter et al., 1992; Dobbins et al., 1998). A secondary prediction is that the fields of preferred directions of rotation and translation for each individual neuron are curl free; these are global conditions on the overall pattern of vectors. The curlfree condition is relaxed in nongradient theory, which provides a theoretical alternative that can be experimentally tested.
Cosine tuning for the direction of arm movement characterizes many neurons in the motor cortex as well as in other parts of the motor system. If this cosine tuning with direction mainly reflects the geometrical constraint of moving in a threedimensional space, as we propose, then the specific functions of these neurons in guiding and planning limb movements must be sought in other properties. One way to obtain this information is to measure how the preferred direction varies in space for different hand positions, because this vector field completely specifies the properties of a neuron in a linear theory. When the preferred direction field is curl free, the underlying potential function that generates the gradient field can be constructed empirically.
Cosine tuning function is an approximation to biological data. For example, the averaged directional tuning curves in Figure 2 are all slightly sharper than a cosine function. Such systematic deviation can be accounted for only by nonlinear theories (see Appendix ). Ultimately, the underlying neural mechanisms that generate the tuning properties need to be considered in more detailed theories. These tuning properties might be the outcome of learning processes based on correlated neuronal activities induced by movement and motion. In this paper, we have focused on several analytically tractable situations to emphasize the existence of general neuronal tuning properties that are insensitive to the actual mechanisms.
How preferred rotation axes may be used to update a static representation
If the cosine tuning of motionsensitive neurons is determined essentially by geometric constraints regardless of the actual computational functions, then what is the value of these simple response properties? For threedimensional object motion, a population of neurons tuned to translational direction and rotation axis should carry sufficient information to determine the instantaneous motion of any given object and therefore could be used to update the static view represented elsewhere. This allows future sensory and motor states to be predicted from the current static state.
Information about the static view of an object is represented in the ventral visual stream in the monkey cortex, leading to the inferotemporal (IT) area (Ungerleider and Mishkin, 1982). The response of a viewsensitive neuron in IT area typically drops off smoothly as the object is rotated away from its preferred view, around either the vertical axis (Perrett et al., 1991) or other axes (Logothetis and Pauls, 1995). Viewdependent representations for threedimensional objects have been studied theoretically (Poggio and Edelman, 1990; Ullman and Basri, 1991) and have motivated several recent psychophysical experiments (Edelman and Bülthoff, 1992;Bülthoff et al., 1995; Liu et al., 1995; Sinha and Poggio, 1996). The general idea of a viewdependent representation in the IT area is consistent with recent neurophysiological results, including singleunit recordings (Perrett et al., 1991; Logothetis and Pauls, 1995; Logothetis et al., 1995) and optical imaging data (Wang et al., 1996).
Information about the instantaneous motion of an object is represented in the dorsal visual stream, including areas MT, MST, superior temporal polysensory area, and the parietal cortex, such as area 7a. Given a population of motionsensitive neurons tuned to translation and rotation, it should be possible to extract complete information about the instantaneous motion of any object. For example, a sixdimensional population vector can be used to reconstruct rigid motion. More efficient reconstruction methods may also be used and implemented by a biologically plausible feedforward network (Zhang et al., 1998). The same set of neurons can extract the motion of different objects by combining the activities of input neurons differently. Instantaneous translation and rotation determine how the current view of this object is changing at the moment and could be used to update the static view representation in the IT area.
Broad tuning to static views logically implies that each static view of an object elicits a certain activity pattern in the temporal cortex, and that as the view changes, the pattern of activity also changes smoothly, depending on the axis of rotation and direction of translation. A complete representation of the dynamic state of an object would require representing information about both the current view and how the view is changing, so that the system can effectively update its internal state in accordance with the movement of the object. Such motion information might help improve the speed and reliability of the responses of viewspecific neurons to a threedimensional object during natural movements.
Conclusion
We have shown that simple generic tuning properties arise when an encoded sensory or motor variable reflects changes rather than static configurations. By linearizing the system locally for movementsensitive neurons, the analysis reveals mechanisminsensitive tuning properties that mainly reflect the geometry of the problem rather than the exact encoding mechanisms, which could be much more complicated. Although a nonlinear analysis is also considered (Appendix), the basic linear theory already captures some essential features of the biological data, such as sensory responses to visual pattern motions and directional tuning for reaching movements. The analysis predicts the existence of a preferred translational direction and a preferred rotation axis in space with cosine tuning functions for representing arbitrary threedimensional object motion. For natural movements that have an intrinsically low dimensionality, combinations of variables become highly constrained and cannot be changed arbitrarily. It is precisely for these constrained movements that the mechanisminsensitive properties studied here may become useful. By contrast, the analysis may not apply for artificial movements such as computergenerated visual motion stimuli that do not satisfy any simplifying geometry constraints that occur in the real world. The brain should have more efficient representations for those stimulus features that are consistent with commonly encountered configurations in the real world. The analysis presented here may help to predict tuning properties of motionsensitive neurons in unknown situations by providing a basic description of expected properties with which more detailed characterizations as well as potential deviations can be contrasted.
EXTENDED THEORIES
We first reformulate the linear tuning theory for motionsensitive neurons in general terms and then make nonlinear extensions. Here it is assumed that the natural movements of interest can be parameterized by a lowdimensional vector variable:
Linear gradient theory
Assume that the mean firing rate of a motionsensitive neuron is linearly related to the time derivative of a potential function Φ(x) of the state variable x. This leads to the tuning rule:
Linear nongradient theory
Assume directly a linear relationship between the firing rate and the components of the generalized speed velocity v. This leads to the tuning rule:
Coordinatesystem independence
Both the gradient and the nongradient theories are independent of which variables are chosen to parameterize the movements. Suppose the old vector variable x and a new variable x̃ are related by:
The tuning rule has the same form in both coordinate systems:
Nonlinear theory: circular normal tuning
The circular normal tuning function for firing rate has the general form:
The circular normal function can mimic either a cosine or a Gaussian. When K is very small, exp (K cos α) ∼ 1 + K cos α so that the circular normal function in EquationEA17 approaches the cosine function in Equation EA18 with A′ = A + B and B′ = BK. When K is large, cos α ∼ 1 − α^{2}/2 so that the circular normal function approaches a narrow Gaussian function with the variance 1/K.
How can we generate a circular normal tuning function? Because the timederivative equation for firing rate:
Although Equation EA20 can lead to a circular normal function, it does not specify how this occurs. One plausible biological mechanism is a recurrent network with appropriate lateral connections, which can generate a tuning curve closer to a circular normal function than to a cosine function (Pouget et al., 1998).
Nonlinear theory: acceleration tuning and quadratic speed modulation
In this section the basic tuning theory is generalized by including the second temporal derivative. This leads to acceleration tuning, nonlinear speed modulation, and departure from perfect cosine directional tuning.
Assume that the firing rate of a neuron contains not only the first temporal derivative of a potential function, but also the second temporal derivative of another potential function:
This assumption leads to the new tuning rule:
Example 1: Reaching movement
Here the vector variable x = (x_{1}, x_{2}, x_{3}) = (x, y, z) describes the hand position. The dot product terms in Equation EA23 mean that both the velocity and the acceleration have preferred directions and cosine directional tuning functions, together with multiplicative linear modulation by the speed or the magnitude of acceleration. Ashe and Georgopoulos (1994) included acceleration terms in a different regression formula and found a small number of cells related to hand acceleration. Systematic tests are needed to determine whether the acceleration tuning predicted by Equation EA23 really exists.
Example 2: Rigid object motion
Here the vector variable x = (x_{1}, x_{2}, … , x_{6}) = (x, y, z, θ_{1}, θ_{2}, θ_{3}) describes the object’s position and orientation in space. By transforming (θ˙_{1},θ˙_{2},θ˙_{3}) into the angular velocity in physical space, the tuning rule becomes:
Effects of quadratic terms
The quadratic speed terms imply both nonlinear speed modulation and higherorder Fourier components for directional tuning that are speed dependent. To see this, consider a twodimensional reaching example with the hand velocity:
The second Fourier component can either sharpen or broaden the original cosine function, depending on its sign. As illustrated in Figure 9, if the tuning curve is sharpened by the second component, it becomes even sharper as the speed increases; if the tuning curve is broadened, then it becomes even broader as the speed increases. However, the amplitude of the second component (cos 2θ) should be no more than onefourth of that of the first one (cos θ) to ensure that the tuning curve has only a single peak. This limits the effects from the second Fourier term. For speed modulation, the quadratic speed factor produces only a slight bend (Fig. 9) and is too weak by itself to produce ∪shaped or ∩shaped curves for some neurons in area MT (Maunsell and Van Essen, 1983; Rodman and Albright, 1987) and MST (Orban et al., 1995; Duffy and Wurtz, 1997a). Rodman and Albright (1987) also showed that the average tuning widths of MT neurons were insensitive to speed, although the typical experimental errors for individual neurons might mask the small effects shown here. Thus, the second Fourier component with squared speed may help improve data fitting (compare Fig. 2), but probably only within a narrow range of speeds.
LINEAR VECTOR FIELD FOR DATA ANALYSIS
Linear vector field from experimental data
As shown in the main text, the preferred direction and the preferred rotation axis may be generated by gradient fields of a potential function. Here we consider how to test the gradient condition experimentally. In two and three dimensions, where the curl can be defined, a vector field generated as the gradient of any potential function is always curl free (Fig. 10). In particular, a twodimensional vector field (u(x, y), v(x, y)) has both zero curl and zero divergence if and only if
For testing the gradientfield condition or the curlfree condition with sparsely sampled data points, an additional smoothness constraint is needed. Linearity is a reasonable smoothness requirement, at least for a local region, such as in measurement of local force fields (MussaIvaldi et al., 1985; Giszter et al., 1993), and in local optic flow analysis (Koenderink and van Doorn, 1976). A linear vector field has the general form:
To determine A and b from data vectorsp
_{1}, p
_{2}, … , p
_{N} sampled at positionsr
_{1}, r
_{2}, … , r
_{N}, respectively, we require the total number of data points:
The leastsquare solution is:
Examples of the simplex method
In this section, we illustrate an alternative formulation of the curlfree condition with minimal data points in threedimensional space, which can be extended readily to other dimensions. For local interpolation in threedimensional space, a vector field should at a minimum be sampled at four locations 1, 2, 3, 4, not all lying in the same plane (Fig. 11). Suppose the coordinates of the four points are x
_{1},x
_{2}, x
_{4},x
_{4}, and the corresponding data vectors are p
_{1}, p
_{2},p
_{3}, p
_{4}. The simplex method is based on the fact that any point x in threedimensional space can be expressed as:
To derive an integral formula for the curlfree condition, first integrate along the straight line segment fromx
_{1} to x
_{2} with a linearly interpolated vector, yielding:
In the experiment by Caminiti et al. (1990), the preferred direction of a motor cortical neuron was sampled at three points at equal distance, similar to the case in the bottom diagram in Figure 11. Here the linearity of the vector field only entails that:
Footnotes

We are grateful to T. D. Albright, G. T. Buračas, G. E. Hinton, R. J. Krauzlis, K. D. Miller, A. B. Schwartz, M. I. Sereno, M. P. Stryker, R. S. Turner, D. Zipser, and two anonymous reviewers for helpful comments on the analysis presented here.
Correspondence should be addressed to Dr. Terrence Sejnowski, Computational Neurobiology Lab, The Salk Institute, La Jolla, CA 92037.