Abstract
A child often learns to ride a bicycle in the driveway, free of unforeseen obstacles. Yet when she first rides in the street, we hope that if a car suddenly pulls out in front of her, she will combine her innate goal of avoiding an accident with her learned knowledge of the bicycle, and steer away or brake. In general, when we train to perform a new motor task, our learning is most robust if it updates the rules of online error correction to reflect the rules and goals of the new task. Here we provide direct evidence that, after a new feedforward motor adaptation, motor feedback responses to unanticipated errors become precisely task appropriate, even when such errors were never experienced during training. To study this ability, we asked how, if at all, do online responses to occasional, unanticipated force pulses during reaching arm movements change after adapting to altered arm dynamics? Specifically, do they change in a task-appropriate manner? In our task, subjects learned novel velocity-dependent dynamics. However, occasional force-pulse perturbations produced unanticipated changes in velocity. Therefore, after adaptation, task-appropriate responses to unanticipated pulses should compensate corresponding changes in velocity-dependent dynamics. We found that after adaptation, pulse responses precisely compensated these changes, although they were never trained to do so. These results provide evidence for a smart feedback controller which automatically produces responses specific to the learned dynamics of the current task. To accomplish this, the neural processes underlying feedback control must (1) be capable of accurate real-time state prediction for velocity via a forward model and (2) have access to recently learned changes in internal models of limb dynamics.
Introduction
Humans have the extraordinary ability to learn a vast array of motor tasks with modest amounts of practice and to effectively compensate unforeseen errors in performing each of those tasks. This ability to perform online error feedback control is remarkable in light of the relatively long sensorimotor loop delays impeding information flow between the brain and the periphery (Cordo, 1990; Flanders et al., 1993). Indeed, the apparent complexity required of such a feedback controller has led to a long-running debate over the precise nature and importance of long-latency supraspinal online feedback control, particularly when movements are short and targeted (Houk and Rymer, 1981). However, cortically modulated long-latency responses are often larger than their short-latency counterparts (Marsden et al., 1976), and high-gain responses to visual position and velocity error information are present throughout the entire course of even relatively quick arm movements (Saunders and Knill, 2003, 2004). Moreover, patients with Huntington's disease, which specifically affects supraspinal structures, have been shown to have profound deficits in online error feedback control, even at the earliest stages of this disease (Smith et al., 2000), despite an intact ability to use error signals for trial-to-trial learning (Smith and Shadmehr, 2005). Together, these observations suggest that cortically modulated mechanisms for error feedback control play a key role in even short, rapid movements.
A significant concern with the implementation of long-latency feedback corrections is stability. For all but the smallest gain corrections, it has been argued that long-latency feedback control may not be feasible because of instability caused by large sensorimotor loop delays. However, if the motor system has a means of predicting future state, these instabilities can be eliminated (Wolpert and Miall, 1996; Bhushan and Shadmehr, 1999; Wolpert and Flanagan, 2001). Long-latency online feedback control, therefore, might rely on a so-called forward model of dynamics, which enables the motor system to predict future motion based on past motor commands and delayed sensory feedback (Flanagan et al., 2003, Blakemore et al., 1999; Duhamel et al., 1992; Jordan and Rumelhart, 1992; Wolpert and Miall, 1996; Bhushan and Shadmehr, 1999; Desmurget and Grafton, 2000; Ariff et al., 2002; Mehta and Schaal, 2002; Sommer and Wurtz, 2002). If this prediction could span the latency at which feedback corrections take place, then feedback control driven by this prediction could essentially occur in real time and avoid the instability normally associated with feedback delays.
Any long-latency feedback mechanism would be most effective if its motor output depended not only on trajectory error, but also on the present task and environmental dynamics. This could be accomplished if the feedback controller could learn directly from trial-to-trial error signals, independently from the mechanisms for feedforward motor adaptation. However, a more parsimonious feedback control policy could simply translate desired corrective motion into the motor output using the same mechanism that performs this computation in feedforward control: an adaptive inverse dynamics model (Katayama and Kawato, 1993; Shadmehr and Mussa-Ivaldi, 1994; Wolpert and Kawato, 1998; Kawato, 1999). A shared internal model for feedforward and feedback control would allow feedforward adaptation to automatically train feedback responses.
In the present work, we address two key issues about the contribution of state-predicting forward models of dynamics to feedback control. First, is the motor system capable of using delayed sensory information to continuously and accurately predict the values of state variables in real time for use in feedback control? The second issue concerns the relationship between motor adaptation and feedback control. If we are never trained in the art of error correction when learning a novel task, is the motor system's learned internal model of task dynamics automatically made available to the error feedback controller? I.e. can forward-model-based feedback control adapt automatically to novel dynamics? And, if so, what does this tell us about the functional relationships between forward and inverse models of dynamics?
Materials and Methods
General task description.
Subjects were instructed to make point-to-point reaching movements in the horizontal plane while holding a handle at the end of a robotic manipulandum (Inmotion2 arm; Interactive Motion Technologies). Visual information on movements was provided through a vertically oriented LCD computer monitor. A small, circular (3 mm diameter) on-screen cursor tracked subjects' movements and larger (1 cm diameter) circles indicated the locations of the targets toward which subjects were instructed to make point-to-point reaching arm movements. Their right arms were supported in the horizontal plane by a ceiling-mounted sling. The starting position for forward movements corresponded to an initial hand position in the midline 30 cm from the chest, whereas the target position was 10 cm diagonally forward and to the right, with an angle determined for each subject based on the physical dynamics of his or her arm (see below, Perturbations). Although this angle varied, the range was only 43–47°. At the end of each movement, we provided performance feedback based on the time taken to reach the target and the peak hand speed on that trial, with “good” for normal movements set to 0.45-.55 s long and a top speed of 0.3-.35 m/s. The color of the target changed to reflect the quality of the movement: red for too fast, blue for too slow, and an expanding circle accompanied by a sound for a good movement. Eleven naive subjects, four male and seven female, aged 18–22, performed the experiment. All subjects provided informed consent, and this study was approved by the Harvard University Institutional Review Board.
The experiment was divided into two phases in blocks of one hundred 10 cm movements, 50 in each of two (back and forth) target directions. The target location for each movement became the starting point for the next. The first phase consisted of a four-block baseline period in which no force field was active, whereas the second was a learning phase of six blocks (blocks 5–10) during which subjects performed movements in a viscous curl force field:
The effect of the field for linear movements is to laterally perturb trajectories with a force proportional in magnitude to the speed of the hand (Fig. 1). After the first two blocks of the learning phase, subjects' performance reached an asymptote, so we considered blocks 7–10 to be the “late-learning” period. Note that we refer to this period throughout the remainder of this study. During both phases, approximately one-fifth of all movements were laterally error clamped by applying a force channel (Scheidt et al., 2000; Smith et al., 2006) between the initial hand position and the center of the target, which effectively counteracted lateral motion and forced nearly perfect straight line movements to targets (average maximum absolute deviation, <0.7 mm). The force channel is implemented as a highly stiff, viscous one-dimensional spring and damper in the direction orthogonal to the vector between the initial hand position and the center of the target, with K = 6000 N/m and B = 250 N/(m/s). Application of this error clamp allows for high-accuracy measurement of subject-produced lateral forces, because the robotic arm must effectively produce lateral forces precisely equal and opposite to those produced by the subject to clamp lateral error at zero. By inactivating the force field and clamping lateral error at zero for these trials, we also eliminate the effect of changes in subjects' arm stiffness: observed lateral forces must be actively produced by the subject rather than a response to lateral errors, as lateral errors are held to be essentially zero.
Movements with perturbations.
During both the baseline and learning phases, randomly distributed on
Because the goal of giving these pulses was to examine the change in the online error feedback response as a function of learning force-field dynamics, we wanted to ensure that the pulses themselves provided subjects with no new feedback between the baseline (null field) condition, and the learning (force field) condition. In particular, we wanted to ensure that subjects' only source of lateral error feedback which could influence their lateral force profiles was learning the force field itself on unpulsed trials. To this end, we only gave pulses in a force channel, so that lateral position error was clamped at zero, and the force field was always off during pulsed trials. Thus, a perturbation movement given during the baseline condition was precisely equivalent to a perturbation movement given during the learning phase, aside from changes in subjects' responses. We argue therefore that any changes in subjects' responses to these perturbations between the two phases was attributable to adaptation to the force field on unpulsed trials, and cannot be attributed to any learning from the perturbation trials themselves.
Because we were interested primarily in lateral force profiles during these perturbations, we attempted to further minimize any activity lateral to movement, beyond using the force channel, by carefully choosing the movement direction and the perturbation direction. There were two primary considerations. First, inertial anisotropy in the physical passive dynamics of the coupled human and robot arm system is such that, in general, a force perturbation will produce accelerations in both the direction of applied force vector and the direction lateral to it. Thus, we wanted the movement and perturbation direction combination which would minimize these effects.
We noted that for these two-dimensional (i.e., planar) movements, the mass matrix of the coupled human/robot system possesses two orthogonal eigendirections because this matrix is known to be symmetric. The acceleration-dependent component of the dynamics of this system can be expressed as follows: where F and a represent force and acceleration at the hand, respectively, and M represents the effective inertia matrix of the coupled system. Note that the units are provided alongside each term for clarity. For this system, only forces applied along these eigendirections will produce accelerations in exactly the same directions as the applied forces. Thus, a movement along an eigendirection with a perturbation along its axis (resistive or assistive) should eliminate lateral accelerations caused by the perturbation itself.
To estimate this eigendirection, we had each subject perform a pre-experiment phase during which they were given perturbations in 120 uniformly distributed directions during point-to-point movements. We then used these force data and the accelerations they produced to estimate the mass matrix for the subject, from which we could then calculate the eigenvalues and eigenvectors. Each subject then moved along her own eigendirection for the actual experiment, although the range across subjects was tightly bounded between 43 and 47°. Note that this procedure reduced the lateral components of the stretch responses to longitudinal pulses from nearly 15 N in magnitude (data not shown) to ∼2.5 N in the baseline condition (Fig. 4A, dotted lines).
Computational modeling.
We implemented two classes of feedback control models for the human motor system to better understand the mechanisms potentially responsible for the important features of our data. The first model is an example of adaptive changes to feedback responses that do not incorporate state prediction, whereas the second model is a particular implementation of a feedback mechanism that incorporates both adaptation and state prediction. Full motivation for implementing these models is given in Results.
In the first model, force-field compensation is learned through a rotation of preferred activation directions of relevant muscles, or a nearly equivalent rotation of baseline torque. In this model, originally suggested by Thoroughman and Shadmehr (1999) and diagrammed in Figure 6A, the motor system does not know the precise dynamics of the force field. The model takes as input a desired trajectory, which we set to a minimum-jerk 10 cm point-to-point movement with a peak speed of 0.27 m/s, which closely approximates the mean unperturbed velocity profile in our data (see Fig. 2B). The “inverse model” of arm dynamics converts this desired trajectory into a planned pattern of joint torques. In the null field, the inverse model simply accounts for the physical passive dynamics of the coupled human- and robot-arm system. Here, the inverse model learns to compensate the viscous curl force-field environment by performing a muscle rotation in the preferred directions of the relevant muscles: biceps, triceps, and anterior and posterior deltoid. There are a number of ways to perform the torque-to-muscle decomposition depending on assumptions about the levels of biarticulation between muscles. Following Bhushan and Shadmehr (1999), we implemented six different muscle-moment arm models (Wood et al., 1989; Throughman and Shadmehr, 1999), and rotated the preferred directions of the muscles in these models by the amounts that have been shown to produce torques approximating the learning of this force field (between 12 and 26° of rotation depending on the model and the muscle). After rotation, muscle activations are then recombined into joint torques for each moment arm model. For the torque rotation model, joint torques were simply rotated by the mean rotation angle of the relevant muscles across all moment arm models (20.7°).
Although Thoroughman and Shadmehr (1999) only examined feedforward muscle activations, here we examined the effect of applying such rotations to feedback responses as well. For simplicity, we approximated the feedback pathway as a lumped response with linear gains on position and velocity error (linear stiffness and viscosity) with a single delay. This feedback response (and all other stiffness-viscosity responses described in this study) have the general form FFB = K · xerr(t − td) + B · verr(t − td), where FFB is the feedback force, K is stiffness, B is viscosity, xerr and verr are the differences between desired and actual position and velocity, respectively, and td is a time delay. We found that a time delay of 50 ms combined with reasonable estimates of stiffness and viscosity (Mussa-Ivaldi et al., 1985) was able to closely reproduce null-field perturbation responses with velocity profiles similar to those in our data (compare Figs. 4C, 6C, dotted lines). In an absolute coordinate frame (elbow and shoulder angles taken relative to the external workspace), the joint stiffness and viscosity we used were as follows:
We first computed for this model a baseline feedback response torque to the assistive and resistive pulses (movement was given in a simulated error clamp in which lateral position and velocity were held to be zero). We gave the model the above desired trajectory, then, as in our data, introduced an unanticipated bell-shaped force pulse of 100 ms duration (minimum-jerk fourth-order polynomial). We then compared this feedback response (without rotation) to an equivalent response with rotations of the preferred directions of the muscles applied to assess the learning-induced changes in pulse responses predicted by this model (see Fig. 6C).
In the second model, the feedforward pathway, consisting of desired trajectory generation and an inverse model computation of motor output (joint torque), is generally the same as that above, except that the inverse model learns the exact dynamics of the force field, rather than a rotational approximation. However, the feedback control mechanism is entirely different. In addition to a lumped, low-latency (Δt = 30 ms) approximation of spinal feedback and intrinsic mechanical stiffness and viscosity, this model includes a long-latency state-predicting forward-model-based feedback controller (see Fig. 6B). This forward model takes as input delayed sensory information (Δt = 120 ms) as well as efference copy of motor output, and predicts the future value of kinematic state variables (e.g., position and velocity). This prediction is then compared with the desired trajectory to estimate future error, and the controller issues preemptive corrective accelerations to reduce error in real time. In this model, corrective accelerations were derived from linear gains on predicted position and velocity error: Short-latency feedback gains were chosen to be
The short-latency feedback gains were set to the values used in the model used by Bhushan and Shadmehr (1999) obtained from Gielen and Houk (1987). This combination of parameters was again found to produce null-field pulse responses with velocity profiles closely in agreement with those in our data (compare Figs. 4C, 6D, dotted lines).
Moreover, because the corrections are supraspinally mediated, the feedback controller has access to changes in the internal model of limb dynamics. Forward-model-issued corrective accelerations, then, can be converted to motor output via the inverse model of limb dynamics. Also, the prediction of the forward model of future state is available to the inverse model. This is an essential feature of the model: in general, the inverse model will be highly dependent on state. If there are no unanticipated changes or errors during movement, prediction is unnecessary. If errors occur, however, the torques that the inverse model produces at a given time during movement will, in general, be significantly more appropriate if updated real-time state predictions are available than if it used a desired trajectory alone. In our task in particular, because the late-learning inverse model ideally contains an exact model of the force field, we expect it to be able to issue torques appropriate for the presence of the force field, even for unplanned changes in trajectory, by using the forward model prediction.
As we did for the muscle and torque rotation models, we first tested the forward-model-based controller by examining the null-field response to unanticipated assistive and resistive force pulses given in a simulated lateral error clamp. We then compared these baseline responses with the response of the model to unanticipated force pulses after learning a perfect model of the force field (again in a simulated error clamp).
We also made a few changes to the model itself as presented by Bhushan and Shadmehr (1999) (for details, see supplemental text and supplemental Fig. 1, available at www.jneurosci.org as supplemental material). To address the instability they cited, we gave the forward model the ability to predict the effect of spinal feedback on the future position of the arm. It is certainly most reasonable for a predictor with knowledge of the dynamics of the arm to include in those dynamics the effect of spinal feedback. We also gave the forward model full knowledge of all parts of the task that in late learning we would expect it to have, namely, prediction of the effect of the force field as well as the error clamp, once it is clear that the pulse is active. Note that these final changes are not essential (for a detailed exposition of their effects, see supplemental material, available at www.jneurosci.org).
Results
First, we show that the paradigm itself produced the experimental conditions that we desired. In particular, assistive and resistive pulses produced relatively large deviations in position and velocity trajectories along the direction of movement compared with unperturbed trajectories, whereas the force channel successfully clamped lateral position and velocity very close to zero as shown in Figure 2. Results shown are averaged across all 11 subjects, with the shaded regions representing 95% confidence intervals about the mean.
Movement kinematics associated with force pulse perturbations
We found that the assistively and resistively directed force pulses produced motion perturbations that were nearly equal in magnitude as shown in Figure 2. The maximum difference between perturbed and unperturbed longitudinal velocity profiles averaged 19.5 cm/s for assisted trials and −19.2 cm/s for resisted trials, whereas in position the maximum differences averaged 2.61 cm for assisted trials and −2.55 cm for resisted trials. Across all error-clamp trials, the average absolute maximum lateral displacements were <0.7 mm both in the baseline case as well as after learning the force field (Fig. 2C), so that the lateral components of perturbed trajectories were near zero and nearly identical to one another before and after learning. Thus, lateral errors during pulsed trials were essentially unchanged between the baseline and late-learning epochs, with the difference averaging ∼0.7 mm for both assistively and resistively pulsed trials. Because of how small the lateral errors were on error-clamped trials, both in the baseline and force-field conditions, it is unlikely that our perturbations provided subjects with lateral errors from which to learn, implying that any learning-related effects seen in our data are attributable to force-field learning on unpulsed trials. Even if the submillimeter difference in lateral error seen on pulsed trials between force-field and baseline conditions were somehow large enough to learn from, the direction of the lateral error was opposite that which would be necessary to train task-specific feedback responses.
Note that in resistively pulsed trials, longitudinal velocity is reduced immediately after pulse onset, and remains below baseline levels for an additional 150 ms after pulse offset because of the momentum change conferred by the pulse. At 150 ms after pulse offset, the velocity becomes greater than baseline to compensate the effect of the pulse. The point at which the pulsed velocity profile crosses baseline is actually the point at which the pulse has caused the greatest displacement relative to baseline, and so to correct this displacement in a resistively pulsed trial, the feedback response must provide additional longitudinal velocity so that the movement catches up to the baseline trajectory. After each pulse, we see a general pattern where velocity is first reduced by the pulse, and then increased by the feedback response compared with baseline. This pattern is, of course, the opposite for assistively pulsed trials, but in both cases the turn-around point occurs at about 150 ms after pulse offset (250 ms after pulse onset).
Interestingly, inspection of the longitudinal position and velocity profiles in error-clamp movements shown in Figure 2, A and B, reveal that the motion in the direction of the target is essentially unaltered by exposure to the force-field environment (Fig. 2A,B, compare solid and dashed lines). This is equally true in unpulsed trials and both types of pulsed trials. This finding indicates that any learning-related changes in motor output induced by the force field are confined to the lateral direction, without spillover to longitudinal motion. Changes in lateral pulse response in the late-learning condition compared with baseline could, in general, be related to force-field adaptation on unpulsed trials, or related to experiencing a different pattern of errors on pulse trials. However, our implementation of the error clamp ensures that lateral errors on pulse trials are essentially zero both before and after adaptation (Fig. 2C,D), and the finding that longitudinal motion profiles in our task are unaffected by force-field learning suggests that pulse-induced longitudinal errors are also essentially unchanged. Therefore, any systematic changes in the pulse response between the baseline and late-learning periods can be attributed to motor adaptation induced by the force-field environment rather than to differences in pulse-induced motor errors driving the pulse response.
Force-field learning
We found that subjects appropriately learned the force field over the course of the second phase of the experiment consistent with previous work (Shadmehr and Mussa-Ivaldi, 1994; Smith and Shadmehr, 2005; Smith et al., 2006), as we would expect, unhindered by the random application of force pulses on one of 10 trials that were interspersed among the normal unpulsed force-field trials. When the force field is initially applied, trajectories are far off course, and lateral force profiles are not at all reflective of what is required to balance the curl force field in real time (Fig. 3A). However, with practice, trajectories become straighter. Figure 3D shows lateral displacements averaged across all subjects 350 ms after completing 2 cm of the movement as a function of trial. In the late-learning period, displacements are significantly less than when the force field is initially applied (p < 10−6), and lateral force profiles anticipate the dynamics of the force field. The adaptation coefficient computed as the coefficient of linear regression between actual and ideal force profiles is shown in Figure 3C. The learning curve for this adaptation coefficient averaged across all 11 subjects increases significantly from baseline levels (p < 10−6), to a value >0.9 (90% learning) late in the training period. Shaded regions indicate 95% confidence intervals about the mean.
Changes in feedback control associated with force-field learning
Figure 4B shows baseline-subtracted, late-learning lateral force profiles averaged across all 11 subjects, with shaded regions indicating 95% confidence. Consistent with previous work with such curl force fields (Smith et al., 2006), our data show that baseline-subtracted lateral force profiles late in learning on unpulsed trials simply reflect the force profile necessary to counteract the force field over the duration of the movement. These forces, shown in orange, were learned directly from lateral errors experienced on force-field trials during the training period shown in Figure 3B.
On movements late in learning for which subjects were randomly given assistive or resistive force pulses, however, we see that the force profile is quite different. In particular, lateral force profiles on these trials, shown in Figure 4B in red and purple, respectively, begin identically to their unperturbed counterpart shown in orange, but ∼200 ms after the onset of the force pulse, begin to deviate strongly from the unperturbed profile. More than simply differing in magnitude from unperturbed trials, however, the apparent deviations late in the movements on perturbed trials are task-specific, reflecting learned dynamics of the force field. In particular, these deviations from the unperturbed force profiles closely match the force profiles that would ideally compensate the effects of perturbation-induced changes in velocity on the newly learned force-field dynamics. If the learned dynamics of the force field automatically transferred to online error feedback responses, an ideal perturbation response would be one in which the subject's force profile appropriately balanced the forces that the force field would produce, were it active: Fideal(t) = −BFF × Vactual(t), where Fideal is the ideal lateral force, BFF is the viscous curl force field, and Vactual is the actual velocity. We can see how close subjects' perturbation responses are to ideal by plotting the baseline-subtracted perturbation response along side the ideal force profile. This is shown in Figure 4C, which illustrates that learning the force field has little effect on lateral perturbation responses (the change is nearly identically zero) until ∼200 ms after perturbation onset, when the motor loop delay is overcome. At this point, force profiles in both assistively perturbed and resistively perturbed trials begin to closely track their ideal counterparts.
It is clear that, late in movements, on assistive and resistive perturbation trials, lateral force profiles differ substantially from their unperturbed counterparts (Fig. 4D,E). In particular, measured at 400 ms (i.e., 300 ms after perturbation offset), lateral force magnitudes during resisted and assisted trials are greater and less than, respectively, that of an unperturbed trial, both with p < 10−6. If we instead examine lateral force magnitude averaged over the duration of the feedback response at 200–600 ms (i.e., 100–500 ms after perturbation offset), we again find that the mean lateral force on resisted trials is greater, and that of assisted trials is less, than the unperturbed condition, also with p < 10−6. Note that the shaded regions and error bars throughout Figure 4 represent 95% confidence intervals around the mean.
Real-time state prediction
The result that force profiles in both assistively perturbed and resistively perturbed trials become task specific 150 ms after perturbation offset demonstrates that during the training of simple unpulsed movements in a force field, the motor system automatically updates its rules for online error correction so that responses to force pulses are, after a time delay, fully appropriate for the presence of the force field, although the force pulse and force field were never experienced together. Furthermore, as this particular force field is velocity dependent, an appropriate perturbation response requires that the motor system have accurate information about its velocity after perturbation. In the presence of motor loop delays, the motor command for producing the appropriate lateral force for a given velocity must be issued before sensory information about that velocity is available. Therefore, our results suggest that the motor system has access to a predictive model of limb dynamics that can provide accurate online state prediction for velocity. Such a forward model, which could integrate delayed sensory information with efference copy of motor output to accurately predict future velocity, would be particularly useful to the motor system if the arm is perturbed off course. This can be seen directly in the single-trial data in Figure 5 where, plotted with subjects actual longitudinal velocity profiles, we show subjects' velocity predictions, namely their lateral force profiles scaled down by the force-field magnitude, Vpred = Flateral/(15 N/(m/s)). In general, the accuracy of this state prediction is quite remarkable. When all pulsed trials from all subjects are considered, lateral force profiles can predict >80% of the variance in longitudinal velocity. Figure 5C shows the R2 value for a linear regression of the lateral forces subjects produced onto the corresponding longitudinal velocities in all late-learning pulse trials. We performed this regression separately at each time point during the movement to examine how the precision of real-time state prediction during pulsed movements evolves in time. We find that state prediction accuracy initially climbs after movement onset but then falls to zero when the force pulse is applied because this pulse cannot be predicted. Then, ∼150 ms after pulse offset (250 ms after onset), the prediction becomes increasingly accurate. At ∼400 ms after pulse onset, lateral force linearly predicts 82% of the variance in longitudinal velocity. The accuracy of this prediction is maintained fairly well until movement termination (550–600 ms after pulse onset). Note that even when assistive and resistive trials are considered separately, force profiles still show significant ability to predict variability in longitudinal velocity in real time (R2 = 0.22 and 0.34, assistive and resistive, respectively; p < 10−6 in both cases).
We also considered whether this state prediction is truly in real time, or whether some nonzero lag between lateral force and longitudinal velocity produced more accurate predictions. In Figure 5D, we performed the same regression analysis as in Figure 5C, but over a range of leads and lags (±200 ms). Because a range of lags were examined at each time point in movement, this analysis produces a prediction surface characterized by the cross-correlation between longitudinal velocity and lateral force over all trials at each time point. Note that the data in Figure 5C represents a vertical slice through this surface at zero lag. Also note that the optimal lag at each time point occurs at the maximum of the corresponding horizontal slice through this surface. In Figure 5D, these points are marked in black, with 95% and 99% confidence intervals shaded in dark gray and light gray, respectively. Although this analysis is intrinsically a type of cross-correlation, for consistency with Figure 5C, we report R2 rather than R.
The main result here is that in the time period when state prediction (velocity prediction) is best (350–450 ms after perturbation onset), the optimal time lag is <30 ms. The overall best prediction occurs at 300 ms after pulse offset (400 ms after onset) at a lag of just 5 ms (as indicated by the asterisk in Fig. 5D). For time points later than 450 ms after perturbation onset, the reliability of the prediction decreases, but the optimal time lag remains near zero and is not significantly different from zero, suggesting continued real-time state prediction. But note that the contours are relatively flat in the time period around zero lag, indicating that in this region prediction accuracy is not very sensitive to lag, and thus accurate estimation of prediction lags is not possible (as would be expected from the fact that both force profiles and the velocity profiles are relatively smooth and do not tend to change much over 50–80 ms). Earlier in movement, accuracy decreases and the optimal lag is no longer in real time; although even at 300 ms after pulse onset, when the optimal lag is −70 ms (force leads velocity by 70 ms), the difference in prediction accuracy between this lag and real time is only ∼10%.
Model comparisons
The main observation in our data was that feedforward learning of the force-field environment produced consistent changes in feedback responses. Because we carefully controlled motion errors when probing feedback responses, we note that any feedback controller that is not adaptive (in the sense that the relationship between error and response is unaltered by learning) will show no such learning-related changes in pulse response. In other words, it would predict a flat line in Figure 4C. With this in mind, we implemented two classes of adaptive feedback control models. Because our results suggested that a state-predicting forward model of dynamics might be necessary to produce lateral force profiles appropriate for the altered longitudinal velocity trajectories in the postpulse period, we wanted to test (1) whether an adaptive but nonpredictive feedback control policy might be able to explain our results as well and (2) whether a state-predicting forward model of dynamics really could produce the task-appropriate responses observed in our data.
We first explored whether simple adjustments to short-latency feedback responses could account for our results. If changes in short-latency segmental responses were indeed able to account for the learning-related changes in pulse response observed in our data, this would imply that the stiffness (K) and viscosity (B) matrices that characterize these responses would have changed. We note that because the longitudinal component of the pulse response is essentially unaltered during learning, it is unlikely that the overall scale of these matrices substantially changed. We therefore explored the possibility that these matrices changed to rotate (or redirect) feedback responses compared with the baseline state. Specifically, we tested the effects of a rotation of baseline feedback torque and muscle activation in response to force pulses, where the amount and direction of rotation were appropriate for compensating the effect of the force field on unpulsed trials. Thoroughman and Shadmehr (1999) previously suggested a model for feedforward learning in which the motor system can approximate the dynamics of the force field by rotating the preferred direction for the muscles involved in actuating each movement. These found that simple rotation of ∼20° explained feedforward changes in muscle activation in a force-field adaptation task. Here, we tested whether such rotations of the preferred direction for feedback responses could generally explain our results and produce task-appropriate feedback responses to unpredictable perturbations. In other words, could changes in muscular preferred direction that produce task-appropriate changes in feedforward control also produce task-appropriate changes in feedback responses?
The set of muscle and torque rotation models we studied are diagrammed in Figure 6A (for a complete description, see the Materials and Methods). Figure 6C shows the simulation results for the muscle and torque rotation models. The simulation results shown in this panel are analogous to those shown in Figure 4C for our data. The dotted lines again represent the ideal pattern of forces, as calculated by scaling the pulse-induced change in longitudinal velocity profile by the magnitude of the force field. The solid lines represent the change in lateral pulse response between the null-field condition and after learning the force field. The results show that stiffness-viscosity-based feedback in conjunction with a simple rotation of either muscles' preferred directions or net torque produce consistent learning-related changes in perturbation responses. However, the learning-related changes in these responses do not closely correspond to those observed in our data and are not task appropriate.
Rather, these learning-related changes are substantially opposite in direction from both the ideal learning-related responses and the responses we observed experimentally. An explanation lies in the properties of the feedback control mechanism: the torque and muscle rotations induce lateral (rotated) forces that approximately compensate anticipated force-field dynamics. Note that this rotation only approximates compensation for force-field dynamics because these dynamics are purely based on velocity, whereas torques and muscles activation are based on the more complex physical dynamics of the arm that are dependent on acceleration and position in addition to velocity. On an unperturbed trial, the direction of lateral force production approximately corresponds to the direction of movement and speed, and these muscle and torque rotation models can perform fairly well. However, if a large perturbation is applied, the relationship between applied force and movement velocity changes radically. In particular, when correcting for a resistive perturbation, subjects must pull their arm forward. Applying a rotation to this forward corrective force will produce more lateral force (appropriate for a faster movement), when in fact this resistive perturbation has slowed the hand speed, and a task-appropriate response should therefore reduce lateral force. Thus, in this case, forces produced by feedback rotation will be opposite the direction of the movement speed error induced by the pulse; thus, rather than this lateral force pattern compensating the pulse-induced change, it will be nearly opposite, as shown in Figure 6C. Thus, although these muscle and torque rotation models can grossly approximate anticipated force-field dynamics on unperturbed trials, they cannot account for the learning-induced changes in perturbation responses that our data show. Note that feedback corrections can be conceived of as corrective submovements that add to the feedforward motor output. The simulation results for this feedback rotation model suggest that applying corrective submovements which compensate the feedforward dynamics of the force field would also produce responses with learning-related components that are substantially opposite those seen in our data.
We next studied the pulse responses of a state-predicting forward model of dynamics. Figure 6D shows the results from this forward-model-based controller. As in the analysis of our data, ideal curves were determined from pulse-induced changes in longitudinal velocity, whereas lateral forces were determined from learning-induced changes in the pulse responses of the model. For this model, we see behavior closely approximating our results (Fig. 4C). In particular, there are no consistent learning-induced changes in pulse response until 180 ms after the onset of the force pulse. At this point, the forward model begins to receive delayed sensory information about the force pulse, and begins to accurately predict the pulsed velocity profile. Importantly, these predictions are then made available to the inverse model, so that subsequently produced lateral forces become appropriate for the perturbed velocity trajectory, and thus track the ideal force profile. Thus, in the output of this model, as in our data, the controller produces additional lateral force (because of the additional predicted longitudinal velocity) late in resisted trials, and produces less (because of less longitudinal velocity) on assisted trials.
It is important to note that these models are not meant to suggest a precise form for the motor controller, but merely to help argue that any putative feedback controller should contain a predictive forward model, and be altered appropriately by feedforward learning. The muscle and torque rotation models clearly cannot represent all possible models of adaptive but nonpredictive feedback control; however, our results with these models indicate that adaptivity alone is insufficient to produce task-appropriate responses to unanticipated errors. Because feedback control of any kind is inherently delayed, responses must incorporate some type of prediction if they are to account for limb state or other features of the task dynamics at the time these responses actually take effect, as is seen in our data. Similarly, the forward-model simulations do not indicate that any predictive model automatically produces task-appropriate responses, but instead shows that this class of adaptive, predictive models can generate task-appropriate responses that reproduce the main features of our data.
Discussion
In this study, we examined the way in which motor adaptation affects online responses to untrained and unanticipated perturbations in humans. We aimed to answer the question: how do newly learned changes in the motor system's internal models for feedforward control of novel dynamics affect feedback control responses? To accomplish this, we studied the feedback responses to unanticipated motor errors when these errors had never themselves been trained in the newly learned force-field dynamics.
In doing so, we were careful to ensure that probing the online feedback response to force pulses did not itself influence the force-field learning in which we were interested. The key feature of our paradigm was the clamping of lateral errors so that error feedback on pulsed movements was constrained to be fully orthogonal to the type of error necessary to bring about force-field learning. As a result, we were able to measure learning-related changes in lateral force production without providing a useful training (or untraining) signal for this force production. Thus, we effectively measured perturbation-specific responses without training perturbation-specific responses. We ensured that force-field-induced motor adaptation generated laterally directed errors and learned responses, whereas the force-pulse perturbations we applied produced errors localized to the direction of movement. This localization was achieved by applying only assistive and resistively directed force-pulse perturbations and refined by using a lateral error clamp and by carefully choosing movement and perturbation directions to take advantage of favorable eigendirection arm dynamics.
It could be argued that the force channel itself constitutes a type of perturbation. However, close examination of our data show that the error clamp has no significant effect on longitudinal velocity, either in the baseline condition or late in learning the force field (supplemental material, available at www.jneurosci.org). Furthermore, a key feature of our analysis was that we examined the difference between lateral responses before and after learning, both of which were recorded in error-clamp trials. Therefore, unless there was a specific interaction between a putative error-clamp perturbation and the force-field learning, any effect of such a putative error-clamp perturbation would be removed by baseline adjustment (Fig. 4B). Moreover, examination of learning-related changes in lateral force profiles in all types of error-clamped trials that we studied shows that these responses appear to be fully accounted for by a match to the ideal force-field compensation, suggesting little or no interaction between learning and the error clamp.
Two important technical features of our task were that we could directly measure lateral force patterns on error-clamp trials and that the velocity dependence of the viscous curl force field meant that we had a very precise notion of what an ideal lateral compensation for the perturbation-induced longitudinal velocity errors should look like, facilitating analysis of the experimentally observed lateral force profiles. Our paradigm simultaneously allowed for learning of a novel motor adaptation task and examination of its effect on online error feedback, without the latter affecting the former.
Of a number of proposed explanations of the relationship of feedback control to task learning and the formation of internal models, we found one general class of controllers that can explain our data, namely task-specific feedback controllers. In particular, to produce the output seen in our experiment, the we suggest that the feedback controller must (1) contain a forward model capable of accurate real-time state prediction for velocity and (2) combine the state predictions of that forward model with recently learned changes in internal models of limb dynamics. These characteristics allow, after a time delay, for lateral force profiles in perturbation responses to be nearly perfect in their task specificity after learning.
Controllers that do not meet these criteria do not appear to be capable of reproducing our data. A model that relies only on standard stiffness-viscosity error feedback is insufficient: this controller would predict no change between the perturbation responses in our baseline and late-learning conditions (Fig. 4C, flat line). Furthermore, if we add to this some sort of learning dependence without the incorporation of real-time state prediction, the controller remains unable to explain our data. To illustrate that an adaptive feedback controller lacking state prediction is unable to account for our data, we implemented controllers which rely on standard stiffness-viscosity feedback in conjunction with force-field-specific rotation of baseline (1) preferred muscle activation direction and (2) feedback torque. We found that these controllers produced responses inconsistent with our data.
Implementing a controller that relies on forward-model state prediction in conjunction with learned changes in internal model dynamics, we demonstrated that such a controller can, in principle, reproduce the main features of our data. The controller, after a time delay, produces a nearly ideal force profile, namely the opposite of what the force field would be expected to produce were the field active during the perturbed movement.
Previous work has suggested that such a state predictor is probably available to the motor system. Ariff et al. (2002) showed that when instructed to track their unseen hands during reaching movements, subjects made saccades that, even after unanticipated perturbations to the arm, accurately predicted the future position of the hand, typically ahead by 150–200 ms. However, the small number of saccades over the course of a reaching movement combined with their discrete nature makes it unclear whether the nervous system can continuously provide real-time state prediction. Our data show that state prediction can be continuous and in real time.
Several studies have found that responses to unexpected perturbations during movement do not always directly reflect the joint kinematics associated with the perturbation (Latash, 2000; Hasan, 2005). Rather, in some cases these feedback corrections appear to generally respond to endpoint errors somewhat independently of the associated joint kinematics in a way that can mimic voluntary movement. This can perhaps be seen most clearly when joint kinetics are redundant so that endpoint errors caused by perturbation at one joint can be compensated by corrective responses at other joints (Latash, 2000).
Others have also suggested previously that a component of feedback control may reflect an internal model of dynamics. Research into human reflex responses to perturbations of the arm during postural conditions have suggested that short- and long-latency components of these responses may respond to different errors and state variables (Lacquaniti and Soechting, 1984; Soechting and Lacquaniti, 1988, Kurtzer et al., 2008). EMG data from these studies show that whereas short-latency spinal reflexes respond solely to muscle stretch, long-latency reflexes may respond to net changes in torque rather than changes in joint angle alone, as would be predicted by a simple stretch response. Because of intrinsic mechanical coupling between the physical dynamics of connected joints such as the elbow and the shoulder, shoulder muscle responses lead to both elbow and shoulder motion and vice versa. Therefore, muscles responses related to joint torques can counteract the consequences of external perturbations more effectively than muscle responses related only to joint motion encoded by muscle stretch. These studies therefore suggest that long-latency reflexes may generate responses to simple perturbations of the arm that take into account its intrinsic physical dynamics. But are the properties of normal arm dynamics intrinsically built in to these long-latency responses in some sense? Or are they learned over a lifetime of experience? Or are theses responses constructed so that they can automatically account for current internal models of the limb? Our results suggest the latter.
More recently, Wang et al., (2001) examined how motor output evolved when unanticipated force pulses were delivered during reaching movements between baseline and late force-field training conditions. They reported that this change in motor output reflected learning the force-field dynamics. However, during this study, only the raw motor output produced on pulsed trials, before and after adaptation, were compared. This is equivalent to looking at only the resistively perturbed (purple) trace in Figure 4B (which generally reflects force-field dynamics) without comparing it to the unperturbed (orange) trace. Therefore, overall feedforward changes in motor output associated with learning the force-field dynamics were not dissociated from perturbation-induced feedback changes in these dynamics, and feedback-specific changes in motor output brought about by learning were not examined in this study. Here, we were able to examine such feedback-specific changes and show that they precisely reflect newly learned dynamics.
Our results provide strong support for some properties of the motor system that would be generally required for the implementation of optimal feedback control (OFC), but may be at odds with a key prediction of OFC. OFC responses must incorporate the task goal, current limb state, and knowledge of system dynamics (Kuo, 1995; Todorov and Jordan, 2002; Körding and Wolpert, 2004, 2006; Scott, 2004). Although it is clear that both limb state and task goal can affect feedback responses, we are not aware of previous evidence that learned changes in system dynamics can alone alter feedback responses when limb state and task goal remain unchanged. Our data show that training that modifies knowledge of system dynamics can alter online error feedback responses even when other determinants of these responses are held constant. Furthermore, we show that feedback responses change in ways that are appropriate for the current task dynamics: after learning the force-field dynamics, assistive and resistive perturbation responses accurately reflect what a feedback controller with perfect knowledge of the force field should produce, given the changes in longitudinal velocity induced by these perturbations. Additionally, we show the ability of the feedback controller to incorporate accurate real-time prediction of limb state. Because OFC responses must be a function of limb state, the ability to accurately predict state is essential for implementation of an optimal feedback control policy.
However, it is important to note that learning-related changes in feedback responses we observed very closely reflected those that would be ideal to counteract the expected effect of newly learned dynamics only if subjects intended to follow the same trajectory as in the unpulsed condition (i.e., the ideal force profiles displayed in Fig. 4 correspond to the force profile required for maintenance of the baseline motion trajectory). Although we note that maintenance of a static desired trajectory before and after changes to the force-field environment nicely explains the learning-related changes in feedback responses that we observed, the idea of a static desired trajectory is inherently at odds with ideas about OFC that generally predict that optimal responses involve continuous replanning of future motion trajectories.
Our data provides clear evidence for a neural control system capable of monitoring and predicting in real time the state variables relevant to the learned dynamics of a task and combining these predictions with learned changes in internal models of limb dynamics.
Footnotes
-
This work was supported by grants from the Wallace H. Coulter Foundation, the McKnight Endowment Fund for Neuroscience, and the Alfred P. Sloan Foundation. We thank Gary Sing and Joern Diedrichsen for helpful discussions.
- Correspondence should be addressed to Maurice A. Smith, Harvard School of Engineering and Applied Sciences, 29 Oxford Street, 325 Pierce Hall, Cambridge, MA 02138. mas{at}seas.harvard.edu