Abstract
Humans predict the sensory consequences of motor commands by learning internal models of the body and of environment perturbations. When facing a sensory prediction error, should we attribute this error to a change in our body, and update the body internal model, or to a change in the environment? In the latter case, should we update an existing perturbation model or create a new model? Here, we propose that a decision-making process compares the models' prediction errors, weighted by their precisions, to select and update either the body model or an existing perturbation model. When no model can predict a perturbation, a new perturbation model is created and selected. When a model is selected, both the prediction's mean estimate and uncertainty are updated to minimize future prediction errors and to increase the precision of the predictions. Results from computer simulations, which we verified in an arm visuomotor adaptation experiment with subjects of both sexes, account for short aftereffects and large savings after adaptation to large, but not small, perturbations. Results also clarify previous data in the absence of errors (error-clamp): motor memories show an initial lack of decay after a large perturbation, but gradual decay after a small perturbation. Finally, qualitative individual differences in adaptation were explained by subjects selecting and updating either the body model or a perturbation model. Our results suggest that motor adaptation belongs to a general class of learning according to which memories are created when no existing memories can predict sensory data accurately and precisely.
SIGNIFICANCE STATEMENT When movements are followed by unexpected outcomes, such as following the introduction of a visuomotor or a force field perturbation, or the sudden removal of such perturbations, it is unclear whether the CNS updates existing memories or creates new memories. Here, we propose a novel model of adaptation, and investigate, via computer simulations and behavioral experiments, how the amplitude and schedule of the perturbation, as well as the characteristics of the learner, lead to the selection and update of existing memories or the creation of new memories. Our results provide insights into a number of puzzling and contradictory motor adaptation data, as well as into qualitative individual differences in adaptation.
Introduction
It is now well accepted that the CNS predicts the consequences of motor commands by learning internal models of tools or external perturbations (Wolpert and Kawato, 1998; Krakauer et al., 1999; J. Y. Lee and Schweighofer, 2009; Kambara et al., 2011) and of the body (Cothros et al., 2006; Kording et al., 2007; Berniker and Kording, 2008; Kluzik et al., 2008). Internal models are updated to minimize sensory prediction errors, i.e., errors between sensory outcomes and predictions (Mazzoni and Krakauer, 2006; Taylor and Ivry, 2011; K. Lee et al., 2018). It is thought that if the prediction error is small for a given model, then this model will be selected to determine subsequent motor commands, and will be further updated if needed (Wolpert and Flanagan, 2001).
It is not well understood, however, how the CNS decides to update existing models or to create new internal models (Shadmehr and Mussa-Ivaldi, 2012). When movements are followed by unexpected errors, such as due to a visuomotor or a force field perturbation, should the CNS create an entirely new internal model, update an existing perturbation model, or update the model of the body?
Here, to address this question, we propose a new computational model of motor adaptation, along the lines of a recent model for visual memory (Gershman et al., 2014). The model extends and combines previous models of motor learning known as mixture of experts models (Jordan and Jacobs, 1990; Ghahramani and Wolpert, 1997; Wolpert and Kawato, 1998) and previous models of adaptation based on the Kalman filter (Korenberg and Ghahramani, 2002; Berniker and Kording, 2011). The model contains three types of internal models: (1) “expert” perturbation models that have been previously adapted (Kawato and Wolpert, 1998; J. Y. Lee and Schweighofer, 2009; Lonini et al., 2009), (2) “novice” models with nonspecific predictions and large uncertainties, and (3) a dedicated “baseline” or body model (Berniker and Kording, 2008, 2011). When facing an unexpected sensory outcome, a Bayesian decision-making process compares the models' prediction errors, weighted by their precisions, to determine whether to create a new expert model by updating a novice model, or to select and update the baseline model or an existing perturbation model. When a model is selected, both the prediction's mean estimate and uncertainty (the inverse of precision) are updated to minimize future prediction errors and to increase the precision of the predictions.
If a perturbation is small relative to baseline model's precision, this model will be selected and updated. The learner will show long aftereffects during de-adaptation, little or no savings during re-adaptation, and gradual decay during trials in which the target errors are artificially clamped to zero (“error-clamp”). If a larger perturbation is similar to a perturbation encountered in the past, an existing expert model will be selected, leading to the recall of its protected memory. If not, a new memory will be created by selecting a novice model and updating it into an expert. The learner can then rapidly switch between the expert model and the body model, resulting in short aftereffects, large savings, and lack of decay for multiple trials during error clamp. In this process, noise in the prediction errors can yield rapid switches in hand directions during error-clamp.
We tested these predictions in an arm visuomotor adaptation experiment in which we manipulated the perturbation amplitudes and levels of noise. We then inserted trigger trials (each, a single perturbation trial) in the error-clamp block to probe for individual differences in adaptation: if the learners had previously formed a perturbation model, the trigger trial will suddenly reduce the prediction error for this model, yielding in its selection. These learners were predicted to have short washout and large savings in previous adaptation blocks. In contrast, if the learners had only updated their baseline model, the trigger trial will have minimal effect. These learners were predicted to have prolonged washouts and no savings in previous adaptation blocks.
Materials and Methods
Computational model
The overall architecture of the model is shown in Figure 1A. As in previous models of visuomotor adaptation (Izawa and Shadmehr, 2011), the motor command represents the hand movement direction. On trial t, the learner generates the motor command ut to reach the target tt. Here, we assume that the target is located at the angle 0, the forward direction, for simplicity but without loss of generality. Visual feedback of the hand ht is determined differently in non-error-clamp trials (i.e., baseline, perturbation, and washout trials) and in error-clamp trials, in which feedback is independent of actual performance: where pt is the perturbation at time t, and ntp ∼ N(0, σp2) and ntec ∼ N(0, σec2) are noise sources added to the perturbation or to the error-clamp, respectively. ntp and ntec were introduced to represent trial-by-trial fluctuations in feedback and to manipulate the noise level in the environment.
In its general form, the model contains N internal models, including one baseline model, each associated with a weight given by the model posterior probability. The estimate of the perturbation p̂t is given by the prediction from the model with the largest weight, in a winner-take-all manner. Thus, the overall prediction at time t is given by the following: where wti is the weight of model i, with To reach the target, we assume subjects generate a motor command ut that compensates the estimated perturbation p̂t: where ntu ∼ N(0, σu2) is a motor noise term. Receiving the efferent copy of the motor command, each internal forward model independently predicts the sensory feedback from its own perturbation estimate: The sensory prediction error for each model is given by the following: The weights (which correspond to the “responsibility signals” in previous models such as the MOSAIC models; Wolpert and Kawato, 1998; Haruno et al., 2001; Doya et al., 2002; Bertin et al., 2007), are given by the posterior probability of the models: where ρi is a constant prior weight representing prior belief that each model being true in the absence of feedback, and Ŝti is the uncertainty (variance) of a model i around its mean estimate p̂ti. We note that the inverse of the uncertainty, called the precision, weighs the effect of the sensory prediction errors in the computation of the weights. Selection of a model i will therefore depend on (1) how well the model can reduce the prediction error εti and (2) how precise is the model, as given by the inverse of its uncertainty, and (3) its prior weight. Note that a crucial difference with the MOSAIC model is that the uncertainties Ŝti, and therefore the inverse precisions, are time-varying quantities, as described below.
The mean predictions and uncertainties of the different models are updated according to the standard Kalman filter equations (Bishop and Welch, 2001), as earlier proposed by others (Korenberg and Ghahramani, 2002; Berniker and Kording, 2008, 2011; Burge et al., 2008; Wei and Kording, 2010; van der Vliet et al., 2018). However, in the present model, the type of update for each model depends whether it is selected or not (Eqs. 2 and 6). When a model is selected, it is updated according to the measurement update equations of the Kalman filter: where 0 < ai ≤ 1 are decay rate parameters and σm2 is a measurement noise variance parameter.
When a model is not selected, it is updated according to the time update equations of the Kalman filter:
for j ≠ i: where σs2 is a state noise variance parameter.
All three types of models (novice, expert, and baseline) are selected and updated with equations 7 and 8, but are different as follows: (1) A novice model is a model that has never been selected and has therefore a large initial uncertainty Ŝti; its posterior distribution is approximately flat and centered on zero (Fig. 1B). (2) An expert model is a previous novice model that has been selected and updated: its mean prediction p̂ti is close to the perturbation amplitude and its uncertainty Ŝti has been reduced, up to a minimum value Ŝmin (corresponding to the highest possible precision). A model is said to be created when it is selected and changes from its non-informative novice state with an approximately flat posterior to a more peaky posterior centered near the perturbation (Fig. 1B). (3) The baseline model prediction is centered ∼zero (i.e., no perturbation) and its uncertainty is initialized to the smallest possible uncertainty, equal to the minimum value Ŝmin. In addition, the baseline model has a large prior compared with the novice and expert models. Note the binary treatment of errors for learning, similar to the Berniker and Kording (2008)'s model: the errors attributed to the body update the baseline model (corresponding to their body model); the errors attributed to external origins update the perturbation model (corresponding to their world model).
Overall simulation design
The simulations (except for the last set, in which we compare savings following gradual and abrupt adaptation) and all experimental conditions comprised a baseline block, a sequence of abrupt adaptation and washout blocks, and an error-clamp block (Fig. 2A), as follows: a baseline block (20 trials), learning block (LB)1 and washout block (WB)1 (60 and 40 trials, respectively), LB2 and WB2 (50 and 20 trials, respectively), LB3 and WB3 (30 and 40 trials, respectively), and LB4 (50 trials), followed by an error-clamp block (120 trials; Fig. 2B). The numbers of trials in the learning and unlearning blocks were purposely varied and determined during piloting to prevent predictable periodicity in the experiment.
We first tested the effect of perturbation amplitudes on subsequent behavior in unlearning and relearning blocks with a large perturbation amplitude (−20°; Fig. 2B, Condition 1a) versus a small perturbation amplitude (−10°; Fig. 2B, Condition 3). Second, we tested the effect of different perturbation noise levels in error-clamp following adaptation to a large perturbation (−20°). The schedule for these simulations was the same as for the previous conditions, but with a small noise level (Gaussian, 0° mean, SD = 0.5°; Fig. 2B, Condition 1a) versus a large noise level (SD = 4.0°; Fig. 2B, Condition 2a) added to the perturbation. Third, we tested the effect of two separate trigger trials during error-clamp following adaptation to a −20° perturbation. The trigger trials, each a single −20° perturbation trial, were inserted at one-half and three-quarters of error-clamp, respectively (Fig. 2B, Conditions 1b and 2b). We note that these trigger trials are similar to “reinstatement trials” in the classical conditioning literature (Bouton and Peck, 1989). Fourth, we tested for qualitative individual differences in adaptation, washout, savings, and error-clamp. For this, we applied Condition 2b, but made a change to a single parameter in the model, the baseline model uncertainty (and associated minimum uncertainty Ŝmin; see Simulation parameters). Note that, in theory, individual differences could also be achieved via changes in the prior parameters (as the priors also act to modify the weights of Eq. 6). The study of the differences (or similarity) in model selection due to individual differences in baseline model uncertainty and differences in prior is left for future work.
In the experiment, we clustered subjects in the large amplitude groups based on their response to the trigger trials, and studied savings and washout in previous trials for the two clusters.
Finally, we tested the effects of gradual versus abrupt perturbation on savings, using paradigms akin to those used by Roemmich and Bastian (2015). We tested for savings in a second abrupt adaptation block of 100 trials that is preceded by either a gradual (100 trials), abrupt (100 trials), or short adaptation (20 trials) block.
Note that for simplicity, we only implemented a model that accounts for adaptation to a single perturbation. Initially the model therefore only contained a novice model and a baseline model (Fig. 1A,B). Such a minimal model is sufficient to account for our data. The model and analysis code, as well as the data, are available at https://sites.usc.edu/cnrl/resources/.
Simulation parameters
In all simulations, hand direction was modeled between 0 and 1, and then scaled by 20° to match experimental data. We determined a single set of parameters that could account for all experimental results qualitatively (except for individual differences, see end of this section): the SDs of both perturbation noise σp and error-clamp noise σec (Eq. 1) were 0.2 for the large-noise condition and 0.025 for the small-noise condition. The motor noise SD σu was set at 0.17 to qualitatively reproduce the experimental error clamp data in the different amplitude and noise conditions. The state noise σs was 0.05 for both perturbation and body model, and the measurement noise σm was set at 5.0 and 0.8 for the perturbation and body model, respectively, to qualitatively reproduce experimental learning rates. The prior of the baseline model ρ1 was 0.95 and that of the novice model ρ2 was 0.05. Initial mean perturbation values were set at 0. The initial uncertainty for the novice model was 100 times greater than for the expert model, with 4 and 0.04, respectively. We also introduced a minimum value Ŝmin, which was needed to maintain numeric stability. In simulations aimed at studying between-subject qualitative differences in adaptation and de-adaptation, we increased the initial uncertainty for the baseline model (and corresponding minimum uncertainty) to 0.16. Finally, the retention parameters were given different values for the perturbation model (a = 0.9997) and for the baseline model (a = 0.96; for justification of these parameters, see Discussion, Limitations).
Experimental design and statistical analysis
Experimental design.
Fifty-four subjects (22.5 ± 3.8 SD years old, 20 males and 34 females) participated in this visuomotor rotation study, which was approved by the institutional review board at the University of Southern California, after signing an informed consent. The visuomotor perturbation rotated the cursor position counterclockwise by a given angle with respect to the starting position. We randomly assigned subjects into one of five different conditions (Fig. 2B). The experimental design was closely aligned to the design of the simulations. In particular, in all conditions, the experiment schedule consisted of adaptation, washout, and error-clamp blocks with the same order and number of trials as in the simulations (see Overall simulation design; Fig. 2B).
Conditions 1a, 2a, and 3 differed in either rotation angles (Condition 1a vs Condition 3) or in Gaussian noise levels added to perturbation (Condition 1a vs Condition 2a), as follows: Condition 1a (n = 11): large perturbation (−20°) and small noise level (SD = 0.5°); Condition 2a (n = 11): large perturbation (−20°) and large noise level (SD = 4.0°); Condition 3 (n = 11): small perturbation (−10°) and small noise level (SD = 0.5°). Conditions 1b (n = 11) and 2b (n = 10) were identical to Conditions 1a and 2a, respectively, except for two “trigger trials” inserted at one-half and three-quarters of the error-clamp, respectively. Trigger trials were simply single rotation trials, with the same amplitude as rotation trials in the adaptation blocks (i.e., −20° rotation). All subjects in each condition received exactly the same rotation sequence.
Detailed experimental methods.
Subjects sat in front of a device that matched hand space with visual space via a mirror, and were instructed to hold a stylus pen moving on a digitizer tablet (Wacom Intuos 7). Head and trunk movements were limited via a chin-rest. The experiment took place in a dark room, and the mirror obscured the view of the forearm and hand. A cursor (red dot, 1.2 mm radius) representing the tip of the pen was displayed on the mirror. Before the start of each trial, subjects were instructed to position the cursor inside a home circle of a 3 mm radius (∼36 cm away from the subject's torso). We used a polar coordinate system centered on the home circle, with 0° defined as the forward direction and positive direction as clockwise deviation. Subjects were instructed to perform an outward shooting movement toward a circular target of 3° radius. The target appeared at a pseudorandom location at each trial, within 5° around the center of a 120° arc, which was 10 cm away from the starting position and centered on the 0° direction. Subjects were told to initiate a shooting movement as soon as a target appeared and to stop the movement after crossing the arc. A red dot representing a cursor disappeared when the pen tip moved farther than 3 cm from the starting position. When the pen tip crossed the arc, the red dot was displayed on the crossed-point and remained there for 1 s. Subjects were encouraged to keep movement duration (time between the moment the cursor disappears and the moment the cursor crosses the arc) relatively constant and short. The messages “Too Slow” or “Too Fast” were displayed when the duration was >300 or <100 ms, respectively. After each shooting movement, subjects performed an inward movement to the home circle, during which only the radial location of the cursor was available.
Statistical analysis.
To quantify the rates and amount of adaptation, and account for individual differences, we fitted exponential mixed-effect models to the (re-)adaptation and de-adaptation data, similar to Ramkumar et al. (2016) and Schweighofer et al. (2018) for other types of motor learning data. The rates of learning were estimated by the time constants in the exponential model. In addition, we included grouping (binary) variables in the models to perform statistical tests for differences in time constants in different blocks, in different conditions, and in different subgroups of subjects.
To test for savings in re-adaptation in the large perturbation amplitude Condition 1a and the small amplitude Condition 3, we first compared the time constants in the first learning block to the time constants of the next three learning blocks. For this repeated-measure comparison, the hand direction hi,j of a participant i at trial j, was modeled by the following exponential mixed-effect model: where tj is the trial within each learning block (trial 0 at the beginning of each block), ϵi,j is a residual term, G is a within-subject grouping binary variable, with G = 0 for the first learning block and G = 1 for LB2–LB4, αi and βi are the mixed-effect coefficients representing baseline and asymptotic adaptation levels, and τi is the mixed-effect time constant of adaptation for the first block. The mixed-effect parameter Ti reflects the difference in time constants between the first block and the following blocks (with Ti < 0 indicating savings).
To test for differences in savings between the large amplitude Condition 1a and the small amplitude Condition 3, we compared the time constants in re-adaptation Blocks 2, 3, and 4 between the two groups with the following model: where the between-subject grouping binary variable, with G = 0 for the Blocks 2, 3, and 4 of the small amplitude Condition 3 and G = 1 for the corresponding blocks in the large amplitude Condition 1a. A similar model was fit to the three washout blocks to compare the time constants of washout in the large and small amplitude conditions.
The MATLAB function nlmefit was used for fitting the mixed-effect exponential models. Because the direct estimation of τi + TiG created numerical instability, we estimated instead exp ψi + ΔiG so that the time constants were always positive. The same models were fit to experimental and simulated data; however, in simulated data, we only reported the fixed effects as an indication of the qualitative behaviors (and not statistical differences between blocks and conditions, because these differences were due to parameter choice).
Statistical differences between time constants in the different blocks or conditions were tested by the p values associated with the fixed-effect parameter Δ (obtained after computing z-scores for this parameter). To minimize the number of parameters and increase the chance of convergence, a diagonal covariance structure was used for the random effects. In addition, we added the last trial of the previous block to the fitted data to increase numerical stability. Overall model fit was deemed successful if (1) the fixed-effect time constant was significant, (2) the asymptotic adaptation level was significant, and (3) the variance of the random-effect time constant was not zero (yielding different estimated time constants for all subjects). Conditions 1 and 2 were always met, but Condition 3 was often not met when fitting the models to individual blocks. To solve this issue, we removed the random-effect baseline from the model when fitting individual block data. In addition, the residuals were observed for outliers and for overall model fit. We report the fixed-effect time constants with their 95% confidence intervals [calculated via the “delta method” (Ver Hoef, 2012) because of the exponential transform]. The time constants for each adaptation and washout block reported in Figures 4 and 7 were computed by fitting the model of Equation 9.1 (without a grouping variable) to each block.
To test for possible qualitative individual differences for all 21 subjects in the large amplitude Conditions 1b and 2b, we first clustered the subjects based on their responses to trigger trials. Specifically, we computed the mean hand directions in the five trials before and after each trigger trial. We then plotted the mean hand direction after the trigger trial as a function of the mean hand direction before the trigger trial. Three distinct behaviors were predicted from the computational model. If the hand direction is already near the adapted (“high”) state, the trigger trial will have little effect, and hand direction will stay high. If the hand direction is near baseline, there are two possibilities: either the trigger trial will have little effect, and the hand direction will stay near baseline (“low” state), or the trigger trial will recall the perturbation model, and the hand direction will suddenly change to the adapted (high) state. Participants in the high versus low states after trigger trials were classified into “responders” and “non-responders”, respectively, using an unsupervised k-means algorithm with two clusters based on their post-trigger trial responses (using the MATLAB function kmeans). Following this clustering of participants into two subgroups, we then analyzed difference in savings and washout using the mixed-effect exponential model as in Equation 9.2, but with the between-subject grouping variable G = 0 for the non-responder subgroup and G = 1 for the responder subgroup. Significance threshold was set at p = 0.05.
Results
Theoretical predictions
The model makes three clear predictions, which were tested in simulations and behaviorally, about (1) the role of perturbation amplitudes on adaptation, de-adaptation, and re-adaptation (Fig. 1B); (2) the role of noise on model selection during error-clamp; and (3) the effect of re-introducing the perturbation with a single trigger trial in the clamp.
First, if the initial perturbation is sufficiently large given the precision of the baseline model, a new memory will be created by selection and update of the novice model (Eqs. 2 and 6; Figs. 1B, 3, examples of simulation results for Conditions 1 and 2). This is because the large perturbation largely increases the sensory prediction error for the baseline model, yielding a small likelihood (see exponential term in Eq. 6) for this model. As a result, the weight of the baseline model will be smaller than that of the novice model, which will be selected (Eq. 6). With sufficient trials, the novice model will gradually become an expert: its prediction of the perturbation will become both more accurate (resulting in a small prediction error; Eq. 7, first) and more precise (small uncertainty; Eq. 7, second). In contrast, the baseline model will stay protected from update (Fig. 1B). At the onset of the following de-adaptation block, the baseline model will be selected, because it now has the smallest sensory prediction error (Fig. 3). Because of the passage of time, the expert model's mean prediction will decay to some extent and its uncertainty will increase during de-adaptation (Eq. 8, second; Fig. 3, second column). However, at the onset of a subsequent re-adaptation block, its sensory prediction error will still be relatively small, leading to its re-selection (Fig. 3, second and third columns). The learner will therefore switch rapidly between the perturbation and the baseline models, allowing rapid changes of behavior upon environmental changes, expressed as short aftereffects in washout and large savings in relearning. We call this learner a “two-model learner” (Fig. 1B).
On the contrary, if the initial perturbation is sufficiently small or gradual, or if the baseline model has a large uncertainty, the precision-weighted sensory prediction error of the baseline model will be small, and its weight will be larger than that of the novice model. The baseline model will therefore be selected and updated both in the first block (Figs. 1B; Fig. 3, last row, example of simulation results for Condition 3) and in the subsequent washout and relearning blocks of the same amplitude, making transitions slow and gradual, resulting in long aftereffects and no savings (Fig. 3). The novice model will remain unchanged in this condition, with large uncertainty. We call this learner a “one-model learner”.
Second, the model predicts that once a learner has formed two models, stochastic switching can occur between the two models in error-clamp, resulting in “lags” of various duration before sudden decay. If the expert model accurately predicts the perturbation in the last trials of adaptation before the clamp, the sensory prediction error of the model will be near zero early in the clamp. This is because the difference between the hand direction (say, 20°) and the clamped visual feedback (0° plus some noise) is approximately equal to the preceding perturbation (−20°). The perturbation model will therefore continue to be selected. The durations of the lags are stochastic, however. This is because noise affecting the sensory prediction error (because of non-central motor noise, i.e., motor noise not carried by the efferent copy, sensory noise, or experimentally induced perturbation noise) can lead to a sudden reduction of the sensory prediction error of the baseline model, and therefore selection of this model, which results in an abrupt switch of the hand direction toward baseline (Fig. 3, first and last columns, rows 1–4). Therefore, higher levels of noise will lead, on average, to earlier switching toward baseline than low levels of noise. This is illustrated in the simulations examples of Conditions 2a and 2b (large perturbation noise) versus Conditions 1a and 1b (small perturbation noise) in Figure 3. Note that noise can also yield “upward switches” from the baseline to the perturbation model (Fig. 3, illustration of in Condition 2a, third row, first column), although these are less frequent events than downward switches because of the passive memory decay of the perturbation model and the greater prior of the baseline model. In contrast, if a learner has only updated one model (the baseline model), the hand direction will only decay gradually during error-clamp (Fig. 3, Condition 3, last row, first column).
Third, because learners can potentially be one- or two-model learners for the same perturbation amplitude, single trigger trials during error-clamp can lead to large qualitative differences in behavior when the performance has returned near baseline. If a learner has formed an expert model (and is therefore a two-model learner), and the hand direction has previously returned near baseline in the clamp, then a trigger trial will yield a sudden reduction of the sensory prediction error of the expert model, its selection, and a switch of hand direction toward the adapted state; in addition, the hand direction may stay in the adapted state for several trials following trigger trials (Fig. 3, Conditions 1b and 2b). In contrast, for one-model learners, a trigger trial will only cause a transient change because of a small update of the baseline model.
Savings and aftereffects following large and small perturbation: experimental and simulation results
Figure 4A shows experimental subject-averaged adaptation data for large perturbation Condition 1a (−20°/0.5°; top row) and for the small perturbation Condition 3 (−10°/0.5°; bottom row). As shown in Figure 4B, in which we superimposed the (normalized) hand direction for the first 20 trials of each learning block, subjects in Condition 1a show large savings in the relearning blocks (LB2, LB3, and LB4) compared with subjects in Condition 3. Specifically, an exponential mixed-effect model analysis (see Materials and Methods) shows that subjects in Condition 1a had a significant smaller mean time constant in the relearning blocks than the time constant in the initial learning block [LB1 time constant = 7.96 trials, 95% CI: (5.47, 10.46) trials; p < 0.0001; LB2, LB3 and LB4 mean time constant = 1.68 trials, 95% CI: (1.06, 2.31) trials; p < 0.0001 for difference; Fig. 4C]. On the other hand, subjects in the small perturbation Condition 3 only showed a trend for a change in time constants across learning blocks, indicating only a small degree of savings [LB1, time constant = 9.31 trials, 95% CI: (6.24, 12.38) trials; p < 0.0001; LB2, LB3, LB4, mean time constant = 6.42 trials, 95% CI: (4.04, 8.80) trials on average; p = 0.050 for difference]. Note that there was no difference in the time constant of adaptation in the initial block LB1 in the two conditions (p = 0.98). In addition, the mean time constant in relearning blocks (LB2–LB4) of Condition 1a was significantly smaller than the mean time constant for these blocks in Condition 3 (relearning p < 0.0001). Similarly the mean time constant in washout blocks (WB1–WB3) in Condition 1a was significantly smaller than that of Condition 3 [Condition 3, mean time constant = 4.88 trials, 95% CI: (3.51, 6.24) trials; p < 0.0001; Condition 1a, mean time constant = 2.08 trials, 95% CI: (1.36, 2.81) trials; p < 0.0001 for difference]. Note that these results partially replicate those of Morehead et al. (2015), with the main difference that we used 20° and 10° for the large and small perturbation with a single target, whereas they used 45° and 15° with two targets.
Simulations results (Fig. 4D–F) qualitatively account for these data, with large savings and shorter washouts in the large perturbation amplitude condition compared with the lack of savings and longer washout in the small perturbation amplitude condition. The large perturbation condition produced a time constant of 10.5 trials for the first learning block (LB1), followed by a small mean time constant of 1.55 trials for the subsequent learning blocks (LB2–LB4; Fig. 4F). The washout blocks produced a mean time constant of 1.51 trials across the three washout blocks (WB1–WB3). These short time constants were a direct consequence of model switches upon the perturbation change. On the other hand, the small perturbation condition produced a time constants of 10.0 trials for the first learning block (LB1), followed by a mean time constant of 10.2 trials for the subsequent learning blocks (LB2–LB4), and again a mean time constant of 10.0 trials again for the three washout blocks (WB1–WB3). These long time constants were a consequence of the baseline model being continuously updated each time the environment changed.
Decay in error-clamp: experimental and simulation results
In Figure 5, we show both condition-averaged and individual subjects' hand directions in error-clamp. Although averaged data suggest a continuous and gradual decay, between-subject variability was large. For instance, for subjects in Condition 1a (large perturbation and small noise; Fig. 5A), the average between-subject SD of hand direction in learning and unlearning blocks was 3.9°, whereas that in the error-clamp block it was 8.9°. Larger intersubject variability in error-clamp indicates that dynamics of unlearning in error-clamp may not follow a simple, passive decay. Instead, examination of individual data shows various patterns in error-clamp, with most subjects showing a lag, as predicted by our simulations (compare Fig. 3). For example, Subjects 1, 5, and 10 in Condition 1a showed little or no decay, with hand direction >10° for the whole duration of the clamp. In contrast, Subjects 2, 7, and 8 showed a sudden drop after varying lags following the onset of error-clamp. Finally, Subjects 4, 6, 9, and 11 showed rather gradual decay.
In contrast, in the condition with larger perturbation noise level (Condition 2a; Fig. 5B), the average hand direction appears to show faster return to baseline in error-clamp trials compared with the condition with lower noise level (Condition 1a; Fig. 5A), as predicted by the model. Here again not all subjects followed a simple gradual decay. In particular, Subjects 23, 27, 28, 30, and 31 exhibited sudden drops, with Subjects 23, 27, and 29, and 31 switching back to near 20° spontaneously, resulting in oscillatory patterns.
A density plot of hand directions in error-clamp for all subjects in Condition 1a (Fig. 5D, left, red curve) shows two peaks centered near 0 and 20 degrees. Thus, overall, the hand direction of subjects in Condition 1a remained near the perturbation angle of 20° for a relatively large number of trials, and then switched abruptly to near 0°, with few trials between these two angles. In contrast, the density plot of hand directions for subjects of Condition 2a (Fig. 5D, left, blue curve) shows a single peak centered near 0° with a fat right tail. Thus, overall, the hand direction of subjects in this condition also showed sudden switches between the perturbation angle and baseline, but such switches occurred earlier than Condition 1a with low noise, with occasional spontaneous returning back to near 20°.
Condition 3 (small perturbation and small noise; Fig. 5C) shows an overall trend of gradual decay. The distribution of hand directions suggests that decay was gradual and slow: whereas the density plot in the large perturbation Conditions 1a and 2a shows at least one peak near 0°, the distribution of hand direction in Condition 3 has a single peak at 6° (Fig. 5D, left, green curve). This suggests that there was no abrupt change, and most subjects did not decay completely to 0°.
For comparison, Figure 5D (right) shows the corresponding distributions of hand directions from multiple independent simulation runs of each condition. Compared with data, simulation results show relatively narrower distributions, but they overall replicate the general patterns of density of hand directions in the experimental data, including shapes of distributions and location of peaks in the three conditions, except that the center of the small perturbation (Condition 3) shifted closer to 0° because of a decay parameter setting in simulations to replicate all the experimental data consistently.
Trigger trials and individual differences in adaptation: experimental and simulation and results
To investigate whether a learner had formed two models, i.e., a baseline and a perturbation model, we introduced two trigger trials during error-clamp both in experiments and in simulations of large perturbation Conditions 1b and 2b (Fig. 6).
Condition-averaged hand direction in error-clamp showed instantaneous responses to the trigger trials (i.e., sudden jumps in hand direction), both in Conditions 1b and 2b (Fig. 6A,B, left). The response appears to be sustained for a number of error-clamp trials thereafter. However, the averaged hand direction indicates that the responses to trigger trials were, on average incomplete in magnitude, i.e., <20°. This is because not all individuals responded to the trigger trials; indeed, three patterns can be observed in the individual responses (Fig. 6A,B, right). First, when the hand direction was near 0° when a trigger trial was presented, a majority of subjects (for instance, Subjects 16 and 19 of Condition 1b) showed immediate jumps in response to the trigger trial to angle values near the adapted state. In contrast, and as expected when the hand direction was near 20° at the time of trigger trial, there was no effect (for instance, Subject 15 of Condition 1b). These first two patterns were predicted by the model; compare results for these subjects with the model's response to trigger trials in Figure 3 (first column, second and fourth rows). However, a third pattern is also apparent: several subjects with hand direction near 0° at the time of trigger trial did not respond to this trial (for instance, Subjects 13 and 14 of Condition 1b).
Figure 6C (right) shows simulation results in a large amplitude condition that account for these patterns of response to trigger trials. Two subgroups of subjects were simulated: a subgroup with the default baseline uncertainty (and with the associated minimum uncertainty) of 0.04; and a subgroup with a broader baseline uncertainty of 0.16. The subgroup with the narrower baseline uncertainty developed a new perturbation model during adaptation (two-model learners), and responded to trigger trials in clamp, as shown in Figure 3. As a result, the two-model learners showed large savings (similar to that of the large perturbation in Fig. 4), with a time constant of 10.7 trials for the LB1, followed by a small mean time constant of 1.46 trials for the subsequent learning blocks (LB2–LB4; Fig. 7F). The washout blocks produced a mean time constant of 1.43 trials across the three washout blocks (WB1–WB3). In contrast, the subgroup with the broader baseline model uncertainty did not develop a new perturbation model. These one-model learners produced a time constant 4.0 trials for LB1, followed by a mean time constant of 4.48 trials for LB2–LB4, and a mean time constant of 4.07 trials for the three washout blocks.
We hypothesized that subjects who responded to trigger trials had previously formed a new perturbation model, that is, were two-model learners. In contrast, subjects who did not respond to trigger trials only had previously only updated their baseline model, that is, were one-model learners.
To test this hypothesis, we clustered subjects in Conditions 1b and 2b into responders and non-responders based on the average hand direction of five trials following trigger trials using k-means clustering (Fig. 6C, left). The average hand direction of the 5 trials following the trigger trials in Cluster 1 was 17.7 ± 0.7° (SE). In contrast, the average hand direction of the five trials following the trigger trials in Cluster 2 was 3.4 ± 1.0° (SE). This clustering resulted in 15 responders and 6 non-responders. Responders were defined as subjects who had at least one response out of the two trigger trials that belongs to Cluster 1 in Figure 6C. Based on these simulations results, and on the savings results of large and small perturbation shown in Figure 4, we then conjectured that the non-responders would show little or no savings in relearning blocks, as well as gradual aftereffects in washout blocks.
Figure 7, A and B, shows the subject-averaged adaptation data for the 15 responders (top row) and 6 non-responders (bottom row). We then performed an exponential mixed-effect model analysis similar to that in Figure 4 with the data from all 21 subjects, coding for responders and non-responders with a grouping variable. Figure 7C shows the fixed-effect time constants for each block of both learning and washout blocks for both responders and non-responders. Responders showed large savings, i.e., significant decrease of mean time constant in the relearning blocks (LB2–LB4), compared with the initial learning block [LB1 time constant = 6.75 trials, 95% CI: (5.26, 8.25) trials for LB1; p < 0.0001; LB2–LB4 mean time constant = 2.21 trials, 95% CI: (1.64, 2.79) trials; p < 0.0001 for difference]. On the other hand, non-responders showed no change in time constants in learning and relearning blocks, indicating no savings [LB1 time constant = 6.97 trials, 95% CI: (3.08, 10.87) trials; p < 0.0001; LB2–LB4 mean time constant 8.86 trials, 95% CI: (4.10, 13.63) trials; p = 0.34 for difference]. Note that there was no difference in time constants in the initial block LB1 between responders and non-responders (p = 0.77). In addition, the mean time constant in relearning blocks (LB2–LB4) of responders was significantly smaller than that for non-responders (relearning p = 0.00088). The responders showed short after-effects, whereas the non-responders showed larger after-effects: the mean time constant in washout blocks (WB1–WB3) of responders was significantly smaller than that of non-responders [non-responders, time constant = 6.50 trials, 95% CI: (3.29, 9.71) trials; p < 0.0001; responders, time constant = 3.51 trials, 95% CI: (1.48, 5.55) trials; p = 0.037 for difference]. Note that these results for responders and non-responders parallel those for the large perturbation Condition 1a and the small perturbation Condition 3 in Figure 4, respectively. Thus, in our paradigm, 20° appears to be a large perturbation for most subjects, leading to the development of a new perturbation model. For a few other subjects, however, the same 20° perturbation appears to only warrant the update of the baseline model.
Reaction time analysis
Following previous modeling work in motor learning and adaptation (Wolpert and Kawato, 1998; Berniker and Kording, 2008), and in visual memory (Gershman et al., 2014), we have proposed a scheme for the formation of internal “perturbation” models from novice models. However, in visuomotor adaptation, a number of recent studies have found that participants can strategically re-aim their hand direction to minimize errors (Taylor et al., 2014; Haith et al., 2015). Thus, it can be asked whether our perturbation models are true internal models or whether they are deliberate/controlled processes that underlie such strategic re-aiming. In this later scheme, the two-model learners would update a baseline model for small perturbation and re-aims for large perturbations. Re-aiming has been associated with increase in reaction times (Haith et al., 2015; McDougle and Taylor, 2019). Accordingly, large perturbations would be associated with larger reaction times than small perturbations. In addition, such a scheme would predict increases in reaction times between the pre-trigger trials (when no aiming occurs) and post-trigger trials (when re-aiming occurs). In contrast, selection and update of a true perturbation internal model would show similar reaction times to the baseline model before and after the trigger.
We therefore examined fluctuations in reaction time between the experimental conditions, the adaptation and de-adaptation blocks, and the five trials between before and after trigger trials. For each subject, we averaged the reaction times within blocks, and then compared the mean reaction times between conditions with t test or between blocks with either repeated-measures ANOVAs or pair t tests. Reaction times >2 s were discarded.
Mean reaction times for the four adaptation blocks in the large perturbation of Condition 1a were 605 ± 48 (S) ms, 568 ± 39 ms, 561 ± 38 ms, and 568 ± 32 ms in the order of LB1 to LB4. In the small perturbation Condition 3, reaction times were 579 ± 45, 560 ± 42, 542 ± 44, and 498 ± 28 ms, in the same order. An overall repeated-measure ANOVA with Blocks 1–4 as the repeated effect and condition as the between-group effect showed that reaction times decreased over blocks (p = 0.003), but were not different between the large perturbation Condition 1a and the small perturbation Condition 3 (p = 0.34), with no interactions (p = 0.62). In the large perturbation Condition 1a, reaction times were not different across blocks (p = 0.22, repeated-measure ANOVA). In the small perturbation Condition 3, however, there was an effect of blocks (p = 0.007); reaction times were smaller in adaptation blocks 3 and 4 than in adaptation Block 1 in this condition (both p < 0.05; paired t test).
Across both trigger trials in Conditions 1b and 2b and across subjects, there was no difference between RTs in the five trials before and after the trigger trials (before 506 ± 17 ms; after 473 ± 18 ms; p = 0.13; pair t test). We then analyzed changes in reaction times for effective triggers (that is, triggers that brought the hand direction from a low state of <10 degrees to a high state) for the responder subgroup (see above). In this case, the reaction times increased by 57 ms on average (5 trials before trigger 531 ± 33 ms; 5 trials after trigger 588 ± 40 ms; p = 0.037; pair t test). Note however, that although such effective triggers occurred in 14 participants, this increase in reaction time was only present in 9 participants; the other 5 participants showed decrease in reaction times following trigger trials.
Savings and aftereffects following gradual and abrupt perturbations: simulation results
Here we present simulations that account for “one-trial” savings in the data from Roemmich and Bastian (2015). Specifically, these authors showed that, whereas a gradual perturbation followed by a washout period does not lead to savings in a subsequent abrupt re-adaptation phase, an abrupt perturbation followed by a washout period leads to large savings in a subsequent abrupt re-adaptation phase. A short abrupt perturbation followed by a washout period also leads to savings in a subsequent abrupt re-adaptation phase. Figure 8 shows that our model, without any changes, reproduces these results for the three initial blocks (gradual, abrupt perturbation, and short abrupt perturbation).
Discussion
Using a combined computational and behavioral approach, we showed that qualitative differences in creating, updating, and recalling memories in visuomotor adaptation are linked and explained by a single computational model of motor adaptation based on the mixture of experts framework. In the following, we discuss how our results account for previous experimental data on savings and error clamp, and for qualitative individual differences in adaptation. We then discuss possible implications of our results for studying neural mechanisms.
Savings, error-clamp, and individual differences in adaptation
Savings were dominant in the large perturbation conditions. In contrast, longer-lasting aftereffects were observed in the small perturbation condition. These results are in line with previous studies showing that savings occurred after an initial large abrupt perturbation followed by a washout period (Klassen et al., 2005; Krakauer et al., 2005; Morehead et al., 2015), even with very few trials of adaptation (Huberdeau et al., 2015). In contrast, no savings occurred when a gradual adaptation was followed by a washout period (Herzfeld et al., 2014; Roemmich and Bastian, 2015). Our model explains these differences via creation of a new internal model for large perturbations and via update of the baseline model for small and gradual perturbations. Note, however, that we did observe some degree of savings in the small perturbation condition as well (albeit significantly less than in the large perturbation condition). A model that includes a meta-learning mechanism, such as proposed by Herzfeld et al. (2014), could account for these results. Alternatively, such savings could also be due to the small perturbation being already large enough for creating and updating a perturbation model for some participants. This may be the case, because, in the second learning block, the two smallest time constants in the small perturbation Condition 3 (4.15 and 4.9 s) were similar to the two largest time constants of the responder subgroup in the large perturbation Conditions 1b and 2b (4.8 and 4.9 s).
Our study also reconciles previous controversial results on error-clamps. In the large perturbation conditions, performance during clamp was often sustained near the perturbation level and abruptly terminated after lags of varying durations. The lag duration was, on average, shorter in conditions with large perturbation noise. Such lags were found in previous studies (Scheidt et al., 2000; Vaswani and Shadmehr, 2013; Vaswani et al., 2015). In stark contrast, it has been argued that, in these previous studies, the lags were due to an artifact and that the decay was due to passive forgetting (Brennan and Smith, 2015). Our data is consistent with such a gradual decay in the small perturbation condition and for a number of subjects in the large perturbation conditions.
The distinction between two- and one-model learners in the model accounts for these qualitative difference in savings and error-clamp data. If a learner becomes a two-model learner, then savings, lags, and switching occur. Noise in the prediction errors and memory decay determine the lag durations in error-clamp; if the prediction error of the perturbation model becomes sufficiently large because of a sudden large noise input, model switching ensues. The adapted performance can be re-instated via trigger trials, via a sudden decrease in the perturbation model's prediction error, and thus reselection of this model in these responders. Some subjects did not respond to trigger trials, however. Because these subjects exhibited gradual washout and little savings, we predict that they remained one-model learners and solely updated their baseline models.
Limitations of the model
Our model accounts for the data presented, but not for a number of other phenomena in adaptation. First, we did not dissociate the “strategic” versus implicit components of adaptation (Fernandez-Ruiz et al., 2011; Taylor et al., 2014; Schween and Hegele, 2017). Despite having found no differences in reaction times between the large and small amplitude conditions, most, but not all, subjects in the responder subgroup showed an increase in reaction times following effective triggers. This suggests that these subjects strategically re-aimed their movements, at least to some degree. Such strategic adaptation could explain why, in our simulations, a smaller decay rate was needed in the expert model than that in the baseline model to account for the long lags in error-clamp. This is because strategic learning appears to be “temporally stable”, whereas implicit learning consists of both temporally stable and “temporally labile” components (Miyamoto et al., 2014). We note however that such distinction between strategic and implicit components does not contradict the need to create and select multiple memories for multiple- or even for a single-adaptation task, as we have proposed. Second, and related, we did not model multiple processes with different timescales involved in motor adaptation (Smith et al., 2006; Kording et al., 2007; J. Y. Lee and Schweighofer, 2009; S. Kim et al., 2015; J. Y. Lee et al., 2016). As a result, the model cannot reproduce effects that require both fast and slow processes, such as spontaneous rebounds in error clamp. Addition of a fast process in the model could account for this phenomenon. Third, we largely simplified the simulations by only considering a baseline model and a single novice model. An extension of the model to multiple adaptation could be envisioned in which the number of memories is not specified in advance (Gershman and Blei, 2012). Finally, we assumed that selection was via a winner-take-all scheme, which well accounted for switching data in the error clamp and following trigger trials. However, a continuous weighting scheme may better explain multiple adaptation data (Ghahramani and Wolpert, 1997), as well as anterograde interference data. Yet, another possible scheme is that the baseline model is always updated (Berniker and Kording, 2011), with the perturbation model selected or not depending on the perturbation. This would account for adaptation to task-irrelevant clamped feedback data, up to large perturbation angles (Morehead et al., 2017; H.E. Kim et al., 2018). Future work is needed, however, as these last two possible architectures would not well account for switching behavior in error clamp.
Implications of our results for studying neural mechanisms
There is strong evidence that the cerebellum is involved in motor adaptation via update of forward models based on sensory prediction errors (Martin et al., 1996; Miall et al., 2007; Tseng et al., 2007; Izawa et al., 2012; Schlerf et al., 2012; Popa et al., 2013). A possibility is that prediction errors are generated by comparison of sensory signals and predictions from the cerebellum. A second possibility is that prediction errors are computed in the parietal cortex (Inoue and Kitazawa, 2018) by the microcircuitry of the cortical column (Bastos et al., 2012). The prediction errors could then be sent to the to the areas of the cerebellum involved in adaptation (Sasaki et al., 1977; Rabe et al., 2009). Yet, a third possibility is that both the parietal cortex and the cerebellum update memories from prediction errors, but at different time scales (S. Kim et al., 2015). A computational imaging study (fMRI or lesion mapping) of adaptation that uses model-derived prediction errors as regressors of BOLD activity may help to shed light on these possibilities.
What could be the neural correlates of the body and perturbation models? A possibility, which can be tested in an imaging study, is that the phylogenetically newer lateral parts and the anterior arm area part (lobules IV–VI) are involved in adaptation to large and small perturbations, respectively (Imamizu et al., 2003; but see Werner et al., 2014). In addition, recent data (Inoue and Kitazawa, 2018) are consistent with parietal area 5 neurons encoding errors for the baseline model and parietal area 7 neurons encoding errors for perturbation models (Shadmehr, 2018). This possibility is in line with a fMRI study showing that the IPL part of area 7, by its influence on the cerebellum, is involved in internal model switching from prediction errors, as previously proposed in MOSAIC (Imamizu and Kawato, 2008).
A distinction with MOSAIC, however, is that in our model both the mean and the uncertainty of the perturbation estimates are updated, with expertise being due to both accurate and precise predictions. To shed light on the neural correlates of precision, model-derived precision-weighted prediction errors could be used as regressors of BOLD activity in a computational-fMRI study (for such an experiment in audiovisual learning, see Iglesias et al., 2013).
Conclusion
We proposed a new model of motor adaptation, which uses multiple precision-weighted prediction errors for memory creation, selection, and update. The model, akin to a model for visual memories proposed by Gershman et al. (2014), provides insights into a number of puzzling and contradictory experimental data on savings and error-clamp, and accounts for large qualitative individual differences. More generally, recent experiments and theories suggest that coding of precision-weighted prediction errors is used in multiple types of human memory (Friston, 2005; Henson and Gagnepain, 2010; Greve et al., 2017). Thus, our simulations and behavioral data of motor adaptation are in line with the general view of human learning according to which new memories are created when no existing memories can account for discontinuities in sensory data.
Footnotes
This work was supported by Grants NSF BCS 1031899 and R56 NS100528 to N.S. We thank Jun Izawa for helpful discussions on models of motor adaptation, and Raphael Schween, Hiroshi Imamizu, Hiroyuki Kambara, and Atsushi Takagi for valuable comments on a previous draft.
The authors declare no competing financial interests.
- Correspondence should be addressed to Nicolas Schweighofer at schweigh{at}usc.edu