Abstract
When we adapt our movements to a perturbation, and then adapt to another perturbation, is the initial memory destroyed, or is it protected? Despite decades of experiments, this question remains unresolved. The confusion, in our view, is due to the fact that in every instance the approach has been to assay contents of motor memory by retesting with the same perturbations. When performance in retesting is the same as naive, this is usually interpreted as the memory being destroyed. However, it is also possible that the initial memory is simply masked by the competing memory. We trained humans in a reaching task in field B and then in field A (or washout) over an equal number of trials. To assay contents of motor memory, we used a new tool: after completion of training in A, we withheld reinforcement (i.e., reward) for a brief block of trials and then clamped movement errors to zero over a long block of trials. We found that this led to spontaneous recovery of B. That is, withholding reinforcement for the current motor output resulted in the expression of the competing memory. Therefore, adaptation followed by washout or reverse adaptation produced competing motor memories. The protection from unlearning was unrelated to sudden changes in performance errors that might signal a contextual change, as competing memories formed even when the perturbations were introduced gradually. Rather, reinforcement appears to be a critical signal that affords protection to motor memories, and lack of reinforcement encourages retrieval of a competing memory.
Introduction
When adaptation to a perturbation is followed by reverse adaptation, does the brain protect the memory that was acquired during adaptation, or do the errors during reverse adaptation continue to modify the previously acquired memory? In a typical experiment, a target is presented and the subject produces a movement that is perturbed by amount A. With training, the subject learns to respond to that target by producing motor commands that approximately cancel A. Let us call the result of this training “the motor memory for A.” Now suppose that the perturbation is changed to B. With training, in response to the same target, the brain produces motor commands that cancel B. The central question is whether this learning destroys the memory of A. Despite a half century of research (Lewis et al., 1952; McGonigle and Flook, 1978; Shadmehr and Brashers-Krug, 1997; Caithness et al., 2004; Krakauer et al., 2005), this question remains unresolved.
For example, Caithness et al. (2004) trained reaching movements in the AB paradigm and then retested in A. They found that performance was no different from naive. They wrote, “when people successively encounter opposing transformations (A then B) of the same type (e.g., visuomotor rotations or force fields), memories related to A are reactivated and then modified while adapting to B.” Indeed, there is currently little evidence for protection of A in the AB paradigm. The same question, however, has produced unequivocal results in other fields of memory research: experiments in classical conditioning (Medina et al., 2001; Stollhoff et al., 2005) suggest that memories that are produced during adaptation are protected during reverse adaptation (termed extinction). Indeed, extinction training is believed to produce a distinct memory that competes with the original (for review, see Bouton, 2002).
Why have the two fields of research arrived at different conclusions? A major difference is the method that has been used to assay memory. While in the motor learning literature, the assay of memory is savings (i.e., faster relearning), in classical conditioning, the assay is spontaneous recovery. Spontaneous recovery refers to the observation that extinction training returns performance to baseline, but with passage of time the brain reexpresses the adaptation memory in response to the stimulus. This is taken as evidence that the extinction memory masks expression of the adaptation memory, but passage of time dissolves this mask.
Spontaneous recovery has also been observed in motor learning, but its occurrence has not been viewed as evidence for protection of memories. For example, when a long period of A is followed by a brief period of B, motor output returns to baseline. However, in the following error-clamp trials in which performance errors are clamped to zero (Scheidt et al., 2000), motor output rises from baseline toward A (Smith et al., 2006; Criscimagna-Hemminger and Shadmehr, 2008; Ethier et al., 2008). One view is that this is evidence for multiple timescales of a single-context memory (Smith et al., 2006) in which training in B can destroy the memory of A (Sing and Smith, 2010). Another view is that upon transition to B, at least some component of A is protected (Lee and Schweighofer, 2009), but then the error-clamp trials make the brain uncertain regarding context, resulting in expression of both A and B.
Is the memory of A protected during learning of B? If so, how can we encourage the brain to express it? We show that when the brain is expressing one motor memory, it will switch and express another motor memory merely because of sudden reductions in probability of success. This produces spontaneous expression of a previously acquired memory. In the AB paradigm, at least part of the A memory is protected and can be retrieved through manipulation of reinforcement.
Materials and Methods
We recruited 99 neurologically intact, right-hand-dominant participants for our study (24.2 ± 4.6 years, mean ± SD, including 45 males and 54 females). All volunteers were naive to the paradigm and the purpose of the experiment. Each volunteer signed a consent form approved by the Johns Hopkins University School of Medicine Institutional Review Board. Subjects were trained in a force field (Shadmehr and Mussa-Ivaldi, 1994). They held a two-joint robotic manipulandum while making point-to-point reaching movements with their right hand. The hand was covered by a screen and visual feedback was provided by a small cursor (5 × 5 mm) representing the actual hand position. Participants reached out from a 1 cm2 starting point to a 1 cm2 target positioned at a distance of 10 cm directly along the body midline. The trial was considered a success if the movement was completed within 500 ± 50 ms, and the tangential velocity during the reach did not exceed 0.55 m/s or fall below 0.20 m/s. If the trial was successful, the target was animated to represent an explosion. Otherwise, there was no explosion. The explosion was the only “reinforcement.” Subsequently the robot brought the hand back to the center starting point to indicate start of the next trial. We recorded force at the handle, as well as position and velocity of the hand at a rate of 100 Hz. Volunteers were allowed a 2 min break at intervals of 192 trials. A typical experiment lasted ∼2 h.
Error-clamp trials.
We placed error-clamp trials (Scheidt et al., 2000) randomly in the baseline and adaptation phases with one-eighth probability (no error-clamp trials were present during brief 20 trial periods, see below). Immediately following training, all groups were given a block of 364 error-clamp trials. During error-clamp trials, the motion of the hand was constrained to a straight line “channel” to the target by a stiff one-dimensional spring (spring coefficient = 2500 N/m; damping coefficient = 25 N · s/m) that counteracted forces perpendicular to the direction of the target. As in all trials, the target was animated (indicating success) if the hand reached the target within 500 ± 50 ms.
Our experiments were designed to answer three questions: (1) Does adaptation to a perturbation followed by an equal period of washout or reverse adaptation result in catastrophic destruction of the motor memory? (2) If motor memories are protected from unlearning, does this protection depend on a contextual cue associated with large errors that arise from a change in the perturbation? (3) Under what conditions do previously acquired motor memories show spontaneous expression?
Experiment 1.
We performed this experiment to ask whether adaptation to a perturbation followed by an equal period of washout or reverse adaptation resulted in catastrophic destruction of the motor memory. We trained four groups of subjects in protocols that are illustrated in Figure 1A. All protocols began with a null field for 192 trials during which no forces were imposed on the hand. Field A was clockwise curl in which forces on the hand were dependent on hand velocity ẋ via a viscosity matrix [0 13; −13 0] N · s/m. Field B was a counterclockwise curl. In the BNb group (n = 9), we tested whether null training could erase previous training in B. Participants were trained in B for 384 trials, followed by null for 384 trials, followed by B for 20 trials, and then error clamp for 364 trials. We compared the forces that the BNb group produced in the error-clamp trials with group Nb (n = 9). The hypothesis was that if null training produced unlearning of the memory acquired in prior training in B, then the forces that subjects produced in error-clamp trials following 20 trials in B should be identical in the BNb and Nb groups. Alternatively, if the brain protected the memory of B during the null training, perhaps it would express this protected memory following brief reexposure to B. Next, we tested this same question in a different protocol. In the BAb group (n = 9), we tested whether training in the opposite force field could destroy previous training in B. Participants were trained in B for 384 trials, followed by training in A for 384 trials, followed by B for 20 trials, and then error clamp for 364 trials. We compared the forces that the BAb group produced in the error-clamp trials with group Ab (n = 9).
Experiment 2.
The results of Experiment 1 suggested protection of memories. A number of previous models have suggested that sudden large errors that occur when the perturbation is changed from one direction to the opposite direction alert the brain that the context has changed, resulting in protection of the currently active memory and spawning of a new memory (Jacobs et al., 1991; Haruno et al., 2001). Here, we performed an experiment to ask whether protection of motor memories depended on a contextual cue associated with large errors that arise from a sudden change in the perturbation. We trained four groups of subjects in protocols that are illustrated in Figure 1B. In the BgNb group (n = 9), we presented B gradually over 96 trials, maintained it at full strength for 192 trials, and then gradually returned it to null over 96 trials. The null training continued for another 384 trials, and then B was reintroduced at full strength for 20 trials, followed by error clamp for 364 trials. We compared the forces that the BgNb group produced in the error-clamp trials with group Nb (n = 9). In the BgAb group (n = 9), the gradual presentation of B was followed by training in A for 384 trials, followed by 20 trials in B, and then 364 error-clamp trials. We compared the forces that the BgAb group produced in the error-clamp trials with group Ab (n = 9).
Analysis of the data in the groups that learned B gradually demonstrated protection of the B memory. To further examine the nature of this protection, we recruited a new group of subjects and introduced field B even more gradually. In the BggNb group (n = 9), we presented B gradually over 192 trials, maintained full strength for 192 trials, and then gradually returned to null over 192 trials. The null training continued for another 192 trials, and then B was reintroduced at full strength for 20 trials, followed by 364 error-clamp trials.
Experiment 3.
In Experiments 1 and 2, we observed that 20 B trials produced spontaneous recovery of the previously acquired B memory, i.e., revisiting B produced partial recall of the previously acquired B memory, despite the intervening washout and reverse adaptation. What were the cues that encouraged the brain to express a previously protected motor memory in the error-clamp block? Initially, we imagined that perhaps the critical cue was the fact that the forces in 20 B trials were the same as the forces in the initial B training. That is, perhaps the forces that subjects experienced in the 20 B trials acted as a cue that produced expression of the previously acquired B memory. To check for this, we trained subjects in ANb. In the ANb group (n = 9), we presented A for 384 trials, followed by null for 384 trials, 20 B trials, and then 364 error-clamp trials (Fig. 1C). We compared the forces that the ANb group produced in error-clamp trials with the Nb group (n = 9). If the critical cue for expression of a memory was similarity between the perturbation forces in the acquisition and reexposure periods, then subjects should not express A in the error-clamp trials after brief exposure to B. In fact, we found that the subjects expressed some of the A memory in the error-clamp trials.
Experiment 4.
One way to account for the results of the above experiments is to imagine that the sudden changes in movement error and probability of success that accompanied the 20 B trials made the brain uncertain regarding which motor output was appropriate, A, B, or null. Perhaps it was this uncertainty that resulted in expression of the previously acquired memory. To test this idea, we trained subjects in B, and then after washout, attempted to make our subjects uncertain through artificially manipulating probability of error and/or probability of success. After we manipulated this uncertainty during 20 trials, we then presented them with the usual error-clamp block. In the BgNR group (n = 9), we presented B gradually over 96 trials, maintained it at full strength for 192 trials, and then gradually returned it to null over 96 trials (Fig. 1D). Following 96 null trials, a random field was presented for 20 trials. This random field consisted of 7 trials of A, 7 trials of B, and 6 null trials, randomly interspersed. This was followed by 364 error-clamp trials. We compared the forces that the BgNR group produced in the error-clamp block with the BgN group (n = 9). The BgN group did not receive the random field before the error-clamp trials.
To dissociate whether uncertainty arose from sudden changes in the probability of error versus sudden changes in the probability of success, we considered a final group of subjects. In the BgNS group (n = 9), we presented B gradually over 96 trials, maintained it at full strength for 192 trials, and then gradually returned it to null over 96 trials (Fig. 1D). After an additional 96 null trials, we presented 20 error-clamp trials but withheld reinforcement (target explosions) even if velocity and performance time were within the acceptable limits. This was followed by 364 error-clamp trials with the usual success requirements.
Data analysis.
Performance was measured via the force that subjects produced against the channel wall of the error-clamp trials. The force output as a percentage of perturbation was calculated via the ratio of the actual force output, as measured at maximum velocity in error-clamp trials, to the ideal force required to cancel the perturbation at that velocity. During the 20 trials of B, no error-clamp trials were given. To measure performance in these trials, the perpendicular displacement from a straight line to the target at maximum velocity was calculated for each movement. This served as the proxy for movement error in non-error-clamp trials. Repeated-measures ANOVA and post hoc Tukey's test were used to confirm that all groups reached equivalent levels of adaptation during force field trials. Two-tailed t tests were used to quantify the differences in initial bias of the error-clamp block, and to evaluate the bias in average motor output between groups for the last 100 trials of the error-clamp block. All analyses were done using Matlab 7.0.4 and SPSS.
We performed a bootstrapping procedure to estimate the strength of memories of A and B that were expressed during error-clamp trials (Criscimagna-Hemminger and Shadmehr, 2008). For example, to estimate how null affected the previously acquired memory of B, we randomly selected one subject from the BNb group (with replacement) and another from the Nb group (with replacement) and found the difference in force output (percentage perturbation) for each trial during the error-clamp block. In other words, the B that remained and was expressed after the null field training is B̂ = BNb − Nb. This is the assay of the memory that was protected from destruction. We repeated this subtraction 100 times, randomly selecting subjects from each group and established a distribution for B̂. Similarly, the memory for Bg that remained after subjects experienced a null field was B̂g = BgNb − Nb. To account for any differences in the 20 trials of B following A training, the Ab group served as the control to assay the B that remained and was expressed after the A field training.
Model.
We compared the predictions of two previously published models of motor adaptation: a model that allowed erasure of memories (Smith et al., 2006) and a model that used sudden errors to protect memories (Lee and Schweighofer, 2009). Both models are multirate, multiple-timescale formulations that allow us to compute the expected patterns of spontaneous recovery in error-clamp trials. Both use the following error equation to drive motor adaptation. For force-field trials, In the above equation, e(n) is the prediction error on trial n, f(n) is the perturbation force, and x(n) is motor output. In error-clamp trials, the perturbation is equal and opposite to the force produced by the subject. Adaptation in the Smith et al. (2006) model is achieved by two internal states, one fast process that adapts quickly but has poor retention and one slow process that learns slowly but has better retention. The update equations for the net motor output are given by the following: The learning rates for the fast and slow states are 1 > bf > bs and the forgetting rates for these states are 1 > as > af.
In the above model (Eq. 2), both the fast and the slow processes are updated by the same prediction error. In this model, memories are not protected. In contrast, in the Lee and Schweighofer (2009) model, there is one fast state and many slow states. The slow states are selected based on contextual cues: The contextual cue is switched based on large errors. This allows for protection of slow memories. In error-clamp trials, there is no contextual cue to allow for explicit memory selection, so the value of each element of the context vector is set to 1/m, where m is the number of contexts. The parameters for both models were set by those given in Joiner and Smith (2008), as af = 0.85, as = 0.998, bf = 0.11, and bs = 0.021.
Results
Our first aim was to determine whether adaptation followed by an equal period of washout or reverse adaptation resulted in catastrophic destruction of a motor memory. Experiment 1 (Fig. 1A) was designed to answer this question. Let us begin with some simulations to illustrate how the patterns of spontaneous recovery, i.e., motor output in the error-clamp block, should be affected if memories are protected from unlearning.
Consider a training protocol in which a long period of adaptation in A (∼400 trials) is followed by a brief period (20 trials) of adaptation in B, where B = −A. This training is then followed by error-clamp trials in which movement errors are clamped to zero (Ab paradigm, left column of Fig. 2A). To simulate learning, we considered two existing models. The first model (Fig. 2B) assumed a single context (Smith et al., 2006) in which errors always produced learning/unlearning. The second model assumed multiple contexts (Lee and Schweighofer, 2009) in which sudden errors produced a contextual change that protected a component of the currently activated memory from unlearning (Fig. 2C). Both models assumed that changes in motor output are due to a fast adaptive process that learned strongly from error but had poor retention, and a slow adaptive process that learned weakly from error but had strong retention. The multiple-context model further assumed that the slow component of the memory (but not the fast component) was contextual: a sudden change in error signaled a change in context, resulting in deactivation of the slow trace and instantiation of a new slow trace.
As the left column of Figure 2, B and C, illustrates, both models predict spontaneous recovery in the Ab paradigm. At the end of Ab training, in the single-context model, there is a slow memory of A and a fast memory of B. In the error-clamp block, the different rates of decay of the fast and slow processes produce spontaneous recovery of A. In contrast, in the multiple-context model, the sudden errors that occur in the transition from A to B signal a contextual change. This contextual change deactivates slow A (i.e., it no longer contributes to output) and protects it from unlearning, while activating a slow B. The multiple-context model further assumes that the transition from B to error-clamp trials causes reactivation of A so that both the slow B and slow A are present in the error-clamp trials. The important idea is that whereas both models account for the rise of motor output from baseline toward A in the error-clamp trials, they do so with very different interpretations: the single-context model explains that this rise is due to passive decay of currently activated memories, whereas the multiple-context model explains this pattern as a consequence of reactivation of previously inactive and protected memory of A.
In previously published work, the Ab paradigm indeed produced rise of motor output from baseline toward A followed by a gradual decline (Smith et al., 2006; Criscimagna-Hemminger and Shadmehr, 2008; Ethier et al., 2008). As the above simulations show, both a single-context model that allows for erasure, and a multiple-context model that protects memories can account for this pattern. However, a simple experiment can dissociate between these two models. Consider a training protocol in which a long period of training in B precedes the Ab training (BAb paradigm, right column Fig. 2A). In this scenario, the single-context model predicts that because the length of training in B is equal to the subsequent training in A, the fast and slow memories that are produced by B are transformed to fast and slow memories for A; i.e., A destroys B (Zarahn et al., 2008). As a result, in the single-context model the motor output in the error-clamp block is identical in the BAb and Ab paradigms (Fig. 2D). In contrast, in the multiple-context model the slow B is protected from error-dependent learning during training in A, but is then expressed in the error-clamp block. Consequently, during the error-clamp block the motor output in BAb is biased toward B as compared to Ab (Fig. 2D). In summary, if memories are protected, then we should see that the motor output in the error-clamp block in BAb is biased toward B as compared to Ab. We focus on the bias that persists throughout the error-clamp block as evidence of multiple memories, as this assay should indicate retention of the slow memories without contamination from any fast learning or switching that occurs at the transitions.
The organization of the experiments is as follows: In Experiment 1, we will show that in the BAb and similar paradigms, motor output in the error-clamp trials is biased toward B, suggesting protection of B during adaptation to A. In Experiment 2, we will show that protection of B is unrelated to sudden errors that might signal a contextual change, raising doubt about models in which contextual change is based on kinematics errors. In Experiments 3 and 4, we will show that spontaneous recovery, i.e., expression of a previously acquired memory in error-clamp trials, is an active process of recall and not passive decay of an already active motor memory. Finally, we will show that this active recall is associated with withholding of reinforcement for a current motor output, resulting in the retrieval of a previously acquired memory.
Experiment 1: memories are protected from unlearning
The design of this experiment is shown in Figure 1A. To determine whether the memory of B is protected during subsequent training in A, we compared the forces that subjects produced in the error-clamp block in the BAb and Ab groups. We noted that by end of training in A, performance of the Ab and BAb groups were indistinguishable (comparison of last five error-clamp trials in A, F(1,16) = 0.041, p > 0.5). However, in the error-clamp block the motor output in the BAb group was biased toward B as compared to Ab (Fig. 3A). For example, the forces on the first error-clamp trial were significantly more biased toward B in BAb than Ab (t test, p = 0.005). Furthermore, the forces reached a much lower plateau in the BAb group as compared to the Ab group (average of last 100 trials, t test, p < 0.001). In comparing the BAb and Ab groups, the motor output in the error-clamp block is similar to the predictions of the multiple-context model (right column, Fig. 2C), suggesting that memory of B was protected during training in A, and then reexpressed in the error-clamp block.
To determine whether the memory of B could be destroyed by subsequent training in null, we compared performance in the error-clamp block in the BNb group versus the Nb group. We noted that by the end of the null trials, performance of the Nb and BNb groups were indistinguishable (comparison of the last five error-clamp trials in null, F(1,16) = 0.121, p > 0.5). However, in the error-clamp block the motor output in the BNb group was biased toward B as compared to Nb (Fig. 3B). For example, the forces on the first error-clamp trial were significantly more negative in BNb than Nb (t test, p = 0.005). This bias, however, vanished beyond the 50th trial of the error-clamp block. Therefore, training in null appeared to have a greater effect on memory of B than training in A. [It is interesting that in the Nb group, 20 trials of B are sufficient to produce a memory that does not decay to zero even after 300 error-clamp trials. This is a consistent finding that we have found regardless of whether the 20 trials are in a clockwise or a counterclockwise field (Keisler and Shadmehr, 2010)]. Together, the patterns of motor output during error-clamp block of Experiment 1 suggested that during adaptation to A or washout in null, the previously acquired memory of B was at least partially protected. Brief reexposure to B produced reexpression of the previously acquired B memory.
Experiment 2: protection despite paucity of sudden errors to signal a contextual change
One possibility is that in Experiment 1, the large movement errors that accompanied introduction of B, or the large errors that accompanied transition to null, acted as context cues that facilitated protection of B. Indeed, such kinematic errors are the basis for contextual change in theoretical models (Jacobs et al., 1991; Haruno et al., 2001; Doya et al., 2002). Without the large errors to mark a change in context, the theoretical models predict that memories will be erased. Furthermore, expression of the B memory in the error-clamp trials may be due to the fact that the errors induced by the 20 B trials were similar to the initial errors experienced during adaptation to B. Thus, we performed Experiment 2 (Fig. 1B) to ask two questions: whether protection of the B memory required sudden change in errors to signal a change in context, and whether recall of this memory relied on cues that were error dependent.
In the BgAb group, field B was introduced gradually, and then after a period of constant perturbation, was gradually returned to null, following which A was introduced abruptly (Fig. 1B). This was followed by 20 B trials and then a long sequence of error-clamp trials. We imagined that if formation of the B memory required a sudden perturbation to “label” it, or if its recall during reexposure (20 B trials) required a similarity between errors during initial learning and reexposure, then the forces in the error-clamp block would be similar in BgAb and Ab. Instead, we found a strong bias toward B in the BgAb group as compared to Ab. For example, the forces on the first error-clamp trial were more negative in BgAb than in Ab (t test, p = 0.004). Furthermore, the forces reached a much lower plateau in the BgAb group as compared to the Ab group (average of last 100 trials, t test, p < 0.001). These results suggested two ideas: (1) that a sudden change in error was not required for establishing a motor memory that could be protected, and (2) that recall of a motor memory did not require errors during reexposure that were similar in magnitude to errors experienced during acquisition. This last point is crucial, as it suggests that expression of B in the error-clamp block is not based on a comparison between errors acquired during acquisition and retesting. Finally, because the forces had a lower plateau in the BgAb group than in the BAb group, it would appear that a memory that is acquired without sudden errors (gradual B) is more resistant to destruction than a memory that is acquired with sudden errors (abrupt B) (Huang and Shadmehr, 2009).
The transition from B to A in the above experiment was abrupt, inducing sudden errors. Is this abrupt transition crucial for protection of the B memory? To check for this crucial assumption of multiple-context models, we considered performance of the BgNb group in comparison to Nb group (Fig. 1B). In the BgNb group, the initial B memory was acquired without sudden changes in error, and its transition to null was also without sudden changes. After the brief reexposure to B, we again found a strong bias toward B in the BgNb group versus the Nb group. For example, the forces on the first error-clamp trial were more negative in BgNb than in Nb (t test, p = 0.030).
In summary, results of Experiment 2 suggested three ideas. First, the protection of the B memory did not rely on a sudden change in errors that may have signaled a change in context. Second, the recall of the B memory did not rely on cues such as error size that might be shared in initial exposure and reexposure. This implies that sudden movement errors were not necessary to contextually label a memory so that it could be protected or later recalled. Finally, gradual adaptation to B produced a motor memory that was more resistant to subsequent training in A or null (as compared to abrupt adaptation to B), as evidenced by a stronger bias toward B in Experiment 2 than in Experiment 1.
Expression of memories in error-clamp trials
While the results from Experiments 1 and 2 suggest that the memory of B was not destroyed by subsequent training in null or A, it is useful to quantify how much of this memory was expressed in the error-clamp trials. We estimated expression of the B memory in the error-clamp block using a bootstrapping method. For example, to quantify B that was expressed in the BNb group, we subtracted the forces produced by the Nb group from the BNb group, i.e., B̂ = BNb − Nb (Fig. 3E). This analysis was critical to determine the contributions of the slow memory of B retained and reexpressed, over the bias induced by the 20 trials of B exhibited in the Nb condition. The results suggested that while in all experiments a significant amount of B memory was expressed in the error-clamp trials, there was a trend toward stronger expression in the BAb and BgAb groups (calculated by BAb − Ab) than BNb and BgNb groups. That is, somewhat surprisingly, the expression of the B memory was more affected by the null washout trials than by adaptation to the opposite perturbation.
Sudden change in probability of success as a possible contextual cue
Figure 4A plots the movement errors in the groups that were abruptly introduced to B and then transitioned abruptly to null. The largest trial-to-trial change in error occurred when B was introduced (∼22 mm), and when null was reintroduced (∼25 mm). In comparison, gradual introduction of B and gradual reintroduction of null produced trial-to-trial changes that were no larger than 3 and 6 mm, respectively (these errors occurred following set breaks). If a sudden change in movement error signals a contextual change, in the gradual condition these cues were less available. Yet, protection and expression of the B memory was more robust in the gradual condition than in the abrupt condition (BgNb vs BNb, Fig. 3). Therefore, it seems unlikely that sudden changes in movement errors act as contextual cues.
Another source of information that can signal a contextual change is probability of success. During initial null field training, all abrupt and gradual BNb conditions displayed comparable levels of success rate (F(2,42) = 1.546, p = 0.225). In the abrupt condition, the probability of a successful trial dropped sharply at the onset of B (Fig. 4A). Interestingly, the probability of success also dropped significantly in the gradual condition (Fig. 4B). For example, when field B was at 25% of full strength, the probability of success had dropped by >80%. Therefore, while in the gradual condition there were no sudden changes in performance errors, the gradual accumulation of the small errors and the nonstationary nature of the environment led to substantial reductions in reinforcement.
Perhaps this decrease in probability of success in the gradual condition was because we increased the strength of the field too quickly (over ∼100 trials), leading to larger trial-to-trial variance and displacement error. That is, perhaps the gradual B in Experiment 2 was not gradual enough. To check for this, we recruited a new group of subjects for a paradigm in which the perturbation was introduced very gradually, over twice as many trials as before (the BggNb group, as illustrated in Fig. 4C, top row, in which the field reached full strength after 192 trials). Once again, we observed a large drop in the probability of success, despite the fact that the perturbation only produced a minimal increase in movement errors. For example, when the perturbation had reached 25% full strength, movement errors had increased by about 3 mm from baseline, but probability of success had dropped by 75%. After washout in null, this new ultra-gradual group also exhibited strong expression of B in the error-clamp block: forces were strongly biased toward B in the BggNb group as compared to Nb (t test, p = 0.020). In summary, in the gradual condition, we observed small incremental increases in movement errors, but much sharper declines in probability of success. It is possible that a large change in probability of success acted as a cue that signaled a contextual change, initiating a search for better motor commands (Izawa and Shadmehr, 2011).
Let us now consider the events that took place immediately before the error-clamp block. Figure 4D displays the movement errors and probability of success in the various groups that experienced null, then brief exposure to B, and then the error-clamp block. In the 20 B trials, movement errors suddenly increased and were then eliminated by the transition to the error-clamp block. Similarly, probability of success suddenly decreased, and then recovered. Therefore, one of the critical events that took place in the 20 B trials was that previously reinforced motor commands (in null or A) were no longer reinforced. It is possible that this withholding of reinforcement for a current motor memory resulted in the expression of the competing motor memory. We will test this idea directly in Experiments 3 and 4.
Experiment 3: spontaneous recovery of a motor memory following a sudden change in performance errors
Why is it that the B memory is being expressed in the error-clamp trials? Is it because the 20 B trials are in the same field as the B that was experienced before? Or is it that the sudden introduction of movement errors and change in probability of success that takes place in the 20 B trials encourage a switching from expression of one memory to another? To decide between these possibilities, we consider a scenario in which the errors that came before the error-clamp block were unrelated to the errors that were experienced during acquisition of the memory. In the ANb group (Fig. 1C), training in A was followed by a long period of training in null, and then 20 trials in B. During these 20 trials, the brain will experience a sudden decline in performance. That is, previously reinforced motor commands (appropriate for null) will no longer be reinforced. Will the 20 trials in B produce spontaneous recovery of A?
By the end of the null trials, performances in the ANb and Nb groups were indistinguishable (average of last five error-clamp trials, F(1,16) = 0.108, p > 0.5). Furthermore, as Figure 5A illustrates, performance of these two groups were indistinguishable during the 20 B trials (perpendicular displacement, F(1,16) = 0.80, p > 0.5). Therefore, during the training in B, there was no evidence of prior training in A in the ANb group. Finally, the forces in the first trial of the error-clamp block were indistinguishable between ANb and Nb (t test, p = 0.398). Remarkably, as the trials in the error-clamp block continued, the ANb group produced forces that became biased toward A (average of last 100 trials, t test, p < 0.001). We performed a bootstrap analysis to quantify expression of A in the error-clamp trials: Â = ANb − Nb. The results (Fig. 5C) demonstrated that the A memory exhibited spontaneous recovery during the error-clamp block, despite the fact that this block occurred hundreds of trials after acquisition of A, and was preceded by training in B. Along with Experiment 2, the data in Figure 5C suggest that the mere presence of sudden errors and/or sudden changes in probability of success produces spontaneous expression of a previously acquired and presumably deactivated motor memory.
Experiment 4: spontaneous recovery of a motor memory following withholding of reward for the competing memory
In the BgNR group (Fig. 1D, top plot), the null training was followed by 20 trials in which the field was random on any given trial (A, B, or null). As a control, we considered the BgN group in which B was introduced gradually and removed gradually, followed by washout in null and then error-clamp trials. Note that in the BgN group, the null training directly leads to the error-clamp trials. Therefore, the BgN group experiences no sudden change in error and/or probability of success before the error-clamp block. In contrast, in the BgNR group, 20 high-error and low-success trials immediately precede the error-clamp block. Indeed, in the BgNR group, we observed robust expression of the B memory in the error-clamp block (Fig. 6A) (average of last 100 trials, t test, p < 0.001). Note that B was learned gradually and without large errors, yet it was reexpressed after washout when the subjects encountered a sequence of random large errors. In comparison, the forces produced by the BgN group in error-clamp trials were indistinguishable from zero (first 250 trials, F(1,16) = 0.617, p > 0.4). Therefore, the error-clamp block by itself was not sufficient to produce expression of a previously acquired memory. Rather, a small number of trials in which there were large errors and low probability of success produced a condition in which a previously acquired memory showed spontaneous recovery.
In the BgNR group, the random condition consisted of a number of trials in which field B was present. It is possible that expression of B was due to occasional presence of this perturbation immediately before the error-clamp trials. The alternate hypothesis is that the brain expressed B because the motor memory for null was no longer producing a rewarding outcome in the random field. Our final experiment was designed to test the idea that the brain switched between motor memories merely because of sudden changes in probability of success.
In the BgNS group, the null trials were followed by 20 no-success error-clamp trials in which regardless of the movement, target explosions were withheld (schematic in Fig. 1D, success probabilities in Fig. 6B). These no-success error-clamp trials were followed by the usual error-clamp block. Before the no-success trials, motor output was comparable between the BgNS and BgN groups (Fig. 6A, open circles). However, as reinforcement was withheld, subjects in the BgNS group began producing forces appropriate for B. Indeed, trial after trial, the withholding of reinforcement encouraged greater expression of B. By the 23rd trial, expression of B was similar in the BgNS and BgNR groups. As the error-clamp trials continued, the motor output in the BgNS group continued to be biased toward B as compared to the BgN group (all 364 error-clamp trials, t test, p < 0.001). Therefore, withholding of reinforcement during expression of the null field memory produced spontaneous recovery of the memory for B.
In summary, Experiments 3 and 4 demonstrated that sudden removal of reinforcement encouraged expression of a previously acquired motor memory. This suggests that in the BAb, BNb, and similar experiments in which memory of B was spontaneously expressed in error-clamp trials, a critical factor was the fact that the current motor commands (A or null) suddenly became unsuccessful in acquiring reinforcement. This sudden change in probability of success encouraged expression of a previous acquired memory, i.e., the memory of B.
Discussion
In numerous experiments, people have adapted their movements to perturbation B, and then adapted to the opposite perturbation A. To determine whether adaptation to A destroyed the memory of B, they were retested in B. When the temporal distance between the training episodes was zero, as in experiments here, performance in retest was usually no different from naive (Lewis et al., 1952; Flook and McGonigle, 1977; Shadmehr and Brashers-Krug, 1997; Bock et al., 2001; Caithness et al., 2004; Krakauer et al., 2005; Overduin et al., 2006), suggesting catastrophic interference. Here, we found that BA training produced two competing memories. When motor output associated with one memory was denied reinforcement, the competing memory was retrieved.
In Experiment 1, subjects trained in B and then in null or A, followed by brief exposure to B. We found that in BAb and BNb, motor output during the error-clamp block was biased toward B as compared to Ab and Nb, respectively. Therefore, adaptation followed by deadaptation did not result in catastrophic destruction. In Experiment 2, we asked whether protection of a memory required sudden errors to mark a context change. We found that despite gradual presentation of the perturbation, forces in the error-clamp block were biased toward B. Therefore, protection was not based on contextual change signaled by large kinematic errors. In Experiment 3, we trained subjects in A and then, after washout, presented 20 B trials. The sudden errors in B produced spontaneous recovery of A. Therefore, spontaneous recovery was an active process of retrieval and not passive decay of an already active memory. Finally, in Experiment 4, we found that random errors could produce spontaneous recovery. Most interestingly, we found that when we trained in B and then denied reinforcement following washout, the brain retrieved the memory of B. Therefore, when current motor commands produced the expected kinematic outcome but were unrewarded, the brain expressed another set of motor commands that were previously rewarded.
The multiple components of motor memory
When one learns to produce motor commands that compensate for a perturbation, and then that perturbation is removed, what prevents erasure of the motor memory? There are a number of computational models of learning in which the system is composed of multiple modules, each an expert with a forward model that predicts behavior in a particular context of the environment, paired with an inverse model or controller that produces motor commands (Wolpert and Kawato, 1998; Haruno et al., 2001; Doya et al., 2002). The forward and inverse models are tightly coupled during acquisition and use. Importantly, switching between modules takes place due to a responsibility selector that assigns credit to each module based on the accuracy of predictions made by its forward model. Such models produce protection of acquired memories when there are sudden large errors in behavior. Our results appear inconsistent with these models: first, we found that motor memories were protected even when perturbations were introduced gradually, preventing large errors. Second, we found that the brain switched from expressing one memory to another merely because current motor commands were not acquiring reward, despite paucity of kinematic errors.
A possible approach is to change the Lee and Schweighofer (2009) model so the switch between slow states is based on probability of success, rather than large performance error. In the resulting model, slow states learn from performance errors, but contextually switch based on probability of success. However, as this learning depends entirely on performance errors, we cannot account for the fact that during adaptation, repeated reinforcement of a movement produces a memory independent of the memory produced via error-dependent learning (Huang et al., 2011; Orban de Xivry et al., 2011).
We approach our problem by considering a different model of motor memory. Suppose the process of generating a movement involves two computations: one that transforms a target state xt into motor commands u, i.e., a control policy, possibly in the motor cortex, and one that transforms motor commands into predicted sensory consequences x̂, i.e., a forward model, possibly in the cerebellum (Shadmehr and Krakauer, 2008). Upon exposure to an abrupt perturbation B, sensory prediction errors produce adaptation of the forward model, a process that depends on the cerebellum (Synofzik et al., 2008). At this early stage of learning, motor commands improve not because the controller has changed, but because commands are corrected via internal feedback through the forward model (Chen-Harris et al., 2008). This accounts for the fact that early in training, despite large improvements in performance, there is little or no change in the motor cortex (Paz et al., 2003), and disruption of the motor cortex does not affect the initial rapid phase of adaptation (Orban de Xivry et al., 2011). As training continues, certain motor commands repeat and are reinforced by success. This reinforced repetition produces a distinct motor memory (Diedrichsen et al., 2010; Huang et al., 2011; Verstynen and Sabes, 2011) that depends on the motor cortex (Orban de Xivry et al., 2011), producing plasticity in the controller so that motor outputs are associated with reward (Fig. 7B). The motor command in any given trial is the one that is most likely to be rewarded (the mode of this distribution). At the end of B training, we have acquired a new forward model and controller (Figs. 7B,C).
When the perturbation changes to A, previously successful motor commands are no longer successful. This encourages a search for new motor commands (Izawa and Shadmehr, 2011) and possible disengagement of the controller representing B. The critical hypothesis is at the end of BA training, we have multiple motor commands associated with success (Fig. 7B), with the mode reflecting motor commands that were successful in A. Therefore, reversal of the perturbation is unlikely to produce erasure in the controller because reward is simply shifted to new motor commands, creating a bimodal distribution reflecting the history of all learning. However, it is possible that by the end of BA the forward model acquired in B is catastrophically affected by training in A (Fig. 7C). The novel prediction is that when a motor memory shows spontaneous recovery, it reflects the output of the controller (which learns from reinforcement) and may not be accompanied with the appropriate forward model. This also leads to the prediction that spontaneous recovery will be absent in basal ganglia patients, but present in cerebellar patients.
This model can account for a number of observations in our data. We observed that memories were protected in the gradual condition. This protection may rely not on kinematic errors, but on changes in success probability, which were substantial even in the gradual condition. Selection of the controller based on reward would also account for switching that we observed when reinforcement was withheld. Finally, we observed that BN training produced much less recovery of B than BA training (Fig. 3E). This is accounted for as the peak of the two resulting associations between motor commands and reward in the controller are closer in BN than in BA training, resulting in greater interference.
This model may relate to state-space models, in that the motor cortex may serve as the site for “slow states” of learning, with “fast” learning taking place in the cerebellum. Evidence for this case has been observed through double dissociation by applying tDCS over M1, which increases retention of learned motor memories with no effects on adaptation, with tDCS over the cerebellum increasing the rate of adaptation, without affecting retention (Galea et al., 2010).
Link to operant conditioning
In BAb and similar experiments, protection of B was not due to a contextual cue from sudden errors, and spontaneous recovery of B was not due to a similarity between errors during testing and initial adaptation. Rather, the brain expressed B because the current motor commands (in null or A) were suddenly unsuccessful. This parallels observations in operant conditioning. For example, Mazur (1995) investigated the role of reward in pigeons that were trained to peck at two different keys. Each key delivered reward at a constant probability for a set number of trials, but then changed to a new reinforcement schedule. The pigeons were able to adjust to the new schedule, but at the start of a new block they reverted back to the previous schedule, indicating spontaneous recovery of prior training. That is, previously rewarded behaviors are not erased when new behaviors are rewarded, consistent with the view that in the AB paradigm, reinforcement of B motor commands does not erase the association of A commands with reward.
Limitations
Traditional methods of assaying motor memory have relied on measures of savings, i.e., training then retesting on the same task. The implicit assumption has been that if the memory is present, then there are contextual cues in retesting that should be sufficient to express it. This approach has produced the conclusion that motor memories are erased because there is no evidence of savings. Our work here shows that this conclusion is false, but does not explain why previous methods of assaying motor memory failed to observe protection. Though our model predicts that spontaneous recovery reflects the output of the controller, our current design cannot determine the actual component of motor learning present in error-clamp trials. Furthermore, our work does not address two important issues in motor learning: passage of time appears to strengthen motor memories (Brashers-Krug et al., 1996), and repetition appears to increase resistance to interference (Krakauer et al., 2005). It is unclear whether denial of reinforcement would produce spontaneous recovery that increases with passage of time, and with increased repetition. Finally, though we linked changes in success rates to spontaneous recovery of motor memory, these changes also altered levels of cognitive attention.
Footnotes
This research was supported by a grant from the National Institutes of Health (NS37422). S.E.C.-H. was supported by a predoctoral fellowship from the NIH (NS062647).
- Correspondence should be addressed to Sarah E. Pekny, Johns Hopkins University School of Medicine, 720 Rutland Avenue, 416 Traylor Building, Baltimore, MD 21205. sep205{at}gmail.com