Abstract
The early stages of motor skill acquisition are often marked by uncertainty about the sensory and motor goals of the task, as is the case in learning to speak or learning the feel of a good tennis serve. Here we present an experimental model of this early learning process, in which targets are acquired by exploration and reinforcement rather than sensory error. We use this model to investigate the relative contribution of motor and sensory factors to human motor learning. Participants make active reaching movements or matched passive movements to an unseen target using a robot arm. We find that learning through passive movements paired with reinforcement is comparable with learning associated with active movement, both in terms of magnitude and durability, with improvements due to training still observable at a 1 week retest. Motor learning is also accompanied by changes in somatosensory perceptual acuity. No stable changes in motor performance are observed for participants that train, actively or passively, in the absence of reinforcement, or for participants who are given explicit information about target position in the absence of somatosensory experience. These findings indicate that the somatosensory system dominates learning in the early stages of motor skill acquisition.
SIGNIFICANCE STATEMENT The research focuses on the initial stages of human motor learning, introducing a new experimental model that closely approximates the key features of motor learning outside of the laboratory. The finding indicates that it is the somatosensory system rather than the motor system that dominates learning in the early stages of motor skill acquisition. This is important given that most of our computational models of motor learning are based on the idea that learning is motoric in origin. This is also a valuable finding for rehabilitation of patients with limited mobility as it shows that reinforcement in conjunction with passive movement results in benefits to motor learning that are as great as those observed for active movement training.
Introduction
When we first learn to play tennis, or when, as a child, we learn to speak, our desired sensory state and motor commands are frequently marked by uncertainty. As we learn novel movements, we are also determining the sensory state we are trying to achieve. During this early stage of learning, our targets are acquired and refined through trial and error, by exploration and reinforcement. The requirements are different from adaptation procedures, which serve as the current reference model for studying motor learning (Shadmehr and Mussa-Ivaldi, 1994). In adaptation experiments, participants make movements to well-defined sensory targets. Learning is characterized by compensation for the sensory error that results from experimentally imposed perturbations. However, when the target is uncertain, as early in learning, there is limited opportunity for sensory error to drive motor learning. Indeed, this idea is consistent with a growing body of literature indicating that factors other than error are important for motor learning, including reward (Huang et al., 2011) and movement repetition (Diedrichsen et al., 2010). Here we present a new experimental model to study the early stages of motor learning. In this model, determining the desired state is an integral part of the learning. There are no experimentally imposed perturbations.
We use this model to address one of the fundamental questions regarding motor skill acquisition, namely, the extent to which motor versus sensory factors determine the process of learning. Disentangling the respective roles of afferent input and motor outflow is complicated by the fact that, when an action is performed, descending motor commands and afferent sensory information co-occur. Although it is known that exposure to afferent input can facilitate sensorimotor performance (Beste and Dinse, 2013), few investigations have tried to tease apart the dependence of motor learning on afferent factors from those which are more clearly motor in nature. This question can be addressed by studying the motor learning produced by passive movements. A training procedure using passive movements matches movement kinematics to those experienced during actual motion while at the same time minimizing the motor outflow. Some studies report no evidence of learning following passive movement training (Lotze et al., 2003). Other studies showed positive effects of training with passive movement, but only if alternated with active movements (Wong et al., 2012) or in the presence of visual feedback (Beets et al., 2012). The extent to which learning is dependent upon somatosensory information per se remains to be established.
Here we have experimentally contrasted active with matched passive movement training to evaluate their contribution to motor learning. Subjects make reaching movements to an unseen target without vision of the arm. When the subject successfully lands in the target zone, a binary signal indicating success is presented. We show that, when passive movements are paired with reinforcement, the benefits to learning are similar in magnitude and persistence as those observed under active movement conditions. In both cases, there are small but comparable improvements in somatosensory acuity. Our findings indicate that, in acquiring simple motor skills, the somatosensory system provides the leading contribution to learning.
Materials and Methods
Subjects and experimental tasks.
A total of 230 participants of either sex were recruited (mean age ± SD: 21.9 ± 2.1 years). They were all right-handed and reported no history of sensorimotor disorders. Participants provided written consent, and all procedures were approved by the McGill University Institutional Review Board.
Figure 1 shows the experimental setup and tasks. The participants' task was to reach without vision of the arm to a bar that extended from one side of the workspace to the other. Within this bar, there was an unseen target. There were two primary manipulations. One involved a comparison of the effects of training with active versus passive movement. Passive participants experienced the same trajectories as those produced by the active participants, but there was no active generation of movement. The other manipulation involved a comparison of the effects of binary feedback about movement success, here referred to as “reinforcement.” Some participants in the active and passive conditions received positive feedback (an explosion displayed on screen) each time the movement ended within the desired target zone (Reinforcement groups). Other participants received no feedback or any other information to indicate that their movement was correct (Control groups).
Participants were randomly assigned to one of four conditions: active reinforcement (n = 49), passive reinforcement (n = 52), active control (n = 45), and passive control (n = 48). Within each condition, a subset of subjects (n = 80) made outward reaching movements along the body midline. Other subjects (n = 114) moved out and to the left, at an orientation of 135°. More than one version of the experiment was performed to assess the generality of the findings across movement directions. A further group of participants was recruited for a visual control experiment (n = 36).
Subjects were tested individually in a single session. The session comprised reaching movements and perceptual tests (see Fig. 1b). In each case, subjects held the handle of a two degree-of-freedom planar robotic arm (InMotion2; Interactive Motion Technologies). Subjects were seated, and the arm movements occurred in a horizontal plane with the shoulder abducted to ∼70°. An air sled was used to support the subject's arm, and straps restrained the subject's trunk. A semisilvered mirror, which served as a display screen, was placed just below eye level and blocked vision of the arm and the robot handle. Sixteen-bit optical encoders provided the position of the hand (Gurley Precision Instruments). Applied forces were measured using a force-torque sensor (ATI Industrial Automation) that was mounted below the robot handle.
Reaching movements.
The movement start position was indicated by a green circle, 20 mm in diameter, ∼15 cm from the subject's chest along the body midline (see Fig. 1a). Reaching movements were aimed at a green stripe that extended the entire width of the display screen. In the straight-ahead version of the experiment, the stripe was oriented along a horizontal axis (see Fig. 1c); whereas in the 135° version, it was tilted at 45° from the horizontal (see Fig. 1d). An unseen target area 5 mm in width (8 mm for the 135° version of the experiment) lay within the stripe. In both cases, the perpendicular distance between the starting point and the stripe was 15 cm. The stripe was 1 cm in width.
A thin yellow line (2 mm width) that also extended the width of the display screen provided the subject with visual information about the distance of the hand from the stripe. The lateral position of the hand relative to actual target was not shown. To enable participants to place the handle at the start position before each trial, a yellow circle (12 mm in diameter) was temporarily superimposed on the yellow line, when the subject's hand was at the start position. This information was removed as soon as participants moved outward.
In the straight-ahead condition, participants were told to move straight outward until they reached the green stripe. They were also instructed to make a single movement without correction. In the 135° version, participants were told to move at 135°, also along a straight line, until they hit the stripe. Subjects were instructed to finish each movement in 800 ms. Visual feedback of movement duration was provided at the end of each movement by a stripe color change. The feedback was used to help subjects achieve the desired movement duration, but no trials were removed from analysis if subjects failed to achieve the speed requirement. At the end of each trial, the robot returned the subject's hand to the start position.
The experiment began with baseline movements in which all subjects performed 15 reaching movements toward the stripe without receiving feedback. Participants in the straight-ahead version of the experiment were instructed to move straight out precisely along the body midline. Participants in the 135° version were instructed to move precisely at 135°. This was followed by a baseline measure of somatosensory perceptual classification accuracy (see Perceptual judgments). Participants then underwent a training procedure in which they were assigned to one of four groups.
Participants in the active reinforcement condition performed active reaching movements toward the stripe. The instructions were the same as in the pre-training block. Whenever the movement ended within a prescribed range around the desired target, a “Nice shot!” message appeared on the screen together with the picture of an explosion. The range for reward was ±2.5 mm for the straight-ahead version of the experiment and ±4 mm for the 135° version. Only the lateral dimension along the stripe was taken into account to evaluate whether to deliver reward or not (the reward was delivered regardless of movement amplitude in the sagittal plane, provided that participants had moved at least 5 cm beyond the starting position). No feedback was provided for movements that landed outside the prescribed range; hence, the reinforcement feedback was strictly binary. Subjects were instructed to maximize the number of explosions. Participants in the straight-ahead version of the experiment received reinforcement for movement directed slightly to the right of their actual midline (bias: 15 mm rightward relative to the midline; see Fig. 1c). This bias was introduced following a pilot experiment showing that several participants were able to move accurately straight-ahead under baseline conditions, thus leaving little or no room for motor learning. The rightward direction was chosen to enable us to separate learning associated with our procedure from a spontaneous bias often associated with repeated reaching movements performed with the dominant right hand, which tend toward the left of the body midline (Darainy et al., 2013). Participants in the 135° version instead were reinforced for movements directed exactly at 135° (see Fig. 1d). The training duration was 150 trials in the straight-ahead condition, and 200 trials for the 135° version. This choice was made following a pilot experiment at 135°, in which it was determined that the 135° direction was harder to learn compared with the biased straight-ahead direction, and a greater number of trials provided a better opportunity to achieve reliable learning. For the straight-ahead version, a short break was given halfway through. For the 135° version, breaks were given every 50 trials.
Participants in the passive reinforcement condition experienced movements generated by the robot under position servo-control. We used the movement trajectories generated by the active reinforcement subjects as a model for the passive reinforcement subjects. Each passive reinforcement participant was randomly yoked to an active reinforcement subject (without intermixing the straight-ahead and 135° versions of the experiment). A participant in the passive reinforcement condition thus experienced the exact sequence of movements as a participant in the active reinforcement condition. Subjects in the passive reinforcement group were also given the same rewards at the end of movement as subjects in the active reinforcement condition, whenever the robot-generated movement ended within the reinforcement range. Participants in the passive reinforcement condition were told that they would experience the movements made by another participant who was attempting to learn to move correctly to the target position (defined to participants as “straight-ahead” or “at 135°”) within the stripe. We instructed participants to focus on the feeling of the movements and to use this information to learn the correct movement direction. To ensure that the participants in the passive condition were paying attention during training, they were told to report every occurrence of the reward. They were also instructed not to apply any force to the robotic handle during training. To verify that participants complied with this instruction, we measured the average force applied to the handle in the horizontal plane throughout each passive movement. Breaks were given during training as in the active reinforcement condition.
Participants in the active control condition performed reaching movements toward the same stripe as subjects in the active reinforcement condition, but they were not provided feedback that served to reward successful achievement of the target position. The instructions to this group were to improve their ability to move precisely straight out along their body midline (for the straight-ahead experiment) or at 135° (for the 135° experiment) by repeating the movement several times. Breaks were given as in the active reinforcement group. This control was used to verify that moving to the rightward biased target or to the 135° target in the reinforcement groups was not merely the result of repeating several reaching movements aimed at the body midline or at 135°, respectively.
Passive control participants experienced the same displacements of the limb as the passive reinforcement group (i.e., the movement trajectories generated by participants in the active reinforcement group), but successful movements were not accompanied by reward. These participants were told that they would experience the movements of another subject that was attempting to learn to move correctly to the target (defined to participants as “straight-ahead” or “at 135°”). Participants were instructed to pay attention to the feeling of the movements and to use this information to learn the correct movement direction. To ensure that participants were paying attention during the training, they were required to report every occurrence of the stripe turning blue (corresponding to a slow robot-generated movement). It was made clear to the participants that the target color was not the most important feature and that they had to pay particular attention to the feeling of the movements. Breaks were given as in the active reinforcement group.
Immediately following the training, participants in all four groups were tested for their ability to perform reaching movements to the target in the absence of any reward. The instructions to participants in this phase were as follows: “Reach precisely straight-ahead in front of you/at 135°,” with the addition, for the reinforcement groups of “just as you have done (“felt,” for the passive reinforcement group) so far to get the explosions. However, this time there will be no explosions.” Each participant performed 15 active reaching movements. Afterward, participants underwent a second test of somatosensory function.
A subset of participants (n = 32; 10 in the active reinforcement, 12 in the passive reinforcement, and 10 in the passive control group) from the 135° version of the experiment was tested for retention ∼7 d later. The retest session was identical to the baseline movement test, comprising 15 reaching movements at 135° in the absence of visual information about the lateral displacement of the hand. No familiarization or warm-up was provided before the retest session.
An additional group of participants (n = 36, visual reinforcement group) was recruited to investigate whether the improvement in motor performance following training could be attributed to knowing the location of the target, in the absence of somatosensory experience. One subgroup was tested in the version of the experiment with the horizontal stripe and baseline movements directed straight-ahead (n = 20); another subgroup was tested in the version of the experiment at 135° (n = 16). For baseline trials, visual control participants received instructions identical to the other groups. However, during training they did not perform movements of any kind. Instead, on each trial, they were shown the endpoint position for each of the movements produced by a randomly yoked participant from the active reinforcement group (model) (for the 135° version of the experiment, we used as models the subgroup of 16 participants who had the fan-shaped somatosensory perceptual test; see below). Endpoint positions were shown as a red dot with a diameter of 5 mm, centered at the final position of the corresponding movement of the model. Endpoint dots remained visible on the screen for 1500 ms. When the dot fell within the actual target zone, thus corresponding to a reward achieved by the participant that served as the model, a “Correct” message was displayed on the screen. The time interval between successive presentations of the endpoint dots corresponded to the interval between the successive movements of the model. Thus, this procedure provided participants in the visual reinforcement group with explicit visual information regarding the target position that was equal in quantity to the number of reinforced movement trials in the active and passive reinforcement groups. During this training period, participants did not hold the robot handle, and they were asked to relax and avoid movements. For the straight-ahead version of the experiment, participants were instructed to pay attention to the position of the dots and to learn the particular location at which the appearance of the dot triggered the “Correct” message. This position corresponded to the target 15 mm toward the right relative to the body midline that had been reinforced in the active and passive participants. For the version of the experiment at 135°, participants were instructed to pay attention to the position of the dots and to use this information to accurately learn the location corresponding to 135°. To ensure that participants were paying attention during training, all participants were told to report every occurrence of the “Correct” message. Following training, participants were asked to reach precisely to the location they had been shown during the training. Participants were informed that the “Correct” message would have no longer been displayed.
Perceptual judgments.
Participants were tested for somatosensory perceptual classification accuracy before and after movement training. This procedure was primarily aimed at measuring proprioceptive function; however, we describe it as somatosensory as both proprioceptors and cutaneous afferents may well be involved. Subjects were tested with their eyes closed. Three somatosensory perceptual testing procedures were used.
For the straight-ahead condition, the robot was programmed to passively move the subjects' arm through each of 10 fan-shaped trajectories, 15 cm in length (see Fig. 2a). Subjects were required to judge on each trial whether the robot had moved the arm to the right or left. The trajectories were distributed equally to the right or left of a reference line (not shown to the subjects) connecting the starting position with the target of the motor task, which was 15 mm to the right of the body midline. We used a set of lateral deviations used in previous experiments (8°, 5°, 4°, 3°, and 1.5° in both directions) (Darainy et al., 2013) and rotated them to align them to the motor task reference line. This yielded the following set of deviations relative to the body midline: −2.3°, 0.7°, 1.7°, 2.7°, 4.2°, 7.2°, 8.7°, 9.7°, 10.7°, and 13.7°. Each perceptual test involved 100 trials with the above angles tested 4, 10, 10, 14, 12, 12, 14 10, 10, and 4 times each. All of the test movements followed a bell-shaped velocity profile. Subjects were instructed not to resist the action of the robot.
Two different perceptual tests were used for subjects that trained at 135°. One group of subjects underwent perceptual testing using the 10 fan-shaped trajectories described above, but distributed equally to the right or left of a 135° reference line (not shown to the subjects) (see Fig. 2b). Subjects were required to judge on each trial whether the robot had moved the arm above or below the 135° direction. We used lateral deviations of 12.8°, 8°, 6.4°, 4.8°, and 2.4° in both directions relative to the 135° direction. These angles were derived from the set previously used by Darainy et al. (2013) and expanded by a factor of 1.6, to mirror the expansion of the range for reinforcement in the motor task in the 135° condition.
Because of the potential difficulty in making an above/below judgment relative to 135°, we tested the other set of participants in the 135° experiment in an A × B design. Subjects had to judge whether the second movement (X) in a set of three consecutive robot-generated movements felt closer to the first movement (A) or to the third movement (B) (see Fig. 2c). In this procedure, A and B represent extreme deviations and thus function as objective reference points. A deviation of ±12° relative to the 135° reference was used for A and B. For the X deviations, we used the following set of angles: 8°, 6°, 4°, 3°, and 2° in both directions. As in the previous procedures, each block of perceptual testing involved 100 trials with the above angles tested 4, 10, 10, 14, 12, 12 14 10, 10, and 4 times each.
Data analysis.
Motor performance was quantified with reference to the absolute value of subject's lateral perpendicular deviation from a straight line connecting the starting position and the center of the target. Two measures of deviation were considered: (1) the deviation at movement peak velocity (PDmaxv) and (2) the deviation at movement end (PDend). For the 135° version of the experiment, the target reference point was placed exactly at 135°. For the straight-ahead version, the target reference point was placed 15 mm to the right of the subject's actual midline, to reflect the position that was reinforced during the training. The data from the 15 trials recorded in the pretests and post-tests were averaged on a per-subject basis, yielding a single deviation score for each of the pre-training and post-training tests, for each subject. Participants were excluded from analysis following testing if they failed to collect at least 2 rewards over the course of the training, one of which had to be in the last 50 trials. Eight subjects were removed on this basis.
To exclude the possibility of active force production for participants in the passive groups, we measured the average 2D force applied by subjects to the robot handle during the course of the movement (see Fig. 4b). The average 2D force was 1.82 N in the passive reinforcement group and 1.88 N in the passive control group. This difference was not statistically reliable (t = −0.43, p > 0.6). A force of this magnitude (∼180 g) would be expected simply due to the passive stiffness of the arm. This is consistent with the idea that active force production was not a significant factor in the passive movement procedure.
We investigated whether learning in the passive groups was influenced by the nature of the passive movements experienced. To this end, we correlated the change from pre-training to post-training exhibited by participants in the passive groups with the learning of the active reinforcement participants that served as models. We obtained a score for the active tutors by subtracting the PDend score averaged across the first 15 trials of training from the score obtained averaging the last 15 trials of training.
A measure of perceptual classification accuracy was computed using the following: (True positive + True negative)/(Positive + Negative). The measure spans a range from 0 to 1, where 1 represents a perfect match between stimuli and classification and 0 indicates a systematic mismatch (Baldi et al., 2000). The calculation was conducted on a per-subject basis about the point which maximized that subject's percentage of correct responses. Classification accuracy was observed to approach ceiling values for limb positions farther from the center (∼85% accuracy for the outer three test positions on each side). To maximize the sensitivity of the measure, we computed classification accuracy based on the four central positions, which gave us 52 observations for each participant. Participants with baseline perceptual classification accuracy exceeding 2 SDs above or below the overall sample mean were excluded from the statistical analysis of somatosensory perceptual classification (n = 6, 3% of the total).
Changes in overall motor performance and perceptual classification accuracy following training were evaluated statistically using a mixed-effects ANOVA. Time was a repeated-measures two-level factor (levels: pre-training vs post-training). Training mode and Reinforcement were between-subjects factors (Training mode: active vs passive; Reinforcement: present vs absent).
In a preliminary analysis, we found no evidence that our results were influenced by the version of the experiment (straight-ahead vs 135°). Specifically, in a four-way mixed-effects ANOVA with Time, Training mode, Reinforcement, and Experiment version as the factors, there was no instance of a three-way interaction between Time, Reinforcement, and Experiment version, or between Time, Training mode, and Experiment version, nor any four-way interaction (all p > 0.2 at the least). This applied to all the dependent variables in our study (PDmaxv, PDend, and somatosensory classification accuracy) and similarly if the experiment version was divided into three levels rather than two (Straight vs 135° vs 135°-AXB). Therefore, the version of the experiment was dropped as a factor from the statistical analyses, and the results shown below are collapsed over the different versions.
The analysis for the 1 week retest was performed by means of a multivariate ANOVA (MANOVA) on the differential scores for PDmaxv and PDend that were obtained subtracting the baseline values from the 1 week movement error scores, with Bonferroni-Holm correction for multiple comparisons.
In addition to testing overall changes in motor performance from pre-training to post-training, we fit a model to the time-series data of the pre-post change scores for the absolute deviation at movement end. The goal was to assess the stability of learning over the course of the 15 trials following training, which for all subjects involved active and nonreinforced movements. Change scores were obtained by subtracting, on an individual basis, the average of the 15 pre-training scores from each of the 15 scores collected during the post-training session. We then used the generalized additive modeling (GAM) approach (Wood, 2006) to model the change scores. The model contained the following: (1) a smooth function of trial for each individual subject, (2) an intercept term for each experimental group, and (3) a smooth function of trial for each experimental group. The nonlinear smooth functions were thin-plate splines, and fitting was performed using Maximum Likelihood Estimation. Degrees of freedom for the smooth functions were assigned using a cross-validation method as part of the default implementation of the “bam” function (mgcv library in the R package for statistical computing).
Learning in the visual reinforcement group was assessed by means of a paired sample t test on the average pre-training and post-training deviation scores.
Results
Participants were tested for motor performance and for somatosensory perceptual classification accuracy (Figs. 1, 2) at the beginning of the experimental session as well as following the training. Figure 3 shows movement data for the entire experimental sequence. During training, participants in the active movement condition made reaching movements to an invisible target. For subjects in the passive condition, the robot moved the participant's arm through the trajectories of a randomly matched active subject. Participants in the active and passive reinforcement conditions received reinforcement (an explosion displayed on the screen) if the hand position at movement end fell within a narrow range around the target, and no feedback otherwise. Participants in the control conditions did not receive any feedback or reinforcement. Overall, we found that, when reinforcement was present, both active and passive participants showed a significant reduction in movement error relative to the target. Participants in the reinforcement conditions also showed improvement in somatosensory classification accuracy following training, whereas no perceptual change was observed in the control groups.
Experimental setup and motor tasks. a, Reaching movements were aimed at a green stripe and were performed without vision of the arm. A thin yellow line provided the subject with visual information about the distance to the stripe, without indicating the lateral position of the hand. b, The experiment began with 15 active movements without feedback, followed by a baseline test of somatosensory acuity. Participants were then assigned to four different training groups: active versus passive movements, with versus without reinforcement. An additional control group received visual training with reinforcement but did not produce or experience any movements. Following training, participants repeated the motor and perceptual tests, in the absence of feedback. c, In one version of the experiment (n = 80), the stripe was horizontal. An unseen target area (red shaded area, 5 mm width) lay within the stripe, 15 mm to the right of the actual midline. During the training, participants in the active reinforcement condition received positive feedback (an explosion displayed on screen) whenever their movement ended within the desired target. Participants in the passive reinforced condition experienced the same movement trajectories of active participants, replayed under robot position servo-control, and also received reinforcement when the movement ended in the target area. Active and passive control participants did not receive any feedback during training. Participants in the visual reinforcement condition did not perform any movement, and instead they were shown the endpoint positions of the movements of active reinforced participants, coupled with reinforcement for successful movements. d, In a second version (n = 114), the stripe was tilted at 135°. To obtain reinforcement, movements had to end within an unseen target area (8 mm width), centered at 135°.
Description of the somatosensory classification tasks. a, The robot passively moved the subjects' unseen arm through each of 10 fan-shaped trajectories. Subjects judged whether the arm was moved to the right or left. No feedback was provided. This was used for subjects in the motor task described in Figure 1c. b, For half of the subjects who participated in the task described in Figure 1d, the fan-shaped trajectories were distributed equally to the right or left of 135°. Subjects judged whether the arm was moved above or below the 135° direction. c, The other half of the participants had to judge whether the second passive movement (X), in a set of three, felt closer to the first (A) or to the third (B).
Passive and active movements with reinforcement result in similar amounts of motor learning. The figure shows the sequence of motor tasks for the straight-ahead version of the experiment (a), for the version with the target at 135° (b), and for the entire dataset combined (c; training trials 151–200 from the 135° version are not shown). Dots represent the change relative to baseline movements in the absolute lateral distance of the hand from the target at movement end. Negative values indicate improvement. The pre-training and post-training blocks involved 15 active movements in the absence of reinforcement. Participants that received reinforcement during the training (Reinforcement groups) show reduced error in post-training movements, regardless of whether the training involves self-generated active movement or passive arm displacement. Participants who did not receive reinforcement (control groups) do not show learning. The pattern of results is similar for the straight-ahead and 135° versions of the experiment. Dots represent mean ± SE.
Passive training paired with reinforcement produces motor learning
Figure 3 shows data from all four experimental conditions. Figure 3a gives the straight-ahead version of the experiment. Figure 3b shows the 135° version. Figure 3c shows the entire dataset combined. In each panel, error scores are normalized to subjects' average baseline performance. The error score is computed relative to the reinforced target, which was 15 mm to the right of the body midline for the straight-ahead version of the experiment, and at 135° for the version with the bar tilted at 135°. The left and right columns show data before and after training, respectively. The numbered trials in the middle column show the change in error over the course of training. Scores below the dashed line in the center indicate learning, such that a greater distance below the dashed line corresponds to a greater decrease in error relative to baseline performance. The ∼7 mm reduction in movement error, which is shown for subjects in the active reinforcement condition, reflects the achievement of movements closer to the desired target. These subjects also show a progressive increase in the proportion of reinforced trials over the course of training (Fig. 4a). Subjects shown in blue and gray made active movements; those in red and cyan were in the passive condition. Because these latter participants experienced the same movements as subjects in the active reinforcement condition, their training data are identical and are not shown separately. The effects of training on subsequent performance are seen in the right column. In the movement trials following training, subjects in both the active and passive reinforcement groups improve performance, that is, their movement error decreases by ∼7 mm relative to baseline, and they do better than the subjects in the passive condition without reinforcement, who show little change in movement error. Subjects in the active control condition show post-training scores that reflect an increase in movement error relative to baseline performance. We tested for active motor outflow in subjects in the passive conditions by examining the force applied to the handle of the robot arm. The applied force was low, averaging ∼1.8 N, and it was similar for participants in two passive conditions (Fig. 4b).
a, The percentage of reinforced trials in the Active reinforcement group increases over the course of training. The blue trace represents the average proportion of reinforced trials (±SE). b, The averaged force applied to the handle during the passive training procedure was ∼1.8 N. The applied force was comparable for the Passive reinforcement group (red trace) and for the Passive control group (cyan trace). c, Results of the GAM analysis applied to the movement error change scores for the straight-ahead and 135° datasets combined (see also Fig. 3, bottom right). Shaded areas represent Bayesian CIs. Movement error in the passive control group varied according to a nonlinear smooth function, with a significant improvement compared with baseline for the first three trials following the exposure to nonreinforced passive movements, and a progressive washout of motor learning in the subsequent trials. d, Participants that receive visual information about the target position during training, but do not perform active or passive movements, do not show evidence of learning, relative to baseline (captions and conventions are as in Fig. 3).
Differences in motor learning were assessed with repeated-measures ANOVA using the absolute value of the perpendicular deviation at maximum velocity and the absolute value of the lateral deviation at movement end as dependent measures (PDmaxv and PDend, respectively). The pattern for both dependent measures was similar. Participants who received reinforcement showed a significant reduction in error from pre-training to post-training (Time × Reinforcement interaction for PDmaxv: F(1,190) = 22.25, p < 0.001, ηp2 = 0.11, power = 0.99; post hoc: p < 0.001; PDend: F(1,190) = 22.48, p < 0.001, ηp2 = 0.11, power = 0.99; post hoc: p < 0.001), whereas no changes were observed for participants that did not receive reinforcement (post hoc for PDmaxv: p > 0.7; PDend: p > 0.1). The reinforced and nonreinforced participants were similar at baseline (post hoc for PDmaxv: p > 0.4; PDend: p > 0.7), but participants who received reinforcement showed significantly smaller error in the post-training test compared with nonreinforced participants (post hoc for PDmaxv and PDend: p < 0.001). No differences were found between active and passive participants in their pattern of performance in pre-training versus post-training trials (Training mode × Time interaction: PDmaxv: F(1,190) = 0.06, p > 0.7; PDend: F(1,190) = 2.18, p > 0.1; Training mode × Time × Reinforcement interaction: PDmaxv: F(1,190) = 0.009, p > 0.9; PDend: F(1,190) = 2.85, p > 0.09). One difference between active and passive participants was that, overall, passive participants showed greater movement deviation at peak velocity (main effect of Training mode: PDmaxv: F(1,190) = 4.1, p < 0.05, ηp2 = 0.02, power = 0.5; PDend: F(1,190) = 2.87, p > 0.09). Overall, active and passive reinforced participants decreased their error relative to the target by a similar amount (PDmaxv: 4.1 ± 1 mm, 4.3 ± 1 mm; PDend: 6.9 ± 1 mm, 6.5 ± 2 mm, for the active and passive reinforcement groups, respectively, mean ± SE).
While the decrease in error in absolute terms was similar for participants in the active and passive reinforcement conditions, the latter learned less when considered relative to initial movement performance. Specifically, passive reinforcement participants showed a tendency to reach further from the target in initial baseline trials (PDmaxv: active reinforcement = 10.9 ± 1 mm, passive reinforcement = 14.4 ± 1 mm; PDend: active reinforcement Pre = 17.6 ± 2 mm, passive reinforcement = 23.3 ± 2 mm). When combined with a similar decrease in error for the two groups over the course of learning, one obtains a 37%–40% improvement for active reinforced participants and 28%–30% for passive reinforced participants. Thus, depending on whether learning is expressed in absolute terms or relative to initial performance, estimates of learning for passive reinforced participants range from 75% to 100% of the learning shown by subjects in the active reinforced condition.
As is seen from the percentages reported above, learning is incomplete. Motor performance in the post-training tests is characterized by a persistence of error relative to the target for all groups, including the reinforced groups that showed significant learning. This is consistent with the observation that the average rate of reinforcement at the end of the training for the active reinforcement groups was ∼30% (Fig. 4a). These results suggest that the design has captured the early stages of learning a rather difficult motor skill, which, for most subjects, could not be mastered within a single session of training.
For subjects in the passive reinforcement group, a relationship was observed between their change in performance between pretraining and post-training movements and the accuracy of the passive movements that they experienced during training. Participants who were exposed to a sequence of passive training movements in which there was a greater reduction of error themselves showed a greater reduction of error in their active movement trials, which took place following the passive training (r = 0.29, p < 0.04). This relationship was not statistically significant in passive participants that did not receive reinforcement (r = −0.01, p > 0.9).
Participants showed a small (10 ms) but significant increase in movement duration following training (main effect of Time: F(1,190) = 4.89, p < 0.03, ηp2 = 0.03, power = 0.59), and movement duration was slightly longer (14 ms) for passive participants overall, compared with active participants (main effect of Training mode: F(1,190) = 4.02, p < 0.05, ηp2 = 0.021, power = 0.51). No other differences in movement duration were found from pre-training to post-training or between the groups (for all other main effects and interactions, p > 0.05). A small (8 mm/s) but significant decrease in movement peak velocity from pre-training to post-training was found for the participants that received reinforcement (Time × Reinforcement interaction: F(1,190) = 6.02, p < 0.02, ηp2 = 0.03, power = 0.68). Passive participants also had lower peak velocity than active participants (main effect of Training mode: F(1,190) = 5.47, p < 0.03, ηp2 = 0.03, power = 0.64), with no interaction with Time or Reinforcement (Time × Training mode interaction: F(1,190) = 1.55, p > 0.2; Time × Training mode by Reinforcement interaction: F(1,190) = 0.02, p > 0.8). Thus, participants who received reinforcement exhibited slightly lower peak velocities. The same patterns were observed for participants in the active and passive reinforced conditions, confirming the similarity in motor performance between these two groups.
The GAM analysis (Fig. 4c) shows that subjects in the active and passive reinforcement conditions have similar improvements in movement accuracy in post-training trials. Passive control subjects who did not receive reinforcement showed a transient improvement in movement in post-training tests. Active movement control subjects showed a progressive increase in error over the course of the post-training movements. The fits shown in Figure 4c accounted for 86.3% of the deviance (R2 = 0.82). This model showed that motor performance in the passive control group varied significantly over the course of the post-training trials according to a nonlinear smooth function (estimated degrees of freedom: 2.59, F = 4.1, p < 0.01). Inspection of the Bayesian CIs revealed that motor performance for this group was significantly improved compared with baseline for the first three trials immediately following the nonreinforced passive movements. Subsequently, motor performance returned to baseline levels, where it remained throughout the remainder of the post-training block. A significant nonlinear term was also found for the active control participants (estimated degrees of freedom: 1.75, F = 3.43, p < 0.05). These participants exhibited a progressive drift away from the target during the post-training trials. The slope of the drift was more pronounced during the first 5 trials and subsequently reached asymptote.
Motor learning is retained at a 1 week interval
A subset of participants was retested after 7 d to assess retention of learning. Difference scores were computed relative to the initial baseline values collected on the first day of the experiment. Figure 5 shows learning in the original training session and the 1 week retest data. Negative values indicate reduction in error relative to baseline movements. Evidence of retention is seen for participants in the active and the passive reinforcement conditions, but not for the passive controls. Retests were not performed for active control subjects.
Motor learning following active and passive reinforced training is retained at 1 week interval. No learning or retention is seen following nonreinforced passive movements. Bars represent the change in motor performance, relative to the initial baseline, measured immediately following training (yellow bars) and after a 1 week interval during which no further training occurred (purple bars). Negative numbers indicate improvement in motor performance relative to the baseline. a, Results for the perpendicular deviation at movement peak velocity. b, Deviation at movement endpoint. *p < 0.05.
The three retest groups differed in statistical tests in terms of their change in motor performance from baseline to 1 week retest (MANOVA, PDmaxv: F(2,29) = 4.6, p < 0.02, ηp2 = 0.24, power = 0.73; PDend: F(2,29) = 4.91, p < 0.02, ηp2 = 0.25, power = 0.76). The active reinforcement group showed significantly greater retention than the passive control group (Bonferroni-Holm corrected post hoc: PDmaxv: p < 0.04; PDend: p < 0.03). The passive reinforcement group also showed significantly greater retention than the passive control group (PDmaxv: p < 0.04; PDend: p < 0.04). The active and passive reinforcement groups showed no difference in the amount of retention (PDmaxv: p > 0.9; PDend: p > 0.6). Retention in the active reinforcement group was significantly greater than zero when assessed at movement endpoint (PDend: t(9) = 2.42, p < 0.04), and marginally significant at movement peak velocity (PDmaxv: t(9) = 2.12, p = 0.06). Retention in the passive reinforcement group was significantly greater than zero at movement peak velocity (PDmaxv: t(11) = 2.26, p < 0.05; PDend: t(11) = 1.14, p > 0.2). The passive control group showed a nonsignificant tendency to increase movement error, at 1 week compared with initial baseline performance (PDmaxv: t(9) = −1.53, p > 0.1; PDend: t(9) = −2.11, p > 0.06).
In summary, subjects in the passive reinforcement group showed improvement in movement following training comparable with that of subjects in the active reinforcement condition. In a retest 1 week after training, both reinforcement conditions continued to show reliably better movement performance than at baseline.
Visual control experiment
We conducted a control study to test whether the improvement in motor performance observed in the reinforcement conditions could be explained simply by knowing the position of the target. Figure 4d shows the change in motor performance relative to baseline for participants in the visual reinforcement group. These subjects did not show any evidence of learning from pre-training to post-training (dependent samples t test, PDmaxv: t(35) = 0.45, p > 0.6; PDend: t(35) = −0.80, p > 0.4). This indicates that merely knowing the position of the target is not enough to generate correct movements in our task.
Motor learning is associated with improvements in somatosensory classification accuracy
Perceptual classification performance was assessed before and after motor learning. Figure 6 shows changes in classification accuracy, for the two versions of the experiment separately (Fig. 6a,b) as well as for the entire dataset (Fig. 6c). It can be seen that, while there are differences between the two movement directions, both active and passive participants that received reinforcement improve, whereas control participants show no consistent pattern.
Participants who received reinforcement during training improved somatosensory classification accuracy compared with baseline. No systematic changes were seen for control participants who did not receive reinforcement. Changes in somatosensory classification accuracy are shown separately for the straight-ahead version of the experiment (a), for the version with the target at 135° (b), and for the combined dataset (c).
Statistical analysis indicated that participants who received reinforcement showed a small (∼2%) but statistically significant increase in somatosensory classification accuracy (Time × Reinforcement interaction: F(1,183) = 3.94, p < 0.05, ηp2 = 0.02, power = 0.51; post hoc: p < 0.02), whereas no changes were observed in participants that did not receive reinforcement (post hoc: p > 0.6). No differences were found when comparing active and passive participants, neither as a main effect (F(1,183) = 0.02, p > 0.8), nor in the classification pattern from pre-training to post-training (Training mode × Time interaction: F(1,183) = 1.22, p > 0.2), nor in its dependence on reinforcement (Training mode × Time × Reinforcement interaction: F(1,183) = 0.44, p > 0.5).
We also found that, over the entire dataset classification, accuracy showed a moderate but statistically significant correlation with motor performance. Higher perceptual classification accuracy in the final test was correlated with a decrease in endpoint movement error from pre-training to post-training (r(187) = −0.208, p < 0.01).
Discussion
We have presented an experimental model for early motor learning. Subjects make movements to uncertain targets, such as occurs outside of the laboratory in learning to speak or in acquiring motor skills, such as tennis or golf. We investigated the extent to which motor learning under these circumstances is dependent on active movement versus somatosensory afferent information. Participants learned to perform reaching movements to an unseen target location in the absence of vision. They trained either with self-generated movements or with closely matched passive movements generated by a robotic arm. We found that learning was similar in the active and passive conditions. Passive participants showed from 75% to 100% of the learning exhibited by active participants, and learning following both conditions was retained at a 1 week interval. In both the active and passive conditions, the reinforcement of successful actions was crucial to learning. Training with reinforcement also resulted in improved somatosensory classification accuracy.
Somatosensory information may facilitate motor learning in various ways. First, somatosensory stimulation has been shown to improve motor performance in situations in which the stimulation is not directly matched to the movement that has to be learned. One example is an increased rate of learning in a fast thumb movement task following a discrimination task that involved low-amplitude thumb muscle vibrations (Rosenkranz and Rothwell, 2012). According to these authors, sensory attention in the context of somatosensory stimulation may transiently change how proprioceptive input is integrated in motor cortex. Other examples of nonspecific effects of somatosensory stimulation on motor performance have been described in the context of repetitive electric stimulation (Kalisch et al., 2010), coactivation-based long-term potentiation (Dinse et al., 2005), and stochastic resonance (Priplata et al., 2003). Somatosensory stimulation that is directly relevant to the motor task also facilitates learning (Darainy et al., 2013). The participants in this latter study underwent a perceptual training procedure in which they had to indicate whether a robotic arm moved their unseen arm to the right or to the left of the body midline, and feedback on their accuracy was provided. Following perceptual training, participants showed an increased rate and extent of motor adaptation to lateral perturbations of their reaching movements, and these improvements were correlated with changes in connectivity in frontal motor circuits (Vahdat et al., 2014). The beneficial effects of perceptual training in the latter two studies were found to be dependent on decision-making followed by reinforcement, whereas passive movements without feedback were less able to alter subsequent learning.
Although these studies show that motor performance may benefit from somatosensory stimulation, they do not investigate the extent to which somatosensory and motor factors contribute to learning. Indeed, in all of the above manipulations, the somatosensory stimulation received does not correspond to the movement kinematics or the somatosensory feedback that would be normally associated with the execution of that movement. Few studies have tried to tease apart the motor and somatosensory components of learning. Lotze et al. (2003) found that learning the timing of wrist flexion/extension through passive movements resulted in poorer outcomes than active training. Indeed, it is not clear whether passive movements yielded any improvement at all. Two other studies provided some evidence for motor learning following somatosensory training using passive movements (Beets et al., 2012; Wong et al., 2012). However, in the study by Beets et al. (2012), the complexity of the task made necessary the presence of continuous visual feedback for any learning to be observed; whereas in the study by Wong et al. (2012), passive movements were always intermingled with active movements.
A possible influence of somatosensory input on motor learning comes from the plasticity that occurs as a result of movement repetition, referred to as “use-dependent” or “experience-dependent” learning. It is known that repeating movements establishes a change in the cortical network representation of the muscles involved (Classen et al., 1998) and biases new actions toward a previously repeated action (Diedrichsen et al., 2010; Huang et al., 2011; Verstynen and Sabes, 2011). To our knowledge, passive movements have been investigated as a possible source of use-dependent learning in only one study. Diedrichsen et al. (2010) showed that passive movements toward a specific direction led to changes to subsequent active movements toward that same direction, although in this experiment the bias was only introduced in a task-redundant dimension. The results of the study by Diedrichsen et al. (2010) resemble those of the passive control participants. The time-series analyses we performed showed that the movements immediately following nonreinforced passive trials were effectively biased toward the repeated direction (for similar results following active movements, see also Huang et al., 2011; Verstynen and Sabes, 2011). However, in our experiment, this effect was transient and entirely washed out in the course of the first few active movements. Learning in the reinforced passive group instead was retained throughout the post-training session, and even at a 1 week interval, thus ruling out an explanation of our results exclusively based on experience-dependent plasticity.
The presence of binary reinforcement paired with somatosensory stimulation seemed to be necessary for long-term retention. This is consistent with the findings of several previous investigations that showed the added benefits of reinforcement over experience-dependent plasticity in motor adaptation. For example, Darainy et al. (2013) showed that perceptual training through passive movements with binary feedback was more effective than mere exposure to the same training trajectories in accelerating subsequent force-field learning. Huang et al. (2011) showed that repeating a movement that corresponds to the solution of a visuomotor rotation subsequently results in an increased learning rate only if the repetition was associated with successfully achieving the target. On the contrary, no benefits were observed if the repeated movement had not been previously associated with successful adaptation. Shmuelof et al. (2012a) showed that the retention of a visuomotor rotation was better in a group of participants that received additional training with binary feedback, than for those who received detailed error information in addition to the binary feedback. This was seen despite the fact that initial learning of the visuomotor rotation, and therefore the previous motor experience, was identical in the two groups.
A further consideration pertains to the role of cognitive factors, such as explicit reaching strategies. The visual reinforcement study we have run rules out the possibility that the learning observed in the active or passive reinforcement groups was solely due to knowledge of the target position. Indeed, there were no improvements in motor performance when participants were repeatedly shown the target position associated with reinforcement, but in the absence of somatosensory training. Participants in the visual reinforcement study showed a pattern qualitatively similar to the passive control participants, with a transitory improvement in the movements immediately following reinforced visual presentation of the target and a rapid washout. This may suggest a cognitive component for what has been so far referred to as experience-dependent learning, that could be investigated in future studies.
It has been suggested that initial learning is dependent on predictive feedforward processes, whereas repetition-based learning is prominent later on, when the asymptotic solution of motor adaptation results in repeated successful movements (Diedrichsen et al., 2010). This explanation fits with the results of adaptation studies. However, when learning a novel motor skill, it is not possible to use an adaptation-based strategy for correcting errors because there is not a clear sensory target relative to which error can be computed. When acquiring a novel motor skill, exploration, possibly repetition, and reinforcement could indeed play a dominant role in the early stages of learning, with error-based procedures becoming effective only at later stages. Accordingly, the choice of perturbation and adaptation as an experimental model of learning can influence conclusions regarding motor learning. This issue was already recognized and discussed in detail previously (Shmuelof et al., 2012b; see also Stanley and Krakauer, 2013).
There is increasing evidence showing that motor learning promotes somatosensory plasticity, in the form of perceptual recalibration (Haith et al., 2008; Cressman and Henriques, 2009; Ostry et al., 2010; Wilke et al., 2013) and changes to intrinsic brain connectivity (Vahdat et al., 2011). The majority of these studies have used motor adaptation; therefore, the extent to which somatosensory change accompanies the development of novel motor skills is yet to be explored. In the only study to our knowledge investigating changes in proprioceptive function following motor skill learning (Wong et al., 2011), the authors trained participants to make accurate and fast reaching movements, in the absence of external perturbations. They found an ∼11% increase in proprioceptive acuity following motor training with active movements and visual feedback, but not following passive movements without feedback. Our results confirm and extend these previous findings. We show an increase in perceptual acuity for reinforced participants who train with active movements, and no change in acuity for passive movements without feedback. A new finding is that, when passive movements are coupled with reinforcement, somatosensory perceptual acuity improves to an extent comparable with active participants. The increase in perceptual acuity in our study is small, on the order of 2%. The training duration and training breadth are possible explanations for this difference compared with the study by Wong et al. (2011), as both were greater in this previous investigation, providing greater opportunity for generalization (Berniker et al., 2014) and possibly for somatosensory plasticity as well. We also notice that our perceptual testing procedure mainly targeted proprioceptive function, but it would be interesting to further investigate the extent to which motor learning may influence tactile function, in addition to proprioception.
In conclusion, the present study showed that, in the early stages of human motor learning, somatosensory experience paired with reinforcement improves motor as well as somatosensory function to a degree comparable with that seen following training with active movements.
Footnotes
This work was supported by the National Institute of Child Health and Human Development R01 HD075740. We thank Floris Van Vugt for help with the GAM analysis.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. David J. Ostry, Department of Psychology, 1205 Dr. Penfield Avenue, Stewart Biology Building, Montreal, QC H3A 1B1, Canada. david.ostry{at}mcgill.ca