Abstract
Plasticity in the subcortical motor basal ganglia–thalamo–cerebellar network plays a key role in the acquisition and control of long-term memory for new procedural skills, from the formation of population trajectories controlling trained motor skills in the striatum to the adaptation of sensorimotor maps in the cerebellum. However, recent findings demonstrate the involvement of a wider cortical and subcortical brain network in the consolidation and control of well-trained actions, including a brain region traditionally associated with declarative memory—the hippocampus. Here, we probe which role these subcortical areas play in skilled motor sequence control, from sequence feature selection during planning to their integration during sequence execution. An fMRI dataset (N = 24; 14 females) collected after participants learnt to produce four finger press sequences entirely from memory with high movement and timing accuracy over several days was examined for both changes in BOLD activity and their informational content in subcortical regions of interest. Although there was a widespread activity increase in effector-related striatal, thalamic, and cerebellar regions, in particular during sequence execution, the associated activity did not contain information on the motor sequence identity. In contrast, hippocampal activity increased during planning and predicted the order of the upcoming sequence of movements. Our findings suggest that the hippocampus preorders movements for skilled action sequences, thus contributing to the higher-order control of skilled movements that require flexible retrieval. These findings challenge the traditional taxonomy of episodic and procedural memory and carry implications for the rehabilitation of individuals with neurodegenerative disorders.
Significance Statement
Skilled actions like handwriting, typing, and playing music involve recombining movements into various sequences and producing them from memory. This fMRI study suggests that the hippocampus, traditionally associated with episodic memory and spatial navigation, preorders well-trained movement sequences prior to their execution. These findings challenge the canonical separation between episodic and procedural memory systems and the neural basis of skilled motor behaviors.
Introduction
The neural control of skilled movements is distributed across cortico–striato–cerebellar circuits linked via interlocking loops in the thalamus (Bostan and Strick, 2018). As motor skills are learnt, the dominant assumption is that there is a transition from associative–cognitive to motor reference frames (Hikosaka et al., 2002; Dayan and Cohen, 2011; Verwey, 2023). This is implemented through a shift toward the involvement of cortical and subcortical areas with mono- and disynaptic projections to brainstem and spinal control centers which control the respective effectors, particularly the contralateral primary motor cortex, the putamen, and the ipsilateral anterior lobule of the cerebellum (CB; Kelly and Strick, 2003; Bostan et al., 2013; Graybiel and Grafton, 2015; Bostan and Strick, 2018). This shift is thought to underlie the formation of a motor “repertoire” enabling automatic rather than slow and effortful cognitive control of fine-grained kinematic trajectories and sensorimotor transformations from typing and handwriting to highly skilled musical and athletic performance (Wong and Krakauer, 2019; Harpaz et al., 2022).
Yet, a growing body of evidence from neurophysiology and neuroimaging experiments in both humans and animal models suggests that the network of brain regions engaged in later stages of skill learning and control continues to extend beyond the core motor regions (Diedrichsen and Kornysheva, 2015). This applies to skills even after prolonged training over weeks or months (Shima and Tanji, 1998; Averbeck et al., 2003; Mushiake et al., 2006; Berlot et al., 2020, 2021) and during sequence production entirely from memory without visual or other external guidance (Yewbrey et al., 2023). The associated cortical activity patterns have been linked to higher-order control of skilled motor sequences such as the serial order and timing of motor sequences (Kornysheva and Diedrichsen, 2014), to enable behavioral flexibility, a key feature of skilled motor control in humans. In contrast, activity within primary motor cortical regions shows a lack of tuning to skilled sequences of movements, even following extensive practice (Shima and Tanji, 2000; Yokoi and Diedrichsen, 2019; Zimnik and Churchland, 2021).
While the cortical control of motor skills has been characterized in detail over the last three decades, the complementary roles of striato–thalamo–cerebellar loop regions remains poorly delineated. Specifically, it remains unclear whether the encoding of skilled motor sequences, e.g., in striatal and cerebellar effector-related areas, is linked to high-order control of sequence chunking and action selection or to lower-order control of movement kinematics (Graybiel and Grafton, 2015; Dhawale et al., 2021; Wolff et al., 2022). Furthermore, the hippocampus has long been overlooked as a contributor to procedural motor control due to landmark reports of preserved procedural memory in amnesic patients with medial temporal lobe damage (Brooks and Baddeley, 1976; Cohen and Squire, 1980). More recently, the original taxonomy has been questioned by findings pointing toward its contribution to sequence learning and consolidation, both during sleep and short breaks (Albouy et al., 2008, 2013; Doyon et al., 2009; Lungu et al., 2014; Buch et al., 2021). Although there is now strong evidence for the hippocampus in procedural motor skill consolidation, its role in the online motor control of skilled movements remains unclear. It remains uncertain whether this contribution is related to the retrieval of task-context associations or directly tied to the control of specific features of the performed movement sequences.
Here, we set out to tease apart the roles of subcortical areas in skilled movement sequence planning and execution from memory. Participants produced movement sequences with particular order and timing entirely from memory in a delayed sequence production task following 2 d of training (Yewbrey et al., 2023). Using canonical and multivariate analysis of fMRI patterns in effector-related contralateral striato–thalamo–cerebellar and bilateral hippocampal areas, we probed the tuning of these regions to motor sequences and their ordinal and temporal features during planning and execution. We found that the hippocampus increased its activity and carried accurate information on the upcoming movement sequence—the movement order independently of the planned sequence timing. In contrast, the activity increase within effector-related striato–thalamo–cerebellar regions during planning and execution was not sequence-specific. This implies their involvement in the lower-order control of individual movement components reused across sequences.
Materials and Methods
Participants
Twenty-four participants (14 females and 10 males; mean age, 21.00 years; SD, 1.64 years) completed the 3 d experiment, having met all behavioral and imaging requirements. All participants had no history of neurological disorders. Twenty-three participants were right-handed with a mean Edinburgh Handedness Inventory [https://www.brainmapping.org/shared/Edinburgh.php; adapted from Oldfield (1971)] score of 75.22 (SD, 20.97; range, 25–100); one was left-handed with a score of −70. Although our preregistration (www.osf.io/g64hv) stated that we would exclude left-handed individuals, this participant's data did not differ qualitatively from the rest of the sample so were included. The sample comprised individuals without professional training in music, dance, or sports. On average, participants had only 2.3 years of musical or dance training (SD, 3.3), with 9 out of 24 having no prior experience with any additional training in music, dance, or sports during their lifetime beyond school, in line with numbers from the general population in the UK (Okely et al., 2021). One individual did not complete the questionnaire on previous training. We therefore believe that the sample is representative of individuals with no professional-level skill in music, dance, or sports.
Data collected from an additional 17 participants were excluded for the following reasons: one due to unforeseen technical difficulties with the apparatus, one because of a corrupted functional scan, and 15 further did not reach target performance after 2 d of training. Target performance required an error rate of <20% (group mean, 6.54%; SD, 6.03%) and statistically distinct sequence timing structures that were maintained across different sequence orders; the full details regarding the exclusion criteria have been described previously (Yewbrey et al., 2023). Participants were recruited either through a participation panel at Bangor University and awarded module credits or through social media and given monetary reward at a standard rate. Individuals with professional musical qualifications were excluded from recruitment. All participants provided informed consent, including consent to data analysis and publication, through an online questionnaire hosted by Qualtrics. The Bangor University School of Psychology Ethics Committee approved this experiment and its procedures (Ethics Approval Number 2019-16478).
Apparatus
Two custom-built force transducer keyboards captured presses from all 10 fingers on the right and left hands of participants as they produced movement sequences. Each key had a small groove where participants placed their finger and could be adjusted for comfort according to the size of the participant's hand. Below each key, a force transducer sampled force at 1,000 Hz (Honeywell FS Series, with a range of up to 15 N). The keys were not depressible. Force acquisition occurred in each trial from 500 ms before the sequence cue onset to the end of the production period in production trials and the end of the false production period in no-go trials. Traces from the right hand were baseline-corrected trial-by-trial using the first 500 ms of acquisition (500 ms before the sequence cue) and smoothed to a Gaussian window of 100 ms. Button presses were recognized when a channel's force exceeded a fixed threshold relative to the baseline (2.5 N for the first eight participants and 1 N for the subsequent 16 of 24 participants). The further apparatus used during the experiment has been described in a previous paper (Yewbrey et al., 2023).
Behavioral task
Participants were trained to produce four five-finger sequences with defined timing structures, defined by interpress intervals (IPIs), from memory in a delayed sequence production paradigm. Go trials began with an abstract fractal image (sequence cue), which was associated with a sequence. Following the sequence cue, a fixation cross was shown to allow participants to prepare the upcoming sequence. A black hand with a green background (go cue) then appeared to cue sequence production from memory. Succeeding the go cue, another fixation cross was presented. Feedback was then displayed to participants based on their performance during the preceding production period, finally followed by an intertrial interval (ITI) where a further fixation cross was displayed. During training, participants learned the sequences through repeated exposure to visually guided (instructed) go trials. These instructed go trials were functionally identical to from-memory go trials but featured a go cue with a gray background and a red dot on the tip of each finger on the hand image, which moved from finger to finger in the target production order and in-pace with the target timing structure. No-go trials had the same structure as go trials, but the go cue did not appear following the preparatory fixation cross. Instead, the fixation cross continued to show for an extended period. This was succeeded by a further fixation cross, feedback, and ITI, as in go trials.
The four five-finger target sequences consisted of permutations of two finger orders (Order 1 and 2) and two IPI orders (Timing 1 and 2) matched in finger occurrence and sequence duration. Sequence orders were generated randomly for each participant, making sure to avoid ascending and descending press triplets and any identical sequences. Furthermore, each participant's trained sequences began with the same finger press to avoid differences in the first press driving the decoding of sequence identity during preparation (Yokoi et al., 2018). Timing structures were the same across participants, which consisted of four target IPI sequences as follows: 1,200 ms - 810 ms - 350 ms - 650 ms (Timing 1) and 350 ms - 1,200 ms - 650 ms - 810 ms (Timing 2).
The trial-by-trial feedback used a point-based scale, ranging from 0 to 10. Points were awarded based on initiation reaction time and temporal deviation from target timing. Zero points were awarded if the executed press order was incorrect. If the executed press order was correct, participants were awarded their earned timing points. In no-go trials, five points were awarded if no press was made as instructed. Participants were presented with a feedback screen after each trial showing the number of points achieved in the current trial, as well as visual feedback on whether they pressed the correct finger at the correct time. A horizontal line was drawn across the center of the screen, with four symbols displayed equidistantly along the line representing each of the five finger presses. Correct presses were indicated by an “X” symbol, and incorrect presses were represented by a “–” symbol, for each respective sequence position. The vertical position of these symbols above (“too late”) or below (“too early”) the horizontal line was proportional to the participant's timing of the respective press relative to target IPI. Using these cues, participants were able to adjust their performance online to ensure maximum accuracy of sequence production. During the first 2 d of training, auditory feedback in the form of successive rising tones corresponding to the number of points (0–10) was played alongside the visual feedback. Auditory feedback was absent during the fMRI session, to prevent any auditory processing driving decoding accuracy. All aspects of the behavioral task for this experiment are described in greater detail in a previous study (Yewbrey et al., 2023).
Procedure
Training duration was consistent across participants and occurred across the first 2 d of the experiment over three distinct training stages. In the first training stage, 80% of all trials were instructed go trials, and the remaining 20% were no-go trials. During the second training stage, 40% of trials were instructed go trials, 40% were from-memory go trials, and 20% were no-go trials. In the third and final stage of training, 80% of trials were from-memory go trials, and 20% were no-go trials. The third day took place inside the MRI scanner and consisted of a short refresher stage, made up of the same proportion of trials as the second stage of training, during which T1 images were collected. Next, there was an fMRI stage consisting of six imaging runs, featuring 50% from-memory go trials and 50% no-go trials. In addition, before and after the last training stage, participants completed a synchronization task which has been described previously (Yewbrey et al., 2023).
MRI acquisition
Data were acquired using a 32-channel head coil in a Philips Ingenia Elition X 3 T MRI scanner. T1 anatomical scans at a 0.937 × 0.937 × 1 resolution were acquired using MPRAGE, encoded in the anterior–posterior dimension with an FOV of 240 × 240 × 175 (A-P, R-L, F-H). For the functional data, T2*-weighted scans were collected across six runs of 230 volumes at a 2 mm isotropic resolution, with 60 slices at a thickness of 2 mm. The functional images were acquired at a TR of 2 s, a TE of 35 ms, and a flip angle of 90°. These were obtained at a multiband factor of 2, in an interleaved odd–even EPI acquisition. To allow the stabilization of the magnetic field, four images were discarded at the beginning of each run. The whole brain of most participants was covered, except for the central prefrontal cortex, the anterior temporal lobe, and ventral parts of the CB. Jitters were used within each trial during preparation periods, postproduction fixation crosses, and ITIs, to give us a more accurate estimate of the Hemodynamic Response Function (HRF) by varying which part of the trial is sampled by each TR (Serences, 2004).
Preprocessing and first-level analysis
All fMRI preprocessing steps were completed using SPM12 (revision 7219) in MATLAB (MathWorks). We applied slice timing correction using the first slice as a reference to interpolate all other slices too, ensuring analysis occurred on slices which represent the same time point. Realignment and unwarping were conducted using a weighted least-square method correcting for head movements using a six-parameter motion algorithm. A mean EPI was produced using SPM's Imcalc function, wherein data acquired across all six runs were combined into a mean EPI image to be coregistered to the anatomical image. Mean EPIs were coregistered to anatomical images using SPM's coreg function, and their alignment was checked and adjusted by hand to improve the alignment, if necessary. All EPI runs were then coregistered to the mean EPI image. For the GLM, regressors were defined for each sequence separately for both preparation and production, resulting in eight regressors of interest per run. Preparation- and production-related BOLD responses were independently modeled from no-go and go trials, respectively, to tease out activity from these brief trial phases despite the hemodynamic response lag (Logothetis, 2003). The preparation regressor consisted of boxcar function starting at the onset of the sequence cue in no-go trials and lasting for the duration of the maximum possible preparation phase (2,500 ms). The production regressor consisted of a boxcar function starting at the onset of the first press with a fixed duration of 0 (constant impulse), to capture activity related to sequence initiation and extract sequence production-related activity from the first finger press that was matched across sequences within each participant. We aimed at capturing BOLD responses related to neuronal populations that become differentially active for different sequences (Tanji and Shima, 1994), for which a single estimate of sequence production has been used to successfully identify sequence representations in several previous fMRI studies (Wiestler and Diedrichsen, 2013; Kornysheva and Diedrichsen, 2014; Nambu et al., 2015; Yokoi et al., 2018; Berlot et al., 2020). Additionally, we included several regressors of no interest: (1) error trials (incorrect or premature presses during go trials and presses during no-go trials), which were modeled from the sequence cue onset to the end of the ITI; (2) the preparation period in go trials (1,000–2,500 ms from sequence cue); and (3) the temporal derivate of each regressor. The boxcar model was then convolved with the standard HRF. To remove the influence of movement-related artifacts, we used a weighted least-square approach (Diedrichsen and Shadmehr, 2005). This resulted in us obtaining beta weight images for each of the eight conditions per scanning run, for all six runs. We further calculated the percent signal change during preparation and production relative to rest and compared each with a baseline of zero using two-tailed one–sample t tests (Fig. 2a). These tests were Bonferroni-corrected two times, to account for phase (2) within each predefined ROI. Additionally, we identified significant clusters of activity constrained to each ROI using a random effect analysis with an uncorrected threshold of t(23) > 3.48, p < 0.001, and a cluster-wise p value for the cluster of that size (Worsley et al., 1996).
Subcortical and cerebellar regions of interest
FreeSurfer's automatic segmentation (Dale et al., 1999) was used to segment subcortical regions of interest (ROIs) consisting of the thalamus, caudate nucleus, putamen, and hippocampus, from each participant's T1 anatomical image. We then resliced each region's mask into the same resolution as the functional images (2 × 2 mm isotropic) and further masked them using functional activity. Only beta weights from voxels within these subcortical functional masks were extracted, constraining analyses to voxels belonging to subcortical ROIs. Furthermore, to assess the spatial organization of multivariate patterns, we defined 160-voxel volumetric searchlights with a maximum radius of 6 mm in native space within each subcortical ROI. Linear discriminant analysis (LDA) accuracy (see below, LDA) obtained was assigned to the center voxel of the active searchlight.
For the CB, we used the SUIT cerebellar toolbox (Diedrichsen, 2006) to segment the whole structure based on individual anatomical T1 images. We then resliced the lobular probabilistic cerebellar atlas (Diedrichsen et al., 2009) into each participant's native space to acquire masks for CB lobules IV and V and resliced these masks into functional resolution. Beta weights were extracted from these regions using these masks to constrain analyses to voxels of interest. Additionally, we ran a 160-voxel volumetric searchlight analysis across the whole CB using each participant's individual anatomy. LDA accuracy was assigned to the center voxel of the active searchlight in a similar manner to the subcortical analysis. Classification accuracy maps were subsequently resliced into SUIT space to display group results on a cerebellar surface flatmap (Diedrichsen and Zotow, 2015).
LDA
LDA was used to detect sequence-specific representations (Kornysheva and Diedrichsen, 2014; Kornysheva et al., 2019; Yewbrey et al., 2023), programmed in a custom-written MATLAB (MathWorks) code. We extracted mean patterns and common voxel-by-voxel covariance matrices for each class from the training dataset (five of the six imaging runs), and then a Gaussian linear discriminant classifier was used to distinguish between the same classes in the test dataset (the remaining imaging run). The factorized classification of finger order, timing, and integrated order and timing followed the previous approach (Kornysheva and Diedrichsen, 2014; Yewbrey et al., 2023) and was performed on betas estimated from the sequence preparation and production periods independently. For the decoding of sequence order, the classifier was trained to distinguish between two sequences with different orders but matching timing across five runs and was then tested on two sequences with the same orders paired with a different timing in the remaining run. This classification was then cross-validated across runs and across training/test sequences, for a total of 12 cross-validation folds. For the decoding of sequence timing, the classifier was trained to distinguish between two sequences with differing timings paired with the same order and tested on two sequences with the same two timings paired with a different order and underwent the same cross-validation procedure. This method of training and testing the linear discriminant classifier allowed for identification of sequence feature representations that were transferrable across conditions they are paired with and therefore independent. The integrated classifier was trained to distinguish between sequences on five runs and then tested on the remaining run. Here, the mean activity for each timing (collapsed across two orders) and finger order (collapsed across two timings) condition within each run was subtracted from the overall activity for each run, separately (Kornysheva and Diedrichsen, 2014; Yewbrey et al., 2023). This allowed for the measurement of residual activity patterns that were not explained by a linear combination of timing and order, rather spatiotemporal idiosyncrasies that would result in the generation of unique kinematics for the respective sequence that were not transferrable to others. We then trained our integrated classifier to distinguish between all four sequences based on these residual activity patterns across five imaging runs and then tested its accuracy on the remaining imaging run, cross-validated across runs.
We normalized classification accuracy by transforming to z scores, assuming a binomial distribution of the number of correct guesses (Kornysheva and Diedrichsen, 2014). We then tested these z scores against zero (chance level) in subcortical and cerebellar ROIs across participants for statistical analysis. Z score-transformed classification accuracy maps were subjected to a random effect analysis, with an uncorrected threshold of t(23) > 3.48, p < 0.001, and a cluster-wise p value for a cluster of that size (Worsley et al., 1996). Analyses that considered all voxels within respective ROIs were also subjected to one-tailed one–sample t tests relative to zero, Bonferroni-corrected six times to account for phase (2) × classifier (3), to identify decoding accuracy above the chance level. We also ran repeated-measure ANOVAs on regional distance measures with factors of phase, classifier, and region to assess interaction effects and ran post hoc pairwise comparisons to investigate significant interaction effects.
Multidimensional scaling of sequence patterns across preparation and production
To determine whether the sequence-specific voxel patterns undergo a qualitative change from planning to execution or are simply scaled up or scaled down versions of the same patterns, we calculated the multidimensional distances between the two phases. We firstly prewhitened beta weights prior to multidimensional scaling to increase the reliability of our distance measurements, using a regularized estimate of the overall noise–covariance matrix (Walther et al., 2016). From these beta weights, cross-validated Mahalanobis distances were computed for each of the sequences across preparation and production, resulting in representational dissimilarity matrices (RDMs; Fig. 4). We then identified the first three principal components (PCs) associated with the three largest eigenvalues in our prewhitened beta weights using classical multidimensional scaling of the variance–covariance matrix for visualization of distance across the two phases. This method plots all eight conditions in 3D space according to the first three eigenvectors (Fig. 4b).
Next, we set out to analyze the changes related to the neural representation of the sequences across preparation and production. Since overall activity levels varied greatly within regions across phases (Fig. 2), the largest eigenvalues are primarily associated with differences in activity levels across phase along PC1 (Fig. 4), which are of limited interest to sequence-related representations. Thus the Euclidean distances were obtained from PC2 and 3 only, similar to previous approaches (Ariani et al., 2022). To prevent that changes in representations are driven by a simple scaling up of the patterns from preparation to production, we normalized the cross-phase Euclidean distance relative to the grand mean of pairwise k(k − 1) / 2 Euclidean distance values between sequences within-phase, providing cross-phase distance values that represented a percentage of the scale of the overall representational structure. Finally, given a nonzero level of noise in the data, the Euclidean distance between the representations in preparation and production could still be positively biased. We therefore simulated fMRI activity patterns to establish a no-switch baseline of representation patterns from preparation and production (Fig. 4a, left panel) and a switch pattern (Fig. 4a, right panel) with a matched number of voxels, sample size, and signal-to-noise ratio relative to the mean across participants of each ROI collected in the empirical data. The simulated activity pattern (y) for the ith spatial and jth temporal sequence for the kth imaging run generated as yi,j,k = si + tj + ii,j + rk + ei,j,k with each pattern component was generated as a normal random vector over 160 voxels (Kornysheva and Diedrichsen, 2014). The variance of the spatial (s), temporal (t), and integrated (i) representations were fixed at the same level with the same distribution used for planning and execution phases in the baseline and the noise (e) matched to the empirical signal-to-noise ratio in the respective ROI (https://github.com/RYewbrey/prepProd2/blob/main/simulation/prepProdSimu_makedataPP.m). Additionally, these distances were scaled by the same factor as the signal-to-noise ratio from preparation to production in the respective ROI and normalized according to the average distances between conditions during preparation and production, respectively. Simulated sequence patterns during preparation and production were either generated using the same distribution (left panel; no switch) or different distributions (right panel; switch). The obtained simulated datasets were then submitted to the same multidimensional scaling analysis as the empirical data. Cross-phase within–sequence Euclidean distances were calculated to provide ROI-specific no–switch baselines for the one-sample t tests carried out in each region and Bonferroni-corrected for seven regions.
Results
Subcortical activity modulation during sequence planning and execution
Twenty-four participants were trained to produce four five-finger sequences from memory using their right hand. Behavioral training progressed from production under visual instruction to production from memory across 2 d. On the final third day, participants produced the sequences entirely from memory in an MRI scanner. The behavioral and cortical results are reported in detail in a previous paper (Yewbrey et al., 2023). To isolate fMRI patterns relating to movement production from those relating to movement preparation in a fast-paced paradigm resembling the naturalistic timescales of sequence planning and execution, we used two independent trial types: go trials (Fig. 1a), to sample production activity, and no-go trials (Fig. 1b), to sample preparatory activity. Go trials began by cueing the upcoming sequence with an abstract fractal image, followed by a fixation cross, then a black hand with a green background (go cue) indicated that the sequence should be executed (Fig. 1c). A further fixation cross, then performance feedback, followed. No-go trials were identical to go trials during movement preparation; however, they did not display a go cue. Instead, a fixation cross remained on the screen for an extended period. To encourage sequence planning regardless of trial type, participants were incentivized to initiate sequences as quickly as possible upon the presentation of the go cue (Mantziara et al., 2020). Half of the points in each go trial reflected the initiation RT, and “go” trials could reward up to double the points of “no-go” trials in the case of fast initiation and correct sequence execution. Indeed, the average initiation RT for the five-press sequences (M, 424 ms; SD, 67.05) was substantially shorter than the choice RT previously observed for trained four- and six-press sequences without the preparation period (∼500 ms; Verwey, 2004), suggesting that participants engaged in sequence planning prior to execution. Finally, the trials were pseudorandomized, reducing the expectation of a particular trial type. Therefore, the underlying process of sequence planning was identical in both trial types until the point where these trials differentiated feedback followed, and participants were rewarded for not producing any presses (see Materials and Methods).
Firstly, we examined whether the subcortical ROIs with connections to the active effector (right hand)—the left thalamus, caudate, putamen, and the right CB lobules IV and V—as well as the bilateral hippocampi, changed overall BOLD activity during the delayed sequence production task. We defined all ROI using each participant's individual anatomy and extracted the percent signal change during preparation (no-go trials) and production (go trials) relative to rest (see Materials and Methods). The extracted percentage signal change values from across all voxels belonging to each ROI were then averaged for each participant (Fig. 2a,b). All tests were Bonferroni-corrected twice to account for phase (preparation and production) in each predefined ROI. During preparation, we observed activity above the baseline in the caudate (t(23) = 3.28; p = 0.006; d = 0.67), putamen (t(23) = 2.67; p = 0.028; d = 0.55), and hippocampus on both the left (t(23) = 6.28; p < 0.001; d = 1.28) and right (t(23) = 4.63; p < 0.001; d = 0.94) sides. No significant difference relative to baseline was found in the thalamus nor in cerebellar lobules IV and V during preparation (p > 0.114; d < 0.41). During production, activity increases were observed in all subcortical area with connections to the active effector—contralateral thalamus (t(23) = 7.54; p < 0.001; d = 1.54), caudate (t(23) = 2.66; p = 0.028; d = 0.54), putamen (t(23) = 4.21; p < 0.001; d = 0.86), CB lobule IV (t(23) = 5.61; p < 0.001; d = 1.15), and CB lobule V (t(23) = 12.22; p < 0.001; d = 2.49). The hippocampus, however, showed activity decreases during production on the left (t(23) = 2.45; p = 0.044; d = 0.50) and right (t(23) = 3.19; p = 0.008; d = 0.65) sides. In sum, only the striatum and hippocampus were active during sequence preparation from memory. In contrast, the striato–thalamo–cerebellar regions with direct relevance to the active effector increased their activity during production, whereas hippocampal activity decreased.
We then localized the peak activations within each subcortical area and across the whole cerebellar volume by identifying significant clusters of activation using a random effect analysis (Fig. 2a,b). All results were corrected using SPM's small-volume correction and Bonferroni-corrected twice to account for phase. During preparation, we found significant clusters of activation in the caudate's ventral anterior head (t(23) = 6.02; pFWE = 0.008; extent = 101; MNI coordinates, x = −16; y = 22; z = −5), anterior body (t(23) = 5.56; pFWE = 0.020; extent = 83; x = −16; y = 22; z = 7), and dorsal anterior tail (t(23) = 5.55; pFWE = 0.008; extent = 106; MNI coordinates, x = −12; y = −6; z = 17), posterior putamen (t(23) = 5.71; pFWE < 0.001; extent = 228; MNI coordinates, x = −32; y = −14; z = 9), anterior left hippocampus (t(23) = 7.97; pFWE < 0.001; extent = 766; MNI coordinates, x = −26; y = −22; z = −15), anterior right hippocampus (t(23) = 6.33; pFWE < 0.001; extent = 471; MNI coordinates, x = 38; y = −20; z = −17), and CB right lobule V extending into lobule IV (t(23) = 5.09; pFWE = 0.002; extent = 205; MNI coordinates, x = 17; y = −45; z = 17). During production, we found large significant clusters of activation in the middle posterior thalamus (t(23) = 14.29; pFWE < 0.001; extent = 851; MNI coordinates, x = −16; y = −24; z = 5), the anterior tail of the caudate (t(23) = 5.24; pFWE < 0.001; extent = 208; MNI coordinates, x = −10; y = 0; z = 9), posterior putamen (t(23) = 7.06; pFWE < 0.001; extent = 783; MNI coordinates, x = −28; y = −18; z = 3), and CB lobules IV–VI (the peak of which was in lobule VI, outside of our predefined cerebellar ROIs; t(23) = 8.46; pFWE < 0.001; extent = 20,079; MNI coordinates, x = 7; y = −69; z = −16). The location of the clusters generally shifted or expanded from associative to motor subregions of the subcortical areas of interest upon execution.
Hippocampal activity during planning predicts the upcoming order of movements
To identify what content is represented in the underlying activity patterns during planning and production, we trained and tested an LDA classifier to distinguish between the neural patterns related to sequence identification, specifically of the planned and executed movement order, movement timing, and their unique combination (Kornysheva and Diedrichsen, 2014; Yewbrey et al., 2023). All decoders were cross-validated across six imaging runs and, for order and timing, across conditions, then converted into z scores (see Materials and Methods).
We first tested whether ROI showed above-chance decoding accuracy (Fig. 3a,b). To do so, we obtained accuracy values for each of our three classifiers during preparation and production when considering all voxels within each ROI. We then assessed accuracy values in each region using one-tailed one–sample t tests against a chance level of zero. All tests were Bonferroni-corrected six times to account for classifier (3) and phase (2) in each predefined subcortical ROI. During preparation, we found above-chance decoding of the upcoming sequence order in the left hippocampus (t(23) = 3.00; p = 0.018; d = 0.61). No evidence of significantly above-chance decoding accuracy could be found in the other ROIs (p > 0.264; d < 0.36). These results suggest that the left hippocampus is involved in the planning of the ordinal structure of movement sequences.
Though considering all present voxels contained within a region is informative as to the general inclination of a region's processing, it does not inform us as to which subregions may be driving decoding accuracy the most. This is especially true when it comes to subcortical regions such as the thalamus, which shows differing functional and anatomical connectivity throughout its various subregions and nuclei (Kumar et al., 2017). Therefore, we performed volumetric searchlight analyses using 160 voxel searchlights for each decoder, constrained to each subcortical ROI (Fig. 3a,b). The accuracy value for each given searchlight was assigned to the center voxel. We then identified significant clusters using a random effect analysis on the produced accuracy maps and applied SPM's small-volume correction in a similar manner to the contrast activity cluster analysis. We Bonferroni-corrected significance values for each cluster six times, to account for classifier (3) and phase (2). During preparation, we found a significant cluster for the order decoder in the anterior body of the left hippocampus (t(23) = 5.74; pFWE < 0.001; extent = 379; MNI coordinates, x = −32; y = −22; z = −11). No significant clusters were found during production, although prior to sixfold corrections, a cluster for integrated sequence decoding was found in the superior thalamus during production (t(23) = 4.76; pFWE = 0.045; extent = 47; MNI coordinates, x = −18; y = −26; z = 15). These findings reinforce that the hippocampus, particularly the anterior body, is involved in the planning of sequential movement order. Sequence timing and its nonlinear integration with sequence order, however, do not seem to be represented in effector-related subcortical regions.
Sequence-related activity patterns are not conserved across planning and execution
Given that our target subcortical regions show different activity levels alongside changes in informational content across preparation and production, we asked whether the sequence-related patterns are maintained prior to and during movement. One possibility is that the patterns present during preparation undergo linear scaling to exceed a set threshold and initiate movement, where unintended initiation is thought to be prevented by suppression in corticospinal circuits (Duque and Ivry, 2009). Alternatively, these patterns may alter their geometries, meaning that they undergo a significant state change. The presence of scaled cortical neural patterns has been evidenced in single-finger movements using fMRI (Ariani et al., 2022), whereas distinct neural patterns have been shown in multiunit recordings of cortical populations (Kaufman et al., 2014); however, subcortical pattern dynamics before and during sequence production are unknown. Accordingly, we assessed the within-sequence, across-phase Euclidean distance between conditions across the first three PCs which explained 9.8% (SD, 3.4%) of the fMRI patterns using multidimensional scaling of the RDM. Substantial activity changes were seen across all regions from preparation to production, captured by PC1 (8.7%; SD, 3.2%; Fig. 4b). This was confirmed by simulations where the activity offset was imposed (Fig. 4a; see Materials and Methods). Since representational distances related to this offset were not of interest, we quantified the Euclidean distance between each sequence's position during preparation and production in a 2D space based on PC2 and 3.
If one were to assume that activity patterns during production are simply upscaled versions of those during preparation, the hypothetical distance between patterns after removal of the first PC should be zero. However, given a nonzero level of noise in our data, these distances are unlikely to equal to zero which would bias our results away from indicating a maintained distribution. To control for this, we simulated fMRI datasets with the same conditions as our empirical data and generated preparation and production from either the same distribution (no switch) or different distributions (switch) while matching their signal-to-noise ratios to our ROIs (Fig. 4a). Additionally, we scaled this cross-phase distance by the average of distances within preparation and production (Fig. 4b). We then performed one-tailed, one-sample t tests against zero to identify significantly elevated Euclidean distance in each of our subcortical regions, Bonferroni-corrected seven times for region (Fig. 4c). We found significant elevation in the thalamus (t(23) = 6.06; p < 0.001; d = 1.24), caudate (t(23) = 3.20; p = 0.014; d = 0.65), putamen (t(23) = 3.12; p = 0.017; d = 0.64), and hippocampus in the left (t(23) = 3.05; p = 0.020; d = 0.62) and right (t(23) = 3.93; p = 0.002; d = 0.80) hemispheres. No significant differences were found in CB lobule IV (t(23) = 0.37; p = 1.00; d = 0.08) and CB lobule V (t(23) = 2.05; p = 0.182; d = 0.42). These results suggest that all subcortical regions apart from the CB underwent a substantial state change which alters sequence-specific neural representations from prior to and during movement.
Discussion
Acquiring a repertoire of skilled motor sequences has been linked to plasticity in subcortical motor-related brain areas—basal ganglia, the CB, and, more recently, the hippocampus (Khilkevich et al., 2018; Buch et al., 2021; Harpaz et al., 2022). However, the contribution of these areas to higher-order action selection and lower-order movement control in skilled movements is debated (Christian and Thompson, 2003; Frank, 2011; Gao et al., 2018; Mizes et al., 2023a). Here, we examined the contribution of effector-related subcortical regions and the hippocampus to motor sequence control in the perimovement phase (Fig. 5). Furthermore, we assessed whether the neural patterns during planning in those subcortical areas were simply a subthreshold versions of those during execution (Cisek and Kalaska, 2005; Duque and Ivry, 2009) or qualitatively distinct neural states (Kaufman et al., 2014). We found that both hippocampal and effector-related striatal regions showed increased activity during sequence planning. However, only the hippocampal activity was related to sequence-specific information, specifically predicting the sequence order shortly before each execution. In contrast, effector-related striatal, thalamic, and cerebellar regions increased activity during execution yet lacked sequence-specific tuning, suggesting their role may be related to the control of lower-level kinematics of individual movement elements rather than of a whole movement sequence. All regions, apart from the CB, showed fundamental shifts in the sequence-related activity patterns from planning to execution, indicating that the movement onset is associated with a fundamental state shift in subcortical areas rather than a scaling up of subthreshold patterns, similar to the previous results in the cortex (Kaufman et al., 2014; Yewbrey et al., 2023).
The hippocampus preorders upcoming movements
The hippocampus has traditionally been associated with episodic recall in the nonmotor domain and is key to the preservation of event order (Davachi and DuBrow, 2015; Jafarpour et al., 2023). Here we show that the hippocampus is part of online motor control of a procedural skill which was trained over several days and produced without external guidance. Specifically, it is key to setting up the serial order of movements in an upcoming sequence from long-term memory (Fig. 5). This is a high-order control function, as the same hippocampal patterns are retrieved regardless of the upcoming temporal structure (timing) of the sequence. These findings show that the hippocampal involvement goes beyond its contribution to the acquisition of motor skills, which has been reported in the context of sleep-related or wakeful consolidation during short rest periods interspersed with sequence learning (Albouy et al., 2013; Buch et al., 2021). Instead, it is participating in the rapid planning of a well-trained skill.
The hippocampus may retrieve the order of the upcoming motor sequence via competitive queueing of the upcoming movements (Averbeck et al., 2003; Kornysheva et al., 2019; Rhodes et al., 2019). This computational framework suggests that upcoming movements are preordered in parallel prior to their serial execution during production in the parallel planning layer (Houghton and Hartley, 1995) which has been confirmed experimentally (Averbeck et al., 2002; Kornysheva et al., 2019). Alternatively, the hippocampus may engage in serial order preplay (Dragoi and Tonegawa, 2011), which is temporally compressed ∼20-fold relative to the acquired skill and is associated with skill consolidation (Buch et al., 2021). How competitive queuing and preplay are related to each other is unclear. It is possible that they either measure the same underlying mechanism, e.g., preordered sequence elements being related to incomplete replay sweeps, or constitute coexisting modes of serial order control across domains.
Our previous findings in the cortex suggest that the superior parietal lobule (SPL) also plays a role in the definition of movement order during planning (Yewbrey et al., 2023) and represents effector nonspecific movement intentions (Henderson et al., 2022; Shushruth et al., 2022). In humans, the SPL and hippocampus share connections as recently demonstrated in diffusion tractography imaging (Huang et al., 2021) and interact closely during navigation tasks to convert world-centered into body-centered coordinates (Whitlock et al., 2008; Rolls, 2020). As such, interaction between the hippocampus and the SPL may serve to map the higher-level extrinsic spatial order retrieved by the hippocampus (Kornysheva et al., 2019) onto effector-related intrinsic reference frames (Wiestler and Diedrichsen, 2013) prior to execution.
Effector-relevant striatal, thalamic, and cerebellar activity lacks sequence-specific tuning
Despite robust activity increases, in particular during sequence execution, we found no evidence that effector-relevant contralateral striatal, thalamic, or ipsilateral cerebellar activity contained any sequence-specific information. These findings speak against both the hypothesized role of the striatum in higher-order action selection (Mink, 1996; Redgrave et al., 1999; Gurney et al., 2001; Frank, 2011) or the integration of sequence features for kinematic control of the whole sequence as one motor program (Graybiel, 1998; Wymbs et al., 2012; Friend and Kravitz, 2014). Human neuroimaging studies report movement concatenation-related activity in the striatum (Wymbs and Grafton, 2015); however, our data suggest that these control signals are not sequence-specific (Bednark et al., 2015) but might instead be related to the process of concatenation itself. Furthermore, multiunit recordings in the dorsomedial striatum (caudate nucleus in humans) showed ramping activity which scales with temporal intervals between movements (Emmons et al., 2017; Wang et al., 2018) and is sensitive to the temporal structure of sequential movements (Grahn and Brett, 2007). However, we find no evidence that the striatum encodes differences in upcoming or ongoing temporal interval structure of sequences that are matched in structural complexity—neither during planning nor during execution.
Recent results in rodent models showed that optogenetic lesions lead to kinematic deficits in overtrained sequence production (Mizes et al., 2023a). Interestingly, this mapping breaks down when highly trained sequences share elements with other sequences the animal has been exposed to, e.g., lever presses appearing in a different order, with the skill becoming once-more cortically dependent (Mizes et al., 2023b). This is typical for many motor skills in humans, such as typing, handwriting, and music production where elements are reused in different permutations, including highly overtrained sequences such as signing with one's name or typing “Best wishes.” It is reflected in the factorial design of our study, which requires a well-trained recombination of movements into new sequences and shows no evidence of striatal tuning to these sequences. Thus, kinematic control in the striatum is likely associated with fused movements that are not broken up for flexible recombination—here, individual muscular commands associated with the different finger presses on the force transducer.
Cerebellar motor regions have traditionally been associated with low-level sensorimotor learning and online motor control, as evidenced by cerebellar ataxia (Diener and Dichgans, 1992; Miall et al., 2007), adaptation to sensory perturbations (Donchin et al., 2012), and eyeblink conditioning (Christian and Thompson, 2003). However, recent findings demonstrated the cerebellar involvement beyond short timescales (Chabrol et al., 2019; Li and Mrsic-Flogel, 2020). Transient perturbations of the deep cerebellar nuclei during planning resulted in mice making fewer correct directional licks and impacted subsequent movement direction, while kinematics remained stable (Gao et al., 2018; Chabrol et al., 2019; Li and Mrsic-Flogel, 2020). The interconnected ventrolateral thalamus, which constitutes an interface between striatum, CB, and cortex, has been shown to help maintain preparatory activity (Guo et al., 2017) and control cortical dynamics from preparation to production for rapid and precise motor behavior (Inagaki et al., 2022). This extends the notion of the CB as an online motor controller and the thalamus as a sensorimotor relay (Sherman and Guillery, 2002; Sherman, 2007). However, we did not observe any activity increase or sequence-related patterns in ipsilateral effector-relevant cerebellar regions during motor sequence planning. This supports the classical findings that cerebellar motor regions contribute to the online control of individual movement elements during movement production rather than the planning and execution of a whole sequence, e.g., planning “on the fly” once the production of a sequence has started (Ariani and Diedrichsen, 2019; Ariani et al., 2021). Notably, sequence-related patterns could be found in cerebellar regions known to be interconnected with ipsilateral prefrontal areas (contralateral Crus 1). This was only observed during motor execution, again emphasizing the role of the CB in the online control of movements beyond effector-related regions.
Despite participants’ ability to fluently produce the target movement sequences from memory after 2 d of training, the movement kinematics for the upcoming sequence are not preplanned. Rather, the order and timing of sequences are recombined trial-by-trial likely through communication with the cortex, which has shown a mechanism whereby the order and timing are planned and subsequently integrated during execution in the cortex (Yewbrey et al., 2023). This maintenance of separation up until the go cue likely affords the individual the ability to adjust movement plans should task demands suddenly change.
Shift in sequence pattern activity across planning and execution
Motor planning in the cortex has been proposed to be a scaled down version of execution, held at bay by suppression in corticospinal circuits (Cisek and Kalaska, 2005; Duque and Ivry, 2009). More recently, however, planning activity in the motor cortex has been proposed to exist in an output-null dimension to execution, where increases and decreases in output from neurons projecting to the same source cause no net change in activity but prepare the system to shift into the output-potent dimension (Kaufman et al., 2014, 2016), including for sequences of movements (Yewbrey et al., 2023). Our current results suggest that this shift also takes place in multiple subcortical areas: striatum, thalamus, and hippocampus. Here, sequence-related patterns across planning and execution were significantly distinct across the perimovement phases. This could not be explained by a simple upscaling through changes in activity and matched by our no-change simulations of the same dimensions that considered empirical noise levels in those areas. This could be driven by separate populations becoming active at different times around movement execution or, alternatively, changes of the patterns of the same neuronal pool.
Implications for clinical disorders
Here we show that the hippocampus, but not the basal ganglia–cerebellar–thalamic loop is carrying sequence-related information during the planning of skilled movements produced from memory. Thus, breakdowns in movement sequence control in Parkinson's disease (Wilkinson et al., 2009) and cerebellar ataxia (Doyon et al., 1997) cannot be explained by deficits in the planning or execution of entire sequences. Instead, we hypothesize that they might be a result of the reduced capacity to concatenate movement elements or execute movement elements from the motor repertoire. In contrast, our results predict that Alzheimer's disease, which is associated with a degeneration of the hippocampal formation and the medial temporal lobe, affects everyday movement sequence production such as handwriting and ideomotor action organization (Förstl and Kurz, 1999) because of the inability to retrieve the correct order of the upcoming movement elements (Rhodes et al., 2019). We hypothesize that this applies particularly to contexts where movement sequences need to be retrieved from memory flexibly in difference permutations. In humans, this applies to everyday skilled actions which draw on the same elements from the motor repertoire across different ordinal sequences, including overtrained sequences of movements, e.g., in typing, handwriting, and tool use.
In summary, our results question traditional classifications of motor skill memory and control, along with their associated subcortical substrates. Research on skilled movement planning in diverse clinical populations is needed to inform appropriate rehabilitation interventions and support, e.g., by tapping into compensatory mechanisms that utilize preserved movement planning resources to improve performance.
Footnotes
K.K. was supported by Academy of Medical Sciences Springboard Award (SBF006/1052).
The authors declare no competing financial interests.
- Correspondence should be addressed to Katja Kornysheva at k.kornysheva{at}bham.ac.uk.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.