Abstract
How does the brain change during learning? In functional magnetic resonance imaging (fMRI) studies, both multivariate pattern analysis (MVPA) and repetition suppression (RS) have been used to detect changes in neuronal representations. In the context of motor sequence learning, the two techniques have provided discrepant findings: pattern analysis showed that only premotor and parietal regions, but not primary motor cortex (M1), develop a representation of trained sequences. In contrast, RS suggested trained sequence representations in all these regions. Here, we applied both analysis techniques to a five-week finger sequence training study, in which participants executed each sequence twice before switching to a different sequence. Both RS and pattern analysis indicated learning-related changes for parietal areas, but only RS showed a difference between trained and untrained sequences in M1. A more fine-grained analysis, however, revealed that the RS effect in M1 reflects a fundamentally different process than in parietal areas. On the first execution, M1 represents especially the first finger of each sequence, likely reflecting preparatory processes. This effect dramatically reduces during the second execution. In contrast, parietal areas represent the identity of a sequence, and this representation stays relatively stable on the second execution. These results suggest that the RS effect does not reflect a trained sequence representation in M1, but rather a preparatory signal for movement initiation. More generally, our study demonstrates that across regions RS can reflect different representational changes in the neuronal population code, emphasizing the importance of combining pattern analysis and RS techniques.
SIGNIFICANCE STATEMENT Previous studies using pattern analysis have suggested that primary motor cortex (M1) does not represent learnt sequential actions. However, a study using repetition suppression (RS) has reported M1 changes during motor sequence learning. Combining both techniques, we first replicate the discrepancy between them, with learning-related changes in M1 in RS, but not pattern dissimilarities. We further analyzed the representational changes with repetition, and found that the RS effects differ across regions. M1's activity represents the starting finger of the sequence, an effect that vanishes with repetition. In contrast, activity patterns in parietal areas exhibit sequence dependency, which persists with repetition. These results demonstrate the importance of combining RS and pattern analysis to understand the function of brain regions.
Introduction
The ability to learn and produce complex sequences of movements is essential for many everyday activities, from tying shoelaces to playing instruments. Searching for where these acquired skills are represented in the brain has been one of the central questions in motor neuroscience (Lashley, 1950). One prominent issue in this debate is whether skilled sequence execution relies on representations in premotor area and supplementary motor area (SMA), or whether the sequences are represented in the primary motor cortex (M1; for review, see Dayan and Cohen, 2011; Berlot et al., 2018). We recently conducted a systematic longitudinal five-week training study (Berlot et al., 2020) employing functional magnetic resonance imaging (fMRI) to assess brain changes with motor sequence learning. We observed no change in overall activity with learning in M1, and no changes in the sequence-specific activity patterns. In contrast, clear learning-related changes in both overall activity and fine-grained activity patterns were observed in premotor and parietal areas, suggesting that learning-related changes occur outside of M1. Consistent with this idea, activity patterns in M1 seem to reflect individual movement elements, but not the sequential context (Yokoi et al., 2018; Yokoi and Diedrichsen, 2019; Russo et al., 2020). This suggests that M1 does not represent learnt motor sequences, but must rely on inputs from other areas to select the next correct movement element.
Using the technique of repetition suppression (RS), however, Wymbs and Grafton (2015) provided evidence for learning-related changes during motor sequence learning in M1. RS refers to the observation that a stimulus repetition evokes reduced neuronal activity compared with its initial presentation (Gross et al., 1967). It is commonly used as a tool for investigating brain representation (Buckner et al., 1998; Henson et al., 2003; for review, see Segaert et al., 2013) following the logic that if regional activation reduces on repetition, the underlying neuronal population must represent some aspect of the stimulus that repeated (Grill-Spector et al., 2006). Wymbs and Grafton (2015) found learning-related changes in RS across several regions, including M1, where they reported a non-monotonic change in RS over weeks: early increase, followed by a decrease, and again an increase in RS, which they interpreted as skill-specific specialization in M1. Altogether, their results indicate that M1's activity patterns are malleable when learning motor sequences. This stands in stark contrast to the above-mentioned studies that used pattern dissimilarity analyses and found no evidence of sequential representation in M1.
We reasoned that this discrepancy between RS and pattern analysis may reflect the fact that different underlying components of activity patterns might bring about the suppression of activity observed on repetition, some of which may not be directly related to a sequence identity (Grill-Spector et al., 2006; Alink et al., 2018). To understand RS effects in more detail, we need to know what aspects of the underlying representations reduce from the first to the second repetition. We therefore designed a paradigm that allowed us to investigate changes in brain representation using both tools, RS and multivariate pattern analysis (MVPA). We trained healthy volunteers to produce motor sequences over five weeks and tested their performance during high-field (7 T) MRI scanning. Participants performed trained and untrained sequences, each sequence twice in a row, allowing us to conduct both pattern and RS analysis on the same data. Replicating previous results, we observed significant learning-related changes in M1 for RS, but not for pattern dissimilarities. In contrast, both metrics showed learning-related changes in premotor and parietal regions. Using pattern analysis, we then decomposed the activation patterns in the first and second repetition to determine which representational aspects underlie the RS effects in the different regions. Finally, we performed control analyses to test whether observed effects could be attributed to learning-related improvements in the execution speed.
Materials and Methods
Participants
A total of 27 participants took part in the experiment. Data of one participant were excluded because the field map was distorted in one of the four scans, resulting in 26 participants whose data were analyzed (17 females, 9 males). Their mean age was 22.2 years (SD = 3.3 years). Criteria for study inclusion were right-handedness and no prior history of psychiatric or neurologic disorders. They provided written informed consent to all procedures and data usage before the study started. The experimental procedures were approved by the Ethics Committee at Western University.
Apparatus
Finger sequences were performed using a right-hand MRI-compatible keyboard device (Fig. 1A). The keys of the device had a groove for each fingertip, with keys numbered 1–5 for thumb-little finger. The keys were not depressible, so participants performed isometric finger presses. The force of the presses was measured by the force transducers underneath each finger groove (FSG-15N1A, Sensing and Control, Honeywell; dynamic range 0–25 N; update rate 2 ms; sampling 200 Hz). For the key to be recognized as pressed, the applied force had to exceed 1 N.
Experimental paradigm. A, Experimental setup, finger sequences composed of nine digits were executed on a keyboard device. Participants received visual feedback on correctness of their presses, with digits turning green for correct presses, and red for incorrect presses. B, Group-averaged performance on trained sequences over the five-week behavioral training protocol. Red shade indicates the standard error of the group mean (SEM). C, Group-averaged performance during the scanning sessions. Trained sequences are in red, untrained in blue. Dark color indicates first execution, light second execution. White bars indicate the group mean performance. D, Experimental paradigm inside the scanner. Each sequence was presented twice in a row. Trials started with a 1-s preparation time in which the sequence was presented, followed by a 3.5-s period of main phase, when the sequence was also execution, followed by 0.5 s of intertrial interval (ITI). The plotted timeseries for an insert of the design is group-averaged evoked activation of M1. Shaded error bars indicate the SEM.
Experimental design, behavior
Participants were trained over a five-week time period to perform six nine-digit finger sequences (Fig. 1B). They were split into two groups, with trained sequences of one group being the untrained sequences of the second group, and vice versa (for all of the chosen sequences, see Fig. 4B). The chosen sequences for both groups were matched as closely as possible on several features: starting finger, number of repetitions per finger, and first-order finger transitions. The decision to split participants into two groups was made to ensure that none of the observed effects could be because of the specific set of sequences chosen.
On day 1 of the study, participants were acquainted with the apparatus and the task performed in the scanner. To ensure no sequence-specific learning would take place before scan 1, we used finger sequences different from the trained and untrained sets which participants did not encounter at any later stage of the experiment.
During the behavioral training sessions, participants were trained to perform the six sequences. They received visual feedback on the correctness of their presses online with each digit turning green for correct, and red for incorrect press (Fig. 1A). They were instructed to perform the sequences as fast as possible while keeping the overall accuracy >85%. The details of the training protocol, as well as a few other design features (which were not assessed for this paper) have been described elsewhere (Berlot et al., 2020).
Experimental design, imaging
Longitudinal studies assessing learning have to tackle the challenge that performance changes with learning, and that it is not clear whether brain changes reflect the acquisition of new skills, or are caused indirectly by the changed behavior (Poldrack, 2000). For motor learning, the higher speed of execution could lead to different brain activation, unrelated to learning. Pacing participants to perform at the same speed for trained and untrained sequences, and across sessions, presents a possible solution for this problem. On the other side, pacing participants at a slower speed might not tap into the same neural circuitry as skilled behavior. For this reason, we decided to include both approaches; sessions with paced performance and a session where participants performed at full speed.
Participants underwent a total of four MRI scanning sessions (Fig. 1C) while executing trained and untrained sequences. The first session served as a baseline before the start of the training protocol (in week 1), where the “trained” and “untrained” sequences were both untrained and seen for equivalent amounts of time. The second session was conducted in week 2, and the last two after training protocol was completed in week 5. In scanning sessions 1–3, participants' performance inside the scanner was paced with a metronome, whereas in session 4, they performed as quickly as possible. For the purpose of this paper, we analyzed data of scanning session 1 (before training, paced), 3 (after learning, paced) and 4 (after learning, unpaced; Fig. 1C), allowing us to examine both learning-related and performance-related changes. Session four allows for the closest comparison to the previous RS study (Wymbs and Grafton, 2015), which also employed a full-speed performance design.
Each scanning session consisted of eight functional runs with event-related design randomly intermixing trials across the 6 trained and the 6 untrained sequences (totaling 72 trials per functional run). Each sequence was executed for two trials in a row (Fig. 1D). In this way, our design did not differentiate between RS and expectation suppression (Summerfield et al., 2008; Kok et al., 2012). In contrast to perceptual studies, however, in motor studies the influence of the expectation of a repetition is likely much less important. After the informative cue, preparatory processes are executed in a full awareness of whether the sequence is repeated from last trial, no matter if that repetition was expected or not. Thus, repetition effects in motor control will always contain an element of expectation. For this reason, we chose repetition to be a predictable feature of our experimental design.
Each trial started with a 1-s preparation time with nine digits of the sequence presented on the screen (Fig. 1D). A “go” signal was presented afterward. In scans 1–3, a pink line appeared underneath the sequence and started expanding, indicating the pace at which participants were to press. In scan 4, participants executed the sequence as fast as possible after the go cue. After execution, they received feedback on their overall performance – three points for correct and zero for incorrect performance. Each trial lasted for 5 s total, with a 0.5-s intertrial interval (Fig. 1D). Five periods of 10-s rests were added throughout each functional run to provide a better estimate of baseline activation. These rests were added randomly, but never between the first and second execution of the same sequence. In total, each scanning session lasted for ∼75 min.
Image acquisition
Data were acquired on a 7-Tesla Siemens Magnetom MRI scanner with a 32-receive channel head coil (eight-channel parallel transmit). At the beginning of the first scan, we acquired anatomic T1-weighted scan for each participant. This was obtained using a magnetization-prepared rapid gradient echo sequence (MPRAGE) with voxel size of 0.75 × 0.75 × 0.75 mm isotropic [field of view = 208 × 157 × 110 mm (A-P; R-L; F-H), encoding direction coronal]. Data during functional runs were acquired using the following sequence parameters: GRAPPA 3, multiband acceleration factor 2, repetition time (TR) = 1.0 s, echo time (TE) = 20 ms, flip angle (FA) = 30°, slice number: 44, voxel size: 2 × 2 × 2 mm isotropic. To estimate magnetic field inhomogeneities, we acquired a gradient echo field map with the following parameters: transversal orientation, field of view: 210 × 210 × 160 mm, 64 slices, 2.5 mm thickness, TR = 475 ms, TE = 4.08 ms, FA = 35°. The dataset is publicly available on OpenNeuro (accession number ds002776).
Preprocessing and first level analysis
Data preprocessing was conducted using SPM12. Preprocessing of functional data included correcting for geometric distortions using the acquired field map data, and head motion correction (three translations: x, y, z; three rotations: pitch, roll yaw). The data across sessions were all aligned to the first run of the first session, and then co-registered to the anatomic scan.
Preprocessed data were analyzed using a general linear model (GLM; Friston et al., 1994). We defined a regressor for each of the performed 12 sequences (six trained, six untrained), separately for their first and second execution, resulting in a total of 24 regressors per run. The regressor was a boxcar function defined for each trial, and convolved with a two-γ canonical hemodynamic response function (time to peak: 5.5 s, time to undershoot: 12.5 s). All instances of sequence execution were included into estimating regressors, regardless of whether the execution was correct or erroneous. This analysis choice was also taken by Wymbs and Grafton (2015), thus allowing a more direct comparison of RS results. Even when the error trials were excluded (i.e., removing all error trials as well as second execution trials when the first execution was erroneous), our results remained unchanged.
As standard in SPM, the mean of all voxels (averaged over space and time) was set to 100, thus providing a common scale across runs and subjects. We did not apply global scaling in our first-level model. The temporal autocorrelation was modeled using the “FAST” option in SPM12, which offers a flexible basis function to model dependencies on longer timescales. High-pass filtering was achieved by temporally prewhitening the matrix using the obtained temporal autocorrelation estimate. Ultimately, the first level analysis resulted in activation images (β maps) for each of the 24 conditions per run, for each of the four scanning sessions.
Surface reconstruction and regions of interest (ROIs)
Individual subject's cortical surfaces were reconstructed using FreeSurfer (Dale et al., 1999). Individual surfaces were aligned and spherically registered to match a template atlas (Fischl et al., 1999). Subsequently surfaces were resampled to FreeSurfer's left-right symmetric template (fs_LR.164k.spec) in the connectome workbench distribution (v.1.3.2; Marcus et al., 2011).
These surfaces were used to define the ROIs, which were defined on the group surface template using aligned probabilistic cytoarchitectonic maps (Fischl et al., 2008) and then projected into the individual brains. Specifically, our ROIs included areas covering the M1 and secondary associative regions. The M1 was defined by including nodes with the highest probability of belonging to Brodmann area (BA)4, which in addition corresponded to the hand knob area (Yousry et al., 1997). The dorsal premotor cortex (PMd) was included as the lateral part of the middle frontal gyrus. The anterior part of the superior parietal lobule (SPLa) was defined to include anterior, medial and ventral intraparietal sulcus. The subsequent analyses carried in the space of the original functional data acquisition for each individual subjects by determining the voxel that lay between the individual pial and white matter surfaces.
Additionally to the ROI analysis, we also performed a continuous searchlight analysis (Oosterhof et al., 2011). A searchlight was defined for each surface node, encompassing a circular neighborhood region containing 120 voxels. The voxels for each searchlight were found in exactly the same way as for the ROI definition. As a slightly coarser alternative to searchlights, we also defined a regular tessellation of the cortical surface separated into small hexagons, and extracted the functional data in the same way.
Evoked activation and RS
We calculated the percent signal change for execution of each sequence relative to the baseline activation for each voxel, for each functional run and averaged across runs. The calculation was split between the first and second execution (Fig. 1D).
To calculate RS, the activation during the first execution was subtracted from the elicited activation during the second execution. Thus, negative values of this difference contrast represented relative suppression of activation on the second execution, i.e., RS. For most subsequent analyses, the obtained values of activation and RS were averaged separately for trained and the untrained sequences. For ROI analysis, the volume maps were averaged across the predefined regions (M1, PMd, SPLa) in the native volume space of each subject. Additionally, for visualization the volume maps were projected to the surface for each subject, and averaged across the group in Workbench space.
Dissimilarities between activity patterns for different sequences
To evaluate which regions displayed sequence-specific representation, we calculated crossnobis dissimilarities between the evoked β patterns of individual sequences. To do so, we first multivariately prewhitened the β values, i.e., we standardized them by voxels' residuals and weighted by the voxel noise covariance matrix. We used optimal shrinkage toward a diagonal noise matrix following the Ledoit and Wolf (2004) procedure. Such regularized prewhitening has been found to increase the reliability of dissimilarity estimates (Walther et al., 2016). Next, we calculated the crossvalidated Mahalanobis dissimilarities (i.e., the crossnobis dissimilarities) between evoked regional patterns of different pairs of sequences, resulting in a total of 66 dissimilarities. This was performed twice: once by combining the activation patterns across the two executions and second time by separately obtaining dissimilarities between evoked patterns split per execution. The obtained dissimilarities were then averaged overall, as well as separately within the pairs of trained sequences, and the untrained sequences. This analysis was conducted separately for each ROI and using a surface searchlight approach (Oosterhof et al., 2011). In the searchlight approach, dissimilarities were calculated among the voxels of each searchlight, with the resulting dissimilarities values assigned to the center of the searchlight.
Changes in dissimilarities with repetition
We then related the change in dissimilarities with repetition to the changes in overall activity. Activation pattern for each sequence can be characterized as a point in a high-dimensional space, with each axis referring to the activation of a voxel. As a measure of the overall activation, we used the length of the activity vector from the origin (rest), and as dissimilarities the lengths of the vectors between different conditions. Unbiased estimates of the length of activity vectors relative to rest were derived from the crossvalidated second-moment matrix. The square root of each diagonal element (variance of evoked pattern) indicates the length of the activity vector, relative to rest. The square root of crossnobis dissimilarity (variance–covariance between patterns) is the length of the vector between the two patterns. The crossnobis dissimilarities were averaged across the conditions before taking the square root transform, separately for each execution. Similarly, overall activity vector length was averaged across conditions to obtain an overall activity vector length for executions 1 and 2.
Using the obtained average activity vector length and dissimilarities per execution, we assessed whether RS simply scaled the entire activity pattern downward. To do so, we computed the ratio of activity vector length change:
Pattern component analyses: modeling representational components
To determine what specific features of the patterns might change across the two executions, we decomposed the pattern component modeling (PCM) toolbox (Diedrichsen et al., 2011, 2018). PCM models the covariance structure (second moment matrix) of regional activity patterns according to different representational hypotheses. In our experiment based on presented sequences, we defined five representational components.
First finger
Both trained and untrained sequences started with one of three possible fingers: thumb, middle, or little finger. The first-finger component predicts that activity pattern for sequences that start with the same finger are identical. For sequences starting with a different first finger, the prediction was based on the covariance of the natural statistics of hand movement (Ejaz et al., 2015).
All fingers
The sequences were slightly different in terms of which fingers were involved. The “all fingers” component simply characterized how often each finger occurred in each sequence. If two sequences consisted exactly of the same presses (just in a different order), they were predicted to be identical. The predicted covariance was again weighted by the natural statistics of hand movement (Ejaz et al., 2015).
Sequence type
This component split the performed sequences based on whether they were trained or untrained, predicting one regional activity patterns for all the trained and a different activity pattern for all the untrained sequences.
Trained sequence identity
This component modeled any differences between the six trained sequences.
Untrained sequence identity
Similar as the trained sequence identity, this component predicted a unique activity patterns for each untrained sequence.
The overall predicted second moment matrix (G) was then a convex combination of the component matrices (Gc), each weighted by a positive component weight
The construction of the model components was done separately for the two groups of participants, as different sequences constituted trained or untrained sequences for the two groups. The subsequent steps of model fitting and evaluation were carried together for all subjects.
We formulated a model family containing all possible combinations of the five chosen components (Yokoi and Diedrichsen, 2019). This resulted in 32 combinations, also containing the “null” model that predicted no differences among any of the sequence patterns. We evaluated all of the 32 models using a crossvalidated leave-one-subject-out scheme. The components weights were fitted to maximize the likelihood of the data the data of subject 1,…,N–1. We then evaluated the likelihood of the observed regional activity patterns of subject N under that model. The resultant cross-validated likelihoods were used as model evidence for each model (see Diedrichsen et al., 2018). The log model Bayes factor BFm, the difference between the crossvalidated log-likelihood of each model and the null model, characterizes the relative evidence for that model.
In addition to the model family of the chosen components, we also fit a “noise-ceiling” model to assess maximal logBFm that would be achievable for a group model (Nili et al., 2014; Diedrichsen et al., 2018). For each of the two groups, we predicted the second moment matrix of a left-out subject based on n–1 subjects in the same group. This metric of intersubject consistency was then combined across the subjects of the two groups.
To integrate the results across models, we used model averaging. Assuming a uniform prior probability across models, we first computed the posterior probability of each model (1–32) in each region directly from the log-Bayes factors:
In this expression, the denominator normalizes the log-Bayes factors across 32 models to ensure they sum to 1. The obtained posterior probability was used in calculation of two subsequent metrics: (1) component log-Bayes factor; and (2) variance accounted for by each component. The log-Bayes factor for each of the five components (first finger, all fingers, etc.) was calculated as the log of the ratio between the posterior probability for the models containing the component (c = 1) versus the models that did not (c = 0).
To determine the amount of pattern variance accounted for by each component (across the models), we normalized the trace of each model component to be 12 (number of conditions) before fitting. Thus, the fitted component weight
Important to note is that the estimated variance is always positive, such that this quantity cannot be used to test whether a component is present at all. On the other hand, the log-Bayes factor does not take into account the actual weighting of the component in explaining the activity patterns. In univariate models, the average variance accounted for is tightly related to the evidence for that component; however, this is not necessarily the case in the multivariate setting. While component c1 can be crucial to account for the covariance between the patterns, it may actually play a relative small role in predicting the activity patterns. Thus, both the component Bayes factor and the averaged explained variance provide informative, albeit slightly different, measures of the importance of a component.
Statistical analysis of RS and dissimilarities
We employed a within-subject design. For each subject's data, we calculated RS and dissimilarities, separately for trained and untrained sequences. This was done for each region and session. To statistically quantify how RS and dissimilarities changed with learning (across sessions for trained/untrained sequences), we performed a session × sequence type ANOVA on those metrics, in predefined ROIs. Afterwards, we used a two-sided paired t test to assess the effect of sequence type per session. We additionally performed a three-way session × region × sequence type ANOVA to examine whether the learning-related effects differed across regions. For the analysis of dissimilarities split by execution (execution 1 vs 2), we calculated, per subject, the expected crossnobis dissimilarities for execution 2 of the cortical surface regions. The observed dissimilarities on the second execution were contrasted with those by using a two-sided paired t test.
Statistical analysis of PCM
We report the component log-Bayes factors, averaged across subjects. Additionally, the log-Bayes factors were submitted to a one-sample t test against 0 (two-sided). To quantify the change in component variance across executions, we calculated, per subject, the percent reduction in component variance from execution 1 to 2. The relative reduction in variance with repetition was contrasted across components by using a two-sided paired t test.
Results
Changes in RS with learning
To examine learning-related changes in RS and pattern analysis, we calculated both metrics on fMRI activation patterns both pre- and post-learning (i.e., weeks 1 and 5). Relative to rest, sequence execution activated M1, primary somatosensory cortex (S1), PMd and ventral premotor cortex (PMv), SMA, and SPLa (Fig. 2A). In general, activity was higher for the first than for the second execution (Fig. 2B). RS was calculated as the difference in activity between the two executions of the same sequence (Exe2 - Exe1). Negative values indicate a relative reduction in activation with repetition, i.e., RS. Already in week 1, before learning, RS was observed in nearly all regions displaying task-evoked activation (Fig. 2C). Only in regions that showed de-activation during task performance (Fig. 2B, blue shades), did we observed positive difference values between the executions (Fig. 2C, areas in red shades). This indicates that both the amount of activation and the amount of deactivation reduced with repetition.
Changes in repetition suppression (RS) and dissimilarities with learning. A, Group-averaged evoked activation, measured as percent signal change over resting baseline in week 1, averaged across all sequences and projected to an inflated representation of the left hemisphere, i.e., hemisphere contralateral to the performing hand. B, Group-averaged activation for each execution (Exe1, Exe2), in the baseline session (session 1/week 1) and after training (session 4/week 5) represented on a flattened representation of the left hemisphere. CS stands for the central sulcus. C, The difference in evoked activation between the two executions. Blue represents relative suppression of activation on the second, relative to the first, execution. Regions of interest (ROIs): primary motor cortex (M1), dorsal premotor cortex (PMd), anterior superior parietal lobule (SPLa). D, RS in the predefined ROIs, separately for trained (red) and untrained (blue) sequences. Error bars reflect the standard error of the group mean (SEM). More negative values indicate more suppression during second execution, relative to the first; * signifies p < 0.05. E, Elicited activation measured in percent signal change over resting baseline for trained sequences on first (dark) and second (light) execution. RS is calculated as the difference between activation across executions, i.e., Exe2–Exe1. Error bars reflect SEM. F, Elicited activation split by execution for untrained sequences. G, Average dissimilarity between evoked patterns for all pairs of sequences, in week 1, averaged across the group. Pattern dissimilarity was computed using a searchlight approach, by calculating the average crossnobis dissimilarity of activation patterns between all sequence pairs in each searchlight. H, Average dissimilarity between activation patterns of different sequence pairs in weeks 1 and 4. I, Dissimilarities between trained (red) and untrained (blue) sequence patterns, across weeks 1 and 5. Error bars reflect the SEM; * signifies p < 0.05.
We statistically quantified how RS changed across weeks (specifically between sessions 1 and 4) for three predefined ROIs: SPLa, PMd, and M1 (Fig. 2D; see Fig. 2E,F for a breakdown of activation per execution). The increase in RS across sessions was higher for trained than untrained sequences in all regions, as confirmed by a significant session × sequence type interaction in each region (PMd: F(1,25) = 5.29, p = 0.030; SPLa: F(1,25) = 4.62; p = 0.041). The increase in RS was particularly strong in M1 (M1: F(1,25) = 24.74; p = 3.9e−5). Indeed, the three-way interaction of region × session × sequence type was significant (F(2,50) = 9.19, p = 3.9e−4). To summarize the RS results, all regions showed evidence of an increase of sequence-specific representation with learning, with a particularly strong effect in M1.
Changes in pattern dissimilarities with learning
As another measure of sequence-specific representations, we tested whether the regions that displayed RS also showed distinguishable fine-grained activity patterns for each sequence. As a measure of pattern dissimilarity, we calculated the average crossvalidated Mahalanobis dissimilarity (i.e., crossnobis dissimilarity) between activation patterns of all possible sequence pairs. Overall, regions with dissimilar activity patterns for the different sequences corresponded to regions which also exhibited RS effects (Fig. 2G,H). Additionally, both metrics (RS and pattern dissimilarities) increased from sessions 1 to 4, with the effect particularly pronounced in the parietal cortex (Fig. 2C,H). Thus, based on visual inspection, RS and pattern dissimilarity metrics seem to provide consistent evidence for the development of sequence-specific representations with learning in an overlapping set of regions.
However, when quantifying the change in pattern dissimilarities across weeks in predefined ROIs, we observed important differences from RS. In SPLa and PMd, pattern dissimilarities increased more for trained than untrained sequences across sessions (Fig. 2I), as quantified by a significant interaction in a session × sequence type ANOVA (SPLa: F(1,25) = 4.80; p = 0.038, PMd: F(1,25) = 5.29, p = 0.030). In contrast, the week by sequence type interaction was not significant in M1 (F(1,25) = 2.13, p = 0.16; Fig. 2I). This indicates that while PMd and SPLa show learning-related changes on the level of pattern dissimilarities, these are absent in M1. The three-way interaction (region × session × sequence type) on the observed dissimilarities was indeed significant (F(2,50) = 3.39, p = 0.041), confirming the difference between regions.
Pattern dissimilarities reduce with repetition
Within the same dataset, we observed learning-related changes in RS in M1, but no change in pattern dissimilarities with learning. While the increase in pattern dissimilarities, as well as direct evidence for pattern changes across weeks (Berlot et al., 2020), clearly argue that sequence-specific learning occurs in premotor and parietal areas and not in M1, RS provides evidence for the development of sequence-specific representations in all these regions. How can this discrepancy be explained? To resolve this question, we need to understand how the role that each area plays during skilled sequence performance changes from the first to the second execution. We first inspected pattern dissimilarities for each of the two executions separately (execution 1, execution 2) in the trained state (week 5/session 4). We observed that, on average, pattern dissimilarities in week 5 decreased with repetition in most cortical regions (Fig. 3A). This decrease was particularly pronounced around the central sulcus, including M1 (Fig. 3B).
Representational change with repetition of sequence execution. A, Dissimilarities between pairs of sequences in session 4, split by first and second executions. B, Difference in pattern dissimilarities between executions 1 and 2. Blue hues reflect relatively lower dissimilarities on the second execution. C, Difference between the observed dissimilarity during execution 2 and the predicted distance based on the reduction of activation with repetition. Blue hues indicate lower dissimilarities than predicted, red higher. The difference between the two was significant with p < 0.05 in tessels which are fully visible (i.e., not greyed out). D–F, Same as A–C but for the paced speed session, i.e., session 3. Same thresholds were applied to the visualizations as the respective figures from A–C.
Of course, some decrease in dissimilarities would be expected given the decrease of overall activity with repetition (Fig. 2B). We therefore compared the decrease in dissimilarities to what would be predicted if activation decreased proportionally for all sequences. First, we calculated the relative decrease in activity, i.e., the ratio of the length of the activity vector during the second execution over the activity vector during the first execution (for details, see Changes in dissimilarities with repetition for details). This ratio was applied to the observed dissimilarities on the first execution, yielding a prediction of what dissimilarities would be expected for the second execution, if representation scaled with activation. This calculation was applied to activity patterns to each of the parcels on a regularly tessellated cortical surface (Fig. 3C). Around the central sulcus, i.e., including M1, the observed dissimilarities on the second execution were significantly lower than what was predicted from the reduction in overall activity (Fig. 3C). In contrast, observed dissimilarities on the second execution in parietal areas were quite close to the prediction based on dissimilarities scaling proportionally with average activity.
To quantify whether the reduction in dissimilarities differed qualitatively across regions, we subjected the difference between the observed dissimilarities on second execution from those predicted under the scaling model to a one-way ANOVA with the main effect of region, which was significant (F(2,50) = 7.42, p = 1.5e−3). Post hoc t-tests revealed that this was driven by a significantly larger deviation from scaling in M1 as compared with SPLa (t(25) = 3.55, p = 1.56e−3). M1 and PMd did not differ from one another (t(25) = 1.25, p = 0.22). There was a significant difference between PMd and SPLa (t(25) = 2.65, p = 0.013), indicating a more “scaling-like” representation in SPLa. Altogether this indicates that representational change with repetition differed across regions: proportional scaling of representation in parietal regions, and violation of proportional scaling in M1, where a much more pronounced decrease of dissimilarities was observed.
Decomposing representations across executions 1 and 2
Analysis of average dissimilarities across executions revealed a compression of representation in M1, but not in parietal regions. This analysis, however, does not reveal which aspects of the representations are responsible for this regional difference. To investigate exactly how the representation changed, we decomposed the representations during each execution into several underlying representational components. Differences in the sequence patterns could reflect differences in various characteristics, or features (Fig. 4A). Specifically, based on previous results (Yokoi et al., 2018; Yokoi and Diedrichsen, 2019), we hypothesized that the covariance (or similarity) between activity patterns can be explained with the following five components (Fig. 4B; for details, see Materials and Methods): (1) first finger: a pattern component determined by the starting finger; (2) all fingers: a pattern component that simply adds the finger-specific patterns regardless of their sequence; (3) sequence type: trained and untrained sequences have different average patterns; (4) trained sequence identity: the trained sequences differ among each other; (5) untrained sequence identity: the untrained sequences differ among each other. Using PCM (Diedrichsen et al., 2018), we constructed a model family, which consisted of all possible combinations of those five components, totaling 25 = 32 models. These models were then fit to the observed regional covariance structure (second moment matrices; Fig. 4C), separately for executions 1 and 2. In all regions and across both executions, several models accounted for observed data well, with model fits as good as the noise ceiling model (M1: 21 models for Exe1, 24 for Exe2; PMd: 16 for Exe1 and Exe2, SPLa: 16 for Exe1 and Exe2), showing that, overall, these models accounted well for the observed data. To integrate the results across models, we used Bayesian model averaging to estimate which components were most important to explain the patterns.
Component decomposition of regional representation across executions 1 and 2. A, Executed nine-digit sequences. B, Candidate component models used to assess regional representations across first and second executions. Each row and column indicate a specific sequence, and values in the matrices reflect the correspondence across sequences on that component, with yellow indicating higher correspondence. C, Regional representations during the first execution of sequences, as assessed by the crossvalidated second moment matrix, averaged across subjects of group 1. Similar as for models, each row and column reflect an activation pattern for an individual sequence. Regions: primary motor cortex (M1) and anterior superior parietal lobule (SPLa). D, Variance explained by candidate model components on executions 1 (black) and 2 (gray) during the full speed session in M1, dorsal premotor cortex (PMd) and SPLa. Error bars reflect standard error of the group mean (SEM). E, Relative contribution of variance explained in D across the different components. The total variance explained across the different components (i.e., sum of the bars in D) was normalized across the two executions to display the relative shift of importance of different representational components. F, G, Same depiction as D, E for the results of activity patterns during the paced scanning session.
In M1, the regional representation on the first execution was accounted for by the individual movement elements (all fingers), with especially high weight on the first finger (Fig. 4D). This replicates the previous findings showing that M1's representation during sequence production tasks can be fully explained by the starting finger (Yokoi et al., 2018; Yokoi and Diedrichsen, 2019). In these two studies, the number of times each of the five fingers was pressed was held constant across all sequences. In the current study, we did not match this number. Thus, the subsequent finger presses, encoded in the all-finger component, also accounted for substantial variance, independent of the exact ordering of these movements.
To statistically quantify these effects, we calculated component Bayes factors for individual components. In M1, the Bayes factors were significant for both first-finger and all-finger factors (first finger: BF = 6.8, t(25) = 3.1, p = 4.8e−3; all fingers: BF = 9.6, t(25) = 4.4, p = 1.7e−4). In contrast, the component Bayes factors were not significant for any sequence-related feature – neither sequence type (BFc = 3.2, t = 1.9, p = 0.07), nor sequence identity: of trained sequences (BFc = 1.6, t(25) = 1.5, p = 0.16) or untrained sequences (BFc = 0, t(25) = −0.2, p = 0.85). Thus, the pattern analysis clearly shows that activity patterns during the first execution in M1 can be explained by a superposition of individual movements, without any evidence of a sequence representation.
In SPLa and PMd, the variance explained during the first execution was well accounted for by sequence type (SPLa: BFc = 16.3, t(25) = 6.0, p = 3.0e−6, PMd: BF = 15.5, t(25) = 5.94, p = 3.3e−4), and trained sequence identity (SPLa: BFc = 5.4, t(25) = 3.4, p = 2.5e−3; PMd: BFc = 4.6, t(25) = 2.8, p = 0.011). There was no significant evidence for representation of untrained sequence identity in either of the regions (SPLa: BFc = 0.8, PMd: BF = 0.1; t(25) ≤ 1.1, p ≥ 0.28). In comparison to M1, the variance related on individual movements, either the first finger or all fingers were weaker across PMd and M1. In PMd, the first finger still accounted for some variance (BFc = 4.1), but this was further reduced in SPLa (BFc = 0.5).
In M1, the pattern component related to the first finger drastically reduced by 93% with repetition (Fig. 4D). The reduction in variance explained by the first-finger component was larger than for the all-finger component, which reduced by 75% (paired t-test: t(25) = 9.03, p = 2.4e−9). This indicates that the drastic reduction of average dissimilarities in M1 with repetition is mostly due to a pronounced first-finger effect during the first execution that almost vanishes on the second execution. This was further confirmed with a significant correlation between the amount of first-finger suppression and reduction in dissimilarities (r(25) = 0.43, p = 0.027). In other words, participants who displayed further reduction of the first-finger effect, also showed stronger reduction in observed dissimilarities.
Large reductions of the first-finger effect were also observed in session 4 in PMd (by 81%) and SPLa (by 83%). In contrast, the representation of sequence type and trained sequence identity in these areas clearly reduced less (PMd: sequence type: 44%, trained sequence: 64%; SPLa: sequence type: 49%, trained sequence: 55%). To statistically quantify whether the first-finger effect reduced more than trained sequence component, we performed a paired t-test on the percentage reduction across the two components. The results of tests were indeed significant for both PMd (t(25) = 7.96, p = 2.6−8) and SPLa (t(25) = 12.8, p = 1.7e−12).
In summary, SPLa's regional activation patterns were better accounted for by components related to the sequence identity than to the first finger, which also reduced much less with repetition. This likely explains why the average dissimilarities did not compress with repetition in SPLa regions as much as in M1. With repetition, the proportion of different components to overall regional representation remained relatively stable in SPLa (Fig. 4E), but changed substantially in M1 in that the dominant first-finger representation on the first execution nearly disappeared on the second execution. This was affirmed by an execution × ROI (SPLa, M1) ANOVA comparing the relative amount of variance accounted for by the first-finger component. Both main effects were significant (execution: F(1,25) = 66.39, p = 1.68e−8, region: F(1,25) = 85.98, p = 1.44e−9), as well as the interaction between the two (F(1,25) = 42.33, p = 8.16e−7). Thus, the decrease in M1's first-finger representation on repetition was more pronounced than that of SPLa. PMd's representation was in-between those of M1 and SPLa, more variance was accounted for by the first finger than in SPLa, but less than in M1.
Effect of speed on repetition effects
It is important to note that the speed of execution differed between trained and untrained sequences in session 4 (Fig. 1C). This speed difference could explain for the observed differences between trained and untrained sequences in session 4. To control for this factor, we had designed the study to include an extra session, session 3, which was also performed after learning was completed, but with paced performance. Specifically, the movement speed in session three was matched between trained and untrained sequences, as well as to performance observed in session 1.
We have previously reported that after learning, crossnobis dissimilarities for trained sequences are affected by the speed of execution. Specifically, the dissimilarities between trained sequences were lower for paced session (session 3) than full speed session 4 in PMd and SPLa, but not in M1, where there was no distinction between trained and untrained dissimilarities in either session (Berlot et al., 2020; Fig. 2I, comparison sessions 3, 4). Similarly, RS in PMd and SPLa was also less pronounced in session 3. The RS did not differ significantly between trained and untrained sequences in session 3 (t(25) ≤ 1.22, p ≥ 0.23; Fig. 2D). In M1, the difference in RS was also strongly reduced, but remained just above the significance threshold (t(25) = 2.1, p = 0.046). Additionally, the change in RS from session 1 to session for trained and untrained sequences were non-significant. Thus, while the speed of performance clearly influenced the strength of RS across regions, it nevertheless appears that M1's RS cannot be fully explained by differences in speed between trained and untrained sequences.
Next, we compared whether the speed of execution affects the decrease in dissimilarities on repetition. As for the full speed performance, we observed that dissimilarities decreased on the second execution (Fig. 3D,E). Also as for the full speed performance, the reduction in dissimilarities was more pronounced in M1 as compared with SPLa (t(25) = 2.80, p = 9.6e−3).
Finally, we assessed whether the reduction in representational components on repetition (especially the finger effect in M1) is observed even during paced performance. Overall, our PCM modeling accounted for less variance during the paced performance compared with full speed performance (Fig. 4D,F). We have previously reported that the patterns of activity are much more distinguishable and have higher signal-to-noise ratio during the full speed session compared with paced performance (Berlot et al., 2020), which likely accounts for this difference.
Interestingly, the overall amount of the first-finger versus all-finger components varied with speed. During full speed performance, the first-finger component accounted for a larger part of the pattern variance than during paced performance (Fig. 4D–G). This was confirmed by an significant interaction of a session × component (first/all fingers) ANOVA in M1 (F(1,25) = 17.3, p = 3.3e−4). Nevertheless, a similar reduction of the first-finger effect in M1 was observed for the paced session as for the full speed session (first-finger reduction by 92%, all finger by 66%; t(25) = 3.12, p = 4.5e−3), suggesting that the decrease of the first-finger weight on repetition did not depend on the speed of execution. The reductions in the first-finger effect were larger than for trained sequence components also in PMd and SPLa (PMd: t(25) = 2.34, p = 0.02; SPLa: t(25) = 8.11, p = 1.8e−8). Altogether this confirms that the larger reduction of the first-finger effect with repetition does not depend on the speed of performance.
Discussion
Pattern analysis versus RS
There are two common ways of looking at brain representations, MVPA and RS. In MVPA types of analyses, differences in multivoxel activity patterns across conditions are interpreted to reflect that the region represents conditions as distinct. In RS, activity reduction on repetition is interpreted as the region representing the stimulus dimension along which the repetition occurred (Grill-Spector et al., 2006). For example, if a region shows less activity every time the color of a visual stimulus repeats (rather than the shape, texture, etc.), it would provide evidence for a role of the region in the analysis of color. The question on the relation between RS and pattern dissimilarities measures has been addressed before especially in visual neuroscience (Sapountzis et al., 2010; Epstein and Morgan, 2012; Hatfield et al., 2016; Mattar et al., 2018; Davis and Poldrack, 2013), but the two have not been directly compared before in the motor domain.
Learning-related changes of RS and pattern dissimilarities
We combined the two methods to investigate changes during motor sequence learning. Using pattern analysis, several fMRI studies have failed to provide evidence that M1 obtains a motor sequence representation with learning (Wiestler and Diedrichsen, 2013; Yokoi et al., 2018; Berlot et al., 2020). In contrast, a study using RS (Wymbs and Grafton, 2015) reported learning-related changes even for M1, which suggests a development of sequence-dependent representation. Here, we report that both techniques showed the development of sequence-specific representations in premotor and parietal cortices. In contrast, the two metrics provided discrepant insights into M1: we observed some evidence for learning-related changes using RS, but not pattern dissimilarities. Additional control analyses suggest that this difference was not completely driven by the difference in the speed of execution, although higher speed of execution increased RS across regions.
As Wymbs and Grafton (2015), we found changes in RS in M1 across learning sessions, as well as a difference between trained and untrained sequences in sessions posttraining. However, the specific evolution of the changes differed between the two studies. Wymbs and Grafton reported a complex increase-decrease-increase pattern of RS in M1 depending on the level of the training of the sequence. In contrast, we report higher RS for trained than untrained sequences after training. There are a number of important differences in the design of the two studies which could have contributed to the observed differences in results. For instance, their design only employed full speed performance, the probability of sequence repetition was lower, and the training was longer and had three groups of sequences (highly, medium, and lightly trained) rather than just two (trained and untrained). Further studies, directly manipulating these differences, are needed to reconcile the findings reported here relative to the previous report of Wymbs and Grafton (2015).
Representational changes with repetition
To assess in more detail what aspects of regional representation are sensitive to repetition, we decomposed regional representations into different underlying components (e.g., first finger, combination of all fingers, sequence identity, etc.), separately for the first and second executions. We observed that M1 mainly represents the first finger in a sequence. This component diminishes dramatically on repetition. In contrast, the representation of sequence type and identity, which accounted for most of the variance in parietal areas, remained more stable across the two executions. Activation patterns in PMd reflected a mixture of sequence-related representations (as in parietal regions), which remained stable with repetition, and representations related to single movements (as in M1), which diminished with repetition. Altogether, our results suggest that RS acts differently on different components of neuronal representations. Depending on the representational composition of each region, RS can therefore be more or less pronounced. Specifically, regions that represented more transient information about a sequence (first finger) shows particularly strong suppression of dissimilarities with repetition, while regions that represent more persistent information (sequence type and identity) show a more proportional reduction of representation with activity.
Interactions between cortical motor regions during sequence performance
Our findings can be summarized in the following, admittedly rather speculative, model of how parietal/premotor areas and M1 interact during skilled motor sequence performance. During the first execution, premotor and parietal regions contain information about the specific sequence that needs to be executed (Fig. 5). Premotor regions also reflect the starting finger of the sequence. These regions send signals to M1, preactivating the neural circuits for the movement of the first finger. This replicates a previous finding that the difference between M1's activation patterns is explained by the starting finger, rather than true sequence representation (Yokoi et al., 2018). The finding is also consistent with results from neurophysiology (Averbeck et al., 2002) and magneto-encephalography (MEG; Kornysheva et al., 2019) showing that the first action in a sequence is most highly activated in premotor and motor areas during the preparatory period.
Conceptual depiction of changes in representation across regions and with repetition. Different dots represent activation patterns for different finger sequences, with colours indicating the starting finger of the sequence. Regions: anterior superior parietal lobule (SPLa), dorsal premotor cortex (PMd) and primary motor cortex (M1). Activation levels of three hypothetical voxels are indicated across the three axes.
Upon repetition, activation reduces across all regions. The decomposition analysis indicates that the sequence identity component in premotor and parietal regions reduces only moderately, suggesting that the sequence representation is always necessary to successfully guide M1 through the correct sequences of actions. In contrast, the preactivation of the first finger reduced dramatically, possibly reflecting reduced planning needs on repetition (Ariani et al., 2020). Thus, the pronounced RS effect in M1 may be because of the fact that fMRI activity here is driven to a large degree by the initial input from other regions that prepares this region for the first execution of a sequence. On the second execution, the need for this preactivation may be substantially reduced.
Overall, our results suggest that M1 does not represent individual trained sequences with learning, despite increased RS. Instead, it appears to represent individual finger presses. Nonetheless, we did find some evidence that RS in M1 was stronger for trained than untrained sequences. The effect was statistically not particularly strong in session 3, and we were not able to conclusively show that it was not, at least partially, caused by the trained versus untrained differences in MT in session 4. Overall, however, our data are more in favour of the presence of a real effect than for its absence. If true, could the remaining effect be driven by a true learned sequence representation in M1? Since fMRI activity reflects a combination of the input to a cortical region and recurrent activity (Logothetis, 2002), but not the output spiking (Picard et al., 2013), we suggest that M1's sensitivity to sequence type reflects differences in the received input to M1, with more efficient communication from higher-order areas on repetition of trained sequences. Some support for this idea comes from a recent study demonstrating layer-specific effects in M1 (Persichetti et al., 2020). By measuring changes in cerebral blood volume across layers, the authors demonstrated that superficial M1 layers (which reflect M1's inputs) show RS, whereas deep layers' activation (which is more indicative of M1's outputs) is enhanced during repetition. Since the BOLD signal is biased toward the superficial vascular signals, our activation results more likely reflect inputs into M1.
Still, rather than input from other areas, increased RS in M1 could reflect sequence dependency at a subvoxel resolution (Grill-Spector and Malach, 2001; Grill-Spector et al., 2006), which cannot be detected by pattern analyses. A prior electrophysiology study provided some support for this, demonstrating differential M1's responses to trained relative to random sequences (Matsuzaka et al., 2007). However, this study did not show differential activation for different trained sequences, thus no sequence representation as defined here. Moreover, recent electrophysiological studies have also shown that M1 does not represent the sequential context (Russo et al., 2020; Zimnik and Churchland, 2021). Altogether, this makes it unlikely that the RS observed in M1 reflects sequence dependency.
Our proposed model makes a number of predictions that could be tested using a combination of techniques. For layer-specific fMRI studies, we would predict that the first-finger effect in M1 can be mostly found in the superficial layers, reflecting cortico-cortico communication. For MEG or intracranial EEG studies (Ghuman et al., 2008; Gilbert et al., 2010; Korzeniewska et al., 2020), we would predict that the difference between trained and untrained sequences would be mainly present at the start of the sequence, an effect that would strongly reduce on repetition. Addressing these questions will advance our understanding of motor sequence on neural circuitry underlying production of skilled actions.
In conclusion, we demonstrated that RS may not only reflect a suppression of a specific representation in a region, but that the role of the region, and hence the structure of the representation, can change qualitatively from the first to the second repetition. While the representation of the skilled motor sequences remained relatively stable in parietal and premotor regions, the M1's representation changed, with a strongly reduced activation related to the beginning of the sequence. These results emphasize that employing RS only using the average regional activation sometimes provides incomplete, and possibly misleading, insights into regional representation. Instead, the combination of RS with pattern analyses can illuminate how representations change with repetition, and may provide a deeper understanding of brain circuits and their function.
Footnotes
This work was supported by an Ontario Trillium Scholarship (E.B.), the Natural Sciences and Engineering Research Council of Canada Discovery Grant RGPIN-2016-04890 (to J.D.), and the Canada First Research Excellence Fund (BrainsCAN). We thank Giacomo Ariani for helpful comments on an earlier version of the manuscript.
The authors declare no competing financial interests.
- Correspondence should be addressed to Jörn Diedrichsen at jdiedric{at}uwo.ca