Abstract
Learning to play a musical instrument requires complex multimodal skills involving simultaneous perception of several sensory modalities: auditory, visual, somatosensory, as well as the motor system. Therefore, musical training provides a good and adequate neuroscientific model to study multimodal brain plasticity effects in humans. Here, we investigated the impact of short-term unimodal and multimodal musical training on brain plasticity. Two groups of nonmusicians were musically trained over the course of 2 weeks. One group [sensorimotor-auditory (SA)] learned to play a musical sequence on the piano, whereas the other group [auditory (A)] listened to and made judgments about the music that had been played by participants of the sensorimotor-auditory group. Training-induced cortical plasticity was assessed by recording the musically elicited mismatch negativity (MMNm) from magnetoencephalographic measurements before and after training. SA and A groups showed significantly different cortical responses after training. Specifically, the SA group showed significant enlargement of MMNm after training compared with the A group, reflecting greater enhancement of musical representations in auditory cortex after sensorimotor-auditory training compared with after mere auditory training. Thus, we have experimentally demonstrated that not only are sensorimotor and auditory systems connected, but also that sensorimotor-auditory training causes plastic reorganizational changes in the auditory cortex over and above changes introduced by auditory training alone.
Introduction
The results of musical training are reflected in induced functional and structural differences in the brains of musicians compared with nonmusicians. For example, violin players have an increased somatosensory cortical representation of the left hand (Elbert et al., 1995). Musicians show also more pronounced auditory cortical representations than nonmusicians for tones of the musical scale (Pantev et al., 1998; Hirata et al., 1999; Shahin et al., 2003, 2008) and for the timbre of the instrument on which they were trained (Pantev et al., 2001).
Structural differences between musicians and nonmusicians have been demonstrated as well. Gray matter volume in motor, auditory, and visual brain regions differ in professional keyboard players and nonmusicians (Gaser and Schlaug, 2003). The left planum temporale, which is important for the processing of complex sounds, is relatively larger than the right planum temporale in professional musicians, especially those with absolute pitch (Schlaug, 2001). In addition, high-resolution magnetic resonance images revealed an enlargement of Heschl's gyrus in musicians (Schneider et al., 2002).
Multisensory integration was defined by Meredith and Stein (1983) as an increase in neuronal response to a stimulus consisting of a combination of modalities compared with the sum of neuronal responses to each stimulus modality separately. The interaction, as well as the integration among different sensory modalities, is especially important when playing a musical instrument. Sensory modalities interact, functionally reorganize, and contribute to new qualities of perception that convey information not inherent in each single modality. Recently, such evidence for cross-modal (auditory/somatosensory) reorganization of cortical functions in musicians has been found by our group (Schulz et al., 2003) by comparing magnetoencephalography (MEG) responses in trumpet players, who have developed connections between sound and feeling in the lip, and in nonmusician controls, who have not.
Music performance also involves the close interaction of sensory processing with motor production (Zatorre et al., 2007). For example, in professional pianists, activity of the motor cortex has been recorded when they were listening to a well-learned piece of piano music (Haueisen and Knösche, 2001; Baumann et al., 2005; Bangert et al., 2006). The differences between musicians and nonmusicians described above may, however, not only be the result of life-long training. Becoming a musician may also be related to innately driven musical talent or different learning skills (Monaghan et al., 1998). Therefore, training musically naive subjects in a laboratory environment and comparing different kinds of training is a method that is better suited to directly evaluate the effects of multimodal musical training.
To date, the impact of sensorimotor training comprising auditory, somatosensory, and motor activity has not been compared with auditory training alone in a laboratory environment. This is the goal of the present study. Specifically, we hypothesize that sensorimotor-auditory training in the context of piano playing leads to greater plasticity in the human auditory cortex compared with a mere auditory training.
Materials and Methods
Musical mismatch negativity.
Musical training and musical expertise are reflected in the auditory system as better discrimination performance of tonal frequencies (Brattico et al., 2003). This ability can be verified electrophysiologically in humans by means of electroencephalographic (EEG) or MEG measurements of the mismatch negativity [MMN (in EEG), MMNm (in MEG)]. The MMN is a preattentive frontocentral negative component of the event-related potential or field, measured at latencies of 120–250 ms after stimulus onset, with brain sources within the primary and secondary auditory cortex (Näätänen and Alho, 1995). It can be elicited not only by changes in simple auditory features like frequency, intensity, or duration of a sound, but it can also reflect complex aspects of musical structure.
Subjects.
Twenty-three nonmusicians (13 females) between 24 and 38 years of age with no formal musical training, except for their compulsory school lessons, participated in the study. The data of three subjects had to be excluded because of insufficiently pronounced MMN before training. All subjects were right-handed as assessed by the Edinburgh Handedness Inventory (Oldfield, 1971). None of them had a history of otological or neurological disorders. Normal audiological status, defined as air conduction threshold of no more than 10 dB hearing level between 250 and 4000 Hz, was verified by pure tone audiometry. All subjects were completely informed about the nature of the study. Informed consent was obtained from all subjects as approved by the Research Ethics Board of the University of Münster. The assignment of the test subjects to either the sensorimotor-auditory (SA) or to the auditory (A) group of 10 subjects each was random. The SA group learned to play a musical sequence on the piano whereas the A group merely listened to the music that was played by the participants of the SA group and made judgments as to whether the sequences were correct or not. The auditory MMN responses from all participants were measured before and after training. Training-induced plasticity was evaluated by comparing the MMN differences before and after training between the SA and A groups.
Stimuli for MEG measurement.
For the MEG measurements before and after training, we used a three- and a six-tone piano sequence (Fig. 1a). The standard three-tone piano sequence was a G-major broken chord (i.e., the tones were played in sequence, not simultaneously) in first inversion: B (246.94 Hz)–D (293.66 Hz)–G (392.00 Hz). In the deviant stimulus the first two tones of the sequence were the same, but the last tone was a minor third (three semitones) lower (E, 329.63) than in the standard. The six-tone sequence was an extension of the three-tone sequence composed of a c-major broken chord in root position followed by a g-major chord in first inversion: C (261.63 Hz)–E (329.63 Hz)–G (392.00 Hz)–B (246.94 Hz)–D (293.66 Hz)–G (392.00 Hz). Again on deviant trials, the last tone was lowered by a minor third to an E (329.63). In addition to being shorter, the three-tone sequence was part of the tone sequence used for training (see below), but the succession of tones of the six-tone sequence was not exactly contained in the training sequence (Fig. 1, compare a, b). Therefore, the two different sequences allowed us to evaluate the effect of sensorimotor-auditory training on deviance detection with a trained stimulus and with a generalization of the trained stimulus.
a, Tone sequences for the standard and deviant stimuli that were used in the MEG measurements before and after training. b, Musical score of the I–IV–V–I chord progression in c-major in broken chords that was used as training sequence for SA and A training. c, Visual templates for the SA training for each broken chord of the training sequence. Numbers represent the fingers (thumb, 1; index finger, 2; etc.) with which the subjects were supposed to press the corresponding piano keys. On each template, the image of the piano keyboard was depicted and the finger placement was marked. For each chord, the notes were to be played in ascending order first, and then descending again (compare score in b).
The stimuli for the MEG measurements were generated by means of a digital audio workstation in which an integrated on-screen virtual keyboard allowed generation of realistic piano tones on a synthesized piano. The duration of each tone in the stimulus sequences was 300 ms, resulting in a total melody length of 900 ms in the three-tone sequence and 1800 ms in the six-tone sequence. Successive sequences in the MEG recording session were separated by a silent interval of 900 ms. The three- and the six-tone sequences were presented in separate runs consisting of 400 trials (320 standards and 80 deviants) each. Both the three- and six-tone sequence runs were presented twice resulting in four runs and 1600 trials altogether. The deviant stimuli occurred randomly with the constraint that at least three standards were presented between two deviants. The four runs were separated by short breaks.
Training procedure.
The training stimulus consisted of four broken chords forming a I–IV–V–I sequence (Fig. 1b). This sequence is very common in Western tonal classical and popular music and clearly defines the key. The I chord is the tonic chord, built on the first note of the scale (in C major, this consists of the notes C–E–G–C). The IV chord is called the subdominant and is built on the fourth note of the scale (in C major, F–A–C–F). The V chord is called the dominant and is built on the fifth note of the scale (in C major, G–B–D–G). Participants played the first note of each chord with the left hand and the remaining three notes with the right hand (Fig. 1b, see the fingers used). The participants were never shown the musical notation of the tone sequences. Instead, to facilitate the training, visual templates of the finger placement for each broken chord (Fig. 1c) were presented. On each template, the image of the piano keyboard was depicted and the finger placement was marked. This helped prevent finger tangles, and a stable finger-key mapping facilitated the learning.
By comparison, the short test sequence used during the MEG measurement consisted only of the last three notes of the V chord, and the longer test sequence consisted of the last three notes of the I chord followed by the last three notes of the V chord. Thus, the three notes of the shorter test sequence occurred in that order during the training, but the six tones of the longer test sequence did not.
Before the training procedure, the participants were divided randomly into two groups. The SA group was trained to play the C-major chord progression on the piano. In the first training session, an instructor demonstrated the sequence. Training sessions were scheduled on 8 d within 2 weeks and lasted 25 min each. The training sessions were recorded via MIDI connection by a computer. A specifically developed computer program recorded each keystroke and compared the recorded data with a template of the correct sequence. Additionally, the computer program monitored the onset and offset time of each keystroke. Thus, we were able to quantitatively assess the correctness of the keystrokes, the tempo, and the smoothness of playing, and thus, to objectively evaluate the training progress on the behavioral level.
For the A group, the MIDI data recorded from the SA group were used to ensure that the A group obtained exactly the same auditory information as the SA group. Each subject of the A group listened to all of the training sessions of one randomly assigned subject from the SA group. In the first training session of the A group, the sequence was also demonstrated by an instructor. As in the SA group, auditory training sessions of the A group were scheduled on 8 d within 2 weeks. Subjects were seated in front of the piano while listening to the recorded sequences of the SA group. Thus, they could see the piano but received no visual information as to which keys had been played. The task for the subjects of the A group was to press the right- or left-foot pedal of the piano after each sequence to indicate that the sequence they heard was correct or not. This task was chosen to ensure that the subjects of the A group also participated actively in the experiment and listened carefully.
To evaluate the effect of the training on behavioral performance, all subjects participated in an auditory discrimination test before and after the two-week training. Thirty-five sequences of the I–IV–V–I chord progression in C-major that were used for training were played after being recorded from a trained musician with built-in mistakes in 13 sequences. The participants listened to these recorded sequences and responded by pressing the right-foot pedal of the piano whenever they heard a wrong note. The sensitivity index of signal detection theory d′ was computed as a performance measure for each subject individually (d′ = z(hit rate) − z(false alarm rate); function z is the cumulative distribution function of a Gaussian probability distribution; values of 1 and 0 for which z is not defined were replaced by 0.999 and 0.001, respectively). The individual values of d′ were statistically evaluated in a mixed model 2 × 2 ANOVA with factors group and pretraining/posttraining. Data of the behavioral test of one subject in the A group was not recorded because of technical failure and this subject was consequently left out of the behavioral analysis.
MEG data acquisition.
Magnetic field responses were recorded with a 275-channel whole-cortex magnetometer system (OMEGA 275; CTF Systems) with interchannel spacing of 2.2 cm. The MEG pickup coils use a 2 cm diameter configured as first-order axial SQUID gradiometers with 5 cm baseline (Vrba and Robinson, 2001). The spectral density of the intrinsic noise of each magnetic channel was <7 fT/√Hz for frequencies >1 Hz. The MEG signals were low-pass filtered at 150 Hz and sampled at a rate of 600 Hz. In the three-tone sequence, the duration of a recording epoch was 1.8 s, and in the six-tone sequence, 3.6 s, including 0.2 s prestimulus intervals, respectively. The data recording was synchronized to the stimulus presentation in each trial. The total recording time was 60 min. The recordings were performed in a magnetically and acoustically shielded room. The subjects were in an upright position, seated as comfortably as possible while ensuring that they did not move during the measurement. The subject's head position was checked at the beginning and end of each recording block by means of three localization coils fixed to the nasion and the entrances of both ear canals. Subjects were instructed not to move, to stay in a relaxed waking state during the measurement, and not to pay attention to the sound stimuli. Alertness and compliance were verified by video monitoring. To control for confounding changes in attention and vigilance, subjects watched a soundless movie of their choice, which was projected on a screen placed in front of them.
MEG data analysis.
The recorded magnetic field data were averaged separately for the standard and deviant stimuli and the three-tone and six-tone stimulus sequences. Subtracting the standard data from the deviant data generated difference waveform data sets. Epochs contaminated by muscle or eye blink artifacts containing field amplitudes of >3 pT in any channel were automatically rejected from the averaging procedure.
For the analysis in sensor space, root mean square (RMS) values were calculated for each subject over all sensor channels for the averaged data sets of the standard and deviant conditions and for the difference data sets. The obtained RMS data were then averaged over the subjects of each group for the deviant, standard, and the difference waveform. For the consecutive MMN source analysis, the averaged field waveforms were 30 Hz low-pass filtered, and a baseline correction was performed based on the 100 ms time interval previous to the onset of the piano tone sequences. Then, the source analysis model of two equivalent current dipoles (ECD) (one in each hemisphere, latency ∼150–250 ms after stimulus onset) was applied. The two spatiotemporal dipoles, defined by their dipole moments, orientation, and spatial coordinates, were fitted simultaneously to the MMN based on the difference waveforms for both hemispheres, separately for the three- and six-tone sequences and for each recorded data set before and after training.
The estimated source location was determined in a head-based Cartesian coordinate system with the origin at the midpoint of the mediolateral axis (y-axis), which joined the center points of the entrances to the ear canals (positive toward the left ear). The posterior–anterior axis (x-axis) was oriented from the origin to the nasion (positive toward the nasion), and the inferior–superior axis (z-axis) was perpendicular to the x–y plane (positive toward the vertex). Further anatomical and statistical constraints were applied to the data. In general, only estimated source locations fulfilling the following anatomical considerations characterizing the human auditory cortex area were included for further analysis: anterior–posterior value (x) within ± 3 cm, medial–lateral value (y, distance from the midsagittal plane) >2 cm. Additionally, a statistical consideration of >75% goodness of fit for the dipolar source was imposed. Median values of x, y, and z coordinates of the ECDs as well as of the angles of the dipole orientation were calculated across the stimulus conditions of the three- and six-tone sequences before and after training. The median values of the source coordinates and orientations were then used as reference for the source-space projection method (Tesche et al., 1995). The source-space projection estimates the activity in a certain brain area by a linear combination of the measured field at the 275 sensor positions outside the head, and thus, it transfers the multichannel magnetic field data into two single time series of magnetic dipole moments for the left and right hemispheres. These time series reach a maximum only for a typical dipolar magnetic field pattern of a single current source in an a priori specified brain region, and therefore, this method is spatially sensitive.
The source-space projection allows calculating the grand averages of dipole moment time-series across different subjects and different conditions, thereby enhancing the signal-to-noise ratio with uncorrelated system noise canceling out. The method is maximally sensitive for brain activity from sources at selected origins and orientations. Unwanted activities from more distant sources or sources having different orientations are combined less optimally, and therefore the activity of these sources is reduced in the dipole moment waveforms. The dipole moment waveforms, over the whole stimulus-related epochs of the different conditions of the experiment, were calculated based on the source-space projection.
Finally, grand average waveforms for RMS values were computed for pretraining and posttraining data, groups (SA and A), and stimulus sequence (three tone and six tone). Grand average waveforms for the dipole moments were computed for pretraining and posttraining data, groups (SA and A), stimulus sequence (three-tone and six-tone stimulus sequences), and hemisphere (left and right). To evaluate the MMN strength across participants, the MMN peaks of RMS values and dipole moments were determined from the corresponding waveforms of each individual participant and subjected to statistical analysis by means of repeated measures mixed-model ANOVA with factors group, pretraining/posttraining, stimulus sequence, and hemisphere. In all statistical tests, the α level was set at 0.05, and all tests were two-tailed unless otherwise stated.
Results
Behavioral data
The results of the behavioral test (Fig. 2) revealed that discrimination as indexed by d′ improved significantly in both groups after training compared with before. This observation is statistically supported by a main effect of pretraining/posttraining (F(1,17) = 14.42; p = 0.001) in a 2 × 2 mixed-model ANOVA with factors group and pretraining/posttraining. The interaction of group × pretraining/posttraining was not significant (F(1,17) = 2.72; p = 0.117), but there was a trend for more improvement in the SA group (pretraining: mean, 1.42; SD, 1.17; posttraining: mean, 3.24; SD, 1.19) than in the A group (pretraining: mean, 1.37; SD, 0.52; posttraining: mean, 2.09; SD, 1.09). The main effect of group was not significant (F(1,17) = 3.09; p = 0.097), indicating that there was no overall difference in performance between groups.
Group means of behavioral performance in the auditory discrimination test before and after training as measured by the sensitivity index d′. pre, Pretraining; post, posttraining. Error bars indicate SEM.
For the SA group, the three measures of motor performance in the piano training are displayed in Figure 3. The learning curves reveal that the SA group improved their playing continuously in the course of training in all three measures of performance, correctness of keystrokes, tempo, and smoothness of playing.
a–c, Progress of motor performance in the course of the training sessions in the SA group as measured by correctness (a), tempo (b), and smoothness of play (c). Error bars indicate SEM.
MEG data
Auditory evoked magnetic fields in response to each tone were obtained in all subjects and both stimulus conditions. Figure 4 shows typical individual RMS data of a participant after the SA training to standard and deviant six-tone sequences (a) and the difference waveform representing the MMN (b, area above noise level marked by thick line). The MMN can readily be seen also in the deviant condition response (a, thick line marked by arrow) within the time region between 150 and 200 ms after tone onset, if compared with the standard condition (a, thin line). The iso-contour plot (a, right corner inset), constructed from the MEG field data as recorded by all channels for the MMN maximum, illustrates a clear dipolar pattern justifying the application of the current dipolar model for the evaluation of the MEG experimental data in cortical source space.
a, Typical responses (RMS values) of an individual subject to standard (thin line) and deviant (thick line) six-tone sequences. The field distribution of the MMN component at the time point indicated by the arrow is shown in the right corner inset. The two arrows in the inset indicate schematically auditory cortex dipolar sources, and the triangle symbolizes the nose. Onsets of tone stimuli are indicated by triangles. b, Difference waveform obtained by subtracting the standard waveform from the deviant waveform. A clear MMN component (marked by a thicker line) after the onset of the deviant tone (black triangle) is discernible.
The individual RMS difference waveforms were averaged for each group and each stimulus condition, and the resulting grand averages are illustrated in Figure 5. In the SA group (top row), a distinct increase in the MMN amplitude from pretraining to posttraining is visible in the six-tone condition as well as in the three-tone condition, although slightly smaller in the latter. In contrast, in the auditory group, only a marginal increase in the six-tone condition and almost no increase in the three-tone condition are discernible. As an additional illustration of the training effects on the RMS values, Figure 6 depicts the average pretraining/posttraining differences of individual MMN peaks for both groups and stimulus conditions. The mixed-model ANOVA results on the RMS differences between the factors group, pretraining/posttraining, and stimulus condition failed to reach statistical significance on main effects and interactions as well. This fact is not very surprising, because the RMS data are not normalized and thus strongly depend on the individual head size and geometry with respect to the MEG whole head sensor array. Therefore, interindividual variance is increased and comparability of data between subjects is limited. However, RMS data are unaffected by model assumptions, and thus, they provide valuable initial information.
Group averages of RMS difference waveforms for both groups and three-tone and six-tone stimulus conditions. The onset of the deviant occurs at time point zero. Thin lines indicate pretraining (pre) data and thick lines posttraining (post) data. seq., Sequence.
Group averages of pretraining/posttraining differences of the individual MMN peak amplitudes for both groups and stimulus conditions. seq., Sequence. Error bars indicate SEM.
The very well-pronounced dipolarity of the MMN individual data justified the use of a single equivalent current dipole model for the consecutive cortical source analysis of the data of the different groups and conditions. The group averages of the resulting source waveforms obtained after the performed source-space projection before and after training for the different groups and conditions are displayed in Figure 7, consequently representing the major result of this study. A clear MMN is detectable in all panels. MMN in the two groups appears quite similar in pretraining. However, whereas distinctly large increases of the MMN amplitude between pretraining and posttraining data are ascertainable in the SA group (thick vs thin lines), these effects are much smaller in the A group. In the SA group, the increase is clearly more pronounced for the six-tone sequence than for the three-tone sequence. In addition, it is obviously larger in the right than in the left hemisphere for all different conditions. In contrast, in the A group, no clear difference pattern for the cortical source strength can be observed; training appears to be associated with an increase in MMN in the right hemisphere for the three-tone sequence, but hardly any effect is seen in the left hemisphere or for the six-tone sequences in either hemisphere.
Group averages of the source waveforms obtained after performing source-space projection before and after training for both groups, stimulus conditions, and hemispheres. Data for the three-tone sequences are shown in the top four panels and data for the six-tone sequences in the bottom four panels. Within each set of four panels, SA group data are shown in the top row, and A group data are shown in the bottom row. Data from the left hemisphere (LH) are presented on the left and those of the right hemisphere (RH) on the right. Thin lines indicate pretraining (pre) data and thick lines posttraining (post) data.
The group averages of the pretraining/posttraining differences of the individual MMN source strength peak amplitudes are depicted in the bar plots of Figure 8 and tested for statistical significance in a mixed-model ANOVA with factors group, pretraining/posttraining, stimulus condition, and hemisphere. Significant main effects of pretraining/posttraining (F(1,18) = 11.20; p = 0.004) and hemisphere (F(1,18) = 9.56; p = 0.006) were qualified by several significant interactions. A significant group × pretraining/posttraining interaction (F(1,18) = 7.49; p = 0.014) indicated a stronger training effect in the SA group than in the A group. Neither the three-way interactions involving the factor group nor the four-way interactions were significant. However, the three-way group × pretraining/posttraining × stimulus condition interaction approached significance (F(1,18) = 3.17; p = 0.092), indicating a larger training effect for the SA group especially in the six-tone sequence. The overall training effects were stronger in the right hemisphere, as indicated by a significant pretraining/posttraining × hemisphere interaction (F(1,18) = 11.90; p = 0.003). Also the three-way pretraining/posttraining × hemisphere × stimulus condition interaction attained significance (F(1,18) = 4.77; p = 0.043), indicating that the dominance of the training effect in the right hemisphere over the effect in the left hemisphere was more pronounced in the six-tone than in the three-tone sequence.
Group averages of the pretraining/posttraining differences of the individual MMN source waveform peak amplitudes from both groups, as well as different stimulus conditions and hemispheres. Left, Left hemisphere; right, right hemisphere; seq., sequence. Error bars indicate SEM.
Discussion
In the present study, we showed that multimodal sensorimotor-auditory training in nonmusicians results in greater plastic changes in auditory cortex than auditory-only training. We examined representations for melodic fragments that were based on a melody with which subjects received 2 weeks of training. After training, compared with before, the sensorimotor-auditory group showed a much larger enhancement of the MMN than did the auditory group. The behavioral results corroborated the electrophysiological ones, although with less significance because of ceiling effects. Because MMN is primarily generated in auditory cortex (Picton et al., 2000), the results indicate strong effects of sensory-motor practice on auditory representations. Few previous studies have examined cross-modal plasticity. However, in one study, it was found that professional trumpet players show enhanced interactions between auditory input and somatosensory input to the lip, presumably as a result of years of practicing their instrument (Schulz et al., 2003). In the present study, we manipulated experience in a well-controlled laboratory setting. We tightly controlled what both groups heard during training by taking the sequences produced by those in the sensorimotor-auditory group and randomly assigning each person in the auditory group to hear the sequences produced by one person in the sensorimotor-auditory group. We also made sure that both groups listened attentively to the auditory stimuli in that they had to make judgments about each sequence heard. Thus, this study demonstrates that, not only are auditory and sensorimotor representations for music connected, but musical representations in auditory cortex change more when the sensorimotor system is involved in training compared with when only the auditory system is involved in training. Certainly sensorimotor training is more demanding and motivating, causing more attentional resources to be spent on the perception of the tones. Thus, the increased value of attention during the sensorimotor-auditory training is a further factor leading to the increased neural activity in the auditory system.
The idea that music and movement are related has a long history (Cross, 2003). Synchronized movement to music is found in all cultures (Brown, 2003), and this relation may have its origins in the rhythmic movements of locomotion (Todd et al., 2007). Executing rhythmic motor movements involves a network of brain areas including basal ganglia, cerebellum, premotor cortex, and supplementary motor cortex (Zatorre et al., 2007). Functional magnetic resonance imaging studies have shown that these movement-related areas are also activated during auditory perceptual tasks (Janata and Grafton, 2003). In particular, the cerebellum (Petacchi et al., 2005) and the premotor cortex (Brown and Martinez, 2007) can show activation during auditory discrimination, and disruption of auditory feedback affects motor execution (Pfordresher and Palmer, 2006). When nonmusicians were trained to play a melody on a keyboard, motor areas were activated when they heard this melody, but not when they heard other melodies (Lahav et al., 2007).
Although it is generally accepted that hearing music makes people want to move, of most interest to the results of the present study is recent evidence suggesting that the relation is bidirectional. In other words, movement can affect auditory processing. Phillips-Silver and Trainor (2005, 2007) showed that for both infants and adults, bouncing on every second beat of an auditory metrically ambiguous rhythm pattern biased listeners to hear the ambiguous pattern as a march, whereas bouncing on every third beat of the same pattern biased them to hear the same ambiguous pattern as a waltz. Recent physiological evidence also indicates strong bidirectional connections between auditory and movement-related areas (Zatorre et al., 2007). For example, auditory cortex is activated when musicians observe someone else play a keyboard (Haslinger et al., 2005). Furthermore, similar auditory and motor areas are activated when pianists play a piece without being able to hear it and when they listen to it without playing it (Baumann et al., 2005; Bangert et al., 2006). The results of the present study are consistent with these findings in showing that motor processing affects auditory areas. The results of the present study extend these findings to show that motor training causes changes in auditory cortex over and above changes introduced by auditory training alone.
It is of interest that a greater effect of brain plasticity occurred for the longer test stimulus than for the shorter test stimulus. There are several possible reasons for this. First, the longer stimulus might simply provide a better context from which to detect deviants. Second, the shorter test stimulus was an exact excerpt from the training stimulus, whereas the longer test stimulus contained patterns from the training stimulus but was not an exact excerpt. Thus, the generalization requirements of the second pattern may have afforded greater opportunity to observe plastic changes. Stronger training effects were also found in the right than in the left auditory cortex. This is consistent with studies showing preferential encoding of spectral information on the right (Zatorre and Samson, 1991; Zatorre and Halpern, 1993; Schönwiesner et al., 2005). However, in the present experiment, the test stimuli included patterns from the training stimulus that were predominantly played by the right hand and less by the left hand. One might then expect strong motor training for the test stimuli in the left hemisphere. However, not only was greater plasticity seen in the right than in the left hemisphere, but these effects were particularly strong in the right hemisphere for the longer test stimulus that required greater generalization of training effects. The influence of motor training on plasticity in auditory cortex, then, appears to occur at level of representation for melody that does not depend on the particular hand used.
Many studies have shown differences in processing between musicians and nonmusicians (Elbert et al., 1995; Schlaug et al., 1995; Pantev et al., 1998, 2001; Schlaug, 2001; Münte et al., 2002; Schneider et al., 2002; Fujioka et al., 2004, 2005; Trainor, 2005), and many have attributed these differences to the specific experience that musicians acquire while practicing their instrument for hours a day over many years. However, in most studies, it is difficult to know whether intrinsic early differences led to the decision to train musically, or whether the differences observed are indeed attributable to the training itself. Nevertheless, some studies strongly suggest an experiential role of musician–nonmusician differences. For example, effects of training can be instrument-specific (Pantev et al., 2001; Schulz et al., 2003; Shahin et al., 2008), and the EEG responses of children taking music lessons have been shown to change differently over the course of a year compared with those of children not studying music (Fujioka et al., 2006). However, the gold standard for showing causal effects of musical training is random assignment and experimental control of experience. Both of these conditions were met in the present study. Thus, we are able to conclude that the more robust effects for sensorimotor-auditory training than for auditory training alone are attributable to the experience itself.
Footnotes
-
This work was supported by Deutsche Forschungsgemeinschaft Grant PA392/12-1. We thank A. Wollbrink for technical help, K. Berning for supporting the data acquisition, and our test subjects for their diligent collaboration.
- Correspondence should be addressed to Dr. Christo Pantev, Institute for Biomagnetism and Biosignalanalysis, University of Münster, Malmedyweg 15, D-48149 Münster, Germany. pantev{at}uni-muenster.de