Auditory and written language in humans' comprehension necessitates attention to the message of interest and suppression of interference from distracting sources. Investigating the brain areas associated with the control of interference is challenging because it is inevitable that activation of the brain regions that control interference co-occurs with activation related to interference per se. To isolate the mechanisms that control verbal interference, we used a combination of structural and functional imaging techniques in Italian and German participants who spoke English as a second language. First, we searched structural MRI images of Italian participants for brain regions in which brain structure correlated with the ability to suppress interference from the unattended dominant language (Italian) while processing heard sentences in their weaker language (English). This revealed an area in the posterior paravermis of the right cerebellum in which gray matter density was higher in individuals who were better at controlling verbal interference. Second, we found functional activation in the same region when our German participants made semantic decisions on written English words in the presence of interference from unrelated words in their dominant language (German). This combination of structural and functional imaging therefore highlights the contribution of the right posterior paravermis to the control of verbal interference. We suggest that the importance of this region for language processing has previously been missed because most fMRI studies limit the field of view to increase sensitivity, with the lower part of the cerebellum being the region most likely to be excluded.
The ability to control interference from competing information is essential for efficient auditory and written speech processing. Interference can occur at the sensory level (e.g., from environmental noise), the comprehension level (e.g., when two people speak at the same time), and at the production level (e.g., when the same message can be expressed in more than one way). The current study used a combination of structural and functional imaging to investigate the brain regions involved in suppressing verbal interference during speech comprehension.
Dissociating processing related to control and interference is not easy in functional imaging paradigms because they occur in time. One approach is to distinguish activation that occurs at the onset of interference from activation that occurs during sustained interference, with the assumption that processing related to control mechanisms will be sustained or increase over time, while processing related to interference will decrease over time (because it is being controlled). Using this rationale, we have previously associated activation in the head of the left caudate with the control of verbal interference during a Stroop task that taps single-word interference during color naming (Ali et al., 2010). However, it is much more challenging to design functional imaging paradigms that can dissociate interference from control during more complex tasks, such as sentence comprehension. Therefore, the current study used structural imaging to identify brain regions in which structure was positively correlated with the ability to control verbal interference, as measured outside the scanner using a task that assessed auditory sentence comprehension in the presence of interference from competing sentences. To validate the findings, we used functional imaging from a complementary experiment to confirm that the identified areas were activated when participants made semantic decisions on written words in the presence of strong versus weak interference from distracting words.
To maximize intersubject variance in the ability to control language interference, our participants were non-native users of English. Those with high English proficiency were expected to have the greatest expertise in controlling interference from their first (dominant) language. The wide intersubject variability in the control of verbal interference was then correlated with intersubject variability in brain structure. Based on previous functional imaging studies, we predicted that the control of verbal interference at the comprehension level would correlate with brain structure in left inferior or middle frontal regions (Rodriguez-Fornells et al., 2002, 2005) and/or the left head of caudate (Ali et al., 2010). Specifically, Rodriguez-Fornells et al. (2002, 2005) found that when bilinguals were making semantic decisions on written words in their native language, and words in their non-native language were presented, left inferior and middle frontal activation increased more than in monolinguals who did not speak the non-native language. However, these studies were not designed to distinguish activation related to “control functions” from activation related to interference per se. Therefore, in addition to looking in regions of interest, we also conducted a whole-brain analysis to search for regions that have not previously been associated with the control of verbal interference.
Materials and Methods
The study was approved by the local ethics committee. All participants gave written informed consent.
Structural imaging study
The structural imaging study included data from 26 right-handed Italian adults (16 females, mean age 32.9 years, SD = 7.1, range 21.3–41.4) who were late learners of English and resident in the United Kingdom (UK) at the time of testing. All participants completed a language history questionnaire adapted from Li et al. (2006). The sample of participants was selected to represent a wide range of English-language abilities while keeping other characteristics as constant as possible (see below).
The Bilingual Verbal Ability Tests
The Bilingual Verbal Ability Test (BVAT) (Muñoz-Sandoval et al., 1998) contains three standardized tests administered individually: (1) Picture Vocabulary; (2) Oral Vocabulary; and (3) Verbal Analogies. In the Picture Vocabulary test, participants are asked to name a total of 58 pictured objects, with the degree of difficulty gradually increasing. It is an expressive language task that involves word retrieval ability at the single-word level and measures word comprehension/knowledge. The Oral Vocabulary test is in two parts, one for Synonyms (20 items) and one for Antonyms (24 items). In the Synonyms subtest, the participant is asked to make a synonymous word association, with difficulty increasing gradually. In the Antonyms subtest, the participant is asked to make an opposite (antonymous) word association, with difficulty again increasing gradually. In the Verbal Analogies test, the participant is asked to recognize the analogous relationships between two words and to find the word that fits with the analogy (e.g., “bird is to fly as fish is to. …swim”). This task consists of 35 items measuring verbal reasoning in increasingly more complex conceptual/logical steps.
The three BVAT tests were administered in English (second language; L2) first. Each item answered incorrectly in English was readministered in Italian (first language; L1), thereby resulting in two different scores: (1) an English Raw score; and (2) a Gain score for L1. All scoring was automated through the “Scoring and Reporting Program” software, which is a standard feature of the BVAT kit. The BVAT generates a measure to assess the cognitive-academic level of proficiency in English (CALP).
The results of the BVAT categorized participants according to 5 different levels and increasing in units of 0.5 to obtain 9 degrees of cognitive-academic proficiency in English, ranging from negligible (score = 1) to advanced (score = 5). The number of participants at each level of proficiency was as follows: N = 7, very limited (CALP score 2.0), N = 2, limited (CALP score 3.0), N = 8, limited to fluent (CALP score 3.5), N = 6, fluent (CALP score 4.0), N = 1, very fluent (CALP score 4.5), N = 2, advanced (CALP score 5.0). Thus, our participants had a wide range of second language proficiency that we predicted would result in a wide range in the ability to control verbal interference. It should be noted that the CALP is a refined index of proficiency provided by the BVAT. Therefore, a “Negligible” level of proficiency does not mean that the subject cannot speak English at a functional level.
Sentence interpretation task
We designed a variant of a sentence interpretation task that has previously been used in cross-linguistic research (MacWhinney and Bates, 1989; Bates et al., 2001; Dick et al., 2003), clinical research (Dick et al., 2001), and developmental research (Dick et al., 2004). In this task, which builds on that by Leech et al. (2007), participants must identify the agent in a series of sentences varying in structural complexity in the presence or absence of interfering sentences also presented simultaneously in both ears (diotic listening). The target language was either the first language (L1) or the second language (L2). Likewise, language interference could either be the same language as the target or a different language. This resulted in four different interference conditions: (1) target sentence in L1 with interference in L1, (2) target sentence in L2 with interference in L2, (3) target sentence in L1 with interference in L2, and (4) target sentence in L2 with interference in L1. There were also two conditions with no interference, where the target sentence was either in (5) L1 or (6) L2. Within each of these six conditions, the syntactic structure of the sentences was either canonical [Subject-Verb-Object (S-V-O)] or noncanonical [Object-Verb-Subject (O-V-S) or Object-Subject-Verb (O-S-V)]. Both Italian (L1) and English (L2) predominantly use canonical S-V-O word order (Bates et al., 1982). Thus, canonical sentences were taken to be easier and therefore imposing a lower cognitive load (Roland et al., 2007). Conversely, the noncanonical sentences were taken to be harder and more cognitively demanding (high-load processing).
Based on the results of previous studies (Dick et al., 2001; Leech et al., 2007), we anticipated that meaningful individual differences in language skill—and crucially, in the cognitive control of interference—would be revealed by the most challenging set of conditions. We expected interference to be highest when the target sentence had a noncanonical structure and was presented in L2 (the weaker language), and when interference was presented in L1 (the dominant language).
Participants were told that they would see two drawings of animals presented simultaneously on the left and right sides of a computer screen and that during this time they would also hear a sentence featuring the two animals, with one of them doing a “bad action” to the other. Participants were required to identify the animal doing the bad action by making the corresponding left or right key press. They were also told that in some conditions they would hear two people speaking simultaneously, one male voice and one female voice. Participants were instructed to focus on the voice with the gender indicated on the computer screen at the beginning of the task and ignore the other voice. An illustration of the experimental setup is displayed in Figure 1.
All participants were instructed in English and completed 16 practice trial sentences for each experimental condition. For a given sentence, the position of the agent animal (left or right) was counterbalanced across participants. Four pseudo-random condition orders were created, which were randomly allocated to the participants with 6 or 7 participants per order. For each order, half of the target sentences were spoken by a woman and half by a man. Each trial was presented immediately following the participant's response, allowing a maximum of 3 s, after which, if there was no response, the next trial was presented automatically. Trials were presented in short runs of variable length (4, 6, or 8 trials) in which the target language alternated to maximize interference and, therefore, the need for selective attention; i.e., a target run in L1 was always followed by a target run in L2 and vice versa. In the language interference condition, the L1 and L2 sentences used as interference were counterbalanced in such a way that participants would perform an equal number of trials in the same language (i.e., L1/L1, L2/L2) and opposing language (i.e., L2/L1, L1/L2).
In each trial, both visual and auditory stimuli were presented. The visual stimuli were drawings of familiar animals taken from several picture databases (Snodgrass and Vanderwart, 1980; Abbate and LaChappelle, 1984a,b). Single pictures were digitized black-and-white line drawings (7.0 × 5.0 cm) displayed in pairs in accordance with the auditory stimuli (the sentences featuring the animals). Each drawing was embedded in a solid gray rectangle surrounded by a white background (Fig. 1). The auditory stimuli were 192 sentences, 96 in English (L2) and 96 translation equivalents in Italian (L1). The easy canonical sentences (S-V-O) were (1) active and (2) subject-cleft syntactic structures. The difficult noncanonical sentences (O-V-S or O-S-V) were (3) object cleft and (4) passive syntactic structures. Table 1 shows examples of these sentence types.
Target and non-target sentences were created from a pool of animal nouns and action verbs using the following criteria: (1) each animal appeared twice as agent, and twice as patient; (2) each verb appeared twice; (3) no noun appeared with a verb more than once as an agent and no noun appeared with a verb more than once as a patient; (4) no two nouns were combined together twice; (5) the names of the animals were not cognates; (6) the verbs chosen were all high-frequency verbs, transitive, and with mildly negative meaning; (7) attended (i.e., target) and competing (i.e., interfering) sentences were always spoken by speakers with different genders and counterbalanced across languages; (8) attended and competing sentences were paired pseudo-randomly with the proviso that the same animals and syntactic structure would never be presented simultaneously in target and non-target sentences. Thus, the decision point for driving a response would rarely if ever be simultaneous in target and non-target sentences.
Sentences were recorded by native speakers (1 male and 1 female in each case) of British English (L2) or Italian (L1) onto digital audio tape (DAT) in an Industrial Acoustics 403-A audiometric chamber with a TASCAM DA-P1 DAT recorder and a Sennheiser ME65/K6 supercardioid microphone and pre-amp at gain levels between 6 and 12 dB. The recorded stimuli were then digitized via digital-to-digital sampling onto a Macintosh G4 computer via a Digidesign MBox using ProTools LE software at a sampling rate of 44.125 kHz with a 16 bit quantization. The waveform of each sentence and animal name was then edited, converted into a 16 bit 44.125 kHz mono sound file in Audacity 1.2.5 for Mac, and saved in .wav format. Each target and competing speech sentence was normalized to a root mean squared amplitude of 70 dB using Praat software (Boersma and Weenink, 2010), such that the average signal-to-noise ratio over the whole sentence was 0 dB.
The experiment was run under Matlab 7.7.0 (MathWorks Inc.) on a MacBook 13 inch laptop computer with the auditory stimuli presented through Sennheiser EH-150 headphones. Accuracy was recorded in Matlab from a USB Logitech Precision game-pad in which only two buttons were enabled, one on the right and one on the left.
Accuracy scores in the baseline condition (i.e., without language interference) were subtracted from those in the interference condition to obtain a task ability score. The score is on a negative scale because performance on non-interference (baseline) tasks was always better than that of interference conditions. Consequently, better ability to manage interference is indicated by higher (“less negative”) scores. As predicted by the wide range of proficiency in our sample, there was also a wide range of scores on the sentence task (Table 2), which ensured the necessary intersubject variability for the structural brain imaging analysis.
Interference was highest when the task involved noncanonical sentences in the non-native language (L2) and interference was presented in the native language (Fig. 2). This was confirmed by a 2 × 2 × 3 within-subjects ANOVA on errors, crossing interference (present, absent) with sentence type (canonical, noncanonical) and language (L1, L2). The effect of interference (present vs absent) interacted with both language (F(2,52) = 3.672, p = 0.032, η2 = 0.124) and sentence type (F(2,52) = 5.041, p = 0.010, η2 = 0.162) with no significant three-way interaction (F(2,52) = 2.091, p = 0.134).
The Simon Task (Simon and Wolf, 1963; Lu and Proctor, 1995) tests the ability to resolve nonverbal stimulus–response conflict. By including this measure in the imaging analysis, we were able to focus on the control of verbal conflict in the sentence interpretation task described above, after factoring out the ability to control nonverbal conflict in the Simon Task. Both tasks involved the same right-hand motor response. Therefore, an effect that was greater for the control of verbal interference relative to the control of nonverbal conflict is unlikely to be due to the control of interference at the motor response level.
Participants were shown a red or a blue square appearing either on the left of right side of the screen. They were asked to press one key in response to the red square and another key in response to the blue square. The control of nonverbal conflict is measured by comparison of performance on congruent and incongruent trials. On congruent trials, the color stimulus matches the side of the button (e.g., red square requiring left button response appearing on the left side of the screen). By contrast, on incongruent trials, the color stimulus does not match the side of the correct button press response (e.g., red square requiring left button response appearing on the right side of the screen), typically leading to slower reaction times (Bialystok et al., 2004). A computer-based version of the Simon Task was developed with Matlab and presented on the same MacBook laptop as the sentence task described above. A two-button keypad was connected to the computer. The task began with a fixation cross in the center of the screen that remained visible for 800 ms and was followed by a 250 ms blank interval. At the end of this interval, a red or blue square appeared either on the left or the right side of the screen and remained visible for 1000 ms if there was no response.
Participants were asked to respond according to one characteristic of the stimulus, i.e., the color red or blue, while ignoring an irrelevant characteristic of the same stimulus, i.e., its position on the screen (right or left). There were in total 28 sequential randomized test trials, 14 congruent and 14 incongruent (Bialystok et al., 2004; Morton and Harper, 2007). Participants were trained with 4 practice trials and the experiment automatically began after all practice trials were successfully passed. Between the practice and the experimental phase, all participants were reminded to press the buttons as quickly and accurately as they could. Only two participants needed >4 practice trials before carrying out the test. The task took ∼5 min to complete.
To best capture the variance, the effects of interference on the reaction times and error rates were combined in a composite task efficiency score. Median correct reaction time for congruent trials was subtracted from that for incongruent trials and the result divided by the difference in proportion correct for incongruent minus congruent trials. In this case, a negative efficiency score indicates better performance with congruent trials. Thus, the more positive (or less negative) the score, the more efficient the participant was on the most demanding incongruent trials.
There were fewer incongruent trials in the Simon Task (14) than in the Sentence Interpretation task (48). However, this was sufficient to result in a significant effect of incongruency. A two-way repeated-measures ANOVA for trial type (congruent, incongruent) confirmed that participants were slower and less accurate on incongruent than congruent trials: reaction time, F(1,26) = 54.848, p < 0.001, η2 = 0.687; Errors, F(1,26) = 6.045, p = 0.021, η2 = 0.195. We also note that the original study by Simon and Wolf (1963) in which the effect was discovered included only 16 trials per condition (Bialystok et al., 2004).
Matrices (Part of the British Ability Scale II)
The Matrices task from the British Ability Scale II (BAS-II) (Elliot et al., 1997) is a test of nonverbal reasoning. It was included in the imaging analysis to partially control for so-called performance IQ (Richardson et al., 2010). In this test, participants were shown an incomplete matrix of black and white abstract figures, with each matrix consisting of either four or nine cells. Participants were asked to complete the matrix by first selecting the most appropriate pattern from six potential tiles and then indicating their selection by pointing to or reading the number of the tile. Participants first completed four practice items and then began the test at an age-appropriate level, which is indicated on the test (previous items are administered should they fail on the first three test items). The test was discontinued if the participant made five failures out of six consecutive items. An ability score is obtained from a look-up table supplied with the test.
Structural image acquisition
Anatomical whole-brain images were acquired using a Siemens Sonata 1.5T MRI scanner; a T1-weighted modified driven equilibrium Fourier transform (MDEFT) sequence (Deichmann et al., 2004) was used to collect 176 sagittal slices with an image matrix of 256 × 224, yielding a final resolution of 1 mm3 (TR/TE/TI = 12.24 ms/3.56 ms/530 ms).
Structural image analysis
Scans were analyzed using SPM 8 (Wellcome Department of Imaging Neuroscience, http://www.fil.ion.ucl.ac.uk/spm). Structural images were processed using the Diffeomorphic Anatomical Registration Through Exponentiated Lie Algebra (DARTEL) toolbox available in SPM 8 (Ashburner, 2007, 2009). DARTEL uses a more sophisticated registration model than previous approaches implemented in the SPM software (Ashburner, 2009). Structural images were first segmented in native space into gray and white matter. A template brain was then created in DARTEL using default parameter settings. This process iteratively matches selected images to a template generated by their own mean. The resulting flow fields containing deformation information generated by this process were then used to spatially normalize gray matter images to Montreal Neurological Institute (MNI) space. Both modulated and unmodulated images were created. Unmodulated images preserve the concentration of gray matter, thereby representing gray matter density (Mechelli et al., 2005). Modulated images make a correction for local brain volume and therefore represent volume changes rather than density changes. Previous studies of second language learning and vocabulary acquisition (Mechelli et al., 2004; Lee et al., 2007; Grogan et al., 2009; Richardson et al., 2009) have found robust and replicable results using unmodulated images (i.e., gray matter density) rather than modulated images (i.e., gray matter volume). The normalized modulated and unmodulated images were smoothed using an isotropic kernel of 8 mm at full-width half-maximum (FWHM).
Statistical analyses of structural data
A multiple regression analysis was used to identify the main effect of verbal control while factoring out variance associated with the following cognitive skills: (1) second language proficiency (as measured by the BVAT), (2) the control of nonverbal information (as measured by the Simon Task), and (3) nonverbal reasoning (as measured by the Matrices task). To model linear effects of age, (4) age-in-years was also included as a regressor in all analyses. We performed two different second-level analyses, as follows.
Structural imaging analysis 1.
The regressor of interest was the ability scores for the most difficult language control condition when participants made decisions on noncanonical sentences in L2, with interference in L1. In addition, regressors (1) to (4) were included as regressors of no-interest.
Structural imaging analysis 2.
To demonstrate that verbal interference arose at a semantic rather than a perceptual level, we added three more regressors, which were the ability scores for processing L1 noncanonical sentences with L2 interference, L1 noncanonical sentences with L2 interference, and L2 noncanonical sentences with L2 interference. This allowed us to investigate the main effect of target language (L1 vs L2), the main effect of interference language (L1 vs L2), and the interaction between these variables. We did not include the ability scores for controlling interference during the processing of canonical sentences because performance was near ceiling levels for most participants. There was therefore insufficient variance across participants to correlate brain structure with performance in the canonical target conditions.
The statistical threshold was set to p < 0.05 after familywise error (FWE) correction for multiple comparisons across the whole brain. We also used regions of interest based on prior fMRI studies of language interference and control. These included the inferior and middle frontal regions reported in the studies by Rodriguez-Fornells et al. (2002, 2005) and the left head of caudate reported by Ali et al. (2010) (see Table 3 for details). For our regions of interest, statistical correction was at p < 0.05 FWE corrected within a 10 mm search radius of the peak voxel listed in Table 3. For completeness, we also reduced the threshold to p < 0.05 uncorrected.
Functional imaging study
To demonstrate that the brain areas associated with verbal control mechanisms in the structural imaging study were actively involved when interference was high, we also present the results of a functional imaging experiment in which a group of German participants who spoke English as a non-native language made semantic decisions under interference from unrelated words in L1 or L2. The data for this experiment were collected before those used in the structural imaging experiment, with different participants, tasks, stimuli, modality of presentation, and behavioral assessments and thus provide a completely independent yet meaningful comparison to the results of the structural imaging experiment.
The functional imaging study included data from 8 right-handed German participants (6 males, mean age 36.4 years, range 23–62), who were resident in the UK and had been speaking English for >4 years. Data from the same 8 participants were reported in a previously published study of semantic priming (Crinion et al., 2006). Three additional participants were included in the PET study reported by Crinion et al. (2006), but were excluded from the current study because their native language was not unequivocally German, and the purpose of the current study was to contrast interference from the dominant relative to the weaker language. Although 8 participants is small for an fMRI study, it is sufficient for a PET study because: (1) the signal is 5–10 times larger in PET than fMRI; (2) PET data come from multiple independent measurements from the same individual, as opposed to intercorrelated measurements in fMRI; and, (3) the false-positive rate does not increase notably for sample sizes >6 (Andreasen et al., 1996). To demonstrate the robustness of our results, we illustrate consistency at the individual subject level. Nevertheless, we acknowledge that small samples are susceptible to false negatives (Andreasen et al., 1996) and may not be representative of larger populations. This turned out not to be a concern for the present study because a significant positive activation was observed in a region of interest from the structural study. Therefore, the PET study served to confirm the findings of the structural study.
The Meara (1992) English vocabulary test was used to assess the subjects' general knowledge of words. On average, the participants knew 84% of the words (range 42–100%). The Graded naming task (McKenna and Warrington, 1992) was used to judge naming vocabulary and knowledge of low-frequency words in both English (L2) and German (L1). The results confirmed that naming was better in German (72%, range 63–90) than English (mean 62%, range 43–93). We expected that better knowledge of German would result in stronger L1 interference than L2 interference.
The in-scanner task involved semantic decisions on written words in L1 or L2 that were either primed with a word in L1 (i.e., L1-L1 or L1-L2) or L2 (i.e., L2-L1 or L2-L2). These four conditions are comparable to the four conditions included in the structural imaging paradigm, even though the precise task demands were different.
On each trial, a pair of written object or animal names was presented consecutively with a short (250 ms) interval between onset times. Participants were instructed to ignore the first word (the interference) and make a two-choice semantic decision on the second word (the target) with a key press response from either the first or middle finger on their right hand. There were a total of 120 targets and 120 primes. Each target word was associated with one of three possible verification questions that each focused on the perceptual properties of the object/animal concept: (1) long legs or short legs (e.g., HORSE vs DUCK); (2) multicolored or plain (e.g., WASP vs WORM); (3) open or closed handles (e.g., SPOON vs SUITCASE). The question was presented at the start of the scanning block and remained constant within the block. Over the experiment, correct finger responses were 50% first finger response and 50% middle finger response.
Functional activation images were acquired using a Siemens/CPS ECAT EXACT HR+ (model 962) PET scanner. The same experiment was also conducted using fMRI (Crinion et al., 2006). The reason for reporting PET data rather than fMRI data is that the PET scanner includes data from the whole brain simultaneously. In contrast, the fMRI data did not include the top and bottom of the brain, because fMRI data are acquired in a serial slice-by-slice procedure, and sensitivity can be enhanced by using a limited field of view. The top and bottom of the brain are therefore typically excluded from fMRI studies unless these areas are a priori regions of interest.
Each participant had 12 PET scans, to measure regional cerebral blood flow using bolus infusion of radioactively labeled water (H215O). The dose received was 9 mCi per measurement, as approved by the UK Administration of Radioactive Substances Advisory Committee (ARSAC). Scans from each subject were realigned using the first as a reference, transformed into a standard MNI space (Ashburner and Friston, 2000), and smoothed with a Gaussian kernel of 12 mm FWHM. Structural MRI images were obtained for coregistration with the PET data.
Statistical analysis used standardized procedures. This involved ANCOVA with subject effects modeled and global activity included as a subject-specific covariate. The condition and subject effects were estimated according to the general linear model at each voxel (Friston et al., 1995). As in the structural imaging study, we conducted a factorial analysis to investigate the main effects of the language of verbal interference (L1 and L2), the language of the target stimulus (L1 or L2), and the interaction of these factors. The expectation was that the semantics associated with the prime would interfere with the semantics related to the target, and this interference effect would be greater when the prime was presented in L1 (German) than in L2 (English). We therefore directly contrasted activation for L1 primes with activation for L2 primes and compared the location of this effect with the location of the area associated with the control of interference in the structural imaging study.
Structural imaging analysis 1 results
A whole-brain search identified one area in which there was a significant and positive correlation between the ability to control L1 interference during L2 sentence decisions and gray matter density in the unmodulated images. This was located in the posterior paravermis of the right cerebellum in the most medial part of lobule VIIIA (MNI coordinates: x = +12, y = −64, z = −38; Z = 5.3; p < 0.024 FWE corrected for multiple comparisons; 161 voxels, p < 0.001 uncorrected), as illustrated in Figure 3a. The significance of the correlation between brain structure and the ability to control L1 interference was higher when the data were unmodulated images (see details above) than when the data were modulated images (MNI coordinates: x = +14, y = −64, z = −36; Z = 2.0; with 85 voxels, p < 0.05 uncorrected). This is consistent with our previous studies (Mechelli et al., 2004; Lee et al., 2007; Grogan et al., 2009; Richardson and Price, 2009), which have shown that language ability correlates more strongly with gray matter density (in unmodulated images) than gray matter volume (in modulated images). The strong relationship between the ability to control verbal interference and gray matter density in the right posterior paravermis was observed even when the same analysis was repeated without the additional regressors (x = +11, y = −63, z = −38, Z score = 4.5; 163 voxels at p < 0.001) (Fig. 4a).
There were no other significant effects in gray or white matter images (modulated or unmodulated), even in our regions of interest (Table 3). When the statistical threshold was lowered to p < 0.05 uncorrected, we found 129 voxels in the left middle frontal cortex (x = −35, y = +39, z = +27; Z score = 2.3) and 45 voxels in the left inferior frontal gyrus (x = −39, y = +18, z = +9, Z score = 2.1; and x = −54, y = +12, z = +7, Z score = 2.2), but no voxels in the left head of caudate or anywhere in the anterior cingulate. Plausibly, the absence of any effects in the left head of caudate and anterior cingulate is a consequence of our focus on interference at the comprehension level rather than at the response level (cf. Ali et al., 2010; van Heuven et al., 2008), and because our behavioral measure does not index a switch in language (Crinion et al., 2006; Abutalebi et al., 2007, 2008).
Finally, we correlated gray matter density in the right posterior paravermis with a range of language scores and experience but did not observe any significant effects (Pearson correlation = 0.02–0.24; p = 0.1–0.45, one-tailed) for (1) Matrices score, (2) Simon Task, (3) BVAT-CALP, (4) age of second language onset, (5) years in the UK, (6) number of languages. This contrasts to the highly significant correlation with verbal interference from the dominant language on non-native language processing (Pearson correlation = 0.68; p = 0.001) identified in Structural imaging analysis 1. As far as these data indicate, the structural changes corresponding to control of interference are not crucially affected by other factors that differ between second language learners, such as age of acquisition.
Structural imaging analysis 2 results
To demonstrate that verbal interference arose at a semantic rather than a perceptual level, we tested for the main effect of target language (L2 vs L1), the main effect of interference language (L1 vs L2), and their interaction. In the right cerebellar region of interest, identified in Structural analysis 1, the most significant effect (Z = 3.7) was the interaction between target language and interference language, because gray matter density was most significantly correlated with the ability to control L1 interference when processing L2 (as reported in Structural analysis 1). There were only weak main effects for L1 interference versus L2 interference (Z = 2.9) and for L2 targets versus L1 targets (Z = 2.2), and no other significant effects in the gray or white matter images, even in the regions of interest.
Accuracy on the semantic decision task during scanning was higher when interference was in L2 (81% for L1 targets and 82% for L2 targets) than when interference was in L1 (78% for L1 targets and 71% for L2 targets). Thus, consistent with expectations and with results from the sentence interpretation experiment above, performance was lowest (71% accurate) when the target was in L2 and the interference was in L1. Response times did not vary for L1 interference (1239 ms) versus L2 interference (1239 ms) or L1 targets (1256 ms) versus L2 targets (1226 ms).
Functional imaging results
The most significant result (i.e., with the lowest p value) was that activation in the posterior paravermis (lobule VIIIA) of the right cerebellum was greater when interference was in the dominant relative to the weaker language (x = +24, y = −60, z = −44; Z = 3.8; 61 voxels at p < 0.001 uncorrected). As displayed in Figure 3b, this region is just lateral to the area associated with the control of language interference in the structural imaging study. When the peak MNI coordinates from the structural imaging study (x = +12, y = −64, z = −38) were used as the center of a spherical region of interest with 10 mm radius, the effect of dominant (native) versus weaker (non-native) language interference in the functional imaging study reached a corrected level of significance (x = +20, y = −60, z = −40, Z = 3.2; p < 0.03 corrected); this effect did not interact with the language of the target (Z = 1.2; p > 0.05). As illustrated in Figure 4b, the main effect of dominant versus weaker interference on right posterior paravermis activation in the PET study was replicated across all 8 participants.
In summary, the functional imaging study confirmed that the area associated with the control of verbal interference in the structural imaging study was activated in the context of high versus low interference in the functional imaging study. Specifically, activation in the right posterior paravermis was higher in the context of interference from the dominant language (L1) relative to weaker (L2) language.
In this study, we investigated the brain areas that are involved in the control of verbal interference. Designing functional imaging experiments to identify the brain areas that control verbal interference is challenging because brain activity in the mechanisms that control interference co-occurs with brain activity related to the processing of conflicting information. We therefore used structural imaging to identify long-term markers of processing ability on brain structure. Our findings were then validated with functional imaging.
In both experiments, our participants were second-language users of English who varied in their ability to control interference from their dominant language (L1) while performing semantic decisions in their weaker language (L2). Proficiency in more than one language develops expertise in language control as illustrated by evidence from behavioral studies that both languages may be active in parallel (Dijkstra et al., 1998; Van Hell and Dijkstra, 2002; Von Studnitz and Green, 2002) and that the non-target language needs to be suppressed through inhibitory processes beyond the language system (Green, 1986, 1998; but also see La Heij, 2005; Costa, 2005, for a different view). Therefore, our choice of second-language English users, with a wide range of proficiency, capitalized on the opportunity to identify brain areas that vary with the ability to control language interference. However, we are not claiming that the observed effects are specific to bilinguals.
The results were surprising in two ways. First, we expected that the brain regions that would be most significantly associated with the control of verbal interference would be the left middle and inferior frontal areas, which have previously been identified in studies of bilinguals attending to one language while ignoring competing information from another language (Rodriguez-Fornells et al., 2002, 2005). Instead, both our structural and functional imaging analyses identified an area in the right posterior cerebellum, with only weak and statistically insignificant involvement of left frontal cortex, and no evidence for involvement of subcortical or anterior cingulate regions.
The second surprising finding was that the right posterior cerebellar area that we identified was in the paravermis rather than the lateral hemisphere. This is not consistent with the more lateral location of right cerebellar activation in previous functional imaging studies of semantic processing or speech production (Ackermann et al., 2007). To our knowledge, no previous study has reported a link between the posterior paravermis and the control of verbal interference. We suggest that this is because the area we have identified is in a relatively inferior part of the cerebellum that is typically excluded from fMRI studies using serial multislice acquisition to maximize sensitivity in other regions. Below, we discuss why the right posterior cerebellum might be important for the control of verbal interference and why the effect was in the paravermis rather than the lateral cerebellum.
Why is the right cerebellum involved in the control of verbal interference?
Both functional imaging and lesion studies have highlighted the importance of the right cerebellum for language processing. For example, Jansen et al. (2005) used functional imaging in healthy left- and right-handed individuals and found that the degree of left-lateralized activation in the cerebral hemisphere was positively correlated with the degree of right-lateralized activation in the cerebellum. Lesion studies have also shown that the effect of right cerebellar damage on language function mirrors that seen after left frontal lobe damage. For example, Schweizer et al. (2010) found that during a phonemic fluency task, patients with right cerebellar lesions produced significantly fewer words compared with patients with left cerebellar lesions or healthy controls. This deficit was not explained in terms of motor speech impairment but, rather, a reduction in switches between task strategies. Switching between strategies maximizes phonemic fluency. For example, participants might start by generating words that are synonymous (e.g., slender, slim) and then generate words that begin with the same letters (e.g., small, smart). The strategic control of these strategies is impaired in patients with damage to the right cerebellum (Schweizer et al., 2010) and left prefrontal cortex (Alexander et al., 2007).
Studies of developmental language impairments have also reported a correlation between left prefrontal and right cerebellar brain structures. For example, Hodge et al. (2010) found that in groups of participants who had unimpaired language, the inferior frontal gyrus (IFG) is larger in the left than in the right hemisphere, while lobule VIIIA in the cerebellum is larger in the right than in the left hemisphere. Conversely, in groups of participants with specific language impairment, IFG is larger in the right hemisphere, while lobule VIIIA is larger in the left hemisphere (Hodge et al., 2010). Likewise, using functional connectivity analyses, Krienen and Bucker (2009) found that activity in lobule VIIIA is more tightly correlated with the left prefrontal cortex than the left motor cortex. This link between left lateralization in the prefrontal cortex and right lateralization in lobule VIIIA of the cerebellum is particularly interesting, given that the current study associated the control of verbal interference to a right cerebellar region that lies in lobule VIIIA with a weak trend in the left middle and inferior frontal cortex. The highly significant effect in the right cerebellum is consistent with the well established view that the cerebellum is involved in the modulation rather than the generation of cognitive and motor functions (Schmahmann, 1996; Murdoch, 2010). The less significant correlation in the left frontal lobe might be explained by unknown confounding influences on frontal lobe gray matter that have not been investigated in the current study. This will require further investigation.
Why was the control of verbal interference associated with the paravermis rather than lateral cerebellum?
Lobule VIIIA is an inferior part of the posterior cerebellum that extends from the lateral surface to the vermis. The area we associate with the control of verbal interference is in the most medial part of lobule VIIIA and lies within the posterior paravermis. This area is typically associated with the control of motor movements (Stoodley and Schmahmann, 2009), with lesions to the posterior paravermis resulting in uncontrolled movements (Ye et al., 2010). Plausibly, the controlled use of two different languages calls upon the same mechanisms that are involved in the control of limb movements, but further investigation is required to specify what these mechanisms are.
In the language domain, the right posterior paravermis has also been reported (at MNI coordinates: x = +18, y = −64, z = −48; Z score = 5.3) for silently reading words with irregular versus regular spellings (Osipowicz et al., 2011). This could reflect the resolution of conflict because, for irregularly spelled words (e.g., “YACHT”), there is a mismatch between pronunciation at the whole word and sub-word levels (“Yot” versus “Yatched”). Failure to resolve this conflict may explain why reading errors increase in patients with damage to the posterior paravermis (Moretti et al., 2002). The finding that right posterior paravermis activation increases for reading words with irregular relative to regular spellings (Osipowicz et al., 2011) was observed in native speakers of English. It therefore serves to highlight that the posterior paravermis may play a role in language control that is not specific to second language processing, and perhaps not specific to comprehension processing. Plausibly, monolinguals will also have a significant relationship between their ability to control verbal interference and gray matter density in the posterior paravermis. However, the effect sizes are likely to be smaller in monolinguals because, relative to multilinguals, they have less intense experience in language control. For this reason, we would also expect gray matter density in the posterior paravermis to be higher in bilinguals than monolinguals, but such between-group comparisons require very large sample sizes (Mechelli et al., 2004).
To summarize, the posterior paravermis has been associated with the control of both motor and language functions. Future studies are now required to determine whether lesions to the posterior paravermis impair the control of verbal interference, particularly in bilingual patients. In healthy individuals, a longitudinal study of language learning would establish whether increased gray matter in the right posterior paravermis is caused by the skill acquisition process, or preexisting gray matter differences that underlie the ability to control verbal interference. It will also be important to investigate whether different parts of the posterior paravermis are involved in tasks that vary in the type of verbal processing (e.g., semantic or phonological), the type of motor response (speech or finger press), and the interaction between the verbal and motor processing. More generally, further investigation is needed to determine whether the posterior paravermis is involved in the control of nonlinguistic interference, and whether gray matter density in this region differs in monolinguals and bilinguals. Our study suggests that future fMRI studies of language control should include the posterior paravermis.
This study was funded by the Wellcome Trust. Our thanks to Janice Glensman, David Bradbury, and Suz Prejawa for their assistance with the structural imaging study; and Alice Grogan, Katherine Stockton, Jenny Crinion, and Uta Noppeney with the functional imaging study.
- Correspondence should be addressed to Roberto Filippi, Centre for Brain and Cognitive Development, Department of Psychological Sciences, Birkbeck College, University of London, 32 Torrington Square, London WC1E 7JL, UK.