Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE

User menu

  • Log out
  • Log in
  • My Cart

Search

  • Advanced search
Journal of Neuroscience
  • Log out
  • Log in
  • My Cart
Journal of Neuroscience

Advanced Search

Submit a Manuscript
  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE
PreviousNext
Research Articles, Behavioral/Cognitive

Neural Representations of Non-native Speech Reflect Proficiency and Interference from Native Language Knowledge

Christian Brodbeck, Katerina Danae Kandylaki and Odette Scharenborg
Journal of Neuroscience 3 January 2024, 44 (1) e0666232023; https://doi.org/10.1523/JNEUROSCI.0666-23.2023
Christian Brodbeck
1Department of Psychological Sciences, University of Connecticut, Storrs, Connecticut 06269
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Katerina Danae Kandylaki
2Department of Neuropsychology and Psychopharmacology, Maastricht University, 6200 MD, Maastricht, The Netherlands
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Odette Scharenborg
3Multimedia Computing Group, Delft University of Technology, 2628 XE, Delft, The Netherlands
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Odette Scharenborg
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Abstract

Learning to process speech in a foreign language involves learning new representations for mapping the auditory signal to linguistic structure. Behavioral experiments suggest that even listeners that are highly proficient in a non-native language experience interference from representations of their native language. However, much of the evidence for such interference comes from tasks that may inadvertently increase the salience of native language competitors. Here we tested for neural evidence of proficiency and native language interference in a naturalistic story listening task. We studied electroencephalography responses of 39 native speakers of Dutch (14 male) to an English short story, spoken by a native speaker of either American English or Dutch. We modeled brain responses with multivariate temporal response functions, using acoustic and language models. We found evidence for activation of Dutch language statistics when listening to English, but only when it was spoken with a Dutch accent. This suggests that a naturalistic, monolingual setting decreases the interference from native language representations, whereas an accent in the listener's own native language may increase native language interference, by increasing the salience of the native language and activating native language phonetic and lexical representations. Brain responses suggest that such interference stems from words from the native language competing with the foreign language in a single word recognition system, rather than being activated in a parallel lexicon. We further found that secondary acoustic representations of speech (after 200 ms latency) decreased with increasing proficiency. This may reflect improved acoustic–phonetic models in more proficient listeners.

Significance Statement Behavioral experiments suggest that native language knowledge interferes with foreign language listening, but such effects may be sensitive to task manipulations, as tasks that increase metalinguistic awareness may also increase native language interference. This highlights the need for studying non-native speech processing using naturalistic tasks. We measured neural responses unobtrusively while participants listened for comprehension and characterized the influence of proficiency at multiple levels of representation. We found that salience of the native language, as manipulated through speaker accent, affected activation of native language representations: significant evidence for activation of native language (Dutch) categories was only obtained when the speaker had a Dutch accent, whereas no significant interference was found to a speaker with a native (American) accent.

  • accent
  • electroencephalography
  • linguistic knowledge
  • non-native listening
  • predictive coding
  • proficiency

Introduction

A plethora of behavioral studies has shown that non-native speech processing is slower and more error-prone than native speech processing, even in highly proficient listeners (Garcia Lecumberri et al., 2010; Scharenborg and van Os, 2019). One reason for this is the influence of the native language on non-native listening at different linguistic processing levels (Garcia Lecumberri et al., 2010; Cutler, 2012). Listeners’ knowledge of the sounds of their native language influences how they perceive non-native sounds, which increases the number of misperceived sounds in non-native compared with native listeners (Garcia Lecumberri et al., 2010). This problem percolates upward in the recognition process, leading to spurious activation of similar-sounding words from the non-native (target) language (Cutler et al., 2006; Scharenborg et al., 2018; Karaminis et al., 2022), as well as from the native language (Spivey and Marian, 1999; Marian and Spivey, 2003; Weber and Cutler, 2004; Hintz et al., 2022). These sources of interference slow down word recognition and decrease word recognition accuracy for non-native listeners (Broersma and Cutler, 2008, 2011; Drijvers et al., 2019; Perdomo and Kaan, 2021).

In addition to bottom-up recognition, listeners engage predictive language models during speech processing. In the native language, listeners employ predictive models at different linguistic levels in parallel, including the sublexical, word-form, and sentence levels (Brodbeck et al., 2022; Xie et al., 2023). We thus hypothesized that acquiring a new language involves developing such predictive models and that those models exhibit interference from the native language. Such interference would be evident if native language statistics influence perception of the non-native language. At the sublexical level, phoneme transition probabilities from the native language may influence what phoneme sequences are expected in the non-native language. In word recognition, we contrast two different possible mechanisms of native language interference (Fig. 1). The standard view is that native and non-native word forms directly compete for recognition in one shared lexicon (Fig. 1A; Brysbaert and Duyck, 2010; Dijkstra et al., 2019). Alternatively, words from the two languages could be activated in segregated lexical systems (Fig. 1B), and interference would then only occur at the level of behavioral output (e.g., eye movements in a visual world study).

Figure 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1.

Alternative explanations for activation of native language (Dutch) lexical candidates when listening to a non-native language (English). A, Word forms from both languages compete in a single recognition system. B, The native language and the non-native language lexicons are independent systems that are both activated in parallel by acoustic input. Outputs of the two systems may still interact, for example, in guiding eye movements in visual world studies.

One cue for activating native language knowledge during non-native listening may be a speaker accent consistent with the listener's native language. However, the effect of such an accent is complex. For some non-native listeners, it facilitates recognition (Bent and Bradlow, 2003), but not for others (Hayes-Harb et al., 2008; Gordon-Salant et al., 2019), likely due to an interaction with proficiency: non-native listeners with lower proficiency in the target language tend to benefit from the accent of their own native language, whereas higher proficiency listeners show better accuracy for native accents of the target language (Pinet et al., 2011; Xie and Fowler, 2013).

Previous research on native language interference typically focused on behavioral experiments using carefully crafted stimuli. However, recent results suggest that tasks which increase meta-linguistic awareness also increase the influence of the native language on non-native speech perception (Freeman et al., 2021). This may have led to an overestimation of the effects of native language interference. Here we used a naturalistic listening paradigm and measured neural responses to speech unobtrusively with electroencephalography (EEG), while native speakers of Dutch listened to two versions of an English story, once spoken with an American accent and once with a Dutch accent. We investigated four related questions: (1) Is there evidence for parallel predictive language models in non-native listeners? (2) Do brain responses to non-native speech exhibit evidence for interference from native language statistics? (3) Do these effects depend on the accent of the speaker? (4) Do the effects change as a function of language proficiency, and is the effect of accent modulated by proficiency? That is, do highly proficient listeners benefit more from a native accent (American-accented English) and low proficiency listeners from an accent of their own native language (Dutch-accented English)?

Materials and Methods

In order to measure neural representations during naturalistic non-native story listening, we used the multivariate temporal response function (mTRF) framework (Lalor and Foxe, 2010; Brodbeck et al., 2021). Participants listened to an approximately 12-min-long English story twice, once spoken with an American English accent and once with a Dutch accent, with the order counterbalanced across participants. Using fivefold cross-validation, mTRFs were trained to predict the EEG responses to each story separately from multiple predictor variables, reflecting different acoustic and linguistic properties of the stimuli (Fig. 2 and below). Predictor variables for English closely followed previously reported research (Brodbeck et al., 2022). The influence of native language (Dutch) knowledge on neural representations was assessed by generating additional predictors from Dutch language statistics. To determine which neural representations change as a function of non-native proficiency, the predictive power of the different groups of predictors across listeners was correlated with behavioral tests measuring non-native language proficiency.

Figure 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 2.

Analysis design: predictors and groups of predictors used to test specific hypotheses. Each predictor was constructed as a continuous time series, aligned with the stimuli and corresponding EEG responses. Both auditory predictors were reduced to eight bands, equally spaced in equivalent rectangular bandwidth, to simplify the analysis computationally. Predictors were grouped into sets that reflect specific processes of interest, as indicated by brackets.

Participants

Forty-six Dutch non-native listeners of English from the Radboud University, Nijmegen, the Netherlands, subject pool participated in the experiment. All participants reported to be monolingual and native speakers of Dutch and had started to learn English around the age of 10 or 11. All were right-handed. The following seven participants were excluded due to technical issues during data acquisition: part of the EEG recordings was missing (two); the sound level was initially too low, so part of the English story was presented twice (one); the event codes were missing (one); the connection with the laptop was lost (one); the battery failed during the experiment (one); and the behavioral data was missing (one). This left a sample of 39 participants [14 males; mean age, 21.6; standard deviation (SD), 2.7; range, 18–29]. The experiment consisted of two parts: a lexically guided perceptual learning experiment followed by listening to the two stories. The lexically guided perceptual learning experiment, which investigated the neural correlates underlying lexically guided perceptual learning, was reported in Scharenborg et al. (2019). The participants reported here are a superset of those reported in Scharenborg et al. (2019). All participants were paid for participation in the experiment. No participants reported hearing or learning problems.

Non-native proficiency: LexTale

General English proficiency of the Dutch non-native listeners of English was assessed using the standardized test of vocabulary knowledge, LexTale (Lemhöfer and Broersma, 2012). LexTale scores ranged from 46 (which corresponds to a “B1 and lower” level of proficiency according to the Common European Framework of Reference for Language) to 92 (which indicates a C1 and C2 level of proficiency or an “upper & lower advanced/proficient user”; note Lemhöfer and Broersma do not differentiate between C1 and C2 levels). Overall, four participants were classified as “lower intermediate and lower” (LexTale < 60; B1 and lower), 25 as “upper intermediate” (60 ≤ LexTale < 80; B2), and 10 as “advanced/proficient user” (LexTale > 80; C1/C2). The mean score was 73.3 (SD = 11.0), which corresponds to “upper intermediate”/B2. All participants were taught English in high school for at least 6 years.

Acoustic–phonetic aptitude: LLAMA_D

The LLAMA test (Meara et al., 2002) consists of five tests to assess aptitude for learning a foreign language and is based on Carroll and Sapon (1959). The five tests assess different foreign language learning competencies, including vocabulary learning, grammatical inferencing, sound–symbol associations, and phonetic memory. Here we used the LLAMA_D sub-test, which assesses the ability to recognize auditory patterns, a skill that is essential for sound learning and ultimately word learning. We therefore refer to the LLAMA_D score as acoustic–phonetic aptitude. We expected that higher acoustic–phonetic aptitude may be associated with more efficient accent processing and that acoustic–phonetic aptitude may thus modulate effects of speaker accent independently of effects of English proficiency (LexTale).

During the test, participants first heard a list of words; in the second part of the test, participants heard new and repeated words and were asked to indicate whether the stimulus was part of the initial target words. The words were synthesized using the AT&T Natural Voices (French) on the basis of flower names and natural objects in a Native American language of British Columbia, yielding sound sequences that are not recognizable as belonging to any major language family. The participants got feedback regarding the correctness of their answer after each trial. They scored points for correctly recognized target words and lost points for mistakes. This tested the ability to recognize repeated stretches of sound in an unknown phonology, which is an important skill for learning words in a foreign language (Service et al., 2022) and for distinguishing variants that may signal morphology (Rogers et al., 2017).

The LLAMA_D scores range from 0 to 100%, where 0–10 is considered a very poor score, 15–35 an average score (most people score within this range), 40–60 a good score, and 75–100 an outstandingly good score (few people manage to score in this range) (Meara, 2005). A previously reported average score is 29.3%, SD = 11.4 (Rogers et al., 2017).

Stimuli

The short story was the chapter “The daily special” from the book Garlic and sapphires: The secret life of a critic in disguise” by Ruth Reichl (2005). We aimed to select a story on a neutral topic, while avoiding books that our participants would be familiar with. At the same time, we wanted the story to be entertaining so that participants would be engaged with the story and would want to continue to listen.

The stories were read by a female Native American speaker and a female Dutch speaker, both students at Radboud University at the time of recording. Recordings were made in a sound-attenuated booth using a Sennheiser ME 64 microphone. Each speaker read the story twice. The story with the fewest mispronunciations was chosen for the experiment. Both stories were ∼12 min long.

Procedure

Participants were tested individually in a sound-attenuated booth, comfortably seated in front of a computer screen. The two short stories were administered in a single session after the lexically guided perceptual learning experiment reported previously (Scharenborg et al., 2019). The intensity level of both stories was set at 60 dB SPL and was identical for all participants. The stories were played with Presentation 17.0 (Neurobehavioral Systems) and were presented binaurally through headphones.

Participants saw an instruction on the computer screen informing them that they would be listening to two short stories in English. To start the story, participants had to press a button. Once the story was finished, the participants were prompted to press another button to start the second story. The order of the presentation of the two stories was balanced across participants.

We recorded EEG activity continuously during the entire duration of the experiment from 32 active Ag/AgCI electrodes, placed according to the 10–10 system (actiCHamp, Brain Products GmbH). The left mastoid was used as online reference. Eye movements were monitored with additional electrodes placed on the outer canthus of each eye and above and below the right eye. Impedances were generally kept below 5 Kohm. Data were sampled at 500 Hz after applying an online 0.016–125 Hz bandpass filter.

Experimental design and statistical analysis

Accent was a within-subject factor, as all participants listened to both the American- and the Dutch-accented story. The behavioral tests (LexTale and LLAMA_D) were between-subject measures (one measurement per subject).

Preprocessing

The EEG data were preprocessed with MNE-Python (Gramfort et al., 2014). Data were bandpass filtered between 1 and 20 Hz (zero-phase FIR filter with MNE-Python default settings), and biological artifacts were removed with Extended Infomax Independent Component Analysis (Bell and Sejnowski, 1995). Data were then re-referenced to the average of the two mastoid electrodes. Data segments corresponding to the timing and duration of the two stories were extracted and downsampled to 100 Hz.

Predictor variables

In order to measure the neural representations of speech at different levels of processing, multiple predictor variables were generated. Each predictor variable is a continuous time series, which is temporally aligned with the stimulus, and quantifies a specific feature, hypothesized to evoke a neural response (see Fig. 2 for an overview). The predictors for auditory and English linguistic processing closely followed previously used representations that were developed as measures of processing English as a native language (Brodbeck et al., 2022).

Auditory processing was assessed using an auditory spectrogram and acoustic onsets. Linguistic processing was assessed at the sublexical, word-form, and sentence level using information–theoretic models. These models are all predictive language models that predict upcoming speech phoneme-by-phoneme, but they differ by taking into account different amounts of context (for a detailed theoretical motivation, see Brodbeck et al., 2022). Previous research has shown that such models track speech comprehension more closely than acoustic models (Brodbeck et al., 2018; Verschueren et al., 2022). Sublexical processing was assessed using a context that consisted of a sublexical phoneme sequence, taking into account only the previous four phonemes. Word-form processing was assessed using a within-word context, taking into account only the phonemes in the current word. Sentence-level processing was assessed using a multi-word context consisting of the preceding four words. At all linguistic levels, the influence of context representations on brain responses was operationalized through phoneme surprisal (Eq. 1) and phoneme entropy (Eq. 2) measures:Ii=−log2(p(phi|context)). (1)Hi=−∑phphonemesp(phi+1=ph|context)×log2(p(phi+1=ph|context)). (2)Phoneme surprisal at position i, Ii , reflects how surprising the phoneme at position i is, given a certain context (e.g., sublexical phoneme surprisal quantifies how surprising the current phoneme is based on a prediction using the past four phonemes; sentence-level phoneme surprisal reflects how surprising the current phoneme is based on a prediction using the past four words and the current partial word). Phoneme entropy Hi reflects how much uncertainty there is about the identity of the next phoneme. For lexical processing models, cohort entropy (Eq. 3) additionally reflects how much uncertainty there is about what the current word is:Hlexi=−∑wLexiconp(word=w|context)log2(p(word=w|context)). (3)This, again, depends on what context is used. For example, using only the word-form context, the partial word s… is much more uncertain than when using the sentence context, coffee with milk and s…. Significant brain responses related to these variables were taken as indicators of incremental linguistic processing of speech at these different levels. Finally, in addition to information–theoretic models, neural correlates of lexical segmentation were controlled for using a predictor for responses to word onsets (Brodbeck et al., 2018). A predictor with an equally scaled impulse at each phoneme onset was included to control for any phoneme-evoked response not modulated by the predictors of interest (analogous to the intercept term in a regression model).

To generate the sublexical and lexical predictors, word and phoneme locations are needed, which were determined in the auditory stimuli using forced alignment. To that end, an English pronunciation dictionary was defined based on merging the Montreal Forced Aligner (McAuliffe et al., 2017) English dictionary with the Carnegie Mellon University Pronouncing Dictionary and manually adding five additional words that occurred in the short story. The time point of words and phonemes in the acoustic stimuli were then determined using the Montreal Forced Aligner. Below, the different predictors and how they were created are explained in detail.

Auditory processing

Two predictors were used to assess (and control for; Daube et al., 2019; Gillis et al., 2021) auditory representations of speech: an auditory spectrogram and an acoustic onset spectrogram. The auditory spectrogram reflects moment-by-moment acoustic power, using a transformation approximating peripheral auditory processing, and thus models sustained neural responses to the presence of sound. The onset spectrogram specifically contains acoustic onset edges and thus models transient response to the onset of acoustic features.

An auditory spectrogram with 128 bands ranging from 120 to 8,000 Hz in equivalent rectangular bandwidth (ERB) space was computed at 1,000 Hz resolution with the gammatone library (Heeris, 2018). The spectrogram was log-transformed to more closely reflect the auditory system's dynamic range. For use as the auditory spectrogram predictor variable, the number of bands was reduced to 8 by summing 16 consecutive bands.

The 128 band log spectrogram was transformed using a neurally inspired auditory edge detection algorithm (Fishbach et al., 2001) to generate the acoustic onset spectrogram (Brodbeck et al., 2020). For use as a predictor variable, the number of bands was also reduced to 8 by summing 16 consecutive bands.

Sublexical English representations

Sublexical representations were assessed using a context consisting of phoneme sequences. To that end, first a probabilistic model of phoneme sequences in English without consideration of word boundaries was generated: all sentences of the SUBTLEX-US corpus (Brysbaert and New, 2009) were transcribed to phoneme sequences by substituting each word with its pronunciation from the pronunciation dictionary and concatenating these pronunciations across word boundaries. The resulting phoneme strings were used to train a 5-gram model using KenLM (Heafield, 2011). This 5-gram model was then used to estimate probability distributions for the next phoneme at each position in the story (p(phi+1|phi-3phi-2phi-1phi) , with i indexing the current position in the story). These probability distributions were used to generate two predictors, phoneme surprisal Ii and phoneme entropy Hi (Eqs. 1, 2). Each of these predictors was constructed by placing an impulse at the onset of each phoneme, scaled by the respective surprisal or entropy value. These predictors were used to measure the use of sublexical phonotactic knowledge during speech processing.

Additionally, a phoneme onset predictor was included, with impulse size of one at each phoneme, to serve as an intercept for the sublexical predictors (i.e., capturing any response that occurs to phonemes but is not modulated by any of the quantities of interest).

English word-form representations

A word onset predictor was generated with equal sized impulses at each word onset to assess lexical segmentation (Sanders et al., 2002; Brodbeck et al., 2018). This predictor was taken as an indicator of lexical segmentation, when contrasted with the phoneme predictor which measures responses related to phonetic processing without regard for lexical segmentation.

Word-form representations were assessed using a model of word recognition that takes into account word boundaries but disregards the preceding multi-word context. This model is based on the cohort model of word recognition (Marslen-Wilson, 1987). A lexicon was defined based on the pronunciation dictionary (also used for forced alignment), in which each unique grapheme sequence identifies a word and each word may have multiple pronunciations. At each word boundary, the cohort is initialized using the whole lexicon, with the prior likelihood for each word proportional to its frequency in the SUBTLEX-US corpus (Brysbaert and New, 2009). At each phoneme position, the cohort is pruned by removing all words whose pronunciations are incompatible with the new phoneme, and word likelihoods are renormalized. Thus, at the jth phoneme of the kth word, this cohort model tracks the probability distribution over what word the current phoneme sequence could convey as p(wordk|ph1…phj) . Since each word is associated with a likelihood and also makes a prediction about what the next phoneme would be, this amounts to a predictive model for the next phoneme p(phj+1|ph1…phj) . These evolving probability distributions over the lexicon are in turn used to compute phoneme surprisal (i.e., how surprising the current phoneme is given what words were still in the cohort at the previous position), phoneme entropy (uncertainty about the next phoneme), and lexical entropy (uncertainty about what the current word is; Eqs. 1–3). These predictors were used to measure word-form processing independent of the wider sentence context. Thus, if sublexical surprisal is a significant predictor of brain activity, this suggests that listeners use sublexical phoneme sequences to make predictions about upcoming phonemes; if word-form surprisal is significant, this suggests that they also use information about what the current word could be.

Sentence-level representations

Sentence-level processing was assessed using a lexical model augmented by the preceding multi-word context. The model is identical to the English word-form model, except that now in the word-initial cohorts, prior probabilities for the words are not initialized based on their lexical frequency, but instead based on a case-insensitive, lexical 5-gram model (Heafield, 2011) trained on the word sequences in the SUBTLEX-US corpus (Brysbaert and New, 2009). Thus, instead of tracking the probability of a word k, given the phonemes of word k heard so far, p(wordk|ph1…phj) , this model tracks the probability of a word k given the previous four words in addition to the phonemes of word k, p(wordk|wordk-4…wordk-1ph1…phj) . Predictors based on this language model were used to measure the use of the multi-word context during speech processing.

Sublexical Dutch representations

Interference from Dutch sublexical phonotactic knowledge was assessed with a model analogous to the English sublexical model but trained on Dutch language statistics. Phoneme sequences were extracted from version 2 of the Corpus Gesproken Nederlands (CGN; Oostdijk et al., 2002) and used to train a phoneme 5-gram model (Heafield, 2011). Since Dutch and English have different phoneme inventories, and the 5-gram model was trained on Dutch phonemes, each English phoneme of the stimulus story was transcribed to the closest Dutch phoneme. The resulting phoneme sequence, reflecting a transcription of the English story with the Dutch phoneme inventory, was then used to compute phoneme surprisal and phoneme entropy as for the sublexical English model using the phoneme 5-gram model trained on Dutch. The resulting predictors were used to measure brain responses that would indicate that listeners activated their knowledge of their native Dutch sublexical phonotactics when listening to the English story.

Word-level native language interference

To test for interference from native language word knowledge, we generated two alternative word-form models. These were built and used like the English word-form model, differing only in the set of lexical items that were included in the pronunciation lexicon. First, we built a Dutch word-form model (word-formD). This model contained only Dutch words and their pronunciations, taken from the CGN lexicon. In order to evaluate lexical cohorts in the (English) input phoneme inventory, the Dutch phonemes of those words were mapped to the closest available English equivalent (as for the sublexical Dutch model) or, in the absence of a close English phoneme, to a special out-of-inventory token (which always leads to exclusion from the cohort when encountered). Relative lexical frequencies were taken from the SUBTLEX-NL corpus (Keuleers et al., 2010) to closely match the way in which the English lexical frequencies were determined using SUBTLEX-US. Finally, we also built an English/Dutch combined lexicon, using the union of the two pronunciation dictionaries (word-formED).

mTRF analysis

An mTRF is a linear mapping from a set of nx predictor time series, xi,t, to a response time series yt. The response at time t is predicted by convolving the predictors with a kernel h, called the mTRF, at a range of delay values τ:yt^=∑inx∑τnτxit−τhiτ. The mTRFs were estimated using the boosting algorithm (David et al., 2007) implemented in Eelbrain (Brodbeck et al., 2021), separately for each story. Delay values (τ) ranged from −100 to 850 ms. For fivefold cross-validation, predictors and EEG responses were split into five segments of equal length. To predict the EEG response to each segment, an mTRF was trained on the four remaining segments. This mTRF in turn was the average of four mTRFs, which were trained by iteratively using one of the four segments as validation data and the remaining three segments as training data. The mTRFs were trained using coordinate descent to minimize ℓ2 error of the predicted response in the training data. If after any training step the ℓ2 error in the validation data increased, then this last step was undone, and the TRF corresponding to this predictor was frozen (i.e., excluded from further modification by the fitting algorithm). Fitting continued until all TRFs were frozen.

Predictive power

Evidence for specific neural representations was assessed by testing whether the corresponding predictors significantly contributed to predicting the held-out EEG data. In order to evaluate the predictive power of a specific predictor, or a group of predictors, two mTRFs were estimated: one for the full model (i.e., all predictors) and one for a baseline model, consisting of the full model minus the predictor(s) under investigation. The null hypothesis is that the two models predict the data equally well, whereas the alternative hypothesis is that adding the predictor(s) under investigation improves the model fit. Because the predictive power was measured on data that was held out during mTRF estimation, using fivefold cross-validation, the two models should predict the data equally well unless the predictors under investigation contain information about the neural responses not already contained in the baseline model.

Predictive power was quantified as the proportion of the variance explained in the EEG data. This was calculated as 1−∑t(yt−y^t)2/∑tyt2 , which is directly related to the ℓ2 loss that was minimized during mTRF estimation, ∑t(yt−y^t)2 . In order to test whether the predictive power of two models differed reliably across participants, we first compared the predictive power of the two models, averaged across all sensors, with a repeated measures t test. We report Cohen's d effect sizes. In case there was a significant difference, we then used mass univariate tests to find sensor regions that contributed to the effect. These mass univariate tests were cluster-based permutation tests (Maris and Oostenveld, 2007), using as cluster-forming threshold a t value corresponding to an uncorrected p = 0.05, and estimating corrected p values for each cluster's cluster–mass statistic (summed t values) on a null distribution estimated from 10,000 random permutations of condition labels.

In some comparisons where we are interested in the null hypothesis (e.g., whether there is evidence for native language interference), we also report Bayes factors (B; Rouder et al., 2009) estimated using the BayesFactor R library, version 0.9.12–4.4 (Morey et al., 2022). For directional contrasts (e.g., that predictive power is >0), we report the Bayes factor for evidence in favor of the value being >0 versus <0 (Morey and Rouder, 2011).

Correlations with language proficiency

To test whether language proficiency measures explained neural responses, we analyzed the predictive power of the different language models as a function of the LexTale and LLAMA_D test scores. As dependent measure we extracted the predictive power for a given set of predictors across all EEG sensors. This measure of predictive power is the difference in explained variance (Δv) between two models which differ only in the inclusion or exclusion of the predictors under investigation. We then analyzed the predictive power in R (R Core Team, 2021) using linear mixed-effects models as implemented in lme4 (Bates et al., 2015), with the following formula:Δv∼(LexTale+LLAMA)*accent*sensor+(1|subject). (4)Including higher level random effect structure generally resulted in singular fits, with one exception: for the analysis of auditory responses, we were able to specify sensor as random effect. We tested for significant effects using likelihood ratio tests. In order to minimize the number of comparisons, we first tested whether there was any effect of proficiency, by comparing model Equation 4 with a model in which all terms including LexTale were removed (and analogous for aptitude/LLAMA). In case of a significant difference, we then tested whether the effect of proficiency was modulated by speaker accent by comparing model Equation 4 to a model lacking only terms including a LexTale:accent interaction. When significant interactions with accent were detected, we fit separate linear models for the English- and Dutch-accented conditions.

When we detected significant effects involving LexTale or LLAMA, we then performed further analyses to explore the topographic distribution of these effects across EEG sensors. For this, we fitted a multiple linear regression with the following model, independently at each sensor and for each accent condition:Δv∼LexTale+LLAMA. (5)We show topographic plots of the t statistic corresponding to the predictors of interest from this regression (LexTale/LLAMA). We further selected sensors at which t ≥ 2 to produce scatterplots for illustrating the relationship, and for analyzing TRF magnitudes (next paragraph).

We analyzed the TRFs corresponding to the predictors that were related to proficiency, to gain more insights in the brain dynamics underlying the predictive power effects. If a predictor contributes to the predictive power of a model, it does so through the weights in its TRF. We investigated these weights to gain more insight into the time course at which the predictor's features affect the brain response. For this, we upsampled TRFs to 1,000 Hz and calculated the TRF magnitude as a function of time (for each lag, the sum of absolute values of the weights across sensors and, for acoustic predictors, frequency). We analyzed these time courses using a mass univariate multiple regression model with the same model as in Equation 5, correcting for multiple comparisons across the time course (0–800 ms) with cluster-based permutation tests with the same methods described for the analysis of predictive power.

Results

We hypothesized that acquiring a new language involves learning new acoustic–phonetic representations, as well as developing predictive language models that use different contexts to anticipate upcoming speech. Here we looked for evidence of such representations in EEG responses to narrative speech. To address the research questions outlined in the Introduction, we proceeded in three steps: (1) we verified that the previously described predictive language models for English at the sublexical, word-form, and sentence levels (Brodbeck et al., 2022; Xie et al., 2023) are also significant predictors for EEG responses of non-native listeners; (2) we tested the influence of Dutch, the native language, on the processing of English by testing the predictive power of language models that incorporate Dutch language statistics; and (3) we determined to what extent these effects are modulated by English proficiency (LexTale) and acoustic–phonetic aptitude (LLAMA_D).

Proficiency and aptitude test results

Figure 3 shows that English proficiency (LexTale) and acoustic–phonetic aptitude (LLAMA_D) were uncorrelated (r(37) = −0.06; p = 0.700). This confirms that the two tests measure independent aspects of second language ability.

Figure 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 3.

LexTale and LLAMA_D measure independent aspects of language ability. Each dot represents scores from one participant. The line represents the linear fit, with a 95% confidence interval estimated from bootstrapping (Waskom, 2021). Because scores take discrete values, a slight jitter was applied to the data for visualization after fitting the regression.

Robust acoustic and linguistic representations of the non-native language

To test whether listeners formed a specific kind of representation, we tested whether a predictor designed to capture this representation has unique predictive power, that is, whether an mTRF model including this predictor is able to predict held-out EEG responses better than the same model but without the specific predictor. We initially started with a model containing predictors for auditory and linguistic representations established by research on native language processing (Brodbeck et al., 2022), illustrated in Figure 2:EEG∼auditory+sublexicalE+word−formE+sentenceE. (6)The auditory predictors consisted of an auditory spectrogram and onset spectrogram. The English (E) linguistic predictors were based on three information–theoretic language models, all modeling incremental, phoneme-by-phoneme information processing: a sublexical phoneme sequence model, a word-form model, and a sentence model.

To determine whether the different components of model Equation 6 describe independent neural representations, we tested for each component whether it significantly contributed to the predictive power of the full model (Fig. 4, Table 1). We first tested the average predictive power in the two stories (American and Dutch, A&D; Fig. 4, first row), then tested for a difference between the two stories (American vs Dutch, AvD; data not shown in Fig. 4), and confirmed the effect separately in the American-accented (A) and Dutch-accented (D) stories (Fig. 4, second and third rows). Auditory predictors (Fig. 4A) and the three language levels (Fig. 4B) all made independent contributions to the overall predictive power, and none of them differed between stories (statistics in Table 1). The topographies of predictive power are comparable to known distributions reflecting auditory responses, suggesting contributions from bilateral auditory cortex (Lütkenhöner and Mosher, 2007), similar to native listeners’ responses (Brodbeck et al., 2022). Taken together, these results suggest that non-native Dutch listeners, as a group, use English sublexical transition probabilities (sublexical context), word-form statistics (word-form context), as well as multi-word transition probabilities (sentence context) to build incremental linguistic representations when listening to an English story.

Figure 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 4.

Auditory and linguistic neural representations in Dutch listeners when listening to an English story. Each swarm plot shows the change in predictive power for held-out EEG responses when removing a specific set of predictors (each dot represents the change in predictive power, averaged across sensors, for one participant). Predictive power is expressed in percent of the variance explained by the English model (Eq. 6) averaged across subjects. Stars indicate significance based on a one-tailed related measures t test. Topographic maps show corresponding sensor-specific data, with predictive power expressed as percent of model Equation 6 at the best sensor. The marked sensors form significant clusters in a cluster-based permutation test based on one-tailed t tests. A, Auditory predictors contribute a large proportion of the explained variance. The measure is based on the difference in predictive power between the English model Equation 6 and a model missing auditory predictors (acoustic onset and auditory spectrogram). B, All three linguistic models significantly contributed to the predictive power of the English model, in both stories. Note that predictive power can be negative, indicating that adding the given predictor made cross-validated predictions worse. C, A sublexical Dutch model, reflecting phoneme sequence statistics in Dutch (sublexicalD), significantly improved predictions even after controlling for English phoneme sequence statistics (sublexicalE), suggesting that Dutch listeners create expectations for phoneme sequences that would be appropriate in Dutch even when listening to English. The English sublexical model remained significant after adding the Dutch sublexical model. D, Addition of Dutch word forms suggests word recognition with competition from a combined lexicon: adding a word-form model using only Dutch pronunciations (word-formD) does not improve predictions (left column; comparison, model Eq. 8 > Eq. 7), suggesting that native language word recognition does not proceed in parallel. In contrast, replacing the English word-form model word-formE with a merged word-form model word-formED, which combines English and Dutch word forms, leads to improved predictions of EEG responses to Dutch-accented speech (right column; comparison, Eq. 9 > Eq. 7). *p ≤ 0.05; **p ≤ 0.01; ***p ≤ 0.001.

View this table:
  • View inline
  • View popup
Table 1.

Statistics for the predictive power of English language models, averaged across all EEG sensors (corresponding to swarm plots in Fig. 4)

Previous results suggested that native English listeners activate sublexical, word-form, and sentence models in parallel, evidenced by simultaneous early peaks in their brain response to phoneme surprisal (Brodbeck et al., 2022). This contrasts with an alternative hypothesis of cascaded activation, which would predict that lower-level models are activated before higher-level models, that is, first the sublexical, then the word-form, and then the sentence model (Zwitserlood, 1989). Figure 5 shows TRFs for phoneme surprisal associated with the three language models in model Equation 6. Each language model is associated with an early peak ∼60 ms latency (peaks might appear earlier than expected because the forced aligner does not account for coarticulation). This suggests that the different language models are activated in parallel in non-native listeners, as they are in native listeners.

Figure 5.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 5.

Simultaneous early peaks in TRFs suggest parallel processing. Each line represents the TRF magnitude (sum of the absolute values across sensors) for surprisal associated with a different language model. TRFs are from the mTRF estimated using the model in Equation 6 and are plotted at the normalized scale used for estimation.

Influence of the native language on non-native language processing

A Dutch sublexical phoneme sequence model is activated when listening to Dutch-accented English

Learning English as a non-native language entails acquiring knowledge of the statistics of English phoneme sequences, that is, a new sublexical context model. Given the relatively large overlap of the Dutch and English phonetic inventories, the native language Dutch sublexical model might still be activated when listening to English. To test whether this is indeed the case, we added predictors from a Dutch sublexical context model, sublexicalD, to model Equation 6. The Dutch sublexical model was analogous to the English sublexical model, containing phoneme surprisal and entropy based on Dutch phoneme sequence statistics. To test whether the Dutch sublexical model can explain EEG response components not accounted for by the English sublexical model, the predictive power of the model containing both (Eq. 7) was compared with a model without the Dutch sublexical predictor (Eq. 6) and vice versa.EEG∼auditory+sublexicalE+sublexicalD+word−formE+sentenceE. (7)When averaging the predictive power of the two stories, both the Dutch and the English sublexical models contributed explanatory power (Fig. 4C; sublexicalD, t(38) = 1.72; p = 0.046; d = 0.28; sublexicalE, t(38) = 3.67; p < 0.001; d = 0.59). The explanatory power of the English sublexicalE model was robust across stories (A, t(38) = 2.30; p = 0.013; d = 0.37; D, t(38) = 2.87; p = 0.003; d = 0.46; AvD, t(38) = −0.67; p = 0.506; d = −0.11). However, this was not the case for the Dutch sublexicalD model. Evidence for an effect of the Dutch sublexicalD model in the Dutch-accented story was strong (t(38) = 2.32; p = 0.013; d = 0.37; B = 64.45). However, evidence for interference in the American-accented story was weak, with only negligible evidence in favor of some interference (t(38) = 0.33; p = 0.373; d = 0.05; B = 1.66). The effect was not significantly stronger in the Dutch-accented story compared with the American-accented story, suggesting that some caution is warranted, but the Bayes factor suggests some evidence in favor of a stronger effect in the Dutch-accented story (t(38) = 1.02; p = 0.156; one-tailed; d = 0.16, B = 5.12). We conclude that interference was likely stronger in the Dutch-accented story, but some interference may have occurred in both stories.

Dutch and English word forms are activated together when listening to Dutch-accented English

Several previous studies suggest that Dutch word forms are activated alongside English word forms when listening to English (see Introduction). This could occur in two different ways (Fig. 1): Dutch word forms could be activated in a separate lexical system, without competing with English word forms. Alternatively, Dutch and English word forms could compete for recognition in a connected lexicon.

To test the first possibility (Fig. 1B), we tested whether a separate word-form model with only Dutch word forms, word-formD, improved predictive power when added in addition to the English word-form model:EEG∼auditory+sublexicalE+sublexicalD+word−formE+word−formD+sentenceE. (8)This implements the hypothesis that two independent brain systems track English and Dutch word forms independently, that is, at each phoneme the two systems encounter different amounts of surprisal and entropy according to their respective lexicon and each system generates a neural response, with the two responses combining in an additive manner. Comparing model Equation 8 with Equation 7 tests for the existence of such a Dutch lexical model alongside the English model. The results showed that the Dutch word-form model did not further improve predictions after controlling for other predictors. Indeed, the addition of the Dutch word-form model made predictions worse, as might be expected in cross-validation from a predictor that adds noise (A&D, t(38) = −3.09; p = 0.998; AvD, t(38) = −0.20; p = 0.840).

To test the second possibility (Fig. 1A), we tested a merged lexicon, that is, a model analogous to the English word-form model, but including both English and Dutch word forms: word-formED. This merged word-form model embodies the hypothesis that a single lexical system detects word forms of both languages; that is, at each phoneme there is only a single surprisal and entropy value, which depends on the expectation that the current word could be English as well as Dutch. Since this merged word-form model is hypothesized as an alternative to the English-only word-form model (word-formE), we here tested the effect on predictive power of substituting the merged word-form model for the English word-form model (two-tailed test)—that is, we compared model Equation 9 with Equation 7:EEG∼auditory+sublexicalE+sublexicalD+word−formED+sentenceE. (9)Overall, the merged word-form model improves predictions over the English word-form model (A&D, t(38) = 2.39; p = 0.022; two-tailed; d = 0.38; Fig. 4D). It is conceivable that compared with the American accent, a Dutch accent, which better matches Dutch phonological categories, increases activation of Dutch competitors. Indeed, when analyzing accents separately, the evidence in favor of the merged word-formED model was strong in the Dutch-accented story (t(38) = 2.98; p = 0.005; d = 0.48; B = 312.26) and negligible in the American-accented story (t(38) = 0.29; p = 0.771; d = 0.05; B = 1.57), and there was considerable evidence for a difference between speaker accents (AvD, t(38) = 1.77; p = 0.042; one-tailed; d = 0.28; B = 20.11).

Finally, the merged word-form model was significantly better than the parallel lexicon model, confirming that a lexicon with direct lexical competition between candidates from the two languages better accounts for the data than activation in two parallel lexica (model Eq. 9 vs Eq. 8; A&D, t(38) = 4.11; p < 0.001; A, t(38) = 2.98; p = 0.005; D, t(38) = 3.10; p = 0.004).

Modulation of non-native language processing by language proficiency

We next asked whether the acoustic and linguistic representations are modulated by non-native language proficiency. We used model Equation 9 as the basis for these analyses, because the results reported above suggested that Equation 9 was the best model. Thus, predictive power reported in the following section was always calculated by removing the relevant predictors from model Equation 9. We used linear mixed-effects models to determine whether a given representation is influenced by language proficiency (LexTale) or acoustic–phonetic aptitude (LLAMA_D), and if so, whether this relationship is modulated by speaker accent. Table 2 shows results for the LexTale score, and Table 3 shows corresponding results for the LLAMA_D score.

View this table:
  • View inline
  • View popup
Table 2.

Influence of proficiency on the predictive power of different EEG model components, determined with linear mixed-effects models

View this table:
  • View inline
  • View popup
Table 3.

Influence of acoustic–phonetic aptitude as measured by the LLAMA_D test on the predictive power of different EEG model components

Increased proficiency (LexTale) is associated with reduced late auditory responses

The predictive power of the auditory predictors was significantly modulated by proficiency as measured by LexTale (Table 2). Even though this association differed between accents, it was independently significant for American- and Dutch-accented speech (A, χ2(28) = 59.59; p < 0.001; D, χ2(28) = 98.25; p < 0.001). In both cases, individuals with higher proficiency had weaker auditory representations, and this modulation involved electrodes across the head (Fig. 6A,C). An analysis of the TRFs suggests that in both accent conditions, lower proficiency was associated with larger sustained auditory responses at relatively late lags (A, 250–393 ms; p < 0.001; D, 220–287 ms; p = 0.008; and 348–407 ms; p = 0.015; Fig. 6B,D). These results indicate that listeners with lower proficiency exhibit enhanced sustained auditory representations at relatively late lags.

Figure 6.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 6.

Auditory responses are modulated by non-native language proficiency. A, The strength of the auditory responses to American-accented English decreases with increased language proficiency. The topographic map shows the multiple linear regression t statistic for the influence of LexTale scores on the predictive power of the auditory model. Sensors with t values exceeding 2 (positive or negative) are marked with yellow. The scatterplot shows the predictive power of the auditory model (y-axis, average of marked sensors) against LexTale scores (x-axis). Each dot represents one participant. The solid line is a direct regression of predictive power on the LexTale score; bands mark the 95% confidence interval determined by bootstrapping. B, TRFs suggest that less proficient listeners have stronger sustained auditory representations at later response latencies. The line plot shows the magnitude of the TRF across sensors as predicted by the multiple regression for small and large values of LexTale (60, 90), while keeping other regressors at their mean. Red bars at the bottom indicate a significant effect of LexTale (regression model Eq. 5). The rectangular image plot above shows the average TRF for each sensor, and the topographic maps show specific time points (marked by dashed black lines below) for participants with low and high LexTale scores (median split). While auditory TRFs were estimated as mTRFs for eight spectral bands in each representation, for easier visualization and analysis, the band-specific TRFs were summed across bands (after calculating magnitudes where applicable). C, The strength of auditory responses to Dutch-accented English also decreases with increased language proficiency. D, TRFs to Dutch-accented speech show a similar effect of proficiency on sustained representations as in American-accented speech.

Acoustic–phonetic aptitude (LLAMA_D) is associated only with processing of Dutch-accented speech

The predictive power of auditory responses was also modulated by acoustic–phonetic aptitude, and this effect was qualified by an interaction with speaker accent (Table 3). Figure 7A,C illustrate the pattern creating this interaction. Acoustic responses to the American-accented story were not modulated by aptitude (χ2(28) = 17.74; p = 0.932), but responses to the Dutch-accented story were (χ2(28) = 88.75; p < 0.001), with a broadly distributed topography (Fig. 7C). Consistent with this, TRF magnitudes were not related to phonetic ability in the American-accented story (Fig. 7B). In TRFs to the Dutch-accented story, increased aptitude was associated with decreased sustained responses to acoustic features at relatively late lags (224–277 ms; p = 0.048; Fig. 7D), similar to the effect of proficiency described above (compare Fig. 6).

Figure 7.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 7.

Auditory responses are modulated by acoustic–phonetic aptitude when listening to Dutch-accented speech only. Unless mentioned otherwise, details are as in Figure 6. A, Because predictive power at no sensor was meaningfully related to the LLAMA score (all t < 2), the scatterplot shows data for the average of all sensors. B, Consistent with results from predictive power, TRFs were not significantly modulated by phonetic ability. Line plots show predictions for LLAMA_D scores of 10 and 50. C, In brain responses to Dutch-accented speech, increased phonetic ability was associated with smaller predictive power of auditory predictors, that is, with weaker auditory responses. D, TRFs related to sustained auditory representations of Dutch-accented speech were modulated by aptitude at relatively late lags (224–277 ms).

Thus, Dutch listeners with higher acoustic–phonetic aptitude exhibited reduced acoustic responses when listening to English spoken with a Dutch accent. However, acoustic–phonetic aptitude did not affect acoustic responses when listening to English-accented speech.

English proficiency reduces sublexical representations of American-accented speech and enhances sublexical representations of Dutch-accented speech

The predictive power of the English sublexical model (sublexicalE) was significantly associated with language proficiency, and this effect was modulated by speaker accent (Table 2). Proficiency affected responses in both American- and Dutch-accented speech (A, χ2(28) = 49.62; p = 0.007; D, χ2(28) = 63.81; p < 0.001). The interaction is illustrated in Figure 8. When listening to the American-accented speaker, higher proficiency was associated with a decrease in predictive power, with large effects at frontal sensors bilaterally (Fig. 8A); in contrast, when listening to the Dutch-accented speaker, proficiency was associated with an increase in predictive power primarily at right frontal sensors (Fig. 8C). Thus, when listening to English spoken with an American accent, more proficient listeners show less activation of English sublexical statistics compared with listeners with low proficiency; on the other hand, when listening to a Dutch accent, more proficient listeners activate English sublexical statistics more strongly.

Figure 8.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 8.

Activation of the English sublexical language model is modulated by proficiency and speaker accent. Unless mentioned otherwise, details are as in Figure 6. A, For American-accented English, higher proficiency is associated with reduced sublexical responses. B, TRFs to the surprisal and entropy predictors based on the English sublexical language model. Surprisal is associated with a decreased response in more proficient listeners. C, For Dutch-accented English, higher proficiency is associated with stronger representation of the sublexical language model. D, TRFs do not show significant effects of proficiency.

To determine how brain responses lead to this modulation of predictive power, we analyzed the corresponding TRFs, shown in Figure 8B,D. Here, a TRF reflects the component of the brain response to phonemes that scales with the corresponding predictor's value, that is, surprisal or entropy. The TRF to sublexical surprisal in the American-accented story exhibit increased responses in listeners with low proficiency in middle (160–226 ms; p = 0.003) as well as later parts of the response (558–609 ms; p = 0.007). This suggests that the stronger activation of the sublexical model in individuals with low proficiency is due to increased extended cortical processing. On the other hand, the TRFs to the Dutch-accented story do not exhibit a significant effect of LexTale and thus do not provide a clear explanation for higher predictive power in high proficiency individuals.

No evidence for a decrease in native language interference with increasing proficiency

Even though effects of native language interference persist in highly proficient non-native listeners (Garcia Lecumberri et al., 2010), we hypothesized that the magnitude of the interference might decrease with increasing proficiency. However, the predictive power of the models of native language interference (the sublexicalD predictor and the word-formED>E contrast) were not significantly related to LexTale. Figure 9 shows plots of native language interference as a function of proficiency. The evidence for native language interference was averaged at 18 anterior sensors (manually selected, based on the observation that predictive power of the relevant comparisons was strongest in this region; compare Fig. 4). Even though some of the regression lines seem to exhibit a negative trend, none of these associations were significant (Table 2). This suggests that in the range of proficiency studied here, native language interference does not significantly decrease with increased proficiency.

Figure 9.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 9.

EEG responses that quantify the influence of the native language on non-native speech processing were not significantly related to proficiency. SublexicalD quantifies activation of the Dutch phoneme sequence model (i.e., comparison Eq. 7 > Eq. 6); E∪D > E quantifies the increase in predictive power due to including Dutch word forms (i.e., comparison Eq. 9 > Eq. 7). Data shown on the y-axis correspond to the average predictive power at anterior sensors (top left, pink sensors). Even though some regression plots seem to exhibit a negative trend, associations were not significant.

Discussion

EEG responses of native speakers of Dutch, listening to an English story, exhibited evidence for parallel activation of sublexical, word-form, and sentence-level language models. This matches previous findings from native speakers of English listening to their native language (Brodbeck et al., 2022; Xie et al., 2023).

Activation of the native language

We found evidence for two ways in which the native language (Dutch) influenced brain responses associated with non-native (English) speech processing. First, listening to English activated a predictive model of Dutch phoneme sequences, in addition to the appropriate English phoneme sequence model. This interference was only significant in Dutch-accented speech (although the evidence for a difference by speaker accent was weak). This suggests that listeners were not able to completely “turn off” statistical expectations based on phoneme sequence statistics in their native language, at least when listening to English spoken with a Dutch accent.

Second, brain responses to Dutch-accented English also exhibited evidence for activation of Dutch word forms. Our results suggest that, in advanced non-native listening, Dutch and English words are activated in a shared lexicon and compete for recognition, rather than being activated in independent parallel lexica. This provides a neural correlate for a phenomenon seen in behavioral studies, showing activation of words from the native language during non-native listening (Spivey and Marian, 1999; Marian and Spivey, 2003; Weber and Cutler, 2004; Hintz et al., 2022). However, in our results this effect was significant only for Dutch-accented speech and was not detectable for English-accented speech. Thus, in this more naturalistic listening scenario, the activation of words from the native language specifically depended on the accent. This may be because Dutch speech sounds are inherently linked to Dutch lexical items more strongly than the newly learned American sounds, or because the Dutch accent makes Dutch more salient in general and thus primes Dutch lexical competitors. Moreover, a Dutch-accented speaker may indeed sometimes use Dutch words, whereas a native accent signals a strictly monolingual setting, which may allow listeners to minimize cross-language interference (García et al., 2018). Concerning earlier behavioral results using native accents, we surmise that, compared to naturalistic listening, visual world studies may have exaggerated the interference effect, because native language competitors may have been primed due to their presence on the visual display.

Neither of the effects of native language interference was modulated by proficiency, suggesting that this interference does not disappear in more proficient listeners. This is consistent with previous behavioral results suggesting that native language interference persists even in advanced non-native listeners (Spivey and Marian, 1999; Weber and Cutler, 2004; Hintz et al., 2022). Together with our finding of increased native language interference in an accent from the listener's native language, this could explain why such an accent becomes relatively more challenging at higher proficiency (Pinet et al., 2011; Xie and Fowler, 2013; Gordon-Salant et al., 2019): At lower proficiency, the non-native accent bestows an advantage due to the familiar acoustic–phonetic structure. At higher proficiency, the acoustic–phonetic structure of the native accent becomes more familiar, thus reducing the initial advantage of the non-native accent. Now, the disadvantage due to the increased native language interference in the non-native accent becomes the dominant factor, making the non-native accent relatively more difficult than a native accent.

Note that Dutch and English are both West Germanic languages and share many properties. High lexical overlap between two languages may promote interference and competition, whereas such effects may be inherently lower for less closely related language pairs (Wei, 2009).

Acoustic representations are reduced by proficiency

More proficient listeners exhibited reduced amplitudes in brain responses to acoustic features. Our result replicates an earlier finding (Zinszer et al., 2022) and further suggests that this was primarily due to a reduction in late (>200 ms) responses. We broadly interpret this to indicate that in more proficient listeners, less neural work is being done with the acoustic signal at extended latencies. A potential explanation is that, when lower level signals can be explained from higher levels of representation, the bottom-up signals are inhibited (Rao and Ballard, 1999; Tezcan et al., 2023). Under these accounts, the observed result could reflect that more proficient listeners get better at explaining (and thus inhibiting) acoustic representations during speech listening. This would explain why the reduction is found primarily in late responses: Early responses reflect bottom-up processing of the auditory input and are similar across participants, but more proficient listeners have better acoustic–phonetic models that more quickly explain the bottom-up signal and thus inhibit the later responses.

Acoustic representations of Dutch-accented English are reduced by acoustic–phonetic aptitude

Listeners that scored high on the LLAMA_D test of acoustic–phonetic aptitude also exhibited reduced auditory responses, but only in Dutch-accented English. As with proficiency, this affected primarily later response components (>220 ms). Similarly to the effect of proficiency, the reduced responses may indicate a reduction in neural work or better acoustic–phonetic models. The interaction with speaker accent, then, would indicate that acoustic–phonetic aptitude facilitates the recognition of English language words in a Dutch accent and is less relevant for the American accent. While this might sound counterintuitive, Dutch people tend to be exposed more to native English accents than to Dutch-accented English (e.g., through subtitled movies). Consequently, it might be that the Dutch accent is to some extent less naturally mapped to English word forms than the American accent.

Sublexical processing of the foreign language

Sublexical processing of English was modulated by proficiency in a complex manner, depending on the speaker's accent: when listening to the story spoken with an American accent, increased proficiency was associated with decreased activation of the English sublexical language model. This is consistent with a previous report on Chinese non-native listeners, where increased English proficiency was associated with smaller responses related to a phonotactic measure (Di Liberto et al., 2021). Our results replicate this effect in Dutch non-native listeners and tie it to sublexical (vs word-form) processing. However, our results also suggest that the effect depends on the speaker's accent: when listening to the story spoken with a Dutch accent, increased proficiency was associated with increased activation of the English sublexical model.

Interestingly, behavioral data indicate a similar interaction of proficiency with speaker accent: low proficiency listeners benefit from an accent corresponding to their own native language, whereas more proficient listeners benefit more from an accent native to the target language (Pinet et al., 2011; Xie and Fowler, 2013). Thus, as more proficient non-native listeners have tuned their phonetic perception more to a native accent (Eger and Reinisch, 2019; Di Liberto et al., 2021), phonetic cues in the non-native-accented speech may become relatively less reliable. This may be due to the mismatch of the acoustic cues with the stored acoustic representations but also due to the persistent native language interference (see above). This perceived reliability may influence the degree to which listeners rely on expectations from short-term transition probabilities between phonemes (i.e., the sublexical model) to provide a prior for interpreting the acoustic input: decreased activation of the sublexical language model when listening to a native speaker might indicate that more proficient listeners rely less on this lower level prior. In contrast, the increase in activation of the sublexical language model when listening to the non-native accent may indicate that more proficient listeners increasingly recruit the sublexical language model to provide a prior for the imperfect bottom-up signal.

Lack of modulation of sentence-level responses

We found no relationship between proficiency and responses related to the sentence-level language model. This suggests that listeners across our sample (intermediate to higher proficiency) comprehended and used the English multi-word context to predict upcoming speech. This may indicate that listeners develop predictive models early during non-native language learning (Sanders et al., 2002; Frost et al., 2013), especially when languages are structurally similar (Alemán Bañón and Martin, 2021). It may also reflect the language experience of our sample, as English is frequently encountered in the Netherlands.

Conclusions

We found relatively stable higher level neural language model activations (word-form and sentence level) from intermediate to high proficiency listeners, but reductions in the activation of auditory and sublexical representations with increased proficiency. This may indicate that listeners of intermediate proficiency are able to extract and use sentence-level information appropriately in the non-native language (at least in the context of listening to the relatively easy story) but keep refining computations related to lower level acoustic and sublexical representations.

We also found evidence for a continued influence of native language statistics during naturalistic non-native listening. However, our results suggest a significant influence only in Dutch-accented speech, where the Dutch speech sounds may increase activation of Dutch language representations. This selective interference may explain why a Dutch accent becomes relatively more challenging for highly proficient listeners. For native accents, behavioral research may have inadvertently increased native language interference by increasing meta-linguistic awareness (Freeman et al., 2021) or by priming native language distractors.

Footnotes

  • The data was collected as part of a VIDI grant from the Netherlands Organization for Scientific Research (Grant Number 276-89-003) awarded to O.S. C.B. was supported by National Science Foundation BCS-1754284, BCS-2043903, and IIS-2207770 and a University of Connecticut seed grant from the Institute of the Brain and Cognitive Sciences.

  • The authors declare no competing financial interests.

  • Correspondence should be addressed to Odette Scharenborg at o.e.scharenborg{at}tudelft.nl.

SfN exclusive license.

References

  1. ↵
    1. Alemán Bañón J,
    2. Martin C
    (2021) The role of crosslinguistic differences in second language anticipatory processing: an event-related potentials study. Neuropsychologia 155:107797. doi:10.1016/j.neuropsychologia.2021.107797
    OpenUrlCrossRef
  2. ↵
    1. Bates DM,
    2. Mächler M,
    3. Bolker B,
    4. Walker S
    (2015) Fitting linear mixed-effects models using lme4. J Stat Softw 67:1–48. doi:10.18637/jss.v067.i01
    OpenUrlCrossRefPubMed
  3. ↵
    1. Bell AJ,
    2. Sejnowski TJ
    (1995) An information-maximization approach to blind separation and blind deconvolution. Neural Comput 7:1129–1159. doi:10.1162/neco.1995.7.6.1129
    OpenUrlCrossRefPubMed
  4. ↵
    1. Bent T,
    2. Bradlow AR
    (2003) The interlanguage speech intelligibility benefit. J Acoust Soc Am 114:1600–1610. doi:10.1121/1.1603234
    OpenUrlCrossRefPubMed
  5. ↵
    1. Brodbeck C,
    2. Bhattasali S,
    3. Cruz Heredia AA,
    4. Resnik P,
    5. Simon JZ,
    6. Lau E
    (2022) Parallel processing in speech perception with local and global representations of linguistic context. eLife 11:e72056. doi:10.7554/eLife.72056
    OpenUrlCrossRef
  6. ↵
    1. Brodbeck C,
    2. Das P,
    3. Kulasingham JP,
    4. Bhattasali S,
    5. Gaston P,
    6. Resnik P,
    7. Simon JZ
    (2021) Eelbrain: a Python toolkit for time-continuous analysis with temporal response functions. Available at: http://biorxiv.org/lookup/doi/10.1101/2021.08.01.454687. Retrieved October 13, 2021.
  7. ↵
    1. Brodbeck C,
    2. Hong LE,
    3. Simon JZ
    (2018) Rapid transformation from auditory to linguistic representations of continuous speech. Curr Biol 28:3976–3983.e5. doi:10.1016/j.cub.2018.10.042
    OpenUrlCrossRefPubMed
  8. ↵
    1. Brodbeck C,
    2. Jiao A,
    3. Hong LE,
    4. Simon JZ
    (2020) Neural speech restoration at the cocktail party: auditory cortex recovers masked speech of both attended and ignored speakers. PLOS Biol 18:e3000883. doi:10.1371/journal.pbio.3000883
    OpenUrlCrossRefPubMed
  9. ↵
    1. Broersma M,
    2. Cutler A
    (2008) Phantom word activation in L2. System 36:22–34. doi:10.1016/j.system.2007.11.003
    OpenUrlCrossRef
  10. ↵
    1. Broersma M,
    2. Cutler A
    (2011) Competition dynamics of second-language listening. Q J Exp Psychol 64:74–95. doi:10.1080/17470218.2010.499174
    OpenUrlCrossRef
  11. ↵
    1. Brysbaert M,
    2. Duyck W
    (2010) Is it time to leave behind the revised hierarchical model of bilingual language processing after fifteen years of service? Biling Lang Cogn 13:359–371. doi:10.1017/S1366728909990344
    OpenUrlCrossRef
  12. ↵
    1. Brysbaert M,
    2. New B
    (2009) Moving beyond Kucera and Francis: a critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behav Res Methods 41:977–990. doi:10.3758/BRM.41.4.977
    OpenUrlCrossRefPubMed
  13. ↵
    1. Carroll JB,
    2. Sapon SM
    (1959) Modern language aptitude test. New York: Psychological Corporation.
  14. ↵
    1. Cutler A
    (2012) Native listening: language experience and the recognition of spoken words. Cambridge, MA: The MIT Press.
  15. ↵
    1. Cutler A,
    2. Weber A,
    3. Otake T
    (2006) Asymmetric mapping from phonetic to lexical representations in second-language listening. J Phon 34:269–284. doi:10.1016/j.wocn.2005.06.002
    OpenUrlCrossRef
  16. ↵
    1. Daube C,
    2. Ince RAA,
    3. Gross J
    (2019) Simple acoustic features can explain phoneme-based predictions of cortical responses to speech. Curr Biol 29:1924–1937.e9. doi:10.1016/j.cub.2019.04.067
    OpenUrlCrossRefPubMed
  17. ↵
    1. David SV,
    2. Mesgarani N,
    3. Shamma SA
    (2007) Estimating sparse spectro-temporal receptive fields with natural stimuli. Netw Comput Neural Syst 18:191–212. doi:10.1080/09548980701609235
    OpenUrlCrossRef
  18. ↵
    1. Dijkstra T,
    2. Wahl A,
    3. Buytenhuijs F,
    4. Van Halem N,
    5. Al-Jibouri Z,
    6. De Korte M,
    7. Rekké S
    (2019) Multilink: a computational model for bilingual word recognition and word translation. Biling Lang Cogn 22:657–679. doi:10.1017/S1366728918000287
    OpenUrlCrossRef
  19. ↵
    1. Di Liberto GM,
    2. Nie J,
    3. Yeaton J,
    4. Khalighinejad B,
    5. Shamma SA,
    6. Mesgarani N
    (2021) Neural representation of linguistic feature hierarchy reflects second-language proficiency. NeuroImage 227:117586. doi:10.1016/j.neuroimage.2020.117586
    OpenUrlCrossRef
  20. ↵
    1. Drijvers L,
    2. Vaitonytė J,
    3. Özyürek A
    (2019) Degree of language experience modulates visual attention to visible speech and iconic gestures during clear and degraded speech comprehension. Cogn Sci 43:e12789. doi:10.1111/cogs.12789.
    OpenUrlCrossRef
  21. ↵
    1. Eger NA,
    2. Reinisch E
    (2019) The role of acoustic cues and listener proficiency in the perception of accent in nonnative sounds. Stud Second Lang Acquis 41:179–200. doi:10.1017/S0272263117000377
    OpenUrlCrossRef
  22. ↵
    1. Fishbach A,
    2. Nelken I,
    3. Yeshurun Y
    (2001) Auditory edge detection: a neural model for physiological and psychoacoustical responses to amplitude transients. J Neurophysiol 85:2303–2323. doi:10.1152/jn.2001.85.6.2303
    OpenUrlCrossRefPubMed
  23. ↵
    1. Freeman MR,
    2. Blumenfeld HK,
    3. Carlson MT,
    4. Marian V
    (2021) First-language influence on second language speech perception depends on task demands. Lang Speech 65:28–51. doi:10.1177/0023830920983368.
    OpenUrlCrossRef
  24. ↵
    1. Frost R,
    2. Siegelman N,
    3. Narkiss A,
    4. Afek L
    (2013) What predicts successful literacy acquisition in a second language? Psychol Sci 24:1243–1252. doi:10.1177/0956797612472207
    OpenUrlCrossRefPubMed
  25. ↵
    1. García PB,
    2. Leibold L,
    3. Buss E,
    4. Calandruccio L,
    5. Rodriguez B
    (2018) Code-switching in highly proficient Spanish/English bilingual adults: impact on masked word recognition. J Speech Lang Hear Res 61:2353–2363. doi:10.1044/2018_JSLHR-H-17-0399
    OpenUrlCrossRef
  26. ↵
    1. Garcia Lecumberri ML,
    2. Cooke M,
    3. Cutler A
    (2010) Non-native speech perception in adverse conditions: a review. Speech Commun 52:864–886. doi:10.1016/j.specom.2010.08.014
    OpenUrlCrossRef
  27. ↵
    1. Gillis M,
    2. Vanthornhout J,
    3. Simon JZ,
    4. Francart T,
    5. Brodbeck C
    (2021) Neural markers of speech comprehension: measuring EEG tracking of linguistic speech representations, controlling the speech acoustics. J Neurosci 41:10316–10329. doi:10/29/JNEUROSCI.0812-21.2021.
    OpenUrlAbstract/FREE Full Text
  28. ↵
    1. Gordon-Salant S,
    2. Yeni-Komshian GH,
    3. Bieber RE,
    4. Jara Ureta DA,
    5. Freund MS,
    6. Fitzgibbons PJ
    (2019) Effects of listener age and native language experience on recognition of accented and unaccented English words. J Speech Lang Hear Res 62:1131–1143. doi:10.1044/2018_JSLHR-H-ASCC7-18-0122
    OpenUrlCrossRef
  29. ↵
    1. Gramfort A,
    2. Luessi M,
    3. Larson E,
    4. Engemann DA,
    5. Strohmeier D,
    6. Brodbeck C,
    7. Parkkonen L,
    8. Hämäläinen MS
    (2014) MNE software for processing MEG and EEG data. NeuroImage 86:446–460. doi:10.1016/j.neuroimage.2013.10.027
    OpenUrlCrossRefPubMed
  30. ↵
    1. Hayes-Harb R,
    2. Smith BL,
    3. Bent T,
    4. Bradlow AR
    (2008) The interlanguage speech intelligibility benefit for native speakers of Mandarin: production and perception of English word-final voicing contrasts. J Phon 36:664–679. doi:10.1016/j.wocn.2008.04.002
    OpenUrlCrossRefPubMed
  31. ↵
    1. Heafield K
    (2011) KenLM: faster and smaller language model queries. In: Proceedings of the 6th workshop on statistical machine translation, pp 187–197. Edinburgh, Scotland, UK.
  32. ↵
    1. Heeris J
    (2018) Gammatone filterbank toolkit. Available at: https://github.com/detly/gammatone
  33. ↵
    1. Hintz F,
    2. Voeten CC,
    3. Scharenborg O
    (2022) Recognizing non-native spoken words in background noise increases interference from the native language. Psychon Bull Rev 30:1549–1563. doi:10.3758/s13423-022-02233-7.
    OpenUrlCrossRef
  34. ↵
    1. Karaminis T,
    2. Hintz F,
    3. Scharenborg O
    (2022) The presence of background noise extends the competitor space in native and non-native spoken-word recognition: insights from computational modeling. Cogn Sci 46:e13110. doi:10.1111/cogs.13110.
    OpenUrlCrossRef
  35. ↵
    1. Keuleers E,
    2. Brysbaert M,
    3. New B
    (2010) SUBTLEX-NL: a new measure for Dutch word frequency based on film subtitles. Behav Res Methods 42:643–650. doi:10.3758/BRM.42.3.643
    OpenUrlCrossRefPubMed
  36. ↵
    1. Lalor EC,
    2. Foxe JJ
    (2010) Neural responses to uninterrupted natural speech can be extracted with precise temporal resolution. Eur J Neurosci 31:189–193. doi:10.1111/j.1460-9568.2009.07055.x
    OpenUrlCrossRefPubMed
  37. ↵
    1. Lemhöfer K,
    2. Broersma M
    (2012) Introducing LexTALE: a quick and valid lexical test for advanced learners of English. Behav Res Methods 44:325–343. doi:10.3758/s13428-011-0146-0
    OpenUrlCrossRefPubMed
  38. ↵
    1. Lütkenhöner B,
    2. Mosher JC,
    3. Hall JW
    (2007) Source Analysis of Auditory Evoked Potentials and Fields. In: New handbook for auditory evoked potentials (Hall JW, ed), pp 546–569. Boston: Pearson.
  39. ↵
    1. Marian V,
    2. Spivey M
    (2003) Competing activation in bilingual language processing: within- and between-language competition. Biling Lang Cogn 6:97–115. doi:10.1017/S1366728903001068
    OpenUrlCrossRef
  40. ↵
    1. Maris E,
    2. Oostenveld R
    (2007) Nonparametric statistical testing of EEG- and MEG-data. J Neurosci Methods 164:177–190. doi:10.1016/j.jneumeth.2007.03.024
    OpenUrlCrossRefPubMed
  41. ↵
    1. Marslen-Wilson WD
    (1987) Functional parallelism in spoken word-recognition. Cognition 25:71–102. doi:10.1016/0010-0277(87)90005-9
    OpenUrlCrossRefPubMed
  42. ↵
    1. McAuliffe M,
    2. Socolof M,
    3. Mihuc S,
    4. Wagner M,
    5. Sonderegger M
    (2017) Montreal forced aligner: trainable text-speech alignment using Kaldi. In: Interspeech 2017, pp 498–502. ISCA. Available at: http://www.isca-speech.org/archive/Interspeech_2017/abstracts/1386.html. Retrieved September 18, 2020.
  43. ↵
    1. Meara P
    (2005) LLAMA language aptitude tests: the manual.
  44. ↵
    1. Meara P,
    2. Milton J,
    3. Lorenzo-Dus N
    (2002) Swansea language aptitude tests (LAT), v 2.0. Swansea: Lognostics.
  45. ↵
    1. Morey RD,
    2. Rouder JN
    (2011) Bayes factor approaches for testing interval null hypotheses. Psychol Methods 16:406–419. doi:10.1037/a0024377
    OpenUrlCrossRefPubMed
  46. ↵
    1. Morey RD,
    2. Rouder JN,
    3. Jamil T,
    4. Urbanek S,
    5. Forner K,
    6. Ly A
    (2022) BayesFactor: computation of bayes factors for common designs. Available at: https://CRAN.R-project.org/package=BayesFactor. Retrieved March 31, 2023.
  47. ↵
    1. Oostdijk N,
    2. Goedertier W,
    3. van Eynde F,
    4. Boves L,
    5. Martens J-P,
    6. Moortgat M,
    7. Baayen H
    (2002) ) Experiences from the Spoken Dutch Corpus project. In, pp 340–347. Las Palmas de Gran Canaria. Available at: http://lrec.elra.info/proceedings/lrec2002/pdf/98.pdf
  48. ↵
    1. Perdomo M,
    2. Kaan E
    (2021) Prosodic cues in second-language speech processing: a visual world eye-tracking study. Second Lang Res 37:349–375. doi:10.1177/0267658319879196
    OpenUrlCrossRef
  49. ↵
    1. Pinet M,
    2. Iverson P,
    3. Huckvale M
    (2011) Second-language experience and speech-in-noise recognition: effects of talker–listener accent similarity. J Acoust Soc Am 130:1653–1662. doi:10.1121/1.3613698
    OpenUrlCrossRefPubMed
  50. ↵
    1. Rao RPN,
    2. Ballard DH
    (1999) Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat Neurosci 2:79–87. doi:10.1038/4580
    OpenUrlCrossRefPubMed
  51. ↵
    R Core Team (2021) R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Available at: https://www.R-project.org
  52. ↵
    1. Reichl R
    (2005) Garlic and sapphires: the secret life of a critic in disguise. Penguin Press.
  53. ↵
    1. Rogers V,
    2. Meara P,
    3. Barnett-Legh T,
    4. Curry C,
    5. Davie E
    (2017) Examining the LLAMA aptitude tests. J Eur Second Lang Assoc 1:49–60. doi:10.22599/jesla.24
    OpenUrlCrossRef
  54. ↵
    1. Rouder JN,
    2. Speckman PL,
    3. Sun D,
    4. Morey RD,
    5. Iverson G
    (2009) Bayesian t tests for accepting and rejecting the null hypothesis. Psychon Bull Rev 16:225–237. doi:10.3758/PBR.16.2.225
    OpenUrlCrossRefPubMed
  55. ↵
    1. Sanders LD,
    2. Newport EL,
    3. Neville HJ
    (2002) Segmenting nonsense: an event-related potential index of perceived onsets in continuous speech. Nat Neurosci 5:700–703. doi:10.1038/nn873
    OpenUrlCrossRefPubMed
  56. ↵
    1. Scharenborg O,
    2. Coumans JMJ,
    3. van Hout R
    (2018) The effect of background noise on the word activation process in nonnative spoken-word recognition. J Exp Psychol Learn Mem Cogn 44:233–249. doi:10.1037/xlm0000441
    OpenUrlCrossRef
  57. ↵
    1. Scharenborg O,
    2. Koemans J,
    3. Smith C,
    4. Hasegawa-Johnson MA,
    5. Federmeier KD
    (2019) The neural correlates underlying lexically-guided perceptual learning. In: Interspeech 2019, pp 1223–1227. ISCA. Available at: https://www.isca-speech.org/archive/interspeech_2019/scharenborg19_interspeech.html Retrieved September 9, 2021.
  58. ↵
    1. Scharenborg O,
    2. van Os M
    (2019) Why listening in background noise is harder in a non-native language than in a native language: a review. Speech Commun 108:53–64. doi:10.1016/j.specom.2019.03.001
    OpenUrlCrossRef
  59. ↵
    1. Service E,
    2. DeBorba E,
    3. Lopez-Cormier A,
    4. Horzum M,
    5. Pape D
    (2022) Short-term memory for auditory temporal patterns and meaningless sentences predicts learning of foreign word forms. Brain Sci 12:549. doi:10.3390/brainsci12050549
    OpenUrlCrossRef
  60. ↵
    1. Spivey MJ,
    2. Marian V
    (1999) Cross talk between native and second languages: partial activation of an irrelevant lexicon. Psychol Sci 10:281–284. doi:10.1111/1467-9280.00151
    OpenUrlCrossRef
  61. ↵
    1. Tezcan F,
    2. Weissbart H,
    3. Martin AE
    (2023) A tradeoff between acoustic and linguistic feature encoding in spoken language comprehension. eLife 12:e82386doi:10.7554/eLife.82386.
    OpenUrlCrossRef
  62. ↵
    1. Verschueren E,
    2. Gillis M,
    3. Decruy L,
    4. Vanthornhout J,
    5. Francart T
    (2022) Speech understanding oppositely affects acoustic and linguistic neural tracking in a speech rate manipulation paradigm. J Neurosci 42:7442–7453. doi:10.1523/JNEUROSCI.0259-22.2022
    OpenUrlAbstract/FREE Full Text
  63. ↵
    1. Waskom M
    (2021) seaborn: statistical data visualization. Available at: https://zenodo.org/record/4645478. Retrieved July 16, 2021.
  64. ↵
    1. Weber A,
    2. Cutler A
    (2004) Lexical competition in non-native spoken-word recognition. J Mem Lang 50:1–25. doi:10.1016/S0749-596X(03)00105-0
    OpenUrlCrossRef
  65. ↵
    1. Wei L
    (2009) Code-switching and the bilingual mental lexicon. In: Cambridge handbook of linguistic code-switching (Bullock BE Toribio AJ, eds), pp 270–288. Cambridge, UK: Cambridge University Press.
  66. ↵
    1. Xie Z,
    2. Brodbeck C,
    3. Chandrasekaran B
    (2023) Cortical tracking of continuous speech under bimodal divided attention. Neurobiol Lang 4:1–26. doi:10.1162/nol_a_00082
    OpenUrlCrossRef
  67. ↵
    1. Xie X,
    2. Fowler CA
    (2013) Listening with a foreign-accent: the interlanguage speech intelligibility benefit in Mandarin speakers of English. J Phon 41:369–378. doi:10.1016/j.wocn.2013.06.003
    OpenUrlCrossRef
  68. ↵
    1. Zinszer BD,
    2. Yuan Q,
    3. Zhang Z,
    4. Chandrasekaran B,
    5. Guo T
    (2022) Continuous speech tracking in bilinguals reflects adaptation to both language and noise. Brain Lang 230:105128. doi:10.1016/j.bandl.2022.105128
    OpenUrlCrossRef
  69. ↵
    1. Zwitserlood P
    (1989) The locus of the effects of sentential-semantic context in spoken-word processing. Cognition 32:25–64. doi:10.1016/0010-0277(89)90013-9
    OpenUrlCrossRefPubMed
Back to top

In this issue

The Journal of Neuroscience: 44 (1)
Journal of Neuroscience
Vol. 44, Issue 1
3 Jan 2024
  • Table of Contents
  • About the Cover
  • Index by author
  • Masthead (PDF)
Email

Thank you for sharing this Journal of Neuroscience article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Neural Representations of Non-native Speech Reflect Proficiency and Interference from Native Language Knowledge
(Your Name) has forwarded a page to you from Journal of Neuroscience
(Your Name) thought you would be interested in this article in Journal of Neuroscience.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
Neural Representations of Non-native Speech Reflect Proficiency and Interference from Native Language Knowledge
Christian Brodbeck, Katerina Danae Kandylaki, Odette Scharenborg
Journal of Neuroscience 3 January 2024, 44 (1) e0666232023; DOI: 10.1523/JNEUROSCI.0666-23.2023

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Request Permissions
Share
Neural Representations of Non-native Speech Reflect Proficiency and Interference from Native Language Knowledge
Christian Brodbeck, Katerina Danae Kandylaki, Odette Scharenborg
Journal of Neuroscience 3 January 2024, 44 (1) e0666232023; DOI: 10.1523/JNEUROSCI.0666-23.2023
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • Introduction
    • Materials and Methods
    • Results
    • Discussion
    • Footnotes
    • References
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF

Keywords

  • accent
  • electroencephalography
  • linguistic knowledge
  • non-native listening
  • predictive coding
  • proficiency

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

Research Articles

  • Multi-omics analysis reveals miR-7220-5p alleviates N2O addictive behaviors via NR2B/ERK/CREB signaling
  • Disrupted Neurogenesis from Basal Intermediate Precursor Cells Alters the Postnatal Neocortex in the TcMAC21 Mouse model of Down Syndrome
  • HDAC3 Serine 424 Phospho-mimic and Phospho-null Mutants Bidirectionally Modulate Long-Term Memory Formation and Synaptic Plasticity in the Adult and Aging Mouse Brain
Show more Research Articles

Behavioral/Cognitive

  • HDAC3 Serine 424 Phospho-mimic and Phospho-null Mutants Bidirectionally Modulate Long-Term Memory Formation and Synaptic Plasticity in the Adult and Aging Mouse Brain
  • Phospho-CREB Regulation on NMDA Glutamate Receptor 2B and Mitochondrial Calcium Uniporter in the Ventrolateral Periaqueductal Gray Controls Chronic Morphine Withdrawal in Male Rats
  • Is It Me or the Train Moving? Humans Resolve Sensory Conflicts with a Nonlinear Feedback Mechanism in Balance Control
Show more Behavioral/Cognitive
  • Home
  • Alerts
  • Follow SFN on BlueSky
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Issue Archive
  • Collections

Information

  • For Authors
  • For Advertisers
  • For the Media
  • For Subscribers

About

  • About the Journal
  • Editorial Board
  • Privacy Notice
  • Contact
  • Accessibility
(JNeurosci logo)
(SfN logo)

Copyright © 2025 by the Society for Neuroscience.
JNeurosci Online ISSN: 1529-2401

The ideas and opinions expressed in JNeurosci do not necessarily reflect those of SfN or the JNeurosci Editorial Board. Publication of an advertisement or other product mention in JNeurosci should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in JNeurosci.