Elsevier

NeuroImage

Volume 133, June 2016, Pages 516-528
NeuroImage

Delta, theta, beta, and gamma brain oscillations index levels of auditory sentence processing

https://doi.org/10.1016/j.neuroimage.2016.02.064Get rights and content

Highlights

  • We analyzed EEG oscillations during auditory sentence processing.

  • We considered δ (2–4 Hz), θ (4–8 Hz), β (13–30 Hz), and γ (30–50 Hz) oscillations.

  • We observed significant effects of EEG power and EEG-acoustic entrainment at δ and θ bands during phonological processing.

  • We observed significant β-related and γ-related effects, during phonological and semantic/syntactic processing, respectively.

  • The results echo previous evidence that phonological and higher-level linguistic processing engage distinct neural networks.

Abstract

A growing number of studies indicate that multiple ranges of brain oscillations, especially the delta (δ, < 4 Hz), theta (θ, 4–8 Hz), beta (β, 13–30 Hz), and gamma (γ, 30–50 Hz) bands, are engaged in speech and language processing. It is not clear, however, how these oscillations relate to functional processing at different linguistic hierarchical levels. Using scalp electroencephalography (EEG), the current study tested the hypothesis that phonological and the higher-level linguistic (semantic/syntactic) organizations during auditory sentence processing are indexed by distinct EEG signatures derived from the δ, θ, β, and γ oscillations. We analyzed specific EEG signatures while subjects listened to Mandarin speech stimuli in three different conditions in order to dissociate phonological and semantic/syntactic processing: (1) sentences comprising valid disyllabic words assembled in a valid syntactic structure (real-word condition); (2) utterances with morphologically valid syllables, but not constituting valid disyllabic words (pseudo-word condition); and (3) backward versions of the real-word and pseudo-word conditions. We tested four signatures: band power, EEG–acoustic entrainment (EAE), cross-frequency coupling (CFC), and inter-electrode renormalized partial directed coherence (rPDC). The results show significant effects of band power and EAE of δ and θ oscillations for phonological, rather than semantic/syntactic processing, indicating the importance of tracking δ- and θ-rate phonetic patterns during phonological analysis. We also found significant β-related effects, suggesting tracking of EEG to the acoustic stimulus (high-β EAE), memory processing (θ–low-β CFC), and auditory-motor interactions (20-Hz rPDC) during phonological analysis. For semantic/syntactic processing, we obtained a significant effect of γ power, suggesting lexical memory retrieval or processing grammatical word categories. Based on these findings, we confirm that scalp EEG signatures relevant to δ, θ, β, and γ oscillations can index phonological and semantic/syntactic organizations separately in auditory sentence processing, compatible with the view that phonological and higher-level linguistic processing engage distinct neural networks.

Introduction

Cortical oscillatory activity plays a key role in conveying and controlling neural information across the brain, whereby various fundamental cognitive functions, such as attention, learning, memory, and decision-making, are realized (Ward, 2003, Siegel et al., 2012). Brain oscillations are conventionally divided into several frequency ranges: delta (δ, < 4 Hz), theta (θ, 4–8 Hz), alpha (α, 8–13 Hz), beta (β, 13–30 Hz), and gamma (γ, > 30 Hz) (Ward, 2003). Numerous studies have shown that certain cognitive functions are related to oscillations in multiple frequency ranges. For example, attention is related to changes in α and γ activities (Klimesch, 2012, Jensen et al., 2007), whereas working memory and long-term memory processes involve θ, β, and γ activities (Ward, 2003, Jensen et al., 2007, Fell and Axmacher, 2011). An important topic of human cognitive neuroscience in recent years considers how language is processed via coordination of brain oscillations. The current paper focuses on the auditory modality, and deals with how brain oscillations underpin auditory sentence processing. Previous studies have accumulated evidence that speech and auditory sentence processing are associated with multiple ranges of brain oscillations, including both low-frequency components, such as δ and θ oscillations, and high-frequency components, such as β and γ oscillations (see reviews: Giraud and Poeppel, 2012, Lewis et al., 2015).

For low-frequency components (i.e., δ and θ), recent findings showed that the phase information of the δ and θ oscillations are involved in speech perception. The δ and θ (i.e., 1–8 Hz) phase measured by magnetoencephalography (MEG) can be used to successfully classify different auditory sentences attended to by subjects (θ phase in Luo and Poeppel, 2007; δ and θ phase in Cogan and Poeppel, 2011). In an electroencephalography (EEG) study, the phase restricted to 2–9 Hz (which overlaps the δ and θ bands) can successfully classify different American English consonants (Wang et al., 2012). In connection with such findings on the importance of δ/θ phase, two other recent neurophysiological studies have found that entrainment (i.e., phase-locking) of δ and θ brain oscillations to the speech envelope at the corresponding δ and θ amplitude-modulation rates may underpin speech intelligibility and serve as one of the neural mechanisms of speech processing (Peelle et al., 2013, Doelling et al., 2014). Peelle et al. (2013) found that the degree of θ (4–7 Hz) MEG-envelope entrainment was related to sentence intelligibility observed in the left auditory cortex and middle temporal gyrus. Doelling et al. (2014) artificially removed the δ- and θ-rate (2–9 Hz) envelopes of sentences in various acoustic spectral bands and consequently found that the δ and θ MEG-envelope entrainment was suppressed, accompanied by a reduction in sentence intelligibility. The correlation between brain–acoustic entrainment in the δ and θ range and speech intelligibility thus emphasizes the importance of δ and θ brain oscillations in auditory sentence processing (see review by Ding and Simon, 2014).

Besides involvement in brain–acoustic entrainment, the power of low-frequency components was also found to be important for speech processing. For instance, Peña and Melloni (2012) used a cross-linguistic design to compare the EEG oscillations elicited from Italian and Spanish speakers while listening attentively to Italian, Spanish, and Japanese utterances played both forward and backward. This study found that, in both Italian and Spanish subjects, θ power was significantly higher when listening to forward than to backward utterances, regardless whether or not the language was native. The finding that forward utterances elicit higher θ power than backward utterances, even for a non-native language, thus indicates that θ power may be involved in tracking syllable patterns (Peña and Melloni, 2012). In a more recent MEG study (Ding et al., 2015), similar results were found which showed that, when listening to Chinese sentences with syllable rate of around 4 Hz, both native Chinese or English listeners showed significantly higher 4-Hz MEG power for forward sentences than for the backward versions. Considering that backward utterances preserve properties that are closely matched to the acoustic complexity of speech utterances but cause serious phonological distortions (Binder et al., 2000, Saur et al., 2010, Gross et al., 2013), syllabic tracking in speech utterances may involve a higher degree of phonological analysis compared to backward utterances, even in a non-native language. Studies have found that θ oscillations are also involved in lexical–semantic retrieval (Bastiaansen et al., 2008) and in syntactic processing during sentence perception (Bastiaansen et al., 2002), the former involving retrieval of long-term semantic knowledge and the latter involving working memory processing.

For high-frequency components, such as β and γ oscillations, there is evidence that brain oscillations in this range are involved in different linguistic processes. A recent MEG study (Alho et al., 2014) investigated the inter-areal phase synchronies of high-β (β2, 20–30 Hz) and γ oscillations between the auditory and motor cortices during active and passive listening to phonologically valid but meaningless mono-syllables in both clean and noisy environments. It showed that the left-hemispheric inter-areal β2 synchronies were significantly greater during syllable listening in noisy than in clean environments and that such synchronies were positively correlated with syllable identification accuracy. Furthermore, inter-areal γ synchronies were found to be greater during active than passive listening. This indicates the mediation of phonological categories in speech by inter-areal connectivity between auditory–sensory and motor regions via β2 and γ oscillations. For higher linguistic-level processing, β oscillations were reported to be involved in syntactic processing, showing higher EEG β power for syntactically correct than syntactically unstructured and word category violated sentences (Bastiaansen et al., 2010; also reviews by Lewis and Bastiaansen, 2015, Lewis et al., 2015). In addition, γ oscillations were reported to be involved in lexico-semantic retrieval (Lutzenberger et al., 1994, Pulvermüller et al., 1996). These studies found significant increases in γ oscillations when subjects actively perceived real-word compared to pseudo-word stimuli in both visual (Lutzenberger et al., 1994) and auditory (Pulvermüller et al., 1996) modalities, which is consistent with the critical role of γ activity in long-term memory processing (Ward, 2003).

In addition to the respective roles of δ, θ, β, and γ oscillations, the hierarchical organization between the low-frequency and high-frequency oscillations, termed cross-frequency coupling (CFC), serves as another important parameter for speech processing (Fell and Axmacher, 2011, Lisman and Jensen, 2013). Here, we focus on phase-power CFC, in which the power of high-frequency oscillations is controlled by the phase patterns of low-frequency oscillations (Tort et al., 2008). It has been found that θ–β/γ CFC increased significantly across a range of human cortical regions during various cognitive tasks, including language-related tasks, such as active/passive listening to phonemes and words, word production, visual reading, and so on (Canolty et al., 2006). The phenomenon of θ–β/γ CFC increase has been interpreted in other studies as the neural mechanism for memory processing, including encoding and retrieval of long-term memory and working memory maintenance in both non-human mammals (Tort et al., 2008, Tort et al., 2009, Shirvalkar et al., 2010) and human beings (Mormann et al., 2005, Sauseng et al., 2009, Axmacher et al., 2010, Friese et al., 2013, Kӧster et al., 2014, Kaplan et al., 2014). It is likely, therefore, that θ–β/γ CFC is related to high-level linguistic processes like phonological working memory maintenance and retrieval of lexical–semantic information, or even sentence-level processes related to memory retrieval or encoding (e.g., contextual semantic integration and syntactic processing). Furthermore, it has recently been suggested that θ–β/γ CFC supports the hierarchical binding of both long-duration (such as syllables and long phonemes, e.g., long-vowels, at θ-scale) and short-duration (such as short phonemes, e.g., consonants and short-vowels, at β/γ-scale) phonological information during speech analysis (Giraud and Poeppel, 2012, Gross et al., 2013). Besides θ–β/γ CFC, the coupling between δ and θ oscillations (δ–θ CFC) may also be important. δ–θ CFC was found to be higher when listening to forward than to backward utterances, indicating a possible role of hierarchical binding between even longer-duration information of prosody or phrases/words (at δ-scale) and the θ-scale information in speech perception (Gross et al., 2013), although one should be cautious when interpreting the δ–θ CFC effects due to the close frequency ranges between δ and θ oscillations that could cause intrinsic coupling effects mathematically.

In spite of the abundant findings on brain oscillations to describe language processing as reviewed above, few studies have examined these oscillatory indices for different linguistic hierarchical levels simultaneously. How brain oscillations index and separate processes at these levels therefore remains obscure. The current study aims at revealing oscillatory EEG indices for phonological and higher-level linguistic (semantic/syntactic) processing during listening to auditory sentences in Mandarin. We used three types of continuous utterance stimuli in Mandarin in order to dissociate the effects caused by acoustics, phonology, and the higher linguistic levels: (1) sentences consisting of meaningful disyllabic words assembled with a valid syntactic structure (real-word condition); (2) utterances with morphologically valid syllables, but no valid disyllabic words (pseudo-word condition); and (3) backward versions of both the real-word and pseudo-word utterances (‘non-speech’ condition). In this design, real-word and pseudo-word utterances can be distinguished by their differences in semantic content. For example, in the real-word condition, the syllable pair, ‘喜’ and ‘欢’, constitutes a disyllabic word, ‘喜欢’ (‘enjoy’), while in the pseudo-word condition, the two successive syllables, ‘书’ and ‘实’, do not form a meaningful disyllabic word (i.e., a pseudo-word ‘书实’, see more detailed examples in Stimuli and tasks). Thus, the real-word condition involves semantic integration of two successive syllables into a meaningful word compared to the pseudo-word condition. Also, as real-word utterances have a valid syntactic structure that pseudo-word utterances do not have, the real-word condition also involves syntactic processing compared to the pseudo-word condition. Since both real-word and pseudo-word utterances are composed of morphologically valid syllables, we also designate them as ‘speech’ conditions. The backward utterances are closely matched in terms of acoustic complexity to their respective speech utterances, thereby providing a control condition by which to dissociate the psychoacoustic processing of speech-like physical properties from speech-specific processing (Binder et al., 2000, Londei et al., 2010, Saur et al., 2010, Peña and Melloni, 2012, Gross et al., 2013). We consider that EEG indices with statistically greater magnitude in the real-word condition than in the pseudo-word condition involve semantic/syntactic processing. Meanwhile, we consider indices with greater magnitude for speech than for non-speech, but with no statistical difference between real-word and pseudo-word, to be signatures relevant to phonological rather than semantic/syntactic processing. We thus focus on two types of comparisons: (1) speech (real-word plus pseudo-word) vs. non-speech (backward) condition; and (2) real-word vs. pseudo-word condition.

Based on suggestions in previous studies, that processing of phonology and higher linguistic levels engage different anatomical and functional neural networks (Saur et al., 2010) that can be segregated into different brain oscillations (McNab et al., 2012, with stimuli of visual nouns), we hypothesize that phonological and higher-level linguistic organization in auditory sentence processing can be separately indexed by EEG signatures relevant to δ (2–4 Hz), θ (4–8 Hz), β (13–30 Hz), and γ (30–50 Hz) oscillations. Specifically, we propose the following four EEG parameters, which have been studied in previous research related to speech and language processing, as candidates of the signatures for our current hypothesis. Firstly, we propose to use power changes in brain oscillations, as has been commonly used in previous studies of phonological and semantic/syntactic functions (e.g., Pulvermüller et al., 1996, Bastiaansen et al., 2008, Peña and Melloni, 2012). Secondly, we propose to use EEG-acoustic entrainment (EAE) as an index for auditory sentence processing. As reviewed above, δ to θ brain–acoustic entrainment has been shown to co-vary with speech intelligibility (Peelle et al., 2013, Doelling et al., 2014). However, seen simply from changes in speech intelligibility, what linguistic hierarchical levels (phonological or semantic/syntactic processing) are involved and how these hierarchical levels respectively influence brain–acoustic entrainment remain unclear. Thirdly, we propose to use δ–θ and θ–β/γ cross-frequency couplings (CFC) as candidates for indexing phonological and semantic/syntactic computations and memory processes, such as working memory and long-term memory retrieval in phonological and semantic/syntactic processing, and hierarchical binding between speech components in different timescales (prosody/phrases/words, syllables, phonemes). Finally, based on the findings of auditory-motor inter-areal connectivity embodied via β and γ synchronies during phonological processing (Alho et al., 2014) and large-scale connectivity brain networks for lexical–semantic retrieval (Patterson et al., 2007, Pulvermüller, 2013), we propose to use renormalized partial directed coherence (rPDC) in the β/γ range as an index to describe neural connectivity during phonological and/or semantic processing, where rPDC is a quantitative method for calculating the extent of directed connectivity between brain regions (Schelter et al., 2009).

Section snippets

Subjects

21 subjects (8 males and 13 females, aged 19–25 years old), all undergraduate or postgraduate students of The Chinese University of Hong Kong, consented to participate in this study. All subjects were native Mandarin speakers from mainland China with normal-hearing, as confirmed by a monaural pre-test on both ears. Data for one subject (female) were not used for further analysis because of the excessive percentage of trial rejection (> 60% target trials were rejected) due to eye artifacts (while <

Results

After EEG artifact rejection, over 70% of the trials were retained for further analysis in all 20 subjects. The average number of retained trials were 56.4 (SE: 1.2), 56.0 (SE: 1.2), and 56.8 (SE: 1.2) for the real-word, pseudo-word, and non-speech conditions, respectively.

Attention control and working memory load across conditions

In the current experiment, subjects were instructed to pay attention to the target stimuli and to perform a sound-matching task. Results of response accuracies indicate that the task was easier for speech (> 95% in both real-word and pseudo-word) than for non-speech (< 75%). However, the observation that accuracies were significantly higher than chance level (50%) even for non-speech, confirms that subjects complied with the instruction to pay active attention to the target stimuli in all

Summary

The present study investigated the δ, θ, β, and γ EEG oscillations during auditory sentence processing and we hypothesized that the phonological and higher-level semantic/syntactic processing can be separately indexed. First, we observed significant effects of higher band power and EEG-acoustic entrainment of δ and θ oscillations elicited during phonological processing, but didn't find such effects during semantic/syntactic processing. This may indicate the tracking of phonetic patterns by δ

Acknowledgments

This research was supported in part by grant 455911 made by the Research Grants Council of Hong Kong SAR to Prof. William S-Y. Wang at The Chinese University of Hong Kong. We would also like to appreciate the reviewers for their insightful comments and suggestions during the paper revision.

References (87)

  • A.G. Lewis et al.

    Fast oscillatory dynamics during language comprehension: unification versus maintenance and prediction?

    Brain Lang.

    (2015)
  • J.E. Lisman et al.

    The theta-gamma neural code

    Neuron

    (2013)
  • H. Luo et al.

    Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex

    Neuron

    (2007)
  • W. Lutzenberger et al.

    Words and pseudowords elicit distinct patterns of 30-Hz EEG responses in humans

    Neurosci. Lett.

    (1994)
  • J.X. Maier et al.

    Integration of bimodal looming signals through neuronal coherence in the temporal lobe

    Curr. Biol.

    (2008)
  • I.G. Meister et al.

    The essential role of premotor cortex in speech perception

    Curr. Biol.

    (2007)
  • L. Michels et al.

    Developmental changes of functional and directed resting-state connectivities associated with neuronal oscillations in EEG

    NeuroImage

    (2013)
  • R.C. Oldfield

    The assessment and analysis of handedness: the Edinburgh inventory

    Neuropsychologia

    (1971)
  • F. Pulvermüller

    How neurons make meaning: brain mechanisms for embodied and abstract symbolic semantics

    Trends Cogn. Sci.

    (2013)
  • F. Pulvermüller et al.

    High-frequency cortical responses reflect lexical processing: an MEG study

    Electroencephalogr. Clin. Neurophysiol.

    (1996)
  • D. Saur et al.

    Combining functional and anatomical connectivity reveals brain networks for auditory language comprehension

    NeuroImage

    (2010)
  • B. Schelter et al.

    Assessing the strength of directed influences among neural signals using renormalized partial directed coherence

    J. Neurosci. Methods

    (2009)
  • F. Strand et al.

    Phonological working memory with auditory presentation of pseudo-words — An event related fMRI Study

    Brain Res.

    (2008)
  • L.M. Ward

    Synchronous neural oscillations and cognitive processes

    Trends Cogn. Sci.

    (2003)
  • E. Ahissar et al.

    Speech comprehension is correlated with temporal response patterns recorded from auditory cortex

    Proc. Natl. Acad. Sci.

    (2001)
  • J. Alho et al.

    Enhanced neural synchrony between left auditory and premotor cortex is associated with successful phonetic categorization

    Front. Psychol.

    (2014)
  • T. Arai et al.

    Syllable intelligibility for temporally filtered LPC cepstral trajectories

    J. Acoust. Soc. Am.

    (1999)
  • T. Arai et al.

    Intelligibility of speech with filtered time trajectories of spectral envelopes

  • N. Axmacher et al.

    Cross-frequency coupling supports multi-item working memory in the human hippocampus

    Proc. Natl. Acad. Sci.

    (2010)
  • M.C.M. Bastiaansen et al.

    Syntactic unification operations are reflected in oscillatory dynamics during on-line sentence comprehension

    J. Cogn. Neurosci.

    (2010)
  • J.R. Binder et al.

    Human temporal lobe activation by speech and nonspeech sounds

    Cereb. Cortex

    (2000)
  • E.A. Buffalo et al.

    Laminar differences in gamma and alpha coherence in the ventral stream

    Proc. Natl. Acad. Sci.

    (2011)
  • R.T. Canolty et al.

    High gamma power is phase-locked to theta oscillations in human neocortex

    Science

    (2006)
  • R.T. Canolty et al.

    The functional role of cross-frequency coupling

    Trends Cogn. Sci.

    (2010)
  • C.A. Chapman et al.

    Intrinsic theta-frequency membrane potential oscillations in hippocampal CA1 interneurons of stratum lacunosum-moleculare

    J. Neurophysiol.

    (1999)
  • G.B. Cogan et al.

    A mutual information analysis of neural coding of speech by low-freuqency MEG phase information

    J. Neurophysiol.

    (2011)
  • N. Ding et al.

    Cortical tracking of hierarchical linguistic structures in connected speech

    Nat. Neurosci.

    (2015)
  • N. Ding et al.

    Cortical entrainment to continuous speech: functional roles and interpretations

    Front. Hum. Neurosci.

    (2014)
  • L. Elshoff et al.

    Dynamic imaging of coherent sources reveals different network connectivity underlying the generation and perpetuation of epileptic seizures

    PLoS One

    (2013)
  • J. Fell et al.

    The role of phase synchronization in memory processes

    Nat. Rev. Neurosci.

    (2011)
  • L. Fontolan et al.

    The contribution of frequency-specific activity to hierarchical information processing in the human auditory cortex

    Nat. Commun.

    (2014)
  • A. Gazzaley et al.

    Top-down suppression deficit underlies working memory impairment in normal aging

    Nat. Neurosci.

    (2005)
  • A.-L. Giraud et al.

    Cortical oscillations and speech processing: emerging computational principles and operations

    Nat. Neurosci.

    (2012)
  • Cited by (0)

    View full text