Abstract
Sentence comprehension involves the decoding of both semantic and grammatical information, a process fundamental to communication. As with other complex cognitive processes, language comprehension relies, in part, on long-term memory. However, the electrophysiological mechanisms underpinning the encoding and generalization of higher-order linguistic knowledge remain elusive, particularly from a sleep-based consolidation perspective. One candidate mechanism that may support the consolidation of higher-order language is the coordination of slow oscillations (SO) and sleep spindles during nonrapid eye movement sleep (NREM). To examine this hypothesis, we analyzed electroencephalographic (EEG) data recorded from 35 participants (Mage = 25.4; SD = 7.10; 16 males) during an artificial language learning task, contrasting performance between individuals who were given an 8 h nocturnal sleep period or an equivalent period of wake. We found that sleep relative to wake was associated with superior performance for sequence-based word order rules. Postsleep sequence-based word order processing was further associated with less task-related theta desynchronization, an electrophysiological signature of successful memory consolidation, as well as cognitive control and working memory. Frontal NREM SO–spindle coupling was also positively associated with behavioral sensitivity to sequence-based word order rules, as well as with task-related theta power. As such, theta activity during retrieval of previously learned information correlates with SO–spindle coupling, thus linking neural activity in the sleeping and waking brain. Taken together, this study presents converging behavioral and neurophysiological evidence for a role of NREM SO–spindle coupling and task-related theta activity as signatures of memory consolidation and retrieval in the context of higher-order language learning.
Significance Statement
The endogenous temporal coordination of neural oscillations supports information processing during both wake and sleep states. Here we demonstrate that slow oscillation–spindle coupling during nonrapid eye movement sleep predicts the consolidation of complex grammatical rules and modulates task-related oscillatory dynamics previously implicated in sentence processing. We show that increases in theta power predict enhanced sensitivity to grammatical violations after a period of sleep and strong slow oscillation–spindle coupling modulates subsequent task-related theta activity to influence behavior. Our findings reveal a complex interaction between both wake- and sleep-related oscillatory dynamics during the early stages of language learning beyond the single word level.
Introduction
The human brain is adept at extracting regularities from sensory input, a process pivotal for generating knowledge of one's physical and social environment (Santolin and Saffran, 2018). Notably, learning of such regularities plays a key role in the development of linguistic competencies, enabling the implicit acquisition of grammatical rules embedded in ambient speech (Romberg and Saffran, 2010; Cross et al., 2021; Isbilen et al., 2022). While this perspective of language learning has informed insights concerning the encoding of local dependencies, the acquisition of more complex linguistic structures remains less understood. Here, we address this gap from the perspective of sleep-based memory consolidation, a well-established mechanism governing the generalization of knowledge from sensory experience (Diekelmann et al., 2009; Xie et al., 2018; Brodt et al., 2023).
A plethora of evidence (for review, see Rasch and Born 2013) demonstrates that sleep plays an active role in memory by consolidating and generalizing mnemonic information. This dynamic account of the sleeping brain is captured by the active system consolidation hypothesis (ASC; Born and Wilhelm, 2012; Klinzing et al., 2019). Core to ASC is that sleep facilitates repeated reactivation of encoded memory representations (Rasch and Born, 2013). This reactivation is dependent on cortical glutamatergic synapses, which weaken during prolonged wakefulness (Kavanau, 1997; Rasch and Born, 2013). The ASC is supported by electrophysiological evidence that learned sequences are replayed during nonrapid eye movement (NREM) sleep, potentially via sleep spindle and slow oscillatory (SO) activity. Sleep spindles are bursts of electrical activity occurring between 11 and 16 Hz, while SOs centered at 1 Hz reflect synchronized membrane potential fluctuations between hyperpolarized upstates and depolarized downstates of neocortical neurons (Crunelli and Hughes, 2010; Vyazovskiy and Harris, 2013). The precise coupling between SOs and spindles provides a temporal receptive window for the replay of hippocampal memory traces and their transfer to the cortex for long-term storage (Mikutta et al., 2019; Bastian et al., 2022). Critically, the transfer of newly encoded information from the hippocampus to cortex enables generalization of mnemonic information, allowing the cortex to learn the regularities of sensory input gradually—a process known to support language learning (Davis and Gaskell, 2009; Rasch, 2017; Cross et al., 2018).
Mechanisms of sleep-based memory consolidation have been associated with aspects of language learning, including novel-word learning (Bakker et al., 2015; Mirković and Gaskell, 2016; James et al., 2017) as well as the generalization of grammatical rules (Nieuwenhuis et al., 2013; Batterink et al., 2014). Positive associations have also been identified between rapid eye moment (REM) sleep percentages and language learning proficiency (De Koninck et al., 1989, 1990), supporting a link between REM sleep and language learning. To elucidate the mechanism of this relationship, Thompson et al. (2021) examined oscillatory dynamics during REM sleep and demonstrated that sleep spindles and theta power predicted language learning among individuals engaged in second language immersion programs. This effect was stronger when time locked to eye movements during REM sleep.
Together, extant work on sleep and language learning underscore the significance of both REM and NREM sleep, sleep spindles, and theta power in facilitating second language learning. However, work examining the association between sleep and language often involves only behavioral measures as proxies for memory consolidation (Nieuwenhuis et al., 2013; Mirković and Gaskell, 2016) or examines structure (e.g., grammar; Nieuwenhuis et al., 2013) and meaning (i.e., semantics; Bakker et al., 2015; Batterink et al., 2017; Batterink and Paller, 2017) in the language input separately (cf. Batterink et al., 2014). Markers of sleep-based memory consolidation are also often based on coarse experimental contrasts (i.e., sleep vs wake conditions) or macroarchitectural measures (i.e., percent time spent in a particular sleep stage), rather than neurophysiological events that can more directly test models of systems consolidation anchored in NREM sleep, such as SO–spindle coupling. Online EEG measures during language learning and comprehension and their relation to offline states, such as sleep, are also lacking.
From this perspective, neurobiological models of sleep, memory, and language processing would benefit from a direct investigation of the relation between sleep and higher-order language, such as at the sentence level that have differing grammatical rules (Rasch, 2017; Schreiner and Rasch, 2017; Cross et al., 2018), in conjunction with online measures of neural activity. This would extend our understanding of the complexity of language learning beyond single words and how the generalization of newly acquired linguistic knowledge is supported by sleep (for review, see Cross et al., 2018) and how the brain learns environmental regularities that span multiple scales of complexity and how this information is organized across sleep and wake.
Here, we present data addressing the contribution of sleep-based memory consolidation to complex rule learning in language at the sentence level. We used the modified miniature language Mini Pinyin (Cross et al., 2021), which is modeled on Mandarin Chinese, to contrast rules that instantiate a fixed or flexible word order. Mandarin naive Monolingual native English speakers completed a learning task where they were shown pictures of two-person events, followed by a sentence describing the event in the picture. During this task, participants learned varying word order rules without explicit instruction and then completed a baseline memory task prior to either 8 h of sleep or an equivalent period of wake (Fig. 1). Participants then completed a delayed memory task to assess changes in memory of the word order rules after the 8 h delay.
Illustration of stimulus presentation and experimental protocol. A, Schematic representation of a single trial of a grammatical sentence during the sentence learning task. B, Schematic representation of a single trial during the baseline sentence judgment task. This sentence is a violation of the verb position, whereby the verb chile is positioned in the middle of the sentence when it should be positioned at the end of the sentence. Here, the participant incorrectly categorized this sentence as grammatical and thus received feedback indicating that their response was incorrect. C, Schematic diagram of the vocabulary test, which required participants to translate the nouns (e.g., yegou) into English (e.g., dog) using a keyboard. D, Experimental protocol representing the time course of the conditions (sleep, wake) and testing sessions (sentence learning task, baseline, and delayed sentence judgment tasks). After completing the vocabulary test, participants were randomly assigned to either the sleep or wake conditions, with each participant only completing one of the two conditions. Time is represented along the x-axis, while each colored block corresponds to a different task during the experimental protocol.
We focused on theta oscillations (∼3–7 Hz), which were quantified using complex Morlet wavelets across sentence presentation during the memory tasks. Theta oscillations are implicated in relational binding and memory-based decision-making (Buzsáki, 2002; Jacobs et al., 2006; Backus et al., 2016). From this perspective, theta should track successful language learning and sleep-based consolidation (Cross et al., 2018). We further quantified whole-scalp NREM SO–spindle coupling by detecting spindle events and quantifying the percentage of spindle events that occurred during SO events. SO–spindle coupling and task-related theta power were used to independently predict language learning and to determine whether task-related theta is modulated by sleep-based memory consolidation.
Materials and Methods
Participants
We recruited 36 right-handed participants who were healthy, monolingual, native English speakers (16 male) aged 18–40 years old (Mage = 25.4; SD = 7.0). Participants were randomly assigned to either a sleep (n = 18) or wake condition. All participants reported normal or corrected-to-normal vision, no history of psychiatric disorders, substance dependence, or intellectual impairment and were not taking medication that influenced sleep or neuropsychological measures. All participants provided informed consent and received a $120 honorarium. One participant from the sleep condition was removed from the analysis due to technical issues during the experimental tasks and sleep period, resulting in a total sample size of 35 (Mage = 25.4; SD = 7.10; 16 males; sleep n = 17). Ethics approval was granted by the University of South Australia's Human Research Ethics committee (I.D: 0000032556).
Screening and control measures
The Flinders Handedness Survey (FLANDERS; Nicholls et al., 2013) was used to screen handedness, while the Pittsburgh Sleep Quality Index (PSQI; Buysse et al., 1989) screened for sleep quality. PSQI scores ranged from 1–5 (M = 2.9; SD = 1.33) out of a possible range of 0–21, with higher scores indicating worse sleep quality. Prospective participants with scores >5 were unable to participate. As an additional control, the Stanford Sleepiness Scale (SSS) was administered at the beginning and end of the experiment to measure self-perceived sleepiness.
Electroencephalography
The electroencephalogram (EEG) was recorded during the learning and sentence judgment tasks and sleep opportunities using a 32-channel BrainCaps with sintered Ag/AgCI electrodes (Brain Products) mounted according to the extended International 10–20 system. The reference was located at FCz, with EEG signals rereferenced to linked mastoids offline. The ground electrode was located at AFz. The electrooculogram (EOG) was recorded via electrodes located 1 cm from the outer canthus of each eye (horizontal EOG) and above and below participants’ left eye (vertical EOG). Submental electromyography (EMG) was added to facilitate accurate scoring of sleep periods. The EEG was amplified using a BrainAmp DC amplifier (Brain Products) using an initial bandpass filter of DC 250 Hz with a sampling rate of 1,000 Hz.
Vocabulary and structure of Mini Pinyin
Stimuli consisted of sentences from a modified miniature language based on Mandarin Chinese (Cross et al., 2021). This language contained 32 transitive verbs, 25 nouns, two coverbs, and four classifiers. The nouns included 10 human entities, 10 animals, and five objects (e.g., apple). Each category of noun was associated with a specific classifier, which always preceded each of the two noun phrases in a sentence. As illustrated in Figure 2B, ge specifies a human noun, zhi for animals, and xi and da for small and large objects, respectively. Overall, this stimulus set contained 576 unique sentences (288 grammatical, 288 ungrammatical) which are divided into two equivalent sets (Cross et al., 2021) for a complete description of the stimuli; for the complete set of stimuli, visit https://tinyurl.com/3an438h2).
Example of images used in vocabulary and sentence learning phases. A, Portion of the 25 illustrations used in the vocabulary booklet, which included human, animal, and inanimate objects (i.e., bag, apple). B, Portion of the illustrations used in the sentence learning task, illustrating the interaction between two entities. Note that the entities used in sentence learning task are the same as the illustrations used in the vocabulary booklet.
We focused on a subset of sentence conditions to investigate the mechanisms underlying the learning of different word order rules, which fundamentally differs between natural languages (for review, see Bates et al., 2001). Languages like English and Dutch rely primarily on word order, while languages like German and Turkish rely more on cues such as case marking and animacy (MacWhinney et al., 1984; Bornkessel and Schlesewsky, 2006; Bornkessel-Schlesewsky et al., 2015). From this perspective, Mini Pinyin enabled a comparison between sentences with differing word orders (Fig. 3A) and the influence sleep may have on the respective consolidation of fixed and flexible word order rules. The subset of stimuli in the current analysis contained 96 sentences in the sentence learning task and 144 sentences in the grammaticality judgment tasks. The remaining sentences were considered fillers. These filler sentences included sentences that violated classifier–noun pairs and thus were not suitable for testing predictions regarding fixed and flexible word order processing (for a full description of all sentence conditions present in this language, please see Cross et al., 2021).
Exemplar word order rules and vocabulary items of Mini Pinyin. A, Example of grammatical and ungrammatical fixed and flexible word order sentences. Classifiers and nouns are coded in blue, while verbs are red. The coverb ba is coded in green. For the ungrammatical sentences (right), the point of violation in the sentence is underlined. The direct English translation for each sentence construction is provided below (i.e., the bear eats the apple). B, A sample of the linguistic elements present in Mini Pinyin and their English translation. Note that ba does not have a specific meaning, but when present in a sentence, instantiates a strict actor–undergoer–verb word order.
As is apparent in Figure 3A, sentences that do not contain the coverb ba (i.e., actor–verb–undergoer, AVU; undergoer–verb–actor, UVA) yield a flexible word order, such that understanding who is doing what to whom is not dependent on the ordering of the noun phrases. Instead, determining who is doing what to whom is facilitated by animacy cues. For instance, in the UVA condition, the bear is interpreted as the actor despite the first noun phrase being the apple, since it is implausible for an apple to eat a bear. Therefore, both AVU and UVA are grammatical constructions. In contrast, sentences such as AbaUV yield a fixed word order, such that the inclusion of ba strictly renders the first noun phrase as the actor. Note that the positioning of the verb is critical in sentences with and without a coverb. With the inclusion of a coverb, the verb must be placed at the end of the sentence, while the verb must be positioned between the noun phrases in constructions without a coverb.
Experimental protocol
Participants received a paired picture–word vocabulary booklet containing the 25 nouns and were asked to maintain a minimum of 7 h sleep per night (Fig. 2A for a portion of nouns from the vocabulary booklet). Participants were required to learn the 25 nouns to ensure that they had a basic vocabulary of the nouns to successfully learn the 32 transitive verbs. They were asked to record periods of vocabulary learning in an activity log. Participants were instructed to study the booklet for at least 15 min per day and were informed that they would need to pass a vocabulary test before commencing the main experimental protocol. After ∼1 week, participants returned to complete the main experimental session, where EEG was recorded during a sentence learning task, baseline, and delayed sentence judgment tasks.
Vocabulary test
Participants completed a vocabulary test by translating the nouns from Mini Pinyin into English using a keyboard, as illustrated in Figure 1C. Each trial began with a 600 ms fixation cross, followed by the visual presentation of the noun for up to 20 s. Prospective participants who scored <90% were unable to complete the main experimental EEG session. As such, all 36 participants included in the current paper obtained over 90% correct on the vocabulary test. The proportion of individuals who did not pass the vocabulary test was small (e.g., approximately less than five cases); however, the exact number was not recorded.
Sentence learning
Sentence and picture stimuli were presented using OpenSesame (Mathôt et al., 2012). During sentence learning, pictures were used to depict events occurring between two entities. The pictures and entities shown during the learning task were combinations of the static pictures shown in the vocabulary booklet (for an example of booklet versus sentence learning picture stimuli, see Fig. 2A,B, respectively).
While participants were aware that they would complete sentence judgment tasks at a later point, no explicit description of or feedback regarding grammatical rules was provided during the learning task. Each picture corresponded to multiple sentence variations, similar to the grammatical conditions in Figure 3A. Picture–sentence pairs were presented to participants as correct language input. Participants were presented with a fixation cross for 1,000 ms, followed by the picture illustrating the event between two entities for 5,000 ms. A sentence describing the event in the picture was then presented on a word-by-word basis. Each word was presented for 700 ms followed by a 200 ms ISI. This pattern continued for the 96 reported combinations, until the end of the task, which took ∼40 min. The 96 sentences included in this analysis included the flexible (i.e., AVU, UVA) and fixed (i.e., AbaUV) sentence constructions. Sentences considered as fillers contained a coverb that was not ba and thus were not relevant to testing the predictions posited in the current analysis. During this task, participants were required to learn the structure of the sentences and the meaning of the verbs, classifiers, and the coverb ba. Stimuli were pseudorandomized, such that no stimuli of the same construction followed each other, and each sentence contained a different combination of nouns and verbs. This was done to encourage learning of the underlying grammatical rules rather than episodic events of individual sentences. Further, the two lists of sentences were counterbalanced across participants and testing session. Following the sentence learning task, participants completed the baseline judgment task.
Baseline and delayed judgment tasks
The baseline sentence judgment task taken immediately after learning provided a baseline to control for level of encoding, while the delayed judgment task took place ∼12 h after the learning and baseline judgment tasks. During both judgment tasks, 288 sentences without pictures (144 grammatical, 144 ungrammatical), 156 of which are reported here, were presented word-by-word with a presentation time of 600 ms and an ISI of 200 ms. The 156 included sentences included a combination of grammatical and ungrammatical flexible and fixed sentence constructions, while the 132 sentences that were considered fillers contained coverbs that were not ba, and classifier–noun pair violations, and thus were not relevant to testing the predictions of the current analysis. Participants received feedback on whether their response was correct or incorrect during the baseline but not the delayed judgment task. This was to ensure that participants were able to continue learning the language without explicit instruction. Figure 1, A and B, illustrates the sequence of events in the sentence learning and baseline judgment tasks, respectively.
Participants were instructed to read all sentences attentively and to judge their grammaticality via a button press. As a cue for judgment, a question mark appeared in the center of the monitor for 4,000 ms after the offset of the last word. Two lists of sentence stimuli were created, which were counterbalanced across participants and the baseline and delayed sentence judgment tasks. Half of the sentences were grammatical, with each of the grammatical constructions shown an equal number of times. The other half of the sentences were ungrammatical constructions. Stimuli were pseudorandomized, such that no stimuli of the same construction followed each other.
Main experimental procedure
For the wake condition, participants completed the vocabulary test and EEG setup at ∼08:00 h. The learning task was administered at ∼09:00 h, followed by the baseline judgment task, with EEG recorded during both the learning and judgment task. Participants then completed the behavioral control tasks and were free to leave the laboratory to go about their usual daily activities, before returning for EEG setup and the delayed judgment task at ∼21:00 h the same day. EEG was also recorded during the delayed judgment task.
Participants in the sleep condition arrived at ∼20:00 h to complete the vocabulary test and EEG setup before completing the learning task at ∼21:00 h, followed by the baseline judgment task, with EEG recorded during both the learning and judgment tasks. Participants were then given an 8 h sleep opportunity from 23:00–07:00 h. Polysomnography was continuously recorded and later scored. After waking, participants were disconnected from the head box and given a ∼1 h break to alleviate sleep inertia before completing the delayed judgment task and behavioral control tasks. During this time, participants sat in a quiet room and consumed a small meal. Resting-state EEG recordings were obtained during quiet sitting with eyes open and eyes closed for 2 min, respectively. See Figure 1D for a schematic of the experimental protocol.
Data analysis
Behavioral analysis
Two measures of behavioral performance were calculated. For the behavioral analysis, grammaticality ratings were calculated on a trial-by-trial basis, determined by whether participants correctly identified grammatical and ungrammatical sentences. For EEG analyses, memory performance was quantified using the sensitivity index (d′) from signal detection theory (Stanislaw and Todorov, 1999). Hit rate (HR) and false alarm rate (FA) were computed to derive d′, defined as the difference between the z-transformed probabilities of HR and FA (i.e., d′ = z[HR] − z[FA]), with extreme values (i.e., HR and FA values of 0 and 1) adjusted using the recommendations of (Hautus, 1995).
EEG recording and preprocessing
Task-related EEG analyses during the baseline and delayed sentence judgment tasks were performed using MNE-Python (Gramfort et al., 2013). EEG data (C3, C4, CP1, CP2, CP5, CP6, Cz, F3, F4, F7, F8, FC1, FC2, FC5, FC6, Fp1, Fp2, Fz, O1, O2, P3, P4, P7, P8, Pz) were rereferenced offline to the average of both mastoids and filtered with a digital phase-true finite impulse response (FIR) bandpass filter from 0.1–40 Hz to remove slow signal drifts and high frequency activity. Data segments from −0.5–6.5 s relative to the onset of each sentence were extracted and corrected for ocular artefacts using independent component analysis (fastica; Hyvarinen, 1999). Epochs were dropped when they exceeded a 150 μV peak-to-peak amplitude criterion or were identified as containing recordings from flat channels (i.e., <5 μV).
Task-related time frequency analysis
To determine the individualized ranges used to define the theta frequency band, individual alpha frequency (IAF) was estimated from participants’ pre- and postexperiment resting-state EEG recording. IAFs were estimated from an occipital-parietal cluster (P3/P4/O1/O2/P7/P8/Pz/Oz) using philistine.mne.savgol_iaf (Corcoran et al., 2018) implemented in MNE (philistine.mne). IAF-adjusted frequency bandwidths were calculated according to the harmonic frequency architecture proposed by Klimesch (2012, 2013) and which is in line with previous work (Doppelmayr et al., 1998; Corcoran et al., 2018; Sauppe et al., 2021; Cross et al., 2022), in which the center frequency of each successive band constitutes a harmonic series scaled in relation to the IAF.
We conducted task-related time–frequency analyses by convolving the preprocessed EEG with a family of complex Morlet wavelets using the MNE function tfr_morlet. Theta activity was analyzed using wavelet cycles, with the mother wavelet defined as the center frequency value divided by four. Relative power change values in the poststimulus interval were computed as a relative change from a baseline interval spanning −0.5 s to the onset of each sentence. As such, theta power during the sentence period reflects deviations from the baseline interval, such that higher theta power would indicate an increase in power relative to baseline, while a decrease in power indicates a decrease in power relative to baseline. Five hundred milliseconds were added to the beginning and end of each sentence epoch to avoid edge artefacts. From this, we derived power estimates from individually defined (i.e., based on participants’ IAF values) theta activity from the start to end of each sentence stimulus, electrode, and from the baseline and delayed testing sessions.
Finally, in order to determine whether changes in neural activity between the sleep and wake conditions were truly oscillatory, we used the irregular-resampling auto-spectral analysis toolbox (IRASA v1.0; Wen and Liu, 2016) to estimate the 1/ƒ power law exponent characteristic of background spectral activity, which was used as a covariate in EEG-based statistical models.
Sleep parameters and sleep EEG analyses
Sleep data were scored by two sleep technicians (Z.R.C and S.W.C.) according to standardized criteria (Berry et al., 2012) using Compumedics Profusion 3 software. The EEG was viewed with a high-pass filter of 0.3 Hz and a low-pass filter of 35 Hz. The following sleep parameters were calculated: total sleep time, sleep onset latency, wake after sleep onset, time (minutes), and percent of time spent in each sleep stage (N1, N2, N3, and R). The EEG data were rereferenced to linked mastoids and filtered from 0.3 t6 30 Hz using a digital phase-true FIR bandpass filter. Data were then epoched into 30 s bins and subjected to a multivariate covariance-based artifact rejection procedure. This approach estimates a reference covariance matrix for each sleep stage and rejects epochs that deviate too far from this reference, where deviation is established using Riemannian geometry (Barachant et al., 2013; Barthélemy et al., 2019). Slow oscillation–spindle coupling strength was extracted via the danalyzer toolbox implemented in MATLAB based on published algorithms (Denis et al., 2021).
Briefly, sleep spindles were automatically detected at every electrode during NREM sleep based on individual peak spindle frequencies between 12 and 16 Hz. The raw EEG time series was transformed to the frequency domain by estimating the power spectral density (PSD) of the time series using Welch's method with 5 s windows and 50% overlap. Note that the PSD was calculated on a derivative time series to remove the 1/f component and to make the peak spindles more prominent (Sleigh et al., 2001; Demanuele et al., 2007). For each participant at every channel, spindle peak frequencies were automatically detected. Sleep spindles were then automatically detected using a wavelet decomposition, with the Morlet wavelets generated using participants’ peak spindle frequencies. A thresholding algorithm was then applied to every channel to detect spindles in the narrowband data, with a detected spindle needing to exceed a threshold of six times the median amplitude for a minimum of 400 ms.
For SOs, continuous NREM EEG data were bandpass filtered between 0.5 and 4 Hz, with all positive-to-negative zero crossings identified based on published algorithms (Staresina et al., 2015; Helfrich et al., 2018). Potential SOs were flagged if two such positive-to-negative crossings occurred 0.5–2 s apart. Peak-to-peak amplitudes for all potential SOs were isolated, and oscillations in the top quartile (i.e., with the strongest amplitudes) at each channel were considered SOs (Staresina et al., 2015; Helfrich et al., 2018).
Slow oscillation–spindle coupling was analyzed at each channel during NREM sleep. Specifically, for each identified spindle, we assessed whether it occurred during an identified SO event. These co-occurring events were deemed coupled, and we quantified the percentage of spindle events that were coupled for each channel. For each coupled event, the instantaneous phase of the SO at the time of the peak spindle amplitude was extracted. SO–spindle coupling was further quantified using the mean SO phase and vector length of coupled events for each channel. Finally, the Rayleigh test for circular nonuniformity with alpha set to 0.01 was used to evaluate phase preference regularity across participants.
Statistical analysis
Data were imported into R version 4.0.2 (R Core Team, 2020) and analyzed using (generalized) linear mixed-effects models fit by restricted maximum likelihood (REML) using lme4 (Bates, 2010). For the behavioral model, we used a logistic mixed-effects regression, modeling response choice (correct, incorrect) as a binary outcome variable. This model also factored in by-item and by-participant differences by specifying them as random effects on the intercept. The behavioral model took the following form:
Cluster-based permutation testing (Maris and Oostenveld, 2007) on task-related EEG data was performed in MATLAB R2022a (v9.12.0.1884302; The MathWorks) using the FieldTrip toolbox (v20220810; Oostenveld et al., 2011). Baseline-corrected power estimates for each channel and frequency band (theta, alpha, beta) were averaged over the grammaticality factor for both fixed and flexible sentence types. The difference in spectral estimates between fixed and flexible word orders was calculated for each channel and frequency band within-subjects. These difference scores were then contrasted between sleep and wake conditions (thereby testing the interaction between type and condition). Between-subject t statistics were computed using the ft_statfun_indepsamplesT function. Channels with t values that exceeded an alpha threshold of 0.10 were considered as candidates for cluster inclusion. The t values of resolved clusters were then summed and compared with the null distribution of t statistics obtained from 1,000 random partitions of the data. The cluster-level statistic was considered significant if it attained a p value < 0.05.
Following the identification of significant topographical differences in oscillatory power, the following structure was used for the EEG models, where we were interested in predicting behavior from task-related theta activity, and which did not include trial-based response accuracy:
For sleep-related analyses, we first constructed linear mixed-effects model to predict judgment accuracy from the combination of SO–spindle coupling strength, sentence type, sagittality, and laterality, while controlling for baseline (i.e., presleep and prewake) judgment accuracy and sleep stage (N2, N3), with a random intercept of subject. A second linear mixed-effects model was constructed predicting delayed judgment accuracy from anterior task-related theta power, anterior SO–spindle coupling strength, and sentence type, while controlling for laterality and baseline judgment accuracy, with random intercepts of subject.
p values for all models were estimated using the summary function from the lmerTest package, which is based on Satterthwaite's degrees of freedom (Kuznetsova et al., 2017), while effects were plotted using the package effects (Fox and Hong, 2010) and ggplot2 (Wickham and Wickham, 2016). Post hoc comparisons for main effects were performed using the emmeans package (Lenth et al., 2019). The Holm–Bonferroni method (Holm, 1979) was used to correct for multiple comparisons, while outliers were isolated using Tukey's method, which identifies outliers as exceeding ± 1.5 × interquartile range. Categorical factors were sum-to-zero contrast coded, such that factor level estimates were compared with the grand-mean (Schad et al., 2020). Further, for modeled effects, an 83% confidence interval (CI) threshold was used given that this approach corresponds to the 5% significance level with nonoverlapping estimates (Austin and Hux, 2002; MacGregor-Fors and Payton, 2013). In the visualization of effects, nonoverlapping CIs indicate a significant difference at p < 0.05.
Results
Sleep supports the consolidation of fixed word order rules
Across testing sessions and grammaticality, participants showed a moderate degree of accuracy for fixed (M = 64.00; SD = 48.00) and flexible (M = 58.00; SD = 49.00) word orders, with performance accuracy ranging from 37.18 to 93.75%. As shown in Table 1, performance also varied by sentence type, condition, and grammaticality, with the sleep relative to the wake condition performing higher for fixed word orders at delayed testing.
Percent correct and the sensitivity index d′ by condition (sleep, wake), sentence judgment task (baseline, delayed), grammaticality (grammatical, ungrammatical), and sentence type (fixed, flexible)
Generalized linear mixed-effects modeling of single trial response accuracy (controlling for baseline performance) revealed a significant grammaticality × sentence type × condition interaction (β = 0.13; se = 0.03; p < 0.001; Fig. 4). The Holm–Bonferroni adjusted post hoc comparisons revealed that response accuracy was higher for the sleep relative to wake condition for fixed grammatical (OR = 0.55; se = 0.12; z = −2.60: padj = 0.03) but not fixed ungrammatical (OR = 0.89; se = 0.19; z = −0.52; padj = 1.00) word orders.
Visualization of the behavioral results. Relationship between the probability of correct response (y-axis; higher values indicate a higher probability of a correct response), grammaticality (x-axis; grammatical, ungrammatical), sentence type (left column, flexible; right column, fixed), and condition (wake, salmon; sleep, purple). Bars represent the 83% confidence interval around group level expected marginal mean estimates. Dots represent individual data points per subject for aggregated data.
Response accuracy was also higher in the sleep condition for grammatical fixed relative to grammatical flexible word orders (OR = 0.58; se = 0.06; z = −4.63; padj < 0.001). The sleep condition also judged flexible over fixed word order sentences as ungrammatical (OR = 1.59; se = 0.23; z = 3.10; padj = 0.01). These results indicate that sleep may benefit the consolidation of fixed (but not flexible) word order rules, although this pattern may be due to differing response strategies adopted between the sleep and wake conditions. To address this in subsequent analyses, we examine the sensitivity index d′ to account for potential response biases (Table 1 for d′ values).
Theta power after sleep is associated with increased memory for fixed but decreased memory for flexible word order rules
Based on the differences in behavioral performance between the sleep and wake conditions on fixed and flexible word orders, we asked whether task-evoked theta power predicts differences in behavior across sleep and wake. A nonparametric cluster-based permutation test (see Materials and Methods) contrasting Condition (sleep, wake) and Sentence Type (fixed, flexible) revealed a significant difference in baseline-corrected theta power during the delayed session (Monte Carlo p = 0.008; see Fig. 5A for topography and demarcation of the cluster). No significant clusters were identified for alpha- or beta-band estimates.
Theta power and judgment accuracy. A, Cluster-based permutation testing on the theta band contrasting differences between Condition (sleep, wake) and Sentence Type (fixed, flexible). Warmer colors denote a higher t statistic. Significant channels are indicated by white asterisks. B, Raincloud plots illustrating average theta power over significant channels between sentence type and condition. Positive values on the y-axis denote increased theta power relative to the prestimulus interval. C, Modeled effects of task-related theta power (x-axis; higher values indicate increased power) on judgment accuracy (y-axis; higher values indicate better performance) for the sleep and wake conditions (sleep, purple solid line; wake, dashed pink line) for flexible (left facet) and fixed (right facet) sentences. The black dashed line indicates chance-level performance, while the shaded regions indicate the 83% confidence interval. The x-axis reflects theta power estimates, with more negative values reflecting a decrease in power and positive values reflecting an increase in power from the prestimulus interval, respectively. Individual data points represent raw (single subject) values.
Given the significant theta band effects, we constructed a linear mixed-effects model with judgment accuracy (d′) as the outcome and task-related theta power (drawn from the significant cluster identified above) and Condition (sleep, wake) and Sentence Type (fixed, flexible) as predictors. This analysis revealed a significant theta × condition × sentence type interaction (β = −1.09; se = 0.34; p = 0.001). The Holm–Bonferroni adjusted post hoc comparisons revealed that for flexible word orders, greater theta synchronization was associated with poorer judgment accuracy for the sleep but not wake condition. However, the inverse was observed for fixed word order sentences, such that less theta desynchronization was associated with improved judgment accuracy for the sleep but not wake condition (β = −4.70; se = 1.10; padj < 0.001). Coupled with the behavioral model, the current analysis demonstrates that sleep preferentially consolidates fixed word order rules at the expense of flexible word order rules and that this is reflected in task-related theta power. For a visualization of these effects, see Figure 5C. For time–frequency and power spectral density plots for the sleep and wake conditions across fixed and flexible word orders, see Figures 6 and 7, respectively.
Differences in time–frequency activity between sleep and wake and fixed and flexible word orders. Time–frequency plots for the sleep (top) and wake (bottom) conditions for fixed (left column) and flexible (right column) word order sentences. Time is presented on the x-axis (dashed vertical bar represents sentence onset), while frequency is presented on the y-axis. Warmer colors denote an increase in power relative to the prestimulus period, while cooler colors represent a decrease in power. The z-scale is in arbitrary units.
Power spectral density plots for the sleep (blue) and wake (red) conditions for frontal, central, parietal, and occipital regions of interest. Fixed word order sentences are on the left, while flexible word orders are on the right. The solid red and blue lines represent the mean power spectral density for the wake and sleep conditions, respectively, while the dashed lines represent the aperiodic (1/f) power law. Individual lines represent individual participant power spectral densities.
SO–spindle coupling is predictive of memory for fixed but not flexible word order rules
Having observed differences between the sleep and wake conditions on the relationship between task-related theta activity and behavioral performance, a logical next step was used to test whether behavioral performance for fixed word order rules is associated with SO–spindle coupling. Based on previous work (Helfrich et al., 2018; Mikutta et al., 2019), we focused on the coupling strength, measured as the mean vector length of spindle phase during coupled SO–spindle events (for a summary of typical sleep parameters and their correlation with d′, see Table 2). There was a significant nonuniform distribution for the precise SO phase during peak spindle activity (p ≤ 0.001; Rayleigh test). In predicting behavioral performance, mixed-effects modeling revealed a significant Coupling strength × sentence type × sagittality interaction (β = 3.05; se = 0.97; p = 0.002). Pairwise contrasts further revealed that this effect was largest anteriorly for fixed sentences (β = 6.85; se = 2.01; padj < 0.001; Fig. 8B), but nonsignificant in central (β = −0.75; se = 2.62; padj = 0.77) and posterior regions (β = −3.90; se = 3.47; padj = 0.26). Also note that while stronger SO–spindle coupling predicted improved judgment accuracy for fixed word order sentences, the inverse relationship was present for flexible word order sentences. Figure 8 illustrates an exemplary full-night spectrogram, distribution of SO–spindle coupling strength across channels, as well as exemplar single subject and group level comodulagrams and preferred phase of SO–spindle coupling for NREM sleep. For a summary of sleep microarchitecture characteristics, see Table 3.
Sleep neurophysiology metrics and relationship between phase amplitude coupling and judgment accuracy. A, Hypnogram and full-night multitaper spectrogram for a single participant from channel Cz. B, Modeled effects from the linear mixed-effects regression of SO–spindle coupling strength (x-axis; higher values indicate stronger coupling) on judgment accuracy (y-axis; higher values indicate better performance) for fixed and flexible word order sentences (fixed, purple solid line; flexible, dashed pink line) across levels of anterior (left), central (middle), and posterior (right) regions. The black dashed line indicates chance-level performance, while the shaded regions indicate the 83% confidence interval. C, Scatterplot indicating the relationship between judgment accuracy (y-axis; higher values denote better memory performance) and SO–spindle coupling strength (x-axis; higher values denote stronger coupling) for flexible (left) and fixed (right) word order sentences across anterior channels. The topoplot visualizes the beta coefficient from the SO–spindle coupling strength × sentence type interaction, with higher values/warmer colors denoting a stronger interaction coefficient. D, Single-subject and group-level average time–frequency response of all SOs coupled to a spindle (−1,200 to 1,200 ms, centered on the trough of the SO), with the time-domain averaged SO overlaid. To the right is the preferred phase of SO–spindle coupling for NREM sleep. Note that 0 represents the peak of the SO. E, Ridge plot illustrating the distribution of SO–spindle coupling strength (x-axis; higher values indicate stronger coupling) across channels (y-axis).
Descriptive statistics for sleep parameters and correlations with the difference between d′ at delayed and baseline testing for fixed and flexible word order sentences
NREM slow oscillation–spindle coupling characteristics for frontal, central, and parietal channels
Frontal SO–spindle coupling and task-evoked theta power interact to predict judgment accuracy
Having shown that SO–spindle coupling is associated with improved judgment accuracy for fixed word orders, and judgment accuracy is tracked by task-related theta power, we examined whether frontal theta power interacts with frontal SO–spindle coupling strength to predict judgment accuracy. A mixed-effects model regressing SO–spindle coupling strength, task-based theta power, sagittality (anterior, central, posterior), and sentence type (fixed, flexible) onto judgment accuracy revealed a significant three-way interaction between SO–spindle coupling strength, task-based theta power, and sentence type (β = −41.60; se = 16.70; p = 0.01). As illustrated in Figure 9, high anterior task-based theta power and stronger anterior SO–spindle coupling was positively associated with delayed judgment accuracy for fixed but not flexible word order sentences. This finding links frontal neural activity in the sleeping and waking brain to predict higher-order language learning.
The interaction between task-related theta power and SO–spindle coupling strength predicts judgment accuracy. Delayed judgment accuracy (y-axis; higher values denote higher accuracy), SO–spindle coupling strength (x-axis; higher values denote stronger coupling) and task-related theta power (facetted; low and high contrast for plotting purposes only) averaged across anterior channels. Fixed sentences are color coded in yellow, while flexible sentences are color coded in gray.
Discussion
Coordination between SOs and sleep spindles is hypothesized to provide an optimal temporal receptive window for hippocampal–cortical communication during sleep (Staresina et al., 2015; Helfrich et al., 2019) in the support of memory consolidation. Here, we show that the beneficial effect of SO–spindle coupling on memory extends to sentence-level regularities. Behaviorally, we demonstrated that a period of sleep compared with an equivalent period of wake benefits the consolidation of fixed relative to flexible word order rules and that this effect is modulated by the strength of coupling between spindles and SOs. Our results further reveal that SO–spindle coupling correlates with changes in task-evoked theta activity during sentence processing. Interestingly, participants in the sleep condition exhibited overall less theta power at delayed testing relative to the wake condition; however, less theta desynchronization was associated with improved judgment accuracy for fixed word orders in the sleep group. Lastly, we reveal that the interaction between frontal SO–spindle coupling and task-related frontal theta power predicts improved judgment accuracy for fixed but not flexible word order rules. In sum, our results establish converging behavioral and neurophysiological evidence for a role of NREM SO–spindle coupling and task-related theta activity as signatures of successful memory consolidation and retrieval in the context of higher-order language learning
Beyond single word learning: a role for sleep in consolidating word order rules
Using a complex modified miniature language paradigm (Cross et al., 2021), we demonstrated that a period of sleep facilitates the extraction of fixed relative to flexible word order rules. Importantly, the key distinction between these word order permutations is that successful interpretation of fixed word order sentences relates to the sequential position of the noun phrases and verb (i.e., the first noun phrase is invariably the actor, and the sentence is verb-final). In contrast, successful interpretation of flexible word order sentences depends more heavily on the animacy of the nouns. As such, fixed word order sentences, requiring a more sequential order-based interpretation, are more compatible with an English word order-based processing strategy (MacWhinney et al., 1984; Bornkessel and Schlesewsky, 2006; Bornkessel-Schlesewsky et al., 2015). Critically, this sleep-based enhancement for fixed word order rules was predicted by stronger SO–spindle coupling (Fig. 8F).
Sleep-related memory effects are proposed to be biased toward stimuli following temporal or sequence-based regularities compared with relational information (for review, see Lerner and Gluck 2019). This is posited to occur via the hippocampal complex encoding temporal occurrences of sensory input (Durrant et al., 2011), which are replayed during SWS, potentially via SO–spindle coupling (Navarrete et al., 2020; Solano et al., 2022). Here, we provide evidence supporting this account. Specifically, sleep-based consolidation of higher-order language may favor sequence-based regularities, with mechanisms of sleep-related memory consolidation generalizing fixed over flexible word order rules, indexed by task-related theta activity.
It is important to note, however, that our sample of participants were native monolingual speakers and as such, may have preferentially consolidated fixed word order rules at the expense of flexible rules. While behavioral work demonstrates sentence-level preferences of grammatical rules that are analogous to learners’ native languages (Cross et al., 2021), less is known regarding the neural underpinnings of this phenomenon. We now turn to how the neurobiological processes underpinning the beneficial effect of SO–spindle coupling on memory consolidation extends to higher-order language learning.
Slow oscillation–spindle coupling as a marker of sleep-associated memory consolidation and higher-order language learning
Coupling between SOs and spindles predicts successful overnight memory consolidation (Helfrich et al., 2018; Mikutta et al., 2019; Hahn et al., 2020, 2022). However, these studies often use old–new paradigms with single words (Helfrich et al., 2018; Mikutta et al., 2019) or word–image pairs (Muehlroth et al., 2019). Here, we found that the generalization of sequence-based (or fixed word order) rules is facilitated by the strength of NREM SO–spindle coupling. Mechanistically, during SWS, the cortex is synchronized during the up state of the SO, allowing effective interregional communication, particularly between the prefrontal cortex and hippocampal complex (Helfrich et al., 2019). It is during this SO upstate that spindles induce an influx of Ca2+ into excitatory neurons, enabling synaptic plasticity and the generalization and stabilization of memory traces (Niethard et al., 2018). Here we revealed that the interaction between these cardinal markers of sleep-related memory processing extend to sentence-level regularities. This finding also accords with previous work examining not only NREM sleep and language learning (Batterink et al., 2014; Mirković and Gaskell, 2016; Schreiner and Rasch, 2017), but also REM (De Koninck et al., 1989, 1990; Thompson et al., 2021). For example, the interaction between time spent in NREM and REM modulates the amplitude of language-related ERPs (N400, late positivity) during the processing of novel grammatical rules (Batterink et al., 2014), while percent of time spent in REM is predictive of French learning in a naturalistic multi-week program (De Koninck et al., 1989, 1990). By demonstrating sleep-related consolidation effects for linguistic stimuli of varying complexity, these findings have begun to establish a link between sleep-related memory consolidation of various aspects of language (Rasch, 2017). Building on this foundational work, we have provided empirical evidence supporting a link between oscillatory-based models of hippocampo-cortical memory consolidation and sentence-level learning and how this effect manifests in on-task oscillatory theta activity. In the following, we discuss how SO–spindle coupling, as a marker of sleep-associated memory consolidation, modulates task-related oscillatory activity and how these interactions affect sentence processing.
Task-related theta oscillations index successful memory consolidation of complex linguistic rules
Theta is the dominant frequency in the hippocampal complex and surrounding structures during wake (Duff and Brown-Schmidt, 2012; Covington and Duff, 2016). Oscillations in this frequency range are critical for associative memory formation and coordinating hippocampal–cortical interactions, having been related to associative memory formation (Tort et al., 2009), tracking sequential rules (Crivelli-Decker et al., 2018), and predicting words based on contextual linguistic information (Piai et al., 2016; Corcoran et al., 2023). In the sleep and memory literature, increased theta power has been reported for successfully remembered items, interpreted as reflecting a stronger memory trace induced by sleep-based consolidation. Here, we observed that less theta desynchronization relative to the prestimulus interval predicted higher sensitivity for fixed word order rules after a 12 h delay period and that the effect of theta on fixed word order processing was more pronounced in the sleep relative to wake condition. This finding accords with the general memory literature, possibly reflecting the binding of linguistic items in a sequence to generate a coherent sentential percept.
We also observed that frontal NREM SO–spindle coupling and task-related theta power interacted to predict improved delayed judgment accuracy for fixed but not flexible word order rules. In line with systems consolidation theory (Born and Wilhelm, 2012), NREM oscillatory activity contributes to the consolidation of newly encoded memory representations, which may manifest in stronger theta power during retrieval, indicating a stronger neocortical memory trace (Schreiner and Rasch, 2015), reflected in improved sensitivity to fixed word order rules.
Future directions and concluding remarks
Future studies may include groups in AM–PM (12 h Wake), PM–AM (12 h Sleep), PM–PM (24 h Sleep early), and AM–AM (24 h Sleep late), as recommended by Nemeth et al. (2024). We did, however, model participants’ sleepiness levels and the 1/ƒ exponent in our statistical analyses, which partially controlled for potential time-of-day effects. Further, the evidence presented here is correlational and neuroanatomical inferences are unable to be drawn based on scalp-recorded EEG. However, this is the first study to relate sleep-based memory consolidation mechanisms (i.e., SO–spindle coupling) to online sentence-level oscillatory activity and as such, has set the foundation for future work using techniques with greater spatial-temporal resolution. For example, electrocorticography and stereoelectroencephalography would allow for a better characterization of task-evoked cortical dynamics and SO–spindle coupling between cortical regions and the hippocampal complex, respectively (Helfrich et al., 2018, 2019). This approach would be complemented by demonstrating a selective reinstatement of memory traces during SO–spindle coupling using representational similarity analysis (Zhang et al., 2018). Identifying stimulus-specific representations during the encoding of sentence-level regularities and tracking the replay of stimulus activity related to SO–spindle coupling events would further demonstrate the critical role of sleep-based oscillatory mechanisms on higher-order language learning. Comparisons between sleep-related consolidation effects on language-specific and nonlanguage but related tasks (i.e., statistical learning tasks) in the same group of participants would also further establish the role of sleep in higher-order language learning.
In addition to representational similarity analyses, we suggest that research examine different baselining approaches to task-related differences in theta activity in conditions of sleep and wake. Here, we adopted a conventional baselining approach of subtracting theta power from the prestimulus interval from the stimulus period. In doing so, we observed that the sleep group had greater theta desynchronization than the wake group but that less desynchronization was associated with improved recognition accuracy. From this perspective, it appears that more theta power is indeed associated with better memory, but future research should establish whether this effect is driven by a limiting of task-related desynchronization, as we observed, or if a different baselining procedure would reveal an increase in theta power.
Taken together, our results demonstrate that the temporal coupling between NREM SOs and spindles supports the consolidation of complex sentence-level rules. We demonstrated that SO–spindle coupling promotes the consolidation of sequence-based rules and modulates task-evoked theta oscillations previously implicated in language learning (de Diego-Balaguer et al., 2011; Kepinska et al., 2017) and sentence processing (Vassileiou et al., 2018). Critically, these findings add to models of sleep-based memory consolidation (Lewis and Durrant, 2011; Born and Wilhelm, 2012) and help characterize how effects of sleep-related oscillatory dynamics on memory manifest in oscillatory activity during complex language-related operations.
Footnotes
We thank Alex Chatburn and Samantha Gray for helpful discussions and feedback on an earlier version of this manuscript. We also thank the research assistants at the Cognitive Neuroscience Laboratory. Particular thanks to Isabella Sharrad, Erica Wilkinson, Nicole Vass, and Angela Osborn for help with data collection and also the participants.
Preparation of this work was supported by Australian Commonwealth Government funding under the Research Training Program (RTP; number 212190) and Maurice de Rohan International Scholarship awarded to Z.R.C. I.B.-S was supported by an Australian Research Council Future Fellowship (FT160100437). A.W.C and L.Z.-W. were supported by Australian Government RTP scholarships. R.T.K is supported by a National Institutes of Health RO1NS21135, while R.F.H was supported by the Hertie Foundation (Excellence in Clinical Neuroscience) and the Jung Foundation for Science and Research (Ernst Jung Career Advancement Prize). This work was also supported by a UK ESRC grant (ES/N009924/1) awarded to G.M.G.
The authors declare no competing financial interests.
- Correspondence should be addressed to Zachariah R. Cross at zachariah.cross{at}northwestern.edu.