Abstract
Sleep facilitates abstraction, but the exact mechanisms underpinning this are unknown. Here, we aimed to determine whether triggering reactivation in sleep could facilitate this process. We paired abstraction problems with sounds, then replayed these during either slow-wave sleep (SWS) or rapid eye movement (REM) sleep to trigger memory reactivation in 27 human participants (19 female). This revealed performance improvements on abstraction problems that were cued in REM, but not problems cued in SWS. Interestingly, the cue-related improvement was not significant until a follow-up retest 1 week after the manipulation, suggesting that REM may initiate a sequence of plasticity events that requires more time to be implemented. Furthermore, memory-linked trigger sounds evoked distinct neural responses in REM, but not SWS. Overall, our findings suggest that targeted memory reactivation in REM can facilitate visual rule abstraction, although this effect takes time to unfold.
SIGNIFICANCE STATEMENT The ability to abstract rules from a corpus of experiences is a building block of human reasoning. Sleep is known to facilitate rule abstraction, but it remains unclear whether we can manipulate this process actively and which stage of sleep is most important. Targeted memory reactivation (TMR) is a technique that uses re-exposure to learning-related sensory cues during sleep to enhance memory consolidation. Here, we show that TMR, when applied during REM sleep, can facilitate the complex recombining of information needed for rule abstraction. Furthermore, we show that this qualitative REM-related benefit emerges over the course of a week after learning, suggesting that memory integration may require a slower form of plasticity.
Introduction
Abstraction, or the process of formulating generalized ideas or concepts by extracting common qualities from specific examples, is a core component of fluid intelligence (Welling, 2007). Sleep has been suggested to play an active role in rule abstraction (for review, see Chatburn et al., 2014; Lerner and Gluck, 2019). For instance, some experimental paradigms that probe rule abstraction such as statistical learning of tone transition patterns have been shown to benefit from slow-wave sleep (SWS; Durrant et al., 2011, 2013, 2016), whereas others, like the weather prediction task, seem to benefit from rapid eye movement sleep (REM; Barsky et al., 2015). Rule-learning-related neural patterns have even been shown to reactivate in the rat medial prefrontal cortex during SWS (Peyrache et al., 2009). However, the mechanisms supporting abstraction in sleep are unknown. It is unclear if one specific sleep stage is more important, and whether the benefit stems from memory reactivation or other types of processing in sleep.
Targeted memory reactivation (TMR) is a method for explicitly controlling memory reactivation in the sleeping brain (Oudiette and Paller, 2013). In TMR, sounds that have been simultaneously paired with recently learned material during wake are softly re-presented during subsequent sleep to trigger reactivation of the associated memories and boost consolidation. TMR is most commonly applied during non-REM (NREM) sleep, where it is known to strengthen memories (Rasch et al., 2007; Rudoy et al., 2009; Antony et al., 2012), but it has also been linked to qualitative changes, such as the emergence of explicit knowledge of formerly implicit memories (Cousins et al., 2014). There is currently a debate in the literature regarding whether memories can be reactivated during REM sleep using TMR, with some studies reporting null findings (Rasch et al., 2007; Hu et al., 2019) and others reporting significant effects (Sterpenich et al., 2014; Hutchison et al., 2021; Picard-Deland et al., 2021). The present study aims to address this issue within the realm of rule abstraction because the question of whether TMR can also boost this skill, in addition to memory consolidation remains to be answered. It is also unclear whether rule abstraction would benefit most from reactivation in SWS or in REM, given the proposed role of these sleep stages in memory restructuring (Landmann et al., 2015) and generalization (Lewis and Durrant, 2011; Sterpenich et al., 2014; Pereira and Lewis, 2020). One study did apply SWS TMR to an abstraction task and suggest a benefit, but the lack of a noncued control makes the results difficult to interpret (Batterink and Paller, 2017). Another study showed no effect of SWS TMR on generalization (Witkowski et al., 2021), whereas in a third study, such stimulation appeared to produce a deficit in abstraction (Hennies et al., 2017). Nonetheless, SWS has been linked to positive effects in numerous abstraction-related tasks (for review, see Lerner and Gluck, 2019).
In the current report, we address the above questions by using TMR to reactivate rule abstraction problems in SWS and REM, with different problems cued in each stage. We used a visual abstraction task called the Synthetic Visual Reasoning Task (SVRT; Fleuret et al., 2011), which requires participants to abstract rules that define families of abstract visual patterns through trial-and-error exposure. For example, in the problem depicted in Figure 1, the rule is that each image contains two identical shapes. In training, participants are shown a series of images and asked to categorize them as belonging to the family in question or not. They are given feedback on each correct/incorrect categorization. Each family of shapes is associated with a consistent reference image. At test, participants have to indicate whether a given sample image follows the same rule as the reference image for that particular problem. Because the impacts of TMR can last for up to a week (Hu et al., 2015), and may even amplify across this period (Groch et al., 2017), we retested our participants 1 week after the TMR manipulation.
Experimental design. A, Before sleep, participants learned to pair each image (a face or a landscape) with an synthetic visual reasoning task (SVRT) problem and its associated sound (Problem-Image Association Task). Next, they were trained and tested on the SVRT task, where they had to decide whether the test image followed the same rule as the reference image for any given problem (top). For example, in the problem shown here the rule is that each image contains two identical shapes (Fleuret et al., 2011). Extended Data Figure 1-1 contains another example. Immediately before sleep, participants were probed on their ability to recall which sound (speaker symbols) had been paired to which SVRT problem (Problem-Sound Association Task). B, TMR was applied to different problems during REM and SWS during the night. Finally, participants were retested on the SVRT both next morning (postsleep day 1) and a week later (postsleep day 7). Representative hypnogram depicting the TMR protocol. During TMR in the night, sounds associated with four problems were replayed in SWS, and sounds associated with four other problems were replayed in REM. Control sounds that had not been associated with any problems (new sounds) but instead served as controls for auditory responses were also replayed in both sleep stages. Cuing started with the first instance of SWS and REM and terminated once control and experimental sounds had been presented 28 times each (twice per loop, 14 loops).
Materials and Methods
Participants
Healthy young adults (mean age, 22 years old; range, 19–30 years) were recruited online and through advertisements on the university campus to take part in this study. Participants filled out an online screening form and were excluded if they had any diagnosed sleep, neurologic, or psychiatric disorders; were taking psychoactive medication; traveled in more than two time zones; or engaged in regular shift work in the 2 months before the experiment. Participants reported a regular sleep cycle over a 4 week period before the experiment and were instructed to abstain from alcohol (24 h) and caffeine (12 h) before each visit to the laboratory, as well as daytime napping. Data from 27 individuals (19 females) were collected and used for behavioral analyses. One participant was excluded from the ERP analyses because no EEG triggers were recorded during TMR (n = 26) because of technical difficulties. All participants signed informed consent forms and received monetary compensation for their participation. This study was approved by the ethics committee of the School of Psychology of Cardiff University.
Experimental design
The experiment was conducted according to a within-subject design (Fig. 1). Participants arrived in the evening (between 6:00 and 8:00 PM) and were prepared for polysomnography recordings. Subsequently, participants performed a battery of presleep cognitive testing. First, they performed the Image Familiarization Task, where they passively saw all the images (either faces or landscapes) used in the SVRT. To ensure engagement, participants were instructed to press the space bar whenever a red dot appeared on the screen. After the Image Familiarization Task, participants performed the Problem-Image Association Task, where they learned to associate each SVRT problem with a particular image of either a face or a landscape. These images were used to group the SVRT problems into two categories (category 1, problems paired with faces; category 2, problems paired with landscapes). Next, participants performed the Synthetic Visual Reasoning Test (Fleuret et al., 2011), where they were required to categorize a series of samples from 16 problems as either in class (following the rule) or out of class (not following the rule; Extended Data Fig. 1-1). Each problem was always presented in combination with a specific image from one of the two possible categories (faces or landscapes) and with a 200 ms sound. During training, participants learned through feedback and trial and error until they were able to correctly categorize the samples to 70% accuracy on each problem. During testing, they did not receive any feedback. The last task before sleep was the Problem-Sound Association Task, where participants were trained to recognize which sound had been paired with which problem, until they reached 100% accuracy. This task was introduced to guarantee that the effectiveness of TMR would not be compromised by a weak association between the sounds and their respective problems.
Figure 1-1
SVRT stimuli examples. Sample images from problem 1 (top) and problem 2 (bottom), that either follow the rule (on the left) or break the rule (on the right; Fleuret et al., 2011). For problem 1 the rule is that each picture contains two identical shapes. The squiggly lines were introduced as distractors (not a part of the rule) to increase the difficulty level. For problem 2 the rule is each image contains two shapes of different sizes, the smaller one inside the larger one, roughly centered. The black filling of the smaller shape was added in some images as a distractor to increase the difficulty level. Other problems had rules relating, for example, to the number of identical shapes (pairs or triplets), their position (mirrored or translated, touching or not touching, inside or outside one another, aligned or not aligned, etc.) or their arrangement (odd shape in the middle, bigger shape at the edge, etc.). Download Figure 1-1, TIF file.
Next, participants went to sleep while nonobtrusive brown noise was continuously played throughout the night. For targeted memory reactivation, each category (sets 1 and 2 of problems paired with faces or problems paired with landscapes) was assigned to a sleep stage (either SWS or REM). Assignment of categories to the sleep stages was counterbalanced across participants. Within each category, half of the problems were cued during sleep, and the other half served as a noncued control (subsets A and B). Assignment of sets 1A, 1B, 2A, and 2B to each sleep stage and cuing condition was counterbalanced across participants (see below the SVRT problems that were included in each set). The sounds paired with problems assigned to the cued condition were played at the onset of either SWS or REM, as well as new control sounds not previously presented to the participant. On awakening (day 1), participants performed the Image Familiarization Task again, were wired down, showered, and then were retested on the SVRT. A week later (day 7) participants returned to the lab and were retested once again on the SVRT. Performance on the SVRT was assessed by the accuracy at each time point and by the accuracy change (difference across time points).
All tasks were implemented in MATLAB R2017b using Psychtoolbox-3 and displayed on a 1920 × 1080 pixel computer monitor.
Tasks
Image Familiarization Task
This task consisted of 14 blocks of 8 trials (one per problem) for each of the 2 categories (i.e., 8 faces and 8 landscapes, for a total of 16 different images), amounting to 112 image presentations per category (224 in total). A variable intertrial interval was set between 1 and 2 s. Participants were asked to press the space bar whenever a red dot appeared on the screen. The red dot was set to appear randomly once every eight trials. The task was administered in the evening and again in the morning.
Problem-Image Association Task
This task was designed to help participants learn to associate each SVRT problem and its corresponding sound with a particular image (either a face or a landscape). It consists of two phases, learning and test. For each participant, the images and sounds were randomly assigned to the SVRT problems. During learning, participants performed 3 blocks of 16 trials (one per problem) where they passively viewed the reference representation of any given SVRT problem on the left-hand side of the screen and the image it was paired with (either a face or a landscape) in the center of the screen while the 2 s sound paired with that problem-image dyad was played. Participants were instructed to press the space bar if a red dot appeared on the screen. The red dot appeared randomly once per block. In the test phase, participants saw the reference representation in the center of the screen and heard the same sound that had been paired with it during learning but now trimmed to only 200 ms. Next, two images appeared on the screen, and the participant had to indicate which one had been paired with that particular problem-sound dyad. The test was repeated until participants reached 75% accuracy.
These two tasks, image familiarization and problem-image association, were added to the experimental design to facilitate use of machine-learning classification algorithms to detect replay. We performed extra checks to certify that image category was not influencing the SVRT task (see below, Results).
Synthetic Visual Reasoning Test
The Synthetic Visual Reasoning Test (SVRT) requires participants to indicate whether a given sample image follows the same rule as the reference image for that particular problem (both sample and reference images were displayed simultaneously). The rule governing each problem had to be discovered through trial and error during training. We measured accuracy as the ability to correctly categorize sample images according to whether they followed or broke the rule for that problem (Fig. 1). Feedback was given after each trial, informing participants whether their categorization of the sample image was correct. Extended Data Figure 1-1 has more examples of sample images and rules. Each problem was presented in conjunction with a picture of a face or a landscape to boost the chances of eliciting classifiable EEG patterns, as has been done for objects and scenes (Cairney et al., 2018) and for animals, tools, faces, and buildings (Shanahan et al., 2018). Participants were trained on 16 categorization problems, half of which were subsequently used to test the impact of TMR in SWS (four were cued in SWS and four were used as a control), and the other half (four cued and four control) were used to test the impact of TMR in REM.
The test phase consisted of five trials for each problem. Of a pool of 200 images per problem, (100 following the rule and 100 not following the rule), 5 images were randomly selected for each test (presleep, day 1 and day 7).
During both training and test phases, a time limit for each response was set to 6 s, after which the next trial would start. After each block (i.e., problem) there was a 15 s rest break. The order of problem presentation was randomized for each participant. Each trial began with the presentation of the reference representation of that problem on the left-hand side of the screen, the image it had been paired with (either a face or a landscape) in the center for the screen, and the 200 ms sound that these images were associated with. Then the image to be categorized was displayed on the right-hand side of the screen. Participants were required to press 1 if the image to be categorized was in class (satisfied the rule) or to press 9 if it was out of class (did not satisfy the rule). Performance on the SVRT was assessed by the change in accuracy overnight (postsleep day 1 minus presleep), across the week (postsleep day 7 minus postsleep day 1). Performance was not affected by the category of the image paired with each problem (i.e., face or landscape (all t tests p > 0.4, uncorrected)).
Problem-Sound Association Task
This task was designed to ensure that participants were able to correctly identify all sound problem dyads introduced while performing the SVRT before sleep, which could otherwise compromise the effectiveness of TMR. Again, the reference representation was presented in combination with its corresponding face or landscape image. Next, two 200-ms-long sounds were played, and the participant indicated which one had been paired with that problem-image dyad. The test was repeated until participants reached 100% accuracy.
Stimuli
All sounds were obtained from an online repository (https://freesound.org). Initial sounds (2 s long, learning phase of the Sound-Problem Association Task) were trimmed into 200-ms-long sounds using the software Audacity. A pool of sounds was used for each category (faces/landscapes) from which sounds were randomly selected and assigned to a specific SVRT problem. For faces, generic object sounds were used, and for landscapes, generic nature sounds were used, such as a bird chirping or the wind blowing. For each category (faces or landscapes) a group of 12 similar but easily distinguishable sounds was selected and from this pool, 8 sounds were randomly paired with an image and used in the SVRT task while the remaining 4 sounds were used as controls during TMR. Sounds for faces and landscapes were matched in duration, and all were played at the same volume within each participant.
The images of faces were obtained from Karolinska Directed Emotional Faces (Lundqvist et al., 1998). Only faces of females with a neutral facial expression at a straight angle were chosen. The images of landscapes were obtained from an online repository (https://www.freeimages.com). All images were edited into grayscale and resized (faces, 325 × 435 pixels; landscapes, 435 × 325 pixels) using GIMP (GNU Image Manipulation Program) software.
TMR protocol
Audio cues were embedded in brown noise to decrease the likelihood that the TMR sounds would elicit an arousal. Brown noise was played throughout the entire night while the cues were only presented when SWS or REM was identified online by the experimenter. Both stimuli (audio cues and brown noise) were played through loud speakers placed behind the participant's bed. The sound volume was manually adjusted for each participant before sleep according to the participant's comfort level. Each cue (either experimental, e.g., paired with a learned rule or control, with no rule associated) was played twice in a row before the next cue was played. All cues were played 4 s apart from each other. One loop of cuing consisted of all eight (four control and four experimental) played twice (16 sound presentations). The order of cue presentation was randomized at each iteration of the loop. A total of 14 loops was played in each sleep stage (corresponding to ∼15 min of cuing), adding up to 28 repetitions of each individual sound and 112 cuing events in each condition (control or experimental). Although SWS usually occupies a larger proportion of the night than REM (and would thus allow for an extended cuing time), we wanted to ensure that we would be able to deliver the same amount of cuing in both sleep stages, and therefore we opted for limiting cuing to ∼15 min. Cuing was initiated in the first episode of SWS and REM and was interrupted whenever an arousal or sleep stage transition was identified. In one participant, only 7 of the 14 loops of REM cuing were completed, because of short sleep duration (n = 1), and in another participant only 8 of the 14 loops of SWS cuing were completed because of light sleep throughout the night (n = 1). These participants were not excluded from any analyses. Note that cuing varied among participants, depending on whether they obtained ∼15 min of uninterrupted SWS and REM, so that for some, cuing was finished within the first NREM-REM cycle, whereas for others, additional cycles were needed. No significant correlations were found between number of cues delivered in SWS or REM and subsequent performance (all p values > 0.1). Following off-line sleep scoring, cuing accuracy (calculated as the percentage of cues delivered in the intended sleep stage) was determined as 94.44% for SWS and 93.72% for REM. Regarding continuity (i.e., whether TMR was completed within on sleep cycle), SWS TMR was continuous for 19 of 26 participants, and REM TMR was continuous for 1 of 26 participants only. This is to be expected as we initiated REM TMR at the onset of the first REM episode, which tends to be very short, and our entire cuing procedure required at least 15 min to complete if uninterrupted. Given this distribution of the data, it is not possible to estimate if the TMR effect differed depending on whether cuing was continuous or discontinuous.
EEG recordings and sleep analysis
EEG was recorded using BrainVision software during the Image Familiarization Task (in the presleep evening and morning of postsleep day 1) and during sleep. Recordings were made at 500 Hz from 22 scalp locations on the standard 10/20 layout (Fz, F3, F4, FC1, FC2, FC5, FC6, Cz, C3, C4, CP5, CP6, Pz, P3, P4, P7, P8, PO3, PO4, Oz, O1, and O2), referenced to the mastoids. Impedances were kept below 5 kΩ. Electrooculogram and electromyogram signals were also recorded from electrodes next to each eye and 2 electrodes on the chin, respectively. Sleep scoring was accomplished using the guidelines from the American Association of Sleep Medicine (version 2.5), within a custom-made script implemented in MATLAB. Off-line scoring was performed by two independent raters, blind to when cuing occurred, achieving an 88% agreement rate. Discrepancies were resolved by one of the raters.
Spindles and slow oscillations (SOs) were detected from all channels using the SpiSOP toolbox, version 2.3.8.3 (https://www.spisop.org/), with the spindle detection algorithm based on Mölle et al. (2002). Center frequencies of fast and slow spindles were visually determined for each participant and were used to define the finite impulse response (FIR) filter (center frequency, 13.29 Hz, SD 0.69). The root mean square (RMS) of the filtered signal was computed using a 0.2 s time window and smoothed by a moving average of another 0.2 s window. Any event that surpassed the 1.5 SD of the RMS signal was considered a candidate spindle. To fit the spindle detection criteria, the candidate events had to last between 0.5 and 3 s. Because we had no a priori hypothesis about specific channels, all correlations were made with the average across channels.
Similarly, slow oscillation detection is based on Mölle et al. (2002) but also see (Ngo et al., 2013). Before the actual detection, the signal is high-pass filtered (infinite impulse response filter; IIR by default) then low-pass filtered (FIR) to contain frequency components observed in slow oscillations in a specified band (0.3–3.5 Hz). Then all the time intervals with consecutive positive-to-negative zero crossings are marked. Only intervals with durations corresponding to a minimum (set to 0.5 Hz) and maximum (set to 1.11 Hz) slow oscillation frequency are considered as putative slow oscillations. The threshold for negative peaks was set to 1.25, and for negative to positive peaks, amplitude was also set to 1.25 (default parameters).
EEG preprocessing
First, the data were high-pass filtered at 0.3 Hz and low-pass filtered at 35 Hz. Then the continuous EEG was epoched into trials from 1 s before to 3 s after sound cue onset (because the cues were 4 s apart). Noisy channels were repaired by interpolating data from neighboring electrodes, and trials containing arousals or movement artifacts (as determined during sleep scoring) were removed. Finally, any remaining noisy trials were manually removed following visual inspection. The number of trials included in the final analysis for each participant, sleep stage, and condition are presented in Extended Data Figure 2-3.
Baseline correction was performed on the single-trial level using the entire trial length [−1 3] (Grandchamp and Delorme, 2011). Trials were then separated into conditions (control and experimental) and sleep stages (SWS and REM). One participant was excluded from all analyses as there were no EEG triggers during TMR (final n = 26).
EEG analysis
Event-related potential (ERP) analyses were conducted in FieldTrip (Oostenveld et al., 2011; http://www.fieldtriptoolbox.org/). ERPs were calculated for each condition and sleep stage and compared within subjects and between conditions, across all channels, within a time window from 0 to 2000 ms (not averaged).
ERPs of control and experimental sounds were compared using Monte Carlo cluster permutation tests, corrected for multiple comparisons (Maris and Oostenveld, 2007). The cluster alpha was set to 0.05, and 150,000 randomizations were conducted for every test. Clusters were considered significant at p < 0.025 (two tailed). Similar parameters were set up for time-frequency analysis for each frequency band of interest, theta (4–8 Hz), spindles (9–15 Hz), and low beta (12.5–16 Hz). More specifically, the time-frequency cluster permutation analysis was calculated using the average across trials for each participant in the window of interest (0–2 s). The statistical analysis was performed for experimental versus control sounds in SWS, REM, and also for their interaction (SWS difference vs REM difference, where difference was calculated as experimental minus control sounds) for each frequency band. The minimum number of channels to form a cluster was set to 2, the number of randomizations set to 250,000, and the cluster alpha at p = 0.025 (two tailed).
To determine whether stimulation lead to a change in spindles or slow oscillations, we calculated the number and duration of spindles and slow oscillations per condition (experimental and control sounds). We then compared these between conditions using a cluster permutation analysis. The cluster alpha was set to 0.05, and 250,000 randomizations were conducted for every test. Clusters were considered significant at p < 0.025 (two tailed).
Finally, we sought to detect memory reactivation after our TMR cues using an EEG classifier. Thus, ERP values were used as features to feed a linear support vector machine (SVM) classifier. To avoid overfitting, we used fivefold validation repeated twice. As a performance metric we used the traditional accuracy but also area under the curve. The classification was performed separately for SWS and REM stages for each participant. Statistics were performed at a group level to check for any above-chance time cluster. No significant cluster was found for either of the performance metrics or for either sleep stage.
Statistical analyses
Performance change on the SVRT was compared using a repeated-measures ANOVA with between-subjects factors sleep stage (SWS/REM) and cuing condition (cued/noncued) and session (overnight/across the week) as repeated factor. We ran an outlier analysis using the ROUT method (Q = 1%) and identified two outliers on the SWS cued group. On removal of these outliers, the results remained the same as those in Figure 2A, where no significant differences were found between overall performance change on SWS cued and noncued problems (t(1,24) = 1.132, p = 0.269).
Descriptive statistics (mean, SD, SEM, and confidence intervals) are presented in Extended Data Figure 2-4. The combined performance change was compared between noncued and cued conditions using paired t tests. Pearson's correlations were calculated between the combined performance change and the average number of slow oscillations and spindles in frontal, central, and parietal derivations. Data are presented as mean ± SEM, and we report eta squared (η2) and Cohen's d as effect size estimates for significant findings.
Statistical analyses of the behavioral data were conducted using JASP 0.10.2.0 software, and statistical analyses of EEG data were conducted in MATLAB R2017b using the FieldTrip toolbox (version 20190904).
Data availability
The full dataset presented here, including demographics, behavioral and EEG data, as well as the MATLAB scripts used in the ERP analyses, is available at Zenodo, https://zenodo.org/record/7215812#.ZC-4N3ZBzIU.
Results
TMR in REM improves rule abstraction
We examined baseline performance (presleep) using an ANOVA with the factors cuing condition (cued/noncued) and sleep stage (SWS/REM). No differences or interaction were found (smallest p value = 0.666). Figure 2B and Extended Data Figure 2-1a contain full statistical details.
TMR in REM improves rule abstraction. A, SVRT accuracy change overnight (postsleep day 1 to presleep) and across the week (postsleep day 7 to postsleep day 1) is plotted for each sleep stage (SWS and REM) and cuing condition (noncued and cued). A repeated-measures ANOVA revealed a significant sleep stage * cuing condition interaction (p = 0.013) and a simple main-effects analysis showed better performance for problems cued in REM compared with problems cued in SWS (p = 0.044; Extended Data Fig. 2-1). B, Left, In SWS problems there was no difference between cued and noncued accuracy in any individual session (p > 0.3). Right, In REM problems there was no difference between cued and noncued conditions on day 1 (p = 0.550), but at day 7, accuracy was higher on cued compared with noncued problems (p = 0.002). Mean and SEM are depicted (Extended Data Figs. 1-1, 2-2). Statistical significance: *p < 0.05, **p < 0.01.
Figure 2-1
a, SVRT accuracy at baseline (presleep). ANOVA with Cuing (cued/noncued) and Sleep stage (REM/SWS) as factors. b, TMR benefit. Repeated measures ANOVA on retention interval (overnight/week) and Cuing (cued/noncued) and Sleep stages (SWS/REM). Shaded areas highlight significant results. Overnight benefit is calculated as the difference between Postsleep day 1 and presleep, and the week performance is calculated as the difference between both postsleep sessions (Day 7–Day 1). c, TMR benefit post hoc analysis. Paired t test for REM conditions to understand the differences between cued and noncued problems per session (Postsleep Day1 and Day 7) and also the cuing benefit overnight (difference between Postsleep Day1 and Presleep), a week after (Postsleep Day 7 vs Presleep) and also the difference between Day 7 and Presleep. Download Figure 2-1, DOCX file.
Figure 2-2:
Accuracy on the SVRT per group and session. Download Figure 2-2, DOCX file.
To assess the impact of cuing, on consolidation across a retention interval, we compared SVRT performance change (overnight accuracy change, postsleep day 1 – presleep; across a week, postsleep day 7 – postsleep day 1) using a repeated-measures ANOVA with factors sleep stage (SWS and REM), cuing condition (cued and noncued), and retention interval (overnight and across a week postsleep) as repeated measures. This showed a significant sleep stage * cuing condition interaction (F(1,26) = 6.091, p = 0.020, η2 = 0.013), with no other factor or interaction being significant (smallest p value = 0.128, Fig. 2A, Extended Data Fig. 2-1b). This indicates that cuing had different effects when applied in SWS and REM. To investigate this, we conducted a simple main effects test (sleep stages × cuing), which revealed better performance in the cued condition for REM than SWS (F(1,26) = 4.463, p = 0.044), with no differences between SWS and REM in the noncued control condition (F(1,26) = 0.774, p = 0.387; Fig. 2A). This result could suggest that cuing benefited rule abstraction when delivered during REM sleep but not SWS.
To better understand this pattern of results, and also to gain statistical power, we next analyzed each sleep stage separately using a two-way ANOVA with factors cuing condition (cued and noncued) and retention interval (overnight and across a week postsleep). For SVRT problems cued in SWS, there was no effect of cuing, session, or interaction between these (smallest p value = 0.198). For problems cued in REM sleep, however, we found a significant cuing effect (F(1,26) = 7.930, p = 0.009, η2 = 0.019), indicating that performance improvements were superior for cued problems, compared with noncued problems. There was no effect of session or cuing * session interaction (smallest p value = 0.231). To further understand the origin of the cuing effect in REM sleep we performed a paired t test (cued vs noncued) on accuracy at each session (Presleep, Postsleep day 1 and postsleep day 7; Fig. 2B and Extended Data Fig. 2-3 contain full statistical results). Accuracy was superior for REM cued problems compared with noncued (t(26) = 3.357, p = 0.002, Cohen's d = 0.646) only at Postsleep day 7.
Figure 2-3
Number of trials used per participant and condition. Download Figure 2-3, DOCX file.
Overall, these findings suggest that reactivating problems during REM leads to a significant advantage in rule knowledge after seven days and nights.
Event-related potentials in REM differ between control and experimental sounds
To examine neural processing associated with TMR cues, we plotted sound-evoked ERPs for each sleep stage of cuing (SWS and REM) and sound category (control and experimental) at Cz for illustration purposes. Topographies in Figure 3 showing the spatial distribution of significant channels over time are provided in Figure 4 for all EEG channels. We analyzed a large time window (0–2000 ms), which includes all known auditory event-related potentials (Winkler et al., 2013) and has previously been associated with processing auditory stimuli in both NREM and REM sleep (Campbell and Muller-Gass, 2011). To determine whether the response to control and experimental sounds differed in each sleep stage, we performed a cluster analysis on the ERP amplitudes (all channels, not averaged). This revealed a significant difference between experimental (familiar) and control (new) sounds in REM sleep (cluster corrected for multiple comparisons, p = 0.048) but not in the SWS (all p values > 0.05). This negative cluster ranges from 228 to 400 ms. The elicitation of a larger ERP amplitude for new sounds than for familiar sounds demonstrates an ability to detect novelty. Our observation of this response in REM but not SWS is in keeping with prior literature showing greater responsivity in REM compared with SWS (for review, see Ibáñez et al., 2009).
Event-related potentials at Cz during Targeted Memory Reactivation. Cz ERPs in SWS (top, blue) and REM (bottom, red) elicited by control (new) and experimental (task related) sounds. The vertical dashed line at zero indicates cue onset (200 ms long). A cluster analysis revealed a significant difference between ERPs in response to control and experimental sound in REM between 228 and 400 ms (cluster corrected *p = 0.048). Data are depicted as mean ± SEM (n = 26). Statistical significance: *p < 0.05, **p < 0.01.
Spatial distribution of channels with a statistically significant difference between experimental and control sounds during REM. Data are displayed as the averaged difference (n = 26) between experimental and control sound ERPs in 20 ms time bins. The asterisk (*) indicates the position of a significant channel. The time-frequency cluster permutation analysis for these data are shown in Extended Data Figure 4-1.
To probe the data further, we performed a time-frequency analysis per sleep stage in the same time window (0–2000 ms) choosing relevant frequency bands based on previous work on SWS, theta band (4–8 Hz) and spindle band (9–15 Hz), and lower-beta band (13–16 Hz) for REM sleep. Cluster statistics revealed nothing significant for either frequency band or sleep stage (smallest p value = 0.052). The full list of results is provided in Extended Data Figure 4-1.
Figure 4-1
Time-frequency cluster permutation analysis. When more than one cluster is present, the lowest p value was selected. A hyphen (-) indicates when no clusters are found. No statistically significant clusters were found. Download Figure 4-1, DOCX file.
Does cuing in each sleep stage interfere with consolidation of cuing in the other?
Because we applied TMR in both SWS and REM (although stimulating different problems in each stage), we were interested to know whether TMR in REM might have obscured or interfered with the effects of TMR in SWS. In the case of direct interference, we might expect a negative correlation between the extent to which participants benefit from REM TMR and the extent to which they benefit from SWS TMR. To test for this, we looked for a relationship between performance on problems cued in SWS and REM in two different ways, using overnight gain and using TMR cuing benefit. Thus, we ran a correlation between overnight performance change (difference between postsleep and presleep) for problems cued in SWS and overnight performance change for problems cued in REM. This showed no correlation (r = −0.162, p = 0.420). Next, we calculated the cuing benefit (difference between performance on cued and noncued problems) for SWS-related problems and REM-related problems at each session and across sessions, to check whether TMR-related improvements in REM problems were obtained at the expense of cuing benefit in problems cued in SWS. This showed no significant relationships (p > 0.05, uncorrected; refer to Table 2). These results show that the extent of TMR-related consolidation in REM doesn't predict any specific deficit in the benefit accrued from equivalent cues in SWS.
There is no relationship between time spent in nonmanipulated REM sleep and performance on problems cued in SWS
It could be argued that successive TMR in SWS and REM might have curtailed the amount of nonmanipulated REM available to further advance any consolidation processes initiated by TMR in SWS, thus disrupting any potential benefits from this manipulation. We inspected sleep architecture in relation to TMR and found that 25 of 26 participants had a period of nonmanipulated REM sleep after REM cuing had terminate, an average of 65.9 min (ranging from 24 to 117.5 min). Furthermore, the amount of nonmanipulated REM sleep in each participant was not correlated with performance on SWS cued problems on either postsleep day 1 (r = 0.284, p = 0.160) or postsleep day 7 (r = 0.166, p = 0.419).
Relation between rule abstraction and NREM graphoelements
Sleep architecture data from all 27 participants are presented in Table 1. Slow oscillations and sleep spindles are thought to mediate TMR-related benefits to memory consolidation (Schouten et al., 2017; Cairney et al., 2018; Göldi et al., 2019). To determine whether the same was true for rule abstraction, we counted the number of slow oscillations and sleep spindles in NREM sleep for each participant and checked for correlations between each of these and the SVRT performance change for problems cued in SWS and REM, as well as the control noncued problems for each sleep stage. In line with the observation that TMR in SWS did not improve rule abstraction, we found no correlation between performance on the SVRT task and either spindles or slow oscillations (all p values ≥ 0.1, uncorrected; see Table 4).
Sleep architecture (n = 27)
Correlations between cuing benefit* in REM and SWS
Spindles and slow oscillations identified in epochs after control and experimental sounds
Spindles and slow oscillations summary, averaged across participants and channels separately for control and experimental epochs
Next, we wanted to determine whether TMR cuing altered spindles or slow oscillations in a way that related to subsequent changes in performance on our task. We thus calculated the number and duration (samples) of spindles and slow oscillation in the 3 s epoch following TMR stimulation for each condition (experimental and control). No significant results were found for spindles (smallest p value = 0.06; Fig. 5, topography). But two significant clusters were found for the number of SOs. One in the left hemisphere (t = −9.08, p = 0.007), and one on the right hemisphere (t = −6.50, p = 0.012; Fig. 5). Both indicated a higher number of SOs after control than experimental sounds. We then correlated the mean number of SOs detected in each cluster with behavioral performance change for items (cued in REM/SWS and noncued for both stages) both overnight and over the subsequent week and for both cued and noncued items. This revealed a significant positive relationship between both the right hemispheric cluster (Rho = 0.44, p = 0.03) and the left hemispheric cluster (Rho = 0.42, p = 0.04), uncorrected. Overall, these data appear to suggest that cuing with the experimental TMR tone leads to a reduction in SOs over these electrodes, and this seems to be associated with TMR benefit, although the correlations do not survive correction for multiple comparisons. However, because we had no a priori hypothesis to this effect, and the correlations do not survive correction for multiple comparisons, we feel this should be treated with caution.
Spindles and slow oscillations evoked by TMR. Top row, The average of differences in spindles following experimental and control TMR cues. Bottom, Line shows the same for slow oscillations. Durations are shown on the left and count is shown on the right. Blue colors indicating higher spindle duration/count for control than experimental. Significant clusters are highlighted with a white star.
Image category did not affect SVRT performance
To determine whether being associated with the face/object sounds versus the landscape/nature sounds had any impact on behavior, we directly compared performance on problems associated with faces and landscapes, regardless of sleep stage or cuing condition. There were no differences in performance between the two. We conducted a two-way repeated-measures ANOVA on the raw accuracy values with the factors category (faces and landscapes) and session (presleep, postsleep day 1 and postsleep day 7). There was no effect of category (F(1,26) = 0.362, p = 0.553, η2 = 0.003) or session (F(1,26) = 2.054, p = 0.139, η2 = 0.007), and no interaction (F(1,26) = 0.253, p = 0.778, η2 = 0.001). The same analysis was conducted on the performance changes (overnight, over a week, and overall change), with the Greenhouse–Geisser sphericity correction. Similarly, no effect of category (F(1,26) = 0.365, p = 0.551, η2 = 0.004) or session (F(1,26) = 0.610, p = 0.480, η2 = 0.004) was found, and there was no interaction (F(1,26) = 0.165, p = 0.729, η2 = 0.002). We ran paired t tests between the same time points in each category (e.g., faces at presleep vs landscapes at presleep). No differences were found (all p values > 0.4, uncorrected).
Discussion
This study shows that rule abstraction, one of the building blocks of human reasoning, can be facilitated by applying targeted memory reactivation during sleep. Interestingly, when different problems were cued in SWS and REM within the same night, the problems cued in REM benefitted from off-line rehearsal, shedding light on a possible role for previously detected reactivation during REM (Maquet et al., 2000; Louie and Wilson, 2001; Mainieri et al., 2019). Furthermore, we found that REM TMR–mediated facilitation of abstraction requires time to emerge as cued problems have a significant advantage over noncued problems 1 week after the manipulation. This is important, because it joins a small but growing literature suggesting that some sleep-related memory benefits may require more than just one episode of sleep to emerge (Groch et al., 2017; Cairney et al., 2018).
Abstraction underpins the ability to categorize items and generalize rules to new, never before seen exemplars. This is a core component of fluid intelligence (Otero, 2017) and is particularly important when one is faced with a new problem that cannot be solved exclusively by prior knowledge. Our data appear to show a dissociation between REM and SWS, with TMR in the former but not the latter facilitating performance on a complex task requiring rule abstraction and pattern categorization. Unmanipulated SWS has been shown to be involved in both quantitative (Rasch and Born, 2013) and qualitative changes to recently encoded memories (Wagner et al., 2004; Lau et al., 2010; Durrant et al., 2011, 2013; Wilhelm et al., 2013; Kirov et al., 2015), whereas REM has been suggested to be more involved with qualitative changes, such as forming unexpected links between different memories or concepts (Lewis et al., 2018). This possibility is supported by studies showing that REM duration predicts visual abstraction (Lutz et al., 2017), category learning (Djonlagic et al., 2009), lexical integration (Tamminen et al., 2017), and grammar learning (Batterink and Paller, 2017), all of which are highly integrative forms of memory. Our finding with respect to REM is also in line with a previous review suggesting that abstraction of explicit rules based on prior knowledge is often linked to REM sleep (Lerner and Gluck, 2019) and extends these ideas by providing clues to the underlying mechanisms of REM-dependent rule abstraction. In addition, one study demonstrated that TMR in SWS can actually impair the abstraction of grammar-like transition statistics (Hennies et al., 2017), suggesting that promotion of memory for specific episodes through reactivation in SWS may disrupt the abstraction of generalized statistics. Together with this literature, our findings suggest that REM TMR may have the capacity to directly promote abstraction. Supporting this, studies using REM TMR to investigate qualitative changes, such as the affective tone of emotional memories (Rihm and Rasch, 2015; Lehmann et al., 2016) and the generalization/integration of pictures with emotional content (Sterpenich et al., 2014), typically do find a benefit from REM TMR, as did our current study. If abstraction-like processing turns out to be the main function of REM for memory, that could explain why most REM TMR studies have shown little or no benefit to memory consolidation (Hu et al., 2019, for a meta-analysis) because such studies typically assessed quantitative, rather than qualitative, changes and thus do not test abstraction.
In the current study, although TMR in REM facilitated rule abstraction, TMR in SWS did not. Given this result, it might be tempting to conclude that TMR in SWS does not facilitate this kind of abstraction. However, we cannot exclude the possibility that cuing problems in SWS triggered a consolidation process that would have facilitated abstraction but was disrupted by subsequent cuing in REM. We ran several analyses to investigate this possibility and found that there is no relationship between the extent to which SVRT performance benefitted from cuing in REM and cuing in SWS. We also found that the vast majority of participants had epochs of nonmanipulated REM sleep after REM cuing had ceased, which presumably provided an opportunity for items that had been cued in SWS to continue their consolidation in REM as needed. Nonetheless, we still cannot rule out some kind of interference and thus remain cautious in our interpretation. We therefore conclude only that REM TMR is sufficient to start a consolidation process that facilitates rule abstraction and cannot draw conclusions about the impacts of SWS TMR on this process based on the current data alone.
Regarding the timing of the TMR effects, our data suggest that the impact of TMR may continue to unfold for at least a week, with performance on cued and noncued problems only becoming significantly different after that temporal delay. Notably, we did not test performance between days 1 and 7, so we do not know how quickly this process unfolds. If qualitative changes in memory representations, such as abstraction, require longer periods of time to evolve (Sterpenich et al., 2014; Lutz et al., 2017), then they may escape detection by the commonly used 12 h test-retest paradigm. Prior studies have considered longer test periods and have shown that TMR-related benefits sometimes disappear over a week (Shanahan et al., 2018), but can also persist over this period (Hu et al., 2015; Groch et al., 2017; Simon et al., 2018). Our current study builds on these reports by showing that the benefit to abstraction, which was not significant at day 1 postsleep, became significant by day 7. This is in keeping with a study of emotional processing, which showed that the impact of NREM TMR on emotional content was amplified across a week (Groch et al., 2017) and also with our own work on the serial reaction time task, which shows that benefits from TMR can emerge after 10 d or more (Rakowska et al., 2021).
Building on a model of synaptic plasticity across brain states (Redondo and Morris, 2011; Seibt and Frank, 2019), we have recently proposed a series of plasticity-related events that take place in both NREM and REM that could explain why the effect of sleep on memory consolidation may require extended periods of time before it becomes detectable (Pereira and Lewis, 2020). According to a previous framework (Seibt and Frank, 2019), neuronal ensembles associated with the task are tagged during wakeful encoding. During subsequent NREM reactivation, mRNAs or other plasticity-related products (PRPs) are captured by these tagged synapses. Finally, in subsequent REM, these PRPs are translated into proteins that enable synapses to undergo intense remodeling. In light of our current results, we speculate that applying TMR in REM might potentially bypass the need for PRP capture in NREM, instead promoting PRP capture and translation at task-related synapses. Given the time-consuming nature of these processes, multiple nights of sleep could be required before measurable behavioral effects emerge. Of course, this does not explain why TMR cuing in SWS, which might reasonably be expected to result in extra PRP capture by task-related synapses, did not result in a behavioral benefit. We can only speculate that such a PRP capture is not sufficient in the case of our abstraction task. Alternatively, it is also possible that cuing in REM subsequent to SWS somehow interfered with consolidation such that PRPs capture during SWS cuing were not subsequently translated. More work will be needed to disentangle such effects.
Our ERP analysis complements our behavioral findings by revealing differential neural responses to experimental and control stimuli in REM but not SWS. These differential responses were found between 228 and 400 ms after cue onset, a time window during which auditory stimuli are known to be extensively processed in both NREM and REM sleep (Campbell and Muller-Gass, 2011) and is also associated with the P300 component (Picton, 1992). The P300 is thought to reflect higher-order cognitive processing related to selective attention and resource allocation, with its amplitude proportional to the amount of attentional resource recruited for scrutiny of a given stimulus (Ibáñez et al., 2009). The P300 has also been detected during REM, with larger peak amplitudes occurring for rare sounds in the oddball paradigm (Cote and Campbell, 1999). Our data mirror this result by showing that new control sounds elicited greater P300 waves than familiar task-related sounds. Interestingly, the P300 has been found in response to hearing one's own name in REM sleep but not in response to hearing another name. This could indicate that some level of cognitive processing persists during REM (Bastuji et al., 2002). The fact that we observed a difference between familiar and unfamiliar P300 responses in REM but not in SWS is therefore in keeping with the literature. Other authors have interpreted such results as suggesting that stimuli are processed at a deeper, more cognitive level during REM (for review, see Ibáñez et al., 2009).
Conclusion
In sum, we found that TMR in REM is sufficient to benefit a visual reasoning task commonly used in the field of artificial intelligence (Fleuret et al., 2011; Ellis et al., 2015) but never before tested in a sleep study. Furthermore, ERPs suggested a deeper level of processing in REM than SWS, and behavioral findings suggest that the process started by TMR in REM requires more than one night of sleep to unfold. These findings open exciting new avenues for exploring TMR as a tool to enhance higher-order cognitive functions such as abstraction, a core component of fluid intelligence and creativity.
Footnotes
↵*S.I.R.P. and L.S. share senior authorship.
This work was supported by Engineering and Physical Sciences Research Council Grant EP/R030952/1 and European Research Council Grant SolutionSleep REP-SCI-681607-1. We thank Scott Lowe for the Synthetic Visual Reasoning Task stimuli, Miguel Navarrete for helping with the targeted memory reactivation scripts, Mahmoud Eid-Abdellahi for assisting in the EEG data analysis, Matthias Gruber for commenting on an earlier version of the manuscript, and Martyna Rakowska for helping with a revision.
The authors declare no competing financial interests.
- Correspondence should be addressed to Penelope A. Lewis at LewisP8{at}cardiff.ac.uk