Abstract
Relational memory, the ability to make and remember associations between objects, is an essential component of mammalian reasoning. In relational memory tasks, it has been shown that periods of offline processing, such as sleep, are critical to making indirect associations. To understand biophysical mechanisms behind the role of sleep in improving relational memory, we developed a model of the thalamocortical network to test how slow-wave sleep affects performance on an unordered relational memory task. First, the model was trained in the awake state on a paired associate inference task, in which the model learned to recall direct associations. After a period of subsequent slow-wave sleep, the model developed the ability to recall indirect associations. We found that replay, during sleep, of memory patterns learned in awake increased synaptic connectivity between neurons representing the item that was overlapping between tasks and neurons representing the unlinked items of the different tasks; this forms an attractor that enables indirect memory recall. Our study predicts that overlapping items between indirectly associated tasks are essential for relational memory, and sleep can reactivate pathways to and from overlapping items to the unlinked objects to strengthen these pathways and form new relational memories.
SIGNIFICANCE STATEMENT Experimental studies have shown that some types of associative memory, such as transitive inference and relational memory, can improve after sleep. Still, it remains unknown what specific mechanisms are responsible for these sleep-related changes. In this new work, we addressed this problem by building a thalamocortical network model that can learn relational memory tasks and that can be simulated in awake or sleep states. We found that memory traces learned in awake were replayed during slow waves of NREM sleep and revealed that replay increased connections to and from overlapping memory items to form new relational memories. Our work discovered specific mechanisms behind the role of sleep in associative memory and made testable predictions about how sleep augments associative learning.
- learning and memory
- memory consolidation
- relational memory
- sleep
- synaptic plasticity
- transitive inference
Introduction
The ability to form indirect associations between learned items with overlapping elements highlights an important part of abstract problem solving. This type of learning, known as transitive inference, is a fundamental feature of relational memory (DeVito et al., 2010). For example, one may watch a movie (Object A) and experience a feeling of familiarity about a certain actor (Object B), giving rise to the question of what movie that actor has been in previously (Object C). This type of memory, where the premises that “A goes with B” and “B goes with C” are learned, represents a type of transitive inference where the indirect association (that “A goes with C”) is not inherently learned but is inferred by the subject. Despite the seeming complexity of the task, it has been shown that rats, primates, and humans are capable of performing transitive inference and relational memory tasks (Vasconcelos, 2008; DeVito et al., 2010). Importantly, depending on the type of task, the ability to connect indirect associations or inferences may not be explicitly acquired immediately after training (Walker et al., 2002; Ellenbogen et al., 2007).
Empirical studies suggest that offline processing, such as during sleep, is important in forming indirect associations (Ellenbogen et al., 2007; Werchan and Gómez, 2013). Sleep is a principle component behind many types of memory consolidation and plays an important role in learning (Maquet, 2001; Walker and Stickgold, 2004; Ji and Wilson, 2007; Klinzing et al., 2019). The role of non-rapid eye movement (NREM) sleep in learning and memory has been shown to be significant, aiding in consolidation of declarative memories and memories for complex motor learning tasks (Walker et al., 2003; Diekelmann and Born, 2010; Miyamoto et al., 2016). A central hypothesis for memory improvement during NREM sleep is that replay or reactivation of learned synaptic memory traces during sleep oscillations (spindles or slow waves) strengthens synaptic traces of these labile memories (Wei et al., 2016, 2018; González et al., 2020). Sleep has been shown to augment problem solving (Walker et al., 2002; Wagner et al., 2004; Lau et al., 2011; Nieuwenhuis et al., 2013; Lewis et al., 2018) and hypothesized to create cognitive schemata by replaying memories with overlapping elements, strengthening the connections between overlapping memories and leading to generalization of learned concepts (Lewis and Durrant, 2011; Lewis et al., 2018).
Accumulating evidence suggests that sleep may play a critical role in learning relational memory tasks (Ellenbogen et al., 2007; Lau et al., 2010, 2011; Werchan and Gómez, 2013; Chatburn et al., 2014; Studte et al., 2015). One study showed that duration of slow-wave sleep (SWS) is significantly correlated with learning indirect associations (Lau et al., 2010). Another study tested a subject's ability to relate abstract concepts through generalization, and found improvements after a daytime nap (Lau et al., 2011). It has also been shown that sleep can increase a subject's ability to perform hierarchical transitive inference, where A > B and B > C are learned premises and A > C is a tested abstraction (Ellenbogen et al., 2007).
Despite the progress made in understanding the role of sleep in increasing relational memory performance, it remains unknown what biophysical mechanisms account for this function. Here, using a biophysical model of the thalamocortical network, we tested the role of NREM sleep on the network's ability to perform a relational memory task. We found that the network can form indirect inferences, which were never trained directly, following periods of SWS. We further revealed that sleep replay increases connections to/from a shared conjunctive memory unit, giving rise to an increase in performance during relational memory tasks. Ultimately, a theoretical understanding of how sleep aids with relational memory would guide development of experiments, where these findings can be tested in vivo.
Materials and Methods
Thalamocortical network model
Network architecture
The base thalamocortical network used in this new study has been described in our other works (Krishnan et al., 2016; Wei et al., 2016, 2018; González et al., 2020). The network was composed of two connected populations of neurons: thalamic and cortical. Different from previous work, we constructed two layers (functional regions) for both the thalamic and cortical components of the network, and we did not rely on local connectivity but rather random connectivity between neurons. The thalamic part of the network was broken down into two populations (layers) and contained total 60 excitatory thalamocortical relay neurons (TC cells) and 60 inhibitory reticular neurons (RE cells). Layer 1 contained 40 TC neurons and 40 RE neurons, whereas layer 2 contained 20 TC and RE neurons. The cortical part of the network was also broken down into two layers, representing two functionally different cortical areas. In layer 1 (representing primary visual cortex), there were 200 excitatory pyramidal neurons (PY cells) and 40 inhibitory interneurons (INs). In layer 2 (representing associative cortex), there were 100 PY neurons and 20 IN cells. Connectivity was random; excitatory connections were mediated by NMDA and AMPA connections, while inhibitory connections were mediated by GABAA and GABAB connections. All connections are summarized in Table 1 and described as follows.
Thalamocortical network connection architecture
To describe specific connections, starting in the thalamus, RE neurons received AMPA connections from TC neurons and GABAA connections from RE neurons as well as AMPA connections from PY neurons in associated cortical layer. AMPA synapses between TC and RE cells had connection probability 10% in layer 1 and 20% in layer 2. RE cells were connected to each other through GABAA synapses within the same layer with probability 6.25% in layer 1 and 12.5% in layer 2. Finally, cortical PY neurons synapsed via AMPA connections onto RE cells with connection probability 10% and 20% in layer 1 and layer 2, respectively. TE cells received connections from RE cells through both GABAA and GABAB synapses, as well as AMPA connections from PY neurons in associated cortical layer. Each TC cell received a connection from an RE cell with a 10%, and 20% probability in layer 1 and layer 2, respectively. Each TC cell also received an AMPA synapse from cortical PY neurons, with connection probability 12.5% and 25%, in layer 1 and layer 2, respectively.
In the cortex, PY neurons received nonplastic AMPA connections from TC cells, plastic and nonplastic AMPA connections from other PY neurons, and GABAA connections from IN neurons in the same cortical layer. The TC cells in layer 1 connected to PY neurons in layer 1 of cortex with a connection probability of 10%. In layer 2, this connection probability was increased to 20%. Thus, considering size difference between layer 1 and layer 2, each PY neuron received about the same number of TC inputs. In layer 1 of cortex, each PY neuron received feedback connections from layer 2 PY neurons with a connection probability of 25%. In addition, each layer 1 PY neuron received two inhibitory GABAA connections from INs in layer 1 of cortex. In layer 2, each neuron received a feedforward plastic AMPA connection from layer 1 PY neurons with probability 20%, and a recurrent plastic AMPA connection from layer 2 PY neurons with probability 50%. Each plastic AMPA connection in cortex was also accompanied by a nonplastic NMDA excitatory synapse. In addition, layer 2 PY neurons received 13 GABAA connections from local INs.
Finally, each IN received nonplastic AMPA connections from TC cells in thalamus, with connection probability 3.75% in layer 1 and 7.5% in layer 2. In addition, all INs received nonplastic NMDA and AMPA synapses from PY neurons in both layer 1 and layer 2 of cortex. Layer 1 PY to layer 1 and layer 2 IN AMPA and NMDA connections occurred with a probability of 5% and 10%, respectively. Layer 2 PY to layer 1 and layer 2 IN AMPA and NMDA connections occurred with a probability of 5% and 50%, respectively. The latter 50% connections were much weaker than other connections.
Wake-sleep transition
The transition between wake and sleep was modeled after previous work which describes the role of neuromodulators [acetylcholine (ACh), histamine (HA), and GABA] during the sleep and waking state needed to observe sleep rhythms canonical of SWS (Krishnan et al., 2016). ACh modulated potassium leak currents in all neuron types and excitatory AMPA connections within cortex. HA modulated the strength of the hyperpolarization-activated mixed cation current in TC neurons and GABA modulated the strength of inhibitory GABAergic synapses in both thalamus and cortex. The levels of ACh and HA were reduced during Stage 3 (N3) SWS, while GABA levels were increased compared with the awake state. The exact levels of each neuromodulator were chosen by conducting a parameter sweep and observing which parameters resulted in the appearance of canonical slow waves. In addition, to simulate Stage 2 (N2) sleep characterized by spindles, neuromodulation parameters were determined by parameter sweep looking for the local field potential (LFP) power in the spindle frequency band (7-16 Hz in our study). Parameters for N2 sleep were intermediate between waking and N3 states.
Intrinsic currents
All neurons were modeled with Hodgkin-Huxley kinetics, and equations can be found in previous works (Wei et al., 2018; González et al., 2020). In cortex, PY and IN neurons possessed dendritic and axo-somatic compartments (Wei et al., 2018). Membrane potential dynamics were modeled by the following dynamical equations:
In the thalamus region of the model, TC and RE neurons were modeled by single compartment neurons with the following dynamical equation:
Our previous work gives a more detailed description of the individual currents (Krishnan et al., 2016; Wei et al., 2018).
Synaptic currents and spike-timing dependent plasticity (STDP)
Here, we describe the synaptic currents which were composed of AMPA, NMDA, GABAA, and GABAB synapses as well as the STDP rules (for more details on the specific synaptic currents, see Krishnan et al., 2016; Wei et al., 2018). The effect of ACh on AMPA and GABAA synaptic currents was described by the following equations:
STDP controlled long-term potentiation and depression of synaptic weights between PY neurons. The change in the synaptic strength (gAMPA) and amplitude of miniature EPSPs (AmEPSP) were described previously (Wei et al., 2018) as follows:
Heterosynaptic plasticity
Heterosynaptic plasticity was implemented in some simulations. To mimic heterosynaptic plasticity properties observed in vivo (Chistiakova et al., 2014; Volgushev et al., 2016), after each STDP event in which a synaptic weight was modified, we also modified the weights of remaining synapses into the same neuron to hold the total synaptic input per neuron constant. Thus, if
Memory training and testing
Training and testing of associative memories were modeled after behavioral works (Lau et al., 2010). After creating a two-layer cortical architecture, we selected the groups of neurons in each layer that correspond to each stimulus. Neuron IDs were mapped to a stimulus label as shown in Table 2. The first training phase was the supervised learning. Here, an individual item was stimulated in layer 1 followed, with 5 ms delay, stimulation of that item in layer 2. This phase created a feedforward pathway through the network that represents an individual stimulus. Each feedforward pathway stimulation (e.g., A-A′) included 40 trials with a 500 ms gap between trials. The total length of supervised training was therefore 120 s for all 6 feedforward pathways.
Neuron indices in cortical architecture
Following supervised training, we implemented an unsupervised associative training phase, where pairs of stimuli were presented simultaneously. This occurred by stimulating pairs of input items together (e.g., A + B, B + C, etc) in layer 1. These pairs of items were stimulated sequentially every 500 ms with a 2 s gap between same-pair stimulations. The exact duration of associative training varied by experiment, but if associative training time was 135s/pair, then each pair was stimulated 270 times.
Finally, there was a sleep phase. During sleep, the levels of neuromodulators were changed to induce spindles (N2) or slow oscillations (N3), and there was no external stimulation provided. Each sleep phase was followed by a testing phase, where each of the six groups was stimulated in layer 1, and the response of layer 2 neurons was measured. Stimulation was provided every 500 ms, and each group was stimulated 8 times. Performance was measured as the network's ability to recall both the direct and indirect associated item (e.g., on stimulation of A, can the network recall both B′ and C′?). In Figure 9B, we performed additional tests where Groups A, C, X, and Z were stimulated and neuron Groups B′ and Y′ were hyperpolarized to prevent activation. In another experiment, we hyperpolarized neurons from linking Groups B/B′ and Y/Y′ during sleep to simulate experiments with optogenetic inactivation.
Experimental design and statistical analyses
All analyses were performed within standard Python functions and libraries. Data are presented as mean ± SD unless otherwise stated. Each experiment was repeated with 10 network stimulations from different network initializations and random seeds for purposes of statistical analyses, using standard two-sided or one-sided t tests.
Relational memory performance metrics
Here, we describe the association matrices shown in Figure 3 as well as the conversion from these matrices to an association score. To build an association matrix, individual neuronal groups were stimulated in layer 1 (e.g., item A was stimulated), and we measured the number of spikes in each of the six layer 2 groups (A′, B′, C′, X′, Y′, Z′). This number was averaged over the 10 different (initialization) network simulations and 8 testing trials within each network simulation. We only considered spikes that occur within 150 ms of stimulation to the layer 1 groups. To compute an association score based on the association matrix, we built a binary 6 × 6 mask with 1's in the upper left and lower right 3 × 3 grids and −1's everywhere else. This mask depicts what an ideal associative matrix should look like, where activity in the top left and bottom right grids is acceptable and activity in the top right and bottom left grids is spurious. After element-wise multiplication of the mask and the associative matrix, the resultant matrix was summed up across both rows and columns. To normalize this final score, we divided the final sum by the maximum element in the association matrix multiplied by 18 (here, 18 is the number of elements that should be positive, e.g., number of groups × number of items in each group, or 6 × 3, where 6 is number of groups (A-B-C, X-Y-Z) and 3 is the number of items in each group). The final number was on a scale from −1 to 1, where a score of −1 occurs when the association matrix is the opposite of what it should be after successful learning (e.g., stimulating Group A activates X′, Y′, and Z′), an association score of 0 is true for a random matrix, and an association score of 1 indicates perfect performance on the task (e.g., stimulating Group A equally stimulates A′, B′, C′).
Latency and rate analysis
In Figure 4, we show the spiking rates and latency of neurons in layer 2. To compute the latency of response, after applying a pulse of stimulation during testing, we analyzed the next 200 ms window of activity in layer 2. The latency, for each layer 2 neuron, was determined by taking the time of activation of a neuron in layer 2 and subtracting the time of stimulation in layer 1. If a neuron does not spike in the 200 ms time window, its latency was ignored from the computation. The firing rate was computed by calculating the total number of spikes that occur in the 200 ms window. We considered four different types of memories: direct memories (e.g., activation of neuron Group A′/C′ when B′ is stimulated), indirect memories (e.g., activation of neuron Group A′/C′ when C/A is stimulated, respectively) and incorrect memories (e.g., activation of neuron group X′/Z′ when A is stimulated). For each type of memory, latencies and rates were averaged across all pairs of that type (e.g., direct memories = A-B′, B-A′, B-C′, indirect memories = A-C′, C-A′, incorrect memories = A-X′, A-Z′, C-X′, C-Z′ for the ABC triplet). We should note that this metric likely overestimated latency for the incorrect memories since it did not consider the fact that if a neuron does not fire, its latency is ignored from the computation. Thus, for example, if only one incorrect neuron fired with a latency of <50 ms, then the average latency would indeed be <50 ms. This was rarely the case; nevertheless, the drop in latency of the incorrect memories was likely because of this phenomenon since the rate of firing (three spikes/stimulation) is quite low already.
Weight analysis
In Figures 5 and 6A–D, we explored the synaptic connectivity matrices. Figure 5 was obtained by recording the synaptic weights between neurons for each type of connection (feedforward or recurrent). To evaluate the synaptic input to each neuron i, we computed the following equation:
Modularity analysis
Community detection algorithm was used to describe brain network changes during task learning (Alexander-Bloch et al., 2010; Mucha et al., 2010; Bassett et al., 2015). Modularity refers to the formation of cliques in a network, or series of intraconnected nodes with limited connections to other cliques (Alexander-Bloch et al., 2010). Time-dependent communities can be analyzed by measuring the structure of multislice networks, which can be thought of as a combination of individual networks that are composed of nodes that are linked in time to past and future versions of that network (Mucha et al., 2010). To perform community detection (Fig. 6E,F), we used existing community detection algorithm (Jeub et al., 2020). First, the Leicht-Newman modularity matrix for ordered and directed layers was computed (Leicht and Newman, 2008). This algorithm finds a partition that maximizes the modularity of the matrix. After this partition was computed, the generalized Louvain method for community detection was applied (De Meo et al., 2011; Jeub et al., 2020). As a result of applying these algorithms, a network partition and community assignment graph was returned as a function of time. The algorithms aim to find a community assignment partition that maximizes the resulting modularity of the network. Two parameters were tuned to aid in this process: the coupling between temporal layers (ω = 1.0) and the intralayer resolution (γ = 1.75).
Replay analysis
To analyze memory replay, we adopted a method from González et al. (2020). First, the LFP during sleep was computed by evaluating the average membrane potential across all pyramidal neurons in the cortex. A threshold for crossing from Up to Down state and vice versa of the slow oscillations was computed by taking the resting membrane potential (−63 mV) and subtracting the mean sleep membrane potential. After the threshold was computed, we filtered the LFP using a second-order Butterworth filter with a Nyquist frequency of 500 Hz and passband and stopband frequencies of 0 and 3 Hz, respectively. Next, we applied the threshold to find the Up to Down state and Down to Up state transition times. Activity above the threshold was denoted as an Up state.
Once the Up and Down states were identified, we analyzed the activity within each individual Up state to calculate replay events. A spiking event was considered a replay event when a presynaptic and a postsynaptic neuron fired within a given time window (<200 ms). The order of firing (pre-post, or post-pre) was used to determine the direction of replay and to compute a directional graph between neurons, where each edge stores the number of replay events going in that direction (for details, see González et al., 2020).
Results
Thalamocortical model of relational memory
In this work, we used a minimal thalamocortical network model to test the role of sleep in learning an unordered relational memory task (Fig. 1A,B). Cortex was modeled with a network consisting of two layers, each representing a distinct functional area of the cortex, and each including excitatory PY cells and inhibitory interneurons (INs). A two-layer cortical model was motivated by visual associative learning in the primate brain. Prior work suggests that associations are learned by recurrent synaptic connections in the parietal associative cortex (Fitzgerald et al., 2011, 2013; Aminoff and Tarr, 2015; Bjekić et al., 2019). This area of cortex receives input from primary visual cortex (Galletti et al., 2001), which shows a mostly stereotyped response on presentation of visual stimuli (Deitch et al., 2021). Thus, we constructed our model with two populations of cortical neurons (which we call layers here, when we refer to the model): the first representing visual cortex with a mostly stereotyped population response to specific stimuli, and the second representing associative cortex, with recurrent connectivity to promote associative memory learning.
Thalamocortical model of relational memory simulates transitions between awake and sleep states. A, Basic task setup. During associative training (left), pairs of items are presented simultaneously (A + B, B + C). The relational memory task (right) tests the ability of the network to retrieve direct (B) and indirect (C) items, when presented with item A. B, Basic network architecture: PY, excitatory pyramidal cells; IN, inhibitory interneurons; TC, thalamocortical neurons; RE, inhibitory thalamic reticular neurons. Excitatory connections terminate in a dot, whereas inhibitory connections terminate in a line. Arrows indicate the direction of connections. C, Baseline network dynamics of the 200 PY neurons and 100 INs during wake and SWS. Each row represents membrane potential over time of a single neuron. D, Zoom-in of baseline network dynamics in awake state before sleep (left), during sleep (middle; one Up state is shown), and in awake state after sleep (right). Network dynamics before and after sleep are shown for layer 2 neurons. During sleep, a canonical slow wave pattern is seen across both layers. E, Weight connectivity matrix for feedforward connections from layer 1 to layer 2 in cortex (left) and recurrent connections within layer 2 (right). Connection probability is 30% for feedforward connections and 50% for recurrent connections. White dot represents that a connection exists between two neurons. F, Two-layer cortical network architecture. There are plastic feedforward connections from layer 1 to layer 2 and plastic recurrent connections within layer 2. A subset of neurons in each layer is trained to represent individual items (e.g., neurons 10-29 [denoted neuron Group A in the text] in layer 1 represent item A, and neurons 210-219 [denoted neuron Group A′] represent item A in the second layer).
Thalamus was modeled by two populations of neurons, each including excitatory thalamocortical (TC) neurons and inhibitory reticular (RE) neurons, with bidirectional connections to its respective cortical areas (for details, see Materials and Methods). Indeed, neuroanatomical studies suggest that different subdivisions of thalamus project to different areas of cortex, with primary areas of thalamus, such as LGN projecting bidirectionally to primary visual cortex (Briggs et al., 2007), and other subdivisions, such as the lateral posterior nucleus, connecting bidirectionally to parietal cortex (Lyamzin and Benucci, 2019). All neurons were simulated with Hodgkin-Huxley dynamics and are based on previous work (Krishnan et al., 2016; Wei et al., 2016, 2018).
Using this model, we were able to simulate three distinct states of the network: awake, Stage 2 (N2) sleep and Stage 3 (N3) sleep, by changing the level of neuromodulators (Vanini et al., 2012; Krishnan et al., 2016). Awake state was characterized by random asynchronous firing of cortical neurons, N2 sleep was characterized by spindles with occasional Down states, and N3 sleep (or SWS) was characterized by canonical slow oscillations between Up (active) and Down (silent) states (Blake and Gerard, 1937; Steriade et al., 1993; Steriade, 2006) (Fig. 1B,C; see also Fig. 8A). The thalamic component of the network primarily served the function of driving and modulating oscillations during sleep, specifically to increase synchrony of sleep slow oscillations in N3 (Lemieux et al., 2014) and to generate spindles in N2, while learning-related plasticity occurred in the cortical neuronal populations. Synaptic plasticity was implemented in AMPA receptors, occurring in feedforward connections between layer 1 and layer 2 cortical pyramidal cell populations, as well as recurrent connections between layer 2 pyramidal neurons (Fig. 1F; for details, see Materials and Methods).
To test relational memory in the model, we built two triplets of relational memory items (ABC, XYZ). During associative training, each of the four direct object pairs (A-B, B-C, X-Y, Y-Z) was presented to the network, as described below (Fig. 1A, left). During testing, a single item from each pair was presented (e.g., item A) and the ability of the network to recall each of the relevant associative items (items B and C) was measured (Fig. 1A, right). Each of the six distinct items (A, B, C, X, Y, Z; Fig. 1F) was represented by distinct groups of neurons in the first layer of the network.
Training and testing stimulation protocol
The network stimulation included three distinct phases: supervising training, associative training, and sleep (Fig. 2A). The first phase in training was to build connections between neurons representing item A in the first layer (neuron Group A) and “higher-level” neurons representing the item in the second layer (neuron Group A′) (Fig. 2B). Since all connections in the model were initially random, before training there were equal connections from neuron Group A to all the neuron groups in the second layer (A′-Z′). Thus, to create distinct pathways through the cortex that represent each of the six distinct items, we incorporated the supervised training phase. During supervised training, neurons in each group of layer 1 (e.g., Group A) were stimulated and then neurons in corresponding layer 2 group (group A′) were stimulated with a 5 ms time delay. Through STDP, this stimulation paradigm strengthened feedforward connections between A and A′ and led to the formation of a pathway through the network representing each of the six distinct items. After supervised training, there was a testing phase where each of the six neuron groups in layer 1 was stimulated and the activity of neurons in associative layer 2 was measured. During testing, plasticity was turned off so spiking activity did not lead to STDP events.
Training and testing protocol include supervised and associative training in awake state and spontaneous activity during SWS. A, Overall network dynamics for the three phases: supervised training (purple), associative training (green), and sleep (cyan). Each phase is followed by a testing phase (T1, T2, and T3). B, During supervised training, neuron Groups A, B, C, X, Y, Z are stimulated in layer 1 and neuron Groups A′, B′, C′, X′, Y′, Z′, respectively, are stimulated in layer 2 with a 5 ms time delay. Left, Example stimulations of C and C′ and X and X′. During testing, a single neuron group in layer 1 is stimulated (e.g., neuron group Z on the right), and the response of neurons in layer 2 is measured. Red bars are shown to accentuate neuron groups that are stimulated during training phase. C, During associative training, neuron groups A + B, B + C, X + Y, Y + Z are stimulated simultaneously. Each pair is stimulated with a 500 ms delay after previous group stimulation. No stimulation is provided in layer 2. After associative training, another testing phase is performed. D, During sleep, neuromodulator levels are altered to simulate deep Stage 3 (N3) sleep activity characterized by spontaneous slow waves across cortex. After sleep, another testing phase is performed.
Following supervised training, we simulated associative learning phase. Items A + B, B + C, X + Y, Y + Z were presented simultaneously to the network by stimulating Groups A and B together or B and C together, etc. (Fig. 2C). Because of the preceding supervised training, neurons in the second layer responded to the stimulation in the first layer, such that, for instance, when neuron Groups A and B were stimulated, neuron Groups A′ and B′ fired without any direct stimulation. After a period of associative training, there was another testing phase.
During the associative training phase, we also tested two plasticity schemes: In the first scheme, STDP was used as a sole learning rule to increase synaptic connectivity between neurons with correlated firing activity and decrease synaptic connectivity between those neurons with uncorrelated firing activity. In the second scheme, STDP was used along with heterosynaptic plasticity (Chen et al., 2013; Chistiakova et al., 2014). Heterosynaptic plasticity can induce plastic changes at synapses that are not active during the induction. It has been postulated since early theoretical studies which used normalization to prevent runaway dynamics of synaptic weights and introduce synaptic competition to the model systems with Hebbian-type learning (von der Malsburg, 1973; Miller, 1996). Any synapse to a cell may express heterosynaptic changes after episodes of strong postsynaptic activity leading to a sufficient rise of intracellular calcium (for review, see Chistiakova et al., 2014, 2015). Thus, in the model including heterosynaptic plasticity, after each STDP event, individual weights connecting to a neuron were modified so that the total sum of synaptic inputs to the neuron remained constant. This served to balance excitation in the network and prevent runaway networks dynamics by ensuring that the overall level of excitation remains constant during learning. Below we report results for each of these conditions, and we discuss later possible implications of heterosynaptic plasticity in associative learning.
Finally, we simulated sleep phase (Fig. 2D). Based on experimental data, the improvement of indirect relational memory following sleep is most correlated with SWS (Lau et al., 2010); thus, we primarily focused on testing the effect of SWS on relational memory (differential role of spindles is discussed later in the paper). We need to mention that we did not explicitly model hippocampus and associated ripple events; instead, we assumed that coactivation of the cortical neurons (e.g., A + B) may be result of direct sensory input or hippocampal input (as postulated by “indexing” theory) (Teyler and DiScenna, 1986). Following SWS, there was another testing phase. Overall, based on behavioral work, we tested the hypothesis that, following sleep, the presentation of item A in the first layer will lead to a greater coactivation in neuron Groups A′ and C′ (i.e., association between items A and C would form) compared with the same group activation before sleep.
Sleep improves associative memory performance both with and without heterosynaptic plasticity
In Figure 3A, the strength of response in the layer 2 neuronal subgroups (A′-Z′) is shown in response to stimulation of each of the six layer 1 neuronal subgroups (A-Z) in the first cortical layer. After supervised training, stimulation of a single group in layer 1 (e.g., Group A) led to activity in its corresponding neuronal subgroup in layer 2 (Group A′). Spurious activity in other layer 2 groups was usually minimal and based off the random connectivity matrix, where some groups may be connected (based on number of connections) more strongly than other groups (Fig. 3A, left; materials for computing activity, see Materials and Methods).
Sleep improves associative memory performance. A, C, Responses of layer 2 neuron groups after stimulating a neuron group in layer 1 during testing after supervised training (left), associative training (middle), and sleep (right). A, Responses in the model without heterosynaptic plasticity (HSP). C, Responses in the model including heterosynaptic plasticity during associative training phase. B, D, Conversion of association matrices shown in A–C to a single association performance score. B, Without heterosynaptic plasticity. D, With heterosynaptic plasticity. E, F, Associative training duration versus sleep duration. E, The model without heterosynaptic plasticity. F, The model with heterosynaptic plasticity. The first number in each cell indicates the association score before sleep, and the second number indicates the association score after sleep. Color represents the % change in association score from before to after sleep. G, Improvement in association score as a function of number of slow waves (p = 2.45 × 10−13, R2 = 0.74) in the model including heterosynaptic plasticity. Each dot represents a different network trial. Network trials are computed for 100, 300, and 500 s of sleep as well as different durations of associative training.
After associative training, an increase in direct relational memory was observed. Here, stimulation of a neuron Group A led to activity in neuron Groups A′ and B′, indicating that the network has learned to make direct associations between Objects A and B. Stimulation of the linking item (e.g., B or Y) led to activity in all three of the items in the corresponding triplet (A′, B′, C′ or X′, Y′, Z′). However, most notable is that stimulating A or C alone did not lead to a strong response in the indirect relational item, C′ or A′, respectively (Fig. 3A, middle). After sleep phase, this indirect relational memory was significantly strengthened, as stimulation of A or C (X or Z) led to a stronger response in the indirect relational item, C′ or A′ (Z′ or X′), respectively (Fig. 3A, right).
To quantify the changes in the association matrices, we used a measure of how “diagonal” the matrix is in respect to four main 3 × 3 blocks, which evaluated the extent to which the matrix shows strong responses in the upper left and lower right 3 × 3 blocks, and low responses in the top right and bottom left 3 × 3 blocks (see Materials and Methods). (This measure would be zero for uniform matrix; 1 for a matrix with the top left and bottom right 3 × 3 blocks all having the same values, with zero activity in the top right and bottom left 3 × 3 blocks; and −1 for the opposite case [activity in top right and bottom left blocks]). We found that sleep leads to a significant improvement in relational memory, based on simulating 10 random different network configurations (Fig. 3B, p = 0.0062, t(9) = 3.55, between relational memory after sleep and after associative training, based on a two-sided t test).
The extent of improvement after sleep was determined by two factors: the length of associative training and length of sleep. We observed that, if associative training was long, then indirect associations can be learned without sleep (Fig. 3E, 50 s). However, when associative training was shorter, then sleep had a beneficial impact on improving relational memory (Fig. 3E, 20, 35 s). Given the model with no homeostatic mechanisms built in to constrain synaptic weights, it was observed that long training or long sleep periods could lead to runaway network dynamics, where stimulating a single neuronal group in layer 1 leads to activity across many neurons of the second layer, thus lowering overall response specificity and performance.
Given the negative impact of the runaway network dynamics, we next explored the use of biologically realistic heterosynaptic plasticity mechanism to constrain synaptic weights. Thus, during associative training, heterosynaptic plasticity was put in place, such that the total sum of synaptic inputs to any neuron was conserved over time. In this model, any event that leads to synaptic potentiation between neurons would also lead to a corresponding depotentiation of other connections to the same neuron to keep net sum of all input weights constant (for details, see Materials and Methods). In the model with heterosynaptic plasticity, we observed less spurious activity after associative training (Fig. 3C, middle). In addition, activation of the indirect memory after associative training was almost nonexistent. Importantly, after sleep, the activity in the indirect memory items was strong, with very little activity in neurons representing nonassociated items (Fig. 3C, right). Here, improvement after sleep was strongly significant (Fig. 3D, p = 3.78 × 10^−6, t(9) = 13.04, based on two-sided t test). This suggests that, for SWS to have a beneficial impact on the network's ability to recall indirectly associated items, the weights before sleep must be sufficiently separated but not too strong overall, as it was when heterosynaptic plasticity was applied during associative training. In general, the best performance was observed when sleep was incorporated into the network (Fig. 3F). Increasing the training time beyond a certain duration did not always increase the baseline performance; however, sleep applied even after long associative training could still further improve performance. We tested how associative memory performance depends on the total number of slow waves, and we found a significant positive correlation in a broad range of sleep durations (Fig. 3G). This result is in agreement with previous experimental work that found a significant correlation between the SWS length and relational memory learning (Lau et al., 2010). Interestingly, very long sleep could have the opposite effect and reduce performance (see, e.g., Tsleep = 700 s), suggesting the existence of an optimal sleep duration that could also depend on the duration of preceding training sessions. For further analyses, we used the heterosynaptic plasticity condition with Tsleep = 300 s and Ttrain = 135 s.
Synaptic plasticity may also occur between cortical pyramidal cells and interneurons, as well as between thalamus and neocortex. Although we did not explicitly incorporate these types of plasticity in our model, we tested effect of changes in the balance of excitation and inhibition on post-sleep memory performance. Thus, we modified the level of inhibition in the network by setting it to ±10% of the baseline value. We found no significant difference in the associative score after sleep (t(10) = −0.8, p = 0.4, one-sided t test). After associative training, performance was relatively higher in the network with reduced baseline inhibition (t(10) = 2.4, p= 0.02, one-sided t test). In this case, there was still a significant post-sleep improvement (t(10) = −4.96, p = 0.0001). The network with increased inhibition revealed slightly reduced performance right after associative training but relatively higher gain after sleep.
Sleep increases amplitude and decreases latency of indirect memory response
Since sleep increases the association score, we next asked whether sleep can improve the latency of group activation by reducing time delay between responses of stimulated and indirectly recalled groups. To test this, we analyzed the raw neuronal traces after supervising training, after associative training, and after sleep (Fig. 4A–C). As mentioned before, heterosynaptic plasticity was in place in all these simulations. During testing, each group (A-C, X-Z) was simulated 8 times every 500 ms in layer 1 and the response in layer 2 was measured. We next converted these firing patterns into an LFP for each of the six groups of neurons in the second layer and averaged across eight simulations. Results are shown when X is stimulated in the first layer (Fig. 4D). After supervised training, stimulating X led to a strong response in X′ (Fig. 4D, left). After associative training, the strength of the response of Y′ was increased and there was a small, sustained response in Z′ (Fig. 4D, middle). Finally, after sleep the response profiles of Y′ and Z′ nearly become overlapping, suggesting that the network has used its knowledge of an association between Z′ and Y′ to correctly infer the indirect association between Z′ and X′ (Fig. 4D, right).
Sleep increases amplitude and decreases latency of indirect memory response. A–C, Raw network response traces during testing phase of stimulating A, B, C, X, Y, Z (from left to right) after supervised training (A), associative training (B), and sleep (C). Note increase in response and decrease in latency after sleep. D, Averaged (across 8 trials) and smoothed, through a bandpass filter at 0.1 and 20 Hz, LFP computed separately for the three neuron groups in layer 2 (X′, Y′, Z′ are shown) in response to stimulation of a neuron group X in layer 1. LFPs are shown during testing phase after supervised training (left), associative training (middle), and sleep (right). E, Average response latency for direct memories (black, e.g., latency of neuron Group B′ when A is stimulated), indirect memories (pink, e.g., latency of neuron group C′ when A is stimulated), and incorrect memories (cyan, e.g., latency of neuron group X′ when A is stimulated). F, Average firing rate of neurons in layer 2 for each type of memory (direct, indirect, and incorrect) during testing phase.
We measured response latency as a time delay from layer 1 stimulation to the first action potential in each layer 2 neuronal group's response, and we measured response intensity as total number of spikes per stimulation of each layer 2 neuronal group. After supervised training, the average latency of direct memories (A-B′, which have not been learned yet), indirect memories (A-C′), and incorrect memories (A-X′) were all similar at ∼200 ms (Fig. 4E, left group). In addition, the rate of response was very low and similar across all three types of memories (Fig. 4F, left group). After associative training, the latency of the direct memory recall was substantially reduced and the intensity of response was increased (Fig. 4E,F, middle group). The latency and the response amplitude of the indirect memory were also improved, but the latency was not significantly different from that of response for incorrect memories, and the amplitude was not as strong as for direct memory. Importantly, after sleep, the latency of the indirect memory recall was significantly reduced compared with the incorrect one (Fig. 4E, right group, t(1,163) = 24.27, p = 3.039 × 10−100, two-sided t test) and the intensity of response was significantly increased (Fig. 4F, right group, t(319) = −9.64, p = 2.41 × 10−19). This behavioral change in the network response dynamics highlights the increase in strength of the indirect memory following SWS.
Sleep increases modularity of each triplet in layer 2 recurrent connections
To determine which network changes were responsible for improving indirect relational memories after sleep, we analyzed the changes in synaptic weights. There were two types of plastic connections in the model: feedforward connections between layer 1 and layer 2, and recurrent connections within layer 2. In the feedforward connectivity matrices, we observed that sleep leads to a significant increase in the synaptic input coming from both indirect (Fig. 5A, right, e.g., connection A to C′, t = −6.98, p = 1.39 × 10−95, two-sided t test) and direct neuronal groups (Fig. 5A, right, e.g., connection A to B′, t = −5.66, p = 5.29 × 10−79, two-sided t test). Importantly, the incorrect memory weights (e.g., X to A′) were not significantly greater than their pretraining values (indeed, they were smaller than their pretraining values, p < 1 × 10−100, one-sided t test), suggesting that sleep does not just increase all the connections but only connections related to associated memory items. In the recurrent weights (Fig. 5B), a similar effect was observed where synaptic input from direct and indirect memory groups was significantly increased to specific neurons after sleep (p = 6.18 × 10−62, p = 7.67 × 10−88 for both groups [direct and indirect, respectively], two-sided t test). Interestingly (also see Discussion below), synaptic input from a neuronal group to its indirect triplet pair (e.g., A′ to C′) in the second layer became even larger than the synaptic input from an indirect group in the first layer (e.g., A to C′, p = 0.02, two-sided t test, average feedforward synaptic input = 2.08, average recurrent synaptic input = 2.75).
Synaptic weight dynamics explains improvements in relational memory after sleep. A, B, Left, Feedforward (A) and recurrent (B) synaptic weight matrices after supervised training, associative training, and sleep. Right, Synaptic input to the neurons of each memory type in layer 2 (the sum of all the weights connecting to those neurons) for self-memories (A-A′), direct memories (A-B′), indirect memories (A-C′), and incorrect memories (A-X′) after supervised training, associative training, and sleep.
To better quantify changes in the recurrent connections in layer 2, we built and analyzed a graph of 10 nodes, where each node represents a group of 10 neurons (i.e., Group A′ = 11-20, B′ = 21-30, …, Z′ = 71-80) (Fig. 6A–D). We created an edge between two groups if there were any strong enough weights (i.e., exceeding a threshold) between these groups (the weight threshold was set at 80% of the maximum weight value, so it was different at different time points, e.g., threshold before training = 0.0218, threshold after supervised training = 0.1295, threshold after associative training = 0.1857, threshold after sleep = 0.1913). On the graph, the thickness of the edge depicts how many such weights existed between the two nodes. After supervised training, recurrent weights within trained groups (e.g., between all the neurons from Group A′) increased, but weights between groups remained weak and the graph was essentially disconnected (Fig. 6B). After associative training, relatively weak connections were formed between the linking Group B′ (or Y′) and the other relevant groups, A′ and C′ (or X′ and Z′) (Fig. 6C). In addition, the self-connections (recurrent connections within a group) were magnified. Finally, after sleep, the overall connectivity between the group triplets was increased, with weak connections between direct memory pairs becoming stronger (e.g., X′-Y′) and new connections forming between indirect memories (e.g., X′-Z′) (Fig. 6D). Overall, these changes suggest that items in each triplet (e.g., X′-Y′-Z′) become strongly connected to the other items in that triplet so that activation of any one group can lead to activation of the other groups. Thus, after sleep, all the neurons in the second layer associated with the items belonging to the same relational memory triplet formed an attractor in synaptic weight space.
Sleep increases modularity for each triplet of items (A′B′C′ and X′Y′Z′) in layer 2 recurrent connections. A-D, Graphs of layer 2 connectivity matrices. Each dot represents a group of 10 neurons: red dots represent A′, B′, C′; blue dots represent X′, Y′, Z′). A line is drawn between two dots if there is a weight between groups that exceeds a given threshold (75% of the maximal weight). The thickness of the line represents the number of such connections: (A) before any training, (B) after supervised training, (C) after associative training, and (D) after sleep. Threshold is calculated for each state separately; so, for example, before training many connections exceed the threshold defined by initial weak connections. E, Community assignment for layer 2 neurons over time during each training/sleep phase: ST, supervised training; AT, associative training, and sleep. Neurons were assigned the same color (at any given time) if those neurons belonged to the same community. F, The number of communities over time. Data are averaged across 10 network trials. Error bars indicate SD across trials.
To further test this idea, we performed modularity analysis on the time-dependent recurrent weight matrix to determine how clusters of neurons change over the course of training and sleep (for details, see Materials and Methods). We used a time-dependent community detection algorithm to assign each of the 100 neurons in layer 2 to a community (where community assignment can change over time) based on the synaptic connectivity matrix (Leicht and Newman, 2008; Jeub et al., 2020). Figure 6E illustrates how the community assignment changed during supervised training, associative training, and sleep. During supervised training, each of the 6 subgroups was put into a community with itself, as the neurons within these groups became strongly interconnected. During associative training, there was some mixing between these six subgroups, as observed, for example, in the merging of communities representing Y′ and Z′ (Fig. 6E, orange group). Finally, during sleep, we observed merging of each of the three subgroups from each triplet into larger community. We found that the number of communities in the network started out high but was further reduced mostly during associative training (Fig. 6F). Together, these results suggest that sleep altered the connectivity matrix to enable formation of a large community of related neurons who all shared similar stimulus-response profiles, leading to formation of indirect memories. Thus, sleep altered the community structure by building a strong attractor among members of each of the memory triplets.
Replay during sleep drives synaptic weight changes
Given that during sleep synaptic weights are restructured to support formation of indirect associative memory, the question remains of what it is specifically about sleep that leads to these changes. Based on our previous work (Wei et al., 2016, 2018; González et al., 2020), we hypothesized that replay during sleep of synaptic traces formed during training leads to a strengthening of these synaptic traces and thus an improvement in memory (Ji and Wilson, 2007; Lewis and Durrant, 2011). Importantly, in our model indirect connections (e.g., from A to C′ or A′ to C′) are never explicitly activated during training, however, these pathways may become active during SWS, which could explain the weight changes illustrated above.
To detect possible replay events, we applied a procedure previously proposed by González et al. (2020). After detecting individual Up states (using LFP thresholding; see Materials and Methods; Fig. 7A), we identified, for each Up state, all spiking events that could lead to STDP changes. Thus, if Neuron I fired during an Up state and this was followed by Neuron II firing (within a 200 ms time window), then this pair was considered an STDP event and the direction of replay (from Neuron I to Neuron II) was recorded. We observed that the number of STDP events within the trained region of the network, both in feedforward and recurrent connections, was significantly greater than outside of the trained regions (Fig. 7B, p < 1e-5, for visualization purposes, only pairs with number of replay events above a threshold [top 75%] are shown). Importantly, we observed not just more STDP events randomly distributed across all the neuronal pairs in the trained region, but a higher number of events in specific neuronal pairs (Fig. 7B; note red dots in the ROIs), suggesting that those events reflect replay of the memory elements formed during associative training. In other words, during an Up state, there was a significantly higher chance that the neurons within the trained region would spike in a defined order compared with the neurons outside of the trained region, indicating that SWS does indeed reactivate synaptic memory traces learned during the associative phase.
Replay during sleep drives synaptic weight changes. A, LFP during SWS (left) and “zoom-in” examples of slow waves (right). Beginning/end times of Up and Down states are computed by setting a threshold for the transition from Down to Up state and vice versa. B, Number of replay events for feedforward (top) and recurrent (bottom) connections. Replay events are selected by identifying sequential ordered firing events, within a specified time window. Replay events occur significantly more in the areas of interest (black grids) than in other areas (p < 1e-4, based on shuffling replay matrix 10,000 times). C, Change in synaptic weights as a function of number of replay events between neurons for feedforward (top, R2 = 0.61, p = 1 × 10−12) and recurrent (bottom, R2 = 0.41, p = 1 × 10−10) connections. D, Number of replay events between self, direct, indirect, and incorrect neuron groups for feedforward (top) and recurrent (bottom) connections. For feedforward connections, there was a significantly higher number of replay events between self-connections than direct connections, direct connections than indirect connections, and indirect connections than incorrect connections. For recurrent connections, indirect connections revealed the most replay events (p = 0.006 between wrong connections, and p = 3.28 × 10−36 between direct connections).
We next measured the extent to which replay is correlated with synaptic connectivity changes. Thus, we plotted observed synaptic weight change against the total number of replay events per neuronal pair and discovered a significant correlation between the number of replay events for a given connection and the amplitude of the weight change in this connection (Fig. 7C). This was true for both feedforward and recurrent connections (R2 = 0.62, p = 1 × 10−12 for feedforward and R2 = 0.41, p = 1 × 10−10 for recurrent connections). These data suggest that sleep replay can restructure weights to build the communities underlying relational memory formation as reported in Figure 6. We next separated replay events based on the type of connection: self-connection (e.g., A-A′, or A′-A′), direct connection (e.g., A-B′, A′-B′), indirect connection (e.g., A-C′, A′-C′), or incorrect connection (e.g., A-X′, A′-X′). In feedforward connections, we observed that self-connections had the largest number of replay events, followed by direct, indirect, and incorrect connections, in order (Fig. 7D, top; number of replay events is averaged across 10 trials and all the connections in each of the four categories). This suggests that, in feedforward connections, replay reflects the underlying strength of the synaptic weights (compared Fig. 7D, top, and Fig. 5A). Since self-connections were the strongest (Fig. 5A, t = −3.99, p = 6.72 × 10−5, two-sided t test), these connections experienced the greatest number of replay events. However, in the recurrent connections, there was a greater amount of replay events in the indirect connections (Fig. 7D, bottom, t = 2.72, p = 0.006, two-sided t test). This type of replay can lead to the formation of the communities (Fig. 6), responsible for formation of indirect associative memories.
N3 sleep is uniquely responsible for post-sleep improvement, although spindle-slow-wave nesting may be important
Behavioral studies suggest that duration of N3 sleep, but not N2 sleep, during a daytime nap is significantly correlated with associative memory performance (Lau et al., 2010). We tested the effect of N2 sleep by modifying level of neuromodulators in the model, that was set in between their waking and N3 state levels (Krishnan et al., 2016). In this regimen, the network generated frequent spindle events interrupted by occasional slow waves (Fig. 8A). We compared four conditions: 300 s of N3 sleep alone (control, as in above simulations), 300 s of N2 sleep alone, 600 s of N2 sleep alone, and 300 s of mixed sleep (200 s of N2 followed by 100 s of N3). We found that N2 sleep alone was not sufficient to significantly boost associative memory performance, for either 300 or 600 s of N2 sleep duration (t(9) = −1.56, p = 0.13, one-sided t test) (Fig. 8B, left). However, either 300 s of N3 sleep or mixed N3+N2 sleep did result in a significant improvement (t(9) = −2.39, p = 0.028, one-sided t test) (Fig. 8B, right). These results confirm behavioral evidence showing a unique role for N3 sleep, as opposed to N2 sleep, in improving relational memory.
Stage 2 (N2) sleep has little effect on association score, although spindle/slow oscillation nesting during N3 sleep revealed significance. A, Network dynamics including both N2 and N3 sleep: supervised training (purple), associative training (green) and sleep, comprised of N2 (lime) and N3 sleep (cyan). Bottom row represents zoom-in of N2 sleep (two spindles are shown) and N3 sleep (slow waves). B, Association scores following 300 s of N2 sleep (top left), 300 s of N3 sleep (top right), 600 s of N2 sleep (bottom left), and 300 s of mixed sleep (200 s N2 and 100 s N3, bottom right). C, Association score improvement as a function of spindle power near Down-to-Up transition of N3 sleep suggests a significant correlation between spindle/slow oscillation nesting and association score. Spindle power in 1000 s of mV2. D, Spindle power is significantly higher near Down-to-Up transition than near Up-to-Down transition or a random time selected during the Up state of a slow wave. Power was calculated based on 100 ms time windows.
Other studies suggested that phase locking between slow waves and spindles (frequency nesting) may be necessary for memory consolidation (Latchoumane et al., 2017; Kim et al., 2019). We tested this by measuring the power in spindle frequency band (from 7 to 16 Hz) in three distinct phases of the N3 slow oscillations: Down-to-Up transition, Up-to-Down transition, and Random time windows during the Up state. LFPs were computed, and the starts and ends of each Up state were identified as done previously (see Materials and Methods). We calculated the spindle power in 100 ms time windows centered in each of the three phases. We found significantly higher power in the spindle frequency band near the Down-to-Up transition compared with the two other phases tested (Fig. 8D). Additionally, we found that this spindle power was significantly correlated with associative memory improvement following sleep (Fig. 8C, R2 = 0.5, p = 0.03). These results predict that phase-locking between spindles and slow waves may be important in relational memory.
Discussion
How does sleep give rise to relational memory? Our study suggests the following conceptual model. First, for each “basic” memory, there exists a feedforward pathway through the network that is stable and robust, so a stimulus presentation, namely, pattern activation in primary sensory area (e.g., neuron Group A, Fig. 9A, left), leads to reliable and unique response in associative cortex (activation of neuron Group A′). These pathways can possibly form during development, can be strengthened during subsequent training, and need to be robust for associative learning to take place. These pathways represent sensory “primitives” that have been once learned and do not need to be changed in adult brain. Second, during associative learning, events that have shared context are learned to be represented together. In the model, this occurred when inputs A and B are presented together, which leads to an overlapping representation in associative cortex, where presentation of A or B alone leads to firing and recollection of the other object (i.e., B′ or A′) (Fig. 9A, middle). If different associative memories include a common item (e.g., A-B and B-C), sleep aids in forming indirect associative memory between nonoverlapping items A′ and C′ by strengthening the entire pathway A → C′ (or C → A′), both through an increase in feedforward connections from A to B′ and C′ as well as community (or attractor) formation for the entire A′-B′-C′ group in associative cortex (Fig. 9A, right). As sleep replay takes place on a compressed timescale (Nádasdy et al., 1999), the entire group (A′-B′-C′) can be activated within a small enough window for connections to grow between A′ and C′, taking advantage of STDP-type mechanisms. Indeed, inhibiting the overlapping elements (B/B′ or Y/Y′) during sleep (or during memory recall) prevents post-sleep improvement on this associative memory task in our model (Fig. 9B), in line with in vivo work, which showed that associations between a visual stimulus and fear response could be blocked by optogenetic inhibition of neurons representing the visual stimulus during sleep (Clawson et al., 2021).
Proposed model of relational memory and main experimental predictions. A, Summary of the changes to the model at different time points. During supervised training, feedforward connections are formed between layers 1 and 2 to represent self-memories (e.g., A-A′). During associative training, the network learns to associate items presented together (e.g., A with B and B with C). However, these connections are weak, and no indirect associations are learned (e.g., A is not associated with C). After sleep, direct and indirect memory connections are strengthened and one attractor is formed for entire triplet of items (i.e., a community including A′, B′, and C′). B, Effect of inactivating different neuronal groups during either sleep or testing on association score. Blue bars represent performance after training. Orange bars represent performance after sleep. Silencing linking group in any one layer only (B′ or B, Y′ or Y) during sleep still leads to significant post-sleep improvement for associative memories (B′, Y′: t(10) = −4.91, p = 0.001; B, Y: t(10) = −2.03, p = 0.045, one-sided t test, FDR correction). However, silencing linking groups in both layers (B/B′, Y/Y′) during sleep prevents post-sleep improvement for these associative memory tasks (t(10) = −0.59, p = 0.28). Inactivating linking groups in layer 2 alone (B′, Y′) during testing was sufficient to significantly reduce associative memory performance.
Recent experiments suggest that learning rules may differ between anesthetized and awake states and are biased toward synaptic depression during Up states of Slow Oscillations (SOs) in urethane-anesthetized mice (González-Rueda et al., 2018). This result supports the synaptic homeostatic downscaling (SHY) hypothesis, suggesting that during sleep synapses are downscaled to free up synaptic resources for learning during the next wake state (Tononi and Cirelli, 2014). The other view is that synaptic potentiation (at least to selected subsets of synapses associated with recently learned memories) occurs during NREM sleep to enable memory consolidation (Timofeev and Chauvette, 2018; for review, see also Puentes-Mestril and Aton, 2017). In our new study, based on a large scope of existing experimental data, we used a symmetric STDP rule that is similar in both wake and sleep states, and we observed strengthening of synaptic connections to form new associative memories during sleep. This model may need to be extended based on prevailing biological views about plasticity rules in the waking and sleeping brain as new data are accumulated. In addition, plasticity mechanisms, such as heterosynaptic and homeostatic synaptic plasticity, may affect learning; and their effects are different between sleep and wake. Indeed, for example, the effect of heterosynaptic plasticity depends on neuromodulators (Bannon et al., 2017) whose levels fluctuate during the sleep–wake cycle. In our new study, we explicitly tested the effect of heterosynaptic plasticity on associative memory and found that it helps to form associative memories. Because of the complexity of the effects of neuromodulation, we, however, considered a simplified model where heterosynaptic scaling operates similarly during sleep and awake.
Our work expands on computational models of relational memory by providing a biophysically plausible account of learning during waking and consolidation during sleep. Previous models for relational memory include the temporal context model (TCM) and retrieval based models (Kumaran, 2012; Kumaran and McClelland, 2012). Our model adds to this literature by (1) developing a biophysical account, based on STDP rules, that explores the role of sleep replay on relational memory tasks; and (2) suggesting a role for both the TCM and retrieval-based models, based on different types of relational memory tasks. TCM and retrieval-based models have been successful at demonstrating performance on associative memory tasks (Kumaran and McClelland, 2012). However, these models were constructed using preset weights between different regions of the network, and sleep replay was implemented using artificial stimulation. In contrast, in our work, we show that STDP rules can be used based on realistic task settings to learn relational memories and synaptic replay, that is needed for formation of indirect relational memories, occurs naturally during SWS and does not require any additional stimulation. We found that, during SWS, individual items were replayed spontaneously and in a correct order to form a new relational memory.
Our model, which more closely aligns with TCM, may be insufficient at explaining generalization on ordered relational memory tasks (Ellenbogen et al., 2007; Werchan and Gómez, 2013). We showed that replay is as likely to occur in the forward or backward directions (e.g., forward = A → B, backward = B → A). In this simplified task, memory consolidation during sleep occurs mainly in a recurrent layer, as neurons representing single units become wired together based on a shared context and form an attractor or community that enables indirect memory recall. However, in an ordered relational memory task, where the hierarchy of items needs to be learned, replay within a single attractor-based layer may be insufficient to correctly encode the order of the task, and big-loop recurrency may be necessary.
Many studies explored the effect of sleep on relational memory without analyzing correlation between specific sleep stages and performance improvement (Lau et al., 2011; Huguet et al., 2019). Our work expands on these studies by suggesting a unique role for SWS in improving relational memory. We further predict that, while nesting spindles and slow waves may be important for consolidation of relational memories, spindles alone are not sufficient for consolidation. Our study predicts that the number of slow waves observed during sleep is significantly correlated with the subject's ability to perform relational memory tasks, in line with previous work that demonstrated a significant correlation between the SWS length and relational memory learning (Lau et al., 2010).
Our study also further supports evidence that mental health disorders, such as schizophrenia, where SWS is disrupted may experience deficits in relational memory (Titone et al., 2004; Martin et al., 2005; Pritchett et al., 2012). Patients with schizophrenia have shown a marked decrease compared with healthy controls in their performance on transitive inference and relational memory tasks (Titone et al., 2004; Avery et al., 2021). One of the deficits in sleep in schizophrenia subjects is a significant decrease in the amount of SWS (Keshavan et al., 1990; Benca et al., 1992; Yang and Winkelman, 2006; Manoach and Stickgold, 2009). Our model suggests that, if disrupted, SWS may be responsible for deficiencies to learn transitive inference in schizophrenia; then methods focusing on recovery the normal sleep patterns in schizophrenia could lead to an improvement in associated cognitive symptoms.
We should note the limitation of our work by ignoring the explicit impact of the hippocampus on memory consolidation and transitive inference. Previous studies have described the importance of the hippocampus in transitive inference tasks, where hippocampal activation is increased during the performance of transitive inference tasks, and damage to the hippocampus decreases performance on such tasks (Heckers et al., 2004; Zalesak and Heckers, 2009; DeVito et al., 2010; Wendelken and Bunge, 2010). Recent studies revealed a complex bidirectional model of the interaction between hippocampal and cortical networks (Rothschild et al., 2017; Helfrich et al., 2019). Our recent modeling work (Sanda et al., 2021) found that hippocampal ripples can coordinate large-scale spatiotemporal dynamics of cortical slow waves. We address these concerns by noting the similarity of the second layer in our model with hippocampal regions, which rely on similar attractor dynamics (Colgin et al., 2010). Thus, the same mechanisms we propose here may explain relational memory improvement during sleep in cortico-hippocampal system. Importantly, empirical and computational studies reported that hippocampal activation during SWS is preceded by cortical input and follows a cortical-hippocampal-cortical pathway (Rothschild et al., 2017; Navarrete et al., 2020; Sanda et al., 2021). In this scenario, the content of replay may be introduced by cortical networks (layer 1 in our model) and lead to the chosen content of replay in hippocampal and other cortical networks (layer 2 in the model).
REM sleep is likely to be very critical in memory and learning, but its specific role in formation of relational memories is unknown. One study found that a fraction of time spent in REM sleep during a 60 min nap was correlated with improvement on A-C item pairs but also led to more forgetting of directly learned (A-B) relations (Alger and Payne, 2016). In this work, however, subjects who did not attain REM sleep during the 60 min period also performed similarly to those who attained REM sleep. Thus, it remains an open question how REM and NREM sleep can differentially contribute to relational memory and to memory consolidation in general (see, however, Wei et al., 2018). It is also likely that the cycling between REM and NREM sleep over the course of a typical night (i.e., multiphasic sleep with specific temporal structure) is important for sleep-dependent memory consolidation.
In conclusion, we built a model of the thalamocortical system, which suggests specific biophysical mechanisms that explain the role of sleep in the formation of indirect associative memories. This model predicts that inhibition of neuronal groups that represent common items that link associated items may decrease performance on relational memory tasks (Clawson et al., 2021), while artificial stimulation during sleep replay of nonassociated items may lead to false memory formation (Diekelmann et al., 2010). Our model can be extended to describe transitive inference tasks where there is an underlying hierarchy of items (e.g., A > B), which likely requires a third layer to account for big-loop recurrency needed to perform ordered transitive inference.
Footnotes
This work was supported by Office of Naval Research MURI N00014-16-1-2829, National Science Foundation IIS-1724405, and National Institutes of Health 1RO1MH125557 and 1RO1MH117155.
The authors declare no competing financial interests.
- Correspondence should be addressed to Maxim Bazhenov at mbazhenov{at}health.ucsd.edu