## Abstract

Numerous experimental data suggest that simultaneously or sequentially activated assemblies of neurons play a key role in the storage and computational use of long-term memory in the brain. However, a model that elucidates how these memory traces could emerge through spike-timing-dependent plasticity (STDP) has been missing. We show here that stimulus-specific assemblies of neurons emerge automatically through STDP in a simple cortical microcircuit model. The model that we consider is a randomly connected network of well known microcircuit motifs: pyramidal cells with lateral inhibition. We show that the emergent assembly codes for repeatedly occurring spatiotemporal input patterns tend to fire in some loose, sequential manner that is reminiscent of experimentally observed stereotypical trajectories of network states. We also show that the emergent assembly codes add an important computational capability to standard models for online computations in cortical microcircuits: the capability to integrate information from long-term memory with information from novel spike inputs.

## Introduction

Neural computations in the brain integrate information from sensory input streams with long-term memory in a seemingly effortless manner. However, we do not yet know what mechanisms and architectural features of networks of neurons are responsible for this astounding capability. It has been conjectured that coactive ensembles of neurons, often referred to as cell assemblies (Hebb, 1949), and stereotypical sequences of assemblies or network states play an important role in such computations (Buszáki, 2010). In fact, a fairly large number of experimental studies (Abeles, 1991; Jones et al., 2007; Luczak et al., 2007; Fujisawa et al., 2008; Pastalkova et al., 2008; Luczak et al., 2009; Bathellier et al., 2012; Harvey et al., 2012; Xu et al., 2012) suggest that stereotypical trajectories of network states play an important role in cortical computations. However, it is not clear how those assemblies and stereotypical trajectories of network states could emerge through spike-timing-dependent plasticity (STDP).

There exists already a model for neural computation with transient network states: the liquid state machine, also referred to as liquid computing model (Maass et al., 2002; Haeusler and Maass, 2007; Sussillo et al., 2007; Buonomano and Maass, 2009; Maass, 2010; Hoerzer et al., 2012). Building on preceding work (Buonomano and Merzenich, 1995), this model shows how important computations can be performed by experimentally found networks of neurons in the cortex consisting of diverse types of neurons and synapses (including diverse short-term plasticity of different types of synaptic connections) and specific connection probabilities between different populations of neurons (instead of a deterministically constructed circuit). The liquid computing model proposes that temporal integration of incoming information and generic nonlinear mixing of this information (to boost the expressive capability of linear readout neurons) are primary computational functions of a cortical microcircuit. A concrete prediction of the model is that transient (“liquid”) sequences of network states integrate information from incoming spike inputs over time spans on the order of a few 100 milliseconds. This prediction has been confirmed by several experimental studies (Nikolić et al., 2009; Bernacchia et al., 2011; Klampfl et al., 2012). However, the liquid computing model did not consider consequences of synaptic plasticity within the microcircuit and could not reproduce the emergence of long-term memory in the form of assemblies or stereotypical sequences of network states. We show here that long-term memory traces automatically emerge in this model if one adds three experimentally supported constraints: (1) pyramidal cells and inhibitory neurons tend to be organized into specific network motifs, (2) synapses between pyramidal cells are subject to STDP, and (3) neural responses are highly variable (“trial-to-trial variability”).

We show that in the resulting biologically more realistic model, both assembly codes and stereotypical trajectories of circuit states emerge through STDP for repeatedly occurring spike input patterns. We also provide a theoretical explanation for this and demonstrate additional computational capabilities of the resulting new model for online computations with long-term memory in cortical microcircuits.

## Materials and Methods

We first review the traditional liquid computing model and then describe the modified model that is examined in this article. Next, we describe the STDP rule that is applied in this model. Finally, we specify the benchmark tasks that are used to evaluate the capability of the model for online computation on new spike inputs shown in Figure 12 and provide all technical details of our computer simulations.

##### Traditional liquid computing model.

The traditional liquid computing model (more formally called liquid state machine; Maass et al., 2002) is a recurrent network of excitatory and inhibitory spiking neurons (Fig. 1*A*) with a distance-dependent connection probability between any pair of neurons. The temporal dynamics of different neurons and synapses in the model can be diverse and nonlinear, as reported in experimental data on cortical microcircuits. In fact, theoretical results suggest that this diversity supports generic computations in cortical microcircuits (Maass et al., 2002). Two such generic computations (XOR as a typical nonlinear operation, and memory for a preceding spike input) are described below and they are used in this article to evaluate the performance of our modified liquid computing model on standard liquid computing tasks (Fig. 12). The computational capability of a liquid computing model is commonly tested by training memoryless linear readout neurons (modeling downstream neurons) with the help of a teacher (supervised learning) to approximate the target output of a specific computational task. In previously considered versions of the liquid computing model, no specific local connection patterns (such as specific network motifs) were introduced in the network and no long-term synaptic plasticity was considered within the network. Only the weights to readout neurons were trained, typically by algorithms such as linear regression or the FORCE algorithm (Sussillo et al., 2007), which did not aim at modeling biological synaptic plasticity. Rather, these algorithms aimed at measuring how much information the network contained about the target output of a specific computation.

##### Modified liquid computing model.

Experimental data show that generic cortical microcircuits are not assembled in a random manner from excitatory and inhibitory neurons, as in the traditional liquid computing model. In particular, as emphasized in Douglas and Martin (2004), ensembles of pyramidal cells on superficial and deep layers tend to form through mutual synaptic connections with inhibitory neurons into (soft) winner-take-all (WTA) circuits in which at most one pyramidal cell in the ensemble usually fires (Fig. 1*B*). Therefore, the modified liquid computing model is a recurrent network of WTA circuits of different sizes (Fig. 1*C*). According to Markov et al. (2011), the probability that two neurons are monosynaptically connected drops exponentially with the distance *d* between their somata. To mimic this distance-dependent connection probability in our microcircuit model, we arranged its units, the WTA circuits, on a 2D grid (with each WTA circuit occupying one grid point). The grid size is 10 × 5 for all networks described here. Therefore, each network consists of 50 WTA circuits. We then created a synaptic connection from an excitatory neuron *z̃ _{k′}* to some excitatory neuron

*z*in two WTA circuits of distance

_{K}*d*with probability

*p*(

*d*) = λexp(−λ

*d*), according to the data-based rule of Markov et al. (2011) (Figure 1

*D*). In addition,

*z*receives synaptic inputs from external sources (Fig. 1

_{K}*C*). Therefore, the presynaptic neurons

*y*

_{1}, … ,

*y*of a generic excitatory neuron

_{n}*z*in some WTA circuits are neurons in various other WTA circuits, as well as external input neurons. All of these synaptic connections are subject to short-term dynamics and STDP, as described below.

_{K}Each WTA circuit in our model consists of some finite number *K* of stochastically spiking neurons (*z _{1}* to

*z*in Fig. 2

_{K}*A*) with lateral inhibition. The number

*K*is drawn independently for each WTA circuit from a uniform distribution between a predefined minimum and maximum number (here: 2–10; 10–50 in Fig. 9). Therefore, the resulting recurrent network of WTA circuits consists on average of ∼300 neurons (1500 in the case of Fig. 9). The larger WTA circuits were needed for Figure 9 for the emergence of assemblies with sufficiently long firing patterns.

For each of these neurons, *z _{K}* is viewed as a model for a pyramidal neuron that receives (apart from the lateral inhibition) synaptic inputs from some other pyramidal cells (

*y*

_{1}, . . ,

*y*; Figure 2

_{n}*A*). The membrane potential of neuron

*z*(without the impact of lateral inhibition) is given by: where

_{K}*y*is the current value of the EPSP from the

_{i}(t)*i*-th presynaptic neuron, which is weighted by the current weight,

*w*, of their synaptic connection and

_{ki}*w*

_{k0}is a bias or excitability parameter of neuron

*k.*Each EPSP is modeled as an α-shaped kernel with rise time constant τ

_{rise}= 2 ms and the decay time constant τ

_{decay}= 20 ms as follows: where the sum is over all presynaptic spike times

*t*.

_{p}Neuron *z _{K}* outputs a Poisson spike train with instantaneous firing rate as follows:
Therefore, it fires with a rate proportional to the exponent of its current membrane potential

*u*normalized by the current total activation of all

_{k}(t)*K*neurons in the WTA circuit, Σ

_{j = 1}

^{K}

*e*

^{uj(t)}. R_{max}is the maximum sum of the firing rates of all neurons

*z*

_{1},…,

*z*in a WTA circuit.

_{K}The lateral inhibition of the WTA circuit (Fig. 2*A*) is modeled in an abstract way as the denominator of (3; “divisive normalization”). One can rewrite Equation 3 as *r _{k}(t)* =

*R*

_{max}· exp[

*u*−

_{k}(t)*I(t)*], with

*I(t)*= log Σ

_{j = 1}

^{K}

*e*, a term that can in principle be approximated by the inhibitory circuit in Figure 2

^{uj(t)}*A.*We chose here the more abstract implementation of Nessler et al. (2013), where the ratio in Equation 3 can be viewed as a discrete probability distribution over {1,…,

*K*}, which at any time determines how the total activity

*R*

_{max}is spread among the output neurons of a WTA circuit. This abstract way of modeling lateral inhibition has the side effect that it normalizes the sum of firing rates in the WTA circuit to a fixed value (set to 100 Hz here), even in the absence of external input. We have demonstrated in Figure 6

*D*(where this value was reduced to 70 Hz during those phases where the external input consisted just of noise spikes at a low rate) that this simplification of the model does not affect the effects that we study (emergence of cell assemblies) in a negative manner. Furthermore in other work (S. Haeusler, W.M., manuscript in preparation), it was shown that the abstract normalization of Equation 3 can be approximated via connections to and from a pool of inhibitory neurons (as in Fig. 2

*A*) that reproduces the experimentally found tracking of excitation through inhibition in cortical microcircuits (Okun and Lampl, 2008).

##### Synaptic plasticity.

All synaptic connections from external inputs to WTA circuits and between WTA circuits are assumed to be subject to short-term and long-term plasticity. It is well known that biological synapses are not static, and have a complex inherent temporal dynamics (Markram et al., 1998; Gupta et al., 2000). More precisely, the amplitude of an EPSP caused by an incoming spike not only depends on the current synaptic weight, but also on the recent history of input spikes. We modeled these short-term dynamics of synapses between excitatory neurons in the network after the traditional liquid computing model of Maass and Markram (2002). This model for short-term plasticity is based on Markram et al. (1998) and predicts the amplitude *A*_{k} of the *k*-th input spike in a spike train with interspike intervals Δ_{1},Δ_{2}, …,Δ_{k−1} as follows:
with hidden dynamic variables *u*,*R*∈[0,1] and initial values *u*_{1} = *U*, *R*_{1} = 1. The parameters *U*, *D*, and *F* in our model were chosen independently for each synapse from Gaussian distributions with means of 0.5, 0.11, and 0.005, respectively; the SD was always one half of the mean, with *y _{i}(t)* defined according to Equation 2. These distributions are based on data for synaptic connection between pyramidal cells according to Markram et al. (1998), except that the time constants of depression (

*D*) and facilitation (

*F*) are divided by 10. Because

*D*≫

*F*, these synapses are depressing, i.e., the amplitudes of EPSPs caused by successive spikes are decreasing. The smaller the interspike interval, the stronger is this decrease. In addition, the same synaptic connections are also assumed to be subject to long-term plasticity (STDP). More precisely, the following weight update is applied to the weight

*w*of the synaptic input from the

_{ki}*i*-th presynaptic neuron

*y*to neuron

_{i}*z*whenever neuron

_{K}*z*fires as follows: with

_{K}*y*defined by Equation 2. The corresponding STDP curve is shown in Figure 2

_{i}(t)*B.*The constant

*c*is chosen so that the weights are kept within the range of positive values. For our simulations, we have chosen

*c*= 0.05 · exp(5) = 7.42. This value arises from a learning rate of 0.05 and an offset of 5 for moving all weights into the positive range. If there is a recent preceding spike, the weight change is positive and depends on the current EPSP

*y*and on the current weight

_{i}(t)*w*(in fact, the positive part of the STDP curve has the shape of an EPSP). Whenever there is a postsynaptic spike that is not accompanied by an immediately preceding presynaptic spike of neuron

_{ki}*i*(i.e., the value of the current EPSP

*y*is low), the weight is reduced by a constant amount of 1. In this case, the firing of neuron

_{i}(t)*z*is typically caused by preceding firing of other presynaptic neurons

_{K}*y*, leading to an increase of

_{j}*w*according to the positive part of STDP. The resulting reduction of

_{kj}*w*can therefore be viewed as a qualitative model for the impact of synaptic scaling (Turrigiano, 2008) that might, for example, arise from a competition for AMPA-receptors within neuron

_{ki}*z*and tends to keep the sums of all weights relatively constant. The excitability of neuron

_{K}*z*is changed by Δ

_{K}*w*

_{k0}=

*c*·

*e*

^{−wk0}− 1 whenever neuron

*z*fires, and by Δ

_{K}*w*

_{k0}= −1 in every simulation time step (dt = 1 ms) where neuron

*z*does not fire. Moreover, all of these weight changes are applied with an adaptive learning rate that follows a variance tracking principle (Nessler et al., 2013). Initial values for weights and excitabilities are chosen to be 0.

_{K}This form (Equation 5) of the STDP learning rule was chosen because it facilitates a theoretical understanding of resulting changes in weights and excitabilities (Nessler et al., 2013; Habenschuss et al., 2013). More specifically, it supports a theoretical understanding of why STDP drives competing neurons in a WTA circuit to specialize each on a different cluster of (spatial) spike patterns in its high-dimensional spike input stream. It is shown in Figure 4 of Nessler et al. (2013) that the common form of the negative part of the STDP curve appears with this rule (Equation 5) if one pairs presynaptic and postsynaptic spikes at a medium frequency of ∼20 Hz. Furthermore, it is shown in Figure 8 of Nessler et al. (2013) and the surrounding discussion that the dependence of weight updates on the current weight in Equation 5 is qualitatively consistent with published experimental data (see also the discussion in section 2.1. of Habenschuss et al., 2013). We refer the reader to section 2.3 of Nessler et al. (2013) for other forms of this dependence that would also be optimal from a theoretical perspective.

In Figure 6*A*, bottom, and Figure 6*C*, we report the results of a control experiment with a more traditional STDP rule, where the contribution of a pair of a presynaptic and postsynaptic spike with interval Δ*t* = *t _{post}* −

*t*to the weight change is given by: We have set here A

_{pre}_{+}(w) = e

^{−wki}, A

_{−}(w) = − 1, τ

_{+}= 20 ms, and τ

_{−}= 60

*ms*. The additional zero means that the noise was added to each weight update of magnitude

*M*with SD σ = 0.3

*M*+ 10

^{−4}.

##### Testing the liquid computing capabilities of the network.

We tested the capability of our network to perform nonlinear computations on novel inputs by investigating the performance of a linear readout (trained by linear regression) on two standard benchmark tasks (Fig. 12). The first one was an instance of the binding problem. Consider a network that receives two input streams (Fig. 3*A*). In one input stream, a random sequence of spike patterns A and A′ is presented. The other input stream consists of a random sequence of two other spike patterns, B and B′ (Fig. 3*B*). Both patterns are superimposed by noise spikes (Fig. 3*C*, black dots). The target output is 1 in response to a pattern combination A and B′ or A′ and B, and 0 in response to a pattern combination A and B or A′ and B′. This computation is equivalent to the logical exclusive-OR (XOR) function, which decides whether two input bits are different or not. The equivalence becomes clear when one identifies patterns A, B each with 0, and patterns A′, B′ each with 1. It is well known that the XOR function cannot be computed linearly and cannot even be approximated well by a linear function. This XOR function has been used previously to test the nonlinear computation capabilities of a laminar cortical microcircuit model (Haeusler and Maass, 2007). If one uses a linear readout from the network, the nonlinear part of the computation has to be performed within the network.

Here, the weights of recurrent and input connections were initialized randomly, forming a similar distribution as if having been trained with STDP. A linear readout (which received as input the low-pass-filtered spike trains of the neurons in the network) was then trained by linear regression every 50 ms to decide whether one of the pattern combinations *A*,*B*′ or *A*′,*B* had been present during the last 50 ms. This procedure is illustrated in Figure 3*C* (target “XOR”). The input that was used for training the linear readout consisted of a 50 s sequence of spike patterns, which amounts to 1000 training samples for the regression. The size of the test set was 120 samples drawn from a 6 s input sequence that was newly instantiated.

As a measure for the performance of the readout, we used the point-biserial correlation coefficient between the analog linear readout and the binary target variable. Figure 3*D* demonstrates that a linear readout neuron cannot perform the desired nonlinear computation if it receives directly the two input streams (low-pass filtered) as its inputs. If, however, it is trained on the spike responses of the network, it can achieve a performance above chance level (Fig. 3*D*, gray bars labeled “network”). Any correlation value significantly greater than zero indicates nonlinear transformations by the network itself (for a formal proof, see Nikolić et al., 2009).

We tested the temporal integration capability of the network with another standard benchmark task. We trained a further linear readout by linear regression to decode the identity of the pattern that had been presented in input stream 2 in the interval [−100 ms, −50 ms], i.e., 50 ms before the current point in time (Fig. 3*C*, target “memory”). The target output was 1 if this earlier input pattern was B′, else 0. In addition, in this task, the readout can perform significantly better on the network output than on the stimulus directly (Fig. 3*D*).

We used these two computational tasks to analyze in Figure 12 how computational performance of the network on novel inputs is degraded through long-term memory traces.

##### Computer simulations and data analysis.

All simulations, calculations, and data analysis methods were performed in Python using the NumPy and SciPy libraries. Figures were created using Python/Matplotlib and MATLAB. The time step of the simulations was chosen to be 1 ms.

##### Determining cell assemblies.

When adapting to an input with repeating patterns embedded within a continuous Poisson input stream, the network produced long-term memory traces in the form of transiently active cell assemblies (Fig. 4). If different patterns had been embedded into the input stream, different assemblies emerged simultaneously (Fig. 5). To determine which neurons belonged to an assembly, we used the peri-event time histogram (PETH) of all neurons of the network in response to input patterns. The PETH was estimated with bin size 1 ms from 100 successive presentations of a given input pattern. In addition, the PETH was smoothed with a 40 ms Hamming window for further data analysis. The resulting value, is an observable measure of the temporal activation trace defined by the instantaneous Poisson firing rate of neuron *i* (Equation 3). We defined a neuron *i* as belonging to a stimulus-induced assembly if *r _{i}^{PETH}(t)* reached a threshold value of 99 Hz (80 Hz in the experiment with the traditional STDP rule in Fig. 6) in at least one simulation step during the duration of the associated input pattern.

##### Analysis of the sequential activation of neurons.

Having determined which neurons belong to an assembly, we ordered them according to their mean activation time during the presentation of an input pattern to reveal the stereotypical sequential activation of the memory trace. This mean activation time was calculated as follows. For each neuron *i*, we considered the temporal activity profile during pattern presentations of a particular input pattern, *r _{i}^{PETH}(t)*, for

*t*= 1,…,

*T*, where

_{p}*T*is the length of the pattern in simulation time steps (e.g.,

_{p}*T*= 300 for a pattern duration of 300 ms and simulation time step 1 ms). We computed the center of mass

_{p}*t**

_{i}of this temporal activity profile as follows: In Equation 7, we interpret time as the angle in the complex plane and compute the angle of the average complex number, weighted by the activation

*r*. To analyze sequential activation of neurons we ordered them according to their values of

_{i}^{PETH}(t)*t**

*. This mean activation time is equivalent to the mean spike latency defined in Luczak et al. (2009).*

_{i}##### Similarity and stereotypy of cell assemblies.

We used several criteria to quantify the stability of the emerging cell assemblies. We use the term *similarity* to specify how reliably the order of activation is maintained across different replays of the same memory trace (in the following called “trials”). This is measured as Spearman's rank correlation, which evaluates the correlation between two rank variables. We followed here the procedure from Luczak et al. (2009) and calculated histograms of rank correlations between the mean activation times on single trials (defined by the mean spike latency, i.e., the average spike time, in this trial) and the mean activation times on the average response over multiple trials (estimated by the PETH and Equation 7). Rank correlation ranges from −1 to +1, values greater than zero indicate that the order of activation is mostly maintained across trials.

Conversely, we use the term “stereotypy” to assess how the exact shape of the temporal activation traces of individual neurons are maintained across different trials. This is measured simply as the correlation between the temporal activity trace of a single neuron (defined by its PETH) during two successive trials, averaged over all neurons participating in the assembly.

##### Determining the onset of spontaneous upstates.

In the absence of stimulation, the network produces spontaneous activity during which previously learned memory traces are replayed (Fig. 7). The onsets of these spontaneous replays can only be determined approximately. Once the neurons participating in the assembly are determined (see above), one can compute the summed activation of the assembly, *r(t)* = Σ* _{i}r_{i}^{PETH}(t)*. We filtered this summed trace with a rectangular window of the expected duration of the memory trace (which is given by the duration of the previously learned patterns

*T*). We then looked for maxima in this filtered trace that were above a certain threshold. If such a maximum is located at

_{p}*t′*, then an approximate interval for an upstate was reported as [

*t′*−

*T*/2;

_{p}*t′*+

*T*/2].

_{p}## Results

It is well known that cortical microcircuits are composed of generic network motifs: ensembles of excitatory neurons (pyramidal cells) that are subject to lateral inhibition. (Douglas and Martin, 2004). These network motifs are commonly called WTA circuits. The power and precision of lateral inhibition among pyramidal cells in superficial and deep layers of cortical columns has been confirmed through a large number of experimental studies (Okun and Lampl, 2008; Ecker et al., 2010; Gentet et al., 2010; Fino and Yuste, 2011; Isaacson and Scanziani, 2011; Packer and Yuste, 2011; Avermann et al., 2012). In this article, we investigated models of cortical microcircuits that are structured—in accordance with these data—as networks of WTA circuits (Fig. 1). We show that if one takes also the experimentally observed stochasticity of neural responses into account, then the application of STDP in this model has a rather clear and interesting impact on network dynamics and computations: long-term memory traces in the form of stereotypical trajectories of network states emerge and support online computations.

### Emergence of stereotypical trajectories of network states

We consider a generic randomly connected recurrent network of stochastic WTA circuits in which all synaptic connections between pyramidal cells are subject to STDP (Fig. 1 and see Materials and Methods). We injected into such network a high-dimensional stream of Poisson spike trains (randomly distributed to the neurons in the network). Initially, the neurons in the network responded in a “chaotic” asynchronous irregular manner without any visible structure (Fig. 4*A*, bottom left). This network response changed drastically when some spatiotemporal pattern (a “frozen” Poisson spike pattern of 300 ms length overlaid by random noise in the form of Poisson spikes) was repeatedly embedded into the spike input stream. This noise was actually quite substantial, making up 2/5 of the spikes during a pattern presentation. Nevertheless, after 100 s, the network started to respond to each occurrence of the embedded pattern with structured activity of a specific subset of neurons in the circuit (Fig. 4*A*, bottom right). In the terminology of Buszáki (2010), one can describe this phenomenon as the emergence of an assembly code for the repeated input pattern. Furthermore, the neurons in this assembly tended to fire in some loose order, creating a stereotypical trajectory of network states for this assembly. This stereotypical network response generalized to compressed and dilated variations of the embedded input pattern (Fig. 4*A*, right).

To evaluate quantitatively to what extent the observed sequential firing within the emergent assembly is maintained across different pattern presentations, we computed rank correlations between the mean activation times of responses during individual pattern presentations and those of the average response to the pattern. This is the method that had been proposed in Luczak et al. (2009) for evaluating to what extent a firing order is preserved. Figure 4*C* shows that these rank correlations are significantly larger than 0, indicating that the firing order is mostly preserved across pattern presentations. The distribution of rank correlations in the model is qualitatively similar to those found in cortical microcircuit *in vivo* in response to sensory stimuli (Luczak et al., 2009).

We then injected in a second experiment two different patterns of the same type (again embedded into noise) into the input stream for the network (Fig. 5, red and green spike patterns). Two different cell assemblies emerged after 50 s of simulated biological time, one for each of the two patterns (Fig. 5*A*, top two groups in bottom panel). The neurons from these assemblies tended to fire sequentially, and this emergent ordering was maintained across pattern presentations (compare Fig. 5*B*,*E*). The third group of neurons (Fig. 5*A*, bottom) fired irregularly, with less activity during the presentation of either pattern. Figure 5*D* shows the average spike response of both assemblies over 100 different pattern presentations. It can be seen that neurons that responded strongly during presentation of the red pattern tended to be silent during the green pattern and vice versa. Moreover, for both the red and green pattern, the sequential activation of neurons in the corresponding assembly was largely maintained during different presentations of the patterns, as shown by the mostly positive rank correlations in Figure 5*E.* To determine whether the emergence of the two input specific cell assemblies could be a result of a specific accidental network topology, we repeated the whole experiment 100 times, each time with a new randomly constructed recurrent network of WTA circuits. For each network, we measured the stereotypy of responses of neurons in the two emerging assemblies by the average correlation between the temporal activity trace of a single neuron during two successive presentations of the same pattern. Figure 5*F* shows that the stereotypy of each assembly increased with time for both the red and green pattern, reaching a plateau after ∼25 s of noise-embedded input pattern presentations. Conversely, the correlation of temporal activity traces between pairs of neurons from different emergent assemblies remained low (Fig. 5*F*, black trace).

These tests confirmed that cortical microcircuit models of the considered structure reliably form two different cell assemblies in response to two different input patterns. This holds despite the fact that each stereotypical spike input pattern is superimposed by noise spikes (making up 2/5 of the spikes during a pattern presentation). Therefore, each assembly is activated by a fairly large cluster of somewhat similar input patterns rather than by a single precisely repeated input pattern. The resulting stereotypical trajectories of network states in these microcircuit models are qualitatively similar to the recently recorded activity patterns in posterior parietal cortex during two types of memory-based behaviors (“left trials” and “right trials” in Harvey et al., 2012).

Control experiments (Fig. 6) show that depressing of short-term dynamics of synapses and STDP on recurrent connections are necessary for the emergence of stimulus-specific assemblies as in Figure 5, whereas STDP on synapses from input neurons and a lower level of network activity in the absence of external input (Fig. 6*D*) are not. The third panel of Figure 6*A*, *B* shows that if the recurrent network of WTA circuits is required by a single very large WTA circuit, stimulus specific assemblies emerge that fire in a more precise order than in experimental data from cortex (but reminiscent of recordings from area HVC in songbirds). The last panel of Figure 6*A*, *C* shows that stimulus specific assemblies also emerge with a traditional form of the STDP curve (see Equation 6 in Materials and Methods).

### Replay of stored memory traces during spontaneous activity

It has been shown that, in the absence of sensory input in their spontaneous activity, cortical circuits often produce firing patterns that are similar to those observed during sensory stimulation (Luczak et al., 2007; Luczak et al., 2009; Xu et al., 2012 and references therein). We investigated whether a similar effect could be observed in our model. We compared the spontaneous activity of the network shown in Figure 5*A–E* before and after learning. Figure 7*A* shows the spike trains of randomly selected neurons, which are drawn from the same groups as in Figure 5*A* during spontaneous activity. Neurons that belong to the two different cell assemblies that have emerged in response to the two input patterns (Fig. 7*A*, top two groups) were spontaneously activated and fired in approximately the same order as during stimulation with the corresponding input pattern (neurons are ordered according to their mean activation time during previous pattern presentations). Sometimes the sequential activation did not reach all neurons in the assembly. Apart from these randomly initiated firing sequences, the neurons in these two assemblies remained relatively silent. The third group of neurons (Fig. 7*A*, bottom group) fired irregularly and their rate decreased when one of the two assemblies was spontaneously activated.

Figure 7*B*, *C* analyzes the difference between spontaneous and stimulus-induced firing activity of the assemblies in more detail. Figure 7*B* shows the average firing pattern of one assembly over 100 different spontaneous activations. One can still see a stereotypical firing order in the assembly, which is, according to Figure 7*C*, right, similar to the generic firing order induced by external stimulation. However, the firing order was less precise across spontaneous activations and compressed in time (compare Fig. 7, Fig. 5*A*).

These emergent firing properties during spontaneous activity of the microcircuit model are qualitatively similar to the experimental data of Xu et al. (2012), who found that stimulus-induced sequential firing patterns in primary visual cortex of rat were also observed during spontaneous activity, although with less precision and at a higher speed. This occurred after 100 presentations of a visual stimulus (a moving dot), which is in the same range as the number of preceding presentations of each stimulus in our model (∼75 presentations).

Interestingly, the spontaneous replay of trajectories occurred in our model at a higher speed than during a stimulus-entrained replay (as in Fig. 5). This is a commonly observed feature of replay of experience-induced trajectories of network states of neurons in the hippocampus and neocortex (Euston et al., 2007; Ji and Wilson, 2007).

### Interlinking of cell assemblies

It has been proposed that cell assemblies are the basic tokens of the “neural syntax**”** (Buszáki, 2010) just as words are the basic tokens of the syntax of language. This theory raises the question of whether there are neural mechanisms that are able to concatenate assemblies that are frequently activated one after the other. Our model suggests that such a concatenation could emerge already through STDP without requiring additional mechanisms such as those discussed in Buszáki (2010). In a continuation of the experiment described in Figure 5, we exposed the network to a 100 s input stream in which the first type of pattern was always followed immediately by the second type of input pattern. In Figure 8, neurons are sorted according to their mean activation time during the red–green pattern combination in the middle. The resulting firing pattern can be viewed as a “neural sentence” that consists of two concatenated assemblies. Presenting the green pattern alone resulted in activation of the corresponding assembly alone. However, presenting the red pattern triggered the sequential activation of both assemblies (the “neural sentence”) regardless of whether the red pattern was followed by the green pattern or not. This demonstrates that both assemblies had been interlinked through STDP.

### Theoretical foundations of the model

The emergence of stereotypical trajectories of network states in the model can be understood on the basis of two theoretical principles. The first principle is the emergence of sparse neural codes for repeating spatial spike input patterns through STDP in WTA circuits. The WTA circuit induces not only a competition for firing in response to an input pattern among its neurons, but also a competition for becoming an “expert” neuron for a repeating input pattern that only fires for this pattern (and for variations of it). This long-term consequence of the competition for firing in the presence of STDP is obvious, because only those neurons that fire can adjust their synaptic weights via STDP. This effect has been known for a long time for competitive Hebbian learning in nonspiking artificial neural networks (Rumelhart and Zipser, 1985). Recently, it has been shown that this mechanism also works very well for STDP in a WTA circuit (Gupta and Long, 2009; Masquelier et al., 2009). In fact, under idealized conditions it approximates the expectation maximization process for fitting the weight vectors of the neurons to frequently occurring input pattern (more precisely, for fitting an implicit generative model to the distribution of spike inputs; Nessler et al., 2013).

The second theoretical principle concerns the effect of STDP in networks of WTA circuits. In this case, a generic neuron in a WTA circuit receives not only external input, but also synaptic input from neurons in other (or the same) WTA circuits. Therefore, it receives external input in conjunction with the spike response of other neurons in the network that creates a temporal context for the current external input. This effect has been described previously (Rao and Sejnowski, 2001) in a simpler context. In the case of a single WTA circuit with complete lateral excitatory connectivity (in addition to the lateral inhibition), this process can under idealized conditions be understood rigorously as the emergence of a hidden Markov model (HMM) in which each competing neuron becomes an expert for one hidden state of its spike input, so that the resulting sequence of hidden states can explain the temporal structure of the spike input stream (D. Kappel, B. Nessler, W.M., manuscript in preparation). In the case of a recurrent network of WTA circuits of different sizes with biologically realistic sparse connectivity, the network carries out a multiscale hidden state analysis for the external input stream, where larger WTA circuits provide a higher resolution for hidden state representations. Both of these theoretical principles require substantial noise (trial-to-trial variability) in the network, which is implemented in our model through the stochastic selection of “winners” in the WTA circuits. Without such variability, the described self-organization could not emerge, because the process would get stuck in the next local optimum of a fitness function.

STDP achieves in this model only an online approximation of the forward pass of the well known forward–backward algorithm for HMM learning (Bishop, 2006; D. Kappel, B. Nessler, W.M., manuscript in preparation). This implies that an input pattern that is presented in two different temporal contexts (Fig. 9*A*, blue pattern) may or may not be encoded in the network by a common assembly—that is, by a common guessed hidden state of the input source. Therefore, the underlying theory does not guarantee that the blue input pattern is represented by different assemblies depending on whether it follows the red or green input pattern. Our computer simulations (Fig. 9) show that separate assemblies emerge for the blue pattern in different contexts, but that this differential representation is somewhat unstable (it works only if the blue input pattern is not too long and if it is a rate pattern rather than a spike order pattern). We conclude that additional learning mechanisms (e.g., reward-modulated STDP) are needed to stabilize and extend differential assembly representations for very long common input patterns in different contexts. This hypothesis has an interesting relationship to a detail of the experimental setup and training procedure of Harvey et al. (2012; see in particular their supplemental Fig. 1). The mouse was trained there to run after two visually distinct initial parts of mazes through a visually neutral part. Different neural assembly responses for this neutral part of the maze (in dependency of the different initial parts of the mazes) emerged after a complex training procedure, where this neutral part of the maze was inserted (and extended) between the initial parts and reward locations in a stepwise process. The subsequent decision (turn left or turn right, depending on the color of the initial part of the maze) provides there an incentive to represent the neutral part of the maze by different assemblies, depending on the context (i.e., the initial part of the maze). From a theoretical perspective, such a reward-based learning protocol can introduce (through rejection sampling) the full power of HMM learning, where common parts of input sequences are definitely represented by different hidden states if this is beneficial for future decisions.

This experiment reveals an interesting difference from the self-organization property of the model considered in Liu and Buonomano (2009), in which two different brief initial stimuli produced on the basis of a suitable learning rule (termed presynaptic scaling) two different stereotypical firing responses that recruited in either case *all* neurons in the recurrent network. No separate assemblies of neurons emerged in that model for these two different input conditions, in contrast to our model and the experimental data of Harvey et al. (2012). With regard to experimental data on the dependence of assembly codes on the spatial context of a sensory stimulus, we refer the reader to Itskov et al. (2011). This is an interesting open question whether these data can also be reproduced by our model.

### Liquid computing with long-term memory

The liquid computing model (Maass et al., 2002; Buonomano and Maass, 2009) is an attempt to explain how rather stereotypical cortical microcircuits that consist of diverse types of neurons and (dynamic) synapses with many different time constants could support diverse computational processes in many parts of cortex. It proposes that these microcircuits carry out in particular two generic computational operations: integration of temporally dispersed information from its spike inputs and nonlinear preprocessing to enhance the computational capability of linear readout neurons. In preceding versions of the liquid computing model, long-term memory was restricted to the synaptic weights of readout neurons. In contrast, the model presented here allows that any synaptic connection between excitatory neurons in the network (or “liquid”) can acquire long-term memory via STDP. Another innovation is that this model takes experimentally found stereotypical network motifs (WTA circuits) into account. In this improved liquid computing model, repeating external spike inputs generate stereotypical trajectories of network states that also modify the structure of its spontaneous activity.

To investigate the resulting computational properties of the model, we repeated an experiment from Maass et al. (2002), in which spoken words were used to generate spike input patterns for the network (Fig. 10 *A*,*B*). We used speech samples from the well known TI46 dataset (also used by Hopfield and Brody, 2000), which consists of isolated spoken digits. We preprocessed the raw audio samples with a model of the cochlea (Lyon, 1982) and converted the resulting analog cochleagrams (Fig. 10*A*) into spike trains as in Schrauwen and Campenhout (2003) (Fig. 10*B*). We used 10 different utterances of digits “one” and “two” of a single speaker. Seven of these utterances were used for training; the remaining three were used for testing. We embedded noisy variations of these cochlear spike trains in random order into Poisson spike trains of constant rate. This constituted the spike input to our model, a recurrent network of WTA circuits with STDP as in the preceding experiments.

In contrast to the results in Maass et al. (2002), it is no longer necessary to train a readout neuron by a teacher for the classification of different spoken digits. The network learned to detect and discriminate between different digits by an emergent assembly coding: presentation of digits “one” and “two” activated after a short while two different assemblies of neurons. Moreover, this behavior generalized to unseen test utterances. In addition, there were neurons that were silent during the presentation of either digit and fired irregularly in the absence of any input pattern. The firing patterns of the two assemblies, which exhibit in this case hardly any specific firing order due to the less prominent temporal structure of the spike input patterns, are very easily separable (see their first principle components in Fig. 10*D*). Therefore, the neurons in a WTA circuit of external readout neurons learns autonomously during 100 s to separate and classify the spoken digits (see the firing of neurons 1 and 2 in the spike raster of the 4 neurons of this WTA-readout shown in Fig. 10*C*, third panel). In contrast to the classical liquid computing model, in this case, no teacher is needed to train a readout neuron to classify the spoken digits. Figure 10*F* shows that the performance of such readout without a teacher was very bad before STDP was applied to synapses within the current network (i.e., “training time” = 0), even worse than if applied directly to the input spike trains (Fig. 10*F*, red line). However, after applying STDP to synapses in the recurrent network, the performance of this readout without a teacher improved very fast, indicating an important functional property of the emergent stimulus-specific assembly codes.

One interesting feature of our model is that it also reproduces a characteristic error that had been observed in the experiments of Harvey et al. (2012), which might potentially be typical for trajectory-based computations. Harvey et al. (2012) observed that the trajectory that is characteristic for the current stimulus jumped occasionally to the trajectory that is characteristic for the other stimulus (Fig. 4*F* of Harvey et al., 2012). This effect also occurred in our model, as shown in Figure 10*E.* The black trajectory shown there (which occurred for the simulus “one”) jumped after a while to an area of the state space that is usually visited by the trajectory for the stimulus “two.”

Figure 11 illustrates two further computational properties of the modified liquid computing model. If parts of the presented spoken digits were omitted for a duration of 100 ms, the corresponding assembly continued to fire, thereby bridging this gap in external information. Furthermore, the network exhibited a new generalization capability. We created novel stimuli by generating spike trains from a pseudocochleagram that was obtained by taking the mean of the cochleagrams of two specific utterances, one of digit “one” and one of digit “two” (Fig. 11*B*). As seen in Figure 11*A*, right, the network responded to the spike patterns corresponding to these “averaged” digits on each trial with one of the two assemblies that had previously emerged in response to spoken words “one” and “two.” This response is qualitatively similar to the experimental data of Ramos et al. (1976), who had already shown very early that neural responses to novel interpolations between two overtrained stimuli tended to be very similar to the neural responses for one or the other of the two overtrained stimuli.

Finally, we explored the tradeoff between the computational benefits of an untrained network and a network in which the synapses had been subjected to STDP during recurring input presentations. An untrained network can carry out generic temporal integration (fading memory) and nonlinear preprocessing operations on arbitrary and, in particular, on completely novel input streams very well. Figure 12 shows that this capability is reduced through STDP, when the network develops assemblies that respond to repeating network inputs in a stereotypical manner. Therefore, our model suggests that it is essential that synaptic plasticity is limited, at least for cortical networks that need to retain the capability to process novel inputs in a flexible manner. Such limitation of synaptic plasticity could, for example, be implemented in the brain by allowing during any learning episode only a small subset of all synapses to be modified by STDP. Molecular mechanisms of this type have been proposed in Silva et al. (2009).

## Discussion

We have shown that STDP on synaptic connections between excitatory neurons causes the emergence of input-specific cell assemblies, provided that there is sufficiently much stochasticity (or trial-to-trial variability) in the network, and excitatory neurons are subject to local lateral inhibition. The resulting network activity in the model is strikingly similar to the experimentally observed stereotypical trajectories of network states in sensory cortices (Jones et al., 2007; Luczak et al., 2007; Luczak et al., 2009; Bathellier et al., 2012), hippocampus and prefrontal cortex (Fujisawa et al., 2008), and parietal cortex (Harvey et al., 2012). Furthermore, stereotypical trajectories of network states emerged in our model after approximately the same number of input repetitions (100 repeats) as in the experiments of Xu et al. (2012). Therefore, our model may help to narrow the gap between network activity of cortical microcircuit models and experimentally observed firing patterns of neurons in the cortex.

The experimentally observed stochasticity of neuronal firing (which is likely to result primarily from stochastic synaptic vesicle release; Isaacson and Walmsley, 1995; Tsodyks and Markram, 1997; Goldman et al., 2002; Branco and Staras, 2009) in combination with the experimentally observed primarily depressing dynamics of synaptic connections between excitatory neurons (Tsodyks and Markram, 1997; Goldman et al., 2002) are important ingredients of our model. They avoid the network getting entrained in an almost deterministic stereotypical activity pattern that becomes increasingly independent from external input and tends to reduce the capability of the network to carry out computations on synaptic inputs from neurons outside of the network. Izhikevich et al. (2004) showed previously that synaptic depression alone does not suffice for that. The work of Liu and Buonomano (2009) suggests that, in addition to stochasticity and depressing synapses, local lateral inhibition is also required for the emergence of input-specific assemblies through STDP. However, they showed that instead of local lateral inhibition, an additional rule for synaptic plasticity (which they called presynaptic-dependent synaptic scaling) generates stimulus-specific stereotypical firing patterns. These differed from the ones observed in our model insofar as they recruited every neuron in the network and each neuron fired exactly once during such firing pattern. In addition, the duration of assembly activations adjusted themselves in our model to the time course of external stimuli (Fig. 5*A*) and were able to cover larger time spans than with the mechanisms investigated in Liu and Buonomano (2009). The presynaptic scaling rule of Liu and Buonomano (2009) and the heterosynaptic learning rule considered in Fiete et al. (2010) induce an interesting difference in the self-organization of the network dynamics: they organize all neurons in the network into a linear firing order where a single neuron fires at any moment of time (rather than creating assemblies and sequences of assemblies). These long chains are not dependent on ongoing external inputs and are somewhat reminiscent of the stereotypical autonomous dynamics of the area HVC in songbirds (for details, see Fiete et al., 2010).

### Relationship to other models

Synfire chains are a special type of stereotypical trajectory of network states, the occurrence of which had been proposed in Abeles (1982, 1991). A synfire chain is conceptually rather close to an assembly sequence in the sense of Buszáki (2010). The main difference is that one assumes that the neurons within a single assembly of a synfire chain fire almost simultaneously. This assumption provides a nice explanation for precisely timed firing patterns observed in the prefrontal cortex of monkeys with electrode recordings from a few neurons (Abeles et al., 1993). However, it is possible that the less precisely repeating stereotypical trajectories that emerged in our model (Fig. 5) also provide a statistically valid explanation of these experimental data.

The possible role of synfire chains and related notions of cell assemblies in cognitive processes and memory had been discussed on a more abstract level previously (Wennekers and Palm, 2009; Wennekers and Palm, 2000). The impact of STDP in a recurrent network of spiking neurons had also been analyzed extensively previously (Morrison et al., 2007). In the absence of external input, the network was found to reach a stable distribution of synaptic weights. However, external input induced quite different firing regimes (“synfire explosion”) in their model. This difference from our model may result from the inclusion of synaptic dynamics and local lateral inhibition in our model. The important role of lateral inhibition in this context had already been highlighted by Lazar et al. (2009), although in a more idealized model with synchronized threshold gates instead of asynchronously firing spiking neurons. In that model, an STDP-like rule in combination with homeostasis was shown to generate input-specific trajectories of network states. A nice review of a variety of effects of STDP in recurrent networks of spiking neurons is given in Gilson et al. (2010).

It is an interesting open question whether STDP in our model can also reproduce the emergence of continuous attractors (Samsonovich and McNaughton, 1997) that represent continuous spatial memory.

### Experimentally testable predictions of our model

Monitoring the activity of many neurons in awake animals over several days and weeks while the animal learns the behavioral significance of specific sensory stimuli is now becoming possible through imaging of intracellular Ca^{2+} traces in identified neurons (Huber et al., 2012). Our model predicts that such learning processes cause the emergence of assemblies and assembly sequences for behaviorally relevant sensory stimuli. Through optogenetic stimulation (Stark et al., 2012), one can in principle also generate artificial spike inputs in a local microcircuit, and our model predicts that this will also cause the emergence of stimulus-specific assemblies and trajectories of network states. Our model also predicts that this effect is abolished through inactivation of inhibitory neurons.

In addition, our model predicts (Fig. 12) that an important tradeoff can be observed in the dynamics of local networks of neurons. At one end of this tradeoff curve, one will find networks that can differentially respond to novel stimuli in such a way that substantial information about these stimuli can be read off from their resulting firing activity (Nikolić et al., 2009; Klampfl et al., 2012). At the other end of the tradeoff curve, one will find network responses as shown in Figure 11, which are more stereotypical and contain primarily information on how similar the current stimulus is to some specific familiar stimulus. We propose that microcircuits in different parts of the brain operate at different positions on this tradeoff curve. Furthermore, development and extended learning periods are likely to move at least some of these microcircuits into the more stereotypical regime shown in Figure 12, right. A shift toward more stereotypical trajectories of network states as a result of learning had been described previously (Ohl et al., 2001).

### Summary

The model for learning in cortical microcircuits that we have presented here is a simple extension of the liquid computing model. Whereas the original version of the model (Maass et al., 2002) was only able to model online processing of novel stimuli in a “naive” network, the extension presented here also addresses the formation and computational use of long-term memory traces. Furthermore, in the original version, one had to rely on the contribution of readout neurons (that had to be trained by a teacher) for arriving at a result of a computation (e.g., a classification). The emergence of stimulus-specific assemblies in the extended model makes such external contributions unnecessary (Fig. 10*F*). Our model provides a platform for investigating the functional role of assemblies and assembly sequences as potential building blocks for “neural sentences” that may underlie higher brain functions such as reasoning (Buszáki, 2010). At the same time, our model is consistent with many experimental data on synaptic plasticity and the anatomy and physiology of cortical microcircuits.

## Footnotes

This work was supported in part by the European Union (project #FP7–248311, AMARSi, and project #FP7–269921, BrainScaleS).

The authors declare no competing financial interests.

- Correspondence should be addressed to Wolfgang Maass, Institute for Theoretical Computer Science, Graz University of Technology, Inffeldgasse 16b/I, A-8010 Graz, Austria. maass{at}igi.tugraz.at