Abstract
Triggered by recent experimental results, temporally asymmetric Hebbian (TAH) plasticity is considered as a candidate model for the biological implementation of competitive synaptic learning, a key concept for the experiencebased development of cortical circuitry. However, because of the well known positive feedback instability of correlationbased plasticity, the stability of the resulting learning process has remained a central problem. Plagued by either a runaway of the synaptic efficacies or a greatly reduced sensitivity to input correlations, the learning performance of current models is limited. Here we introduce a novel generalized nonlinear TAH learning rule that allows a balance between stability and sensitivity of learning. Using this rule, we study the capacity of the system to learn patterns of correlations between afferent spike trains. Specifically, we address the question of under which conditions learning induces spontaneous symmetry breaking and leads to inhomogeneous synaptic distributions that capture the structure of the input correlations. To study the efficiency of learning temporal relationships between afferent spike trains through TAH plasticity, we introduce a novel sensitivity measure that quantifies the amount of information about the correlation structure in the input, a learning rule capable of storing in the synaptic weights. We demonstrate that by adjusting the weight dependence of the synaptic changes in TAH plasticity, it is possible to enhance the synaptic representation of temporal input correlations while maintaining the system in a stable learning regime. Indeed, for a given distribution of inputs, the learning efficiency can be optimized.
 Hebbian learning
 spiketimingdependent plasticity
 synaptic updating
 symmetry breaking
 unsupervised learning
 infomax
 activitydependent development
Introduction
Correlationbased plasticity has long been proposed as a mechanism for unsupervised experiencebased development of neuronal circuitry, particularly in the cortex. However, the specifics of a biologically plausible model of plasticity that can also account for the observed synaptic patterns have remained elusive. Two major issues are stability and competition (Miller and MacKay, 1994; Miller, 1996; Abbott and Nelson, 2000; Song et al., 2000; van Rossum et al., 2000; Rao and Sejnowski, 2001;van Ooyen, 2001). If maps, such as ocular dominance maps, emerge from initially random (but statistically homogeneous) synaptic configurations by a Hebbian mechanism (but see Crowley and Katz, 2000), this would imply that there is an inherent instability in the dynamics of synaptic learning that destabilizes an initially homogeneous synaptic pattern. However, this raises the question as to what mechanism prevents synapses from growing to unrealistic values when driven by unstable dynamics. The emergence of inhomogeneous synaptic patterns also requires a competition mechanism that makes some synapses decrease their efficacies as other synapses grow in strength. Such competition is absent in the most naive Hebb rule, which contains only a mechanism for synaptic enhancement. Recent experiments have led to an important refinement of correlationbased or Hebbian learning, by showing that activityinduced synaptic changes can be temporally asymmetric with respect to the timing of presynaptic and postsynaptic action potentials with a precision of down to tens of milliseconds. Causal temporal ordering of presynaptic and postsynaptic spikes induces synaptic potentiation, whereas the reverse ordering induces synaptic depression (Levy and Steward, 1983;Debanne et al., 1994, 1998; Magee and Johnston, 1997;Markram et al., 1997; Bi and Poo, 1998,2001; Zhang et al., 1998; Feldman, 2000; Sjöström et al., 2001).
In this work, we address the question of whether temporally asymmetric Hebbian (TAH) plasticity rules provide an adequate mechanism for unsupervised learning of input correlations. Two models of TAH plasticity have been studied recently that differ in the way that they implement the weight dependence of the synaptic changes and the boundaries of the allowed range of synaptic efficacies. The additive model (Abbott and Blum, 1996; Gerstner et al., 1996; Eurich et al., 1999; Kempter et al., 1999, 2001;Roberts, 1999; Song et al., 2000;Levy et al., 2001; Câteau et al., 2002) assumes that changes in synaptic efficacies do not scale with synaptic strength, and the boundaries are imposed as hard constraints. This model retains inherently unstable dynamics while exhibiting strong competition between afferent synapses. Because this model yields binary synaptic distributions, its ability to generate graded representations of input features is restricted. Moreover, because of the strong competition, patterns in the synaptic distribution can emerge that do not reflect patterns of correlated activity in the input. On the other hand, the multiplicative model (Kistler and van Hemmen, 2000; van Rossum et al., 2000; Rubin et al., 2001) assumes linear attenuation of potentiating and depressing synaptic changes as the corresponding upper or lower boundary is approached. This model results in stable synaptic dynamics. However, because of reduced competition, all synapses are driven to a similar equilibrium value, even at moderately strong input correlations. Thus, neither the additive nor the multiplicative model provides a satisfactory scenario for a robust learning rule that implements a synaptic storage mechanism of temporal structures in the inputs. Here, we introduce a nonlinear TAH Hebbian (NLTAH) model, a novel generalized updating rule that allows for continuous interpolation between the additive and multiplicative models. We demonstrate that by appropriately scaling the weight dependence of the updating, it is possible to learn synaptic representations of input correlations while maintaining the system in a stable regime. Preliminary results have been published previously in abstract form (Aharonov et al., 2001; Gütig et al., 2001).
Materials and Methods
Temporally asymmetric Hebbian plasticity. We describe TAH plasticity as a change in the synaptic efficacy w between a pair of cells, where the range of w is normalized to [0, 1]. A single pair of presynaptic and postsynaptic action potentials with time difference Δt ≡ t_{post} − t_{pre} induces a change in synaptic efficacy Δw given by:
Following previous work (Kempter et al., 1999;Song et al., 2000; Rubin et al., 2001; but see van Rossum et al., 2000), the plasticity effects of individual spike pairs are assumed to sum independently: given a postsynaptic spike, each synapse is potentiated according to Equations1 and 2 by pairing the output spike with all preceding synaptic events. Conversely, a synapse is depressed when a presynaptic event occurs, using all pairs that the synaptic event forms with preceding output spikes.
Mean synaptic dynamics. Because in general the spike times of the presynaptic and postsynaptic neurons are stochastic, the dynamics of synaptic changes are also a stochastic process. However, if the learning rate λ is small, the noise accumulated over an appreciable amount of time is small relative to the mean change in the synaptic efficacies, called the synaptic drift. This drift, denoted asw˙, is the mean rate of change of the synaptic efficacy. Using Fokker–Planck mean field theory, the synaptic drifts are described in terms of the correlations between the presynaptic and postsynaptic activity (Kempter et al., 1999,2001; Kistler and van Hemmen, 2000; Rubin et al., 2001). We consider a pair of stationary presynaptic and postsynaptic processes described by the pulse trains ρ^{pre}(t) = ∑_{k}δ(t − t
Integrateandfire neuron. To study the implications of the above NLTAH plasticity model in a biologically motivated spiking neuron, we simulate a leaky integrateandfire neuron, with parameters similar to those of Song et al. (2000). The membrane potential of the neuron is described by:
Linear Poisson neuron. To investigate analytically the properties of the TAH learning rule, we consider in addition a linear Poisson neuron (Kempter et al., 2001). The spiking activity of this neuron ρ^{post}(t) is a realization of a Poisson process with the underlying instantaneous rate function:
In Figure 2 A, we numerically simulate the linear Poisson neuron receiving uncorrelated Poisson input spike trains. Generating the spike arrival times in continuous time (down to machine precision), the postsynaptic process defined in Equation 5 is implemented by generating a postsynaptic spike with probability w_{i}/N, whenever a presynaptic spike arrives at a synapse (i) of the neuron.
Mean synaptic dynamics for the linear Poisson neuron. For the integrateandfire neuron, there is no simple exact expression relating the correlations between the presynaptic and postsynaptic spike trains to the system parameters such as the rates and input correlations. However, because of the linear summation of inputs in the linear Poisson neuron (Eq. 5), this model permits the expression of the input–output correlations Γ_{pre,post}(Δt) in closed form. Considering the case that all input spike trains have a common rate r, we obtain from Equations 3 and 5 that the correlation of the activity at synapse i with the output activity is:
Generating correlated inputs. We consider input spike trains with rate r and instantaneous correlations defined by:
Measuring the performance of learning rules. A natural way to measure the performance of a learning rule is to quantify its ability to imprint the statistical features of the neuronal input onto the distribution of the learned synaptic weights. One measure of this ability is the mutual information between the neuronal inputs and the synaptic weights. However, direct calculation of the mutual information in cases in which the number of synaptic weights is large is computationally not feasible. Instead, we use here a related quantity that measures the effect of a small change in the statistics of the input on the learned synaptic weights. We denote the features of an ensemble of neuronal inputs by the vector Φ = (Φ_{1}, … , Φ_{R}), where the Φ_{i} parameterize specific input features (e.g., mean strength of the inputs or temporal correlations between different inputs). Given these features, we calculate the N × R susceptibility matrix χ_{ij}, the elements of which are:
Results
To understand learning phenomena in biological nervous systems in terms of neural network function, it is crucial to bridge the gap between the microscopic mechanisms that implement experiencebased changes in neuronal signaling pathways and the macroscopic properties of the learning system composed of these pathways. In this paper, we focus on two general goals of learning that can be defined at the network level and also investigate the importance of the updating parameter μ of the learning rule in these contexts. First, we consider the question of how a network can develop a functional connectivity architecture, as for example in ocular dominance columns. As noted in the Introduction, this type of learning task typically requires the synaptic learning dynamics to be competitive, to allow segregation between initially homogeneous synaptic populations. Moreover, it is important that the learning process is robust in the sense that the learned synaptic patterns faithfully reflect meaningful features in the neuronal input activity, rather than being dominated by contributions from random noise. Therefore, we study here how the interplay between competition and stability in TAH plasticity affects the learned synaptic distributions. In the second part of Results, we turn to the conceptually different learning task of imprinting information about the input activity of a neuron into the respective synaptic efficacies. In this context, the sensitivity of the learning dynamics to features in the neuronal input becomes crucial. Thus, using the sensitivity measure introduced in Materials and Methods, the second part of Results concentrates on a quantitative evaluation of the performance of different TAH learning rules.
The emergence of synaptic patterns by symmetry breaking in TAH learning
One of the basic requirements for the activitydriven formation of cortical maps is the ability of the learning to generate spatially inhomogeneous synaptic patterns from a population of synapses with statistically homogeneous inputs. The emergence of such symmetry breaking is an essential property of current cortical plasticity models (Miller, 1996). In this section, we study the conditions under which the TAH learning models introduced above exhibit symmetry breaking and, hence, qualify as candidate models for the development of functional maps. Moreover, because the learning dynamics may also lead to symmetry breaking that overrides the correlation structure of the afferent activity, it is important to ask what learning rules ensure a faithful representation of the input activity within the learned synaptic connections. We address these questions in three basic types of homogeneous afferent activities, that differ with respect to the correlation structure of the input spike trains: uncorrelated inputs, uniformly correlated inputs, and uniformly correlated subpopulations without correlations between the subpopulations (“correlated subgroups”). Before treating these specific cases, we highlight the general features of the synaptic learning dynamics in a population of synapses with statistically homogeneous input activities. These results apply to all three cases of homogeneous populations of inputs.
Dynamics of a population of synapses with homogeneous inputs
To study the symmetry breaking in the synaptic patterns, we consider the learning dynamics in cases in which the input statistics are spatially homogeneous. This means that each input obeys the same spike statistics and has the same pattern of correlations with the other inputs. This assumption implies that the presynaptic rates r_{i} (where i denotes the index of the different afferents) are all equal. Likewise, the total sum of the correlations that each input has with the rest of the inputs is the same. In particular, the mean effective causal correlations, C_{0}:
To understand the implications of spatial homogeneity in the presynaptic inputs on the learning dynamics, it is useful to concentrate on the linear Poisson neuron model (Eqs. 5, 8). For convenience, we assume that all correlations between input spikes are instantaneous (see Materials and Methods, Eq. 10).
The important consequence of the spatial homogeneity across the presynaptic inputs is that the product of the effective correlation matrix C
Although the homogeneous synaptic steady state always exists, it may be unstable with respect to small perturbations of the synaptic efficacies, driving them into inhomogeneous states. Because of the important functional consequences of this emergence of inhomogeneous synaptic patterns at the network level, it is important to understand the features of the learning dynamics that give rise to this phenomenon of symmetry breaking. Therefore, we analyze the effects of small deviations of the synaptic efficacies from the homogeneous synaptic steadystate
w
*. For each synapse w_{i}, we denote a corresponding small deviation from the homogeneous solution by δw_{i}= w_{i} − w* and express its temporal evolution as a function of all deviations δw_{j}. As we show in the , this temporal evolution is determined by three separate contributions:
Finally, the last term is a cooperative term. Synapses that are positively correlated cooperate to elevate their weights. This cooperation is driven by the potentiating component of the TAH learning and depends on the pattern of correlations among the input channels. We emphasize that the cooperativity in the synaptic learning in general does not originate from a possible advantage of correlated synapses to drive a potentially nonlinear spike generator of the postsynaptic cell, but rather already occurs because of an inherently increased probability of correlated synapses to precede postsynaptic spikes, even when nonlinear cooperative effects in the spike generator are absent.
The stability of the homogeneous synaptic steady state results from the interplay between the stabilizing, the competitive, and the cooperative drifts in the learning dynamics. As we derive in the , perturbations of the steady state that slightly change all weights by the same amount δw (homogeneous perturbations) decay to zero with time and, hence, do not destabilize the learning of a homogeneous synaptic distribution. In contrast, inhomogeneous perturbations (i.e., perturbations in which the deviations of the synaptic efficacies from w* are not identical) can grow exponentially through the learning dynamics and drive the system into inhomogeneous synaptic states. In the , we specifically show that the homogeneous synaptic state becomes unstable if the largest real part of all inhomogeneous eigenvalues (eigenvalues corresponding to inhomogeneous eigenvectors) of the effective correlation matrix C^{+} is sufficiently large. Denoting this eigenvalue by NC_{1}, we find that when:
Although this analysis was performed using the plasticity equations of the linear Poisson neuron, it is qualitatively valid as well for other neuron models, as we show for specific cases. Below we study how the emergence of symmetry breaking (i.e., transitions from homogeneous to inhomogeneous synaptic distributions) depends on the nonlinearity of the TAH dynamics, namely the parameter μ, as well as on the asymmetry between depression and potentiation α, and on the size of the synaptic population N.
Uncorrelated inputs: linear neuron
In this section, we investigate the synaptic distributions that result from the TAH learning process when the postsynaptic neuron is driven by independent Poisson spike trains of equal rate r. For this input regime, it has been found in an integrateandfire neuron that additive learning (μ = 0) breaks the symmetry of the statistically identical presynaptic inputs and leads to a bimodal weight distribution (Song et al., 2000; Rubin et al., 2001). However, it was shown by Rubin et al. (2001) that multiplicative learning (μ = 1) leads to a unimodal distribution of synapses. As shown in the preceding section, these qualitatively different learning behaviors originate in the stabilizing effect of the weight dependence of the synaptic changes on the homogeneous synaptic state. Here, we study the generalized nonlinear TAH rule with arbitrary μ ∈ [0, 1].
In the uncorrelated case, c_{ij} = δ_{ij}, and hence C
By inserting the above expression for C_{0} into Equation 15, we obtain the steadystate efficacy w* of the synaptic population when the learned synaptic state is homogeneous (see, Eq. 19). In this case, the output rate of the linear neuron is given by this steadystate efficacy times the rate of the presynaptic inputs r (compare Eq. 5). Figure2 A depicts the output rate of the postsynaptic neuron as a function of the presynaptic input rate r, for α = 1.05. We focus on this value of α here, because we want to compare the nonlinear rules with the additive rule. In the latter case, α must be close to 1; otherwise, practically all synapses will become zero (see ). For μ = 1 (multiplicative TAH), the efficacy w* is fairly independent of r and, hence, the output rate grows linearly with the input rate. However, if μ is sufficiently small, w* decreases inversely with the input rate, resulting in the output rate being nearly constant.
To study the regime in which the synaptic learning dynamics break the symmetry of the uncorrelated input population, we substitute Equation17 into Equations 15 and 16, computing the homogeneous solution w* (Eq. 19) and the regime of its stability. Figure3 A depicts the critical contour lines according to the stability condition (Eq. 16). Each line traces the critical combination of the parameters μ and τrN for a fixed value of α, such that = 0. Outside the corresponding contour ( < 0), the homogeneous synaptic state is stable, and thus learning generally results in all synapses having the same efficacy. In contrast, inside the contour line, the learning dynamics induce symmetry breaking.
Figure 3 A shows how the outcome of TAH learning depends on the effective size of the presynaptic population. For a sufficiently small τrN, the relative contribution of each input channel to the postsynaptic activity is large and, hence, the resulting strong positive feedback drives all synapses to a stable homogeneous state near the upper boundary (Fig. 3 B, squares). In contrast, as τrN is increased, the effect of a single synapse on the postsynaptic activity decreases. Therefore, for a sufficiently large τrN, the stabilizing force induced by the weight dependence of the synaptic changes dominates the learning dynamics for any nonzero μ, resulting in a stable homogeneous synaptic state (Fig. 3 B, triangles). In between the two extremes of small and large τrN, there is a regime of intermediate effective population sizes for which symmetry breaking may occur, with the synaptic population segregating into a strong and a weak group. Such a case is shown in Figure 3 B (circles).
Importantly, Figure 3 A demonstrates that as the number of afferents N increases, the regime of values of μ for which the homogeneous solution is unstable shrinks to zero. The inset of Figure 3 A shows the value of μ at the border between stability and instability of the homogeneous solution, as a function of the effective population size. It is apparent that this μ decreases linearly with 1/(τrN) when τrN is large (also see ). Hence, for any sizable degree of weight dependence and large synaptic populations, symmetry breaking does not occur.
In the purely additive TAH model, synaptic changes do not scale at all with the efficacy of a synapse, and the weights have to be constrained by an additional clipping to prevent unrealistic synaptic growth. As a result, the additive learning dynamics do not possess stationary synaptic states in the above sense that the individual synaptic drifts become zero. Instead, synapses with positive drifts are held at the upper boundary, whereas synapses with negative drifts saturate at the minimum allowed efficacy. Our treatment of the additive model in the shows that the numbers of synapses gathering at the upper and lower boundaries critically depend on the ratio of depression and potentiation α, as well as on the effective population size τrN. As in the nonlinear TAH learning model, small effective synaptic populations (τrN < 1/(2(α − 1)) will lead to all synapses saturating at the upper boundary because of the strong positive feedback. However, as τrN increases beyond a critical value, the synaptic population breaks into two groups, one of which remains saturated at the upper boundary while the other, losing the competition, saturates at the lower boundary. The ratio of synapses saturating at the top boundary is n_{up} = 1/2τrN(α − 1) (). Because this ratio is inversely proportional to the input rate r, the output rate of the postsynaptic neuron becomes independent of the input rate, as shown in Figure2 A.
Uncorrelated inputs: integrateandfire neuron
We now turn to the behavior of TAH learning in the integrateandfire neuron driven by uncorrelated inputs. Figure2 B shows the output rate of this neuron model versus the input rate for different values of μ. As the figure demonstrates, the outputrate normalization quickly deteriorates as μ departs from the additive model and synaptic changes become dependent on the efficacy of the synapse. Figure 2 C demonstrates that the sensitivity of the output rate to the parameter α rapidly diminishes as μ increases. Comparing A and B of Figure 2shows the qualitative similarity between the output rate responses of the linear Poisson and the integrateandfire model neurons. Note that we have not attempted to match the overall scale of the output rates in the two models. The output rate of the linear neuron can be arbitrarily changed by a gain factor without affecting any other results.
Figure 4 displays the histograms of the equilibrium distributions of learned synaptic efficacies as a function of the updating parameter μ. Recovering the behavior of additive (Song et al., 2000) and multiplicative (Rubin et al., 2001) updating models for μ = 0 and μ = 1, respectively, the plot reveals the transition between these models for intermediate values of μ. Specifically, it shows the emergence of symmetry breaking as μ approaches zero.
As expected from the analysis of the linear neuron, we find that in the integrateandfire neuron also, the critical value of μ at which the synaptic distribution becomes bimodal decreases as the effective population size τrN increases. Increasing the rate of the input processes from 10 Hz (Fig. 4 A) to 40 Hz (Fig.4 B) lowers the first occurrence of a bimodal weight distribution from μ_{crit} = 0.023 to μ_{crit} = 0.017. The inset in each panel depicts the equilibrium weight distribution for the intermediate value of μ = 0.019, showing a clearly bimodal distribution for the 10 Hz input (Fig. 4 A) and a clearly unimodal distribution for the 40 Hz input (Fig. 4 B). Moreover, as expected from the equations describing the homogeneous steady state in the linear neuron (Eqs. 15 and 17), the synaptic efficacy of the homogeneous state at a given μ decreases when the input rate increases.
It is interesting to note the close similarity in the μ dependence of the learned synaptic distributions in the linear and the integrateandfire neurons. For example, in both cases, the critical μ for symmetry breaking is close to 0.023 for input rates of 10 Hz [compare Fig. 4 A with Fig. 3 B(circles)]. This is despite the fact that the two models have very different spike generators and different sizes of synaptic populations. The reason for this similarity is that the input–output correlations in the integrateandfire neuron with 1000 synapses turn out to match in magnitude the corresponding correlations of the linear neuron with 100 synapses (data not shown).
In summary, for uncorrelated inputs and biologically realistic sizes of the presynaptic population, N, on the order of thousands, and for rates on the order of ≥10 Hz, the regime in μ and α in which symmetry breaking between uncorrelated inputs as well as output rate normalization occur is extremely narrow. Thus, the learning behavior changes qualitatively as soon as synaptic plasticity becomes weight dependent.
Uniformly correlated inputs
We briefly discuss here the case in which the presynaptic inputs have positive uniform instantaneous correlations, namely that for all i ≠ j, c_{ij} (Eq. 9) are equal. This situation may, for instance, occur when the entire presynaptic pool of a neuron is driven by a common source. Treating the behavior of the linear Poisson neuron, we show in the that positive uniform correlation increases the value of the synaptic efficacy in the homogeneous synaptic steady state. Moreover, the uniform correlation does not alter the 1/(τrN) dependence of the destabilizing drifts. As a result, in nonadditive learning, when the effective synaptic population is sufficiently large, the homogeneous steady state remains stable for any positive uniform correlation strength. In fact, these correlations increase the stability of the homogeneous state () and, hence, oppose the emergence of spontaneous symmetry breaking.
Correlated subgroups
We now consider afferent input activity to a neuron that is composed of M equally sized groups. These groups are defined by a uniform withingroup correlation coefficient c_{ij} = c > 0 (compare Eq. 9) that is equal within all groups. For pairs of inputs belonging to different groups, the crosscorrelation is zero. In this scenario, the M different presynaptic groups compete for control over firing of the postsynaptic neuron. We first treat the linear neuron and, for simplicity, focus on the case in which the overall number of presynaptic input channels N is large. In this limit, the homogeneous and largest inhomogeneous eigenvalues of C^{+} normalized by N are:
The nature of the synaptic pattern that emerges once the homogeneous synaptic state loses stability depends on the number of afferent subgroups. Here we focus on the example of two equally sized subgroups (i.e., M = 2). A similar scenario, which is motivated by the problem of the activitydriven development of ocular dominance maps, has recently been studied by Miller and MacKay (1994) and Song and Abbott (2001). The regimes of symmetry breaking in which the learned synaptic efficacies segregate according to the two correlated input groups are depicted in Figure5 A (this figure is equivalent to Fig. 3 A, with c replacing 2/N). Thus, symmetry breaking between two correlated subgroups can occur in nonadditive TAH learning models even when the number of presynaptic inputs N is large. This is demonstrated in Figure5 B (solid black line), which plots the learned synaptic efficacies as μ as varied, with c held fixed at 0.11. As is evident from Figure 5, for this level of correlation, symmetry breaking occurs below a fairly high value of μ ≈ 0.15. Note that in contrast to our treatment of the uncorrelated inputs, here we do not use α close to 1 but rather set it to a generic value of α = 1.5.
Figure 5, C and D, describes the behavior of the system as the withingroup correlation is gradually turned on. As expected from the analysis of the uncorrelated input scenario, the substantial weight dependence of the synaptic changes induced when μ = 0.15 (solid black lines), yields a stable homogeneous synaptic state if the withingroup correlation is sufficiently weak. However, when the correlation reaches a critical value, the homogeneous state becomes unstable and the synaptic efficacies segregate into the two input groups, with the one winning the competition suppressing the other. As the correlation increases still further, another transition may occur at a higher value of c, above which the homogeneous synaptic state becomes stable again. The presence of this second transition (which is discontinuous) depends on the values of τr, the expected number of input spikes per synapse arriving within its learning time window, and the ratio of depression and potentiation α (Fig. 5, compare C and D). Importantly, for large values of μ, in particular in the multiplicative model (μ = 1), the stabilizing force is so strong that the homogeneous synaptic state remains stable for all positive correlation strengths (Fig. 5 C, D, dashed black lines; also see ), and no segregation is possible.
The behavior described above for the linear neuron is reproduced qualitatively in simulations of the integrateandfire neuron, as shown in Figure 6 A. To address the question of whether symmetry breaking in the integrateandfire neuron can also occur at higher values of μ, we follow the linear Poisson neuron analysis shown in Figure5 A, which suggests that increasing the value of α extends the μ range of bimodal synaptic distributions. Figure6 B displays the learned synaptic distributions as a function of μ for a withingroup correlation c = 0.05, with α = 1.5 and r = 10 Hz. Similar to the linear neuron findings shown in Figure 5 B, symmetry breaking occurs here in a wide regime of μ.
To emphasize important differences between symmetry breaking in nonlinear versus additive TAH learning, Figure7 shows corresponding learned synaptic efficacies for selected cases of low, intermediate, and high withingroup correlations. Figures 7 A–D depicts learned weight distributions from Figure 6 A for which μ = 0.019. For each correlation, synaptic efficacies resulting from additive learning are depicted on the right (Fig.7 E–H). Except in Figure 7, A andB, where c = 0 (i.e., no input subgroups are defined), the synaptic distribution of the subgroup with higher mean efficacy is depicted in light gray, whereas that of the subgroup with lower mean efficacy is displayed in dark gray.
Inspection of Figure 7, A and B versusE and F, shows that in the regime of low correlations, the learning behavior induced by the two types of plasticity is qualitatively different. While in nonlinear TAH learning (Fig. 7 A,B), the homogeneous synaptic state is stable and all synapses distribute around the same mean efficacy, unstable additive learning induces symmetry breaking (Fig.7 E,F). Importantly, this symmetry breaking in general does not reflect the correlation structure in the afferent input. As shown in Figure 7 F, when the withingroup correlation is 0.03, the 500 synapses of the group winning the competition (light gray) split into two fractions of 325 versus 175 synapses, of which the larger fraction tends to zero efficacy and mixes with the efficacies of the losing input group. In contrast, in the nonlinear TAH model, unfaithful splitting of the weights occurs only for extremely small values of μ, of the order of 1/τrN (Fig. 7, compareA and B with E andF). This is because symmetry breaking within a uniformly correlated group does not occur for μ > 1/τrN, and hence the weights of each subgroup remain the same.
For intermediate strengths of the withingroup correlation, both learning rules induce symmetry breaking that faithfully reflects the structure of the input correlation, with the synaptic distributions of the two input groups well separated. This is shown in Figure 7,C and G, for a correlation of c = 0.1. Note, however, that whereas in additive learning the efficacies of both input groups reach the respective boundaries of the allowed range (i.e., are clipped to saturation), the weights resulting from nonlinear TAH learning do not saturate. As we show in the next section, this property of NLTAH plasticity enhances the sensitivity of the synaptic population to changes in the strength of the withingroup correlation. Finally, when the withingroup correlation is strong, in both types of learning all efficacies become large (Fig.7 D,H).
Clearly, the detailed quantitative properties of the learned synaptic patterns, as well as the parameter values at which symmetry breaking occurs, depend on the neuron model, and specifically on the spike generating mechanism. Nevertheless, the striking qualitative similarity in the findings from both neuron models investigated here suggests that the symmetry breaking induced by the withingroup correlations is a general property of the nonlinear TAH rule with small but nonzero μ, independent of the specifics of the spike generator.
Synaptic representation of input correlations
In the previous section, we studied the emergence of symmetry breaking in homogeneous synaptic populations for different types of instantaneously correlated input activity. In this section, we study the more general issue of how information about the spatiotemporal structure of the afferent input is imprinted into the learned synaptic efficacies by TAH plasticity. Specifically, we investigate how the weight dependence of the synaptic changes affects the sensitivity of the learning to features embedded in the input spike trains.
An example of the associated phenomena is shown in Figure8. Here we study the effect of weight dependence on the steadystate synaptic efficacies of the integrateandfire neuron receiving 1000 Poisson inputs that comprise a small subgroup of 50 correlated synapses (c = 0.1) while all other input crosscorrelations are zero. In this scenario, the subgroup is statistically distinct from the rest of the synaptic population. The coherence of spikes within the subgroup increases the causal correlation of the member synapses with the spiking activity of the postsynaptic neuron. Because of the ensuing cooperation between the correlated synapses, they grow stronger than those of the uncorrelated background. Figure 8 shows how the strength of the stabilizing drift induced by the weight dependence of the synaptic updating modulates the degree of separation between the two subpopulations. For decreasing values of μ, learning becomes increasingly affected by the correlation structure in the input, and the separation between the subgroup and the background is more pronounced. However, below a critical μ, the homogeneous state of the uncorrelated population loses stability and splits, resulting in a bimodal distribution of the background synapses. As a consequence, the representation of the afferent correlation structure in associated groups of synaptic efficacies is confounded by the mixing of the highefficacy mode of the background with the subgroup of correlated synapses. This example raises the general problem of finding an optimal learning rule that, for a given type of input activity, compromises best between sensitivity and stability.
To address this question, we need a quantitative measure for the performance of a learning rule in imprinting information about the input correlations onto the synaptic efficacies. Here we apply the sensitivity measure S (Eq. 13, Materials and Methods), which quantifies the sensitivity of the learned synaptic state to changes in features embedded in the input correlation structure. When S is high, small changes in the input features are picked up by learning and induce a large change in the learned synaptic efficacies. We emphasize that the goal of this performance measure is to quantify and compare general properties of different plasticity rules. It is therefore based only on the relationship between the afferent neuronal inputs and the learned synaptic efficacies. In particular, it avoids direct reference to the neuronal output activity.
We first illustrate the application of the sensitivity measure by considering a simple example in which the input feature to be represented by the learned synaptic efficacies is only one dimensional (i.e., a scalar quantity). Specifically, we apply S to the scenario discussed in the previous section, of two independent input groups with withingroup correlation c. We investigate the behavior of the linear Poisson neuron and quantify how the sensitivity of the learned synaptic distribution to the strength of the withingroup correlation is affected by the weight dependence of the synaptic changes. We consider the sensitivity of the learning as a function of μ for a fixed correlation of c = 0.11. As shown in Figure 5, B and C, this correlation represents an intermediate correlation strength in the linear Poisson neuron treatment. Using the steadystate synaptic efficacies from Figure 5 B, we compute S for values of μ between 0 and 0.5 (see Materials and Methods, ). Figure9 shows the resulting sensitivity curve. We note that each point quantifies the sensitivity of the learned synaptic weights to small changes in the correlation strength around c = 0.11.
As can be seen in Figure 5 B, there are two qualitatively distinct regimes of synaptic distributions emerging from learning in this case. For high values of μ, no symmetry breaking takes place, and the correlation strength is represented by the common mean value of the synaptic efficacies. In this regime (μ ≳ 0.15), S decreases monotonically with increasing μ (Fig. 9), because the higher weight dependence strengthens the confinement of the homogeneous synaptic state to the center range of the synaptic efficacies. For lower values of μ, symmetry breaking occurs, and the correlation strength is represented by the mean efficacy values of the two resulting groups. In this regime, S is nonmonotonous in μ. For very low μ, the synaptic efficacies are close to saturation at the boundaries and, hence, a change in the correlation strength cannot induce a large change in the efficacies. However, the centralizing drift induced by a large μ reduces the sensitivity. Thus, S has a maximum at an intermediate μ (in the present case around μ = 0.02). Finally, at the transition between the regions of homogeneous and bimodal synaptic distributions (μ ≈ 0.15), sensitivity is large, because here a small change in c may cause an abrupt and large change in the synaptic efficacies, namely a bifurcation from a homogeneous to an inhomogeneous synaptic distribution. Note, however, that this transition region in μ is narrow.
We now turn to a richer input scenario in which the afferent correlation structure is inhomogeneous and the input feature space to be represented by the learned synaptic efficacies is highdimensional. Specifically, we consider presynaptic activity in which each synapse receives spike inputs with a specific relative latency with respect to the remaining synaptic population. Such latency or delayline scenarios have been studied previously in the context of additive TAH learning (Gerstner et al., 1996; Song et al., 2000) and can, for instance, be motivated by their analogy to certain delayline models in auditory processing (Jeffress, 1948).
We consider the input activity to consist of N timeshifted versions of one common Poisson spike train with rate r. Because the synaptic learning process depends on the relative timing of the input spikes, we fix one presynaptic input as reference, and treat the remaining N − 1 delays Δ = (Δ_{1}, … , Δ_{N−1}) as R = N − 1 dimensional vector of input features to be represented by the learned synaptic weights. Whereas the delays Δ fully specify the temporal correlation structure of the neuronal input activity, S measures the sensitivity of the learned synaptic efficacies to small independent changes in the individual delays. Because of the temporal sensitivity of TAH plasticity, it is intuitively clear that the learning dynamics will critically depend on the temporal scale of the relative delays. Although it is a natural choice to set this temporal scale through the SD of a Gaussian distribution from which the delays are drawn (Song et al., 2000; Aharonov et al., 2001; Gütig et al., 2001), we here apply the sensitivity measure to the simpler case in which we fix Δ such that the delays between the N inputs are uniformly spaced at a fixed delay ς/(N − 1) [i.e., Δ_{i} = iς/(N − 1) (i = 1, 2, … , N − 1)]. We have checked that the qualitative behavior of S in the case of a fixed delay spacing is similar to that of S_{avg} (see Materials and Methods) obtained from averaging over an ensemble of Gaussian delay vectors with SD ς (Aharonov et al., 2001; Gütig et al., 2001).
We investigate here the behavior of the linear Poisson neuron. One important difference between the delayline input scenario considered here and the input correlations treated above is that here nonzero crosscorrelations between input spike trains also exist at negative time lags. Specifically, if the delays of the input activities of synapses i and j are given by Δ_{i} and Δ_{j}, respectively, and the additional delay of the postsynaptic neuron is ε (Eq. 5), the delay difference Δ_{i} − (Δ_{j} + ε) determines the temporal position of the sharp peak in the otherwise zero effective correlation between the two shifted Poisson inputs (Eq. 7). If this delay difference is negative, the output activity contributed by the j th synapse lags behind the input spikes at the i th synapse. Hence, the j th synapse contributes to the potentiation of synapse i, and the respective effective causal correlation C
To calculate S for a given delay vector Δ, we numerically solve the drift equation of the synaptic learning (Eq. 8) for the synaptic steady state. Using the resulting learned synaptic distributions, we compute the susceptibility matrix χ (Eq. 12,), giving S (Eq. 13). Figure10 A shows the sensitivity S as a function of μ for different values of the temporal delay spacing ς. The curves clearly show an optimal weight dependence of the synaptic changes for which the sensitivity peaks. For larger values of μ, the performance of the learning deteriorates because the increasing confinement of the synaptic weights to the central range of efficacies restricts the sensitivity of the learning to changes in the input correlation structure. Conversely, for lower values of μ, the sensitivity is impaired because the synaptic efficacies are beginning to saturate at the boundaries of the allowed range as bimodal efficacy distributions emerge. The value of μ that optimally adjusts the weight dependence of the synaptic changes depends both on the system parameters and on the input correlations determined by the relative time delays between the inputs. Increasing ς (i.e., increasing the relative delays) weakens the effective correlations between the presynaptic inputs because of the exponentially decaying temporal extent of the learning rule (Eq. 7). Hence, a lower weight dependence of the synaptic changes (corresponding to a lower value of μ) is needed to pick up the correlations and allow sufficient sensitivity of the learning to the input delays. The effect of this change in the temporal extent of synaptic interactions on the learned efficacies is shown in Figure 11, which for each μ depicts all N synaptic efficacies for ς = τ (Fig. 11 A) and ς = 4τ (Fig.11 B). Note that because of the equidistant delays, the relationship between the relative temporal position of a synapse within the presynaptic population and its steadystate efficacy is monotonic, with the leading synapse (Δ = 0) taking the largest weight. In the foreground, the corresponding sensitivity curves are shown. The plots clearly demonstrate that the saturation regime in which most synaptic weights accumulate at the boundaries of the allowed range (black and white) begins at higher values of μ when the temporal dispersion of the inputs is small (Fig. 11 A) (i.e., the synaptic interactions are strong). The plot also reveals that in both cases for low values of μ, only the leading synapse remains at a high value. Finally, it can be seen that the peaks in the sensitivity curves approximately coincide with those values of μ for which the synaptic weights smoothly cover a large range of efficacies, as shown by the gradual change from dark to light values in the corresponding vertical cross sections.
Finally, we ask how the learning sensitivity depends on the statistics of the input delays for a fixed value of μ. To answer this question, Figure 10 B shows S as a function of the delayline spacing ς, demonstrating that S does not vary monotonically with ς, but rather has a maximum at an optimal temporal separation of the inputs. This is because tight spacing leads to strong effective correlations between the inputs, driving the synapses toward saturation. On the other hand, loose spacing reduces the effective correlations between the presynaptic inputs to the extent that the learning behaves essentially as if driven by an uncorrelated presynaptic population.
Discussion
The understanding of activitydependent refinement of neural networks has long been one of the central interests of synaptic learning studies. In this context, most investigations of unsupervised learning using correlationbased plasticity rules have been conducted in the framework of additive plasticity models, which do not incorporate explicit weight dependence in the changes of synaptic efficacies. These simple models suffer from stability problems: either all synapses decay to zero or they grow without bound. An additional problem inherent to simple Hebbian models is the lack of robust competition. Indeed, it has been found that even the inclusion of synaptic depression mechanisms does not provide a robust source of synaptic competition, unless synaptic plasticity is finetuned to approximately balance the amount of potentiation and depression (Miller, 1996).
Recent studies of experimentally observed temporally asymmetric Hebbian learning rules have added two new ideas. One idea is that under these plasticity mechanisms, synapses compete against each other in controlling the time of firing of the target cell and, thus, engage in competition in the time domain. Although TAH learning rules are indeed inherently sensitive to temporal correlations between the afferent inputs, we have shown here that this sensitivity alone is not sufficient to resolve the problems associated with either stability or competition. In the additive model of TAH plasticity, hard constraints need to be imposed on the maximal and minimal synaptic efficacies to prevent the runaway of synaptic strength. In addition, as was shown here, in this model synaptic learning is competitive only when the ratio between depression and potentiation is finetuned, and even then the emergent synaptic patterns do not necessarily segregate the synaptic population according to the correlation structure in the neuronal input. The second idea is that TAH rules would exhibit novel behavior because of the role of the nonlinear spikegeneration mechanism of the postsynaptic cell (Song et al., 2000). In fact, we have shown in this work that the qualitative features of TAH plasticity are strikingly insensitive to the nonlinear integration of inputs in the target cell (see also Kempter et al., 2001). For the parameter choices studied, the properties of the synaptic steady states in the integrateandfire neuron are qualitatively similar to those found in a linear input–output model for neuronal firing. Nevertheless, we note that there are substantial quantitative differences between the two models, particularly with respect to the parameters τrN and c, which effectively determine the correlations between the presynaptic and postsynaptic spike trains. Although a quantitative analysis of these differences is beyond the scope of our work, such a study might reveal interesting insights into the quantitative effects of the details of the postsynaptic spike generator on the learned synaptic distributions. In addition, it is possible that the details of the spikegeneration mechanism will affect the transient phase (i.e., the dynamics) of the synaptic learning process.
From the present work, we conclude that some of the underlying difficulties in correlationbased learning are alleviated by nonlinear plasticity rules such as the NLTAH rule. The nonlinear weight dependence of the synaptic changes provides a natural mechanism to prevent runaway of synaptic strength. As in additive TAH learning, synaptic competition is provided by the mixture of depression and potentiation. However, in NLTAH plasticity, the balance between depression and potentiation is maintained dynamically by adjusting the steadystate value of the synaptic efficacies. Indeed, we have shown that this competition is sufficient to generate symmetry breaking between two independent groups of correlated presynaptic inputs. However, for this to occur, the stabilizing drift induced by the weight dependence of the synaptic changes should not be too strong. In particular, the simple linear weight dependence (Eq. 2, with μ = 1) assumed in the original multiplicative model is incapable of breaking the symmetry between competing input groups. In fact, we have shown that with μ = 1, the homogeneous synaptic state is stable for any pattern of homogeneous input correlations, provided there are no negative correlations in the afferent activity. The present powerlaw plasticity rule with 0 < μ < 1 provides a reasonable balance between the need for a stabilizing force and a potential for spontaneous emergence of synaptic patterns. Our study of symmetry breaking between two competing groups of correlated synapses is inspired by the activitydependent development of ocular dominance selectivity. This scenario has also been studied recently bySong and Abbott (2001) using the additive version of TAH plasticity. In their model, achieving a faithful splitting between the two competing input groups with weak correlations requires relatively tight tuning of the depression to potentiation ratio, α.
One of the surprising results of our investigation is the possibility that when the correlation within input groups is made strong, the stability of the homogeneous synaptic state may be restored. We have shown that this apparently counterintuitive behavior, predicted by the analytical study of the mean synaptic dynamics of the linear Poisson neuron, is also seen in simulations of the full learning rule in the integrateandfire neuron. It would be interesting to explore possible experimental testing of this result, perhaps in the context of the development of ocular dominance. In this work, we have limited ourselves to correlated subpopulations of inputs with positive withingroup correlations. However, in general, negative correlations are an additional potential source of competition (Miller and MacKay, 1994). Furthermore, we have not addressed the important issue of competition between synapses that target different cells. Lateral inhibitory connections between target neurons may provide a source of such competition.
The last part of the present work addresses situations with inhomogeneous input statistics. Different inputs are distinct in their temporal relationship to the rest of the input population. Here the issue is not whether a spatially modulated pattern of synaptic efficacies will form through TAH learning, but rather whether this pattern will efficiently imprint the information embedded in the input statistics. To quantify the imprinting efficiency of the learning rule, we introduced a new method for measuring learning rule sensitivity. In the present context, this measure quantifies the amount of information about the temporal structure in the inputs that a TAH rule can store. Using this method to study the novel class of NLTAH plasticity rules introduced here, we find that the optimal learning rule depends on the input statistics, in the present example on the characteristic time scale of the temporal correlations between the inputs. This finding suggests that biological systems may have acquired mechanisms for metaplasticity to adapt the learning rule to slow temporal changes in the input statistics. It should be pointed out that the sensitivity measure S focuses entirely on how the learned synaptic distribution changes as a result of changes in the correlation pattern among the input channels. It does not, however, address the problem of “readout,” namely how the resulting changes in the synaptic distribution affect the firing pattern of the output cell. A measure that takes the postsynaptic spike train into account will in general depend on the details of the spikegenerating mechanism rather than only capture the properties of the learning rule. In general, however, any readout mechanism will depend on the information that is available in the learned synaptic state. Hence, if the learning itself is insensitive to changes in the input features, the synaptic efficacies will fail to represent these changes and no readout mechanism will be able to extract them. The sensitivity measure S therefore provides an upper bound on the learning performance of the full neural system (including readout). In summary, while quantitative claims about the optimality of specific learning rules have to consider specific readout mechanisms, our study of the general properties of the investigated plasticity rules provide general insights into the mechanisms that enable unsupervised synaptic learning to remain sensitive to input features during learning.
Present experimental results (Bi and Poo, 1998) based on the averaging of individual efficacy changes in different synapses suggest the possibility that indeed the ratio of depressing and potentiating synaptic changes increases in a stabilizing manner as synapses grow stronger (cf. van Rossum et al., 2000). However, available data do not provide conclusive evidence regarding the details of the weight dependence of the efficacy changes. Our work clearly demonstrates the importance of the weight dependence of the TAH updating rule. Synaptic learning rules that implement a stabilizing weight dependence of the type introduced in this work have several advantageous properties for the learning in neural networks. Specifically, our results predict that synaptic changes should be neither additive nor multiplicative, but rather should feature intermediate weight dependencies that could, for instance, result from a gradual saturation of the potentiating and depressing mechanisms. It will be interesting to see whether future experimental results will confirm such a prediction. In this context, it is also important to note that recent experiments and modeling studies reveal important nonlinearities in the accumulation of synaptic changes induced by different spike pairs (Castellani et al., 2001;Senn et al., 2001; Sjöström et al., 2001) as well as evidence for complex intrinsic synaptic dynamics that challenges the simple notion of a scalar synaptic efficacy (Markram and Tsodyks, 1996). The theoretical implications of these sources of nonlinearity and intrinsic dynamics remain to be explored.
Appendix
Generating correlated spike trains
We show here that two spike trains that are generated by conditioning their binwise spike probabilities on the activity of a common reference spike train X_{0}(T) as described in Materials and Methods, have a pairwise correlation coefficient c. For clarity, we denote X_{i}(T) simply by X_{i}. The pairwise correlation coefficient is defined by Cov (X_{i}, X_{j})/
Homogeneous synaptic steady state for a homogeneous population of synapses
We derive the homogeneous synaptic steadystate solution by setting w˙_{i} = 0 andw_{i} = w* in Equation 8 with C
For uncorrelated input activity C
Stability of the homogeneous synaptic steady state
We analyze the stability of the homogeneous synaptic steady state by deriving the time evolution of small perturbations δw_{i} = w_{i} − w* of the synaptic efficacies w_{i} around the steadystate value w*. If these perturbations decay to zero, the homogeneous steady state is stable. For small perturbations, the time evolution is given by:
For uncorrelated input activity, the above bound for μ_{crit} becomes:
Additive TAH in the linear neuron: uncorrelated inputs
The drift of the i th synapse of a neuron receiving uncorrelated inputs and implementing the additive model is given by setting μ = 0 in Equation 8 with C
Moreover, the firing rate of the linear neuron is given by:
Uniformly correlated inputs in the linear neuron
When the presynaptic inputs are uniformly correlated, namely c_{ij} = c ≥ 0 for all i ≠ j (c_{ii} = 1), the effective correlation matrix C
Computing the susceptibility matrix χ for the linear neuron
Here we compute the susceptibility matrix χ (Eq. 12, Materials and Methods) used in Results to evaluate the sensitivity measure S for the learning process in the linear Poisson neuron model. This matrix is obtained by the implicit function theorem. In the synaptic steady state (
w
*), the synaptic drifts are zero by definition, and hence:
Two correlated input groups
For the case of two correlated subgroups, the sensitivity to the withingroup correlation is measured. Hence, the input feature is Φ = c with R = 1. Using Equation 8with C
Delayed Poisson inputs
For a neuron receiving timeshifted versions of a common Poisson spike train ρ
Footnotes

↵* R.G. and R.A. contributed equally to this work.

Partial funding from the Studienstiftung des deutschen Volkes, the Institut für Grenzgebiete der Psychologie, the Large Scale Facility Program of the European Commission, the Horowitz Foundation, the German–Israeli Foundation for Scientific Research and Development, the Volkswagen Foundation, the Israel Science Foundation (Center of Excellence 8006/00), and the USA–Israel Binational Science Foundation is gratefully acknowledged. We thank the staff of the Methods in Computational Neuroscience 2000 summer course at Woods Hole, Prof. E. Ruppin, and O. Shriki for useful discussions. We gratefully acknowledge the valuable discussions with Prof. L. Abbott that inspired our present investigation. We thank Prof. A. Aertsen and the anonymous referees for helpful comments and suggestions.

Correspondence should be addressed to Dr. Haim Sompolinsky, Racah Institute of Physics, Hebrew University of Jerusalem, Jerusalem 91904, Israel. Email: haim{at}fiz.huji.ac.il.

R. Gütig's present address: Institute for Theoretical Biology, Humboldt University, 10115 Berlin, Germany.