## Abstract

We explore a synaptic plasticity model that incorporates recent findings that potentiation and depression can be induced by precisely timed pairs of synaptic events and postsynaptic spikes. In addition we include the observation that strong synapses undergo relatively less potentiation than weak synapses, whereas depression is independent of synaptic strength. After random stimulation, the synaptic weights reach an equilibrium distribution which is stable, unimodal, and has positive skew. This weight distribution compares favorably to the distributions of quantal amplitudes and of receptor number observed experimentally in central neurons and contrasts to the distribution found in plasticity models without size-dependent potentiation. Also in contrast to those models, which show strong competition between the synapses, stable plasticity is achieved with little competition. Instead, competition can be introduced by including a separate mechanism that scales synaptic strengths multiplicatively as a function of postsynaptic activity. In this model, synaptic weights change in proportion to how correlated they are with other inputs onto the same postsynaptic neuron. These results indicate that stable correlation-based plasticity can be achieved without introducing competition, suggesting that plasticity and competition need not coexist in all circuits or at all developmental stages.

- Hebbian plasticity
- synaptic weights
- synaptic competition
- activity-dependent scaling
- temporal learning
- stochastic approaches

Changes in the synaptic connections between neurons are widely believed to contribute to memory storage, and the activity-dependent development of neuronal networks. These changes are thought to occur through correlation-based, or Hebbian, plasticity, but the precise plasticity rules remain unclear. In general, such learning rules should allow synaptic inputs to change in strength, depending on their correlation with postsynaptic firing or with the activity of other inputs (Sejnowksi, 1977). Second, they should generate a stable distribution of synaptic weights. Finally, to account for activity-dependent development, they should generate competition between the inputs of a neuron, so that strengthening some inputs weakens others (Shatz, 1990;Miller, 1996).

Unconstrained Hebbian plasticity does not generate a stable weight distribution, because once an input is strengthened its correlation with the postsynaptic activity increases. This leads to further potentiation and the synaptic weights grow to infinitely large values. Analogously, once depressed, synapses decrease to zero. These problems are usually fixed by constraining the learning rules, for instance by keeping the sum of weights constant. Alternatively, the postsynaptic activity can be used to adjust the threshold for potentiation (Bienenstock et al., 1982; Kirkwood et al., 1996), to regulate neuronal excitability (Desai et al., 1999), or to scale all of a neuron's synaptic weights (Turrigiano et al., 1998). All these mechanisms stabilize postsynaptic activity and introduce competition, but the choice of constraint influences strongly the behavior of the model (Miller and MacKay, 1994).

Recently, plasticity known as spike timing-dependent plasticity (STDP) has been observed (Bell et al., 1997; Markram et al., 1997; Bi and Poo, 1998). If a synaptic event precedes the postsynaptic spike, the synapse is potentiated. If it follows the postsynaptic spike, the synapse is depressed. STDP rules have been implemented in several modeling studies (Blum and Abbott, 1996; Gerstner et al., 1996;Eurich et al., 1999; Kistler and van Hemmen, 2000). Interestingly, because of strong competition between inputs, the postsynaptic firing rate in these models is independent of synaptic input rate (Song and Abbott, 2000). However, as commonly implemented, these learning rules are unstable and require hard bounds on the synaptic weights. The synaptic weights are driven to these bounds and obtain a bimodal distribution, which is unlikely to reflect the weight distribution in biological neurons.

Here we present an intrinsically stable STDP learning rule. It incorporates the experimental observation that potentiation is weaker for strong synapses (Debanne et al., 1996, 1999; Bi and Poo, 1998). This learning rule generates a stable, unimodal, positively skewed distribution of synaptic weights that closely resembles the distribution of quantal amplitudes measured from central neurons (Bekkers et al., 1990; Turrigiano et al., 1998). The weights depend on their correlation with other inputs, so that learning occurs through cooperation between inputs. Finally, competition is almost absent, and must be introduced independently by implementing activity-dependent synaptic scaling (Turrigiano et al., 1998; O'Brien et al., 1998; Turrigiano, 1999). The scaling does not change the shape or the stability of the weight distribution.

## MATERIALS AND METHODS

*Experimental foundation of the plasticity rules.* In this section we extract the parameters of the model from the experimental data presented in Bi and Poo (1998). Synapses can be potentiated and depressed by pairing a synaptic event with a postsynaptic spike. If the synaptic event occurs before the postsynaptic spike, the synapse will be potentiated; if the postsynaptic spike precedes the synaptic event the synapse will be depressed (Markram et al., 1997; Zhang et al., 1998; Bi and Poo, 1998). The amount of conductance change decreases approximately exponentially with the time difference between the synaptic event and the postsynaptic spike, δt. Such a synaptic modification window is illustrated in Figure 1. Although in Bi and Poo (1998) the time window was slightly asymmetric (the time constant for the exponential function was 34 ± 13 msec for depression and 17 ± 9 msec for potentiation), this asymmetry is not essential, and we use in our model the same time constant for depression and potentiation, τ_{STDP} = 20 msec.

There is experimental evidence that the amount of change also depends on the initial synaptic size (Debanne et al., 1996, 1999; Bi and Poo, 1998). The relative amount of depression is independent of synaptic weight, whereas the relative amount of potentiation decreases for stronger synapses (Figs. 1B, 2A). We describe this by assuming that the amount of potentiation is inversely proportional to the weight (Kistler and van Hemmen, 2000).

In the data of Bi and Poo (1998), synaptic events and postsynaptic spikes were paired 60 times. To deduce how much the synapse changes after one single pairing, we divide the weight change by 60, implicitly assuming independence between the pairing events. Denoting with w the synaptic conductance in Siemens, we describe potentiation as w → w + w_{p}, and depression as w → w + w_{d}. The plasticity rules are:
where c_{d} is the average amount of relative depression after one pairing, c_{d} = 0.003; and c_{p} is the average amount of potentiation after one pairing, c_{p} = 7 pS.

It will turn out that fluctuations in the amount of depression and potentiation are important. That is, the amount of conductance change is a noisy quantity. To describe this, we first tried an additive noise model in which a random conductance value was added to the weights after every pairing. For this model, weak synapses show strong fluctuations, and strong synapses show small fluctuations (Fig.2B). Because this does not seem to correspond to the data, we rejected this model. Rather, the fluctuations in the relative change seem roughly constant, which implies that the fluctuations are multiplicative (Fig. 2C). We thus arrive at the following plasticity rules: Equation 1 Equation 2where ν is a Gaussian random variable with zero mean, its SD, ς, is extracted from the data. Assuming that no other noise sources were present in the measurement of the conductance changes, we find ς = 0.015. Equations 1 and 2 are the basis for this study.

The experimental data on the size dependence of the potentiation can also be fitted with a straight line in Figure 2A as was done in Bi and Poo (1998). That is, w_{p}= −c′_{p} w log (w/w_{max}), where w_{max} is the conductance above which a potentiation protocol actually leads to depression of the synapse. For this plasticity rule runaway learning toward infinite weights is surely impossible. The equilibrium distribution that results from these rules looks similar to the one in Figure 3A in the central region, but a smaller second peak appears in the distribution around w = 0. It is not clear whether this has a biological analog. On the one hand it could be an artifact caused by our limited experimental knowledge of Equation 1 at small w. Alternatively, it might be a useful biological mechanism corresponding to silent synapses or weak synapses that can be pruned.

*Simulation details.* To analyze the consequences of the above plasticity rules we simulate a single cell receiving random inputs. We use a leaky integrate and fire neuron with: 100 MΩ input resistance, 20 msec time constant, −60 mV resting potential, and −50 mV firing threshold. After firing the membrane potential resets to the resting potential. The neuron receives input from 25 inhibitory and 100 excitatory synapses. The inhibitory synapses have a reversal potential of −70 mV, and a time constant of 5 msec. The inhibitory synapses are not plastic but are fixed at a conductance of 2000 pS and are stimulated with Poisson trains of 20 Hz. The excitatory synapses have a reversal potential of 0 mV and a time constant of 5 msec. The excitatory synapses also receive Poisson input. In some cases correlation across synaptic inputs is introduced. Correlation is implemented by randomly distributing N Poisson trains among the inputs. Every time step the Poisson trains are redistributed (Destexhe and Paré, 1999). For every synaptic event there is a chance 1/N, that it is shared by another synapse. This yields a cross-correlation coefficient between trains of C(Δt) = 1/Nδ(Δt).

Plasticity was implemented as follows: when a synaptic event occurs after a postsynaptic spike, the synapse is depressed according to Equation 2. We assume that all plasticity events are independent, but one also needs to specify the behavior when there are multiple synaptic events at the same input. We assume that at a given synapse, only the first synaptic event after a given spike depresses the synapse; subsequent synaptic events do not depress the synapse further before another postsynaptic spike occurs. Potentiation was implemented analogously: only the first postsynaptic spike after the synaptic event leads to potentiation of the synapse, according to Equation 1. In the simulations we use c_{p} = 1 pS.

Alternatively, one can assume that all pairing events cause depression and potentiation, as was assumed in most other studies (Kempter et al., 1999; Song et al., 2000). For Poisson trains one can show that this only changes the rate of change, the equilibrium state is the same in both implementations. We choose our implementation because it is consistent with the calculations presented in the .

The plasticity rules are independent of the presynaptic frequency, which is likely to be an over-simplification (Markram et al., 1997). However, this assumption is not essential for our argument: frequency-dependent potentiation or depression would shift the mean synaptic weight depending on stimulation frequency, but stability and competition would not be altered.

The parameters used in the simulation reflect a typical cultured neuron; such neurons have few synapses with large conductances. In slice or in vivo the number of inputs runs in the thousands, and we expect that the plasticity rules will yield synapses with correspondingly smaller mean weights, indeed, such scaling could be accomplished by activity-dependent scaling (see below) and does not qualitatively change our results.

*Activity-dependent scaling.* Here we show how activity-dependent scaling is incorporated in the model. As the precise mechanism behind activity-dependent scaling is not known, we present only a possible implementation. Activity-dependent scaling is a mechanism that adjusts the synaptic weights to regulate the postsynaptic activity. The postsynaptic activity is measured with a slow-varying sensor, a(t). It increases with every postsynaptic spike, and decays exponentially between spikes:
where t_{i} are the spike times. The biological time constant of the activity sensor τ is unknown, but is expected to be slow, we use τ = 100 sec. Activity-dependent scaling scales the weights to prevent too low or too high activity levels. The scaling is thought to be multiplicative and independent of presynaptic activity (Turrigiano et al., 1998;Turrigiano, 1999). A simple implementation would be to update the weights every time step according to:
where a_{goal} is the desired postsynaptic activity, set to 20 Hz, and β is a constant determining the strength of the scaling. This mechanism scales the synapses towards the activity goal, but because the plasticity rules Equations 1 and 2also pull on the weights, in the end a residual deviation between the actual activity and its goal value remains. Such errors are prevented by using an “integral controller”, (Riggs, 1970),
Equation 3where γ is another constant. As the second term accumulates the error, this term will in the long run be dominant if the goal value is not reached. As a result the steady-state activity level will eventually become equal to its goal value.

As with any feedback system, oscillations readily occur. These oscillations can arise as follows: when the activity does not have the desired value, the weights are slowly adjusted, and the activity moves back to its desired value. However, as the activity sensor has a delay, the weights can be overcompensated and overshoot. This leads to oscillations in the activity. Because such oscillation are not known to occur, the parameters were adjusted to prevent them. Suitable parameter values can be calculated from control theory (Riggs, 1970). The parameters do not require a sensitive adjustment. In our simulations we use a strength β = 4 × 10^{−5}/sec/Hz and γ is 10^{−7}/sec^{2}/Hz. These values give slow scaling without oscillations. Our arguments would not change if oscillations would occur, because we are interested in the equilibrium.

The scaling can be incorporated into the weight evolution equation (see). The scaling shifts the point where potentiation and depression are balanced, thus adjusting the mean weight while approximately preserving the shape of the distribution, consistent with experimental observations (see Discussion).

## RESULTS

### Experimental foundation

We base our model on two experimental observations. The first is STDP: it has been observed that synapses can be potentiated and depressed by pairing a synaptic event with a postsynaptic spike. If the synaptic event occurs ∼50 msec or less before the postsynaptic spike, the synapse will be potentiated; if the spike precedes the synaptic event, the synapse will be depressed (Markram et al., 1997; Bi and Poo, 1998). The amount of weight change is approximately exponential in the time between the synaptic event and the postsynaptic spike. The resulting synaptic modification window is plotted in Figure 1.

The second essential ingredient is that the amount of synaptic change depends on the synaptic size. For STDP protocols it was observed that the relative amount of depression is independent of the initial synaptic size, whereas relatively potentiation is larger for weak synapses than strong synapses. In Figure2A we plot such data from experiments on cultured neurons (Bi and Poo, 1998). A similar observation was made in hippocampal slices (Debanne et al., 1996, 1999). Possibly closely related to this, it has been observed that the amount of potentiation and depression depends on the history of synaptic stimulation (Yang and Faber, 1991; Ngezahayo et al., 2000). Including the weight dependence in the plasticity rules has drastic consequences for the weight distribution, the stability of the plasticity, and synaptic competition.

### Synaptic weight distribution after prolonged random stimulation

We ask how the synaptic weights of a neuron evolve when they are subject to the plasticity rules sketched in Figure 1. We first study the case when the neuron receives random synaptic input. Consider the distribution of synaptic weights, P(w). A single bin in this distribution describes the probability that synapses have a weight w. The synapses in this bin are continuously potentiated and depressed because of ongoing coincidences of presynaptic and postsynaptic spikes. Because of the size dependence of the plasticity, strong synapses experience a net depression, whereas weak synapses experience a net potentiation. This confines the synaptic weights. After a while the distribution reaches an equilibrium, at which the individual synapses still change, but the distribution is stationary.

The above picture turns out to be correct in simulations. We simulate an integrate and fire neuron receiving random synaptic input. The presynaptic signal is provided by 100 excitatory synapses stimulated with independent Poisson trains. The synapses are subject to the plasticity rules sketched in Figure 1 and given in Equations 1 and 2. The parameters of the plasticity rules are based on physiological data (see Materials and Methods). As fluctuations in the amount of synaptic change induced by potentiation and depression are important for the shape of the resulting weight distribution, a multiplicative noise model is part of the plasticity rules.

The distribution of synaptic weights after prolonged stimulation is shown in Figure 3A. The analysis in the shows explicitly that this equilibrium distribution is independent of the initial distribution. The resulting distribution is very stable and does not require any fine tuning of parameters. The distribution is unimodal and has a positive skew. This is similar to the distribution found in quantal synaptic current measurements and synaptic staining studies (O'Brien et al., 1998; Turrigiano et al., 1998). For comparison we plot in Figure 3B a distribution of quantal amplitudes, as observed from a single cultured cortical pyramidal neuron held at −70 mV (Turrigiano et al., 1998). In the Discussion, we expand on their similarity.

In the equilibrium state, the synaptic weights continuously make small random jumps, but their movement is confined. The mean synaptic weight is approximately located where the confining force vanishes, and the weight experiences no net depression or potentiation. There are two contributions to the confining force: (1) the probability to cause a postsynaptic spike increases linearly with the weight of the synapse. Strong synapses therefore have a larger probability of being potentiated, whereas the probability for being depressed is independent of synaptic strength (see ). Thus, once potentiated, strong synapses have an even higher chance for subsequent potentiation. This is a destabilizing force, and if this were the only force present, weights would run off to infinity. (2) However, stronger synapses experience a smaller conductance change when potentiated. This constitutes the second force. Because potentiation decreases with increasing weight, but depression does not, this force is stabilizing. When the two forces are combined, the stabilizing force wins, and the stable distribution shown in Figure 3A results.

The analytical treatment presented in the gives an accurate description of the distribution found in the simulations. It describes how the weights evolve by combining the weight dependence of potentiation and the probability that a synapse of given weight will be potentiated and depressed. This yields the weight distribution (Fig.3A, *solid line*). The analysis also shows explicitly that the destabilizing force plays only a minor role. If this force is completely turned off, as is easily done in the analytical expressions, weak and strong synapses have an equal probability for potentiation. Yet, this hardly changes the shape of the distribution (Fig. 3A,*dashed line*) indicating the minor role of the destabilizing force in our model.

### Stability: comparison to other models

How does this model compare to other STDP models? In most other models (but see Kistler and van Hemmen, 2000), potentiation and depression change the synaptic weight by a fixed amount, independent of the synaptic weight (Blum and Abbott, 1996; Gerstner et al., 1996; Amarasingham and Levy, 1998; Kempter et al., 1999;Song et al., 2000). The typical shape of the weight distribution for such a model is shown in Figure 3C. Note that, depending on the parameters, the synapses split into two groups of either weak or strong synapses, despite the absence of any structure in the input.

The behavior can be understood from the force terms introduced above, which determine the net potentiation and depression that a certain weight experiences. Because here potentiation and depression have an identical weight dependence, the stabilizing force vanishes. Left is the destabilizing force. This small but important force causes strong synapses to get even larger as they will have a higher probability of inducing a spike. This positive feedback will cause the weights to run off to infinite values. Therefore, a hard limit on the maximal weight is required for these models. Similarly, weights below a certain threshold will be depressed till they hit the lower bound. In the we show how also for these models the synaptic weight distribution can be calculated.

In these models the weight distribution is sensitive to small perturbations of the parameters and the destabilizing force (see). In contrast, in our model the effect of the destabilizing force is small. The distribution is dominated by the differential weight dependence of potentiation and depression, overruling the positive feedback and stabilizing the distribution.

### Correlated input potentiates synapses

Above we determined the weight distribution reached after long random stimulation. From a functional point of view this is a rather dull situation: no memory is stored, and all inputs obtain on average the same weight. The question arises how memory patterns are impressed and stored in the model. In contrast to conventional forms of long-term potentiation (LTP) and long-term depression (LTD), in our STDP scheme synapses are not strengthened by increasing their input rates. A different stimulation frequency changes the rate at which depression and potentiation occur, but will not effectively change the synaptic weight.

An effective way to store memories is to introduce correlation among inputs (Oja, 1982; Song et al., 2000). When inputs are correlated, the probability for potentiation is larger because synaptic events will often occur simultaneously and induce postsynaptic spikes. Weak synapses piggyback on the strong ones (Zhang et al., 1998). However, the probability for depression is unaltered (see ). This is illustrated in Figure4 where four groups of inputs with varying amount of correlation are presented to the neuron. Correlation shifts the distribution towards higher conductance values as the balance between potentiation and depression shifts. The shape of the distribution remains qualitatively the same, and the stability is maintained. The effect of the correlation is twofold: first, most of the postsynaptic spikes will be triggered by correlated inputs, and, second, these inputs will be potentiated. The mean weight is proportional to the correlation. In models without weight-dependent potentiation, inputs with correlations above a certain threshold obtain maximal weight, and inputs with less correlation obtain essentially zero weight. There is a sharp transition between the two.

### Lack of competition

In many models of constrained Hebbian plasticity there is strong competition between synapses: enhancement of one synapse leads to depression of the other synapses (Miller, 1996;Song et al., 2000). In our model there is practically no competition: the synaptic weights are insensitive to changes in the other inputs. To demonstrate this, we simulate the following situation (Fig. 5A): the postsynaptic neuron receives two groups of inputs. Initially both groups are uncorrelated, and as a result both groups have identical mean weights. Next, strong correlation within the first group is introduced, and this potentiates these synapses as described above. The mean conductance of this group and the postsynaptic firing rate increase. If there were competition present between the synapses, this stimulation should lead to a reduction of the weight of synapses in the other group. The weights of the other group of synapses are, however, hardly affected, as is illustrated in Figure 5A. Thus, there is little competition.

In STDP models in which potentiation and depression are independent of the synaptic weight, there is strong competition. The reason is as follows. In STDP the potentiation mainly occurs if the input has caused the spike, in other words, the inputs compete for the postsynaptic spike. When one input starts driving the postsynaptic spikes and its weight increases, the other inputs will become less correlated with the postsynaptic spikes, and these inputs will effectively be depressed (see ).

The competition in these models is so strong that there is a limited regime in which increasing the input rate is counteracted by the reduction of the synaptic weights, causing the postsynaptic firing frequency to be almost independent of the input rate (Song et al., 2000). In our model, the potentiation and depression of synapses is limited. As a result it is not very sensitive to changes in the total input. Competition and output rate normalization are virtually absent, and the output rate follows the input rate.

### Activity-dependent scaling as a separate competition mechanism

The lack of competition in our model demonstrates that stable Hebbian learning is possible without competition. Nevertheless, competition is useful for developmental processes such as ocular dominance column plasticity, and output rate normalization is useful in situations when the input rate or the number of inputs undergoes large changes. Therefore we include activity-dependent scaling of synaptic weights in the model. Activity-dependent scaling is a homeostatic mechanism which, in reaction to changes in the postsynaptic activity, scales all synapses in an effort to keep the activity of the neuron within bounds. The scaling is multiplicative and does not seem to depend on presynaptic spike activity (Turrigiano et al., 1998). To implement activity-dependent scaling, we introduce a slow-varying sensor of activity. The weights are multiplicatively scaled if the readout of the activity sensor differs from some preset goal value (see Materials and Methods).

The scaling mechanism introduces competition between the synapses (Fig.5B). This is expected: if one synapse is potentiated, the postsynaptic activity rises, and the activity-dependent scaling kicks in to reduce all synaptic weights. The scaling works on long time scales, and in the end the goal level of activity is maintained. The shape of the weight distribution and its stability are not affected by the scaling (Fig. 5C). The competition is thus separated from the STDP.

Note that this additional plasticity rule updates the weights independent of the presynaptic rate, in contrast to the STDP. Thus, if there are two sets of synaptic inputs, one with a low rate and one with a high rate, the weights of the low rate inputs will mainly be governed by the activity-dependent scaling, whereas the high rate inputs will be ruled by the STDP.

## DISCUSSION

Despite the importance of correlation-based plasticity in learning and development, the exact nature of the learning rules that operate in biological networks remains unclear. Here we have shown that a learning rule based closely on experimental data allows inputs to change in strength as a function of correlation, while generating and maintaining a stable distribution of synaptic weights. We use an STDP learning rule in which potentiation occurs when a postsynaptic spike follows a synaptic event, and depression occurs if a postsynaptic spikes precedes a synaptic event. In addition, this rule incorporates the experimental observation that the amount of potentiation decreases as the synapse strengthens (Debanne et al., 1996, 1999; Bi and Poo, 1998). In this weight-dependent STDP, the synaptic weights evolve into a unimodal, positively skewed distribution that closely resembles experimentally measured distributions of quantal amplitudes (Turrigiano et al., 1998) and of receptor number (O'Brien et al., 1998). The introduction of correlations between inputs increases synaptic strengths, but does not effect the shape and stability of the weight distribution. Weight-dependent STDP is intrinsically stable without requiring artificial constraints upon synaptic strengths.

The cause for instability in rate-based plasticity models is a destabilizing mechanism similar to the one in the STDP models: if a synapse is potentiated, the larger synapse causes a higher postsynaptic activity, which in turn potentiates the synapse even further. Weight-dependent potentiation could probably solve the problem of runaway learning for conventional LTP and LTD as well.

The shape of the synaptic weight distribution can be characterized by its mean, SD, and skew. In our model, the mean synaptic weight is determined by the balance point between potentiation and depression. Our analysis shows that this balance point is itself determined by two competing forces, one stabilizing and the other destabilizing. Because stronger synapses are more likely to evoke a postsynaptic spike, they are also more likely to be potentiated than weak synapses. This generates a destabilizing force that pushes synaptic strengths towards higher values. In models without the weight dependence of potentiation, this destabilizing force will tend to push synapses all the way to their upper and lower bounds. In our model, this destabilizing force is balanced by a reduction of the potentiation as synaptic weights increase. Because the amount of depression stays constant, for strong synapses depression will be larger than potentiation. This provides a brake on synaptic strength, constraining the weights of the synapses at central values.

The width of the synaptic weight distribution is strongly influenced by variations in the amount of potentiation and depression for different pairings of synaptic events and postsynaptic spikes. To make the model as realistic as possible, the magnitude of these fluctuations was extracted from the experimental data of Bi and Poo (1998). The magnitude may be overestimated because we assumed that all the measured noise arose from trial-to-trial fluctuations in the amount of potentiation or depression. Without this noise the simulated synaptic weight distribution has the same shape and overall behavior but is considerably narrower. Other factors could also contribute to a widening of the distribution, such as the presence of groups of inputs with different correlation levels (Fig. 4). The noise widens the weight distribution to values similar to those measured for quantal amplitudes, but many other factors could contribute to the width of these measured distributions. For example, the distribution of quantal amplitudes can be widened because of cable filtering (Spruston et al., 1993; Forti et al., 1997) and fluctuations in the transmitter content of vesicles (Frerking et al., 1995; Liu et al., 1999).

Another approach to assess the synaptic conductance distribution in central neurons is by using immunohistochemical methods to quantify the staining intensity of synaptic receptors. Using this method, the observed distributions of receptor staining are also unimodal and positively skewed (Nusser et al., 1997; O'Brien et al., 1998). As in our model, the shape of the distributions of both quantal amplitudes (Turrigiano et al., 1998;O'Brien et al., 1998) and of receptor staining (O'Brien et al., 1998) is preserved when synaptic strengths are scaled up or down in response to changes in activity.

A feature of STDP learning rules, with or without a weight dependence, is that inputs are potentiated as a function of their correlation on short time scales. This is because spikes in the postsynaptic neuron are chiefly caused by inputs correlated on short time scales. This contrasts with conventional rate-based Hebbian models in which the stimulation frequency determines which synapses get potentiated and in which short time scale correlations are not essential. Here, however, when precise timing does matter, correlations on short time scales are essential (Gerstner et al., 1996; Zhang et al., 1998; Kistler and van Hemmen, 2000; Song et al., 2000) as has been suggested for learning and memory (von der Malsburg, 1981).

An important feature of activity-dependent development in some CNS regions is competition between inputs onto a postsynaptic neuron (Shatz, 1990; Miller, 1996). Such competition allows some inputs to be retained, whereas other inputs are lost. In rate-based Hebbian models in which plasticity depends on the firing rate, synaptic normalization schemes are necessary to stabilize synaptic weights, and these schemes invariably introduce competition between synapses (Miller and MacKay, 1994). This has lead to the notion that competition is an inevitable consequence of stable Hebbian plasticity. STDP learning rules that do not include the weight dependence of potentiation also produce strong competition and, for instance, correlations in some inputs push other inputs to zero (Song et al., 2000). In contrast, weight-dependent STDP generates stable Hebbian plasticity without introducing much competition.

Competition can be introduced into weight-dependent STDP through an independent mechanism such as activity-dependent scaling. It is important to note that activity-dependent scaling is not needed to prevent runaway learning, but instead keeps the activity of the postsynaptic neuron within bounds as the input undergoes strong changes. This allows the activity-dependent scaling to be much slower than the STDP, as is suggested by experimental observations (Turrigiano et al., 1998). Our implementation of the scaling as an integral controller is well suited for this task because it is both slow and strong. The scaling mechanism literally scales the entire weight distribution up or down without qualitatively changing its shape, as was also observed experimentally (O'Brien et al., 1998; Turrigiano et al., 1998). Activity-dependent scaling introduces competition between the synapses because if some synapses are potentiated and the postsynaptic activity increases, all the synaptic weights will be scaled down.

Whereas strong competition is clearly important for some processes such as ocular dominance plasticity, in which inputs from one eye are retained, and inputs from the other eye are largely lost, it may not be desirable under all conditions and during all periods of development. In adult animals or in central circuits that code a continuous variable, such as direction, it may be advantageous to allow synaptic weights to change while retaining weak inputs. Such inputs could then be potentiated again if circumstances were to change, allowing the circuit greater flexibility. Our results demonstrate that stable Hebbian plasticity and synaptic competition are separable entities and suggest that learning rules may vary by region or developmental period to generate more or less competition.

## Appendix

### Derivation of the weight distribution

Apart from simulations, we present calculations that show how the synaptic weight distribution follows from the plasticity rules. The advantage of the analytical calculations is that although some approximations have to be made, the role of the various parameters in the model becomes clear and can be studied systematically.

Consider a neuron receiving uncorrelated Poisson inputs. Its synapses continuously undergo weights modifications according to the plasticity rules because of random coincidences of presynaptic and postsynaptic spikes. We denote the distribution of its synaptic weights with P(w, t), where t denotes the time. A single bin in this distribution describes the probability that a synapse has a weight, w. Every time step the number of synapses in this bin can change because of potentiation and depression (Fig. 6). Collecting all terms that change the number of synapses in this bin, we have:
Equation 4
where ρ_{in} is the presynaptic rate, assumed identical at all synapses, p_{p} is the probability that the synapse is potentiated, p_{d}is the probability that the synapse is depressed, the w_{d} and w_{p} describe how much the weight changes with depression and potentiation. The first two right-hand side terms in Equation 4 are loss terms decreasing the number of synapses with weight w. The last two terms are gain terms describing synapses with initially different weights acquiring new weight w because of either potentiation or depression.

For now we neglect the precise timing dependence of the plasticity. Instead, we assume that if the synaptic event occurs within a narrow time window t_{w} after a spike, the synapse is depressed an amount w_{d} = −c_{d}w + vw, see Equation 2 in Materials and Methods. And similar if the synaptic event occurs before the postsynaptic spike, the synapse is potentiated w_{p} = c_{p} + vw. In other words the exponential window is replaced by a square window of width t_{w}. The justification is that when averaged over many pairings, only the average amount of change is important. (This approximation introduces a small, negligible error in Eq. 7). In the simulations the exponential window is used.

By Taylor expanding P(w − w_{p}) and P(w + w_{d}), one obtains the Fokker–Planck equation (van Kampen, 1992):
Equation 5
with jump-moments A and B:
Equation 6
Equation 7where ς^{2} is the variance of the noise term v. This derivation requires that changes in w are small with respect to variations in P(w, t), which is indeed the case. But to solve these equations we first need to know the probability for inducing potentiation p_{p} and the probability for inducing depression p_{d}.

### The probability that a synaptic event causes a spike

First, we calculate the probability that a synaptic event depresses the synapse. This requires that the presynaptic event succeeds a postsynaptic spike within a short window. We use a simplified model: a non-leaky integrate and fire model. The cell receives background input from other synapses, described by a constant background current I_{0}. This current causes the neuron to fire regularly with an interspike interval t_{isi} = V_{thr}C/I_{0}, where V_{thr} is the threshold voltage relative to the resting voltage, and C is the membrane capacitance. We assume that the presynaptic signal is uncorrelated to other inputs and that the presynaptic signal is not affected by the spikes in the postsynaptic neuron (that is, no recurrent connections). At a random time the synaptic event arrives in the postsynaptic neuron. The postsynaptic spike, occurring earlier, is of course independent of this synaptic event. Therefore, the probability that the synaptic event occurs within a time window t_{w} after a spike is:
Equation 8where it is implied that t_{w} < t_{isi}.

Next, we calculate the probability that a synaptic event potentiates the synapse, which requires that the postsynaptic spike comes after the synaptic event. Because the synaptic event can help to induce a spike, this is more complicated than the previous case (Kistler and van Hemmen, 2000). We calculate this as follows: the synaptic current is modeled with a brief square pulse of duration τ_{syn}, its amplitude is wV_{syn}, where w is the synaptic weight and V_{syn} is the synaptic drive (assumed constant). This synaptic current causes the membrane voltage to jump by an amount τ_{syn}wV_{syn}/C. If after the jump the voltage is still below threshold, the interspike interval is shortened to t′_{isi} = t_{isi} − τ_{syn}wV_{syn}/I_{0}. If, on the other hand, the neuron was already close to threshold it will spike (Fig. 7). One finds that the time between the synaptic event and the spike, δt, is distributed as,
This describes an enhanced probability for small intervals between the synaptic event and a postsynaptic spike. The reason is that postsynaptic spikes likely follow the synaptic input. With τ_{syn} ≪ t_{w}, the probability that the synapse is potentiated, is
Equation 9with W_{tot} = t_{w}I_{0}/(V_{syn}τ_{syn}). This probability is a sum of a constant term that describes random coincidence of presynaptic and postsynaptic spikes, and a term linear in w, which describes the enhanced probability that the synapse induces a spike. For synapses with zero weight, p_{p} equals the probability of inducing depression, p_{d}. The reason is that the postsynaptic neuron is unaffected by a tiny input, in other words, only random coincidences cause potentiation. The linear term depends on W_{tot}. The W_{tot} is the average current of all other inputs expressed as an instantaneous conductance. If the input to the cell is purely from excitatory synapses, one has:
Equation 10where N is the number of synapses, and 〈w〉 is their average weight. This dependence of W_{tot} on the average weight shows that W_{tot} is a competition parameter, describing competition among inputs for a postsynaptic spike. As the total input W_{tot} increases, p_{p} gets smaller. Finally, for large synaptic conductances p_{p} reaches its upper limit of one. In that case the presynaptic event always induces a spike, a suprathreshold connection. In physiologically relevant situations such synapses are rare, and this upper limit can be ignored.

In Figure 7D we present the results of a simulation showing the probability for depression and potentiation. Although the synaptic time course, leak conductance and noise in the cell give rise to small correction terms (M. C. W. van Rossum, unpublished observations), Equations 8 and 9 are still qualitatively valid: p_{d} is independent of the weight, and p_{p} increases linearly with the weight and equals p_{d} for zero weight. Experimental verification of this law would be desirable.

### The synaptic weight distribution

Using the above results for p_{p} and p_{d}, we have for the distribution of synaptic weights,
Equation 11
Equation 12
Equation 13This describes the evolution of the synaptic weight distribution under random stimulation. Of most interest is the steady-state solution, which corresponds to the equilibrium distribution a neuron obtains with random stimulation. It is independent of the initial distribution.

The steady-state solution is found by imposing that ∂P/∂t = 0 and that the probability current J(w) = A(w)P(w) − 1/2 ∂/∂w [B(w)P(w)] vanishes. The resulting equation is easily solved numerically. An analytical solution is obtained if one assumes that the noise ς and W_{tot} are large, so that B(w) ≈ p_{d}(c_{p}^{2} + 2w^{2}ς^{2}). In this case the distribution reads:
Equation 14where N normalizes the distribution such that ∫P(w)dw = 1. This distribution is plotted in Figure3A (*solid line*). It closely matches the simulation results. The steady-state solution is unimodal, and the mean weight is roughly located where potentiation and depression are balanced (here A crosses zero). The distribution peaks at w = c_{p}/(c_{d} − c_{p}/W_{tot} + 2ς^{2}). Because W_{tot}depends on the average weight (Eq. 10), the distribution (Eq. 14) has to be solved self-consistently. In practice this poses no problem because the distribution depends only weakly on W_{tot}. Indeed, approximating W_{tot} → ∞ only slightly changes the distribution (Fig. 3, *dashed line*). This means that the enhanced probability for inducing a spike for strong synapses is a minor effect. Finally, note that the distribution does not vanish at zero conductance; this is hard to see in Figure 3, but is clear from Equation 14.

The evolution of the distribution can be compared to diffusion of particles (weights) in an external force field. In analogy with the diffusion equation, the A term is a force that the synapse experiences, its sign determines whether with the next event the weight will on the average increase or decrease (Fig. 6). The B term corresponds to a diffusion “constant” and determines the width of the distribution; it is determined by the amount of weight change and the noise. Although without the noise term, the distribution would still have a finite width and a positive skew, the noise broadens the distribution and due to its multiplicative character, the noise also enhances the positive skew of the distribution.

### Application to other models

Other models of spike timing-dependent plasticity have not included the size dependence of the plasticity rules (Amarasingham and Levy, 1998; Kempter et al., 1999; Song et al., 2000). Also these models can be analyzed with our method. We show that they yield a dramatically different weight distribution. Following the notation of Song et al. (2000), we have, again neglecting the exponential timing dependence,
Equation 15
Equation 16where A_{+} is the amount of potentiation, and A_{−} is the amount of depression. This plasticity scheme requires slightly more potentiation than depression, that is, A_{+} = (1 − ε)A_{−}, where ε is a small, positive number. For small ε and large W_{tot}, one has for this model A(w) = p_{d}(w/W_{tot} − ε)A_{−} and B(w) = 2p_{d}A_{−}^{2}. There is no steady-state weight distribution unless a hard limit on the maximal weight, w_{max}, is explicitly imposed. The resulting distribution is:
Equation 17where N normalizes the distribution. This distribution matches the distributions observed in simulations (Song et al., 2000). It is plotted in Figure 3C with parameters: w_{max} = 1, W_{tot} = 11, ε = 0.05, and A_{−} = 0.005. It is seen that some synaptic weights will cluster around zero weight. The reason is that for weak synapses the drive A(w) is negative, pushing them toward smaller and smaller synaptic conductances. If one chooses w_{max} > εW_{tot} the distribution is bimodal and has a second peak at the maximal weight. In that case A(w) becomes positive for large w and weights for which A(w) is positive are pushed towards w_{max}. When input correlations are present, the distribution becomes already bimodal for lower values of w_{max}.

The parameters are usually chosen such that the distribution is bimodal. This requires a balance between the potentiation and depression ratio, ε, and the competition parameter, w/W_{tot}. The weight distribution in these models is sensitive to small perturbations in this balance. This is seen in Figure 3C: the weight distribution plotted there would be symmetric if W_{tot} were 10, but a 10% change (W_{tot} = 11) causes already a considerably asymmetric distribution. Because of this strong dependence, Equation 17 needs to be solved self-consistently. Namely, through W_{tot} the distribution depends on the weights of the inputs (Eq. 10), but the weights are again given by the distribution. Solving Equation 17 for a range of input rates, shows that there is indeed a limited regime in which the postsynaptic firing frequency is almost independent of the input rate, as was seen in simulations (Song et al., 2000).

The dependence of A(w) on the weight is precisely the opposite to our model where A(w) decreases with increasing weight stabilizing the weight distribution (Fig.6). The stability of different learning rules can be analyzed with our method. If the potentiation depends as strong or more strongly on the weight than in our model, the weights will be stable. If the potentiation depends much more weakly on the weight than assumed here, the stability depends on parameters such as the threshold weight and postsynaptic firing frequency. The behavior of any model is determined by how A(w) crosses zero, with positive or negative slope. In general we can distinguish two classes of learning rules. Assume that B(w) ≠ 0 for any w and that A(w) crosses zero only once, (1) if A(w) crosses zero with negative slope the distribution is centered around the zero crossing as in our model, (2) but if A(w) crosses zero with positive slope the weights are repelled from the zero crossing as happens above. This dichotomy does not seem to leave room for a stable model which intrapolates between the two classes.

*Note added in proof.* After completion of this study, we found that a similar approach to calculate the weight distribution has been followed by Rubin et al. (2000).

## Footnotes

This work was supported by National Institutes of Health Grants R01 NS 36853 (G.G.T.), K02 NS01893 (G.G.T.), and National Research Service Award NS 10967 (G.B.). M.v.R. was supported by the Sloan foundation. G.G.T. is a Sloan Foundation Fellow. We gratefully acknowledge discussions with Larry Abbott, Sacha Nelson, and Sen Song, and G.B. gratefully acknowledges discussions with Mu-ming Poo.

Correspondence should be addressed to Mark C. W. van Rossum, Department of Biology, MS 008, Brandeis University, 415 South Street, Waltham, MA 02454-9110. E-mail:vrossum{at}brandeis.edu.