Abstract
In the current paper it is proposed that short-term plasticity and dynamic changes in the balance of excitatory–inhibitory interactions may underlie the decoding of temporal information, that is, the generation of temporally selective neurons. Our initial approach was to simulate excitatory–inhibitory disynaptic circuits. Such circuits were composed of a single excitatory and inhibitory neuron and incorporated short-term plasticity of EPSPs and IPSPs and slow IPSPs. We first showed that it is possible to tune cells to respond selectively to different intervals by changing the synaptic weights of different synapses in parallel. In other words, temporal tuning can rely on long-term changes in synaptic strength and does not require changes in the time constants of the temporal properties. When the units studied in disynaptic circuits were incorporated into a larger single-layer network, the units exhibited a broad range of temporal selectivity ranging from no interval tuning to interval-selective tuning. The variability in temporal tuning relied on the variability of synaptic strengths. The network as a whole contained a robust population code for a wide range of intervals. Importantly, the same network was able to discriminate simple temporal sequences. These results argue that neural circuits are intrinsically able to process temporal information on the time scale of tens to hundreds of milliseconds and that specialized mechanisms, such as delay lines or oscillators, may not be necessary.
- interval
- short-term plasticity
- paired-pulse facilitation
- paired-pulse depression
- timing
- temporal processing
Our perception of the world is based on the spatiotemporal patterns of neuronal activity produced at sensory layers. By decoding these patterns the brain determines what we see and hear. It is useful to distinguish between the spatial and temporal content of stimuli because fundamentally different mechanisms may underlie each form of processing. Spatial information refers to stimuli defined by the location of active sensory afferents. For instance, vertical and horizontal bars of light activate different retinal ganglion cells arranged in specific spatial patterns. Similarly, 1 and 4 kHz tones activate spatially distinct populations of cochlear hair cells. In both cases there is a place code at the earliest sensory stages. In its simplest form, the generation of neurons that respond selectively to spatial stimuli is a wiring problem: a neuron that responds to a vertical bar or a 1 kHz tone must directly or indirectly receive functional inputs from the appropriate sensory neurons in the retina or cochlea. Temporal information refers to stimuli defined by the temporal structure of active sensory neurons. If a bar of light is present for 50 or 100 msec, in both cases the same groups of retinal ganglion cells are active. Similarly, if two brief 1 kHz tones are separated by 50 or 100 msec the same population of hair cells will be active. Thus, for neurons to respond selectively to a 50 or 100 msec stimulus, an additional process must occur: at some stage a temporal to spatial transformation must transpire.
The above examples emphasize the need to decode information that directly reflects the temporal characteristics of external stimuli–or what can be considered extrinsic temporal information. In addition, a similar problem is posed by the presence of intrinsically generated temporal codes. Theoretical and experimental data suggest that temporal information is also present in the responses to both static and steady-state stimuli (Richmond et al., 1990; McClurkin et al., 1991;Middlebrooks et al., 1994; Mechler et al., 1998; Prut et al., 1998; Buonomano and Merzenich, 1999). For example, Richmond et al. (1990) have shown that by taking into account the temporal structure of neuronal responses to static Walsh patterns, there is more information about the stimuli than there is in the firing rate alone. More recently Mechler et al. (1998) have shown that there is considerable information relating to the contrast of transient stimuli in the temporal pattern of V1 neuron firing. If the brain uses these temporal codes, a critical issue is how they are decoded by the nervous system. Decoding intrinsically generated temporal codes poses the same problem as that of extrinsic temporal information.
The time scale of information processing by the nervous system ranges over many orders of magnitude: from a few microseconds, to milliseconds, to many seconds and above. Here we focus on the millisecond time scale. It is within this time range that much sensory processing, including interval discrimination (Wright et al., 1997) and speech perception (Tallal 1994; Shannon et al., 1995) occurs, and in which some intrinsic temporal codes are hypothesized to operate (Mechler et al., 1998). Furthermore, experimental data has shown that some sensory neurons respond selectively to temporal features of stimuli on the time scale of tens to hundreds of milliseconds, including call-sensitive neurons in monkeys (Rauschecker et al., 1995;Wang et al., 1995), interval and duration-sensitive neurons (Riquimaroux 1994; He et al., 1997), song-sensitive neurons in birds (Margoliash 1983; Doupe and Konishi, 1991; Lewicki and Arthur, 1996), call-sensitive neurons in frogs (Alder and Rose, 1998), and word-selective neurons in humans (Creutzfeldt et al., 1989).
To date, little is known about the neural mechanisms underlying temporal selectivity in the millisecond range (see Discussion; for review, see Ivry, 1996; Gibbon et al., 1997). Various mechanisms have been proposed to account for the sensory side of temporal processing, including internal clocks (Creelman, 1962; Treisman, 1963), delay lines (Braitenberg, 1967; Tank and Hopfield, 1987), and oscillators (Miall, 1989; Ahissar et al., 1997). We have previously proposed that time-dependent neuronal properties may underlie temporal processing (Buonomano and Merzenich, 1995). Using a large multilayer network we showed that the network was able to discriminate among a wide range of temporal stimuli.
The goal of the current paper was to perform a computational analysis of simple circuits that incorporate experimentally defined time-dependent properties and to understand the minimal requirements necessary for temporal processing. Three time-dependent properties in particular were examined because they are experimentally well defined and likely to be critical in shaping the postsynaptic responses to time-varying stimuli: (1) paired-pulse facilitation (PPF) of monosynaptic EPSPs (Zucker, 1989; Zalutsky and Nicoll, 1990; Manabe et al., 1993; Stratford et al., 1996; Reyes and Sakmann, 1998); (2) paired-pulse depression (PPD) of fast IPSPs (Deisz and Prince, 1989;Davies et al., 1990; Nathan and Lambert, 1991; Fukuda et al., 1993;Lambert and Wilson, 1993; Buonomano and Merzenich, 1998); and (3) slow IPSPs (Newberry and Nicoll, 1984; Hablitz and Thalmann, 1987; Olpe et al., 1994). First, using simple disynaptic circuits we showed that interval tuning can arise from changes in synaptic strength, in the absence of changes in any time constants. This observation suggests that long-term synaptic plasticity could underlie the formation of not only spatial, but of temporal receptive fields. The analysis of a larger single-layer network revealed that in a randomly connected network, that the distribution of temporal responses of the individual units is sufficiently broad to form a robust population code for a wide range of temporal intervals and sequences.
MATERIALS AND METHODS
Simulation of disynaptic circuits. All simulations were performed with NEURON (Hines and Moore, 1997) running on an SGI Octane workstation. Each unit was simulated as a single-compartment Hodgkin–Huxley unit. The Hodgkin–Huxley equations and parameters used are shown in Figure1. Parameters were based on those used byTraub et al. (1992). In addition to the Na+, K+, and leak current, a “noise” current was also present.
Fast EPSPs and fast IPSPs
Fast EPSPs and fast IPSPs were simulated using “kinetic synapse” equations (Destexhe et al., 1993; Golomb et al., 1994). Synaptic transmission occurs during a brief pulse of a fixed duration, where ton andtoff represent the onset and offset of the pulse (toff =ton + 1 msec). During a pulse, receptor activation R(t), which is proportional to synaptic conductance, follows: Equation 1after a pulse R(t) is governed by Equation 2where Equation 3and Equation 4As shown in Figure 1, α, which contributes to the rising phase of the PSP was set to 0.5 for both the excitatory and inhibitory synapses. β, which contributes to the decay phase of the PSP, was 0.25 and 0.167 for excitatory and inhibitory synapses, respectively.
Three time-dependent properties were incorporated into the simulations: paired-pulse facilitation of EPSPs, paired-pulse depression of IPSPs, and slow IPSPs. Each of these is described below and shown in Figure2.
Slow IPSPs
The slow IPSP was simulated using previously described equations (Golomb et al., 1994). The same equations used for fast synaptic transmission were used with the addition of a second component,G(t), representing G-protein activation: Equation 5where G∞ is a sigmoid function of R, which was described in Equations1-4. Equation 6For the slow IPSP it is G(t) notR(t) that is proportional to the synaptic conductance.
PPD of the fast IPSP
PPD of fast IPSPs was simulated by modulating the amount of transmitter released using the same time course as the GABAB conductance. The degree of paired-pulse depression, PPD(t) was a function ofG: Equation 7PPD(t) modulates the strength of both the fast and slow IPSPs.
PPF of excitatory synapses
PPF was simulated using of an α function, reinitiated at each spike occurrence (Buonomano and Merzenich, 1995): Equation 8where tispike represents the occurrence of the last spike in unit i. In the large network, simulations were also run with facilitation implemented using a more realistic model described by Markram et al. (1998) with similar results.
Synaptic delays were on average 1 and 2 msec for EPSPs onto inhibitory (Inh) and excitatory (Ex) units, respectively (Thomson et al., 1988;Markram et al., 1997). These delays were meant to capture time delays produced by axon and dendritic conduction times and synaptic delays.
Simulation of large networks. For simulations of a large network, the same units used above were incorporated into a single-layer network composed of 400 Ex units and 100 Inh units. A 4:1 ratio was used because it is the observed ratio of excitatory to inhibitory neurons in neocortical areas (Beaulieu et al., 1992). It is generally reported that a pyramidal neuron receives ∼4000 synapses, and the probability of a connection between local pyramidal cells is 2–8% (Thomson et al., 1988; Mason et al., 1991; Thomson and West, 1993). To simulate the absolute number of synapses and the correct probability would require 40,000–80,000 Ex units. We chose to simulate the correct probability (in part because of computational efficiency). We assumed that the probability of connectivity between cell types was ∼5% (resulting in a small number of synapses on each unit). The connectivity was also constrained by experimental data showing that ∼20% of the synapses onto a neuron are GABAergic (Beaulieu et al., 1992). Table 1 shows the synaptic convergence on to each unit and the average synaptic strength assigned to each synapse. The network was driven by two input pulses separated by a given interval. Each input pulse consisted of a burst of three spikes at 300 Hz.
Recognition network. To determine whether the population response of the large network contained sufficient information to permit discrimination of different stimuli, a layer of output units was used in conjunction with a supervised learning rule. The number of output units corresponded to the number of stimuli presented to the network, and each unit received inputs from all the Ex units. Synaptic strengths were adjusted using a supervised rule [backpropagation with no hidden units; Rumelhart and McClelland (1986)]. The strength of the connection from Ex unit I to output unit j was governed by: Equation 9where δ is the error value for the output unit j (0 or 1). Note that supervised learning rules are generally not used for continuous time models, thus for training the input to the network was the number of spikes from each unit in response to the whole stimulus or each pulse (N). By collapsing time we were able to train the recognition network using conventional algorithms. However, after training the synaptic weights for the output units could be used to observe the network behavior in a continuous time manner (see Fig.9). We emphasize that this discrimination network is used as a technique to analyze the information content of the network and not necessarily meant to be a physiological representation of a read-out.
RESULTS
Our first goal was to understand how the synaptic strengths of multiple synapses in a disynaptic circuit composed of one Ex and one Inh unit shape the responses to simple temporal stimuli. We examined if orchestrated changes in synaptic weights at multiple loci can be used to generate temporally selective responses.
Analysis of order selectivity
We first examined the simplest form of temporal selectivity: the response preference of the Ex unit to the first or second of a pair of pulses presented 100 msec apart. Figure3A shows a schematic of the disynaptic circuit with five different synapses (Input → Ex; Input → Inh; Inhfast → Ex; Inhslow → Ex; Inhslow → Inh). In these simulations the Inhfast → Inh was set to zero, because the fast IPSP decays before the occurrence of the second pulse. Simulation traces in red show an example of the response to paired pulse stimulation at 100 msec, in which the Ex unit responds only to the second pulse: the first pulse generates a subthreshold EPSP, whereas the second input is suprathreshold because of PPF of the EPSP and PPD of the IPSP. By increasing two synaptic strengths (Input → Ex and Inhslow → Ex) the suprathreshold response of the Ex unit changes from the second pulse to the first. As a result of the increase in the Input → Ex strength the first pulse is now suprathreshold. The second now generates a subthreshold EPSP despite the PPF and PPD, because of the increased strength of the slow IPSPs.
These simulations provide a straightforward and intuitive example of how a simple disynaptic circuit can exhibit two modes of order selectivity depending on the synaptic strength of two synapses. To understand the transition between different modes and to determine the robustness of each mode, a parametric analysis of “synapse space” was performed. In each of the subplots of Figure 3B the order sensitivity of the circuit was examined while varying the strengths of the Input → Ex and GABAB → Ex connections over a range of 25 different values. Each subplot reflects a different strength of the GABAA → Ex connection. In these simulations, noise was present in both units (rms of 1.4 and 1 mV in the Ex and Inh units, respectively). As a result of the noise, the behavior the units varied from trial to trial allowing the calculation of the response probability to the first and second pulse. The intensity of green and red is proportional to the probability of firing in response to first and second pulses, respectively. Cells that have a high probability of firing to both pulses are thus represented in yellow. The transition between first-pulse-selective and second-pulse-selective modes occurs at the red–green transition in each subplot. Transitions occur when the GABAB strength (vertical axis) becomes strong enough to prevent the second potentially suprathreshold EPSP from reaching threshold and when the Ex strength (horizontal axis) for the first pulse becomes suprathreshold. The transition point is not fixed, but a function of the strength of the fast IPSP. As the strength of the fast IPSP increases the transition point shifts to the right. This occurs true because even though the fast IPSP must flow through two synapses, it still can “cutoff” the fast EPSP before it produces a suprathreshold response (see below). Thus, as GABAA increases in strength, there is also an increase in the EPSP strength necessary to generate a suprathreshold response to the first pulse.
Simulation of interval selectivity
We were next interested in determining whether the Ex unit in the same disynaptic circuit can exhibit interval selectivity depending on the synaptic weights of different synapses. Figure4A shows traces from the excitatory and inhibitory units for three different sets of synaptic strengths. Surprisingly, parallel changes in the strength of the Input → Ex and Input → Inh connections produce Ex units that respond selectively to either 50, 100, or 200 msec intervals. Even though the time constants of all properties are unchanging, interval selectivity can occur as a result of the interplay between Ex and Inh unit activity. With relatively weak inputs to both the excitatory and inhibitory units (Fig. 4A, red traces), the first pulse generates a suprathreshold and subthreshold response in the Inh and Ex units, respectively. At 50 msec the second pulse is suprathreshold in the Ex unit (although it is riding a slow IPSP elicited by the first spike in the Inh unit), because of PPF which peaks at 50 msec. The second pulse, at any interval, did not generate a fast IPSP because the Inh unit did not fire because of the GABAB-mediated slow IPSP. If the strength of both inputs is increased (green traces), the Ex unit fires exclusively to the 100 msec pulse. It no longer fires to the 50 msec pulse because as a result of the increased input the Inh unit fires in response to the second pulse at 50 msec. Because of the faster flow of activity through the inhibitory part of the circuit, the fast IPSP can cut off the EPSP in the Ex unit, preventing it from firing. If we continue to increase the strength of both inputs (blue traces) through a similar mechanism, the Ex unit fires exclusively to the 200 msec interpulse interval (IPI). Note that the faster flow of activity through the inhibitory branch is observed experimentally and is likely attributable to: (1) the faster membrane time constant of inhibitory neurons (Brown et al., 1981; McCormick et al., 1985; Lacaille et al., 1987); (2) the threshold of inhibitory neurons seems to be lower than that of excitatory neurons; and (3) inhibitory synapses tend to connect closer to the cell soma than excitatory synapses (Beaulieu et al., 1992).
Figure 4B represents a parametric analysis of the interval selectivity described above in synapse space. The strength of the Input → Ex and Input → Inh were varied over a range of weights. The results are represented as a red–green–blue (RGB) plot, which permits visualization of the selectivity to the three intervals while varying two dimensions. Red, green, and dark blue represent regions in which the Ex unit fires exclusively to 50, 100, and 200 msec IPI, respectively (note that interval selectivity implies that the Ex unit responds only to the second pulse of a given interval). Responses to combinations of intervals are represented by the appropriate secondary colors; for example, yellow represents regions in which the Ex unit responds to both 50 and 100 msec intervals (see Fig. 4, legend). The plot illustrates that by varying two synaptic strengths, it is possible to generate selective responses to either the 50, 100, or 200 msec intervals or combinations of these intervals and shows that these regions are fairly robust, operating over a significant range of synaptic strength. Furthermore, cells can respond to specific combinations of intervals (light blue, yellow, and white). We have also examined the difference between increases in the Input → Inh weights and the Inh → Ex weights. Both will tend to increase the degree of inhibition in the Ex unit. Is one more or less effective in controlling interval selectivity? As shown in Figure 4B, within a limited range interval tuning was approximately linearly related to Input → Ex and Input → Inh strength. In contrast, whereas parallel changes in the Inh → Ex and Input → Ex strengths also resulted in selective responses to each interval, selectivity occurred in a much smaller region of synapse space and was a more complex function of synaptic strength (data not shown).
Interval discrimination in large networks
The above results show that simple disynaptic circuits can exhibit interval selectivity. However, this selectivity required fine tuning of multiple synaptic weights. It seems unlikely that there are learning rules that would allow the appropriate combinations of weights to emerge in a self-organizing manner. We next examined whether a large network with randomly assigned synaptic weights is able to discriminate a range of intervals. A network with 400 Ex and 100 Inh units was simulated, connectivity between units was randomly assigned with a uniform distribution (synapses between any two units are equiprobable). The weights of each synapse type were assigned from a normal distribution (see Materials and Methods). Figure5 shows the raster plot of a sample of Ex and Inh units in response to five intervals. Although some units exhibited selective responses to a particular interval, the majority were either interval-sensitive (responded maximally to two or more intervals) or nonselective. Does the population of Ex units as a whole contain sufficient information to discriminate among a range of intervals? Note that if a population code is present, the network may discriminate among intervals even though no single unit is exclusively selective to each interval. To determine both the ability of the network to discriminate intervals as well as how it generalizes we used an independent recognition network. The recognition network was composed of five output units and 400 input units (each representing the number of spikes in an Ex unit in response to the presentation of a given interval). The network was trained on 12 presentations of the five target intervals (50, 100, 150, 200, and 250 msec) and tested on another series of 12 simulations, covering 12 different intervals (25–300 msec at 25 msec increments) to analyze generalization.
Figure 6A shows the response of the output units to the test stimuli. The results show that the population of Ex units as a whole codes well for a wide range of intervals. This population code can be easily read-out by the set of five output neurons trained with a supervised learning rule. The output units responded well to their target intervals and not to the remaining trained intervals. Importantly, each output unit generalized in Gaussian fashion to the untrained intervals. That is, the output unit trained at 150 msec, responded maximally to the 150 msec interval of the test set, and responded submaximally to 125 and 175 msec intervals, and not at all to 75 and 225 msec intervals.
The results shown in Figure 6A were obtained in the absence of noise. Figure 6, B and C, shows the results of simulations in the presence of noise in all the Ex and Inh units. The rms of the resting membrane potential of Ex units was 1.4 and 4 mV in Figure 6, B and C, respectively. A rms of 1.4 mV had little overall effect on interval discrimination, whereas a rms of 4 mV produced a significant decrease in performance, particularly for the intermediate intervals. Other sources of noise such as probability of release were not examined. The effects of “synaptic noise” will be dependent on assumptions about pr, the number of “release sites”, and the number of synapses. However, within a range, different sources of noise are likely to have similar effects because they are all ultimately expressed in the variability of the membrane potential from trial to trial.
Structure of the population code
The results shown above establish that the Ex units form a population code, which can be used by output units to discriminate intervals. The fine interval tuning of the output units could be attributable to either broad or fine tuning of the Ex units driving the output units. Figure 7 shows the synaptic weights of the Ex units onto the output units (Fig. 7A) and the corresponding interval tuning of the Ex units (Fig. 7B). As shown by comparing panels A and B, the output units use input from a large population of Ex cells with different tuning characteristics. At short intervals there is a significant number of interval-selective Ex units that drive the appropriate output unit. Fewer Ex units are selective for longer intervals, thus the 250 msec output unit is driven by excitatory input from a range of broadly tuned Ex units and inhibited by Ex units tuned for shorter intervals. Interestingly, despite the inputs to the output units consisting of a mixture of selective to nonselective cells, the tuning curves exhibit smooth generalization.
What accounts for the diversity of the temporal selectivity of the Ex units given that the time constants of the short-term plasticity and slow IPSPs are the same for all synapses? For the small disynaptic circuits it was shown that temporal selectivity was a function of the synaptic weights (Fig. 4B). In the large network it is was the variability in the synaptic strengths that allowed for variations in the temporal tuning of the Ex units. If all the synaptic weights are assigned using a variance of zero, no interval discrimination occurs, because there is no symmetry breaking (data not shown). In other words, all Ex units will essentially exhibit the same temporal selectivity and behave much like the disynaptic circuit. Thus, the model is in many ways stochastic: it relies not on a specific set of synaptic strengths but on a distribution of different synaptic strengths that will result in a distribution of different types of temporal tuning.
It should be noted that because of the complexity of the large network, additional factors not present in the simple disynaptic circuit further enrich the temporal selectivity of the Ex units. First, in the simplified circuit with only one synapse of each type, the synaptic strength defined the “functional synaptic strength.” In a large network, the effective strength of each synapse class is not determined only by the weight of a synapse, but buy a complex interaction dependent on which and when a set of synapses is active. (2) Lateral connectivity in the form of Ex → Ex and Ex → Inh synapses, absent in the disynaptic circuits, were present in the large network, further enhancing the complexity and variability in the temporal selectivity of the Ex units.
Dependency of interval discrimination on different temporal properties
The simulations above indicate that neural networks that incorporate short-term forms of plasticity and slow IPSPs can generate a population code for a spectrum of intervals. What the simulations have not addressed is the relative contribution of the different properties. Analyzing the performance of the network by simply removing these properties can generate confounding results, because the overall level of activation can change dramatically. We thus chose to “flatten” the profile of short-term plasticity and the slow IPSPs. Under these conditions, PPF, PPD, and the slow IPSP are still present and thus, there are not dramatic changes in the overall activity of the network. However, rather than changing through time, and thus continuously altering the state of the network, these properties did not change 20 msec after their initial onset. Thus, PPF was present, but the degree of facilitation was the same whether the interval was 50 or 400 msec. These manipulations allowed us to directly determine the relative contribution of different time-dependent properties and confirm that it is the continuous change in short-term plasticity and slow IPSPs that underlies the ability of the network to discriminate temporal intervals.
These simulations also examined a broader range of intervals: the target intervals were 50, 100, 200, 300, and 400 msec. The test intervals ranged from 25 to 450 msec at 25 msec steps. Figure8 shows the response of the output units under four different conditions: control (A); flat GABAB-dependent properties (PPD of IPSPs and slow IPSPs) (B); flat PPF of EPSPs (C); and no PPF and no GABAB-dependent properties (D). As expected, each form of short-term plasticity contributed differentially to interval discrimination. In the absence of changing PPD and slow IPSPs, the network discriminated intervals up to 200 msec almost as well as the control condition. However, discrimination of longer intervals was not possible. Thus the time-dependent changes in network state produced by PPF alone were sufficient to effect the population response for short but not long intervals. Interestingly, flattening PPF still allowed a reasonable degree of interval-selective responses but tended to result in the emergence of bimodal responses centered around 150 msec. Note that the output unit trained at 100 msec could easily discriminate between 100 and 200 msec, but not as well between 100 and 250 msec (Fig.8C). This behavior occurs because the magnitude of the GABAB-dependent properties are similar at 100 and 250 msec during their rising and decaying phases, respectively. In other words, there is some symmetry around the peak: the state of the network is similar during the rising and decaying phases of the GABAB-dependent events. Figure8D shows that in the absence of time-dependent properties, interval discrimination is essentially abolished. The membrane and synaptic time constants influence the state of the network at intervals up to 50 msec, allowing some discrimination between a 50 msec interval and longer intervals. However, note that the response to 25 msec was stronger that that to 50 msec, even though the latter was the target interval.
We also examined interval discrimination after removing PPF of the Input → Ex synapses or PPD of the inhibitory synapses or the slow IPSPs. Under each of these conditions the network still performed well (data not shown), but was not as robust in the presence of noise, nor were the peaks of the output responses as high.
Discrimination of simple sequences
We next examined the ability of the network to discriminate simple sequences. Sequence discrimination is an important test if a model is to be a general mechanism for temporal processing, because it requires sensitivity to higher-order temporal features. The network was presented with four stimuli defined by their interpulse intervals: 50–150, 100–100, and 150–50. Note that the first and third sequences contain the same intervals, but in a different order. Each stimulus was presented to the same network used above (with all the same parameters) 24 times. Activity patterns from 12 presentations were used for training the Output units, whereas 12 were used for testing. The recognition network was trained on the number of spikes of each Ex unit generated by the last pulse. Note, that in some sense a priori knowledge was used by telling the network which pulse was the last (however, training the total activity across all pulses generated similar results). Figure9A shows the average response of the three output units. The output units were able to discriminate among the three different sequences. No Ex units were strongly selective to any of the three stimuli (data not shown). Thus, the output units relied on the stimulus sensitivity of a large population of Ex units. To understand what population of Ex units are active in response to each pulse, we can plot the activity of each output unit during each sequence (Fig. 9B). The activity plotted is simply the activity of all Ex units multiplied by the weight of the Ex → Output connection. Thus, the magnitude of the response to the different pulses reflects the overlap between the Ex units driving the maximal response (last pulse of the target sequence). Note that some of the responses have an early excitatory peak followed by inhibition. In general the early responses (short-latency spikes) carry less information about the sequence. This is in part because the early responses (generally driven by functionally stronger connections) are less sensitive to the time-dependent changes in the excitatory–inhibitory balances. This may suggest that early responses carry spatial information, whereas late responses tend to carry more temporal information.
Unlike most models based on delay lines or specific time constants, in this model sequence discrimination is a natural extension of interval discrimination. Interval discrimination is ultimately possible because of differences in the state of the network at the arrival of the first and second pulse. Because at no point is the network “reset”, changes are cumulative. Consider stimulus 2 of Figure 9A(100–100). The intervals between the first and second and second and third pulses are the same, nevertheless, the second and third pulses still arrive in different network states. For example, the slow IPSP (onto both Ex and Inh units) produced by the second pulse will still sum with the slow IPSP from the first pulse. As the number of pulses increases a steady-state should be reached, and the differences in the population response will eventually be too small to allow discrimination. We have not yet examined at what point this occurs, in part because relatively little psychophysical data are available on the interaction between sequence size and discrimination. Furthermore, it is clear that performance is highly dependent on the size and number of layers of the network.
DISCUSSION
The results described here show that a large network of interconnected Ex and Inh units can perform both interval and sequence discrimination. This ability relies on the presence of time-dependent properties (short-term plasticity and slow IPSPs) and variability in the assigned synaptic weights. The model is stochastic in that a random distribution of synaptic strengths is sufficient to generate a range of temporal response characteristics for each unit. Together these units can establish a population code that allows discrimination over a wide range of intervals. By studying small disynaptic circuits, we were able to show how temporal tuning can be determined by synaptic strength. Specifically, the interaction between synaptic strength and time-dependent properties will shape the response characteristics of both the Ex and Inh units. The temporal tuning of the Ex unit is further controlled by the fast inhibition generated by the Inh unit tuning. Together these mechanisms can generate a range of different temporal filters in the Ex unit, ranging from selective to nonselective (Fig. 4B).
Interval versus sequence discrimination
Various models have been proposed to account for interval-selective neuronal responses. One of the first models of interval detection was the delay line model based on axonal conduction delays. This model accounts for the detection of interaural delays in the range of tens to hundreds of microseconds used for sound localization (Jeffress, 1948; Carr, 1993). However, despite early proposals that parallel fibers in the cerebellum may function as delay lines (Braitenberg, 1967), there is no experimental data supporting axonal delays in the millisecond range. However, many of the more recent models follow similar principles, in that they are labeled line models. Selectivity generally relies on establishing a range of different time constants for some time-dependent mechanisms. These could include neurons oscillating at different frequencies (Miall, 1989), a range of biochemical time constants (Fiala et al., 1996), or IPSPs of different durations (Sullivan, 1982; Olsen and Suga, 1991). There is experimental data supporting the latter mechanism in subcortical areas used for pulse-echo detection intervals in the bat (Sullivan, 1982; Olsen and Suga, 1991; Saitoh and Suga, 1995). This mechanism is well suited to solve the temporal requirements for echolocation, which is a relatively specialized problem, in that the timing is always determined by two events (pulse and echo), separated by a few milliseconds.
It is fundamental to determine whether labeled line models generalize to more complex temporal patterns that are common in auditory stimuli such a speech and animal vocalizations. Generally speaking, most models based on a range of different time constants do not inherently account for discrimination of simple sequences. Consider interval or duration detection based on the duration of IPSPs (Sullivan, 1982; Olsen and Suga, 1991; Casseday et al., 1994; Saitoh and Suga, 1995). In such models the first event triggers a rapid excitatory potential and a slower inhibitory potential followed by an excitatory rebound (Fig.10A). The excitatory potential by itself is not capable of eliciting a suprathreshold response, but when a second event generates an excitatory potential that adds with the offset of inhibition (excitatory rebound), a suprathreshold response occurs. Thus, the duration of the IPSP determines the preferred interval of the neuron, and by having a range of IPSP durations it is possible to cover a spectrum of different intervals. Such a system is not well suited to discriminate simple sequences such as those shown in Figure 9. Consider two sequences: 100–200 and 200–100, both will activate the 100 and 200 msec detectors, although in a different order (Fig. 10B). Thus, sequence discrimination would require a second step involving order discrimination, itself a type of temporal discrimination (the addition of a 300 msec detector can solve this problem). More important is the issue of biological implementation. The activation of both the 100 and 200 msec detectors assumes a “reset” mechanism. If the 200 msec interval detector receives a pulse after 100 msec, it must reset so that it can respond to the subsequent 200 msec interval. Such a reset mechanism is not physiologically plausible if it relies on IPSPs, thus in reality it is unlikely that the second interval will activate the appropriate detector. Although modifications can be made to this model to overcome these problems, it seems likely that such a system may have evolved specifically for the detection of intervals and durations under specific conditions, rather than the discrimination of arbitrary temporal patterns.
For sequence discrimination the model described here relied on a population code. More so than for interval discrimination, population codes for sequences are desirable given the large number of potential sequences. In the large network the Ex units implemented a temporal to spatial transformation and represented a given temporal pattern in a population code, which in principle can be used downstream (in our case by the output units) like any other population code. To implement the temporal to spatial transformation, the network relies on state-dependent changes in the network as a result of time-dependent properties extending well from intervals to simple sequences. This is because each pulse induces cumulative changes in the state of the network, and thus in the population response, each pulse establishes a “context.” The disadvantage of this model is that it will have difficulty identifying specific intervals embedded in sequences. If the network is trained to identify a 100 msec interval, and then the 100 msec pulse is inserted within a larger sequence (or simply preceded by another pulse), the network may not identify it. In contrast, some labeled line models will detect a 100 msec interval placed anywhere in a circuit but will not capture the overall pattern. Thus, a psychophysical prediction from the current model is that interval discrimination should be more impaired by the presence of a distractor (a stimulus that precedes the target stimulus) than a nontemporal task such as frequency or intensity discrimination.
Short-term synaptic plasticity
In the current model two forms of short-term synaptic plasticity were simulated: PPF of EPSPs (on to both excitatory and inhibitory units) and PPD of IPSPs. PPF of excitatory synapses is not seen in all synapses, but is dependent on various factors including the presynaptic and postsynaptic cell types and developmental stage. Short-term facilitation is generally observed in Ex → Inh connections (Thomson et al., 1993; Markram et al., 1998; Reyes et al., 1998). Both PPF and PPD are observed in Ex → Ex connections. In the hippocampus, short-term facilitation is seen both in the mossy fiber to CA3 synapses (Zalutzky and Nicoll, 1990; Salin et al., 1996) and the Schaffer collateral to CA1 synapses (Creager et al., 1980; Manabe et al., 1993;Buonomano and Merzenich, 1996). In neocortical synapses, both PPF (Ramoa and Sur, 1996; Stratford et al., 1996; Reyes and Sakmann, 1998) and PPD (Thomson and Deuchars, 1994; Markram et al., 1996;Stratford et al., 1996) are observed, although depression is more common. Stratford et al. (1996) have shown that different Ex → Ex synapses vary as to the type of short-term plasticity observed. Specifically, in the visual cortex, thalamocortical to L–IV synapses exhibit PPD; the L–VI → L–IV projection exhibits PPF, and L–IV → L–IV synapses exhibit little paired-pulse plasticity. Gil et al. (1997) also report paired-pulse plasticity differences between different synapses in the rat somatosensory cortex. Reyes and Sakmann (1998) have reported that synapses between L–II/III pyramidal neurons exhibit PPD early in development and PPF later in development. Additionally, the dependency of short-term plasticity on the synapse type suggests that it has multiple functional roles. Indeed, in addition to the role of short-term forms of plasticity in temporal processing suggested here and previously (Buonomano and Merzenich, 1995; Buonomano et al., 1997), it has also been suggested that short-term plasticity may provide a mechanism for “on-line” modulation in certain types of behaviors (Fisher et al., 1997). Others have suggested that short-term depression between excitatory cortical neurons may play a role in gain control, by amplifying transient changes in firing rates (Abbott et al., 1997) and maintaining the stability of cortical circuits by keeping positive feedback in check (Galarreta and Hestrin, 1998).
The presence of facilitating excitatory synapses is an important component of the model described here. However, in the large network in the absence of facilitation onto Ex units, interval discrimination was still observed. Even in the presence of depressing EPSPs it is ultimately the net balance between short-term plasticity of EPSPs on Ex and Inh units and of IPSPs that will determine the ability of the network to process temporal information. Furthermore, in cortical areas where depression predominates, there seems to be a significant amount of facilitating (low probability of release) synapses, because activity-dependent antagonists reveal that a subpopulation of synapses exhibit PPF (Gil et al., 1999).
Centralized versus distributed temporal processing
A fundamental question regarding the mechanisms underlying temporal processing on the millisecond time scale is whether timing is performed by some specialized central time-keeping system or distributed throughout different brain regions. The most common view of a centralized mechanism is the internal clock hypothesis (Creelman, 1962; Treisman, 1963). In such models a temporal problem in the visual or auditory modality, or even a timed motor behavior, would access the same “internal clock.” Studies of patients with cerebellar (Ivry and Keele, 1989), parietal cortex (Harrington et al., 1998a), and basal ganglia lesions (Harrington et al., 1998b) have all reported deficits in temporal processing, often in both sensory and motor tasks. These studies are generally interpreted to favor centralized timing mechanisms. Additionally, very specific effects on the timing of conditioned motor responses in rabbits have been reported to result from lesions to the cerebellar cortex (Perrett et al., 1993). Psychophysical studies of interval discrimination have provided some support for centralized mechanisms by showing cross-channel or cross-modality generalization of interval learning (Wright et al., 1997; Nagarajan et al., 1998). However, these studies were not designed to selectively engage channel-specific learning.
In contrast to centralized models, distributed models argue that temporal information is processed on an “as needed” basis, occurring in auditory, visual, association, or motor areas depending on the task. The mechanisms underlying temporal processing in either distributed or centralized systems could include delay lines (Braitenberg, 1967; Tank and Hopfield, 1987), oscillators (Miall, 1989; Ahissar et al., 1997), network dynamics (Buonomano and Mauk, 1994; Mauk and Donegan, 1997), or short-term synaptic plasticity (Buonomano and Merzenich, 1995). Given the pervasiveness of temporal information in external stimuli and the generality of the time-dependent mechanisms studied in the current paper, we favor distributed models of temporal processing on the scale of tens to hundreds of milliseconds.
Conclusions
In the current paper, it is suggested that networks of neurons are intrinsically capable of decoding temporal information as a result of time-dependent changes in network state produced by short-term forms of plasticity. Specifically, short-term plasticity and other time-dependent properties change the dynamic balance between excitation and inhibition in local circuits producing neuronal response characteristics that are dependent on previous activity and thus, temporal stimulus history. The hypothesis presented predicts that manipulations that eliminate short-term forms of plasticity will produce deficits in temporal processing. The deficits should be specific to the time scale of the neuronal and synaptic mechanisms being manipulated.
Footnotes
This work was supported by Office of Naval Research Grant N00014-96-1-0206 and the Alfred P. Sloan Foundation. I thank Michael Merzenich for helpful discussions and advice and Allison Doupe, Randy Gallistel, Peter Latham, Uma Karmarkar, and Felix Schweizer for reading earlier versions of this manuscript.
Correspondence should be addressed to Dean V. Buonomano, Department of Neurobiology and Psychology, University of California–Los Angeles, Box 951763, Los Angeles, CA 90095. E-mail: dbuono{at}ucla.edu.