Abstract
Signal duration is important for identifying sound sources and determining signal meaning. Duration-tuned neurons (DTNs) respond preferentially to a range of stimulus durations and maximally to a best duration (BD). Duration-tuned neurons are found in the auditory midbrain of many vertebrates, although studied most extensively in bats. Studies of DTNs across vertebrates have identified cells with BDs and temporal response bandwidths that mirror the range of species-specific vocalizations. Neural tuning to stimulus duration appears to be universal among hearing vertebrates. Herein, we test the hypothesis that neural mechanisms underlying duration selectivity may be similar across vertebrates. We instantiated theoretical mechanisms of duration tuning in computational models to systematically explore the roles of excitatory and inhibitory receptor strengths, input latencies, and membrane time constant on duration tuning response profiles. We demonstrate that models of duration tuning with similar neural circuitry can be tuned with species-specific parameters to reproduce the responses of in vivo DTNs from the auditory midbrain. To relate and validate model output to in vivo responses, we collected electrophysiological data from the inferior colliculus of the awake big brown bat, Eptesicus fuscus, and present similar in vivo data from the published literature on DTNs in rats, mice, and frogs. Our results support the hypothesis that neural mechanisms of duration tuning may be shared across vertebrates despite species-specific differences in duration selectivity. Finally, we discuss how the underlying mechanisms of duration selectivity relate to other auditory feature detectors arising from the interaction of neural excitation and inhibition.
Introduction
The wide variation of sounds relevant to different species has led to specialized neural circuits for analyzing and interpreting species-specific acoustic signals. Neural specializations for representing spectral (frequency) and temporal (timing) components of sound are found from the auditory periphery to the auditory cortex. Mammals first determine sound frequencies in the cochlea where mechanical properties of the basilar membrane produce a spatial (tonotopic) map of frequency that is preserved throughout the auditory brainstem, midbrain, thalamus, and cortex (Brugge, 1992). Neurons within the central auditory pathway are often specialized for species-specific acoustic processing. For example, New World mustached bats (Pteronotus parnellii) emit multiharmonic echolocation calls composed of a long-duration, constant-frequency tone near 60 kHz followed by a short-duration, downward frequency-modulated (FM) sweep (Pollak and Bodenhamer, 1981). Mustached bats adjust the constant-frequency component of their vocalizations to compensate for flight-induced Doppler shifts in received echoes (Jen and Kamada, 1982). Doppler shift compensation behavior is facilitated by having an auditory fovea with mechanical and physiological specializations of the cochlea and an overrepresentation of narrowly tuned neurons in the peripheral (Suga et al., 1975; Kössl and Vater, 1985) and central auditory system (Suga and Jen, 1976; Pollak and Bodenhamer, 1981).
Temporal components of acoustic signals are extracted and represented throughout the central auditory pathway. Signal onset and offset are encoded in the periphery by the onset and offset of cochlear afferent spike trains. Higher order temporal features, such as gap and signal duration, are analyzed by upstream auditory nuclei. In particular, cells tuned to stimulus duration, known as duration-tuned neurons (DTNs), are found first in the auditory midbrain of amphibians (Narins and Capranica, 1980) and mammals (Jen and Schlegel, 1982; Casseday et al., 1994; Chen, 1998; Fuzessery and Hall, 1999; Brand et al., 2000; Pérez-González et al., 2006; Wang et al., 2006). Auditory DTNs exist across vertebrate species and taxa and vary widely in their preferred range of signal durations (for review, see Sayegh et al., 2011). The existence of visual DTNs in the mammalian cortex suggests that neural mechanisms of duration selectivity may be shared across vertebrates and sensory modalities (Duysens et al., 1996).
To address this hypothesis and explore the flexibility of neural mechanisms proposed to produce duration selectivity, we developed computational models of duration-tuned neural circuits. To validate our models and demonstrate how responses of DTNs in different vertebrates are reproduced by similar synaptic mechanisms tuned with species-specific model parameters, we analyzed in vivo electrophysiology data from the central nucleus of the inferior colliculus (ICc) of the big brown bat (Eptesicus fuscus) and present similar data from the published literature on rats (Pérez-González et al., 2006), mice (Brand et al., 2000), and frogs (Leary et al., 2008). Our results support the hypothesis that neural tuning for stimulus duration results from a complex interaction of excitation and inhibition, and that the proposed mechanisms are flexible enough to account for duration selectivity across vertebrates. We then discuss how duration tuning fits into a larger class of stimulus-evoked auditory responses attributable to complex temporal interactions of neural excitation and inhibition.
Materials and Methods
Computational modeling
We modeled DTNs with a single-compartment model in NEURON 7.2 (Carnevale and Hines, 2006) via the Python scripting interface (Hines et al., 2009) using simulation time steps of 0.05 ms. The model cell was a spherical neuron with a diameter of 13 μm that contained glutamate-activated AMPA and NMDA receptor-mediated depolarizing currents and GABA-activated GABAA receptor-mediated hyperpolarizing currents. Receptor kinetics were based on the simplified versions of postsynaptic currents from the study by Destexhe et al. (1998). Briefly, presynaptic spikes triggered a 1 ms release of 1 mm neurotransmitter that activated postsynaptic receptor currents with kinetics determined by the equations in Table 1. The time of a spike was defined as the time step in which the membrane potential of the model neuron crossed 0 mV. The rates of neurotransmitter binding (α) and unbinding (β) determined the rise and decay kinetics of each postsynaptic receptor conductance (gAMPA, gNMDA, gGABAA). Fitted parameter values for α and β were previously determined from whole-cell current recordings (Destexhe et al., 1998). NMDA receptors exhibited a voltage-dependent Mg2+ block characterized by the function B(V) defined in Table 1 (Jahr and Stevens, 1990). The membrane of the model cell also contained passive channels that conduct leak current (Ileak), and channels for fast Hodgkin–Huxley type sodium (Na) (INa) and potassium (K) (IK) currents based on the kinetics described by Traub and Miles (1991) and implemented by Destexhe et al. (1996) (Table 2). Voltage dynamics of the model cell membrane potential (
Excitatory inputs to the model DTN.
Presynaptic spikes that activated glutamatergic AMPA and NMDA receptors on the model DTN were generated by two single-compartment neurons: one providing excitation timed relative to stimulus onset (onset-evoked) and the other providing excitation timed relative to stimulus offset (offset-evoked). Neurons with transient, onset-evoked responses are found throughout the central auditory pathway, including the cochlear nucleus (Pfeiffer, 1966; Haplea et al., 1994) and (medial) superior olive (Guinan et al., 1972; Grothe et al., 1997, 2001). Offset responding neurons have also been observed in the (medial) superior olivary complex (Guinan et al., 1972; Grothe et al., 1997, 2001), where dense excitatory projections lead into the ICc (Oliver et al., 1995). GABAergic offset responding neurons in the superior paraolivary complex are not candidates for providing offset-evoked excitatory input to the inferior colliculus (IC) but may still contribute to duration tuning by suppressing excitatory offset responses at some durations (Kadner et al., 2006). Another possible source of offset-evoked excitation comes from within the IC or even a DTN itself via postinhibitory rebound following sustained inhibition (Casseday et al., 1994). Our model is not dependent on and does not presume a particular anatomical location for these inputs.
Presynaptic neurons were modeled with fast spiking kinetics (Table 3) such that a 1 ms, 0.1 nA injected current pulse produced exactly one spike in the neuron. Presynaptic neurons activated postsynaptic receptors on the model DTN. In our initial (default) model, the onset-responding presynaptic neuron was activated 10 ms after stimulus onset and the offset-responding presynaptic neuron was activated 6 ms after stimulus offset (see Fig. 3E). We also explored the effect of input latency on duration tuning by running simulations with a range of onset-evoked excitation latencies (see Results, Onset-evoked excitatory input latency).
Inhibitory inputs to the model DTN.
Previous work has demonstrated that neural inhibition is necessary for creating DTNs in the IC. Blocking receptors of inhibitory neurotransmitters via neuropharmacological agents abolishes or greatly diminishes duration-tuned responses (Fuzessery and Hall, 1999; Jen and Feng, 1999; Casseday et al., 2000; Jen and Wu, 2005; Yin et al., 2008). Intracellular (whole-cell patch-clamp) (Covey et al., 1996; Leary et al., 2008) and extracellular (single-unit) recordings (Faure et al., 2003) suggest that inhibitory inputs usually precede excitatory inputs to the DTN and persist for as long or longer than the duration of the stimulus. Potential sources of sustained inhibitory input to the IC are from the nuclei of the lateral lemniscus (Covey and Casseday, 1991; Nayagam et al., 2005) where both GABAergic and glycinergic neurons with sustained spiking exist (Vater et al., 1997). Furthermore, neurons in the columnar division of the ventral nucleus of the lateral lemniscus in the big brown bat produce transient, onset-evoked responses with a glycinergic input to the IC that could account for the presence of a strong, onset-evoked inhibition that mainly affects the early portion of a stimulus-evoked response (Covey and Casseday, 1991; Vater et al., 1997). Neuropharmacologically blocking either GABAA or glycine receptors produces a similar effect on the responses of in vivo DTNs (Casseday et al., 2000). Therefore, we chose to model all IPSPs with GABAA receptor kinetics to simplify both the model description and analysis (for a discussion of additional receptor types, see Discussion, Model limitations and future enhancements).
Inhibitory presynaptic inputs to the model DTN were generated by a population of 10 single-compartment neurons with fast spiking kinetics (Table 4) that activated GABAA receptors on the model DTN. Ten neurons were chosen so that the presynaptic input would approximate a continuous inhibitory conductance in the model DTN (Fig. 1A). The current injected to each presynaptic neuron was in the form of discrete 1 nA square pulse events. Each event lasted for the duration of a simulation time step (0.05 ms) and followed a Poisson distribution with a mean probability of 0.05 events per time step (i.e., on average, each presynaptic neuron received 1 nA of current for 0.05 ms per 20 simulation time steps). This resulted in a mean spiking rate of ∼250 Hz in each presynaptic input neuron. In the default model, the inhibitory presynaptic inputs had a latency of 9 ms (re stimulus onset) and their input lasted for the duration of the stimulus. This latency corresponds to that of the leading inhibition observed in the big brown bat (∼4–12 ms) (Covey et al., 1996; Faure et al., 2003). Due to the random timing of inhibitory presynaptic spikes, the precise time course of the inhibition acting on the DTN varied slightly for each stimulus presentation. Simulations were therefore repeated 20 times, and the mean ± SE results are reported. Dynamic model parameters were allowed to settle for 25 ms before stimulus presentation to ensure that each simulation began from a stable network state.
Default model response.
Figure 1 illustrates the simulated receptor conductances, inward currents, and membrane potential of the DTN in the default model. For comparison, we show responses of the model DTN for both a long (25 ms) duration stimulus that did not evoke spiking (Fig. 1, left column) and a short (5 ms) stimulus that reliably evoked one spike per trial (Fig. 1, right column). The conductance changes over time for the AMPA, NMDA, and GABAA receptors in the default model are shown in Figure 1, A and B. The two peaks observed in the AMPA and NMDA receptor conductances in response to the long duration stimulus correspond to the onset- and offset-evoked excitatory presynaptic inputs (see above, Excitatory inputs to the model DTN). These peaks are also present in response to the short-duration stimulus but are less evident because they temporally overlap, as evidenced by the twofold increase in AMPA-evoked conductance and by the even larger increase in NMDA conductance because of a significantly reduced Mg2+ block [B(V)] during the action potential of the DTN. Although the maximum NMDA receptor conductance was greater than the maximum AMPA receptor conductance (Table 5), the realized NMDA conductance, and thus NMDA-mediated current, was substantially smaller due to the Mg2+ block. Our maximum conductance values were chosen so that single, presynaptic, excitatory spikes evoked subthreshold depolarizations in the DTN when they were coincident with presynaptic inhibition, but they evoked suprathreshold depolarizations when they were temporally coincident with the other excitatory input (see Results, Coincidence detection mechanism). In the absence of inhibition, the excitatory input spikes were suprathreshold and produced spikes in the DTN (see Results, Onset-evoked excitatory input latency). The maximum conductance parameter values represent ∼4–10 synapses per input (Destexhe et al., 1998) and nicely reproduced the sharp, short-pass duration selectivity observed in the ICc of bats (Zhou and Jen, 2001; Fremouw et al., 2005). The GABAA receptor conductance (Table 5), and therefore inhibitory current, was noisy due to the random arrival times of the presynaptic inhibitory input spikes. Later, we explore the effect of varying AMPA, NMDA, and GABAA receptor conductances on duration selectivity (see Results, Maximum receptor conductance).
The AMPA, NMDA, and GABAA receptor-mediated currents are shown in Figure 1, C and D. Inward current, responsible for depolarizing the membrane potential, is shown as negative (downward) values, and outward current, responsible for hyperpolarizing the membrane potential of the DTN, is shown as positive (upward) values. Because the NMDA receptor conductance decays slowly, the second (offset-evoked) excitatory input results in a facilitated conductance, and thus a larger receptor-mediated current, compared with the first (onset-evoked) excitatory input. When both excitatory inputs overlapped, the summated current produced an action potential ∼15 ms from stimulus onset (Fig. 1D). Notice also that the AMPA and NMDA currents reversed direction during the action potential (Fig. 1D, at 14 ms) when the membrane potential of the model DTN rose above the 0 mV reversal potential of the AMPA and NMDA receptors (Fig. 1F, at 14 ms).
Receptor currents are a complex function of channel activation kinetics, receptor activation kinetics, and passive model parameters. We measured the ratio of peak NMDA current to peak AMPA current in the default model for a single excitatory input spike to be 0.0732 (NMDA/AMPA; 5.00:68.28 pA). Ma et al. (2002b) found that late (NMDA) and early (AMPA) currents in the rat IC had an average ratio of 0.58 (late/early) with a range of ∼0.08–1.8. We predict that the NMDA-to-AMPA current ratio could be lower in bats, and thus more similar to our default model, due to the requirement for faster temporal processing. The resulting membrane potentials of the model DTN in response to both long- and short-duration stimuli are shown in Figure 1, E and F. When the excitatory inputs failed to overlap, the membrane potential depolarized slightly but remained subthreshold (Fig. 1E). When the excitatory inputs coincided, an action potential was produced (Fig. 1F).
Electrophysiology
Surgical procedures.
Electrophysiological recordings were obtained from the ICc of 21 awake big brown bats (E. fuscus) of either sex. Before recording, a stainless-steel post was affixed to the skull with cyanoacrylate superglue (Henkel Loctite Corporation) to ensure a fixed head position could be precisely replicated between sessions. Before the head-posting surgery, bats were given a subcutaneous injection of buprenorphine (0.03 ml; 0.045 mg/kg). Bats were then placed in a small (9.7 × 11.6 × 9.6 cm; length by width by height) anesthesia induction chamber where they inhaled a 1–5% isoflurane/oxygen mixture (1 L/min). Anesthetized bats were then placed in a foam-lined body restraint within a surgical stereotaxic alignment system with a custom mask for continuous gas inhalation (David Kopf Instruments; model 1900). The hair covering the skull was shaved, and the skin was disinfected with Betadine. Local anesthetic (0.2 ml of bupivicaine; 5 mg/ml) was injected subcutaneously before making a midline incision in the scalp. The temporal muscles were reflected to reveal the dorsal surface of the skull, which was then scraped clean and swabbed with 70% ethanol. After drying, the post was glued to the skull over the cortex using superglue hardened with liquid cyanoacrylate accelerator (Pacer Zip Kicker). A chlorided silver wire attached to the headpost was positioned under the right temporal muscle and served as the reference electrode.
Electrophysiological recordings.
Recordings began 1–3 d after surgery and continued for one to six sessions on separate days. Each session lasted 4–6 h and was terminated if the bat showed signs of discomfort. Before recording, bats were subcutaneously administered a neuroleptic (0.3 ml; 1:1 mixture of 0.025 mg/ml fentanyl citrate and 1.25 mg/ml droperidol; 9.6 mg/kg). Bats were then placed in a foam-lined body restraint that was suspended by springs within a small animal stereotaxic frame customized for bats (ASI Instruments). The entire apparatus was mounted atop an air vibration table (TMC Micro-G). The bat's head was immobilized by securing the headpost to a stainless-steel rod attached to a micromanipulator (ASI Instruments) mounted on the stereotaxic frame. A scalpel was used to cut a small hole in the skull and dura matter over the dorsal surface of the IC. Single-unit extracellular recordings were made with thin-wall borosilicate glass microelectrodes with a capillary filament (outer diameter, 1.2 mm; A-M Systems) filled with 3 m NaCl. Electrodes were made with a Flaming/Brown style micropipette puller (model P-97; Sutter Instrument). Typical electrode resistances ranged from 15 to 30 MΩ. Electrodes were visually positioned over the dorsal surface of the IC with manual manipulators (ASI Instruments) and advanced into the brain at 1 μm intervals with a stepping hydraulic micropositioner (Kopf model 2650). Action potentials were recorded with a neuroprobe amplifier (A-M Systems model 1600) whose 10× output was bandpass filtered and further amplified (500–1000×) by a Tucker-Davis Technologies spike preconditioner (TDT PC1; low-pass fc = 7 kHz; high-pass fc = 300 Hz). Spike times were logged to a computer by passing the PC1 output to a spike discriminator (TDT SD1) and an event timer (TDT ET1) synchronized to a timing generator (TDT TG6). Between recording sessions, the skull was covered with a piece of contact lens and Gelfoam coated in Polysporin. Bats were individually housed in a temperature- and humidity-controlled room. All procedures were approved by the McMaster University Animal Research Ethics Board and were in accordance with the Canadian Council on Animal Care.
Stimulus generation and data collection.
Stimulus generation and on-line data collection were controlled with custom software that displayed spike times as dot rasters ordered by the acoustic parameter that was varied during stimulus presentation (Faure et al., 2003; Fremouw et al., 2005). Briefly, sound pulses were digitally generated with a two-channel array processor (TDT Apos II; 357 kHz sampling rate) optically interfaced to two digital-to-analog converters (TDT DA3-2) whose individual outputs were fed to a low-pass antialiasing filter (TDT FT6-2; fc = 120 kHz), two programmable attenuators (TDT PA5), and two signal mixers (TDT SM5) with equal weighting. The output of each mixer was fed to a manual attenuator (Leader LAT-45) before final amplification (Krohn-Hite model 7500). Stimuli were presented monaurally to the ear contralateral to the IC being recorded using a Brüel & Kjær 1/4 inch condenser microphone (type 4939; protective grid on) modified for use as a loudspeaker by a transmitting adaptor (B&K type UA-9020) to correct for nonlinearities in the transfer function (Frederiksen, 1977). The loudspeaker was positioned ∼1 mm in front of the external auditory meatus. The output of the speaker, recorded with a B&K type 4138 1/8 inch condenser microphone (90° incidence; grid off) connected to a measuring amplifier (B&K type 2606) and bandpass filter (K-H model 3500), was measured relative to a sound calibrator (B&K type 4231) and expressed in decibels sound pressure level (re 20 μPa) equivalent to the peak amplitude of continuous tones of the same frequency (Stapells et al., 1982). The loudspeaker transfer function was flat ±6 dB from 28 to 119 kHz, and there was at least 30 dB attenuation at the ear opposite the source (Ehrlich et al., 1997). All stimuli had rise/fall times of 0.4–0.5 ms shaped with a square cosine function and were presented at a rate of 3 Hz.
Single units were found by searching with a paired stimulus consisting of a short-duration (3–4 ms) and long-duration (20–25 ms) pure tone of the same frequency separated by 110 ms. Upon isolation, the best excitatory frequency (BEF), duration selectivity profile, and acoustic threshold of a neuron were determined by presenting 10–20 repetitions of pure-tone stimuli randomly varied in frequency, duration, or amplitude, respectively. First, we defined BEF as the frequency (1 kHz resolution) that evoked the maximum spike count in the cell using a stimulus duration and amplitude that resulted in robust spiking. Next, the duration selectivity profile of the cell was measured by presenting a series of BEF tones (at the same amplitude as BEF testing) randomly varied in duration from 1 to 25 ms (1 ms resolution). Best duration (BD) was defined as the stimulus duration that evoked the maximum spike count in the cell (Casseday et al., 1994). Finally, acoustic threshold was determined by presenting a series of BEF, BD pulses that were randomly varied in amplitude (10 dB resolution) over a 70 dB range of attenuation.
Results
Characterizing duration-tuned neurons
Duration-tuned neurons can be classified into one of three response classes based on the temporal tuning profile of the cell when stimulated with variable duration pure tones at BEF. (1) Short-pass DTNs respond maximally to short-duration stimuli and eventually have a ≥50% reduction in spike count at durations longer than BD. (2) Bandpass DTNs respond maximally at BD and eventually have a ≥50% reduction in spike count at durations both shorter and longer than BD. (3) Long-pass DTNs differ from short-pass and bandpass DTNs by not having a BD; they respond only when the stimulus exceeds a minimum duration. Moreover, unlike typical sensory neurons that integrate stimulus energy, the minimum duration for evoking a response in a long-pass DTN does not decrease with increasing stimulus amplitude/energy (Faure et al., 2003; Sayegh et al., 2011). Some long-pass DTNs respond with fewer spikes at a longer first-spike latency (FSL) to increasing stimulus amplitude/energy (Faure et al., 2003), a phenomenon characterized as “paradoxical” latency shift (Covey et al., 1996; Galazyuk and Feng, 2001; Hechavarría et al., 2011). It is paradoxical because most sensory neurons produce more spikes and shorter FSLs with increasing stimulus amplitude/energy (Kiang, 1965). To summarize, the responses of DTNs are temporally selective and do not reflect the simple integration of stimulus energy.
Conceptual mechanisms of duration tuning
Various neural mechanisms have been proposed to explain the responses of DTNs. Some are circuit-based models that rely on the temporal interaction and integration of excitatory and inhibitory synaptic inputs at the DTN (Casseday et al., 1994, 2000; Fuzessery and Hall, 1999; Aubie et al., 2009). Another proposes that slowly changing ionic conductances, induced solely by sustained hyperpolarization (i.e., sustained inhibition), result in duration-dependent currents and duration-selective responses (Hooper et al., 2002). The slow conductance mechanism, while biologically possible, lacks empirical support because DTNs in the auditory midbrain continue to fire action potentials even when inhibition is neuropharmacologically blocked (see below). Here, we investigate how variations of two circuit-based mechanisms, using both excitatory and inhibitory synaptic inputs, can reproduce the short-pass and bandpass responses observed from in vivo recordings in the auditory midbrain of the bat, rat, mouse, and frog.
Coincidence detection mechanism
The coincidence detection mechanism of duration tuning was first suggested by Potter (1965), who hypothesized that neurons tuned for stimulus duration in the frog's torus semicircularis would spike only after the summation of excitations evoked by the onset and offset of a stimulus. This mechanism was further described by Narins and Capranica (1980), who postulated that an onset-evoked excitation would temporally coincide with an offset-evoked excitation and produce spikes in a DTN when the onset-evoked excitation had a delay (latency) equal to the preferred stimulus duration.
This model was later augmented to include inhibitory inputs when it was shown that blocking inhibition at the DTN severely reduced (or abolished) the temporal selectivity of the cell (Casseday et al., 1994, 2000; Jen and Feng, 1999; Jen and Wu, 2005; Yin et al., 2008). Therefore, the coincidence detection mechanism requires three inputs to the DTN: (1) onset-evoked excitation; (2) offset-evoked excitation; and (3) onset-evoked inhibition that is sustained for as long or longer than the duration of the stimulus. In the presence of inhibition, neither the onset-evoked nor offset-evoked excitations are suprathreshold on their own and cannot evoke spiking in the DTN; however, when the onset- and offset-evoked excitations temporally coincide, or their effects sufficiently overlap, the summed excitation can overcome inhibition and evoke spiking in the DTN (Fig. 2A,B). Because the arrival time (latency) of the offset-evoked excitation varies with stimulus duration, the response of the DTN will be selective to specific durations. A symmetrical window of coincidence will naturally result in a bandpass duration-tuned response (Fig. 2A,B); however, the coincidence detection mechanism can also produce short-pass duration-tuned responses when the excitatory inputs maximally coincide at the shortest stimulus durations (Aubie et al., 2009).
Anti-coincidence mechanism
Fuzessery and Hall (1999) noticed that many short-pass DTNs in the IC of the pallid bat had responses locked to stimulus onset rather than stimulus offset, suggesting these neurons lacked an offset-evoked excitatory input. They proposed an alternative mechanism that had only one onset-evoked excitatory input and predicted that a DTN would respond when excitation and inhibition failed to coincide. This anti-coincidence mechanism of duration tuning requires two inputs to the DTN: (1) onset-evoked excitation; and (2) onset-evoked inhibition that is sustained for as long or longer than the duration of the stimulus (Fuzessery and Hall, 1999). In the absence of inhibition, the onset-evoked excitation is suprathreshold and can evoke spikes in the DTN. When the onset-evoked excitation has a latency longer than the offset (end) of the inhibition, as may occur at short stimulus durations, it can evoke spiking in the DTN. At longer stimulus durations, the sustained inhibition will eventually overlap the onset-evoked excitation and spiking will be extinguished (Fig. 2C,D). Therefore, the anti-coincidence mechanism naturally results in short-pass duration-tuned responses; however, modified anti-coincidence mechanisms can also produce bandpass duration tuning when the excitatory input is weak or absent at the shortest stimulus durations (Aubie et al., 2009).
Default model
Response profile
Our default computational model produced a short-pass DTN with a BD of 1 ms. Figure 3A illustrates the membrane potential of the DTN in the default model for single stimulus presentations from 1 to 9 ms in duration relative to the timing of spikes arriving at the DTN from the onset- and offset-responding excitatory input neurons. Figure 3B is a dot raster display of the default model over 20 trials, with the mean spike count and FSL (re stimulus onset) over all trials summarized in Figure 3, C and D, respectively. A network diagram of the default model inputs and their synaptic latencies is shown in Figure 3E. In this simulation, one or two spikes were evoked in the DTN for stimuli ≤4 ms, one or no spikes were evoked for stimuli between 5 and 8 ms, and no spikes were evoked at durations >8 ms (Fig. 3A–C). For stimulus durations ≤4 ms, the spike from the onset-responding excitatory input neuron arrived at the model DTN before (1–3 ms) or exactly coincident with (4 ms) the spike from the offset-responding excitatory input neuron, whereas for stimulus durations ≥5 ms the spike from the offset-responding input neuron arrived after the spike from the onset-responding input neuron. The timing of excitatory inputs to the DTN is also evident in the dot raster display (Fig. 3B). The FSL of the model DTN increased more quickly at stimulus durations ≥5 ms because spikes followed the latency of the offset-evoked excitatory input (Fig. 3D).
Because the two excitatory inputs were maximally coincident in response to the 4 ms stimulus, the coincidence detection mechanism of duration tuning would predict the default model DTN to have a BD of 4 ms; however, the observed BD was 1 ms. The discrepancy between the predicted and observed BD was caused by the presence and time course of the sustained inhibition. The effect of inhibition is particularly evident when we compare responses evoked by the 1 and 7 ms stimuli (Fig. 3A). At 1 and 7 ms, the difference in spike timing between the onset- and offset-evoked excitatory inputs was identical (3 ms), as were the magnitudes of the summed excitations even though the order of the inputs was reversed. The reason why the model DTN did not produce an equivalent response at 1 and 7 ms is because of the difference in duration of the sustained inhibition. For the 1 ms stimulus, the sustained inhibition lasted for 1 ms and was easily overcome by the excitations even though they were not perfectly coincident, and this resulted in an effective excitation that typically evoked two spikes per stimulus (19 of 20 trials; Fig. 3B,C). For the 7 ms stimulus, the sustained inhibition lasted for 7 ms and decreased the effective strength of the summed excitations, resulting in one spike (16 of 20 trials) or no spikes per stimulus (4 of 20 trials; Fig. 3B,C). The interaction of inhibition with excitation caused the peak net excitation in the model, and thus the BD of the cell, to occur in response to 1 ms stimuli even though excitatory inputs to the DTN were maximally coincident for 4 ms stimuli.
First-spike latency
First-spike latencies in the default model of duration tuning followed stimulus offset but exhibited a nonlinear relationship with stimulus duration (Fig. 3D). At short durations (1–4 ms), the increase in FSL was smaller than the increase in stimulus duration. At intermediate durations (4–6 ms), FSL increased in proportion to stimulus duration, and at longer durations (>6 ms), the increase in FSL was greater than the increase in stimulus duration. For comparison, a dot raster display of a short-pass DTN from the ICc of the bat exhibiting a similar nonlinear increase in FSL is shown in Figure 4A. In the in vivo neuron, FSL was relatively stable at short stimulus durations (1–3 ms) but then increased and followed stimulus offset at longer durations. A nonlinear shift in FSL was previously reported by Fuzessery and Hall (1999) and Faure et al. (2003), who hypothesized that the offset-evoked excitatory input to a DTN had a minimum response latency that did not change for stimulus durations below some minimum duration, but that increased and thus mirrored the concomitant lengthening of the stimulus offset at longer durations.
We compared FSL changes in our model with those observed from in vivo recordings of DTNs from the ICc of the bat (Fig. 4B). To simplify the comparison, we shifted absolute FSLs (re stimulus onset) in both in vivo and model responses so that the shortest (i.e., minimum duration) stimulus to evoke a response had a FSL that fell on the line y = x − 1. Latencies were shifted according to the following equation: FSLshifted = FSLabsolute − FSLmindur + mindur − 1, where FSLshifted was the resulting shifted FSL, FSLabsolute was the observed FSL, FSLmindur was the observed FSL at the shortest duration to evoked spiking, and mindur was the minimum duration stimulus that evoked spiking. The slope of the shifted FSL function was defined as the change in FSL divided by the change in duration. In our default model DTN with a standard GABAA inhibitory conductance of 2.5 nS, the slope of the shifted FSL function was <1 at the shortest stimulus durations (1–5 ms) but then increased and eventually became >1 at longer stimulus durations (Fig. 4B, black circles). A similar change in FSL was observed in approximately one-half of the in vivo cells recorded from the bat, with examples shown in Figure 4B.
To quantify the change in FSL observed in the in vivo recordings from the ICc of the bat, we computed separate linear regressions for the mean FSL data evoked at both short (1–3 ms) and longer stimulus durations (≥3 ms), and plotted the distribution of the slopes for the two separate stimulus ranges at each level above threshold (Fig. 5). A slope of 0 indicates no change in FSL with increasing stimulus duration and corresponds to a cell with an onset response and constant FSL re stimulus onset. A slope of 1 indicates an increase in FSL equal to the increase in stimulus duration and corresponds to an offset response with a constant FSL re stimulus offset. A slope of >1 indicates that FSL increases more quickly than the increase in stimulus duration and also corresponds to an offset response. Finally, a slope of <0 corresponds to a decrease in FSL with an increase in stimulus duration and corresponds to a cell with paradoxical latency shift (Covey et al., 1996; Galazyuk and Feng, 2001; Hechavarría et al., 2011). We also tested for the equality of variances and means between the distributions at each level above threshold using Levene's test and a t test, respectively.
At +10 dB re threshold, 10 of 27 cells had negative FSL slope functions for responses evoked by short (1–3 ms) stimuli, and 25 of 28 cells had positive FSL slope functions for responses evoked by longer (≥3 ms) stimuli (Fig. 5, left column). Moreover, the two distributions differed significantly in both their means (t = −2.1159; p = 0.0391) and variances (W = 7.9177; p = 0.0069); however, the differences were not solely caused by having different neurons appear in each histogram as 22 cells yielded FSL slope data that appeared in both distributions; 6 neurons with a negative FSL slope at short durations exhibited a positive FSL slope at longer durations, 13 cells with a positive FSL slope at short durations remained positive at longer durations, 2 cells stayed negative, and 1 cell switched from positive to negative. At +20 dB re threshold, 8 of 20 cells had negative FSL slope functions for responses evoked by 1–3 ms stimuli and 14 of 21 cells had positive FSL slope functions for responses evoked by stimuli ≥3 ms, and there was a statistically significant difference between the variances (W = 6.1143; p = 0.0179) but not the means (t = −1.0387; p = 0.3053) of the two distributions (Fig. 5, middle column). Notably, 17 cells yielded FSL slope data that appeared in both distributions and 6 neurons with a negative FSL slope at short durations exhibited a positive FSL slope at longer durations. At +30 dB re threshold, 8 of 16 cells had negative FSL slopes between 1 and 3 ms, 12 of 14 cells had positive FSL slope functions for responses evoked by stimuli ≥3 ms, and again there was a statistically significant difference between the variances (W = 6.1296; p = 0.0196) but not the means (t = −1.3438; p = 0.1898) of the two distributions (Fig. 5, right column). Twelve neurons yielded FSL slope data that appeared in both distributions, and five neurons with a negative FSL slope at short durations exhibited a positive FSL slope at longer durations. In summary, nonlinear change in FSL is a common feature of DTNs in the ICc of the bat. Many cells showed constant or decreasing FSLs in response to very short stimulus durations, and the majority of cells exhibited increasing FSLs with clear offset-following responses to longer stimulus durations.
We hypothesized that nonlinear change in FSL results from the relative timings of the onset- and offset-evoked excitations in combination with the sustained inhibition. Onset-evoked, sustained inhibition is known to increase FSL in mammalian DTNs (Casseday et al., 2000; Fuzessery et al., 2003; Yin et al., 2008). At longer stimulus durations, the offset-evoked excitation occurs after the onset-evoked excitation (Fig. 3A,B), so the depolarized membrane potential of the DTN will have more time to decay toward its resting potential before the offset-evoked excitation arrives. This decay will be accelerated by the presence of sustained inhibition that pushes the membrane potential of the DTN more negative and toward the EGABAA = −80 mV; at longer durations, the membrane potential of the DTN will be further from and take longer to reach its spike threshold, which explains why the slope of the shifted FSL function was >1 in some in vivo DTNs. Leary et al. (2008) also attributed increasing FSLs at longer stimulus durations to the presence of encroaching onset-evoked inhibition because offset-evoked excitation was not observed in whole-cell patch-clamp recordings of frog DTNs.
To test the hypothesis that onset-evoked, sustained inhibition can increase FSLs in our computational model, we generated two versions of the model DTN: one using the default inhibition with ḡGABAA = 2.5 nS, and another using decreased inhibition with ḡGABAA = 1.5 nS. Despite having a 40% reduction in the maximum GABAA inhibitory conductance, the model using decreased GABAA inhibition was able to render excitatory inputs to the DTN subthreshold; hence a fundamental aspect of the coincidence detection mechanism of duration tuning remained unchanged. The model with decreased GABAA inhibition produced slightly shorter FSLs (≤1 ms difference), and a small increase in spike count and width of temporal tuning for stimulus durations between 1 and 6 ms, but otherwise resulted in a short-pass tuning profile (see below, Maximum receptor conductance) (Fig. 8E,F). Most notably, the model with decreased GABAA inhibition produced spikes that exhibited a more linear change in FSL that faithfully followed stimulus offset (FSL slope, ∼1) at longer durations (Fig. 4B, gray circles), thus mirroring the responses of many in vivo DTNs from the ICc of the bat.
Single-parameter modifications
We systematically varied the onset-evoked excitatory input latency, membrane time constant, and maximum receptor conductances of the default model to determine the relative contribution and importance of these parameters on duration tuning response profiles. The results were subsequently used to tune the default model to mimic the responses of DTNs from the bat, rat, mouse, and frog.
Onset-evoked excitatory input latency
We tested four onset-evoked excitatory input latencies to determine how the relative latencies of synaptic inputs can alter the duration tuning response profile of the default model; the latency of the offset-evoked excitation was held constant at 6 ms re stimulus offset. Two important findings resulted from these simulations. First, the bandwidth of duration tuning (i.e., the width of temporal selectivity) systematically increased as a function of the latency of the onset-evoked excitation (Fig. 6A). When the onset-evoked input latency was longer than the offset of the sustained inhibition, the excitation was suprathreshold and the model DTN produced spikes at a fixed latency from stimulus onset. Mean spike counts in the model cell systematically decreased when the duration of the stimulus, and thus the sustained inhibition, was close to the latency of the onset-evoked excitation. Together, these results reinforce the notion that the onset-evoked excitation needs to be coincident with the sustained inhibition to be rendered subthreshold.
Second, for a given onset-evoked excitation latency, FSLs of the model DTN remained relatively constant, except at the shortest and longest stimulus durations (Fig. 6B). At the shortest stimulus durations, the offset-evoked excitation had a small nonzero probability of evoking spikes in the model DTN before the arrival of the onset-evoked excitation because the sustained inhibition at this time had a minimal impact on the membrane potential of the DTN. At intermediate durations when the latency of the onset-evoked excitation was longer than the offset of the stimulus, the onset-evoked excitation produced a constant spike count at a fixed latency re stimulus onset in the model DTN; however, at longer stimulus durations near the latency of the onset-evoked excitation, the sustained inhibition now rendered the onset-evoked excitation subthreshold, so temporal coincidence between the offset-evoked and onset-evoked excitations was necessary to produce spikes in the model cell. Therefore, the mean FSL in the model DTN increased with stimulus duration when the offset-evoked excitation arrived after the onset-evoked excitation. First-spike latencies can also be delayed by encroaching sustained inhibition as demonstrated in Figure 4B and below (see below, Reproducing a frog short-pass DTN).
Few direct observations of excitatory synaptic input latencies to in vivo DTNs have been reported (e.g., whole-cell patch recording) (Covey et al., 1996; Leary et al., 2008). An alternative way to estimate these latencies is to characterize the FSL of neurons that exhibit purely onset response patterns. First-spike latencies of ICc neurons in the big brown bat typically range from ∼5 to ≥30 ms (Haplea et al., 1994). Figure 6C shows data from four onset responding neurons from the ICc of the big brown bat, with FSLs ranging from ∼7 to 23 ms. One cell (MU027.12) is shown twice, once at 0 dB (threshold) and once at +20 dB (re threshold) to illustrate its stable FSL and minimum duration threshold. Dot rasters for two neurons are shown in Figure 6D. Note how the longer latency neuron (gray background) requires a minimum stimulus duration of ∼7 ms to reliably evoke spikes. Later, we demonstrate that onset-evoked excitatory inputs with a minimum stimulus duration threshold could be involved in the mechanism that creates some bandpass DTNs (see below, Reproducing a bat bandpass DTN).
Membrane time constant
The membrane time constant (τ) of a neuron, defined as the product of the membrane resistance (rm) and the membrane capacitance (cm; τmembrane = rmcm), determines how quickly the membrane potential (voltage) will react to synaptic input currents. The membrane time constant of an in vivo neuron is a complex function of multiple ion currents (e.g., K+) (Rothman and Manis, 2003) and nonspecific, hyperpolarization-activated currents (e.g., Ih) (Kopp-Scheinpflug et al., 2011); however, our simplified model allowed us to easily set the passive membrane time constant by altering the maximum conductance of passive leak channels (ḡleak). In the default model with ḡleak = 0.25 mS/cm2 (i.e., r̄leak = 4000 Ωcm2) and cm = 1.0 μF/cm2, the membrane time constant was τ = 4 ms.
Longer time constants in the model DTN resulted in an increased number of spikes and a wider temporal response bandwidth (Fig. 7A,C; panels separated for clarity over different timescales). Response bandwidth, defined as the range of stimulus durations with ≥0.5 spikes per trial, grew exponentially with membrane time constant because the window of effective temporal coincidence between the onset- and offset-evoked excitations broadened (Fig. 7E). Specifically, because the depolarizing effects of the first excitatory input persisted longer at longer membrane time constants, the excitation was more likely to overlap with the second excitatory input arriving later. For all time constants tested, the resulting FSL in the model DTN remained relatively unchanged. At short stimulus durations (i.e., ≤5 ms), the increase in FSL as a function of stimulus duration followed that of the default model and the slope of change in FSL was <1 (Fig. 7B). At longer stimulus durations, the slope of change in FSL was ∼1 because the timing of spikes in the model followed the increase in latency of the offset-evoked excitatory input (Fig. 7D).
Maximum receptor conductance
The responses of IC neurons, including DTNs, are shaped by complex interactions of excitation and inhibition (Rose et al., 1966; Pollak and Park, 1993; McAlpine et al., 1998; Casseday et al., 2000; Gittelman and Pollak, 2011; Pollak et al., 2011). Here, we vary the maximum receptor conductances (ḡ) of AMPA, NMDA, and GABAA receptors to understand how these inputs contribute to duration selectivity (Fig. 8).
Modifying only the AMPA receptor maximum conductance provided two interesting results. First, removing the AMPA-mediated current (ḡAMPA = 0 nS) completely extinguished spiking in the model cell; without the AMPA-mediated depolarizing current, the NMDA-mediated current was stinted by the voltage-dependent Mg2+ block and the model neuron never reached spiking threshold (Fig. 8A; 0 nS). Second, as the maximum AMPA receptor conductance increased, the number of spikes and the bandwidth of duration tuning also increased (Fig. 8A). Interestingly, setting ḡAMPA = 6 nS permitted the model cell to have a nonzero spiking probability over a wider, but still limited, range of stimulus durations with the FSL following stimulus offset. This demonstrated that both the onset- and offset-evoked excitations were required to evoke spiking and that the temporal window of coincidence was considerably wider at higher maximum AMPA receptor conductances (Fig. 8B; 6 nS). When ḡAMPA = 8 nS, the model cell lost temporal selectivity and produced spikes at all stimulus durations. Moreover, the FSL of the model now followed stimulus onset because the onset-evoked excitatory input was suprathreshold even with inhibition and thus no longer required temporal coincidence with the offset-evoked excitation (Fig. 8B; 8 nS).
Increasing only the NMDA receptor maximum conductance resulted in more spikes and a wider duration tuning response profile (Fig. 8C). Unlike when the AMPA conductance was abolished, spikes were still evoked in the model cell even after the NMDA-mediated current was removed (ḡNMDA = 0 nS); however, the number of spikes and temporal bandwidth of tuning were greatly diminished. Even at a very high NMDA maximum conductance (ḡNMDA = 35 nS), temporal coincidence between the onset- and offset-evoked excitations was still required to evoke spiking. Moreover, the model neuron maintained an offset following response pattern (Fig. 8D) and did not switch to an onset following response pattern until ḡNMDA ≥ 125 nS (data not shown). This high NMDA maximum conductance was required to overcome the strong Mg2+ block. Because increased AMPA-mediated depolarizing currents would further release the Mg2+ block from NMDA channels, a higher maximum AMPA conductance would lower the minimum NMDA conductance required to switch the spiking pattern of a neuron from offset to onset following. For example, when we set the ḡNMDA to 6 nS, the model DTN switched from offset responding to onset responding when the ḡNMDA was only 55 nS (data not shown). This highlights how the spiking response pattern of a neuron can be mediated by a complex interaction of solely excitatory currents (Zhang and Kelly, 2001; Sanchez et al., 2007).
Reducing the strength of the GABAA-mediated current also resulted in more spikes and a wider temporal response bandwidth in the model cell (Fig. 8E). When the GABAA maximum conductance was reduced to 1.5 nS, the model DTN responded to durations up to 20 ms and demonstrated a wider window of temporal coincidence. When the conductance was reduced to 1.0 nS, at least one spike was evoked at all stimulus durations and FSL followed stimulus offset (Fig. 8F). In this case, the weakened sustained inhibition was strong enough to render the onset-evoked excitation subthreshold because it completely overlapped the excitation. In contrast, because the offset-evoked excitation only slightly overlapped the sustained inhibition it was not rendered subthreshold, so offset-evoked spikes were evoked at every duration. In the default model, the GABAA-mediated inhibition was stronger and persisted long enough to render the offset-evoked excitation subthreshold. When inhibition was removed completely, the model DTN responded at all stimulus durations (Fig. 8E; 0 nS) with a constant FSL (Fig. 8F; 0 nS). This was expected because, in the absence of inhibition, the onset-evoked excitatory input was suprathreshold on its own. Although offset-responding DTNs are common in mammals, a high proportion of onset-responding DTNs have been reported in the pallid bat (Fuzessery and Hall, 1999) and least horseshoe bat (Luo et al., 2008). Offset-responding DTNs with clear onset excitation breakthrough have also been reported from the ICc of the big brown bat and were hypothesized to occur when onset-evoked excitatory input(s) overpowered inhibitory input(s) (Ehrlich et al., 1997; Faure et al., 2003). A previous computational study demonstrated that reducing the strength of inhibition relative to excitation can cause a model DTN with offset spiking responses to exhibit an “onset excitation breakthrough” spiking pattern (Aubie et al., 2009). Our present simulations reinforce the importance of balanced excitatory and inhibitory input for producing DTNs with offset responses and sharp temporal selectivity.
Multiple parameter modifications
In this section, we modify the default model of duration tuning with parameters tuned to reproduce the responses of in vivo DTNs from the bat, rat, mouse, and frog. These species were chosen to highlight how biologically plausible parameter modifications in our computational model were able to produce a range of temporal selectivities mirroring in vivo duration tuning response profiles. In the bat and the rat examples, we explore two modifications that we hypothesize could contribute to transforming the responses of short-pass DTNs into bandpass DTNs. In the mouse and the frog examples, we demonstrate how both the coincidence detection and anti-coincidence mechanisms can create a short-pass duration-tuned response profile. The models showcase that biologically feasible instantiations of duration selectivity could by created by similar neural mechanisms in different vertebrates. Our theoretical approach provides credible support for the existence of similar plausible instantiations of temporal selectivity in different species; however, alternative mechanisms likely exist and are not excluded by our investigation.
Reproducing a bat bandpass DTN
The coincidence detection mechanism is hypothesized to underlie some bandpass DTNs (Ehrlich et al., 1997; Brand et al., 2000; Faure et al., 2003; Pérez-González et al., 2006). If the onset-evoked excitatory input arrived before the offset-evoked excitatory input at short stimulus durations, but after the offset-evoked excitatory input at long stimulus durations, then theoretically such a mechanism would produce a bandpass DTN. The cell would be expected to show a symmetrical pattern of spiking for stimulus durations both shorter and longer than BD. Often, however, bandpass DTNs in bats have little to no response at the shortest stimulus duration(s), and a longer tail of diminishing spiking probability for stimuli above BD (Ehrlich et al., 1997; Fuzessery and Hall, 1999; Zhou and Jen, 2001). Such a bandpass response profile can be produced using the coincidence detection mechanism if the DTN receives no excitatory input at the shortest stimulus duration(s). This mechanism is biologically feasible if the onset- and/or offset-evoked excitations have a minimum stimulus duration required for producing synaptic input to the DTN.
Our observations suggest that some bat ICc neurons have a minimum duration threshold that persists over a wide range (≥20 dB) of stimulus amplitudes. The neurons, or their inputs, require a minimum integration time of stimulus energy to fire action potentials that is never or rarely met at very short stimulus durations (e.g., cell MU027.12 in Fig. 6C,D, gray shading). Therefore, removing excitatory input at the shortest stimulus duration is a plausible approach for producing a bandpass DTN. In some cases, a neuron that appears to have a minimum duration threshold may actually have a minimum energy threshold. In this case, increasing stimulus energy should recover the response of the neuron at shorter durations. For example, a 1 ms tone with 0.5 ms linear onset/offset ramps has 4.77 dB less energy than a 2 ms tone with the same ramps. If a neuron with a minimum energy threshold responds to tones ≥2 ms at a given amplitude, then increasing the amplitude of the tone by 4.77 dB should now cause the cell to spike to 1 ms signals. Such energy integrators are not likely candidates for producing bandpass duration tuning stable over a range of stimulus amplitudes.
We implemented this strategy to reproduce a bandpass DTN from a big brown bat by removing excitatory input to the default model for 1 ms stimuli. Dot raster displays of the model cell (Fig. 9A) and in vivo bat DTN (Fig. 9B) both show offset-responding spiking patterns. The modified model reproduced both the mean spike counts (Fig. 9C) and FSLs (Fig. 9D) of the in vivo bat DTN. A network diagram of model inputs is shown in Figure 9E, and the model parameters are listed in Table 5. Variation in spike timing was lower for the model cell than the in vivo DTN (Fig. 9, compare A, B), but this was not surprising because the sole source of noise in the model was from the timing of presynaptic GABAergic spikes. Incorporating noise into the timing of excitatory inputs to the model would cause additional jitter in spike arrival times and spike probability that would lead to increased variation in FSL and spike counts per trial.
Reproducing a rat bandpass DTN
Rat DTNs differ from bat DTNs in their preference for longer duration acoustic signals. Pérez-González et al. (2006) reported rat DTNs with BDs up to 128 ms and presented, in detail, an in vivo DTN with a BD of 56 ms. The latter cell had a highly variable spiking probability with weak responses to 7.5 kHz tone durations <50 ms, a peak spike probability of six spikes per 10 trials in response to tones at BD, and a diminishing spiking probability for tone durations up to 159 ms.
On its own, the coincidence detection mechanism of duration tuning could produce a bandpass DTN with a BD of 56 ms; however, the duration-tuning response profile of the model would be expected to show symmetrical spike counts at durations both shorter and longer than BD. The in vivo data show that the rat DTN had an asymmetrical spike count surrounding BD (Pérez-González et al., 2006). To account for the asymmetry in our model, we hypothesized that the onset-evoked inhibition received by the model DTN was initially strong, but then decreased and plateaued (adapted) to a weaker level for the duration of the stimulus. Whole-cell patch-clamp studies in the bat (Covey et al., 1996) and frog (Leary et al., 2008), and extracellular recordings from the bat (Faure et al., 2003) suggest that inhibition in the DTN is strongest at stimulus onset, likely as a result of both GABAergic and glycinergic inputs (Casseday et al., 2000). To simulate strong transient inhibition in our model, we added a second, stronger GABAergic input with a 15 ms latency (re stimulus onset) that persisted for the initial 35 ms of the stimulus. This modification resulted in a model rat DTN that failed to respond to short duration stimuli because its offset-evoked excitation overlapped with the early and strong transient inhibition (Fig. 10A). At stimulus durations near BD (56 ms), the onset- and offset-evoked excitations sufficiently overlapped after the early inhibition had decayed, so spikes were evoked. At durations longer than BD, decreasing temporal coincidence between the onset- and offset-evoked excitatory inputs produced a decaying trail of spikes that followed stimulus offset, resulting in a bandpass neuron with an asymmetrical response probability that mirrored the duration tuning profile of the in vivo rat DTN (Fig. 10B). The in vivo response profile was especially difficult to reproduce because of its highly variable and weaker spiking (Fig. 10C). To account for this variability, we tuned the parameter values in the model to approximate the duration tuning response profile of the in vivo rat neuron when averaged over 1000 simulations. We then ran the simulation 20 times with different random seeds until variability in model output was similar to the in vivo rat DTN. A network diagram of the inputs to the model is shown in Figure 10E with model parameters listed in Table 5. The tuned model reproduced the mean spike counts and FSLs of the rat in vivo DTN (Fig. 10C,D). Unlike the model bat DTN, excitatory inputs to the model rat DTN were present at all stimulus durations.
Reproducing a mouse short-pass DTN
The response profile of DTNs in mice share similarities with DTNs in bats and rats. Some mouse DTNs are tuned to short stimulus durations and have narrow temporal response bandwidths, while others have longer BDs and wider temporal bandwidths (Brand et al., 2000; Xia et al., 2000). To highlight how our computational model was able to reproduce the response of a narrowly tuned DTN from the mouse, we adjusted model parameters to produce a short-pass neuron with a BD of 1–2 ms and a sharp decrease in spiking for longer stimulus durations (Brand et al., 2000).
The model DTN nicely reproduced the sharp tuning of the in vivo short-pass DTN, although the model failed to produce spikes at longer stimulus durations as occasionally observed in the in vivo DTN of the mouse (Fig. 11A,B). These additional in vivo spikes could have been evoked spontaneously from a source of excitatory noise not present in our model. The FSL of both the in vivo mouse DTN and the model DTN had a tendency to increase slightly with stimulus duration (Fig. 11C). A network diagram of the inputs to the model is shown in Figure 11D and model parameters are listed in Table 5.
Reproducing a frog short-pass DTN
Although DTNs were originally discovered and first reported from the auditory midbrain (torus semicircularis) of frogs (Potter, 1965; Straughan, 1975; Narins and Capranica, 1980), these cells have received little attention in anurans. A recent in vivo whole-cell patch-clamp study by Leary et al. (2008) found that, of 22 DTNs recorded from the torus semicircularis, 11 were short-pass, 1 was bandpass, and 10 were long-pass. Intracellular recordings unveiled a short-latency, onset-evoked inhibition (hyperpolarization) and a longer latency, onset-evoked excitation (depolarization). Counter to the coincidence detection mechanism of duration tuning, offset-evoked depolarizations were not observed. Therefore, we expected that the mechanism underlying duration selectivity in anurans would be more similar to the anti-coincidence mechanism originally proposed for short-pass duration selectivity in bats (Fuzessery and Hall, 1999).
We removed the offset-evoked excitatory input in our default model to reproduce the short-pass response profile of a frog DTN reported by Leary et al. (2008), including the low but nonzero spiking probability at longer stimulus durations (Fig. 12A,B). The spiking responses of our model frog DTN also mirrored the increase in FSL at longer stimulus durations (Fig. 12C) routinely observed in in vivo recordings. Leary et al. (2008) hypothesized that this increase was a result of encroaching inhibition delaying spikes. Spikes evoked by this example in vivo had a relatively weak increase in FSL compared with other frog DTNs (Leary et al., 2008). The delay of FSL in our model was caused by encroaching inhibition because only a single, fixed-latency, onset-evoked excitatory input was present in the network (see Fig. 12D with model parameters listed in Table 5).
Discussion
Duration-tuned neurons are found in the auditory midbrain of many vertebrates: bats (Ehrlich et al., 1997; Fuzessery and Hall, 1999; Casseday et al., 2000; Faure et al., 2003; Mora and Kössl, 2004), rats (Pérez-González et al., 2006), frogs (Potter, 1965; Straughan, 1975; Narins and Capranica, 1980; Gooler and Feng, 1992; Leary et al., 2008), mice (Brand et al., 2000; Xia et al., 2000), guinea pigs (Wang et al., 2006; Yin et al., 2008), chinchillas (Chen, 1998), and cats (He et al., 1997; Qin et al., 2009). Cortical DTNs have been studied in the bat (Galazyuk and Feng, 1997; Ma and Suga, 2001) and cat (He et al., 1997) auditory cortex and the cat visual cortex (Duysens et al., 1996). Invertebrate duration selectivity for vibratory signals has also been found in ascending interneurons of insects (Zorović, 2011). Our results support the hypothesis that neural mechanisms of duration selectivity may be shared across vertebrates. We created neurophysiologically plausible computational models of two theoretical mechanisms, the coincidence detection and anti-coincidence mechanisms of duration tuning, that accurately reproduced the in vivo responses of DTNs over a wide range of temporal response profiles. We did so by modifying synaptic input latencies, synaptic input strengths, and membrane time constants of model DTNs. Here, we review inherent limitations of our models and suggest future enhancements, some of which rely on yet to be obtained physiological data. We also highlight similarities between neural mechanisms of duration tuning and mechanisms underlying other stimulus feature detectors in the CNS.
Model limitations and future enhancements
Intracellular recordings from DTNs have been achieved only twice: once in bats (Covey et al., 1996) and once in frogs (Leary et al., 2008). Both studies revealed the relative timing of membrane voltage changes and showed that inhibition often preceded excitation; however, the duration and strength of individual inputs were indeterminable because the individual strengths of excitation and inhibition were inseparable from their summated effect on membrane potential. Membrane time constants were not measured either. In this study, we chose physiologically plausible membrane and ion channel parameters for the auditory midbrain but these parameters may be further constrained within the subclass of DTNs.
Our models incorporated only one inhibitory receptor (GABAA) and two excitatory receptors (AMPA and NMDA). We used two excitatory receptors because the long time constant of NMDA receptors was important for producing long-duration membrane depolarizations from transient inputs and NMDA receptors required AMPA receptor activation due to the Mg2+ block. Long depolarizations increased the window of temporal coincidence for multiple inputs, a feature we believe is critical for tuning to long stimulus durations (Fig. 8A,C). Additional receptors and neurotransmitters shape IC neuron responses and may also contribute to duration tuning. GABAB receptors act presynaptically to modulate GABAergic (Ma et al., 2002a) and glutamatergic (Sun et al., 2006) inputs and are found in moderate concentration in the rat IC (Charles et al., 2001) and low concentration in the bat IC (Fubara et al., 1996). Although GABAB receptors were not explicitly modeled, their net effect is represented by the magnitude of inputs to the model.
Inhibitory glycine receptors are found in the IC of both rats (Pourcho et al., 1992) and bats (Vater et al., 1992; Fubara et al., 1996; Klug et al., 1999; Wenstrup and Leroy, 2001). Pharmacologically blocking glycine increases spike counts and reduces temporal selectivity in DTNs in a manner similar, albeit less pronounced, to blocking GABAA receptors (Casseday et al., 1994, 2000). Although glycine receptors have faster kinetics than GABAA receptors (Harty and Manis, 1998), the effect of a sustained glycinergic input to a DTN would be similar to sustained GABAergic input. That said, it is possible that glycinergic inputs differentially affect DTNs by having distinct temporal response patterns over the duration of a stimulus. Blocking glycine receptors in the IC appears to preferentially affect responses near stimulus onset, suggesting that stimulus-evoked inhibition has an early glycine component (Casseday et al., 2000). The ventral nucleus of the lateral lemniscus in the bat is known to project transient, onset-evoked glycinergic inputs to the IC that could provide early inhibition to DTNs (Covey and Casseday, 1991; Vater et al., 1997). Some glycinergic synapses in the guinea pig ventral cochlear nucleus have fast desensitization times that could also contribute transient inhibition (Harty and Manis, 1998). We produced strong, early inhibition in our model rat DTN by incorporating additional GABAergic inputs. Additional glycine receptors would achieve a similar result.
Serotonin and dopamine have received less attention in the IC literature and were not implemented in our models; however, they likely contribute to duration selectivity. Serotonin levels in the IC are modulated by behavioral states (Hurley et al., 2002; Peruzzi and Dut, 2004), and activation of serotonin receptors can shift the FSL of IC neurons (Hurley and Pollak, 2005; Hurley, 2007) and modulate GABAergic inhibition (Hurley et al., 2008). This suggests that behavioral state may modulate the responses of DTNs. Both latency and inhibition are important parameters for determining the response profile of DTNs. Dopamine receptors are present in the IC (Charuchinda et al., 1987), but their effects on DTNs and other IC neurons are unclear at this point.
Duration tuning and other auditory feature detectors
Within the central auditory system, DTNs are first observed in the IC and arise from the temporal integration of excitatory and inhibitory synaptic inputs. Our results support the hypothesis that both the timing (Fig. 6) and magnitude (Fig. 8) of inputs are important for producing the response characteristics of DTNs. Furthermore, we found that long membrane time constants produced stronger spiking and wider duration tuning bandwidths (Fig. 7), demonstrating that intrinsic membrane properties can provide additional response heterogeneity for cells with similar synaptic inputs.
Convergence of excitation and inhibition produces a host of response profiles in the central auditory system, including selectivity for stimulus frequency (Fuzessery and Hall, 1996; Wehr and Zador, 2003; Wu et al., 2008; Xie et al., 2008; Fukui et al., 2010), intensity (Wu et al., 2006), spatial position/sound localization (Rose et al., 1966; Fuzessery and Pollak, 1985; Zhang and Kelly, 2010; Tang et al., 2011), amplitude modulation (Grothe, 1994; Burger and Pollak, 1998), rhythm detection (Felix et al., 2011), FM sweep direction (Gittelman et al., 2009; Gittelman and Pollak, 2011), and pulse-echo delay (Galazyuk et al., 2005; Yavuzoglu et al., 2011). Inhibition shapes and sharpens neural responses by modulating the latency, strength, and time course of excitation. For example, we demonstrated in both the default model (Fig. 3) and the rat bandpass model (Fig. 10) that the maximum spiking response need not be evoked by the stimulus that creates the maximum temporal coincidence between excitatory inputs. In the default model DTN, weak inhibition at short stimulus durations allowed overlapping, but not perfectly coincident, excitatory inputs to maximally summate and evoke the highest spike count. In the rat bandpass model DTN, strong inhibition at short stimulus durations rendered temporally coincident excitatory inputs less effective in evoking spikes. These mechanisms share many attributes found in other auditory feature detectors.
Precisely timed excitatory and inhibitory inputs are important for improving spike timing in the auditory cortex where inhibition can lag excitation by only a few milliseconds and confine spike times for decreased response jitter (Wehr and Zador, 2003). The temporal response bandwidth of an anti-coincidence DTN also depends on the relative timing of onset-evoked inhibition and the onset-evoked excitation. As inhibition lengthens with stimulus duration, long-latency onset-evoked excitation becomes blocked to produce short-pass duration selectivity (Figs. 2, 12).
Inhibition in the bat IC can also select for FM sweep direction. Upward and downward FM sweeps evoke suprathreshold EPSPs of different magnitudes in IC neurons; however, IPSPs raise the spiking threshold of the cell to just below the largest excitatory input magnitude. This selects for only the strongest excitatory input and thus the corresponding FM sweep direction (Gittelman et al., 2009). Inhibitory inputs to DTNs are responsible for decreasing spike counts and sharpening temporal response profiles by increasing the spiking threshold of a neuron (Fig. 8). For coincidence detection DTNs, inhibition sharpens the window of coincidence for stimulus-evoked excitatory inputs, hence sharpening temporal selectivity. Similarly, shunting inhibition in the avian nucleus laminaris (analog of mammalian medial superior olive) decreases spontaneous activity and sharpens excitatory coincidence detection important for sound localization (Tang et al., 2011).
Inhibitory frequency tuning curves for cortical auditory neurons are often broader than excitatory tuning curves. Therefore, inhibition sharpens frequency tuning by reducing net excitatory input for sounds further from the BEF of the cell (Wu et al., 2008; Xie et al., 2008). Similarly, we demonstrated that the shape of inhibitory inputs to a DTN can be adjusted to produce bandpass duration tuning by selectively preventing the DTN from producing spikes to short-duration stimuli (Fig. 10).
Complex interactions of excitation and inhibition are responsible for creating many types of feature detectors in the CNS. It is not surprising then that duration tuning arises from similar interactions and that the underlying neural mechanisms scale across vertebrates to produce species-specific variations in temporal selectivity. Verification of the underlying mechanisms in different species awaits additional in vivo physiological studies.
Footnotes
This work was supported by a Discovery Grant from the Natural Sciences and Engineering Research Council (NSERC) of Canada (P.A.F.). B.A. was supported by a NSERC Canada Graduate Scholarship, and R.S. was supported by an Ontario Graduate Scholarship. Research in the McMaster Bat Laboratory was also supported by infrastructure grants from the Canada Foundation for Innovation and the Ontario Innovation Trust. We are grateful to Prof. Benedikt Grothe for providing the spike latency data for the mouse duration-tuned neuron.
- Correspondence should be addressed to Dr. Paul A. Faure, Department of Psychology, Neuroscience & Behaviour, McMaster University, 1280 Main Street West, Hamilton, ON L8S 4K1, Canada. paul4{at}mcmaster.ca