Abstract
Mammalian cochlear spiral ganglion neurons (SGNs) encode sound with microsecond precision. Spike triggering relies upon input from a single ribbon-type active zone of a presynaptic inner hair cell (IHC). Using patch-clamp recordings of rat SGN postsynaptic boutons innervating the modiolar face of IHCs from the cochlear apex, at room temperature, we studied how spike generation contributes to spike timing relative to synaptic input. SGNs were phasic, firing a single short-latency spike for sustained currents of sufficient onset slope. Almost every EPSP elicited a spike, but latency (300–1500 μs) varied with EPSP size and kinetics. When current-clamp stimuli approximated the mean physiological EPSC (≈300 pA), several times larger than threshold current (rheobase, ≈50 pA), spikes were triggered rapidly (latency, ≈500 μs) and precisely (SD, <50 μs). This demonstrated the significance of strong synaptic input. However, increasing EPSC size beyond the physiological mean resulted in less-potent reduction of latency and jitter. Differences in EPSC charge and SGN baseline potential influenced spike timing less as EPSC onset slope and peak amplitude increased. Moreover, the effect of baseline potential on relative threshold was small due to compensatory shift of absolute threshold potential. Experimental first-spike latencies in response to a broad range of stimuli were predicted by a two-compartment exponential integrate-and-fire model, with latency prediction error of <100 μs. In conclusion, the close anatomical coupling between a strong synapse and spike generator along with the phasic firing property lock SGN spikes to IHC exocytosis timing to generate the auditory temporal code with high fidelity.
Introduction
The auditory system uses a spike time code to represent the temporal structure of sound and calculate source location. The degree of temporal precision is demonstrated by behavioral sensitivity to impressively small interaural time differences, within tens of microseconds (Harnischfeger, 1980; Moiseff and Konishi, 1981; Carr, 1993; Oertel, 1999). Through the first stages of neuronal sensory processing, spike time precision is improved with respect to the sound pressure waveform (Fukui et al., 2006). In the brainstem, at the second afferent synapse, this improvement involves convergence of several auditory nerve fibers onto one postsynaptic cell (Trussell, 1997; Spirou et al., 2005; Cao and Oertel, 2010; Howard and Rubel, 2010). There, the endbulb-type presynaptic terminals of spiral ganglion neurons (SGNs) respond to an action potential with synchronous release of neurotransmitter across many active zones.
In contrast, in the cochlea, at the first afferent synapse each auditory nerve fiber is excited by a single inner hair cell (IHC) active zone (Liberman, 1980). Most if not all information about the acoustic scene is conveyed across these synapses, and then to the brain in spike trains of type I SGNs (comprising 90–95% of auditory nerve fibers) (Perkins and Morest, 1975). The spike generator presumably resides near to the compact postsynaptic bouton (Hossain et al., 2005), on the short neurite, which exits the organ of Corti through the basilar membrane before the myelinated axon begins (Fig. 1A,B). Synaptic excitation and spike generation in type I SGNs are thus unique in terms of anatomical configuration, and their elucidation is fundamental for understanding the auditory temporal code.
SGN activity depends upon Ca2+-dependent presynaptic exocytosis of glutamate from IHCs (Sewell, 1984; Platzer et al., 2000; Robertson and Paki, 2002; Brandt et al., 2003), which is triggered by stochastic Ca2+-channel gating, at rates proportional to the IHC receptor potential (Moser and Beutner, 2000; Brandt et al., 2005; Johnson et al., 2005; Goutman and Glowatzki, 2007, 2011). Thus, EPSC timings from each IHC to its several SGNs contain information about sound pressure at a particular frequency, filtered by mechanoelectrical transduction and ribbon synapse dynamics. Seemingly at odds with precise spike timing is the considerable variability of EPSC rise time (0.1–1.3 ms), duration (τdecay, 0.1–3 ms), and amplitude (20–800 pA) (Glowatzki and Fuchs, 2002). Its effect on SGN spike generation, however, and the implication for hearing is unknown (Trussell, 2002).
Here, we investigated how SGN intrinsic properties influence spike timing in response to EPSCs. Combining electrophysiology and computational modeling of type I SGNs, we characterized action potential generation in response to discrete synaptic transmission events or current-clamp injections. Although most EPSCs evoked a spike, the latency and jitter depended upon the EPSC waveform and SGN baseline potential. The average EPSC, however, was much larger than required to trigger a spike, facilitating small latency and jitter of spike generation relative to exocytosis timing. A large synaptic conductance, short integration time, low spike threshold, and phasic firing behavior contribute to the accuracy and precision of auditory nerve spike-timing information by suppressing the effects of variability in ribbon synapse neurotransmission.
Materials and Methods
Preparation.
The onset of hearing occurs at postnatal day 12 (P12) in the rat (Geal-Dor et al., 1993), where P0 is the day of birth. From the inner ears of P11–P19 Wistar rats of either sex, the apical one-half turn of the organ of Corti (tonotopic frequency, 4–10 kHz) was isolated in extracellular solution, taking care not to twist the radial fibers connecting each SGN soma in the cochlear ganglion to its bouton in the organ of Corti. The preparation was made flat by removing modiolar bone surrounding the cochlear ganglion. The tectorial membrane was removed, and the preparation was placed under a grid of fine nylon threads in the recording chamber. Experiments were illuminated with light through a green filter and visualized with a 40×, 0.8 NA water-immersion objective on a Zeiss Axioskop FS 2 with differential interference contrast optics. After 4× optical magnification, the image was acquired by a digital camera with gain and offset control (TILL Photonics), and then displayed on a grayscale video monitor. Suction pipettes maneuvered with a MP-285 (Sutter Instrument) were used to make a small hole in the layer of support cells overlying the basal part of an IHC. A small amount of tissue was removed, allowing access to postsynaptic boutons with an intracellular recording electrode. In these experiments, the intracellular recording electrode was advanced toward the IHCs from the side containing the cochlear ganglion (also called the neural or modiolar side), and then sealed to a bouton contacting the base of an IHC (for similar method, see Grant et al., 2011). Although it was not always possible to determine the region of the plasma membrane innervated by the recorded boutons, they were mainly below the IHC nucleus and on the modiolar side (see Fig. 1A,B). Experiments were performed at room temperature (22–24°C).
Electrophysiology.
The extracellular solution was as follows (in mm): 5.8 KCl, 155 NaCl, 1.3 CaCl2, 0.9 MgCl2, 0.7 NaH2PO4, 5.6 d-glucose, and 10 HEPES. Measured osmolarity was ∼300 mOsm, and pH was adjusted to 7.4 with NaOH. Electrodes were pulled with a P-97 (Sutter Instrument) from filamented, thin-wall, 1 mm outer diameter, borosilicate capillary glass (World Precision Instruments), and then shaped on a custom-built microforge with heat applied to the tip and ∼3 bar positive air pressure applied at the back (Goodman and Lockery, 2000). The intracellular solution was as follows (in mm): 135 KCl, 3.5 MgCl2, 0.1 CaCl2, 5 EGTA, 5 HEPES, 2.5 Na2ATP, and 1 Na2GTP. Measured osmolarity was ∼290 mOsm, and pH was adjusted to 7.2 with KOH. Electrode resistance was 6–10 MΩ. The liquid junction potential of 6 mV was compensated on-line. For bouton recording, the neurite was approached with variable positive pressure delivered through a 30 ml syringe connected to the electrode holder. A gigaohm seal could be obtained by removal of positive pressure or by suction at −80 mV in whole-cell voltage-clamp mode, which occasionally led to immediate or delayed patch rupture and successful intracellular recording.
Voltage and current signals were acquired with an EPC10 (HEKA Elektronik Dr. Schulze GmbH) at 20 μs intervals. Current-clamp stimuli (Iinj) were not filtered on the path to the cell and were displayed as recorded from the current monitor. Recorded current signals were low-pass filtered at 5 kHz (four-pole Bessel). The current-clamp circuit of the EPC10 is a “voltage follower,” which allowed for recording of the true voltage signal for fast action potentials (Magistretti et al., 1996, 1998). The voltage signal in current clamp was not subject to the 5 kHz Bessel filter. Rather, the recorded voltage signal was filtered by the recording pipette, with a time constant equal to the access resistance (Ra) times the uncompensated patch-pipette capacitance (Cfast,residual ≈ 1 pF). Pipette capacitance compensation (Cfast ≈ 5–7 pF; τ ≈ 1–2 μs) speeded the voltage response by injecting the charge predicted to change the potential of the pipette. Cfast was set in voltage clamp and reduced by 5% when switching to current clamp to ensure that it was not overcompensated. This prevented voltage oscillations and should have resulted in only small kinetic underestimates. No bridge balance or active series resistance compensation was used.
Input resistances ranged from 200 to 1000 MΩ between cells when measured with small current or voltage steps near the zero-current potential, which ranged from −70 to −80 mV between cells (mean electrophysiological parameters in Table 1). Data from 12 boutons (2 prehearing, 10 posthearing onset) were considered sufficient in quality and duration to be analyzed for this study. When present, spontaneously occurring activity was recorded in voltage-clamp and current-clamp modes before eliciting spikes with defined excitatory current waveforms (see Fig. 1C,D).
Recording and spike parameters of the postsynaptic bouton plus cable
In addition to current-clamp experiments from the zero-current potential, we studied neural responses to excitatory waveforms after setting the membrane potential to relatively depolarized or hyperpolarized levels by applying steady holding currents. The membrane potential sometimes shifted by ±3 mV while holding at constant current over the duration of a 15–90 min recording. Data were acquired over successive 20–60 s periods centered on different mean baseline potentials in interleaved sequence (e.g., −93, −70, −82, and −102 mV with holding currents of −20, 7, −6, and −29 pA, respectively).
Data analysis.
IGOR Pro (Wavemetrics), MiniAnalysis (Synaptosoft), and Mathematica (Wolfram Research) software were used for analysis and plotting. All sweeps were inspected visually. Spikes could be unambiguously discriminated from EPSPs by their peak amplitude. When studying responses to current-clamp stimuli, we discarded sweeps with spontaneous activity within 20 ms before or during the stimulus. For analysis of spontaneous activity, we discarded 20 s segments with greater than ±2 mV deviations from the intended baseline potential. Except for the display in Figure 4A, all voltage traces were corrected by subtracting the voltage error due to series resistance (Verror) equal to the injected current (Iinj) multiplied by the access resistance (Ra). EPSPs and EPSCs were detected in MiniAnalysis (see Figs. 2, 5) by setting an amplitude threshold of ≈7× the root mean square (rms) noise. For example, in one voltage-clamp recording, the rms noise was 1.4 pA, and the amplitude threshold for EPSC detection was 10 pA. Because of the large event size relative to the noise, the counting of events was relatively insensitive to the detection threshold: for example, in a 60 s recording segment, changing amplitude threshold from 10 to 5 pA resulted in detection of four additional events (from 319 to 323), and going from 5 to 2.5 pA resulted in detection of six additional events.
To make accurate and precise estimates of spike times that were not limited by the sample interval, we first measured when Vm depolarized to 10–30 mV below the spike peak (depending on the mean spike height of each SGN from each baseline potential) using linear interpolation between adjacent sampling points. To make spike times more easily comparable across SGNs and baseline potentials, from each spike time measurement we subtracted a fixed period to estimate the spike onset time. This fixed period was unique for each SGN and baseline potential, and was between 0.08 and 0.15 ms, corresponding to less than one-quarter of the spike full width at half-maximum (FWHM). The fixed period was chosen so that the estimated spike onset time lied approximately at the inflection point of the spike upstroke. The spike onset latency was the duration from the stimulus onset to the spike onset time. In the case of spontaneous spikes (i.e., in response to a neurotransmitter release event from the IHC), stimulus onset was defined when the first of either criterion was met: the voltage slope exceeded 5 mV/ms, or the voltage value increased >2 mV above the baseline potential. These two criteria were necessary to reliably detect EPSPs with slow or fast onset.
The maximum EPSP slope was determined as the first local maximum in the time derivative of the voltage (before the local maximum due to the spike upstroke). In cases when no local maximum existed, we measured the slope at the time of the global minimum of the second derivative before the spike upstroke, which corresponded to when the prepotential slope increased the least and occurred just before the spike onset. Discrete time derivatives of voltage were calculated using central differences.
We studied the capacitance of the clamped membrane using multiple techniques, including curve fitting to voltage-clamp and current-clamp data, and whole-cell RC compensation in voltage-clamp mode. The average values obtained across these methods in each cell were used to calculate the grand mean across cells in Table 1. Because these methods assumed a single compartment, we also calculated membrane capacitances based on a two-compartment model (see below), which is the simplest model capable of producing the double-exponential responses we observed. We note that the relatively small size of the boutons together with the incomplete space clamp of recordings from a bouton and attached fiber may limit the certainty of our measurements of membrane capacitance. Data are reported as mean ± SD.
EPSC-like shapes.
EPSC-like current waveforms were constructed by specifying their charge, linear rise time, plateau duration, and exponential decay time constant. These four parameters uniquely defined the EPSC-like shapes and thus fixed their amplitude. EPSC-like shapes (see Figs. 7, 8) were systematically varied and presented with nested loops: the outer loop was for rise times 0.1–1.5 ms in steps of 0.1 ms; the next loop was for charges 100–700 fC in steps of 100 fC; the inner loop was for plateau durations 0–5 ms in steps of 0.5 ms. For the EPSC-like shapes 1–4 (see Fig. 6), stimuli where presented in order of increasing amplitude for each shape separately in the sequence 2, 3, 1, 4. Repetitions were looped after a full stimulus set was presented. The procedure was then repeated from different baseline potentials. Stimuli were delivered at 10 Hz.
Two-compartment model.
We averaged the initial voltage responses to small subthreshold depolarizing current steps and fitted them to the peak (first 3–10 ms, which depended upon the baseline potential, Vbase) with a double-exponential curve as follows:
where I is the difference between the total injected current and the holding current, Ra is the access resistance measured in voltage clamp, and V0 is the voltage of the fit function at steady state (see Fig. 4E). This provided two time constants, τfast and τslow, with the corresponding resistances Rfast + Ra and Rslow, respectively. The values for τfast, τslow, Rfast, and Rslow ranged between 0.07 and 0.4 ms, 0.7 and 4.0 ms, 40 and 120 MΩ, and 220 and 470 MΩ, respectively, depending on the recording. Within those ranges, larger values were found when Vbase was more hyperpolarized. These values were used to construct an electrical circuit composed of 2 compartments, characterized by the resistances, R1, R2, and Raxial; and the capacitances, C1 and C2 (see Fig. 7B). Raxial is the resistance connecting the two compartments. The electrode is connected to compartment 1. The membrane potentials at the two compartments, V1(t) and V2(t), are described by the following system of ordinary differential equations:
where Vbase was the measured average over 1 ms before stimulus onset, and I(t) is the difference between the total injected current and the holding current at time t. Differential equations were solved with a ninth order explicit Runge–Kutta scheme implemented in the method NDSolve in Mathematica 7.0.1.
Since only four parameters can be derived from the double-exponential voltage response measured at the bouton, but five parameters are required to define the circuit, an infinite number of two-compartment models can reproduce the data. To estimate the range of possible membrane potentials at the second compartment, three scenarios were considered (see Fig. 9C). In the first, most conservative scenario (used for Figs. 7 and 8), we assumed that the specific membrane resistance rm and capacitance cm are the same in both compartments. Then, with S1 and S2 being the membrane surfaces of compartments 1 and 2, respectively, R1C1 = rm/S1 · cmS1 = rm cm = rm/S2 · cmS2 = R2C2 = τslow, and the amplitude of the slow component response is the same in both compartments. In the second scenario, we assumed R2 to be infinite (Mennerick et al., 1995). In this case, no current flows to the second compartment at steady state, which results in no voltage difference between the two compartments. In the third scenario, we assumed R1 to be infinity, which yielded the largest voltage difference between the two compartments at steady state. For these three scenarios, the compartment parameters were deduced analytically from the measured τfast, τslow, Rfast, and Rslow. In all three scenarios:
where A = Rslow · τfast + Rfast · τslow.
For the first scenario (R1C1 = R2C2):
where δτ = τslow − τfast.
For the second scenario (R2 → ∞):
where S = Rfast + Rslow and P = Rfast · Rslow.
For the third scenario (R1 → ∞):
where B = Rslow · τfast2 + Rfast · τslow2.
SGN voltage responses did not appear entirely linear, but the responses to 5–20 pA steps were most linear from Vbase = −95 ± 5 mV, so results in the text for τfast, τslow, Rfast, Rslow, and the solutions for the two-compartment circuit were based upon those responses, assuming the first scenario (above). Using voltage-clamp traces, the two-compartment model provided an estimate of Ra (Pandey and White, 2002). This estimate (34 ± 6 MΩ; n = 8) was insignificantly smaller than the estimate based upon measurement using a one-compartment model of the data (Table 1).
Leaky integrate-and-fire models.
First, we considered the leaky integrate-and-fire (LIF) neuron model (Lapicque, 1907; Stein, 1967; Knight, 1972). The evolution of the voltage is passive as described in Equation 2. A spike is emitted with a fixed delay D after the membrane potential Vi(t) reached threshold VTh. To approximate effects of subthreshold voltage-gated Na+ channel activation, we also considered the exponential leaky integrate-and-fire (EIF) neuron model (Fourcaud-Trocmé et al., 2003). Here, the evolution of the membrane potential is given by adding a voltage-dependent intrinsic current ΔTe(Vi(t)−VT)/ΔT on the right-hand side of the differential equations (Eq. 2) for the active compartment(s) i. VT is the maximum steady-state voltage at which the active compartment can remain without spiking in the presence of a constant injected current, and ΔT is the spike slope factor, which characterizes the sharpness of spike initiation (i.e., the voltage range over which Na+ channels activate). Due to the supralinear dependence on Vi of the spike-generating intrinsic current, the membrane potential diverges to infinity in a finite time once enough current is injected. A spike occurs with a fixed delay D after the membrane potential Vi(t) reached VT + 10 · ΔT in the active compartment. For ΔT = 0 mV, the EIF model reduces to the LIF. For the LIF, the voltage at the first compartment is always higher than the voltage at the second, because the current is injected into the first compartment. Therefore, having the spike generator only in the first compartment is equivalent to having it in both, if one assumes the same threshold voltage VTh in both compartments. For the EIF, having the spike generator only in the first compartment would imply that only the first compartment is active, which is a physiologically unreasonable assumption. Therefore, we added the LIF or EIF mechanism to either compartment 2 or to both compartments.
For the LIF neuron model to describe the measured voltage response from a particular baseline potential, only one free parameter needed to be assigned, the threshold VTh. The EIF model had two free parameters, VT and ΔT. We determined these parameters so that the models gave the best predictions of the data, in terms of both spike occurrence and latency. Using the voltage responses to hundreds of different EPSC-like stimuli, we calculated the “raw” model latencies (RLi) to threshold potential VTh (for the LIF) or to VT + 10 · ΔT (for the EIF). Then, for the n spikes predicted by the model and existing in the recording, we calculated the rms latency prediction error δL as follows:
where MLi are the measured spike onset latencies (as described above) and PLi are the predicted latencies, defined as PLi = RLi + D (see Fig. 7C,D). The term D is the fixed delay we added to all the “raw” model latencies RLi to obtain the predicted latencies PLi so that the mean prediction error was zero:
With the spike generator in the second compartment, D ranged from 0.0 to 0.27 ms for the LIF and from −0.11 to 0.10 ms for the EIF. D was determined by minimizing δL for each combination of cell and Vbase. In some cases, a negative value was found for D because the measured spike onset occurred before the voltage in the EIF model reached VT + 10 · ΔT. We also calculated the fraction of correctly predicted spike occurrences, F = 1 − (E + M)/N, where E is the number of extra spikes predicted by the model, M is the number of missed spikes, and N is the total number of stimuli.
To find the best VTh for the LIF model, we calculated F and δL at 0.1 mV steps for VTh between Vbase and Vbase + 50 mV. To find the best VT and ΔT for the EIF model, we calculated F and δL at 0.1 mV steps for VT between VTh − 5 mV and VTh + 5 mV, and for ΔT between 0 and 5 mV. Percentage improvement was calculated as 100 · [1 − δL(model1)/δL(model2)], and then compared against zero for statistical significance as assessed with the Z test.
Results
To characterize action potential generation at the first auditory synapse, we performed whole-cell patch-clamp recordings from the postsynaptic boutons of SGNs, where their peripheral neurites contact IHCs. Experiments were done in the acutely explanted cochlear apex of rats before and after the onset of hearing at room temperature (Fig. 1B). In the absence of an experimentally applied stimulus, we observed spontaneous action potentials in SGNs (Fig. 1C), which are thought to be initiated by Ca2+-dependent synaptic transmission from hair cells both in vitro [rat (Glowatzki and Fuchs, 2002); frog (Keen and Hudspeth, 2006); fish (Trapani and Nicolson, 2011)] and in vivo [cat (Liberman and Kiang, 1978; Sewell, 1984); chinchilla (Siegel and Relkin, 1987); guinea pig (Robertson and Paki, 2002)]. When action potentials originated from baseline potentials of approximately −80 mV, mean spike heights ranged from 30 to 90 mV and spike FWHM ranged from 0.42 to 1.5 ms between cells (Table 1).
Type I spiral ganglion neuron physiology in the acutely explanted organ of Corti with cochlear ganglion. A, Schematic cross-section of the cochlear explant, showing the organ of Corti (right) and the peripheral neurites of type I SGNs (one highlighted in black) connected to somata in the cochlear ganglion (left). Each type I SGN is nonbranched and receives input from one IHC presynapse via a single, postsynaptic bouton-type connection. The synaptic region at the base of the IHC is enlarged in B. B, Intracellular patch-clamp recordings were made from the postsynaptic boutons, connected to their nonmyelinated neurites ≈30 μm in length, which exit the organ of Corti at the foramina nervosa. Beyond this small perforation in the basilar membrane (not on the scheme) are the NaV1.6-positive heminodes and the beginnings of the myelinated segments, and then several nodes of Ranvier before the cochlear ganglion. Due to the approach vector of the electrode, recorded boutons most likely innervated the hemisphere of the IHC that faced the cochlear ganglion (i.e., the 4 SGNs highlighted in black). Drawings are approximately to scale. C, In current-clamp mode, excitatory stimuli (e.g., arrow) were injected into boutons to measure the time course of depolarization and spike generation (top trace, left). Ca2+-dependent synaptic transmission (e.g., arrowheads) from IHCs evoked EPSPs and so-called spontaneously occurring action potentials. D, In the same recording in voltage-clamp mode, neurotransmitter release events from the IHC were measured as EPSCs of variable size and shape.
The distribution of spike heights within an individual cell was unimodal (Fig. 2A,B), with a slight tail at lower amplitudes due to infrequent “second” spikes occurring within 10 ms after the preceding spike. Varying the baseline potential had a large effect on spike height and peak voltage (Fig. 2B, right). From a baseline potential of −65 mV, spikes were vanishingly small and difficult to detect because of the relatively large size and the rapid onset of the underlying EPSPs. These observations are consistent with steady-state inactivation of voltage-gated Na+ channels observed in voltage-clamp recordings of acutely dissociated somata of SGNs, in which a shift from −80 to −60 mV inactivated ≈80% of the Na+ current (Santos-Sacchi, 1993).
Nearly every neurotransmitter release event triggered a spike. A, Synaptic transmission from an IHC evoked spikes in a SGN under current clamp. The arrowhead marks an EPSP that failed to evoke a spike. Age, P19; 356 spikes and 7 failures in 60 s; holding current, −10 pA. B, Left, Distribution of spike heights, measured from a mean baseline potential of −83 mV. Right, Effect of baseline potential on spike height and peak voltage. C, Voltage-clamp recording from the same SGN. Ongoing EPSCs (enlarged in inset) had maximum amplitudes of ∼500 pA. EPSCs measured in voltage-clamp mode did not evoke action currents, which could be differentiated by their shape and larger amplitude. In contrast, a step to −65 mV from the holding potential of −80 mV evoked a stimulus artifact and an action current (arrowheads). D, Interval distributions for action potentials (black) and EPSCs (gray) measured in interleaved current-clamp and voltage-clamp recordings (3 × 20 s in each mode) demonstrate irregular timing; the distributions are similar and suggestive of a Poisson process. Bin size, 50 ms. Overlaid, cumulative probability density functions for spikes and EPSCs were well fit by single exponentials with τ = 151 ms and τ = 182 ms, respectively.
We found a mean resting potential of −72 ± 5 mV, comparable with literature values for somatic recordings (−54 to −77 mV) (for review, see Rusznák and Szucs, 2009). Relatively depolarized resting potentials were reported for SGN boutons from rats before hearing onset (Yi et al., 2010), which is consistent with the hyperpolarizing trend in resting potential observed over development in SNG somata recorded in cochlear slices from the rat (Jagger and Housley, 2002, 2003).
IHC exocytosis evokes SGN spikes with great success
It has been proposed that, at low rates, nearly every discrete neurotransmitter release event from the IHC is sufficient to evoke a spike in the SGN, initiating spontaneous auditory nerve activity in vivo (Siegel, 1992). To assess the efficacy of synaptic transmission for spike generation directly at the synapse, we quantified the success rate by counting the number of subthreshold (unsuccessful) EPSPs and comparing it to the total count of excitatory postsynaptic events (spikes plus subthreshold EPSPs) (Figs. 2A, 3A). Also, we compared spike rates in current clamp to the EPSC rates in voltage clamp from interleaved recording segments (Fig. 2C,D).
Ribbon synapse exocytosis evoked spikes with variable latency. A, Histogram of EPSP maximum slopes for IHC-evoked spikes (mean, 148 ± 75 mV/ms; n = 333). Successful EPSPs are color coded from slowest (blue) to fastest (red). Four slow EPSPs failed to evoke a spike (black). B, Histogram of spike onset latencies measured from EPSP onset (mean, 0.59 ± 0.3 ms; n = 333). C, Scatter plot of spike onset latency versus EPSP maximum slope for IHC-evoked spikes (crosses), color-coded as in A. The larger crosses correspond to the short-latency (red) and long-latency (blue) IHC-evoked spikes in D. The large circles are the CC-evoked spikes in D. D, Comparison of IHC-evoked spikes (solid lines) and CC-evoked spikes (dashed lines). Left, Similar IHC-evoked and CC-evoked spikes with long latencies (1.09 and 1.16 ms) and small EPSP maximum slopes (34 and 46 mV/ms). Right, Similar IHC-evoked and CC-evoked spikes with short latencies (0.36 and 0.34 ms) and large EPSP maximum slopes (230 and 228 mV/ms). Short-latency IHC-evoked spikes were smaller in height and thinner at the peak when overshooting 0 mV, consistent with the presence of greater synaptic conductance compared with long-latency IHC-evoked spikes. Bottom, The stimuli for CC-evoked spikes. E, Membrane potential slope versus membrane potential for IHC-evoked spikes (n = 161). The asterisks mark the onset of the spike segments for one fast and one slow EPSP. Each trace is color-coded according to its EPSP maximum slope (as in A and C). The bold traces correspond to the IHC-evoked spikes (solid) and CC-evoked spikes (dashed) in D. Smaller EPSP maximum slopes were associated with larger spike slopes and vice versa. Data are from one SGN aged P19.
In recordings from three boutons (P17–P19) that exhibited prolonged, stable, and abundant (3–7 Hz) spontaneous activity, the interevent intervals for EPSCs and spikes were Poisson distributed (Fig. 2D, one cell), as expected for SGNs after the onset of hearing (Kiang et al., 1965; Grant et al., 2010). In one P19 recording, 97% of EPSPs (1715 of 1767 in 585 s) evoked a spike. In that case the spike rate was 2.9 Hz and EPSCs occurred at 3.1 Hz (560 in 181 s). In another recording (P17), 81% of EPSPs (496 of 614 in 200 s) evoked a spike. There, the spike rate was 2.5 Hz and the EPSC rate 3.2 Hz (518 in 160 s). Such low failure rates indicated that most synaptic events were indeed sufficient to trigger an action potential from the zero-current potential in SGNs from hearing rats.
In contrast, two recordings from boutons before the onset of hearing showed lower EPSP rates (0.2 Hz or less) and generally higher spike failure rates, similar to the observations of Yi et al. (2010). In one of these boutons (P11), in total only 50% of EPSPs evoked a spike. However, over some periods the success rate increased up to 94% as a distinct EPSP burst pattern emerged, reminiscent of the periods of enhanced neurotransmitter release due to presynaptic spiking of the immature IHCs observed by Tritsch et al. (2007).
To summarize so far, with maturation (from P11 to P19), we observed an apparent increase in the reliability of IHC neurotransmitter release to trigger a spike in SGNs. While the low number of recordings from prehearing animals in the present study prevents statistical comparison, our results are consistent with the previously reported developmental upregulation of synaptic transmission around hearing onset (Grant et al., 2010). The responses of mature SGNs were homogeneous with regard to the high success rates for spikes in response to synaptic events. While this finding is consistent with literature (Siegel, 1992) and observations of experiments below, its general significance will need to be tested by recordings of SGNs from different tonotopic locations and other positions of innervation around the IHC. Due to technical difficulties imposed by the modiolar bone and the structure of the organ of Corti, the current study may have overlooked heterogeneity because it was limited to recordings from the apical turn of the cochlea and primarily sampled SGNs innervating the modiolar surfaces of IHCs.
Influence of EPSP kinetics on action potential latency
Discrete events of neurotransmitter release from individual active zones of cochlear IHCs are known to exhibit substantial variability in their amplitude and kinetics (Glowatzki and Fuchs, 2002; Grant et al., 2010; Yi et al., 2010). However, the impact of this heterogeneity on spike generation in the first auditory neuron is unclear. To address this, we assessed the coupling between synaptic input and spike generation using the maximum slope of the spike prepotential (EPSP maximum slope) as a proxy for the underlying synaptic conductance. Relative to the timing of exocytosis, what is the latency to spike onset and how large is its variance?
We examined the spikes triggered by EPSPs from IHCs (called “IHC-evoked spikes”) and found that the voltage time courses preceding spikes were highly variable. First, the EPSP maximum slopes ranged from 20 to 320 mV/ms, with a somewhat bimodal distribution (Fig. 3A). Second, the intervals from EPSP onset to spike onset ranged from 0.3 to 3 ms (Fig. 3B), but >98% of them ranged from 0.3 to 1.5 ms. The mean latency was ∼0.6 ms and its SD was 0.3 ms [coefficient of variation (CV) ≈ 0.5]. Larger EPSP maximum slopes were associated with shorter latencies. The scatter plot of latency versus EPSP maximum slope (Fig. 3C) illustrates the strong negative correlation between latency and EPSP maximum slope (r = −0.94 in logarithmic scale) (see also Fig. 8G). Thus, EPSP kinetics had a large effect on the time course of action potential generation in SGNs.
From the scatter plot (Fig. 3C), we chose two IHC-evoked spikes as examples of short- and long-latency spikes, and, for comparison, we selected two spikes evoked by current-clamp injection (called “CC-evoked spikes”) that closely resembled the IHC-evoked spikes. Short-latency spikes were preceded by rapid initial depolarization followed by a short period of slowing before the spike onset (Fig. 3D, right). Long-latency spikes had much smaller EPSP slopes (Fig. 3D, left). Figure 3D, right, also demonstrates the effect on spike waveform of a large synaptic conductance compared with an extrinsic current (note the truncated peak of the IHC-evoked spike as the potential exceeds the reversal potential of the synaptic conductance, 0 mV) (Fatt and Katz, 1951). Phase plots provide a convenient way to visualize EPSP and action potential slopes as a function of the membrane potential (time derivative of Vm vs Vm). Figure 3E shows the phase plots for 161 IHC-evoked spikes and the two CC-evoked spikes from Figure 3D. The relatively large maximum slopes of the action potentials we observed (400–500 mV/ms) are comparable with maximum slopes of action potentials measured in pyramidal cell somata but are approximately one-half as large as action potential slopes measured at the axon initial segment (Kole et al., 2008).
The membrane potential measured at spike onset varied significantly (from −65 to −35 mV) with EPSP maximum slope. This is exemplified by the asterisks marking the beginning of two spike segments in the dVm/dt versus Vm traces in Figure 3E. We hypothesize that this resulted predominantly from a variable voltage drop along the axial resistance of the neurite between the synapse and the adjacent heminode before spike initiation, rather than from a variable spike threshold potential. The voltage drop is proportional to the amplitude of the axial current between the two locations. Therefore, the small synaptic conductance preceding long-latency spikes (Fig. 3, blue) would produce relatively little difference between the measured spike onset potential and the threshold potential at the spike generator. In contrast, for EPSPs with large maximum slope (Fig. 3, red), the difference would be larger. To test this idea, we used the data below and constructed a two-compartment model of the neuron (see Figs. 6⇓⇓–9).
Figure 3 demonstrated that, although nearly every presynaptic release event was sufficient to evoke a spike in mature SGNs, response latencies were quite variable (from 0.3 to 3 ms). Because precise spike timing is important for representation of sound, we sought to better understand the relationship between postsynaptic excitation and spike onset latency. What are the SGN firing properties and what is required to trigger a spike? How might the resting conductance and baseline potential affect SGN excitability? How similar are the intrinsic response properties among SGNs innervating the modiolar face of IHCs in the cochlear apex?
Apical SGNs innervating the modiolar face exhibit phasic responses and have a low rheobase
Auditory nerve fibers are able to fire at hundreds of spikes per second in vivo and phase lock with submillisecond precision. Here, to investigate SGN intrinsic properties that might support an accurate and precise temporal code, we tested the excitability of SGNs in vitro by injecting square pulses of current into the postsynapse (Fig. 4A–E).
Spiral ganglion neurons respond as phasic high-pass filters with low rheobase. A1, Depolarizing current steps evoked one spike (and rarely a second, smaller spike) near stimulus onset (A2). Hyperpolarizing current steps evoked inward rectification, and rebound spikes at stimulus offset (A3). B, Top, V–I relationships: steady-state voltage (VSS) as a function of current amplitude for three SGNs (VSS estimated as mean Vm over last 20 ms of 200 ms current-clamp steps, as in A1). Bottom, Steady-state slope resistance (Rslope) as a function of current amplitude. Zero-current potentials (denoted by perpendicular gray dotted lines) were between −80 and −70 mV, and corresponded roughly to the peak in Rslope. C, Strength–duration functions for eight SGNs stimulated from baseline potentials (Vbase) of approximately −80 mV. Rheobase ranged from 35 to 75 pA. Current pulses were applied at 5–20 pA increments. The dotted lines in C and D show the 50 pA level, and insets show double-log scale. D, Strength–duration functions for one SGN from different Vbase, demonstrating similar rheobase but increased chronaxie with hyperpolarized Vbase. E, Averages of subthreshold responses to 20 pA current steps on absolute and relative scales exhibit a smaller membrane time constant with depolarization of Vbase, due to activation of hyperpolarizing membrane conductance. The dashed lines are double-exponential fits up to the response peak: τfast and τslow were 0.09 and 0.74 ms from −72 mV; 0.09 and 0.81 ms from −82 mV; 0.16 and 3.0 ms from −95 mV. F, Bottom, Ramp stimuli from −35 to +50 pA, for durations of 1–19 ms in 2 ms increments. Top, Ramps briefer than 9 ms failed to generate spikes at these small current levels, demonstrating dependence of spike generation on a minimum charge. Ramps >15 ms also failed, demonstrating dependence of spike generation on a minimum rate of depolarization. G, Ramp from −100 to +100 pA in 100 ms (bottom) demonstrates the increase in membrane conductance during subthreshold depolarization, due to K+ current activation before spike onset (top). The arrow labels a spontaneous spike. The dotted line marks the spike onset potential of −68 mV.
SGNs predominantly fired only one spike at stimulus onset in response to sustained stimuli of any strength or duration. Two of 12 SGNs (P11 and P14) fired a second, smaller spike within a few milliseconds of the first, submillisecond latency spike (Fig. 4A1,A2). This firing behavior has been termed class III excitability, phasic, or single spiking (Hodgkin, 1948; Izhikevich, 2007; Prescott et al., 2008). Here, we refer to it as phasic; it resembled the phasic or rapidly adapting class of isolated SGN somata (Mo and Davis, 1997a; Mo et al., 2002; Lv et al., 2010) and contrasted with the slowly adapting class seen in some somatic SGN recordings (Adamson et al., 2002). Hyperpolarizing current pulses evoked inward rectification (Fig. 4A1), indicative of the depolarizing current activated by hyperpolarization (Ih), which has been studied in SGN somata (Chen, 1997) and in their peripheral nonmyelinated neurites in the organ of Corti (Yi et al., 2010). After cessation of strong hyperpolarizing pulses, rebound action potentials were observed (Fig. 4A3).
To better understand the resting membrane conductance and potential (Table 1), we studied the steady-state membrane potential as a function of injected current (Fig. 4B, top). The derivative of this V–I relationship gave us the steady-state membrane slope resistance (Rslope). Rslope was greatest near the zero-current potential and became smaller upon hyperpolarization or depolarization (Fig. 4B, bottom). These effects of hyperpolarization or depolarization were due in part to activation of Ih or low-voltage activated K+ currents, respectively [KV1.1 (Mo et al., 2002); KV7.4 (Lv et al., 2010)]. Thus, the zero-current potential (i.e., the resting membrane potential) corresponded to a potential of low membrane conductance. From this potential, voltage activation of Ih or IK might stabilize the SGN baseline potential by moving it back toward the zero-current potential, keeping it within a narrow range. Since the resting potential for SGNs in vivo is unknown, in this study we injected identical stimuli from several baseline potentials (Vbase) to assess the affect of Vbase on spike generation within cells (see below).
One way to characterize and compare neuronal excitability between individual cells is to measure the minimum current amplitudes required to elicit an action potential for different pulse durations. Two characteristics of the resulting strength–duration relationships (Shepherd et al., 2001) are rheobase (the current threshold as pulse duration approaches infinity) and chronaxie (the duration required to trigger a spike at a current level twice rheobase). The strength–duration relationships were similar for the eight cells tested, and rheobase ranged from 35 to 75 pA between them (Fig. 4C). While our rheobase estimates were similar to current thresholds determined in SGN somata, the spike latencies were much briefer than those from somatic recordings (Mo and Davis, 1997a; Mo et al., 2002) because less stimulus charge was required to evoke a spike. This observation suggests that spike initiation in our experiments did not require depolarization of the soma to action potential threshold.
The mean rheobase was 46 ± 11 pA (n = 8) when current was injected from a mean baseline potential (Vbase) of −80 ± 2 mV. We next estimated rheobase for a range of baseline potentials from −100 to −65 mV by superimposing current steps on steady holding currents. For most cells, <± 30 pA of steady current was enough to offset Vbase over this range. Differences in rheobase were not apparent. However, chronaxie was smaller when Vbase was depolarized because less charge was required to reach threshold potential (Fig. 4D). Subthreshold voltage responses from more depolarized potentials exhibited rapidly activating (<2 ms) inhibitory currents, reflected in the decrease of the apparent membrane time constant (Fig. 4E). These currents were probably carried by K+ and contributed to the phasic excitability of the SGN.
Using ramp stimulation, we observed a requirement of spike generation for a minimum rate of depolarization (Fig. 4F). Brief current ramps with large slopes and small integrals failed to trigger a spike due to a lack of sufficient charge and not because the rate of rise was too fast (i.e., when a fast rise was followed by a brief plateau, a spike was evoked, as in Fig. 4A). Ramps that reached the same level with smaller slopes delivered more charge before stimulus offset and triggered spikes reliably. However, when the slope of the ramp was decreased further, it failed to elicit a spike despite its greater charge. The membrane potential began to level off or even decreased before reaching spike threshold, most likely because the hyperpolarizing K+ current matched or exceeded the amplitude of the depolarizing current. To illustrate this change in membrane conductance as the cell depolarized through a range of subthreshold potentials, we then injected ramps from −100 to 100 pA over a duration of 100 ms (Fig. 4G) to examine the instantaneous V–I relationship. From −120 mV to the zero-current potential of −75 mV, the depolarization was relatively fast, and then slower due to rapid activation of K+ current. A spike was elicited with an onset at approximately −68 mV (Fig. 4G).
Together, SGNs innervating the modiolar face of IHCs from the cochlear apex of hearing rats exhibited an extremely phasic firing behavior, spiking only once per depolarization and permitting only short latencies. This property might prevent multiple spikes during long EPSCs, and thereby enhance the locking of spike times to neurotransmitter release events. SGNs exhibited a threshold not only in terms of current amplitude but also in terms of depolarization rate. This type of excitability is characteristic of class III neurons, which do not respond to slow stimuli and thereby act as high-pass filters (McGinley and Oertel, 2006; Gai et al., 2009). In addition to firing only once, SGNs had a low rheobase and fired with very brief latencies. For current steps of 300 pA, mean latencies ranged from 360 to 650 μs between cells (520 ± 105 μs; n = 5).
SGNs are known to have heterogeneous sound pressure level thresholds when stimulated by IHCs in response to sound (Kiang et al., 1965). SGNs with a particular rheobase could conceivably correspond to auditory units with low or high sound pressure level thresholds in vivo because testing the intrinsic response properties of SGNs does not reveal potential extrinsic contributions to auditory unit heterogeneity, such as those arising from the IHC presynapse or from efferent innervation. Although we might have overlooked SGNs that fire tonically or have significantly different rheobase and/or chronaxie, our observation that SGNs are generally similar could be an indication that the heterogeneity observed in vivo is not intrinsic to the SGN.
Effects of EPSC-like waveform size and kinetics on spike latency and jitter
Relatively low current threshold (Fig. 4) seems to readily explain the high success rate of IHC neurotransmitter release events (Fig. 2) for spike generation in SGNs. These IHC-evoked spikes had latencies that varied by hundreds of microseconds (Fig. 3). This could contribute to variance in spike timing within an individual SGN and across simultaneously active SGNs, hence impacting upon the temporal precision of sound encoding in the cochlea. Therefore, we further studied the influence of EPSC shape on spike timing. How much of the variance in spike timing is due to waveform heterogeneity, and how much is due to jitter inherent to the spike generation process in the SGN?
Within a single SGN postsynaptic bouton, we observed EPSCs with substantial variation in total charge (50–600 fC; Fig. 5B,C), amplitude (50–500 pA; Fig. 5A,B), rise time (0.1–2 ms; Fig. 5A), and FWHM (0.6–3 ms; Fig. 5C), as previously demonstrated extensively by Grant et al. (2010). To determine the jitter inherent to the postsynaptic mechanism of spike generation, we removed waveform heterogeneity by injecting repetitions of identical EPSC-like shapes similar to EPSCs recorded in voltage clamp (Fig. 5D,E). We then assessed spike jitter by calculating the SD of the measured spike onset latencies.
EPSCs of diverse size and waveform were used as stimulus templates. A, EPSC amplitude versus 10–90% rise time demonstrates heterogeneity in size and onset time course (n = 304 EPSCs in 60 s for A–C). B, EPSC amplitude versus charge shows high variability, reflecting heterogeneity in the shape of excitation. C, EPSC FWHM versus charge illustrates the variety of synaptic event durations. D, E, Two EPSCs with a large difference in amplitude, rise time, and FWHM. One is slower and smaller in amplitude than the other, but the two EPSCs have a similar charge (284 vs 245 fC). The small gray EPSC in D is the same as in E, displayed on a different scale. Such broad ranges of EPSC size and kinetics were displayed in individual SGNs from hearing rats (e.g., age P19) and referenced for construction of EPSC-like stimuli for Figures 6⇓–8.
We chose four shapes having different kinetics but equal charge, and scaled their amplitudes to cover a range of charges. The shapes decreased in speed and current amplitude in ascending order (Fig. 6A, bottom). Shapes 1 and 2 mimicked “monophasic” EPSCs, having fast rise times, very brief plateaus, and fast decays. Shapes 3 and 4 had slow rise times, longer plateaus, and slower decays. Although not multipeaked, the slow time course of shapes 3 and 4 were intended to approximate the longer FWHM of “multiphasic” waveforms (Grant et al., 2010).
Effect of stimulus shape and SGN baseline potential on spike latency and jitter depended upon stimulus size. A, Four stimulus shapes (bottom) were scaled to have the same charge. Shapes 1 and 2 (red and black) had linear rise times of 0.3 ms, plateaus of 0.1 ms, and differed only in decay (τ of 0.5 and 1 ms). Shapes 3 and 4 (gold and blue) had linear rise times of 0.8 ms, plateaus of 1 ms, and differed only in decay (τ of 1 and 2 ms). Shapes 1–3 evoked spikes with variable latency (top), but shape 4 failed. In this example, each stimulus delivered 125 fC. B, Each stimulus shape was scaled over a range of amplitudes, conserving charge between shapes. The smallest amplitudes for shapes 1–4 were 83.3, 50, 26, and 18.4 pA, respectively. Larger amplitudes were integer multiples of the smallest ones. Shown are the range of amplitudes for one series of shape 2 (50–700 pA). Selected stimulus–response pairs are in bold. Only the smallest (50 pA) stimulus failed to evoke a spike. C, Spike latency (+SD) versus charge for each stimulus shape (colored as in A) shows reduction of spike latency and jitter with increasing charge. Each data point is the mean of 5–10 repetitions. For small charges the relationship was similar to 1/x (bottom dashed line). For charges >400 fC, the relationship was closer to 1/√x (top dashed line), indicating reduced charge efficiency of spike generation. Note the double-log scale. Waveforms with faster kinetics evoked spikes with shorter latency and less jitter, especially when charge was small. D, Shape 2, Amplitude of 100 pA, delivered from three baseline potentials. Latency depended strongly on baseline potential for such small stimuli. E, Larger amplitude (shape 2, 300 pA) reduced the shift in spike latency associated with changing the baseline potential. F, Spike latency (±SD) versus stimulus amplitude for shape 2 delivered from three baseline potentials (−94 mV, ▿; −83 mV, ○; −72 mV, ▵). Note reduction of latency, jitter, and sensitivity to baseline potential as stimulus amplitude was increased. Inset, Jitter (SD) versus mean latency for 5–10 repetitions of shape 2 at each amplitude, from Vbase = −83 mV. Similar trends were obtained with shapes 1, 3, and 4.
For the smallest charge tested (31.5 fC), all four shapes failed to evoke an action potential. In response to 125 fC, shapes 1, 2, and 3 elicited spikes with very different latencies (Fig. 6A, top). Figure 6B illustrates a decrease in latency with increasing EPSC size for shape 2. As stimulus size increased, the latencies decreased for all four shapes, first rapidly and then more slowly (Fig. 6C). For a given charge, the faster waveforms evoked shorter latencies and less jitter, but the differences in jitter between waveforms was <50 μs when stimulus charge exceeded 300 fC. The longest latencies were ∼3 ms in response to near threshold EPSC-like stimulation. The shortest latencies were ∼250 μs for the largest and fastest shapes tested, which also elicited the smallest jitter. For eight repetitions of shape 2 at an amplitude of 300 pA, the mean latency was 488 ± 18 μs (CV ≈ 0.04 for CC-evoked responses) compared with 590 ± 300 μs (CV ≈ 0.5) for 333 IHC-evoked responses (Fig. 3). This confirms that the variance in spike onset latency was dominated by synaptic input, not postsynaptic spike generation. Indeed, the mechanism of spike generation intrinsic to the SGN was precise to within tens of microseconds.
The zero-current potential in whole-cell recordings in vitro may differ from the SGN resting potential in vivo. To assess the influence of SGN baseline potential on spike generation, we applied EPSC-like stimuli from several baseline potentials. Spike onset latency and jitter decreased as the holding potential was depolarized from −92 to −74 mV (Fig. 6F). However, this sensitivity of spike latency to the baseline potential almost vanished for stimulus amplitudes exceeding 300 pA (Fig. 6, compare D, E). The SDs of spike onset latencies were relatively large for repetitions of the smallest stimuli, but were <20 μs for EPSC-like stimuli exceeding 200–300 pA.
In summary, even small EPSCs triggered a spike, but EPSC heterogeneity produced variable spike latency and jitter. Increasing the size and/or speed of EPSC-like stimuli improved the speed and precision of spike timing. Therefore, we expect the developmental upregulation of EPSC size (287 pA for P19–P21 vs 134 pA for P8–P11) (Grant et al., 2010) and speed (greater proportion of fast monophasic waveforms) (Grant et al., 2010) to reduce the effects on spike latency and jitter that result from variable EPSC waveforms and shifting SGN baseline potentials. The physiological synaptic input seems appropriately sized for precision and, also, efficiency because increases of stimulus size above the physiological mean (greater than ≈300 pA) yielded proportionally less reduction of latency and jitter (Fig. 6F).
Modeling the mechanism of spike generation in the SGN
Unlike a cortical neuron that needs the superposition of many low-amplitude synaptic inputs to initiate an action potential, a single IHC active zone drives the SGN bouton and nearby spike generator with high-amplitude input. So far, we have shown that the properties of discrete synaptic events have large and immediate effects on spike latency and precision (Figs. 3, 5, 6). To further elucidate the spike generation mechanism, we combined experiments and modeling (Figs. 7⇓–9). Our goal was to find the simplest neuron model that could predict SGN responses using a minimum number of parameters. SGNs are known to respond at high rates in vivo, with first-spike latencies that vary with sound stimulus parameters (Neubauer and Heil, 2008). To better define the final step in the pathway from sound to SGN spike, we wanted a model to predict the time course of spike generation in response to a broad range of individual EPSC-like stimuli (Fig. 7).
Two-compartment LIF and EIF neuron models predicted spike latency with low error for a broad range of stimuli. A, Hundreds of EPSC-like stimuli (gray) were injected into SGNs (charge from 100 to 700 fC in steps of 100 fC; rise time from 0.1 to 0.8 ms in steps of 0.1 ms; plateau durations from 0 to 5 ms in steps of 0.5 ms; decay τ = 1 ms; amplitudes calculated). The bold colored traces show the largest- and smallest-amplitude waveforms for each of the seven charge sets. Inset, Characteristics used to define shapes. B, Schematic of the two-compartment circuit. Compartment 1 is connected to the pipette and, via an axial resistance (Raxial), to compartment 2. C, Two-compartment LIF model-predicted and experimentally measured voltage responses for one current stimulus. Shown are the data (black line) and the predicted voltages at both compartments (dashed magenta lines). After the voltage crossed threshold (VTh) at compartment 2, a spike was predicted to occur at a fixed delay D (fixed for all stimuli). The gray dashed lines show the predicted voltage in both compartments for the case of purely passive membranes. Measured AP onset was defined at 0.15 ms before the voltage crossed 20 mV below AP peak. Prediction error (in milliseconds) = measured AP onset minus predicted AP onset. The stimulus, a 70 pA plateau with 0.4 ms linear rise time, started at 0 ms. D, Two-compartment EIF model. All same as in C, but here a spike is generated with a fixed delay D after the predicted voltage in compartment 2 (green) crosses VT + 10 · ΔT (see Materials and Methods). A long-latency spike is used for the example in C and D for clarity. E, Model-predicted spike onset latency versus measured spike onset latency for the LIF (magenta) and EIF model (green) demonstrates general accuracy of predictions for latencies from 0.3 to 5 ms. F, Top, Prediction errors versus measured spike onset latency for the LIF model in magenta and EIF model in green (501 responses). rms latency errors δL: LIF, 104 μs; EIF, 83 μs. Fraction of correctly predicted spike occurrences F: LIF, 98.7% (8 extra or missing spikes in a total of 600 stimuli with 506 spikes triggered); EIF, 98.3% (10 extra or missing spikes). Bottom, SD of the prediction error versus measured spike onset latency (calculated using groups of 20 successive points). Model parameters for baseline potential of −82 mV were as follows: double-exponential fit: τfast = 0.07 ms, Rfast = 40 MΩ, τslow = 2.3 ms, Rslow = 450 MΩ. Two-compartment circuit: R1 = 1760 MΩ, C1 = 1.3 pF, R2 = 600 MΩ, C2 = 3.8 pF, and Raxial = 75 MΩ. LIF: VTh = −66.5 mV; fixed delay D = 0.23 ms. EIF: VT = −68.6 mV; ΔT = 1.3 mV; fixed delay D = 0.09 ms.
To construct the passive electrical circuit of the neuron model, we first analyzed subthreshold voltage responses to depolarizing current steps (Fig. 4E). Because they were better fit by double- than single-exponential functions (mean ± SD for four SGNs: τfast = 0.24 ± 0.1 ms, Rfast = 107 ± 13 MΩ, τslow = 3.3 ± 0.5 ms, Rslow = 382 ± 94 MΩ), we chose a two-compartment circuit (Fig. 7B), which is the simplest passive electrical circuit able to reproduce such voltage responses (Pandey and White, 2002). We thereby obtained the values of membrane resistance (R), capacitance (C), and axial resistance (Raxial) of the two-compartment circuit for four cells: R1 = 2.0 ± 0.6 GΩ, C1 = 1.8 ± 0.8 pF; R2 = 485 ± 149 MΩ, C2 = 7.7 ± 3.9 pF; Raxial = 183 ± 13 MΩ.
We injected a generalized set of EPSC-like stimuli into SGNs to systematically cover the entire range of physiologically observed kinetics and amplitudes (Fig. 7A). To predict spike onset latencies, we considered two simple spike generation mechanisms (see Materials and Methods): the LIF model and the EIF model (Fourcaud-Trocmé et al., 2003). The LIF and EIF neuron models are similar in that they accumulate the stimulus charge on the membrane capacitance of the cell and allow charge to escape through the leaky membrane. The models differ only in the spike generation mechanism. For the LIF model (Fig. 7C), the voltage follows the predicted passive response and the neuron emits a spike at a fixed delay D after the voltage crosses the fixed threshold VTh. For the EIF model (Fig. 7D), the spike-generating current mediated by voltage-gated Na+ channels is approximated by ΔTe(V(t)−VT)/ΔT, where V(t) stands for the voltage at time t, VT is the threshold voltage, and ΔT is the spike slope factor, characterizing the sharpness of spike initiation. With sufficiently large stimuli, the membrane potential diverges to infinity in a finite time. The EIF was defined to emit a spike at a fixed delay D after the membrane potential reached VT + 10 · ΔT (i.e., when it is already diverging toward infinity). The exponential term of the EIF is expected to reduce the error between the measured and predicted spike onset if EPSP kinetics directly affect the time course of INa activation.
To test the two spike-generating mechanisms, we added the LIF or EIF mechanism to cellular compartment 2, or to both compartments. We determined the optimum parameters (VTh for the LIF; VT and ΔT for the EIF) by minimizing the error between model predictions and electrophysiological data. Goodness of fit was assessed in terms of latency error δL and the fraction of correctly predicted spike occurrences F (see Materials and Methods). For both models, we found smaller δL and similar F when we placed the spike generator in compartment 2 only (LIF: 26 ± 17% smaller δL, p = 0.002, n = 6; EIF: 35 ± 23% smaller δL, p = 0.03, n = 4). Both models predicted the data with high accuracy. For the EIF, δL was 77 ± 25 μs and F was 99.0 ± 0.8%, n = 4. The LIF predicted latencies with a somewhat larger error (87 ± 76% larger δL on average compared with the EIF, p = 0.05, n = 4). VT of the EIF was not significantly different from VTh of the LIF (VTh − VT = 0.7 ± 1.2 mV; p = 0.15; n = 4). The spike slope factor ΔT of the EIF was 1.4 ± 0.5 (n = 4).
Figure 7E illustrates the theoretically predicted versus measured spike onset latencies for responses to current-clamp injection in a representative SGN stimulated with hundreds of unique EPSC-like shapes. The relatively small error of the model predictions for single instances of each stimulus demonstrated both the high accuracy of the model and the deterministic nature of the SGN response. The prediction error of both models was very small for latencies ≤1 ms. Only for latencies above ≈1.8 ms did the errors greatly increase (Fig. 7F). The increase in prediction error for long latencies may be mainly explained by variability intrinsic to the neuron (Fig. 6F, inset: increase in jitter, SD, for long latencies).
In summary, although SGNs were phasic and therefore not entirely described as simple “integrators,” their first-spike latency in response to a wide range of EPSC-like shapes could nonetheless be accurately predicted by simple leaky integrate-and-fire models. The EIF had less systematic error than the LIF (Fig. 7F, top), as it incorporated a stimulus-dependent effect on INa activation around spike threshold. The spike generator was better placed in compartment 2, indicating that the spike generator is not centered directly at the bouton. However, the relatively small membrane capacitance of compartment 1 (C1, ≈1.8 pF), suggests that the spike-generating compartment 2 is near the synapse. Such simple models provided an easy way to predict the spike latencies for discrete EPSCs and to estimate the threshold potential at the site of spike initiation. However, more complex models will be required (Herz et al., 2006) to predict the responses of the neuron to high rates of EPSCs.
EPSC-like stimulation and comparison with synaptically evoked spikes
Browsing through the stimulus parameter space, we compared spike onset latencies between the experimental data and the two-compartment EIF model in response to individual EPSC-like stimuli such as in Figure 7A. Figure 8A–C illustrates measured and predicted spike onset latencies as contour plots through stimulus parameter space for each set of stimuli with a different charge. The accuracy of the predicted latencies to the data can be appreciated by the overlap of the green contour lines (model) with the black contour lines (data). Contour lines show the range of stimuli that evoked spikes with equal latency. When holding charge constant, spike latency was more sensitive to changes in amplitude than in rise time.
Effect of current stimulus waveform kinetics on latency: data and model predictions. A, Latency contours in 200 fC parameter space. The black points on the graph represents 88 stimuli of variable amplitude (31–190 pA, y-axis), rise time (0.1–0.8 ms, x-axis), and plateau (0–5 ms, isoplateau bands labeled on right), each with a total charge of 200 fC. Stimuli evoked spikes for all but the smallest waveforms (black X, failure; n = 9). Measured spike onset latencies were plotted as solid black contour lines (1–4 ms, labeled in black). Spike onset latencies predicted by the EIF model are overlaid as green dashed contour lines (green X, predicted failure; n = 1). B, In the 300 fC parameter space, every stimulus evoked a spike. Latencies (black contour lines) were accurately predicted by the EIF model (green dashed contours). C, Spike latency contours for the 500 and 700 fC parameter spaces illustrate reduction of spike latency for larger stimuli; however, reduction in spike latency was reticent when stimuli were increased above 400 fC. D, Stimulus–response pairs for two subthreshold stimuli (100 fC). The stimulus (Iinj, bottom part of each panel) and the response of the cell (Vm) are shown in solid black. The passive response of the model circuit in compartment 1 is shown as a dashed gray line. The response of the EIF model in compartment 2 is shown as a solid green line, where the threshold of −67.5 mV is show by the dotted green line. D1, The data and the passive response in compartment 1 were virtually indistinguishable. D2, Some near-threshold behavior was not well predicted by the passive response. Note the voltage drop between the site of current injection (compartment 1) and compartment 2. E, Two stimulus–response pairs (as in D) from the 200 fC parameter space, labeled in A. The box shows area enlarged in inset. E1, A failure of spike generation where the model predicted a spike. E2, Similar near-threshold stimulus triggered a long-latency spike. F, Two stimulus–response pairs from the 300 fC parameter space, labeled in B. Each inset enlarges the area around spike threshold, where the response of the SGN and the EIF model deviated from the passive response. The EIF model predicted that spike onset occurred after a fixed delay D of 90 μs from when the membrane voltage diverged toward infinity (E, F, dotted green vertical lines). G, Comparing spike onset latency as a function of EPSP maximum slope for CC-evoked spikes (black) and IHC-evoked spikes (blue to red; replotted from Fig. 3C) revealed a very similar relationship.
With a total charge of 200 fC, only the slowest and smallest stimulus shapes failed to evoke a spike from holding potentials around −80 mV (Fig. 8A). The fastest and largest 200 fC stimuli (150–200 pA; 0.1–0.4 ms rise time) evoked spike onset latencies <1 ms, while the slowest successful waveforms evoked latencies of ≈4.5 ms. Most physiological EPSCs have charges of 150–350 fC. When comparing the ranges of latencies for different charge sets within cells, we observed relatively large latency reduction when increasing from the 200 fC to the 300 fC parameter space (Fig. 8A,B). In comparison, we observed less latency reduction when increasing to larger charge sets (Fig. 8C).
Experimental stimulus–response pairs and model predictions are compared in Figure 8D–F. Figure 8D shows two subthreshold stimulus–response pairs and model predictions from the 100 fC parameter space. For 100 fC stimuli, only 2 and 6 of 88 stimuli evoked a spike, respectively, in two cells tested. They were not predicted by the model. Figure 8E shows two stimulus–response pairs and model predictions from the 200 fC parameter space. In rare cases, the model predicted a spike when none occurred (Fig. 8E, left): this happened only in parameter regions where the responses of the neuron were less reliable or deterministic (e.g., 4.5 and 5 ms plateau in Fig. 8A). Figure 8F shows two stimulus–response pairs from the 300 fC parameter space, with the EIF model prediction in compartment 2 and the predicted passive response of the cell in compartment 1. As the model membrane potential crossed threshold at compartment 2, the recorded voltage trace and the EIF model began to deviate from the passive response of compartment 1. For EPSC-like stimuli in the 200–300 pA range, the model performed very well. Latencies were ∼500–1000 μs, and the SDs of the latency prediction errors of the model were only 10–35 μs.
To compare our EPSC-like stimulation with synaptic conductance excitation, we plotted the spike onset latency versus EPSP maximum slope for the CC-evoked and the IHC-evoked spikes (from Fig. 3C). The overlap between the CC-evoked and IHC-evoked data sets (Fig. 8G) confirmed that a range of our EPSC-like shapes were good approximations of synaptic excitation for the study of first-spike latency in SGNs. This allows one to deduce the EPSC-like stimuli that produced similar prepotentials and spike onset latencies as did physiological EPSPs (i.e., those shapes eliciting latencies <1.5 ms).
Spike onset and threshold potential as a function of baseline potential
Finally, to complement the study of spike latency, we investigated the sensitivity of spike onset potential and threshold potential to changes in Vbase. The sudden slope change or visible “kink” in the spike waveform (Fig. 9A,B, circles) is the spike onset potential at the recording site (Fig. 3E, asterisks). Threshold potential, in contrast, is the voltage required to result in the initiation of an all-or-none action potential in the spike-generating compartment. Here, threshold potential was defined as the optimum VTh from the two-compartment neuron model. Since the voltage evolution depended upon the solution for the two-compartment circuit (see Materials and Methods), the estimated voltage thresholds VTh ranged over 2 to 6 mV for each combination of cell and baseline potential we tested (Fig. 9C). Depending on the cell and Vbase, spike onsets were approximately −70 to −50 mV and were always depolarized compared with threshold. Spike thresholds were small, only +6 to +14 mV relative to Vbase when evoked from Vbase near the zero-current potentials of approximately −80 to −70 mV. We found that both the spike onset potential and absolute threshold shifted to more depolarized potentials with depolarization of Vbase. The shift of absolute threshold partially compensated for the change of Vbase, resulting in comparatively small shifts in the relative threshold. This effect should decrease the sensitivity of spike latency to changes in Vbase.
Absolute and relative spike threshold and onset potential covary with baseline potential. A, Individual responses to 50 pA current steps from three baseline potentials in one SGN. B, Phase plots for the action potentials in A. The open circles in A and B mark the spike onset potential, defined when slope reached 30 mV/ms. A and B reveal that the spike onset potential depolarized with the baseline potential. C, Absolute (Vm) and relative (Vm − Vbase) spike onset potentials, measured using minimum stimulation (ovals). Absolute and relative threshold potentials, predicted at the spike generator, assuming possible solutions for the two-compartment LIF model (bars). All plotted versus baseline potential for four SGNs (black, cell from 3 baseline potentials; white, cell from 2 baseline potentials; 2 shades of gray, 2 cells from different baseline potentials). As a function of baseline potential, changes in onset potential and threshold potential were smaller in relative value than absolute value. Such an effect could reduce the influence of baseline potential on spike onset latency.
Discussion
Fast and precise SGN spike generation contributes to temporal code fidelity
The auditory spike code is generated in SGNs by large-amplitude EPSCs of unisynaptic excitation from IHCs, at rates that are modulated by sound pressure. We found that EPSC shape and size variability could strongly influence spike latency relative to the onset of neurotransmitter release. When SGN spikes were evoked spontaneously by discrete EPSCs from IHCs in vitro at room temperature, spike initiation had a mean latency of ∼600 μs from EPSC onset, with substantial jitter (SD of 300 μs; Fig. 3B–D). Upon repeated stimulation with current that approximated the average EPSC, much larger than rheobase, spike initiation had comparable latency but much smaller jitter (SD ≤ 23 μs; Figs. 5A, 6F). This showed that neuronal intrinsic jitter was small for large stimuli and suggests that precision in response to sound is great only insofar as synaptic input is much larger than rheobase. This demonstrates the potential physiological significance of large-amplitude, apparently multivesicular EPSCs (Glowatzki and Fuchs, 2002).
Cellular basis of fast and precise spike generation in the SGN
Initiation of short-latency spikes timed precisely to stimuli is supported by the following: (1) proximity of synapse and low-threshold spike generator, (2) short membrane time constant, (3) phasic excitability of the SGN peripheral neurite, and (4) large EPSCs relative to rheobase. Strong immunolabeling for NaV1.6 at the heminode adjacent to the bouton (Lacas-Gervais et al., 2004; Hossain et al., 2005; Lysakowski et al., 2011), the very rapid spike generation we observed, and the results from our two-compartment model support the view that spikes are initiated in the peripheral neurite. This unisynaptic configuration for aural spike generation, contrasting with that of most cortical neurons, is a highly specialized adaptation of structure and function that seems fundamental to the temporal code of the early auditory pathway.
The large-amplitude synaptic conductance and compact spike generator seem to enable accurate and precise encoding despite biological variability in EPSC charge and rise time (Fig. 6). Moreover, we observed a small latency reduction when increasing stimulus amplitude from 300 to 700 pA, compared with the larger reduction that accompanied a stimulus increase from 150 to 300 pA (Fig. 6F). Thus, while the auditory system allocates significant metabolic energy to latency and jitter reduction, the typical EPSC of ∼300 pA seems poised to do this rather efficiently. Faithful conversion of chemical neurotransmission to spike generation, along with fast and precise glutamate release (Goutman and Glowatzki, 2011), enables a rapidly modulated temporal code for acoustic waveforms. The large and rapid synaptic conductance may be crucial for evoking spikes at high rates, when short interspike intervals are impeded by neural refractoriness. Finally, the large synaptic input and compensatory threshold shift made responses relatively insensitive to changes of the SGN baseline potential (Figs. 6, 9).
Phasic excitability of the SGN
Phasic excitability is prominent in neurons of the auditory pathway (Schwarz and Puil, 1997; McGinley and Oertel, 2006; Prescott et al., 2008; Gai et al., 2009; Howard and Rubel, 2010) and is likely important for sound encoding in SGNs as well. If each EPSC generates one spike, then subsequent spikes or long-latency spikes, less well timed to sound stimuli and more dependent on cellular dynamics, would be precluded by the phasic excitability of the neuron (Brew and Forsythe, 1995). Latencies >3 ms were rarely observed in response to synaptic events or EPSC-like stimuli (Figs. 3, 6, 7, 8), presumably because low-voltage activated IK inhibited spiking in response to slow stimuli. In addition, as phasic SGNs would not encode a steady neurotransmitter concentration, high spike rates would require a rapid sequence of discrete EPSCs. Future work should use stimulus trains to further investigate the impact of phasic excitability on SGN spike encoding.
Phasic excitability was found also in recordings from SGN presynaptic terminals onto bushy cells in acute slices (Lin et al., 2011) and from SGN somata (Mo and Davis, 1997a). Spike accommodation has been ascribed in part to the activity of dendrotoxin-sensitive low-voltage-activated K+ channels (KV1.1) (Mo et al., 2002) (but see Szabó et al., 2002). In SGNs cultured from the mouse, somata isolated from the cochlear apex fired one spike and were termed rapidly adapting, while those from the cochlear base fired continuously and were termed slowly adapting (Lv et al., 2010) (but see Adamson et al., 2002). The phasic responses we observed in SGN peripheral neurites seem to correspond to those of rapidly adapting somata from the cochlear apex.
The intrinsic firing properties of SGNs and their exogenous regulation may differ tonotopically and/or by position around the IHC circumference (Merchan-Perez and Liberman, 1996; Lin, 1997; Mo and Davis, 1997b; Adamson et al., 2002; Flores-Otero et al., 2007; Liu and Davis, 2007; Lv et al., 2010). Additionally, the extent to which SGN properties depend upon the species, age, and preparation is not entirely clear. We recorded from the acutely explanted cochlear apex of rats after hearing onset (aged 2–3 weeks) and targeted boutons innervating the modiolar face of IHCs. Potential variations in intrinsic properties by tonotopic position and by synaptic location on the IHC are important subjects for future studies.
Possible deviations from in vivo conditions
Performing experiments below body temperature likely affects neurotransmission and firing properties. In general, channel-gating speed is increased and conductances might be altered. At physiological temperatures, the membrane time constant, input resistance, and spike amplitude are likely smaller; spike latency and duration are likely shorter; and the baseline and threshold potentials can be hyperpolarized or depolarized by several millivolts (Volgushev et al., 2000; Cao and Oertel, 2005; Graham et al., 2008). However, phasic excitability likely remains unchanged (Cao and Oertel, 2005; Graham et al., 2008).
Faster Na+ channel recovery from inactivation may promote repetitive, high-frequency firing. Interestingly, the principal Na+ channel isoform located at axon initial segments and nodes of Ranvier (NaV1.6), also present in SGNs, is relatively resistant to inactivation. It provides a resurgent current for repetitive firing in retinal ganglion cells at physiological temperature (Van Wart and Matthews, 2006) and in Purkinje cells even at room temperature (Raman et al., 1997). Thus, other factors (e.g., low-voltage-activated IK) probably influence the phasic property we observed. Although Ohlemiller and Siegel (1998) reported that the basic response properties of auditory units were conserved after cooling of the cochlea in vivo, the specific effects of temperature on SGN electrophysiology should be addressed with further studies in vitro.
Implications for SGN function in vivo
IHC synapses and their SGNs comprise single auditory units, known to exhibit functional heterogeneity in terms of spontaneous spike rate and sensitivity to sound (Kiang, 1965; Taberner and Liberman, 2005). This heterogeneity is thought to enhance auditory perception, and identification of its underlying mechanisms is a current topic of research.
Presynaptic mechanisms in the IHC are thought to underlie some response properties of auditory nerve fibers (Fuchs, 2005; Moser et al., 2006). Our observation that most physiological IHC neurotransmitter release events generated a spike in vitro supports the portrayal by Siegel (1992), in which almost every discrete EPSP was successful in triggering a spike in vivo. We cannot exclude that both studies overlooked SGNs having lower rates of success. However, differences in spontaneous spike rate among SGNs might not reflect differences in the success rate but rather the rate of release events. EPSC rate heterogeneity might arise from variation between presynaptic Ca2+ microdomains in IHCs (Frank et al., 2009). Our study also suggests, however, that SGNs receiving on average different EPSC waveforms (Grant et al., 2010) would exhibit different spike latencies. If high rates of release lead to EPSC superposition, then SGNs receiving different modes of EPSCs might generate different spike rates due to the phasic property of neuron, neural refractoriness, and/or synaptic depression. Thus, heterogeneity of EPSC waveforms between synapses might contribute to the diversity of sensitivity, dynamic range, and maximum firing rate. Future work should focus on how presynaptic and postsynaptic mechanisms combine to produce the functional diversity observed in the auditory nerve in vivo.
Inconsistent response latency may limit temporal-code precision. When holding constant charge, rise time, and decay τ, changing only the plateau duration and amplitude (from 100 to 200 pA; Fig. 8B) resulted in a latency shift of ∼400 μs. So, natural EPSC waveform variability within a single SGN might affect response statistics in vivo, for example, first-spike precision at sound onset (Heil, 2004). Hypothetically, EPSC superposition at sound onset or during phase locking could reduce waveform-dependent variability in spike timing. The extent of EPSC superposition in vivo is unclear. If increasing sound amplitude leads to increased EPSC superposition, then louder sounds might produce smaller spike latencies. However, such an effect might be incongruent with observations of the relative invariance of action potential phase in SGNs as a function of sound pressure level during phase locking to tone cycles (Kiang, 1965; Rose et al., 1967). Additionally, differences in mean latency of >100 μs for a 50 pA change in EPSC amplitude might seem incongruent with signaling of microsecond interaural time differences, but such variance could be averaged out when inputs from several SGNs converge onto individual brainstem neurons.
Footnotes
This work was supported by a fellowship from the Alexander von Humboldt Foundation (M.A.R.) and grants (T.M.) from the Federal Ministry of Education and Research through the Bernstein Center for Computational Neuroscience Goettingen (01GQ1005A) and the Deutsche Forschungsgemeinschaft through the Collaborative Research Center 889 “Cellular Mechanisms of Sensory Processing.” We thank R. Davis, J. Goutman, R. Gütig, R. Nouvian, W. Roberts, F. Wolf, and E. Yamoah for comments on this manuscript; and A. Neef, E. Neher, W. Stühmer, H. Taschenberger, and the InnerEarLab for discussion.
- Correspondence should be addressed to Mark A. Rutherford, InnerEarLab, University of Goettingen, Robert-Koch-Strasse 40, D-37075 Goettingen, Germany. rutherford.neuro{at}yahoo.com