Abstract
The auditory systems of birds and mammals use timing information from each ear to detect interaural time difference (ITD). To determine whether the Jeffress-type algorithms that underlie sensitivity to ITD in birds are an evolutionarily stable strategy, we recorded from the auditory nuclei of crocodilians, who are the sister group to the birds. In alligators, precisely timed spikes in the first-order nucleus magnocellularis (NM) encode the timing of sounds, and NM neurons project to neurons in the nucleus laminaris (NL) that detect interaural time differences. In vivo recordings from NL neurons show that the arrival time of phase-locked spikes differs between the ipsilateral and contralateral inputs. When this disparity is nullified by their best ITD, the neurons respond maximally. Thus NL neurons act as coincidence detectors. A biologically detailed model of NL with alligator parameters discriminated ITDs up to 1 kHz. The range of best ITDs represented in NL was much larger than in birds, however, and extended from 0 to 1000 μs contralateral, with a median ITD of 450 μs. Thus, crocodilians and birds employ similar algorithms for ITD detection, although crocodilians have larger heads.
Introduction
Interaural time difference (ITD) processing has been explained in terms of the Jeffress model (Jeffress, 1948; Joris et al., 1998; Fitzpatrick et al., 2002; Konishi, 2003; Palmer, 2004; Hyson, 2005; Joris and Yin, 2007). This model assumes arrays of coincidence-detector neurons that receive excitatory inputs from the two ears, and respond maximally when phase-locked inputs converge from each ear simultaneously. Different conduction delays from each ear provide a means by which coincidence detectors form a place map of sound location. For nucleus laminaris (NL), the ITD-processing structure in birds, the existence of such an arrangement has been confirmed in barn owls, emus, and chickens (Carr and Konishi, 1990; Overholt et al., 1992; Peña et al., 1996; Funabiki et al., 1998; MacLeod et al., 2006).
In the mammalian equivalent of NL, the medial superior olive (MSO), direct confirmation of a place map is lacking. Instead, recent work in guinea pigs and gerbils has revealed a tendency for the steepest region of the ITD tuning curve (its slope) to fall close to midline regardless of best frequency (McAlpine et al., 2001). McAlpine and colleagues proposed that sound source location might instead be encoded by activity in two broad, hemispheric spatial channels (McAlpine and Grothe, 2003; McAlpine, 2005), and put forward a potential solution to the differences in coding observed in birds and small mammals (Harper and McAlpine, 2004). Assuming that the ITDs naturally encountered should be coded with maximal accuracy, they suggested that ITD coding should depend on head size and frequency range. A place map, collectively covering the physiological ITD range of the animal, and consistent with the Jeffress model, would work best for high frequencies and/or large heads, while the two-channel model would provide optimum coding for small heads and lower best frequencies.
The prediction that animals with small heads and/or low-best-frequency hearing should not use a place code was not supported by recent data from the chicken. Chickens and gerbils have similar head sizes and ability to encode temporal information, but chickens display the major features of a place code of ITD (Köppl and Carr, 2008). Data from chickens and barn owls (Wagner et al., 2007) suggest instead that evolutionary history may constrain neural circuits. Furthermore, slope and place codes may not be mutually exclusive strategies. Barn owls appear to use both the map of ITD created by the place code, and the information in the slope of the ITD function (Takahashi et al., 2003; Bala et al., 2007).
To test the extent to which Jeffress-type algorithms are used for ITD processing in other vertebrates with well developed low-frequency hearing, we recorded responses of ITD-sensitive neurons in the alligator. Crocodilians are birds' closest living relatives, and use low-frequency sound for communication. Additionally, field observations support their ability to localize the contact calls made by their young (Hunt and Watanabe, 1982; Passek and Gillingham, 1999).
Materials and Methods
We recorded responses of ITD-sensitive neurons in the American alligator (Alligator mississippiensis). We also incorporated a large dataset of auditory nerve recordings from another member of the Alligatoridae, Caiman crocodilus, into our models and analysis. Experiments were performed in accordance with the guidelines approved by the Marine Biological Laboratory (Woods Hole, MA) and the University of Maryland Institutional Animal Care and Use Committees. This material has been presented in abstract form (Soares et al., 1999, 2003).
Brain slice preparation.
We used in vitro slice preparations of alligator embryo (∼E55 to E65) brainstem to investigate the physiology and morphology of NL. Thirty eggs from several clutches of newly laid eggs of the American alligator were collected from field nests in the Rockefeller Wildlife Refuge at Grand Chenier, LA. At ∼10 d of incubation [total 65 d incubation (Ferguson, 1985)], eggs were transported by air to the Department of Biology at the University of Maryland. Eggs were incubated in boxes containing vermiculite mixed with water at a 2:1 ratio and maintained at 30°C. It is not known when hearing onset in alligators occurs, although it is likely that it occurs before hatch. Hearing onset occurs in ovo in chickens (Saunders et al., 1973), while the calls of hatching crocodilians stimulate their mother to return to the nest (Pooley, 1977).
Eggs were taken from the incubation boxes at between 80 and 90% of the total incubation period. Embryos were rapidly decapitated and a segment of the caudal skull containing the brainstem was removed with a razor blade and quickly submerged in artificial CSF (aCSF). aCSF contained the following (in mm): 130 NaCl, 26 NaH2CO3, 3 KCl, 2 CaCl2, 2 MgCl2, 1.25 NaH2PO4, and 10 dextrose. The aCSF was constantly gassed with 95% O2 and 5% CO2 and had a pH of 7.4. The brainstem segment was dissected out of the cranium and transferred to a vibrating blade tissue slicer (Campden), where it was mounted with cyanoacrylate glue, supported by a gel solution (30% gelatin in aCSF), and cut in aCSF. Transverse slices (150–300 μm) containing nucleus magnocellularis (NM) and NL were collected and placed in a holding chamber at room temperature (25–27°C) in oxygenated aCSF (95% O2–5% CO2). Slices were transferred as needed to a submersion-type recording chamber perfused (1 ml/min) with oxygenated aCSF.
Whole-cell patch clamp recordings were made from visually identified NM and NL neurons using infrared/differential interference contrast video microscopy. Initial pipette resistance was 3–7 MΩ, with an internal solution as follows (in mm): 120 K gluconate, 20 KCl, 0.1 EGTA, 2 MgCl, 2 Na2ATP, and 10 HEPES, pH 7.3. Pipettes were backfilled with saline containing Lucifer yellow (Invitrogen) or sulforhodamine (Invitrogen). Recordings were made using an Axoclamp-2B (Molecular Devices) in bridge mode and digitized while connected to a PC running PClamp 8.0 (Molecular Devices) acquisition software. Analysis of electrophysiological data was performed using AxonGraph 4.5 software (Molecular Devices). Voltage–current (V–I) relationships were constructed from a series of hyperpolarizing and depolarizing current injections within the linear range of the membrane potential (approximately ±10 mV of resting potential). Linear regression analysis of V–I relationships was used to determine the input resistance of each cell.
In vivo recording in caiman auditory nerve.
Three adult young caimans, length ∼1 m, weight ∼4 kg, were anesthetized with sodium pentobarbital (20 mg/kg, i.p.). Animals were tracheotomized and artificially respirated with room air and kept at a cloacal temperature between 24 and 27°C. The head was fixed in a stereotaxic holder, and the auditory nerve exposed by an opening in the occipital bone. Experiments were performed in a sound-attenuating box (IAC). Glass microelectrodes (filled with 3 m KCl, 5–100 MΩ) were inserted in the auditory nerve and advanced using a remote-controlled stepper device (Nanostepper). Single-unit activity was amplified, displayed on an oscilloscope, and converted into unitary pulses using a level discriminator. Unitary pulses were stored on a personal computer (Macintosh II, equipped with National Instruments data acquisition interface boards) and an IBM compatible computer equipped with a CED 1401 laboratory interface. Stimulation and recording was controlled by custom-made software (Smolders and Klinke, 1986). A Sennheiser HD 414 headphone was placed at one ear for acoustic stimulation. Acoustic stimuli were generated from a Wavetek F34 waveform generator or from the computer interface boards and amplified through a conventional audio amplifier (Cyrus) and a computer-controlled homemade attenuator (1 dB steps, 0–119 dB). Sound pressure level at the eardrum was calibrated using a Bruel and Kjaer ½ inch condenser microphone (4165) equipped with a calibrated sound probe tube.
Spontaneous firing rate (SR, minimum 10 s), characteristic frequency (CF), and response threshold at CF (CF threshold) were determined either manually using the waveform generator and the attenuator, or from a computer-controlled recording of the response area as a function of the carrier frequency and the sound pressure level of tone burst stimuli (100 ms duration, 2 ms rise and fall time). To determine the strength of phase locking of single units to the acoustic input, continuous tones were presented at and near the CF, while increasing sound pressure level in 10 dB steps, starting 20 dB below CF threshold. At each stimulus presentation, ∼1000 action potentials were stored for off-line construction of period histograms. Vector strength of phase locking (VS) for a given stimulus frequency was calculated as the fundamental component of the Fourier transform of the normalized period histograms of at least 555 spikes in 64 bins per period of the sound stimulus [estimated SD of VS <0.06 (Smolders and Klinke, 1986)].
In vivo recording in alligators.
We used in vivo recording from sixteen 2- to 3-year-old juvenile alligators (70–90 cm long) to investigate the physiology and morphology of NM and NL. Head widths from members of this group, measured from earflap to earflap, were 32.8 ± 3.5 mm (n = 5). Anesthesia was induced by 4% isoflurane inhalation via a mask, followed by intubation. Body temperature was maintained at 32°C by a heating blanket wrapped around the animal and feedback-controlled by a cloacal temperature probe. EKG recordings from needle electrodes placed in the muscles of the right leg and left forelimb were constantly monitored. A constant gas flow of carbogen mixed with 2–3% isoflurane at 4 ml/min was connected via long loose fitting tube into the trachea. The head was held in a constant position by gluing a stainless-steel head post to the prefrontal bone. An opening in the frontal and parietal bones exposed the brain membranes overlying the cerebellum. In some cases a portion of the cerebellum was aspirated to expose the dorsal surface of the brainstem. Most data were obtained with tungsten microelectrodes, with impedances between 2 and 20 MΩ. We also used glass electrodes for extracellular injection of biotinylated dextran (4% DA, Invitrogen). Electrodes were positioned above the relevant brainstem area using landmarks and then advanced remotely. We continuously tested for auditory responses using a variety of monaural and binaural stimuli. Electrodes were coupled to a preamplifier and amplifier system (μA200, Walsh Electronics), and the amplified signal was filtered (high pass 300 Hz, low pass 10 kHz) and fed to an analog-to-digital converter (TDT DD1) with subsequent event counter (TDT ET1). Both the analog and the TTL signals were stored and processed by custom-written software (xdphys, Caltech). At selected recording sites, biotinylated dextran was deposited iontophoretically, usually by passing 1–2 μA positive direct current for 10–30 min. In other experiments, electrolytic lesions were made by passing 5–10 μA current for 10 s.
Alligators were placed in a sound-attenuating chamber for all measurements (IAC). Closed, custom-made sound systems were placed at the entrance of both ear canals, containing commercial miniature earphones and miniature microphones (Knowles EM 3068). After the sound systems were sealed into the ear canal using Gold Velvet II ear impression material, the sound systems were calibrated individually before the recordings. Acoustic stimuli (clicks and noises) were digitally generated by the same custom-written software as above, driving a signal-processing system (Tucker Davis Technology). Clicks had a rectangular form and duration of two samples (equivalent to 41.6 μs). Only condensation clicks were used. The standard click had 0 dB attenuation relative to 85 dB SPL. Stimuli were generated separately for the two ears by using a TDT AP2 signal processing board. Both channels were then fed to the earphones via D/A converters (TDT DD1), anti-aliasing filters (TDT FT6–2), and attenuators (TDT PA4). Stimuli were tone pips of 100 ms duration (including 5 ms linear ramps), presented at a rate of 5/s. Monaural frequency responses for both ipsilateral and contralateral stimulation were recorded. The mean of the best frequencies (BFs) determined from these was then taken as the frequency for measuring period histograms at 20 dB above threshold and for testing ITD selectivity with binaural stimuli. ITD was tested within ±1 stimulus period, in steps no larger than 1/10 of the period and stimulus durations of 100 ms. Stimulus levels were between 50 and 80 dB SPL and generally 10 stimulus repetitions were presented at each ITD.
Data analysis.
Analyses suggested an increase in neural thresholds over time with isoflurane anesthesia. Our preliminary observations using auditory brainstem responses (Higgs et al., 2002) had not shown an effect of anesthesia, but we observed an increase in threshold after ∼8 h. We therefore only report threshold measurements from the first 6 h of each experiment. Monaural measures of best frequency were derived by measuring changes in spike rate in response to changing tone pips at constant level. Monaural period histograms were generally measured at 20 dB above threshold, and were constructed from 100 repetitions of the tone pip. The timing of each spike relative to the zero-crossing of the stimulus was recorded with a temporal resolution of 2 μs. Period histograms, the mean phase with respect to the stimulus, and the vector strength were derived from these data (Goldberg and Brown, 1969). Rayleigh's test was applied to evaluate the statistical significance of the vector strength, using p≤ 0.01 as the criterion value, which corresponds to a vector strength of 0.1 for 500 recorded spikes (Köppl, 1997b). For ITD curves, the mean spike rate was used for single units. ITD functions were measured at best frequency, and were fitted with a cosine function to determine best ITD, defined as the peak closest to zero ITD. In cases where the minimum of the ITD function fell close to zero ITD, click responses were used to resolve the laterality and define the best ITD. Responses to 128 clicks were stored (Wagner et al., 2005). For single-unit recordings, a peristimulus time histogram (PSTH) with a bin width of 0.02 ms was calculated, using the TTL signal. Latency was defined as the earlier of the first two consecutive bins exceeding the tallest bin in a 10 ms interval preceding the stimulus, as in Köppl and Carr (2008). The click-driven neurophonic activity contained an oscillatory response, which was analyzed using the methods in Wagner et al. (2005), modified by Köppl and Carr (2008), who adjusted the cutoff frequencies of the filter functions to a lower frequency range. Fitted waveforms of ipsilateral and contralateral responses were superimposed, and the click latency difference was calculated as the median difference between two and four consecutive maxima and minima, with positive values indicating contralateral leading.
To calculate characteristic delay, ITDs were measured for at least five different frequencies. We developed an empirical criterion for the acceptance or rejection of data at particular frequencies for further analysis (see Köppl and Carr, 2008). For single units, the mean spike rate and SD were determined for each ITD and an index of modulation derived by calculating the difference between minimal and maximal mean rate and dividing it by the maximal SD observed. Data were discarded if this index was <1.5. Only 21 cases met this criterion; it is possible that longer duration stimuli and more repetitions would have yielded greater modulations over the frequency range required to evaluate characteristic delay. Characteristic phase (CP) was taken to lie between 0.25 and 0.75 cycles.
Histology.
After injection with pentobarbital sodium and phenytoin sodium (Euthasol; Virbac), alligators were fixed by cardiac perfusion with 4% paraformaldehyde, the brains were extracted and cross-sectioned on a freezing microtome. Biotinylated dextran was visualized using standard ABC (Vector Laboratories) and diaminobenzidine protocols on floating sections. Finally, the sections were serially mounted and counterstained with either cresyl violet (lesions) or neutral red (dextran).
Results
Since no previous studies had examined ITD coding in the Crocodilia, we designed experiments to address the major requirements for ITD coding. We measured the ability of the first-order NM neurons to encode stimulus phase, which we compared with data from Caiman auditory nerve (Smolders and Klinke, 1986). We then performed in vivo experiments in alligators to show that NL neurons acted as coincidence detectors and were sensitive to ITD. Finally, we used the alligator data to constrain and adapt models of ITD coding (Grau-Serrat et al., 2003) to determine whether the mechanisms of ITD detection in crocodilians could be similar to those of birds.
The auditory hindbrain nuclei are organized similarly in birds and crocodilians, consistent with their position within the archosaurs (Hedges and Poling, 1999; Carr and Code, 2000; Iwabe et al., 2005). NM contains large round cell bodies in a nucleus that extends from caudolateral to rostromedial in the dorsal medulla, and is located on either side of the midline, where it forms a compact mass of cells on the dorsal surface of the brainstem, medial to the auditory nerve root (Fig. 1A,B; see Fig. 7G). At the level of the entry of the auditory nerve root, NM expands laterally, while the rostral portion is located in a band of cells medial and dorsal to NL. NL is composed of a compact layer of bitufted neurons, in an oval-shaped nucleus except at its most caudal extent, where it forms a lateral bulge (Fig. 1B). NL is surrounded by fibers from NM, and its dorsal border is composed of afferents from the ipsilateral NM and a small number of afferent axons from the auditory portion of the eighth nerve. The ventral border of NL consists of fibers from the contralateral NM. NM and NL were tonotopically organized; rostromedial NL received high-best-frequency inputs (>1 kHz) and caudal NL received lower-best-frequency inputs (Manley, 1970).
Auditory nerve and NM neurons encode the phase of the auditory stimulus
We report on 288 caiman auditory nerve fibers with CFs between 0.2 and 3 kHz, and on 54 alligator NM units with BFs between 0.2 and 2.1 kHz (Fig. 2A,B). In caiman nerve, data points were selected from stimulation near the CF (within ±0.1 octave), with levels at least 20 dB above CF-rate threshold. Phase locking decreased with increasing best frequency but remained statistically significant throughout the range tested (0.2–2.1 kHz). In alligator NM, phase locking decreased with increasing best frequency, and similarly remained statistically significant throughout the range tested (Fig. 2B). Vector strengths at the lowest frequencies tested (0.2–0.4 kHz) were occasionally degraded because of double spiking evident as two evenly spaced peaks within the phase histogram (Fig. 2C, inset) (peak splitting). The highest vector strengths, ∼0.95, were observed between 200 and 500 Hz. Vector strength then declined fairly steeply between 500 Hz and ∼1.5 kHz, although two additional units with significant vector strengths were recorded at ∼2 kHz. Vector strength also showed a large variation that could not be attributed to peak splitting, and it is possible that units recorded in NM included presynaptic auditory nerve fibers. No tests were performed to differentiate between these two possibilities, although the locations of lesions showed that most recordings were within the cellular region of NM. Units were confirmed to be in NM either by the presence of lesions (n = 2) or by biotinylated dextran injections in NM (n = 5). In most experiments, lesions were made below NM, in the NL cell body layer (see below, also Fig. 7G), and locations within NM were inferred by measurements of recording depth. In those experiments in which the cerebellum was removed, NM was also identified by dorsal midline bulges on either side of the ventricle.
Alligator NM neurons were monaural and responded in a primary-like manner. The short intervals of the initial burst, and the high level of tonic firing, may be seen in a peristimulus time histogram (Fig. 2E). Spontaneous firing rates in NM were low, being 2.6 ± 3.8 spikes/s (n = 71), in contrast to those measured for avian NM [chicken 86 spikes/s (Salvi et al., 1992); barn owl 219.4 spikes/s (Köppl, 1997a)], although it should be noted that different anesthetic regimens were used for each study. Best frequencies in NM ranged from 0.2 to 2.2 kHz (Fig. 2D) and thresholds ranged widely, even within a given ear [responses above spontaneous rate elicited at 41 ± 14.6 dB (n = 31)] (Fig. 2F). These values were, however, consistent with auditory brainstem response thresholds of 36–38 dB between 1000 and 1500 Hz (Higgs et al., 2002).
Alligator NL neurons are sensitive to ITD in vivo
ITD-sensitive responses were recorded from 73 NL neurons with best frequencies between 200 and 1500 Hz (Fig. 3). Three recordings at ∼2000 Hz showed poor ITD tuning. Sixty-three recordings were single units, and 10 were neurophonic recordings (Köppl and Carr, 2008). All neurons were binaurally excited, and shown to be in NL by the presence of lesions either within or close to the cell body layer of NL (n = 14 lesions in 8 animals). In all cases we measured best frequency and determined best ITD. Where possible, we recorded each neuron's responses to three stimulus sets. For the first set, we recorded responses to ipsilateral and contralateral clicks to determine whether click delay differences predicted best ITD (55/73). We then recorded phase-locked responses to monaural and binaural tonal stimuli at or near BF to measure monaural phase locking and binaural sensitivity to ITD (46/73). Finally, ITD sensitivity was measured at several different frequencies to calculate characteristic delay (21/73).
Selectivity to ITD was found in all single-unit and neurophonic recordings in NL (Fig. 3). Examples in Figure 3 show the responses of seven neurons and one neurophonic recording with different best frequencies, and plot the number of spikes evoked as a function of the ITD at or close to the neuron's best frequency. The responses of NL neurons varied in a cyclic manner with the ITD of a sound stimulus, and the period of the ITD response function matched that of the stimulus tone. Neurons were sensitive both to tones around best frequency and to broadband noise (Fig. 3B,C,E). Most single-unit ITD curves were not as smooth as those observed in birds and mammals, while neurophonic ITD curves were generally smoother (Fig. 3D). The effect of increased numbers of stimulus repetitions was tested in two units; the resulting ITD curves became smoother although best ITD did not shift in these examples (supplemental Fig. 1, available at www.jneurosci.org as supplemental material). Responses to the least favorable delay dropped below the level of the monaural responses, although not below the generally low level of spontaneous activity (median 16 spikes/s).
Monaural phase sensitivity predicts best ITD in laminaris neurons
NL neurons phase locked to the auditory stimulus and could be driven by both monaural and binaural stimulation (Fig. 4A; see Fig. 8A,B). The phase of each monaural response reflected the delay between the ear and the recording point. When an equal but opposite interaural delay nullified the internal delay, the two phase-locked responses coincided to produce a large binaural response (Fig. 4A,B). A fundamental tenet of coincidence detection mediated ITD detection is that the phase difference between the monaural phase-locked responses should equal the best ITD. This observation holds for the neurons in the alligator NL, which phase locked to stimulation of either contralateral or ipsilateral ears. Generally, inputs were not exactly matched in strength (see Fig. 8A–C). These differences did not, however, reflect a consistent difference between ipsilateral and contralateral inputs, since ipsilateral inputs evoked ∼54 ± 17% (n = 32) of the response. Comparison of the ipsilateral and contralateral period histograms showed that differences in mean phase predicted the best ITD for 46 neurons (Fig. 4D). The mean ITD predicted from these mean phase comparisons was large, being 504 ± 285 μs (n = 46), with a median interaural phase difference (IPD) of 0.29 cycles. Phase differences between the two ears were generally such that, to bring peaks into coincidence, the stimulus to the ipsilateral ear must be delayed with respect to the contralateral ear. This “predicted ITD” was compared with the “observed best ITD” obtained from the ITD curves. Observed and predicted peaks were well correlated (y = 0.90x − 45.5, r2 = 0.90), and were generally centered around and contralateral to zero ITD, between 0 and 1200 μs, i.e., predominantly in the contralateral sound field. Vector strengths varied with ITD, and were highest for favorable ITDs (Fig. 4C).
Click responses predict best ITD
Click stimuli were used to measure the conduction time from the ear to the point of recording in NL. We used the arrival of the first wave front or signal front delay as a measure of latency, and either used the single units' responses or averaged waveforms to calculate click delay differences. We analyzed phase delay data to determine whether this measure of delay reflected the best ITD. Clicks were repeated 128 times, and evoked precisely timed action potentials in well isolated units (Fig. 5B; inset shows ITD from same neuron), or oscillatory responses from NL neurophonics or less well isolated units (Fig. 5C). We could typically distinguish several peaks, with the first response peak, often barely visible in individual traces, but reliably detected by a detection algorithm, taken as the signal-front delay. We determined the phase delay from the time of occurrence of the first large peak (Fig. 5B), and calculated click time difference by subtracting the phase delay of the leading click from the following click. These click delays were also useful in determining the leading ear for mean phase responses. Like the mean phase differences, interaural click delays were well correlated with best ITD (y = 0.97x − 18.73, r2 = 0.89) (Fig. 5A). The mean ITD predicted from these click difference comparisons was 425 ± 302 μs (n = 55).
Characteristic delays and best ITD
NL neurons are maximally excited when the interaural delay exactly compensates for the difference in the neural conduction times from the ears (Fig. 6). This difference in neural travel time is termed the characteristic delay (CD). We derived CDs by fitting ITD curves to a range of frequencies around best frequency to cosine functions in which peak positions could be precisely measured (Peña et al., 1996; Viete et al., 1997). Plotting the mean interaural phase against frequency yields a line whose slope is the frequency-independent ITD or CD (Rose et al., 1966; Yin and Kuwada, 1984). This procedure generally works well for ITD curves for low-frequency-sensitive neurons (Yin and Chan, 1990), although in approximately half the cases in the alligator, ITD curves were sufficiently irregular to preclude computation of CD. Figure 6 illustrates four examples, a single unit with BF of 1050 Hz (one of the highest frequencies recorded) and three other units with CFs near 700–900 Hz. Characteristic delay predicted best ITD (y = 0.97x − 142, r2 = 0.77) although the relationship was not as tight as that of mean phase and click differences (Fig. 6E). Mean ITD from the characteristic delay measures was 427 ± 308 μs (n = 21).
The tendency of each neuron to be maximally or minimally excited at the same ITD at different frequencies was quantified by measuring its CP. The CP was obtained from the intercept of the fitted line with the ordinate (Yin and Kuwada, 1983). A CP near 0 cycles (Fig. 6A,C,D) indicated that the neuron was of the peak type (the peaks of the ITD plots were more coincident than the troughs), whereas a CP of 0.5 indicated a trough type. Figure 6B shows an intermediate CP value of 0.28, termed a “sloper,” i.e., characterized by delay curves that did not align at either the peak or the trough, but rather at an intermediate point (Batra et al., 1997). To assess how well ITD-sensitive units could be divided into peak- and trough-type categories, we examined the distribution of CP (Fig. 6, insets). Characteristic phases were expressed on a scale from −0.5 to +0.5 cycles and covered nearly that whole range (−0.45 to +0.38, n = 21). Their distribution was not uniform, with all but three values falling within ±0.3. The median CP was −0.16. This means that in the majority of cases, the CD fell near a peak in the ITD responses, as for the examples shown in Figure 6. Nevertheless, only half (11 units) had CPs close (±0.1 cycles) to the peak of the ITD curve. These “peak-type” units discharged maximally near a particular ITD across frequencies, consistent with an excitatory mechanism, like that observed in avian NL and mammalian MSO. Units with CPs closer to 0.5 suggest the presence of additional inhibitory interactions, either at the periphery or in the projections to NL.
NL neurons encode locations in the contralateral hemifield; range of best IPDs and ITDs
In each NL, most neurons responded maximally to sounds leading in time at the contralateral ear (positive ITDs). On the right-hand side, one response was recorded with an ipsilateral leading best ITD near 0 μs. The larger spread in ITDs recorded on the right side may be because more neurons (n = 28 left, n = 45 right) were recorded over a greater expanse of NL (summary plots in Figs. 4D, 5A). In general, maximal responses were evoked by large ITDs, with a median best ITD of 450 μs, outside the presumed physiological range of the alligator studied here. Best ITDs ranged from −50 to +1250 μs (Fig. 7A), while best IPDs ranged from −0.05 to +0.77 cycles (Fig. 7B) (median +0.280; positive values indicating contralateral leading, negative values indicating ipsilateral leading). Note that best IPDs beyond ±0.5 cycles were identified by monaural click responses and/or the CD, both of which provide information about laterality (Köppl and Carr, 2008). Sixteen single units and one neurophonic recording had best IPDs beyond 0.5 cycles (Fig. 8D,E). Recordings from the alligator NL show hardly any representation in the ipsilateral hemifield, and a much wider distribution than the available biological range in the contralateral hemifield, unless the range is much larger than that predicted by the interaural distance. The distribution of best ITDs was strikingly different from that observed in the chicken or gerbil (Fig. 7D,E). In the chicken, the mean characteristic delay distribution is, like that of the gerbil, peaked near 90 μs (Pecka et al., 2008). Despite the broad distribution of best ITDs and IPDs in the alligator, lesion locations were consistent with a map of ITD, in that medial lesion locations were closer to 0 μs (median 200 μs) than lateral lesions (median 500 μs) (Fig. 7F).
NL neurons show no significant correlation of frequency difference and best ITD
The stereausis theory assumes that differences in propagation time along the basilar membrane can provide the necessary delays, if the coincidence detectors receive input from fibers innervating different loci on the left and right basilar membranes (Shamma et al., 1989; Peña et al., 2001). If this theory applies, differences in the frequency tuning of the left and right inputs to coincidence detectors should predict best ITD. The two inputs to alligator NL neurons were not always matched in frequency (Fig. 8A), but best ITDs were not correlated with the interaural frequency mismatches (Fig. 8C). Largest ITDs should have the biggest frequency mismatch, but no significant correlation was observed between frequency difference and best ITD (y = −0.068x + 30.16, r2 = 0.05). Best ITDs did show less scatter with increasing BF for all recordings (Fig. 8D), while IPDs did not (Fig. 8E). Furthermore, some ITD values were larger than ±0.5 cycles, the “π-limit.” There was no systematic change of best IPD with BF (Fig. 8E), while best ITD values showed an increasing range with decreasing BF (Fig. 8D).
In vitro physiology and models of coincidence detection
To determine whether the mechanisms of crocodilian ITD detection were similar to those of birds, we developed a model based on a simulation of ITD detection in chicken and barn owl (Grau-Serrat et al., 2003), but with specific crocodilian modifications (Soares et al., 1999). Phase-locked inputs from caiman auditory nerve and alligator NM were used to construct plausible models of input sequences for the model (Fig. 2). Since measurements from caiman auditory nerve were substantially in agreement with the smaller dataset recorded from alligator NM, we used the NM dataset to constrain the vector strength of the inputs to the model of coincidence detection in the crocodilian NL. Inputs were modeled as an inhomogeneous Poisson process with the vector strengths of VS = 0.925 − 0.3f (where f is frequency in kilohertz), which is the best linear fit to the NM data shown in Figure 2B (e.g., 0.82 at 350 Hz and 0.62 at 1 kHz).
Biophysical data were provided by in vitro studies of alligator embryos. Transverse slices revealed that NL neurons were arranged tonotopically along the longitudinal axis of the nucleus from the high-best-frequency rostromedial region to the low-frequency caudolateral region (Figs. 1, 9A). The dendrites of NL neurons were directed dorsally and ventrally and received excitatory inputs from the ipsilateral or contralateral NM, respectively (Fig. 9B). Dendritic length changed systematically along the tonotopic axis, with neurons in the low-frequency region having longer, more branched dendrites, while those in the high-frequency region were shorter and less branched (our unpublished data). In the slices, we recorded from neurons without particular consideration of their location. Using the KGlu intracellular medium, mean resting potentials of −60.8 ± 1.9 mV (n = 19) were attained immediately after starting whole-cell recording. When a suprathreshold depolarizing current was injected in the NL neurons, the cell generated a single action potential at the onset of membrane depolarization, as do similar neurons in birds and mammals (Reyes et al., 1996; Smith, 1995) (Fig. 9A). The steady-state voltage responses showed outward-going rectification, and a prominent sag at hyperpolarized membrane potentials (Reyes et al., 1996; Funabiki et al., 1998). Electrical stimulation of ipsilateral or contralateral projections from NM evoked subthreshold EPSPs (Fig. 9C). The shape of a single EPSC was modeled as an α function with a time constant of 0.25 ms and a peak conductance of 0.007 mS, consistent with these recorded EPSPs. For the model, the firing rate of a single NM fiber was set at ∼100 spikes/s (Fig. 2D,F). Based on a synapse count from exemplar low-, medium-, and high-best-frequency NL neurons, the estimated number of synapses per dendrite ranged from 40 for the highest CF model neuron up to 100 for the lowest CF model neuron (C. E. Carr and D. Soares, unpublished observations). The model assumed an axonal fan-out ratio of 4, meaning each excitatory NM axon makes four synapses on each NL cell it innervates, e.g., the number of independent NM fibers ranged correspondingly from 10 to 25. Leak conductance was set to 42 mS/cm2 everywhere (except for the myelinated portion of the axon, which was set to 0.8 mS/cm2), consistent with the measured input resistance of 140 MΩ (Fig. 9, legend). The temperature was set to 25°C, except where noted below. Finally, the minimum inter-EPSP interval was set to 1.2 ms.
At low frequencies, the model NL neurons fire poorly when their ipsilateral and contralateral inputs are out of phase, gradually firing more at higher frequencies (Fig. 9, out-of-phase firing rate). NL neurons fire strongly when their ipsilateral and contralateral inputs are in phase, the hallmark of a good ITD-sensitive coincidence detector. At higher frequencies, model NL firing rates decrease, even when their ipsilateral and contralateral inputs are in phase, meaning that their ability to discriminate between ITDs worsens. Discrimination is good until ∼1 kHz, an octave lower than in the related chicken model (Grau-Serrat et al., 2003), consistent with both the smaller leak conductance and the longer EPSP duration observed in alligator in vitro recordings. Interestingly, the model could not discriminate ITDs (except at the very lowest frequencies) when the synapses on each dendrite arose from independent fibers, indicating that the synaptic redundancy is critical, and predicting that each NL neuron received more than one synapse from each NM neuron. This is consistent with the similar effect seen experimentally in chicken NL in vitro (Reyes et al., 1996). The model was also run at two additional temperatures, 15°C and 35°C, as representative of a plausible range of operating temperatures [Q10 values ranged from 2 to 3, as in Grau-Serrat et al. (2003); see supplemental Fig. 2, available at www.jneurosci.org as supplemental material]. Relative to 25°C, at 35°C the in-phase firing rates grow, giving overall improved ITD discrimination, especially at the higher frequencies, consistent with our in vivo results. As part of the same trend, relative to 25°C, at 15°C the in-phase firing rates shrink, giving overall poorer ITD discrimination, again especially at the higher frequencies. This implies that maintaining higher temperatures, when possible, may give an overall advantage for sound localization.
Discussion
Crocodilians are successful predators that hunt on both land and water. Eyes located on top of their heads provide 25° of binocular vision to aid in localizing prey (Fleishman et al., 1988). They also have specialized epidermal sensory organs that detect small disruptions in the surface of the surrounding water (Soares, 2002). We show here that crocodilians have well developed neural circuits for encoding ITD. The neurons in NL act as coincidence detectors, both in vivo and in vitro, and encode ITD. Behavioral evidence of sound localization is lacking, but most crocodilians are nocturnal hunters, and produce a loud roaring call (Todd, 2007). Furthermore, females can localize the contact calls made by their young (Hunt and Watanabe, 1982; Passek and Gillingham, 1999), and so we expect that sound source localization is behaviorally relevant to this group.
Sound localization is evolutionarily important, although not all animals localize sound well. In mammals, threshold acuities range from ∼1° for elephants and humans to >25° for gerbils and horses (Heffner and Heffner, 1992a). A similarly wide range has been found for birds, with best acuities of ∼3° in the barn owl (Bala et al., 2003). Heffner and Heffner (1992a,b) proposed that variation in acuity in mammals is best accounted for by the requirement for visual orientation to sound, in that species with broad fields of best vision require less accurate information than foveate species. This hypothesis has not been tested in birds or crocodilians, since fewer data are available for localization and gaze direction in birds (Klump, 2000; van der Willigen et al., 2002), and there are none for crocodilians.
Implementation of Jeffress model
Both avian and crocodilian auditory circuits appear to conform to the requirements of the Jeffress model (Jeffress, 1948; Joris et al., 1998; Grothe et al., 2005). The auditory nerve and NM phase lock to sound in birds and crocodilians (Köppl, 1997b), while NM's target neurons in NL act as coincidence detectors for both tones and noise. Internal delays, equal and opposite to interaural delays, characterize barn owls (Carr and Konishi, 1990; Peña et al., 2001), chickens (Overholt et al., 1992; Funabiki et al., 1998; Köppl and Carr, 2008), and alligators (this study). Best delays in NL are such that neurons respond maximally to sound sources in the contralateral hemifield. Similarly, contralateral click delays are longer than ipsilateral (Wagner et al., 2005; Köppl and Carr, 2008). Thus, the axonal delays from NM appear sufficient to account for the range of observed ITDs. Cochlear disparities or stereausis are an alternative to axonal delays (Shamma et al., 1989; Peña et al., 2001), but when frequency tuning of alligator NL neurons was compared for ipsilateral and contralateral stimulation, best ITDs were not correlated with the interaural frequency mismatches.
ITD sensitivity emerges from phase-locked excitatory inputs in birds (Carr and Konishi, 1990; Ashida et al., 2007; Köppl and Carr, 2008), and mammals (Goldberg and Brown, 1969; Yin and Chan, 1990; Spitzer and Semple, 1995; Batra et al., 1997; Batra and Yin, 2004; Svirskis et al., 2004; Zhou et al., 2005). Simultaneous arrival of the signals from the two ears produces maximal excitation so that peak-type neurons act as coincidence detectors. Most alligator NL neurons had CPs clustered near zero (median −0.16 cycles) (Fig. 6E), as would be expected from coincidence detection between excitatory inputs (Yin and Kuwada, 1983), and consistent with the results of our model of coincidence detection. There were, however, three neurons with CPs closer to 0.5, suggesting the presence of additional inhibitory interactions, either at the periphery or in the projections to NL. The role of inhibition in ITD processing is unknown in alligators, and differs between birds and mammals. In birds, GABAergic inhibition originates from the superior olive and adjusts the gain of the circuit to compensate for changes in loudness (Yang et al., 1999; Burger et al., 2005) (for review, see Hyson, 2005). Projections from the olive also act to suppress firing at bad ITDs in low-best-frequency neurons (Nishino et al., 2008). Mammals, in addition to GABAergic inhibition, have temporally precise glycinergic inhibition to the MSO that can shift ITD sensitivity (Brand et al., 2002; Grothe, 2003; Pecka et al., 2008).
An additional feature of the Jeffress model is the systematic representation of ITD, which creates a place code of azimuthal position. There is support for a place code in the barn owl in vivo (Carr and Konishi, 1990) and in the chicken, both in vitro (Overholt et al., 1992) and in vivo (Köppl and Carr, 2008). Data from the alligator also support a place code, in that the distribution of lesions exhibits a trend from medially located best ITDs near 0 to laterally located best ITDs in the contralateral hemifield. The range of best ITDs is large, however, with median values of ∼450 μs, compared with ∼90 μs in chicken (Köppl and Carr, 2008) and 173 μs in the gerbil (Pecka et al., 2008). Although the 2- to 3-year-old alligators had larger heads than chickens or gerbils, with ears ∼3 cm apart, many peaks of the ITD tuning curves were at very large ITDs. There are, however, no measures of the natural range of ITDs in crocodilians, and crocodilian head size increases throughout life to a head width of nearly 30 cm in alligators [estimated from maximum size data from Woodward et al. (1995)]. Furthermore, Owen (1850) has described a connection between the tympanum and palate in the Crocodilia, and writes “I forbear, with my present limited experience of the living habits and actions of the Crocodilian Reptiles, to offer any hypothesis as to the function of the complex canals which conduct the air and would convey its sonorous vibrations from the nose to the ear: but one peculiarity I may suggest, as being probably related to the structures in question, in which the Crocodiles and Gavials differ from all the Lizard-tribe, viz. that of habitually floating with the operculated meatus externus submerged, and only the eyes and the prominent nostril exposed above the surface of the water. Any noise in the air that might reach the floating reptile would, under such conditions, be conveyed to the tympanum by the canals conducting to that cavity from near the hinder opening of the long nasal passage; and it may also be remembered, that there is a peculiar valve in the Crocodiles which shuts off all communication between that passage and the mouth” [see also Colbert (1946)]. Thus, discussion of the effective interaural distances in crocodilians awaits evaluation of the role of these interaural passages in air and water, and recordings from a larger size range of animals. It is likely that crocodilian ears function as pressure difference receivers, as is also the case in birds (Calford and Piddington, 1988; Hyson et al., 1994; Klump, 2000; Larsen et al., 2006). In birds, cochlear microphonic recordings show that the interaural canal enhances interaural delay. In chicks, effective ITDs increased from ∼100 μs at 4 kHz to ∼180 μs at 800 Hz (Hyson et al., 1994), while Calford and Piddington (1988) studied six species with sounds as low as 500 Hz, and found effective ITDs increased by more than a factor of 3 at low frequencies. Thus the interaural canal increases the range of ITDs (and interaural level differences) at low frequencies in birds and may exert a similar effect in crocodilians. Physiological recordings from chicken NL also support “stretching” of the map of ITD at low frequencies, since the mapped range is larger in the low-best-frequency region of the nucleus (Köppl and Carr, 2008).
The presence or absence of a map of ITD is relevant, because data from gerbil MSO and guinea pig inferior colliculus have been used to support the hypothesis that MSO neurons encode locations in the azimuthal plane by a modulation of their output rate, a slope code, and not by a place code (McAlpine et al., 2001; Brand et al., 2002; McAlpine and Grothe, 2003; Palmer, 2004; Pecka et al., 2008). This dichotomous view may not be appropriate for the alligator or other archosaurs. In the barn owl, behavioral and physiological studies show that both the map of ITD created by the place code, and the information in the slope of the ITD function, may be used (Takahashi et al., 2003; Bala et al., 2007). In the chicken, the mean characteristic delay distribution is, like that of the gerbil, peaked near 90 μs, but the distribution emerges from a place code (Young and Rubel, 1983; Köppl and Carr, 2008). The distributions of best ITD in the alligator suggest that they too could use the information in both the slopes and the peaks of the ITD curve. Furthermore, an information theoretic analysis supports Takahashi et al.'s (2003) observation that place and slope codes are not mutually exclusive strategies. Instead, the relative importance of the two tuning curve features depends on variability in the neuronal response; best encoding may transition from high-slope to high-firing-rate regions of the tuning curve with increasing noise level (Butts and Goldman, 2006).
Evolution of binaural circuits
Archosaurs are thought to have developed binaural hearing in parallel to mammals (Grothe et al., 2005). This assumption rests on paleontological studies, which show that middle ears developed independently in the major tetrapod groups—the anurans, turtles, lepidosaurs, archosaurs, and mammals (Clack, 1997). The appearance of a tympanum and middle ear would have led to changes in the central auditory processing of both high-frequency sound and directional hearing (Christensen-Dalsgaard and Carr, 2008). Clack and Allin (2005) have argued that similarities between birds and crocodiles may be general archosaur synapomorphies, allowing us to hypothesize that binaural circuits sensitive to ITD also characterized the auditory systems of dinosaurs. It is likely that dinosaurs were both vocal and sensitive to low-frequency sound; regression analyses in living archosaurs of best audiogram frequency versus body mass suggest that hearing in large dinosaurs was restricted to low frequencies (<3 kHz) (Gleich et al., 2005), and perhaps similar to that in extant crocodilians.
Footnotes
-
The alligator work was supported by National Institutes of Health (NIH) Grant DC00436 to C.E.C., by NIH Grant R03DC04382 to J.Z.S., and by NIH Grant P30 DC0466 to the University of Maryland Center for the Evolutionary Biology of Hearing. D.S. was supported by National Institute of Mental Health Grant T32MH2004801. The caiman work was supported by Deutsche Forschungsgemeinschaft Grant SFB 45, 269 to J.S. We acknowledge the assistance of the students and faculty of the Neural Systems and Behavior course in Woods Hole, MA, especially Dr. Carmen de Labra and Dr. Sandra Wohlgemuth, who performed the first in vitro recordings from alligator embryos. We thank Dr. Ruth Elsey of the Rockefeller Refuge for help with collecting alligators. We thank Dr. Peña for providing Matlab scripts for calculating characteristic delay, and Dr. Wagner for click analysis scripts. Dr. Pecka kindly provided his gerbil ITD data.
- Correspondence should be addressed to Catherine E. Carr, Department of Biology and Neuroscience and Cognitive Science Program, University of Maryland, College Park, MD 20742-4415. cecarr{at}umd.edu