Abstract
The optic tectum (OT) is an avian midbrain structure involved in the integration of visual and auditory stimuli. Studies in the barn owl, an auditory specialist, have shown that spatial auditory information is topographically represented in the OT. Little is known about how auditory space is represented in the midbrain of birds with generalist hearing, i.e., most of avian species lacking peripheral adaptations such as facial ruffs or asymmetric ears. Thus, we conducted in vivo extracellular recordings of single neurons in the OT and in the external portion of the formatio reticularis lateralis (FRLx), a brain structure located between the inferior colliculus (IC) and the OT, in anaesthetized chickens of either sex. We found that most of the auditory spatial receptive fields (aSRFs) were spatially confined both in azimuth and elevation, divided into two main classes: round aSRFs, mainly present in the OT, and annular aSRFs, with a ring-like shape around the interaural axis, mainly present in the FRLx. Our data further indicate that interaural time difference (ITD) and interaural level difference (ILD) play a role in the formation of both aSRF classes. These results suggest that, unlike mammals and owls which have a congruent representation of visual and auditory space in the OT, generalist birds separate the computation of auditory space in two different midbrain structures. We hypothesize that the FRLx-annular aSRFs define the distance of a sound source from the axis of the lateral visual fovea, whereas the OT-round aSRFs are involved in multimodal integration of the stimulus around the lateral fovea.
SIGNIFICANCE STATEMENT Previous studies implied that auditory spatial receptive fields (aSRFs) in the midbrain of generalist birds are only confined along azimuth. Interestingly, we found SRFs s in the chicken to be confined along both azimuth and elevation. Moreover, the auditory receptive fields are arranged in a concentric manner around the overlapping interaural and visual axes. These data suggest that in generalist birds, which mainly rely on vision, the auditory system mainly serves to align auditory stimuli with the visual axis, while auditory specialized birds like the barn owl compute sound sources more precisely and integrate sound positions in the multimodal space map of the optic tectum (OT).
- auditory system
- avian midbrain
- FRLx
- optic tectum
Introduction
Sound localization is crucial for the survival and reproduction of animals, e.g., to avoid predators, catch prey and detect possible mates (Marler, 1955; Klump and Shalter, 1984). While the spatial location of a visual object is directly encoded in the retinal position, specific binaural cues are used by the auditory system in vertebrates to reconstruct the sound source locations (Schnupp and Carr, 2009; Grothe et al., 2010). However, most of the knowledge about auditory processing in birds comes from studies conducted in the barn owl, an auditory specialist with outstanding auditory localization performances (Payne, 1971; Konishi et al., 1988; Klump, 2000; Wagner et al., 2013; Krumm et al., 2019). It has been shown that for the barn owl the interaural time differences (ITDs) are crucial for localization along the horizontal axis (azimuth), while the interaural level differences (ILDs) are mainly used along the vertical axis (elevation) of the auditory space (Moiseff and Konishi, 1981; Moiseff, 1989). These elevation-dependent ILDs are generated by specific anatomic structures and specializations, namely the asymmetric ears and the facial feather ruff (Knudsen and Konishi, 1979; Coles and Guppy, 1988; Volman, 1994; von Campenhausen and Wagner, 2006).
ILDs and ITDs are processed separately starting from the early stages of the auditory pathway in the brainstem, i.e., in the nucleus magnocellularis and nucleus angularis, respectively (Takahashi et al., 1984). Then, ITDs and ILDs information converges in the lateral shell of the inferior colliculus (IC) and in the external nucleus of the IC (ICx), where the auditory spatial receptive fields (aSRFs) are formed; each neuron responds to sound coming from a precise location in space, and together they form a topographic map of the auditory space (Knudsen and Konishi, 1978; Knudsen, 1983; Fischer and Peña, 2011). In the barn owl, these neurons then project directly to the optic tectum (OT), where auditory and visual information converge, forming overlapping topographical maps of the two modalities (Knudsen, 1982). In the chicken, however, the ICx mainly projects to the external portion of the formatio reticularis lateralis (FRLx), which has been described as a relay between the IC and the OT (Niederleitner et al., 2017). So far, no study investigated the neural response of this brain structure to auditory stimulation in vivo.
Most of the avian species, called generalist birds, have symmetrical ears and lack any auditory specialization for improving sound localization, especially in elevation. Electrophysiological studies investigating the auditory spatial representation in generalist birds' midbrain show ambiguous results. As an example, the aSRFs in the ICx of hawks and owls with symmetrical ears show responses with no elevational features (Calford et al., 1985; Volman and Konishi, 1990); another study in the pigeon's OT, however, found auditory responses spatially confined along both azimuth and elevation (Lewald and Dörrscheidt, 1998). Moreover, there is evidence that the simple head shape of some generalist bird species, including the chicken, provides monaural cues which change systematically across elevation and which, therefore, could be used by the animal for elevational localization (Schnyder et al., 2014).
In our study, we used the chicken as animal model for generalist birds and investigated the spatial tuning in two midbrain structures along the auditory pathway. One is the OT, which is involved in the integration of visual and auditory modalities; the second one is the FRLx.
We characterized the spatial, ITD and ILD tuning properties of single units to describe the aSRFs and the role played by ITDs and ILDs in the formation of such receptive fields. To do so, we conducted in vivo extracellular recordings while presenting binaural auditory stimuli via earphones to anaesthetized chickens.
Materials and Methods
Animals
We collected data from 28 chickens (Gallus gallus domesticus, White Leghorn, 12 males and 16 females) aged between 58 and 114 d posthatch. We used adult chickens because their auditory system is fully developed in contrast to young chickens (Smith, 1981; Manley et al., 1991) and head-related transfer function (HRTF) data for a virtual stimulus presentation were available (Schnyder et al., 2014). Fertilized eggs were provided by the Department of Reproductive Biology, Technical University of Munich (TUM) School of Life Sciences, incubated at 37°C and 70% humidity and, after hatching, reared at the animal facility of the Chair of Zoology, TUM School of Life Sciences. The animals were kept in groups in cages with access to sand, perches, water, and food ad libitum. The bird housing facilities were artificially illuminated with UV-balanced light in a 12/12 h light/dark cycle. All experiments were performed according to the principles regarding the care and use of animals adopted by the German Animal Welfare Law for the prevention of cruelty to animals. The study was approved by the Government of Upper Bavaria, Germany.
Surgery
All protocols and procedures were in accordance with the institutional guidelines of the authorities of Upper Bavaria, Germany (permit no. ROB-55-2-2532-Vet_02-18-154). Animals were anaesthetized with an initial dose of ketamine (40 mg/kg) and xylazine (12 mg/kg) via intramuscular injection in the breast muscle. Anesthesia was maintained by a constant injection of anesthetic (ketamine: 13 mg/kg/h; xylazine: 4 mg/kg/h) via a syringe pump (LA-30, Landgraf Laborsysteme HLL GmbH). The heart rate was constantly monitored with two EKG electrodes placed in a chest muscle and in the contralateral leg. Cloacal temperature was monitored and maintained above 39.5°C using a homoeothermic blanket system (Harvard 50-7137, Scientific & Research Instruments Ltd.). The head was held in a fixed position by means of two ear bars connected to a surgery stereotaxic setup. The scalp was anaesthetized with xylocaine (AstraZeneca GmbH), the feathers were removed with forceps and the scalp was opened along the midline. A craniotomy was performed above the left midbrain and the meninges were opened to expose the brain surface for electrode penetration. An aluminum head holder was attached to the skull with dental cement (Paladur, Kulzer GmbH) to fix the head in the setup for the recording. During anesthesia a negative pressure may build up in the middle ear cavity, preventing the tympanic membrane from vibrating naturally (Larsen et al., 2016). However, in our case, the surgical opening was extended up to the trabecular bone, caudal in respect of the OT, connected to the bullae and interbullae passage described previously (Larsen et al., 2016), thus ensuring middle-ear ventilation. To reconstruct the anatomic position of the recordings, the electrode was coated with a fluorescent dye (DiO or DiI) before the first penetration, following a procedure described previously (DiCarlo et al., 1996). Briefly, a dye solution was prepared by dissolving dye crystals in ethanol (DiI: 50 mg/ml, DiO: 42 mg/ml). Then, the electrode tip was dipped into the solution, allowing it to dry in air for 5 s. The procedure was repeated 10 times. Afterwards, the electrode was placed on the surface of the brain by visual inspection and guided into the brain tissue by a microcontroller (Scientifica Ltd). At the end of the experiment the animals were euthanized with an intrapulmonary injection of sodium-pentobarbital (200 mg/kg, Narcoren) and decapitated with poultry scissors.
Stimulus generation and recording
The recordings were performed in a sound-attenuating chamber (IAC, Niederkrüchten). Stimuli were either pure tones or broadband (BB) noise (100–5000 Hz). Fresh noise stimuli were digitally generated for each single trial, so that the averaged neural response across trials cancels out the possible modulation of the neural activity in response to the amplitude envelope of the noise stimuli. The stimuli were generated by a custom-written script in MATLAB (MathWorks), converted to analog signals via an external sound card (Fireface 400, RME), and presented binaurally via earphones (ER4S, Etymotics) using the software AudioSpike (HörTech gGmbH). The earphones were not sealed to the animal's ear canals, hence preventing physical constraints in the air vibration in the inner ear. In order to obtain the aSRF, we presented sounds via earphones in a virtual auditory space (VAS) created by fresh BB noise filtered with HRTFs corresponding to specific locations in space. The HRTFs were not individualized; for all recordings we used the HRTF data obtained in a previous study (Schnyder et al., 2014). Since the HRTFs were measured via microphones placed at the position of the eardrums, all properties of the sound reaching the eardrum under natural conditions were fully captured. Furthermore, it has been previously shown that, in the barn owl, there is a good correspondence between the neural responses obtained by HRTF stimuli and free-field stimulation (Keller et al., 1998). Since the binaural cues in the HRTFs of different adult chickens do not change remarkably (see Results, VAS stimuli) and the head size of the chicken used for the chosen HRTF database (interaural distance: 31 mm) was compatible with the head size of the chickens of our study (average interaural distance from four subjects: 30 ± 2 mm), we used one HRTF dataset as stimuli for all the tested chickens.
The neural activity was recorded using a tungsten electrode (impedance: 2 MΩ, WPI GmbH), preamplified (ExAmp-20KB, Kation Scientific), passed through a HumBug (Quest Scientific Instruments Inc.) to eliminate traces of power line noise and converted to a digital signal (Fireface 400, RME, sampling rate: 44,100 Hz). We used the software AudioSpike (HörTech gGmbH) for bandpass filtering of the recorded signal (0.3–3 kHz) and visually monitoring the online signal.
While advancing the electrode into the brain, a contralateral or bilateral BB noise stimulus was presented to elicit an auditory response. When a responsive unit was detected, the protocol consisted in determining the threshold level, followed by different series of stimuli to measure the following properties: spatial tuning, ITD tuning, ILD tuning, and frequency tuning (comparable to Aralla et al., 2018). Each set of stimuli was presented in a pseudo-randomized order. Each stimulus had a duration of 150 ms, with an interstimulus interval (ISI) of 500 ms. A cosine ramp function of 20 ms was applied at the beginning and the end of the stimuli to avoid spectral splatter.
Stimuli presentation
The threshold was estimated by presenting binaural or monaural BB noise in a level range from 10 dB SPL up to 90 dB SPL. Each level was presented 10 times. The threshold was defined as the level at which the unit started responding to the stimuli. The threshold was estimated via visual inspection of the plot showing the mean spike rate as a function of the level. The VAS stimuli simulated 429 different positions. The azimuthal range was ±180°, the elevational range was ±67.5°. The average binaural level was 10–15 dB above threshold (the level refers to the RMS SPL of the stimulus at the frontal position: 0° azimuth, 0° elevation). Each spatial position was presented 10 times. The ITD stimuli were typically presented at 20 dB SPL above threshold, within a range of ±500 µs and steps of 50 µs. Each condition was tested 20 times. The ILD stimuli were presented with an average binaural level of 20 dB SPL above threshold and an ILD range typically within ±40 dB, with 4-dB steps. Each stimulus was presented 20 times. The monaural stimuli consisted in the presentation of BB noise at different sound pressure levels, in a range of 0–40 dB above threshold, with 2-dB steps and 20 repetitions for each level. For the frequency tuning we usually presented pure tones in a range between 0.1 and 4.85 kHz, at intervals of five steps per octave, at 20 dB above threshold. Each stimulus was presented 10 times.
Data analysis
Spike sorting and peristimulus time histogram (PSTH) time window selection
Evaluation of the unit type (multi or single) was conducted offline. A recording was labeled as single-unit if only one type of spike waveform was detected and if <10% of the interspike intervals had a duration ≤3 ms. In all the other cases, defined as multi-unit, it was possible to isolate the response of single units using spike-sorting. Both unit labeling and spike sorting were conducted using Wave_clus 3 (Chaure et al., 2018) in MATLAB.
For each recording, a PSTH of the neuronal activity for all the repetitions was created. The PSTH contained a 50-ms prestimulus (Pre) window, 150 ms of stimulus presentation (Stim) window and 300 ms of poststimulus (Post) period. A response window was defined individually for each recording with the following procedure: a 20-ms window was moved in 1-ms steps over the Stim and Post recording windows, and the responses in this 20-ms window were compared with the first 20 ms of the Pre window, which included only spontaneous activity, to identify response windows that were significantly different from spontaneous activity. The first consecutive windows with a significant difference in activity (Wilcoxon signed-rank test, p ≤ 0.01) were considered the start of the response window and the last two windows with significant activity were the end of the response window (window start: median = 53 ms, range: 50–213 ms; window end: median = 297 ms, range: 80–485 ms). If there was no significant response, we ruled out the unit as unresponsive. The significant change in activity could indicate either an increase of activity (excitation) or a decrease (inhibition).
Frequency tuning
We first tested whether a unit was frequency tuned (see below, Statistical analysis). If the unit was tuned, we fitted a spline curve to the frequency response data (MATLAB 'spline' function). The best frequency (BF) was defined as the frequency which elicited the strongest neural response, and it was calculated as the frequency at the peak of the spline fit.
Monaural response
Monaural responses were recorded for ipsilateral and contralateral stimulation to test whether a unit was responsive to purely binaural or also monaural stimulation. If a significant difference was present in at least three consecutive levels, the unit was classified as excitatory (E) or inhibitory (I) according to an increase or decrease of activity, respectively. In all the other situations, it was classified as nonresponsive (“0”).
Spatial tuning
Classification of the aSRF was based on its location in the VAS. aSRF could be located mainly within the contralateral hemifield (called “contra”), the ipsilateral hemifield (called “ipsi”), or broadly distributed on both sides (called “bilateral”). This classification was based on the distribution of the aSRF area with a spike rate >50% of the maximum response. If >75% of the resulting area was present on one hemifield, the aSRF was labeled as “contra” or “ipsi,” according to the hemifield, otherwise it was labeled as “bilateral.” For the units with contralateral response, we also calculated the centroid.
In addition, the “contra” units were classified according to the aSRF shape. The two main aSRF shapes we determined were the “round” and the “annulus” type. An aSRF was classified as round if its peak was located near its centroid. Alternatively, a receptive field was classified as annulus if it had a ring-shape, with a decrease of activity at the centroid. The aSRF type classification was assessed from the azimuthal cross-section at 0° elevation of the normalized spike activity array (MATLAB curve fit: 'csaps' function, smoothing parameter: 0.0075; peaks detection: 'findpeak' function, 'MinPeakProminence',0.05). If one main peak was detected around the centroid position, it was classified as “round.” If it presented two local maxima with a local minimum around the centroid, it was classified as “annulus.” aSRFs were defined “broad” if more than two local maxima were detected. This category includes aSRFs with a “patchy” shape, not presenting any clear pattern and broadly spread mostly within the contralateral hemifield. Unit classification was automated using a custom MATLAB script, and only 6 units (9%) required manual classification editing after visual inspection.
For round and annulus aSRFs, some spatial features were measured for subsequent analysis. For both aSRF types, the “medial peak” was defined as the azimuthal position corresponding to the peak of neural activity close to the frontal location (i.e., 0° azimuth). The outer diameter was defined as the azimuthal distance between the two most distant points where the normalized spike rate is equal to 0.5. Regarding only the annular aSRFs, the local-maxima-diameter was defined as the azimuthal distance between the two local maxima. These measurements were based on the detection of local maxima from the fitted curve of the spike rate cross-section at 0° elevation (MATLAB curve fit: 'csaps' function, smoothing parameter: 0.0075; peaks detection: 'findpeak' function, 'MinPeakProminence',0.05).
We organized the aSRF data in a two-dimensional azimuth-by-elevation-spike activity array. For ease of view the data were smoothed in azimuth and elevation with a standard MATLAB function (curve fit: 'csaps' function, smoothing parameter: 0.005). We plotted the data as contour plots (contour spacing of 11% of normalized spike rate).
ITD tuning
First, we tested whether a unit was ITD-sensitive or not (see below, Statistical analysis). If it was ITD-sensitive, a curve was fitted to the ITD response data (MATLAB 'spline' function). Based on the ITD tuning curve, we measured the best ITD (ITD at maximum value of spline) and the ITD response width (ITD range with activity >50% of maximum). Positive ITD values indicate contralateral leading and negative values ipsilateral leading sounds.
ILD tuning
As for the ITD stimuli, we tested whether a unit showed ILD sensitivity or not. If it was ILD-sensitive, we applied a cubic smoothing spline to the ILD response data (MATLAB 'csaps' function, smoothing parameter: 0.03). Since the ILD tuning curves were typically sigmoidal, we calculated the maximum ILD, the 50% cutoff value (the ILD value at 50% of the maximum spike rate) and the ILD response width (ILD range with activity >50% of maximum spike rate). Positive ILD values indicate a higher sound pressure level at the contralateral ear.
Histology and anatomic reconstruction
After the decapitation of the animal, the brain was removed from the skull and fixed in 4% PFA solution for at least 24 h at 6°C. Then, the brain was cryoprotected in 30% saccharose in phosphate buffer (PB; 0.1 m, pH 7.4) for at least 24 h and subsequently sectioned at 100 µm on a microtome (Microm HM440E, GMI) along the sagittal plane, and collected into PB. Slices were mounted onto object slides coated with chrom-alum gelatin, then covered with DAPI-Propyl gallate (to identify brain structures) and a cover slip. Subsequently, we identified the fluorescent electrodes tracks using a fluorescence microscope (Olympus BX 63F). Since the electrode penetrations were mostly parallel to the sectioning plane (sagittal plane), the dye traces were easily detectable under the microscope.
During the recordings, we measured the position of the recordings along the rostro-caudal (RC) and medio-lateral (ML) axes, parallel to the stereotaxic arms (resolution = 0.1 mm), as well as along the dorso-ventral (DV) axis, parallel to the microcontroller (resolution = 0.001 mm). If the microcontroller was tilted to a certain angle from the standard axes orientation to provide a perpendicular entrance of the electrode through the brain surface, the angle was taken into account in the calculations of the final coordinates. We also measured the reference point, defined by the coordinates of the zero plane along each axis, according to the measurement system in the atlas by Puelles (2007). More specifically, the zero coronal and horizontal planes were aligned to the ear bars, while the zero sagittal plane was aligned to the midline. Moreover, during the recordings, from the skull opening it was possible to expose the most external point of the OT, thus we could measure the total ML extent of the midbrain from the midline. For each subject, a ratio between the total ML distances measured from the atlas and from the brain was used as a shrinkage/expansion factor to compensate for offsets because of differences in individual brain sizes, and it was applied to the stereotaxic coordinates along all three axes.
From these data, the position of the first penetration along the ML axis was defined as the compensated ML distance from the midline, and the assessment was supported by the dye track in the brain slides. Based on the stereotaxic information and the visual cue in the brain slides, it was possible to reconstruct the coordinates of the first penetration along the three axes, expressed as a distance from the reference point. The position of the other recording sites was estimated by the relative distance from the first one, applying the correction factor. Then we transferred the coordinates into the atlas coronal planes, where each recording site was assigned to the closest plane. This plane was <0.12 mm away from the recording site along the RC coordinate, since the interplane distance in the atlas is 0.24 mm. Since the FRLx is not reported in the atlas, the contour of the FRLx was defined by using the boundaries shown by Niederleitner et al. (2017; see their Fig. 4). Moreover, it was possible to estimate the OT layers from the depth of the electrode penetrations, typically orthogonal to the brain surface. We depicted the coordinates along the coronal views because it was necessary for projecting the OT recording sites onto the brain surface and because our brain area of interest was represented with higher resolution in the coronal view (i.e., smaller interplane distance) compared with the sagittal view.
In some cases, before the coronal representation, the recordings were also mapped in the sagittal planes, to visually check whether there was a match between the position estimated from the stereotaxic coordinates and the position of the electrode dye track in the brain slide. This was only applied to those recordings distant ±0.2 mm from the sagittal planes. In the other cases, the locations were not plotted onto the sagittal planes.
Projection onto the OT surface
Since previous studies showed that the avian OT has a topography of visual and/or auditory stimuli on the brain surface (e.g., in barn owl, Knudsen, 1982; in pigeon, Clarke and Whitteridge, 1976), we investigated whether a topography of the auditory space was also present along the chicken's OT surface. Therefore, we projected the location of the OT units onto the brain surface following a revised version of the procedure used in a previous study (Knudsen, 1982). From the atlas coronal view, we defined the projected location of each OT recording site as the surface point with the shortest distance from the unit coordinates. We measured the length of the OT surface for each atlas slide, we calculated the position of each projection along the OT surface, and the position of the most lateral point of the OT surface. Then, the OT surface sections were represented as a flattened line for the two-dimensional representation. The lateral point was used as reference point to align the flattened OT surfaces to each other along the same horizontal line in the RC axis.
Statistical analysis
For characterizing the monaural responses, we used a Mann–Whitney U test (p ≤ 0.01) to test the spike rate difference, at each level, between the neural response and the prestimulus spontaneous activity. The frequency, ITD and ILD tuning were assessed using a Kruskal–Wallis test (p ≤ 0.01) for comparison of the neural response spike rates across the tested values of the parameter. To test the ITD/ILD sensitivity within the physiological range, the same statistical test was performed on a smaller range of tested stimuli (ITD: ±200 µs, ILD: ±8 dB). The discrepancy between the measured physiological range of ITDs and ILDs (±173 µs, ±5.9 dB, respectively; see Results) and the tested range is given by the step size of the ITD and ILD stimuli (50 µs and 4 dB steps, respectively). If we had tested only in the ±150 µs ITD range and ±4 dB ILD range, we would have disregarded the neural response at most external values. The significant difference in diameter and best ITD between annulus and round aSRFs was tested using a Wilcoxon rank-sum test, p ≤ 0.01. For the analysis of topographic maps, the correlation between specific tuning properties (i.e., aSRF medial peak location, best ITD) and the position of the corresponding recording sites along each anatomic axis was conducted using the Spearman's correlation coefficient.
Results
VAS stimuli
For the VAS stimuli presentation, we chose one of the four available chicken HRTFs from the database recorded by Schnyder et al. (2014). The ITDs were concentrically aligned with the interaural axis (range = ±173 µs; Fig. 1A), and interestingly, the ILDs modulated along the elevational axis (range = ±5.9 dB; Fig. 1B). The physiological range of ITDs was greater at low frequencies (0.1–2.5 kHz; range: ±182 µs) than at high frequencies (2.5–5 kHz; range: ±163 µs), whereas the ILDs had a wider range at high frequencies (low frequency range: ±3.1 dB; high frequency range: ±9.3 dB). These results are in line with the duplex theory, suggesting that ITDs can be used as localization cues at low frequencies, whereas ILDs at high frequencies (Rayleigh, 1907). The averaged ranges of ITDs and ILDs among the available HRTF datasets were slightly bigger than those from the HRTF used for the VAS stimuli (ITD range: ±196 µs, ILD range: ±7.0 dB). The ILD variance among HRTF datasets was larger than the ITD variance, but both were relatively small if compared with the total range (ITD: SD/total range = 7%; ILD: SD/total range = 14%), suggesting that the ITDs and ILDs do not change remarkably across subjects and justifying the usage of one database of VAS stimuli for all subjects (Fig. 1C–F).
Basic properties
We recorded auditory neural responses from 69 single units from either the OT or the FRLx. A total of 42 units (61%) were located in the OT, while 27 units (39%) were in the FRLx. All 69 units responded to binaural BB noise (see example responses in Fig. 2A,B). The typical PSTH profile in FRLx and OT units had an onset peak followed by a sustained or build-up profile, and with a latency around 14 ms after stimulus onset (FRLx: median = 13 ms, range: 10–98 ms; OT: median = 15 ms, range: 10–174 ms; Fig. 2C,D). Only a minority of units were frequency tuned, i.e., responded to pure tones [OT: 4 units (10%); FRLx: 10 units (37%); Fig. 2E]. BFs ranged between 0.1 and 4.9 kHz. Moreover, we found that most of the BFs were in the low frequency range [≤2 kHz; OT: 3 units (75%); FRLx: 6 units (60%)].
Monaural responses
We measured the monaural response of 57 units (FRLx: 22 units; OT: 35 units) for both ipsilateral and contralateral BB noise. We found that, for both FRLx and OT units, the most frequently observed monaural response type was responsive only to contralateral stimulation [“0E”; OT: 17 units (48%); FRLx: 17 units (77%)]. A high percentage of OT units were not responsive to either ipsilateral or contralateral stimulation [“00”; OT: 15 (43%); FRLx: 4 units (18%)]. Only a minority showed ipsilateral excitation (“EE,” “E0”) or ipsilateral inhibition (“IE”; Tables 1, 2).
aSRFs
Most of the units had an annular aSRF (24 units, 35%; Fig. 3A) or round aSRF (27 units, 39%; Fig. 3B), whereas a minority presented a broad aSRF (8 units, 12%; Fig. 3C) or bilateral aSRF (10 units, 14%; Fig. 3D). Except for the bilateral units, all the remaining units responded to the contralateral hemifield of the VAS (59 units, 86%). Three units (4%) showed inhibitory response and were excluded from subsequent analysis. The round and annular aSRFs had the centroid placed at 90 ± 15° in azimuth and 0 ± 8° in elevation. The mean position of the centroids was at 91° azimuth and –2° elevation [annulus: 94° azimuth, −1° elevation (Fig. 3E); round: 90° azimuth, −2° elevation (Fig. 3F)]. For 4 units with annular aSRF, the centroid could not be determined because of their irregular shape and were not included in the subsequent analysis. The azimuthal diameter of the outer border of the annular aSRFs ranged from 127° to 212° (mean 162°), whereas the diameter of the round aSRF ranged from 64° to 181° (mean 119°). The mean diameter of the outer border of the annular aSRF was significantly larger than the mean diameter of the round type (Wilcoxon rank-sum test, p < 0.001; Fig. 3G). Taken together, these data suggest that the round and annular aSRFs are placed in a concentric manner around the aural-visual axis, where the smaller round aSRFs are surrounded by the annular aSRFs.
ITD and ILD tuning properties
We investigated whether the ITD and ILD tuning could explain the shape and location of the round and annular aSRFs. The distribution of the ITDs in the VAS stimuli has two main features. First, there is a gradient along the azimuthal axis, ranging between positions at −90° and +90° in azimuth. Second, the ITDs are arranged concentrically around the interaural axis, forming a cone of confusion (Fig. 1A). These features already suggest that neurons with longer best ITDs might possess a round aSRF, whereas smaller best ITDs would generate an annulus shaped aSRF. On the other hand, although the ILDs lack such a concentric configuration, they present two main peaks at around +50° and −50° elevation and a notch at the interaural axis (Fig. 1B). These ILD cues are the result of the head-induced monaural gain cues (Schnyder et al., 2014), which might be used by the chicken for elevational localization. Thus, we investigated whether they also play a role in the formation of the aSRFs.
We tested the ITD tuning in 68 units. ITD tuning curves usually exhibited a sharp peak (Fig. 4A) and in 98% of the cases had contralaterally leading best ITDs. Half of the round aSRF units were tuned to ITD (15 units; 52%) and the majority of annular aSRF units (16 units; 71%) had best ITDs within the physiological range. No broad and bilateral aSRF unit was tuned to ITD (Fig. 4B). The best ITDs of the annulus units were significantly smaller than those of the round units (round: range from 105 to 287 µs, mean = 198 µs; annulus: range from −9 to 152 µs, mean = 86 µs; Wilcoxon rank-sum test: p < 0.001; Fig. 4C,D). Moreover, the local-maxima-diameter of annular aSRFs and the outer diameter of round aSRFs correlated with the best ITDs [round: Pearson's r = −0.74, p = 0.002 (Fig. 4E); annulus: Pearson's r = −0.81, p < 0.001 (Fig. 4F)]. These results suggest that the best ITDs are mainly used by units exhibiting well defined contralateral aSRFs (i.e., annulus and round shape), where the ITDs are informative in defining the aSRF diameter. Furthermore, the ITD tuning width was relatively narrow in both FRLx and OT units (FRLx: median = 112 µs, range = 66–417 µs; OT: median = 132 µs, range = 69–296 µs).
We tested the ILD tuning in 57 units. The ILD response curves typically had a sigmoidal shape (Fig. 5A). Most of the round and annulus units were tuned to ILD (round: 17 units, 74%; annulus: 15 units, 83%). However, only a smaller portion of the round aSRF units was responsive within the physiological range (round: 8 units, 35%; annulus: 12 units, 67%; Fig. 5B). 94% of the ILD-tuned units responded to contralateral values (Fig. 5C). We investigated whether also the ILD tuning plays a role in the formation of aSRFs. If considering all the ILD-tuned units, the 50% ILD cutoff values correlate to the diameter of the round and annular units [round: Pearson's r = −0.72, p = 0.002 (Fig. 5D); annulus: Pearson's r = −0.57, p = 0.027 (Fig. 5E)]. These results suggest that round and annular aSRFs might be created by converging ITD and ILD information. Indeed, a consistent percentage of annular and round units were selective to both ITD and ILD cues (annulus: 13 units, 72% of tested units; round: 11 units, 48% of tested units, broad and bilateral: 0 units), and they were responsive to ITD and ILD values related to the contralateral hemifield (i.e., contralateral leading sounds for ITDs, and louder contralateral sounds for ILDs). Furthermore, we investigated whether the properties of the ILD tuning curves vary between FRLx and OT. The median position of the 50% ILD cutoff was close to the physiological range for both FRLx and OT units, but the variability in the neural population was relatively high (FRLx: median = 7 dB, range = −35–14 dB; OT: median = 7 dB, range = −35–37 dB). In contrast to what was observed for the ITD tuning, this result shows that units in FRLx and OT were sensitive to a broader range of ILD values.
The units in FRLx and OT show different aSRFs
Most of the round aSRF units were located in the deep layers (layers 13–14) of the OT (20 units, 71%) and most of the annulus aSRF units were located in the FRLx (20 units, 84%). Any broad and bilateral aSRF units were located in the OT (Fig. 6A). It was not possible to determine the exact position of 8 units, because of lack of fluorescence tracing. However, it was possible to determine that these units were located in the layers of the OT and not in other midbrain structures, by using the stereotaxic information. The location of these units was not used in the calculation of the topographic maps (see next paragraph). In the OT, directions around the audiovisual axis were overrepresented, as seen by the normalized average of the aSRFs obtained from neurons in the OT (Fig. 6B). The same average for the FRLx neuron, on the other hand, revealed uniform representation of locations around but also excluding 90° azimuth (Fig. 6C). These results show different spatial tuning properties of FRLx and OT in responding to specific areas of the auditory space.
Topographic map in FRLx and OT
Thanks to fluorescence tracing and stereotaxic reconstruction (Fig. 7A), it was possible to reconstruct the position of the recording sites in the atlas coronal view (Fig. 7B). Given the aSRF selectivity of FRLx and OT at the neural population level, we investigated the presence of a topographic map related to the aSRF for each brain area. There was a significant correlation among the FRLx neurons between the annular aSRFs medial peak and the round aSRF central peak, and the location along both the RC (Spearman's ρ = −0.51, p = 0.0067; Fig. 7C) and DV axes (Spearman's ρ = 0.57, p = 0.0020; Fig. 7D), but not along the ML axis (Fig. 7E). However, when analyzing individual aSRF types, we found that the medial peak of annular aSRFs correlates with the location along both RC (Spearman's ρ = −0.62, p = 0.0028) and DV (Spearman's ρ = 0.73, p = 0.0002) axes, while the round aSRFs are only significantly organized along the RC axis (RC: Spearman's ρ = −0.84, p = 0.0444; DV: Spearman's ρ = 0.41, p = 0.4333), probably because of the low number of round aSRF units. This result suggests that there is an aSRF map arranged in a concentric manner, where smaller annular aSRFs are computed at the dorsal and caudal edge, whereas the diameter progressively increases toward the ventral and rostral edge (Fig. 7F). In the same way, the best azimuth of round aSRFs shifts from locations at the back of the animal head (azimuth range >90°) toward locations in the frontal field (azimuth <90°).
Thus, to verify whether such an aSRF map reflects a possible ITD map in the FRLx, we compared the best ITD, calculated from the ITD tuning curve, of both annular and round aSRF units as function of the recording site, finding a significant correlation along both RC axis (Spearman's ρ = 0.50, p = 0.034; Fig. 7G) and DV axis (Spearman's ρ = −0.49, p = 0.039; Fig. 7H), but not along the ML axis (Fig. 7I). The correlation was preserved when the best ITDs of only annular units are considered (RC: Spearman's ρ = −0.55, p = 0.0258, DV: Spearman's ρ = 0.58, p = 0.0191). Only 2 round aSRF units showed a best ITD in the FRLx, thus the correlation coefficient was not calculated for those units alone. These results indicate the presence of an ITD map aligned with the aSRF map: smaller annular aSRFs and longer ITDs are computed at more caudal and dorsal positions in the FRLx, whereas bigger annular aSRFs and smaller ITDs are generated at more rostral and ventral locations (compare Fig. 7F,J).
We also investigated whether a topographic map of aSRFs is present in the deep layers of the OT. We did not find a significant correlation between the medial peak of both round and annular aSRFs and the projection of the units along the RC axis (Spearman's ρ = 0.17, p = 0.4682; Fig. 8A) or DV axis (Spearman's ρ = −0.30, p = 0.1979; Fig. 8B). No significant correlation was detected considering the medial peak of the round aSRFs alone (RC axis: Spearman's ρ = 0.16, p = 0.5330; DV axis: Spearman's ρ = −0.32, p = 0.1973), suggesting that the round aSRFs are not topographically represented in the OT (Fig. 8C).
Discussion
Our study aimed at characterizing the auditory spatial properties of neurons in two midbrain areas, the OT and the FRLx, and the role of ITD and ILD tuning in shaping the aSRFs. We showed that most of the aSRFs in the FRLx and in the OT of the chicken are confined in both azimuth and elevation; the FRLx units mainly displayed ring-shaped “annular” aSRFs, while the OT units showed round aSRFs (Fig. 9). The role of ITD/ILD tuning in aSRF formation, the implications for auditory processing in the avian midbrain and for multimodal integration in OT are discussed.
Role of ITD and ILD tuning in aSRF formation
Our data suggest that annular and round aSRFs in OT and FRLx are mainly generated through a unit's tuning to ITD and ILD, which could allow generalist bird's midbrain neurons to encode for specific positions in elevation. This evidence stands in contrast with the hypothesis that generalist birds can only discriminate sound sources along azimuth (Klump, 2000). Electrophysiological studies showed that aSRFs in the ICx of birds with symmetrical ears are only limited in azimuth, not in elevation (Calford et al., 1985; Volman and Konishi, 1989, 1990). However, in our study the VAS range (azimuth: ±180°, elevation: ±67.5°) was considerably wider than in the other studies, where it was limited to the frontal area of the auditory space. This limitation might have not permitted to reconstruct the entire shape of round and annular aSRFs, which may have appeared as vertical stripes (Volman and Konishi, 1989; see their Figs. 3, 6), or other uncompleted shapes (Calford et al., 1985; see their Fig. 2B,C).
In the barn owl, ITDs are exclusively informative for horizontal location of a sound source and ILDs mainly for vertical directions (Moiseff, 1989; Takahashi et al., 2003), and the multiplicative computation of these two cues in the ICx provides narrowly tuned aSRFs (Peña and Konishi, 2001; Fischer et al., 2007). In generalist birds the main cue for spatial tuning might be provided by the ITDs: the aSRF may be round for preferred ITDs close to those occurring at +90° in azimuth or lead to increased response at a ring-shaped cone of confusion. However, also the BB ILDs seem to be informative for spatial sensitivity, especially for the round aSRFs. We could not investigate the role of spectral ILDs, as well as the role of low and high frequency ranges for binaural cues processing. Future studies are needed to investigate the aspects of ITD/ILD processing in FRLx and OT.
When compared with other chicken auditory nuclei, the ITD tuning responses in FRLx and OT are narrower than in more peripheral processing levels, such as the chicken nucleus laminaris (Köppl and Carr, 2008) and IC (Aralla et al., 2018). The ILD tuning responses were sigmoidal, as observed in the chicken lateral lemniscal nucleus (LLD; Sato et al., 2010), however in FRLx and OT the steep part of the ILD curves spans across a broader range than what observed in LLD. The ILD tuning in FRLx and OT is mainly comparable to the “contra-dominated” type of the chicken's IC described previously (Aralla et al., 2018).
Auditory spatial tuning along the pathway ICx–FRLx–OT
Birds have internally coupled ears (ICE) which increase both ITDs and ILDs that reach the ear drums. This effect may increase spatial resolution and it is particularly beneficial for small sized birds, which otherwise would experience small ITDs and ILDs. In the chicken, the enhancement of the presented ITDs and the ITD-dependent amplitude modulation can reach a factor of up to 1.8 and 2.22, respectively (Köppl, 2019). Thus, according to our HRTF measurements, the “heard” ITD and ILD ranges might broaden up to ± 353 µs and ± 15.5 dB, respectively. These cues are then processed along the auditory pathway up to the midbrain nuclei, such as the IC, the FRLx and the OT. The FRLx receives input from ICx and projects to the OT, and it is discussed as the plesiomorphic connection between these two nuclei (Niederleitner et al., 2017). As far as we know, our study is the first in which aSRFs in the FRLx were recorded in vivo. Our main finding is a predominance of annular aSRFs which are arranged in a topographic map. In the barn owl, the ICx has a 2D-auditory space map which is projected to the OT, preserving spatial tuning in ICx (Knudsen and Konishi, 1978; Knudsen and Knudsen, 1983). The ICx of symmetrically eared owls contains a topographic map of the best azimuth, where lateral locations are represented at the posterior part of the ICx, and increasingly frontal best azimuths are at more anterior positions (Volman and Konishi, 1989, 1990). Likewise, in our study the smaller annular aSRFs were located at the posterior FRLx area, and progressively larger aSRFs (responsive to more frontal azimuths) were present at more anterior locations. Spatial tuning and medial preferred directions might be projected from ICx onto FRLx.
The physiological function of the annular aSRFs in the FRLx might be to encode the offset of the auditory stimulus from the lateral axis, yielding an “error signal,” where the neural response along the aSRF ring defines the angular distance of the stimulus from the visual axis. This information might be transferred to the (pre) motor networks (Niederleitner et al., 2017) to provide a quick response to a stimulus, or to trigger saccadic head movements to align the stimulus to the visual axis.
Spatial tuning in the OT was dominated by round aSRFs, encoding lateral locations and not organized in a topographic map. Similar tuning was found in the pigeon OT (Lewald and Dörrscheidt, 1998), indicating that the auditory resolution in generalist birds' OT is much lower than in the barn owl, where neurons are narrowly tuned to specific locations in space creating a place code of the auditory space (Knudsen, 1982). The lack of ring shaped aSRFs in the OT implies that FRLx neurons that project onto the OT may create a source for lateral inhibition and shape the tuning in OT. An alternative explanation is that the place coding of space in FRLx is converted to a rate code in OT. Such a mechanism has been observed in the barn owl, where ICx and OT, which have a map of auditory space based on place coding, project to the midbrain tegmentum, involved in eye gaze orientation, which operates as a rate code nucleus (Cazettes et al., 2018; Peña et al., 2019). However, rate code would generate ambiguities in a 2D environment, thus it would only be useful for coding sound directions along the azimuthal or elevational plane.
Implications for auditory localization and multimodal integration
The OT, like the superior colliculus, is an important computational hub for multimodal integration (Stein and Meredith, 1993; Knudsen and Brainard, 1995). Both visual and auditory stimuli are processed in the chicken OT (Cotter, 1976), and young chicks show audiovisual integration (Verhaal and Luksch, 2016). In contrast to the barn owl, which possesses auditory specializations and frontally oriented eyes as adaptations for nocturnal hunting (Harmening and Wagner, 2011; Wagner et al., 2013), generalist birds mainly rely on vision and have laterally oriented eyes to monitor the surroundings (Iwaniuk et al., 2008). Thus, the lateral visual axis strongly overlaps with the interaural axis (Schnyder et al., 2014). This implies that the overrepresentation of the auditory space around the lateral axis in the OT might correspond to the visual segment inspected by the area centralis. However, the size of the auditory and visual RFs is remarkably different: if the aSRFs have a mean diameter of 120°, the visual RFs are not bigger than 25° (Verhaal and Luksch, 2013). This discrepancy might be explained by the law of inverse effectiveness, as part of the neural computational rules underlying multimodal integration (Stanford and Stein, 2007). It states that multimodal integration is strongest for stimuli that, when presented alone, are minimally effective in eliciting a neuronal response. When sound and image are faint, even a receptive field as large as the aSRFs that we found can be beneficial, as it increases the likelihood that weak stimuli will reach the detection threshold.
From an evolutionary perspective, the sensory organs have evolved to ensure an animal's survival. If the sensory periphery, for physical reasons, does not allow the precise location of a stimulus, it is nevertheless beneficial to exploit the information by orienting another, more precise distance sense. Because of the diurnal lifestyle of early vertebrates, the visual system had been dominant in most taxa; however, mammalian (especially therian) evolution went through a nocturnal niche which strongly altered the sensory layout and tuned the auditory system toward high precision. Thus, our findings in the chicken might represent the ancestral situation for vertebrates. It will be interesting to see whether other amniotes such as lizards process auditory stimuli in a similar fashion.
Footnotes
- Received November 4, 2021.
- Revision received April 4, 2022.
- Accepted April 26, 2022.
This work was supported by the Deutsche Forschungsgemeinschaft (DFG) Grant Lu 622 13/1. We thank Yvonne Schwarz and Birgit Seibel for their help with the histology.
The authors declare no competing financial interests.
- Correspondence should be addressed to Harald Luksch at harald.luksch{at}tum.de
- Copyright © 2022 the authors