Abstract
The natural environment challenges the brain to prioritize the processing of salient stimuli. The barn owl, a sound localization specialist, exhibits a circuit called the midbrain stimulus selection network, dedicated to representing locations of the most salient stimulus in circumstances of concurrent stimuli. Previous competition studies using unimodal (visual) and bimodal (visual and auditory) stimuli have shown that relative strength is encoded in spike response rates. However, open questions remain concerning auditory–auditory competition on coding. To this end, we present diverse auditory competitors (concurrent flat noise and amplitude-modulated noise) and record neural responses of awake barn owls of both sexes in subsequent midbrain space maps, the external nucleus of the inferior colliculus (ICx) and optic tectum (OT). While both ICx and OT exhibit a topographic map of auditory space, OT also integrates visual input and is part of the global-inhibitory midbrain stimulus selection network. Through comparative investigation of these regions, we show that while increasing strength of a competitor sound decreases spike response rates of spatially distant neurons in both regions, relative strength determines spike train synchrony of nearby units only in the OT. Furthermore, changes in synchrony by sound competition in the OT are correlated to gamma range oscillations of local field potentials associated with input from the midbrain stimulus selection network. The results of this investigation suggest that modulations in spiking synchrony between units by gamma oscillations are an emergent coding scheme representing relative strength of concurrent stimuli, which may have relevant implications for downstream readout.
Significance Statement
While natural auditory scenes comprise many acoustic signals, the brain is capable of segregating sources and prioritizing representation of the most salient sound. This study demonstrates that midbrain space map neurons in the owl's optic tectum, homolog to the superior colliculus, represent relative strength of concurrent auditory stimuli by modulating the interneuronal spike train synchrony. This synchrony correlates strongly with the low gamma range LFP, which reflects inputs from a midbrain stimulus selection network. These results may provide insight to similar processes in mammalian superior colliculus and human audition, where the most relevant auditory signals must be prioritized in complex noisy environments and may inform optimization strategies for hearing aids.
Introduction
Natural auditory scenes comprise multiple sounds that coincide in time. Nevertheless, the brain is capable of localizing, segregating, and identifying auditory sources. In this study, we leverage the sound localization system of the barn owl to understand the neural basis of coding relative stimulus strength in multisound environments.
Evolutionary pressure to hunt prey has shaped the barn owl's efficient sound localization system (Payne, 1971). Barn owls localize sound sources using interaural time difference (ITD) and interaural level difference (ILD; Moiseff, 1989) to infer locations in azimuth and elevation, respectively (Knudsen and Konishi, 1980; Moiseff and Konishi, 1981). These cues are combined in the midbrain to form a topographic map of space (Knudsen and Konishi, 1978; Knudsen, 1982). This auditory map first emerges at the external nucleus of the inferior colliculus (ICx), which sends point-to-point connections to the optic tectum (OT), the avian homolog of the mammalian superior colliculus (Knudsen and Knudsen, 1983). The OT integrates auditory information from the ICx and visual information from the retina to form a multimodal map of space, and its population activity guides motor control for orienting behavior toward auditory and/or visual stimuli (du Lac and Knudsen, 1990; Masino and Knudsen, 1992; Netser et al., 2010; Cazettes et al., 2018).
Another key difference between the auditory space map in the ICx and the multimodal OT is the OT's role in sending and receiving reciprocal feedback to the midbrain stimulus selection network to determine and prioritize representation of the most salient stimulus in circumstances of many concurrent areas of activity across the map (Marín et al., 2007; Mysore et al., 2010, 2011). Coordinated activity between the OT and three tegmental nuclei—the nucleus isthmi pars parvocellularis (Ipc), pars semilunaris (SLu), and nucleus isthmi pars magnocellularis (Imc)—leads to global inhibition of spike response rates across the map for all nonsalient source locations (Mysore et al., 2010, 2011; Schryver et al., 2020; Schryver and Mysore, 2023), and focal synchronized activity of neurons responding to salient source locations boosts the bottom–up relay to downstream brain regions (Marín et al., 2012).
While previous studies investigated either competing unimodal (visual) stimuli or bimodal (visual–auditory) stimuli, the consequences of competing auditory stimuli on neural responses have not been fully investigated in the OT. This includes population coding of competing auditory stimuli with a momentary strength that varies over time, which is often the case for natural sounds that exhibit complex envelope structures. Previous work studying the representation of concurrent sounds by space-specific neurons in the ICx has shown that concurrent sounds will summate and interfere with each other at each ear (Keller and Takahashi, 2005). Reductions in ICx neuronal responses induced by concurrent stimuli have been attributed to binaural decorrelation (Keller and Takahashi, 2005) and surround suppression (Wang et al., 2012), with the effects of binaural decorrelation likely inherited from the nucleus laminaris (NL) early on in the ITD detection pathway (Albeck and Konishi, 1995). Current knowledge of multisource representations in the ICx raises the question of what additional computations in the OT contribute to the determination of relative strength for competing stimuli and ultimately facilitate readout by downstream regions involved in behavior and perception.
To this end, we conducted recordings in the ICx and OT in awake owls using multichannel electrodes while presenting diverse types of auditory competitors (flat noise and amplitude-modulated noise) through a free-field speaker array and assessed differences in coding of relative strength for concurrent stimuli between the ICx and OT. The results of this study show that while both the ICx and OT exhibit a reduction in spike responses when presented with competing concurrent stimuli, spike train synchrony of nearby units represents an additional coding scheme that reflects relative strength of concurrent stimuli in the OT, but not in the ICx. Gamma range brain oscillations induced by the midbrain stimulus selection network strongly modulated synchrony. We show two subpopulations of OT neurons with different response properties to amplitude-modulated stimuli and competition-dependent changes in synchrony.
Materials and Methods
Protocols utilized in this study were in compliance with guidelines set by the National Institutes of Health and Albert Einstein College of Medicine's Institute for Animal Studies. All procedures were approved by the Institutional Animal Care and Use Committee of the Albert Einstein College of Medicine.
Animal handling and surgery
Three adult North American barn owls (Tyto furcata) of both sexes (one female, two males) were used in this study. Prior to recordings, custom-built steel head plates were affixed to the skull with dental acrylic to head-fix animals for stereotaxic surgery and recordings. For implanting chronic drives for awake recordings, owls were food-deprived 12 h prior to surgery. On the day of surgery, owls were anesthetized with intramuscular injections of ketamine (Ketaset, 20 mg/kg) and xylazine (Anased, 2 mg/kg). Body temperature was maintained with a heating pad. Proper anesthetic level was assessed from pedal and eyelid reflexes. Prophylactic antibiotic (ampicillin, 20 mg/kg, i.m.) and lactated Ringer's solution (10 ml, s.c.) were administered to maintain adequate hydration throughout surgery. To maintain anesthesia during surgery, doses of ketamine and xylazine were administered every 1.5–2 h. Chronic drives were affixed to the skull using dental acrylic. After implant surgery, owls were given analgesic (Rimadyl, 3 mg/kg, i.m.) to minimize inflammation and pain. The animals remained in a crate that was placed in a warm, quiet environment during recovery and were returned to the main aviary once they were alert and capable of standing. The owls recovered for 5 d before awake recordings were initiated. Prior to implanting electrode drives, the owls were habituated to remain head fixed inside soundproof chambers over increasing amounts of time (up to 1.5 h) while undergoing sound stimulation, such that animals would remain calm during awake recordings. The owls were monitored with an infrared camera, and recordings were terminated if animals showed signs of distress.
Sound stimulation
All recordings were performed in a double-wall sound-attenuated chamber (Industrial Acoustics). The inner walls of the chamber were lined with acoustical foam to minimize echoes (Sonex). Tucker-Davis Technologies System 3 and custom-written Python software were used to synthesize and deliver acoustic stimuli. We utilized our free-field speaker array to deliver stimuli, as described in previous studies (Wang et al., 2012; Beckert et al., 2017, 2020). Sound stimuli were presented in free field through a custom-made hemispherical array of 144 speakers (Sennheiser, 3P127A) arranged to surround a stereotaxic device, used to maintain the owl's head fixed at the center facing 0° azimuth and 0° elevation. Speaker locations range ±100° azimuth and ±80° elevation. The angular separation between speakers varies from 10 to 30°, with the highest density of speakers located in the frontal space at the center of the array. Speakers were calibrated using a Brüel & Kjær microphone (model 4190) positioned at the center of the array.
To measure spatial receptive fields of recorded units, broadband noise stimuli were presented from each of the 144 speakers, five times each in random order (100 ms duration, 5 ms rise–fall times, 1.0 s interstimulus intervals). Preferred and nonpreferred locations were selected and further characterized by measuring responses to changing stimulus levels at these locations: either flat (not amplitude modulated) broadband noise or sinusoidally amplitude-modulated (modulation frequency: fAM 55 or 75 Hz) broadband noise stimuli were presented at levels ranging from 27 to 72 decibels sound pressure level (dB SPL) in increments of 5 dB (1.0 s duration, 5 ms rise–fall times, 2.5 s interstimulus intervals, 20 repetitions per level in random order).
For competing sound paradigms, stimuli consisted of two sounds (1.0 s duration, 5 ms rise–fall times, 2.5 s interstimulus intervals, 20 repetitions per condition in random order): either two de novo uncorrelated flat broadband noise stimuli or two sinusoidally amplitude-modulated uncorrelated broadband noise stimuli with different modulation frequencies (either fAM 55 or fAM 75 Hz). Competing stimuli were presented simultaneously from two speakers at different azimuths and the same elevation. For each recording session, the preferred azimuth and elevation of all units were determined from the spatial receptive fields. One stimulus, the driver, was presented from a speaker location within the receptive field of simultaneously recorded units (preferred location). The second stimulus, the competitor, was presented from a nonpreferred location at least 50° away and within the same hemisphere, contralateral to the recorded midbrain hemisphere. We chose to investigate competition along azimuth because midbrain space map neurons exhibit ellipsoidal tuning curves that are wider in elevation than azimuth, which predict twofold better discrimination in azimuth than elevation (Knudsen and Konishi, 1978; Bala et al., 2007). Thus, we could use a single elevation to evoke responses across simultaneously recorded units. We ensured that the distant competitor stimulus alone did not evoke significant responses by measuring spike response rates to increasing stimulus levels and units with spike response rates that exceeded the baseline activity by >2 SD in response to the competitor were excluded. Similarly, only units with spike response rates >2 SD above baseline for the driver were included. The modulation frequencies 55 and 75 Hz for amplitude-modulated stimuli were chosen to match a previous study investigating the representation of concurrent sound sources in the ICx (Keller and Takahashi, 2005). Amplitude-modulated stimuli were constructed by multiplying broadband noise with a sinusoidal envelope (100% modulation depth, meaning the final amplitude ranged between 0 and the same amplitude of the flat broadband noise). Driver stimuli were presented at a fixed level in each recording block (flat noise, either 47 or 62 dB SPL; amplitude-modulated noise, either 43 or 58 dB SPL). The different driver levels were chosen to mitigate the influence of overall loudness, and indeed effects were independent of overall sound levels, and results were pooled. Competitor levels varied across trials ranging from 15 dB to 10 dB above the level of the driver (relative levels −15 to +10 dB, increments of 5 dB). Both driver and competitor stimuli were played at intensities above threshold to ensure stimuli evoked sufficient neuronal responses.
Electrophysiology
Chronic NeuroNexus microdrives (dDrive) loaded with either a single tetrode (HQ4 probes) or a single linear 16-channel probe (H16) were implanted in the OT to record spike responses and local field potentials (LFPs). Recording sites along the 16-channel probe spanned 1.5 mm with 100 µm spacing between sites. In one bird, an additional implant of the H16 probe was placed in the ICx after recordings in the OT were concluded. Microdrives allowed electrodes to be advanced up to 2.5 mm along the dorsoventral axis from the initial site of implantation. Proper placement of electrodes in the OT and ICx was guided by stereotaxic coordinates and characteristic electrophysiological properties of neurons in this region: the OT neurons exhibited unambiguous tuning to binaural cues and were spontaneously bursty and multimodal, responding to both visual and auditory stimuli, whereas the ICx neurons were strictly auditory (Knudsen and Konishi, 1978; Knudsen, 1982, 1984). During awake recording sessions, electrodes were advanced in 300 µm increments in search of new units after each recording block that spanned over 3 d. Recording signals were amplified, digitized, and stored using Tucker-Davis Technologies System 3 and custom-written Python code. Spike times of multiunit activity were extracted offline by thresholding voltage traces of each channel semimanually per recording session, blinded to which spikes were stimulus driven or spontaneous.
Data analysis
For all sound paradigms, we calculated the spike response rate from the mean spike count evoked by each stimulus within the stimulus presentation window across either five repetitions for spatial tuning or 20 repetitions for each stimulus level.
Two-sound competition
Relative changes in the spike response rate to competing stimuli were calculated as the percent change in response to two concurrent stimuli relative to responses measured to a single stimulus at the position and level of the driver stimulus, similar to measures previously reported (Mysore et al., 2010, 2011; Mysore and Knudsen, 2011):
Correlation analysis
To examine the synchrony of temporal spiking patterns between nearby units, we computed cross-correlograms (CCGs) of spike trains for simultaneously recorded unit pairs (Bair et al., 2001; Kohn and Smith, 2005; Smith and Kohn, 2008; Beckert et al., 2017, 2020). Discrete spike trains were converted to binary sequences indicating the presence of spikes in time by constructing peristimulus time histograms (PSTHs) for each trial from 50 to 1,000 ms poststimulus onset in 1 ms bins. Responses from the first 50 ms of the stimulus were excluded to remove influences attributed to the onset response. Trial-averaged CCGs were smoothed using a rectangular 5 ms sliding window and normalized by the geometric mean spike response rate of neuron pairs within the analysis window. To minimize changes in CCG magnitudes related to slow fluctuations and stimulus-locked correlations, we also computed the shifted CCG for nonsimultaneous trials (Kohn and Smith, 2005). To obtain the shifted CCG, the spike train from one trial for a single neuron was cross-correlated with the spike train from the subsequent trial for the other neuron. The trial-averaged shifted CCG was then smoothed and subtracted from the original CCG to calculate the shift-corrected CCG. From this shift-corrected CCG, the noise level and standard deviation observed in each cross-correlogram were measured at time lags ±940 to ±950 ms. Spike train synchrony (STS) is defined as the maximal value observed at lags within ±15 ms around zero minus the noise level. Peaks that did not exceed five standard deviations of the noise level were excluded from further analysis.
We quantified changes in spike train synchrony induced by competition relative to spike train synchrony observed for single stimuli, the competition synchrony index (CSI):
Estimating phase and vector strengths of spike timing for amplitude-modulated stimuli
To examine the influence of spike timing by amplitude-modulated stimuli, spike times were first converted modulo the period of the modulation frequency (Keller and Takahashi, 2005). Then, the magnitude (vector strengths) and phase of the mean vector were determined for all spike times collected across all repetitions. For competing stimuli, spike times were related to the driver modulation frequency, except in instances where we compared the relative contributions of either the driver or competitor stimuli.
Response latency
The response latency was determined from each unit's average PSTH as the interval between stimulus onset and the first time at which half the maximum peak response was observed. Average PSTHs were constructed from 20 repetitions to a single driver stimulus at a fixed intensity from data collected for rate level curves.
LFP analyses
The LFP was derived from the wideband signal by first removing 60 Hz electrical line noise and its harmonic (180 Hz), using an IIR notch filter (scipy.signal.iirnotch). Bleed-through effects from spiking activity were controlled by the following procedure adopted from previous studies (Sridharan et al., 2011). Briefly, the denoised signal was bandpass-filtered between 300 and 4.7 kHz (Butterworth filter, scipy.signal.butter, order 2, run forward–backward with scipy.signal.filtfilt) and subtracted from the original signal to get the remaining low-frequency signal. From the remaining low-frequency signal, 2 ms windows at each spike time were linearly interpolated. This spike-corrected remnant low-frequency signal was then added back to the original signal to arrive at the spike-removed signal. This signal was low-pass filtered at 200 Hz (Butterworth, as above) and resampled to 1 kHz (scipy.signal.resample_poly).
We calculated the induced response of the LFP by subtracting the mean evoked response from the LFP of each trial. The induced response was filtered between 20 and 50 Hz (Butterworth, order 1), which is the low gamma range that corresponds to the Ipc input (Asadollahi et al., 2010). Gamma power is the root mean square (RMS) magnitude of the analytic signal, calculated from the Hilbert transform of the bandpass-filtered signal. Power during stimulus presentation was measured 50 ms after stimulus onset to the end of stimulus duration, and spontaneous baseline power was measured for an equivalent time without stimulation. The stimulus-related change in gamma power was estimated as follows:
Statistical analyses
To compare various measures across single stimulus levels and competing relative levels, we conducted one-way ANOVA and subsequent post hoc Tukey’s tests to distinguish significant differences across groups. To test for significant phase locking of spikes to amplitude-modulated stimuli, we performed Rayleigh's test of uniformity on the distribution of spike phase values for single units across trials or the distribution of mean spike phases across units.
Code accessibility
Python code used for data analysis is available at https://github.com/penalab/Bae-Ferger-et-al-2024 or available upon request.
Results
To assess changes in spike responses and synchrony of nearby neurons from auditory competition, we recorded neuronal responses in three awake owls of both sexes (two males, one female) using chronically implanted multichannel electrodes (either a single tetrode or a 16-channel linear probe). While single tetrodes were used for investigating responses of nearby OT neurons (two owls), in one owl, both the OT and ICx were independently recorded with a 16-channel linear probe to investigate differences between these midbrain regions. Because of the topographic mapping of the midbrain and electrode design, simultaneously recorded neurons were tuned to the same or similar preferred locations. Analyses in the OT for competition of flat broadband noise stimuli were based on 262 units from 52 recording sessions, and analyses for competition of amplitude-modulated stimuli contained 204 units from 42 recording sessions (fAM driver, 55 Hz; fAM competitor, 75 Hz) and 205 units from 42 recording sessions (fAM driver, 75 Hz; fAM competitor, 55 Hz). Analyses in the ICx included 104 units from seven recording sessions for competition of flat noise stimuli and 90 units from six recording sessions for competition of amplitude-modulated stimuli.
Competition of broadband noise stimuli–response rates and spike train synchrony in the OT
We first characterized the spatial receptive field of the units we recorded by presenting stimuli from our free-field array of 144 speakers (Fig. 1A). To determine the effect of competing auditory stimuli on spike responses, we presented two uncorrelated broadband noise stimuli in our free-field speaker array from separate speaker locations. One stimulus, the driver, was presented from a speaker location at the preferred location as determined by the spatial receptive field, and another stimulus, the competitor, was presented in the same hemifield as the driver but at a distant location (Fig. 1B). We verified that single stimuli evoked responses only from the location of the driver, but not from the location of the competitor, by measuring spike response rates to increasing stimulus levels, presented from either speaker alone, and constructing rate level curves.
We altered the relative strength of the driver and competitor by presenting the driver at a fixed intensity while the competitor intensity varied across trials. Overall, spike response rates in the OT significantly decreased as the relative level of the competitor increased (Fig. 1C; F(1,5) = 203.1; p = 4.07 × 10−167; ANOVA). We observed suppressed responses when the level of the competitor was greater than the level of the driver (relative levels >0 dB); that is, spike response rates were lower than when the driver was presented alone at the same intensity used for competition trials. Spike responses for units that preferred the driver location were inhibited the most when the competitor level was loudest at +10 dB relative level (Fig. 1C; median reduction, −42.8%). This finding is consistent with similar competition studies for unimodal visual and bimodal visual–auditory stimuli (Mysore et al., 2010, 2011; Mysore and Knudsen, 2011), as well as studies that examined the representation of concurrent auditory stimuli in the ICx (Keller and Takahashi, 2005).
Additionally, we investigated changes in spike train synchrony between nearby OT neurons due to competing stimuli through multichannel electrode recordings. Raster plots for two relative level conditions showed a decrease in coincident spikes between simultaneously recorded units when the competitor was louder (Fig. 2A). Cross-correlograms of spike trains between simultaneously recorded neurons showed a single peak around zero lag (mean best lag: 0.0 ms, 4 ms SD). The peak height was used as an indicator of spike train synchrony as described in Materials and Methods. Spike train synchrony reduced as competitor levels increased (Fig. 2B). The CSI was used to quantify the change in spike train synchrony by competition relative to single stimulation (Fig. 2C; F(1,5) = 61.75; 2.58 × 10−62). Synchrony decreased as the competitor became louder. A previous study using tetrode recordings in the OT demonstrated that spike patterning was more reproducible across trials and more synchronous between nearby neurons when stimuli were presented at their preferred location (Beckert et al., 2020), suggesting that spike patterning due to stimulus envelopes was the strongest when the stimulus resided inside the receptive field. Our data corroborate the results of this previous study and further extend these findings to the context of concurrent competing stimuli, indicating that changes in spike train synchrony may have implications for encoding relative strength under multiple sound conditions.
Competition of amplitude-modulated noise stimuli–response rates and spike train synchrony in the OT
To further test how the stimulus envelope shape influences the representation of stimulus competition in the OT, we assessed changes in spike response rates and spike train synchrony in response to competing amplitude-modulated stimuli. The use of periodic amplitude-modulated stimuli has some advantages: natural stimuli exhibit complex envelopes that vary in amplitude over time, and sinusoidal modulation is considered a valuable approximation to natural envelopes, compared with unmodulated broadband noise. Additionally, the periodicity of sinusoidal modulation enables analysis of how predictable envelope properties influence spike timing.
Similar to our data from competing flat noise, increasing levels of an amplitude-modulated competitor led to decreases in spike response rates (Fig. 3A,B). This held true regardless of whether the amplitude modulation frequency (fAM) of the competitor was higher (Fig. 3A; F(1,5) = 65.28; p = 1.79 × 10−60; ANOVA) or lower (Fig. 3B; F(1,5) = 42.56; p = 1.64 × 10−40; ANOVA) than that of the driver. Spike responses for units that preferred the driver location were inhibited the most when the competitor level was loudest (relative level +10 dB; median reduction −36.5 and −32.4% for higher and lower competitors fAM, respectively). This decrease in spike response rate with higher competitor levels, along with a decrease in coincident spikes, is exemplified in two raster plots of simultaneously recorded units (Fig. 3C,D). Across all unit pairs, spike train synchrony between neighboring units decreased as the competitor level increased (Fig. 3E,F). Again, relative strength determined synchrony independently from modulation frequencies: synchrony decreased regardless of whether the competitor fAM was higher (Fig. 3G; F(1,5) = 45.50; p = 1.87 × 10−45; ANOVA) or lower (Fig. 3H; F(1,5) = 46.45; p = 2.02 × 10−46; ANOVA). These results suggest that spike train synchrony may underlie coding of relative stimulus strength within the midbrain stimulus selection network and is unbiased by the envelope of the sound.
Cross-region comparison with the ICx
Neurons in the ICx have been shown to phase lock to the envelope of amplitude-modulated sounds when stimuli are presented within their receptive field (Keller and Takahashi, 2005). Through modeling, it was demonstrated that the suppression of spike responses for concurrent stimuli is largely attributed to binaural decorrelation (Keller and Takahashi, 2005) and is likely inherited from NL early on in the ITD detection pathway (Albeck and Konishi, 1995). As a downstream region, the OT would be affected by this as well. However, a critical difference between the ICx and OT is the role of the OT as part of the midbrain global inhibition network (Mysore et al., 2010, 2011; Mysore and Knudsen, 2011). While the OT inherits much of the influences of binaural decorrelation and lateral inhibition from the ICx, whether coding in multisound environments impacts the determination of the most salient sound source in the OT is an open question. Critically, changes in spike train synchrony between units in the presence of concurrent sounds have not been analyzed in the ICx before.
To address this open question, we conducted recordings in one owl using a 16-channel electrode in the ICx and measured changes in spike response rate and spike train synchrony with competition in this region. Independently recorded datasets using similar 16-channel probes from the same owl in the ICx and OT were compared to assess region-specific differences. Significant reductions in the ICx spike response rate were observed with increasing competitor levels for competing flat broadband noise stimuli (Fig. 4A; F(1,5) = 6.29; p = 1.04 × 10−5; ANOVA), and for amplitude-modulated stimuli, independent of whether the competitor fAM was higher (Fig. 4B; F(1,5) = 21.57; p = 1.1 × 10−19; ANOVA) or lower (Fig. 4C; F(1,5) = 4.32; p = 7.36 × 10−4; ANOVA) than the driver fAM. ICx spike response rates were maximally suppressed when the competitor was loudest at +10 dB for all stimuli conditions (flat noise Fig. 4A: median reduction, −26.0%; competitor fAM higher Fig. 4B: −16.4%; competitor fAM lower Fig. 4C: −17.8%).
We then further analyzed changes in synchrony with competition in the ICx. Regardless of whether the driver or competitor was stronger, raster plots from example units show that competition in the ICx did not change the number of coincident spikes between simultaneously recorded units for flat noise (Fig. 5A) or amplitude-modulated noise (Fig. 6A,B). Across the recorded population, spike train synchrony to a single sound in all unit pairs was lower in the ICx (flat noise Fig. 5B, horizontal line: 5.33 × 10−6 SD 4.14 × 10−6; competitor fAM higher Fig. 6A: 6.67 × 10−6 SD 4.09 × 10−6; competitor fAM lower Fig. 6B: 6.51 × 10−6 SD 4.08 × 10−6) than that in the OT (flat noise Fig. 2A: 4.08 × 10−5 SD 2.81 × 10−5; competitor fAM higher Fig. 3C: 7.17 × 10−5 SD 3.25 × 10−5; competitor fAM lower Fig. 3D: 7.12 × 10−5 SD 4.76 × 10−5) for all stimuli types. When presented with competing sounds, the ICx did not show the same reduction in spike train synchrony for flat noise (Fig. 5C; F(1,5) = 0.51; p = 0.77; ANOVA) and amplitude-modulated noise with the competitor fAM higher (Fig. 6E; F(1,5) = 0.38; p = 0.86; ANOVA) or lower (Fig. 6F; F(1,5) = 1.21; p = 0.30; ANOVA) than the driver fAM as was observed in the OT (Figs. 2C, 3G,H). We hypothesize that the changes in correlation patterns observed in the OT in response to competing sounds may be attributed to the modulatory input from the midbrain stimulus selection network. We assess this and other possible explanations for the observed data in the following sections.
Elevation tuning does not explain different changes in synchrony
We first assessed whether measures of synchrony for competing sounds across azimuth positions but constant elevation might be impacted by differences in elevation tuning across units recorded by our multichannel linear electrode. Because ILD tuning changes along the dorsoventral axis in both midbrain space maps, our linear probe might have sampled populations of neurons that share the same preferred azimuth but exhibit different preferred elevations (Knudsen and Konishi, 1978; Knudsen, 1982). To this end, we analyzed the spatial receptive fields for simultaneously recorded units in each recording session, characterized the elevation tuning at the preferred azimuth of each unit, and examined tuning similarity between simultaneously recorded unit pairs by computing signal correlation. Each units’ best elevation and elevation tuning width were calculated by fitting a Gaussian function to the responses across elevations. The mean of the fitted function served as best elevation, while the elevation tuning width was defined as two times the standard deviation. Signal correlation was determined as Pearson’s correlation of the mean spike response rates of the elevation tuning curves for a pair of units (Bair et al., 2001; Beckert et al., 2017). The distribution of preferred elevations observed across all units showed that the electrode in the OT-sampled units tuned overall to slightly more positive elevations than the units recorded from the ICx (Fig. 7A). This may be attributed to different initial depths of the implanted electrodes in the ICx and OT but is not expected to result in synchrony differences, because competition is observed across all elevations and azimuths (Mahajan and Mysore, 2018). On the other hand, the distributions of elevation tuning widths across all units (Fig. 7B) and signal correlation for simultaneously recorded unit pairs (Fig. 7C) were comparable in both regions, indicating similar compositions of the recorded populations with respect to spatial tuning properties. The intrasession standard deviations were used as measure of the population diversity. There were no significant differences between our recordings in the OT and ICx for best elevation (Fig. 7D; p = 0.47; Wilcoxon), elevation width (Fig. 7E; p = 0.94; Wilcoxon), and signal correlation of simultaneously recorded units (Fig. 7F; p = 0.81; Wilcoxon). Altogether, these analyses indicate that sampling differences in elevation within our data are unlikely explanations of the observed differences in spike train synchrony.
Interplay of envelope locking and spike train synchrony
We attempted to determine the role of the acoustic stimuli to affect changes in spike train synchrony in the OT during competition by first computing the mean phase at which spikes occur relative to the sinusoidal modulations of the driver stimulus. In conditions with a single stimulus presented at the preferred location, increasing levels caused spikes to occur at earlier phases of the stimulus envelope (Fig. 8A,B, note that subpanels are ordered with levels decreasing from left to right). This is expected because spike latencies decrease with increasing stimulus intensities. Interestingly, we also observed similar phase shifts in the context of competition with spikes shifted to later phases along with increasing levels of the competitor and constant driver level (Fig. 8C,D). It is noteworthy that not all units in the OT exhibit phase locking to the presented amplitude modulations, but some rather respond independently of the modulation phase, which is indicated by a low vector strength (Fig. 8C,D). Preferred phases of most units with significant phase locking to amplitude modulations were similar to each other, within the rising phase or peak of the amplitude modulation cycle. Additionally, the overall vector strength from the population was strongest when the driver stimulus was louder and weaker when the competitor was louder (Fig. 8C,D).
We also investigated the relative contribution of driver and competitor stimuli on spike patterning by comparing vector strength in relation to amplitude modulation of either the driver or the competitor when both are present for each ICx and OT unit (Fig. 9). When the driver was louder, most units showed stronger phase locking to the driver and almost none to the competitor. As the competitor level increased to match and eventually surpass the driver, OT units exhibited equal phase locking to either stimulus (Fig. 9A). This effect on spike timing was also observed in the ICx (Fig. 9B), indicating that the influence of competing sounds on spike patterning is unlikely a unique effect underlying the reduction in synchrony specific to the OT. Phase locking in the ICx to the amplitude modulation of a competitor source arises when both the driver and competitor sounds share frequencies with common interaural phase differences that the neuron may be tuned to (Keller and Takahashi, 2005). Units which do not phase lock to the driver almost never phase lock to the competitor either, suggesting that this is a property of the neurons rather than the stimulus. Interestingly, only half of OT units phase locked to the stimulus compared with almost all units in the ICx (Fig. 9A,B). Overall, these results suggest that OT-specific changes in synchrony cannot be explained by the temporal dynamics of the stimuli alone.
The stimulus selection network remains a likely cause of spike train synchrony
Unlike the previously described sources of spike patterning, modulations from the midbrain stimulus selection network might be influencing the differential spike train synchrony specific to the OT. Specifically, spatial tuning in Ipc, a relevant nucleus within the midbrain stimulus selection network, which is also topographically organized, detects relative strength among simultaneous inputs across the map and supplies periodic gamma oscillation input (20–50 Hz) to regions in the OT that represent the most salient location (Asadollahi et al., 2010; Bryant et al., 2015). To determine whether Ipc projections may play a role in OT spike train synchrony, we compared power in the low gamma range of the LFP, associated with Ipc input, between the ICx and OT (Fig. 10A,B). Sound competition resulted in a decrease of gamma power in the OT, which correlates to changes in relative strength of the competing stimuli (Fig. 10A; F(1,5) = 89.76; p = 3.59 × 10−83; ANOVA). On the other hand, gamma power in the ICx does not change in a manner that represents relative strength (Fig. 10B, left; F(1,5) = 4.72; p = 3.09 × 10−4; ANOVA). We further analyzed the strength of the relationship between spike train synchrony and phase locking (vector strengths) of unit pairs to either gamma oscillations of the LFP or to the stimulus temporal dynamics for competing amplitude-modulated sounds. Nearby OT units exhibited a strong positive correlation between phase locking to the LFP, measured as vector strength, and spike train synchrony, suggesting that the two are closely linked (Fig. 10C). However, high vector strengths to stimulus amplitude modulations across neuron pairs were not correlated to spike train synchrony and pairs exhibited exclusively either strong spike train synchrony or high vector strengths to the stimulus, suggesting distinct units (Fig. 10D). Additionally, while small differences between the mean preferred phases of nearby units to LFP gamma oscillations were strongly associated with higher spike train synchrony (Fig. 10E), phase differences to the stimulus amplitude modulations were not correlated with spike train synchrony (Fig. 10F). In fact, unit pairs that exhibited high vector strengths to the stimulus temporal dynamics exhibited generally very low synchrony values (Fig. 10F). Altogether, these results suggest that the synchrony between units in the OT is more strongly linked to the gamma range LFP, which reflects inputs from the midbrain stimulus selection network, than the stimulus temporal dynamics. This reveals spiking train synchrony a possible substrate to represent the relative strength in addition to spike rates in the space map of OT.
Subpopulations of units in the OT are distinguishable by phase locking to sound amplitude modulations and onset latency
We further investigated the diversity of response types in the OT. The following figures and results reported below extend to amplitude-modulated stimuli with higher modulation frequencies and lower modulation frequency competitors (not shown due to space limitations).
Subpopulations of phase locking and nonphase locking units could be distinguished by their onset latency (Fig. 11A,B). Both unit types were audiovisual and exhibited latencies expected in the OT (short latency median, 9.0 ms; long latency median, 23.0 ms; separation of groups at 14 ms). However, short latency units exhibited higher vector strengths to stimulus modulation (Fig. 11C; p = 1.01 × 10−18; Wilcoxon) and lower vector strengths to the gamma range LFP (Fig. 11D; p = 5.63 × 10−29; Wilcoxon) compared with long-latency units. The number of short- (n = 64, 31%) and long-latency (n = 140, 69%) units indicated that longer-latency, nonphase locking units were the largest group of OT neurons.
These subpopulations also responded differently to competition depending on the stimulus type (Figs. 12, 13). Long-latency units exhibited suppression of spike responses (gray line) as the competitor grew in intensity for both amplitude-modulated stimuli (Fig. 12A) and flat noise stimuli (Fig. 12B). While short-latency units exhibited comparable degrees of suppression for competing flat noise (Fig. 12B), suppression of spike response rates for competition with amplitude-modulated stimuli was significantly weaker than long-latency units (Fig. 12A; unpaired t test between subpopulations).
We further analyzed how spike timing of the short- and long-latency subpopulations was impacted by competing amplitude-modulated stimuli by constructing phase difference histograms (Fig. 12C,D). Phase differences indicate whether the driver modulation phase is leading (positive values) or following the competitor (negative values) at each spike time. At phase difference zero, both modulations coincide. For short-latency units, there was a mild decrease in the number of spikes observed as the competitor increased in intensity, and spikes aggregated to specific phase differences (Fig. 12C). If binaural decorrelation was the only source of suppression, the lowest spike rates should be expected when driver and competitor are in phase (phase difference 0) and equally loud. However, the strongest suppression can be observed at negative phase differences approximately −1/2π, and the highest rates remain at a positive phase difference approximately +1/2π for an equally loud competitor (Fig. 12C; relative level 0 dB), as well as for a louder competitor (Fig. 12C, right panel). Thus, binaural decorrelation is not the only source for suppressing spike responses of the OT units. Additionally, spike response rates of long-latency units, which show stronger suppression with competition, were independent of phase differences (Fig. 12D).
We then explored whether competition-related changes in synchrony were different for the subpopulations (Fig. 13). Overall, synchrony was higher at all relative levels in long-latency units (gray) compared with short-latency units (pink, amplitude-modulated stimuli; blue, flat noise; Fig. 13A,B). Again, while changes in synchrony were comparable in short- and long-latency units to flat noise competition (Fig. 13D; unpaired t test between subpopulations), short-latency units exhibited smaller decreases in synchrony compared with long-latency units for amplitude-modulated competition (Fig. 13C; unpaired t test between subpopulations).
We also investigated whether the midbrain stimulus selection network may be modulating synchrony in these subpopulations differently. Synchrony of short-latency units is correlated with the VS to both gamma range LFP and stimulus modulations, but this correlation is limited by overall small range of synchrony values (Fig. 13E,F). On the other hand, long-latency units show larger synchrony values that are strongly correlated with VS to gamma range LFP, but independent of stimulus modulations (Fig. 13G,H).
Altogether, these results show that both subpopulations of units are impacted by competition. The difference we found suggest that only the short-latency units are particularly affected by temporal properties of the stimulus, while long-latency units are mostly sensitive to inputs from the midbrain stimulus selection network.
Discussion
This study examined a representation of relative strength between concurrent auditory stimuli across midbrain space maps and demonstrated an emerging role of spike train synchrony and brain oscillations in governing competition of multiple stimuli in the OT, which is the last midbrain region to exhibit a topographic map of space and whose population readout guides behavior and perception.
Impact of competition in the ICx and OT
Many principles concerning representation of concurrent sounds are shared between the ICx and OT. In both regions, spike response rates decreased with competition compared with when a single stimulus is presented, and both are impacted by binaural decorrelation and surround suppression (Figs. 1, 3, 4). Spatially tuned ICx and OT neurons phase lock to sounds that occur at their preferred location (Figs. 8, 9), as observed in other studies using frozen flat noise (Beckert et al., 2020) and further extended to amplitude-modulated noise in this study. However, temporal patterning of spikes in the ICx (Keller and Takahashi, 2005) and OT (Fig. 9) is influenced by the amplitude modulations of stimuli at nonpreferred locations when a competitor source's binaural level is equal to or greater than a driver source.
Possible mechanisms for an emerging coding scheme in the OT
The differences observed between the space maps in the ICx and OT may enable the OT to represent the relative strength of concurrent sounds more robustly. Firing rate suppression by a competitor sound is stronger in the OT than that in the ICx (Figs. 1, 3, 4), suggesting that global inhibition in the OT is further potentiating representation of the louder source by muting activity representing weaker stimulus locations and providing further inhibition beyond the suppression expected from binaural decorrelation. The difference in maximal suppression observed by a competing sound in the OT for flat noise is 16.8 percentage points greater than that in the ICx and represents the effects of global inhibition provided by the midbrain stimulus selection network.
Additionally, spiking synchrony of neurons tuned to the same azimuth is greater in the OT than that in the ICx (Figs. 5, 6). This may simply reflect a larger diversity of inputs that feed into the ICx compared with the OT, since the emergence of a space map in the ICx depends on the convergence of ITD information across frequencies (Albeck and Konishi, 1995), while the OT inherits direct point-to-point projections from the ICx (Knudsen and Knudsen, 1983). However, the decrease in spike train synchrony with auditory competition is observed only in the OT (Figs. 2, 3). The stability of synchrony across sound competition in the ICx suggests that nearby neurons in this region fire coherently because of similar stimulus-related inputs (Fig. 9A). Although synchrony based on random coincidence is expected to increase or decrease depending on higher or lower firing rates, reduced responses induced by competing sounds do not influence synchrony of nearby neurons in the ICx. Therefore, OT-specific changes in synchrony of nearby neurons appear to reflect relative strength of a stimulus at a preferred location in circumstances of concurrent sources of activity across the map.
Our data suggest that the OT integrates auditory spatial information from the ICx, with additional computations performed by the midbrain stimulus selection network. We show that changes in synchrony generalize across different types of auditory competitors, such as flat noise (Fig. 2) and amplitude-modulated noise (Fig. 3), and are independent from temporal properties of stimuli (Fig. 10). Within the midbrain's stimulus selection network, Ipc's bottlebrush projections span across several layers in the OT to provide bursts of periodic excitation that is reflected in the low gamma range of the LFP (Wang et al., 2006; Asadollahi et al., 2010; Bryant et al., 2015) and are hypothesized to have a role in synchronizing activity across these layers and modulating gain (Knudsen, 2018). Our analysis of LFPs in the OT indicates that changes in synchrony with competition are correlated to changes in gamma power (Fig. 10), consistent with this hypothesis (Wang et al., 2006; Asadollahi et al., 2010; Bryant et al., 2015).
While relative strength has been shown to be encoded in spike response rates (Mysore et al., 2010, 2011; Mysore and Knudsen, 2011), this study indicates that differential spike train synchrony in the OT may provide additional coding strength. More specifically, in the context of time-varying signals, such as sounds with complex envelopes, spike patterning of individual midbrain map neurons is prone to following the temporal dynamics of the strongest stimulus from nonpreferred locations (Fig. 9). However, the OT also represents relative strength through interneuronal synchrony through a distinct subpopulation of neurons that do not phase lock to stimulus envelopes (Figs. 11–13). These same neurons have longer response latencies, which is in line with the delay imposed by additional inputs from the midbrain stimulus selection network.
Implications for bottom-up and top-down salience
Simultaneous multiregion recordings have shown that a synchronized activity between the OT and higher areas along the tectofugal pathway occurs for salient source locations (Marín et al., 2007, 2012) and is associated with behavioral correlates of perception (Netser et al., 2010). Representing relative strength through spiking synchrony may serve functional significance for conducting bottom-up relay, especially when downstream regions do not observe topographic mapping of space (Reches and Gutfreund, 2009; Mysore and Knudsen, 2014; Beckert et al., 2017).
As a region that also receives top-down forebrain projections, dual-region recordings between the OT and modulatory forebrain areas (Winkowski and Knudsen, 2006, 2007, 2008; Mysore and Knudsen, 2014) may provide insight into whether synchrony of midbrain responses is an important principle underlying both bottom-up and top-down salience determination. Although this study focused on bottom–up salience through manipulations of level, top–down input may define salience based on properties related to attention, memory, and behavioral relevance. We hypothesize that OT's strategic connections within the midbrain stimulus selection network enables it to represent salience beyond what is encoded in the ICx. For example, while ICx neurons are capable of phase locking to amplitude modulations up to 200 Hz (Keller and Takahashi, 2000), the saliency map in the OT should not be biased toward whichever sound exhibits the fastest modulation, which would bear little ethological relevance. Similarly, while binaural level is an important cue, intensity alone should not determine salience, especially when relevant sounds are often quiet. In these circumstances, top-down modulation would be critical for defining salience unbound to the properties of the stimulus. Despite the limitations of this study, coding relative strength of concurrent stimuli through spike train synchrony may provide additional representations of salience and could be tested for other types of bottom-up and top-down salience.
Relevance to other species
The OT and the superior colliculus are evolutionarily conserved homologs that share functional commonalities as a saliency map in birds (Mysore and Knudsen, 2011; Knudsen, 2020) and mammals (Itti and Koch, 2001; Veale et al., 2017). Both structures exhibit a map of space and integrate across visual and auditory sensory modalities and top-down inputs to inform decision, and the strength of activation in a precise location guides attention and orienting behavior to that location. In primates, focal reversible inactivation in the superior colliculus leads to deficits in target selection for areas that correspond to those locations (McPeek and Keller, 2004). Additionally, these regions contain multiple layers where computations are performed by diverse cell populations with distinct input/output connections (Moschovakis et al., 1988; Sparks and Hartwich-Young, 1989; Wang et al., 2006; Gale and Murphy, 2014), and synchronization across layers may serve a functional role in grouping cell assemblies. In cats, synchrony between neurons is attributed to short bursts of spikes with millisecond precision (Pauluis et al., 2001), and the strength of synchrony is modulated as a function of their overlapping receptive fields (Brecht et al., 1999). Interestingly, this synchrony decreased when there were two independent visual stimuli as opposed to a single stimulus (Brecht et al., 1999). While competition was not directly investigated in these previous studies, their results are potentially consistent with the findings of this study and underscore the utility of synchronization in grouping cell assemblies. Additionally, there is a rich body of literature in the visual system supporting (Gray et al., 1989; Engel et al., 1991; Kreiter and Singer, 1996) and opposing (Lamme and Spekreijse, 1998; Shadlen and Movshon, 1999; Ray and Maunsell, 2011) the utility of synchrony in temporally binding assemblies of neurons that encode distinct features of the same perceptual object. In the auditory cortex, spike timing of neurons is modulated by the LFP which reflects visual information from V1 (Atilgan et al., 2018). When auditory and visual stimuli are temporally aligned, spike responses of A1 neurons follow and enhance the representation of the coherent auditory–visual stimuli (Atilgan et al., 2018). Gamma oscillations in the OT may play a role in binding coherent auditory and visual features of the same perceptual object.
In sum, this study demonstrates that the OT reinforces coding of relative strength among simultaneously active locations of the map by modulating synchrony between nearby neurons. This emergent coding scheme is correlated to input from the midbrain stimulus selection network as reflected in the gamma frequency range of the LFP. Modulating spike train synchrony may enable the OT to support a robust representation of salient sound locations that is less vulnerable to corruption from other sensory cues.
Footnotes
This work was funded by the National Institute of Deafness and Other Communication Disorders (R01DC007690 and F30 DC020109) and CRCNS-US-Israel R01NS135851 projects. We thank Brian Fischer and Yoram Gutfreund for their helpful comments on this manuscript and the current collaboration in funding sources.
↵*A.J.B. and R.F. contributed equally to this work.
The authors declare no competing financial interests.
- Correspondence should be addressed to Andrea J. Bae at andrea.bae{at}einsteinmed.edu.