Abstract
Neural decoding is a tool for understanding how activities from a population of neurons inside the brain relate to the outside world and for engineering applications such as brain–machine interfaces. However, neural decoding studies mainly focused on different decoding algorithms rather than different neuron types which could use different coding strategies. In this study, we used two-photon calcium imaging to assess three auditory spatial decoders (space map, opponent channel, and population pattern) in excitatory and inhibitory neurons in the dorsal inferior colliculus of male and female mice. Our findings revealed a clustering of excitatory neurons that prefer similar interaural level difference (ILD), the primary spatial cues in mice, while inhibitory neurons showed random local ILD organization. We found that inhibitory neurons displayed lower decoding variability under the opponent channel decoder, while excitatory neurons achieved higher decoding accuracy under the space map and population pattern decoders. Further analysis revealed that the inhibitory neurons’ preference for ILD off the midline and the excitatory neurons’ heterogeneous ILD tuning account for their decoding differences. Additionally, we discovered a sharper ILD tuning in the inhibitory neurons. Our computational model, linking this to increased presynaptic inhibitory inputs, was corroborated using monaural and binaural stimuli. Overall, this study provides experimental and computational insight into how excitatory and inhibitory neurons uniquely contribute to the coding of sound locations.
- auditory midbrain
- inferior colliculus
- neural decoding
- neuron types
- sound localization
- two-photon calcium imaging
Significance Statement
Over the decades, studies have proposed three sound source decoders: the space map decoder (topographically tuned to sound location), the opponent channel decoder (compares the averaged tuning between two groups of neurons), and the population pattern decoder (decodes locations by utilizing the diverse tunings across the population). This is the first study that (1) visualizes the local organization of spatial tuning and identifies clusters in a brain area that features an auditory spatial map, (2) tests the three decoders in a single brain area of the same species and discovers that distinct neuron types favor different decoders, and (3) reveals the differential spatial coding between excitatory and inhibitory neurons and elucidates this disparity through a computational model.
Introduction
Neural decoding aims to identify what sensory stimulus or motor output elicits a particular pattern of neural activity. From neural activity, one could decode visual motion, natural image, and face (Jazayeri and Movshon, 2006; Kay et al., 2008; Chang and Tsao, 2017); animal vocalization, phonetic feature, spoken sentence, and language (Town et al., 2018; Anumanchipalli et al., 2019; Gwilliams et al., 2022; Liu and Wang, 2022; Tang et al., 2023); arm, hand, handwriting, and tongue movement (Georgopoulos et al., 1986; Ofner et al., 2019; Willett et al., 2021; Laurence-Chasen et al., 2023); and control robotic arm (Hochberg et al., 2006) and paralyzed muscles (Lorach et al., 2023). Numerous approaches have been proposed for decoding (Glaser et al., 2020), such as topographic map (Kaas, 1997; Groh, 2014), population vector (Georgopoulos et al., 1986), maximum likelihood estimation (MLE; Jazayeri and Movshon, 2006; Williams et al., 2023), linear regression (Chang and Tsao, 2017; Gwilliams et al., 2022), linear discriminant analysis (Ofner et al., 2019), and artificial neural networks (Anumanchipalli et al., 2019; Willett et al., 2021; Laurence-Chasen et al., 2023). However, only a few studies examined different neuron types in decoding (Berry et al., 2019; Wang et al., 2020). Inhibitory neurons were either excluded (Liu and Wang, 2022) or were shown to have similar accuracy to excitatory neurons (Allen et al., 2017; Najafi et al., 2020).
Decoding sound location is a challenging task for the auditory system. Unlike the visual and somatosensory systems, where spatial information is topographically organized at peripheral receptors and relayed to the brain, the auditory system maps sound frequency, not location, in the cochlea. Consequently, sound frequency is topographically organized at each station of the central auditory pathways (Fig. 1a, left). While the auditory system computes sound location internally without inheriting it from the sense organ, intriguingly, such maps have been observed in both the superior colliculus (SC) and inferior colliculus (IC, Fig. 1a, right). The first auditory spatial map was discovered nearly half a century ago in the barn owl’s mesencephalicus lateralis and pars dorsalis (MLd, a homolog of the mammalian IC; Knudsen and Konishi, 1978) and subsequently confirmed in the mammalian SC (see review by King, 2004) and IC (Aitkin et al., 1985; Binns et al., 1992, 1995; Schnupp and King, 1997). Additionally, spatial cues are also mapped onto the SC (Wise and Irvine, 1983, 1985; Hirsch et al., 1985; Ito et al., 2020) and IC (Aitkin et al., 1985; Wenstrup et al., 1985, 1986; Irvine and Gago, 1990). The fine-scale architecture of auditory spatial or spatial cues in the colliculi, however, remains unknown due to the limited spatial resolution of electrophysiological recordings (Fig. 1b).
Experimental paradigm and example tuning curves. a, Left panel, Tone frequency is topographically from dorsal to ventral direction in the central inferior colliculus (IC; Schnupp et al., 2015, their Fig. 3) and from medial to lateral direction in the cortical IC (Wong and Borst, 2019, their Fig. 6). Right panel, Cues for azimuth are topographical from dorsal to ventral direction in the IC (Wenstrup et al., 1985, their Fig. 2, and Wenstrup et al., 1986, their Figs. 7, 8) and from rostral to caudal direction in the cortical IC (Binns et al., 1992, 1995, and Schnupp and King, 1997, their Fig. 15) and central IC (Irvine and Gago, 1990, their Fig. 13, and Aitkin et al., 1985) and deep layer of superior colliculus (SC; Ito et al., 2020, their Fig. 2). b, Neighboring neurons may share (left) or do not share (right) cues or azimuth preference. Different shape indicates different stimulus preferences. c, Left panel, Schematic of interaural level differences (ILD) stimuli in the awake mouse. Negative ILDs indicated the sound level of the ipsilateral ear was stronger than the contralateral ear. Middle panel, The IC and SC were not covered by the cortex, and neurons from the superficial IC were imaged using a two-photon microscope. Right panel, Schematic of binaural stimuli. Pure tones are presented only in the left ear (ipsilateral), right ear (contralateral), or two ears simultaneously (diotic). d, Example field of view (FOV) from Vglut2- (left) and VGAT-Cre (right) mice where only the excitatory and inhibitory neurons expressed the GCaMP6. R, rostral; C, caudal; L, lateral; M, medial. Scale bar, 25 µm. e, Top panels, Normalized ΔF/F of one example excitatory and inhibitory neuron (color circles in d). Bottom panels, ILD tuning curves of these same example neurons under nine different pure tone frequencies. Error bars represent the standard error of the mean. f, Top panels, Normalized ΔF/F of the same example neurons is shown in e. Bottom panels, Corresponding diotic, contralateral, and ipsilateral tuning curves.
In mammalian colliculi, the auditory spatial map in the IC appears less organized than the multisensory audiovisual map in the SC (Schnupp and King, 1997). Notably, no spatial or spatial cue maps have been identified in the auditory cortex to date (see review by Middlebrooks, 2021). Given this context, how does the auditory system decode sound location based on a rough map or even in the absence of a map? Two main decoders have been proposed (van der Heijden et al., 2019). The opponent channel decoder averages responses from two channels of neurons, either from the opposite sides (contralateral and ipsilateral) or from the same side of the brain, that prefer either contralateral or ipsilateral sound locations and decodes sound locations based on the difference between the two channels’ averaged responses (McAlpine et al., 2001; Groh et al., 2003; Stecker et al., 2005; Lesica et al., 2010; Młynarski, 2015; Derey et al., 2016; Ortiz-Rios et al., 2017; Panniello et al., 2018). In contrast, the population pattern decoder predicts sound locations based on the tuning of each neuron (Miller and Recanzone, 2009; Day and Delgutte, 2013, 2016; Goodman et al., 2013; Belliveau et al., 2014; van der Heijden et al., 2018; Wood et al., 2019). However, the decoder performance for the excitatory and inhibitory neurons is still unknown in any brain area.
In this study, we began by using two-photon calcium imaging to probe the fine local structure of spatial cue, the interaural level difference (ILD), in the dorsal IC of awake mice. Notably, the dorsal IC is the only collicular region that both processes auditory spatial information and is feasible for optical imaging. Following this, we examined the spatial decoding strategies and encoding properties in both excitatory and inhibitory neurons. Subsequently, we constructed a model and varied the number and strength of presynaptic inhibitory inputs, aiming to explain the distinct encoding properties of inhibitory neurons. Lastly, we validated our computational model using monaural (contralateral and ipsilateral) and binaural (diotic) sound stimuli. To the best of our knowledge, this is the first study to compare multiple spatial decoding strategies—space map, opponent channel, and population pattern—in a single animal species or brain region, and it represents the first exploration of auditory spatial coding in inhibitory neurons within the colliculi or cortices.
Materials and Methods
Animals and virus injection
All experiments were approved and conducted in conformity with the Tsinghua University Animal Care and Use Committee. For all experiments, 10 Vglut2-Cre (JAX: 016963) and 9 VGAT-Cre (JAX: 016962) adult (2–3 month) mice of both sexes were used. The mice were group housed with a reversed light cycle (12 h), and all the awake experiments were performed during the dark period (12 h).
The mice were intraperitoneally anesthetized with 40 mg/kg pentobarbital. Three alternative washes of betadine and 70% (vol/vol) alcohol were applied to the head skin to prevent inflammation. The skin over the IC was cut by sterile scissors and forceps, after which the animals were mounted to a stereotaxic holder, and a part of the occipital bone was thinned ∼1 mm in diameter with a 0.5 mm drill bit. The center of the thinned skull was ∼0.5 mm from the midline. When the boundary of the IC could be clearly identified from the transverse sinus, cerebellum, and superior colliculus, we stopped the drilling and used a 26-gauge needle to peel off a thinned skull at the desired location for virus injection. AAV2/1.Syn.Flex.GCaMP6f (UPenn) was diluted 1:1–5 in saline. We injected virus only into the superficial layer of the IC cortex. We used the microsyringe pump (Micro4, WPI) and corresponding glass pipette to inject the virus. In order to penetrate the dura without significant pushing down the brain, we pulled, cut, and ground the pipettes to a 30 µm inner diameter and 45° inclination angle. After the virus injection, we sealed the incised skins with cyanoacrylate glue.
The virus expression area barely exceeded ∼500 × 500 µm, and the centroid was roughly 0.5 mm from the midline. To map the global organization of ILD uniformly, we have tried two strains of GCaMP transgenic mice. One was Thy1-GCaMP3 (JAX: 017893), and the other one was Ai93 (JAX: 024103). The Ai93 was Cre and tTA dependent; therefore, three mouse lines were needed to express the GCaMP6f indicators. We successfully bred five mice that expressed GCaMP3 in excitatory neurons (Thy1 promoter) and three mice that expressed GCaMP6f in excitatory (Vglut2-Cre) and inhibitory (VGAT-Cre) neurons. There was a high level of GCaMP expression and strong fluorescence changes only in the cerebral cortex, with almost no expression in the IC and SC. Thus, we could only map the local instead of the global organization of cell-type–specific ILD and frequency tunings.
Headcap and window implantation
We implanted the headcap and window around 10 d after the virus injection surgery. The mouse was anesthetized, and eye ointment, betadine, and 70% alcohol were applied as previously described. A large area of skin that covered the dorsal cortex and cerebellum was cut off, and the exposed periosteum was dried by air and also removed by low-speed drilling. All the exposed skull was covered by a thin layer of cyanoacrylate glue and dental cement. One tungsten head bar was fixed ∼2 mm rostral to the lambda point, and the other tungsten head bar was fixed above the inclined slope of the cerebellum. We thinned and removed the occipital bone over the IC, namely, the medial 2/3 part of the exposed IC. The dura above the IC and the caudal part of the SC were removed with a 25G needle. The dura around the transverse sinus detached from the IC surface, and there was a large amount of cerebrospinal fluid below the dura; thus, it was the safest part to penetrate. After rupturing the dura, the remaining dura was very easy to peel off with forceps. A tiny cover glass was selected from the debris of standard 24 × 40 mm cover glass (thickness, 0.15 mm). Finally, we applied a thin layer of cyanoacrylate glue to the surrounding area of the cranial window. Mice recovered from anesthesia quickly and did not show any abnormal behavior.
Closed-field sound stimuli
We generated all the acoustic stimuli with custom software (LabVIEW, version 8.6) that controlled a data acquisition card (NI PCIe-6321; analog output sampling rate, 900 kS/s; resolution, 16 bits). The generated acoustic stimuli were connected to a BNC desktop mount terminal block (BNC-2110) and fed to a speaker driver (ED1, Tucker-Davis Technologies). Two straight silicone tubes (ID was 1/16″, OD was 1/8″, wall thickness was 1/32″; 5233K51, McMaster-Carr) of 7 cm length were coupled to two electrostatic speakers (EC1, Tucker-Davis Technologies) along the interaural axis. We placed the silicone tubes at the entrance of the mouse’s ears to deliver dichotic sound stimuli. The silicone tubes were glued to the head bar with cyanoacrylate glue in case of dropping. We calibrated the tube-coupled speakers with a custom-made coupler and 1/4″ prepolarized free-field microphone (40BE, GRAS), amplifier (2610, Brüel & Kjær), and calibration software and processor (SigCalRP and RZ6, Tucker-Davis Technologies).
For the spatial sound stimuli, nine logarithmically spaced pure tones ranging from 3 to 48 kHz (0.5-octave step) were presented at seven different ILD levels randomly. We used the average binaural level (ABL) paradigm, i.e., sound level changed in both ears, but the average level was consistent. ABL of 55, 65, and 75 dB sound pressure levels (SPL) were used. Most of the FOV was tested with 65 dB SPL ABL. The sound level of the right/contralateral ear will increase from 50 to 80 dB at a 5 dB step, and the sound level of the left/ipsilateral ear will decrease from 80 to 50 dB at a 5 dB step. Thus, ILD will change from −30 to 30 dB at a 10 dB step. The sound duration was 50 ms. Each stimulus was repeated 10 times with 1 s interstimulus interval (ISI). Each session has a fixed sound level and required 630 s (9 tones × 7 ILDs × 1 s ISI × 10 repeats). In addition to pure tone stimuli, broadband noise (3–48 kHz) was also used for neural decoding analysis. The sound duration was still 50 ms, and the ABL was 65 dB. Each session only required 70 s (1 noise × 7 ILDs × 1 s ISI × 10 repeats).
For the binaural sound stimuli, 20 logarithmically spaced pure tones ranging from 3 to 48 kHz (0.21-octave step) were presented to the contralateral ear, ipsilateral ear, or both ears diotically (same SPL and frequency). The sound duration was 50 ms (5 ms onset and 5 ms offset ramps), and the sound intensity was 30, 50, and 70 dB SPL. Each stimulus was repeated 10 times with a 1 s interstimulus interval (ISI). Each session has a fixed sound level and required 600 s (20 tones × 3 modes × 1 s ISI × 10 repeats).
Two-photon calcium imaging and acoustic noise
Images were acquired with a custom-built two-photon microscope that was controlled by open-source ScanImage (version 3.8). A 920 nm excitation light for GCaMP6f imaging from mode-locked Ti:Sapphire laser (Mai Tai eHP) was scanned by paired galvanometers (6215H, 3 mm silver coated mirror) and guided through a water immersion objective (XLPL25XWMP, 25×, 1.05 NA). The laser power was controlled with the combination of a half-wave plate and Glan-Laser polarizer which was frequently calibrated with a laser power meter (PM100D). Since the imaging depth was <100 µm, the laser power was typically below 30 mW. Emitted fluorescence was collected with a laser block filter, dichroic mirror (FF665-Di02), bandpass filter (FF01-527/70), and a photomultiplier tube (R9880). The above detection parts along with the preamplifiers were enclosed within the commercial multiphoton detection module (2PIMS-2000-40-20). Data acquisition and mirror scanning were controlled by the NI PCI-6110 and BNC-2090A. The mouse was head-fixed and placed on a 15 × 10 cm (diameter × width) cylinder treadmill, which was modified from the wire grid treadmill used by pets. The treadmill was connected to the central rod by two smooth and silent ball bearings. The mouse locomotion was detected with a rotation decoder in the central rod of the treadmill.
There are three background noises: femtosecond laser, scanners, and treadmill. Ambient laser noise was kept low by keeping the laser’s power supply and cooling unit in a separate room and enclosed within a sound-attenuation chamber. Compared with high-speed resonant scanners which generated 8 kHz noise, the low-speed galvanometer scanners we used only generated low-frequency noise (Issa et al., 2014). When the mouse was running, no noise could be heard in the experimental room due to our specially designed treadmill. We used our speaker calibration microphone to measure background noise when the laser was turned on, the largest FOV was applied (scanners become nosier when driving voltage increases), and the mouse was running. Owing to the internal thermal noise, this 1/4″ microphone only works for sound levels above 30 dB SPL. We compared the noise level between the noisiest situation with the quietest situation when the laser and scanners were turned off and the mouse was removed. We found that the noise level under the two situations was similar and could not be identified from the baseline noise level. Furthermore, we put this free-field microphone above mouse ears instead of within the ear canal which should be quieter during the experiment. The lowest sound level we used for both binaural and spatial experiments was 30 dB SPL; therefore, the background noise with no more than 30 dB under free-field measurement will unlikely affect our findings.
The image acquisition was triggered by the rising edge of acoustic stimuli. The acquisition speed was 5 Hz, and the resolution was 256 × 200pixels. The largest field of view (FOV) of the microscope was 320 × 320 µm. Although more neurons could be collected when using a larger FOV, the pixels assigned to each neuron, and therefore the signal-to-noise ratio, were lower than with a smaller FOV. In this study, the FOV size was dynamically adjusted depending on the number of available neurons and blank area. We used a low magnification air objective (PLN10X, 10×, 0.25 NA) to determine the position of FOV relative to the midline, transverse sinus, and sigmoid sinus. We then moved the IC region with strong fluorescence to the center of the current FOV. We used a 1 ml syringe to add warm saline before changing to the 25× objective. When the objective touched the small droplet, strong fluorescence and many neurons with clear morphology could be observed from the binocular.
Image data processing
For the calcium imaging data, lateral (x–y) motion induced by locomotion was corrected with TurboReg, a plugin of ImageJ (1.48), which was registered (mode, rigid body) to the frames without any motion effect. Elliptical regions of interest were determined manually in MATLAB (MathWorks), and the fluorescence signals of somas were extracted using the custom software modified by Feinberg and Meister (2015). ROIs were selected by visually inspecting the image stack based on neural morphology and intensity change. ROIs with filled nuclei were excluded. Only those ROIs significantly driven by sound or tuned to specific stimuli were used for later data analysis. We extracted the neuropil signals which could potentially bias the auditory tunings of somas. The true fluorescence signals were f = R−r × n, where R was the raw fluorescence signal, n was the contamination signal (10 µm ring around the somas), and r was the contamination factor. To determine the value of r, we identified the horizontal blood vessel (i.e., f = 0) and recorded the raw signal R and contamination signal n, so the r was equal to the ratio of R and n, which ranged between 0.5 and 0.7 in different FOVs. The r values in the Vglut2-Cre and VGAT-Cre mice were similar. The baseline fluorescence f0 was estimated using the iteration procedure described by Issa et al. (2014). Briefly, we estimated the mean and standard deviation of each ROI’s spike event, then removed any data points that were more than 1.5 standard deviations, and repeated the above procedure until no further deviated points were found. The ROI traces were normalized with f0, namely, (f−f0)/f0. Lastly, the nonnegative deconvolution method (Vogelstein et al., 2010) was used to estimate the spike events (arbitrary units).
Experimental design and statistical analysis
Our previous studies showed that the response properties of dorsal IC neurons to spectral and temporal sound stimuli were highly dependent on the locomotion of the animals. In this study, we also acquired the locomotion in most of the sessions and pupil video in some sessions. We did not compare the ILD tunings under different behavioral or around states partially because each stimulus only has 10 repeats instead of 30 repeats in previous studies (Chen and Song, 2019). In addition, our FOVs span various depths, but each FOV usually has a different horizontal position. Therefore, we did not examine whether neurons at different depths have different coding properties.
The imaging acquisition rate used in studies was five frames per second, and the resolution of the y-axis was 200 pixels, which equaled 200 ms per image and 1 ms per horizontal line. Thus, when considering the delays caused by line-by-line scanning and calcium indicator (>50 ms), sound-induced responses will be observed in both the first and second frames. Spike events of the first and second frames will be averaged as sound-evoked responses, whereas spike events of the third to fifth frames will be averaged as spontaneous responses.
In Figure 2, for cluster analysis, ROIs were included if their ILD or frequency responses to either contralateral, diotic, or ipsilateral stimuli were significantly different from each other (one-way ANOVA, α = 0.01). For the ILD stimuli using pure tone, we chose the tone frequency that evoked the largest responses [i.e., best frequency (BF)]. Therefore, only 70 out of 630 stimuli were chosen. FOV that had <10 significant ROIs were excluded. In Figure 3, for two decoders, ROIs were included if their ILD responses using the broadband noise or at the best frequency were significantly different from each other (one-way ANOVA, α = 0.01). This resulted in 268 excitatory and 211 inhibitory neurons at the best frequency at 65 dB ABL. In Figure 4 for encoding, we tested two α values (0.01 and 0.05) and obtained similar results. We only showed the results with larger α values which resulted in 460 excitatory and 312 inhibitory neurons at the best frequency at 65 dB ABL. In Figure 5 for comparing ILD and frequency selectivity, ROIs were included if their ILD and frequency responses (not the 20 frequencies in the binaural stimuli) were both significantly different from each other (one-way ANOVA, α = 0.05). This resulted in 307 excitatory and 194 inhibitory neurons at 65 dB ABL. In Figure 6 for binaural gain analysis, our interest was to compare the response amplitudes that were evoked by the contralateral, diotic, and ipsilateral stimuli instead of their frequency tunings. Therefore, ROIs were included if their sound-evoked responses were significantly different from the spontaneous responses (unpaired t test, α = 0.01). Furthermore, averaged sound-evoked responses should be larger than averaged spontaneous responses. ROIs that had nonsignificant responses relative to the spontaneous responses or were inhibited instead of being driven by the sound stimuli were not included in this analysis. This resulted in 381 excitatory and 171 inhibitory neurons at 70 dB SPL. FOV that had <10 significant ROIs were excluded from the cluster analysis. Gains of dichotic to contralateral or ipsilateral were calculated based on the ratio of their corresponding maximum responses. The aural dominance index (ADI) was calculated based on the maximum responses of contralateral tuning minus the maximum responses of ipsilateral tuning then divided by the summed responses (Xiong et al., 2013). The cluster analysis of binaural gain was the same as the ILD and frequency analysis and was detailed in the next section.
Neighboring excitatory neurons have similar ILD preferences. a, Weighted ILD in one example field of view (FOV) for the excitatory (left, 256 × 256 µm) and inhibitory (right, 160 × 160 µm) neurons. The direction of FOVs is the same as in Figure 1d. b, Pairwise tuning correlation (y-axis) versus distance (x-axis) in the same example FOV for excitatory (left) and inhibitory (right) neurons. The lines indicate the best linear fit to the data. c, Top panels, Best linear fitted lines of tuning correlation versus distance in all FOVs for the excitatory (left) and inhibitory (right) neurons. The gray lines indicate the pairwise tuning correlation and distance are not correlated (p < 0.05). Bottom panels, Best linear fitted lines of best ILD difference versus distance in all FOVs. d, Similar to c but for tone frequency instead of ILD. e, Proportion of FOV that have clusters at different pairwise distances in all FOVs for the excitatory (left) and inhibitory (right) neurons. For example, when the paired neuron distance is ≤125 µm, clustered and random ILD tunings of excitatory neurons were found in 11 and 7 FOVs, respectively (proportion, 61%). In contrast, ILD tunings of inhibitory neurons were randomly distributed in all nine FOVs (proportion, 0%). Notice that FOV that had <10 significantly tuned neurons were excluded (see Materials and Methods). f, Similar to e but for tone frequency instead of ILD.
Three decoders have different performances for the responses of excitatory and inhibitory neurons. a–c, Space map decoder for one example FOV (the same as Fig. 2a) and accuracies for all FOVs. a, CoM (color dots) of 70 (seven ILDs with 10 repetitions) population vector and mean CoM (color diamonds) averaged over 10 repetitions. b, Decoded ILDs (color circles) that encircled the ground truth ILDs (color dots). In this FOV, the decoding performance is 30%. c, Gardner–Altman plot for two-sample (excitatory vs inhibitory) effect size. FOVs with less than five ILD-tuned neurons are excluded. The p-value is 0.0019, the effect size is 6.26%, and the 95% confidence interval ranges from 2.44 to 10.07%. d, Schematic of the opponent channel decoder. The population of neurons was divided into two groups: preferred ipsilateral sound (blue circle) and preferred contralateral sound (red circle). In this example, five neurons will be chosen randomly. During each trial, at least one neuron (black-filled dot) was chosen from each group or channel. For example, in the third trial, the responses of two neurons that preferred ipsilateral ILD and three neurons that preferred contralateral ILD were averaged separately (thick blue and red curves, respectively). The tuning curve difference (thick brown curve, contralateral minus ipsilateral responses) was used to predict the most likely ILDs. e, Performance of opponent channel decoder versus the number of neurons. Only those neurons that were significantly tuned to the ILD stimuli (ANOVA, p < 0.01; orange, 268 excitatory neurons; green, 211 inhibitory neurons) were included. From the single repetition response of multiple neurons to the same stimulus, we calculated the probability of each stimulus that could evoke those responses using Bayes’ rule (see Materials and Methods). If the stimulus with the highest probability was the real stimulus, then this stimulus was predicted correctly. The y-axis showed the percent of correctly predicted stimulus (0/7, 1/7, 2/7, …, 7/7; the dashed line was 1/7 chance level). Among each population size, 200 different combinations were chosen from the neural population. Thus, the thick color lines and error bars indicated the mean and standard error of performance. f, Schematic of population pattern decoder. Compared with the opponent channel decoder, it is not required to choose at least one neuron from two channels. Thus, the neural population was not grouped, and every neuron has the same probability to be chosen. Five neural responses from five different neurons (two thin blue and three thin red curves) were used to predict the most likely ILDs. g, Performance of population pattern decoder versus number of neurons.
Excitatory neurons have heterogeneous ILD tuning and inhibitory neurons have reliable and selective ILD tuning. a, The ILD tuning curves for significantly tuned (ANOVA, p < 0.05) excitatory (left) and inhibitory (right) neurons. Each row shows the ILD tuning curve for one neuron. All tuning curves were normalized to have the same maximum and minimum for plotting. Neurons were sorted by best ILD. b, Left panel, Peak normalized ILD tuning curve of example neuron (thin line, individual repetition; thick line, averaged cross 10 repetitions; gray patch, response area). The responses at seven ILDs are as follows: 0.0000, 0.0368, 0.2387, 0.5725, 0.8091, 1.0000, and 0.8716. The summed responses total 3.5287, which corresponds to a 30 dB tuning area. Right panel, Fitted ILD tuning curve of example neuron (dot, averaged response to each ILD; solid curve, fitted ILD curve; dashed line, half-peak threshold). The fitted curve has a peak response slightly <1 (0.88 in this example). We use the intersection of the half-peak threshold on the y-axis of the fitted curve to determine the ILD half-peak value, which corresponds to the x-axis value of the intersected curve. c, Top, Tuning reliability. The median is 0.21 for excitatory (orange line) and 0.26 for inhibitory neurons (green line). The effect size is −0.061, and the 95% confidence interval is −0.088 and −0.034. Middle, Response area. The median is 26.5 dB for excitatory neurons and 23.7 dB for inhibitory neurons. The effect size is 0.054, and the 95% confidence interval is 0.039 and 0.068. Bottom, ILD half-peak value. The median is 11.78 dB for excitatory and 15.51 dB for inhibitory neurons. The effect size is −4.133, and the 95% confidence interval is −6.169 and −2.097. Because we only included neurons that had a peak at 30 dB ILD in the bottom panel, the number of neurons is less than those of the top and middle panels.
A computational model could explain the positively correlated tone frequency and ILD selectivity in inhibitory neurons by an increase in the number of presynaptic inhibitory inputs. a, Top panel, Normalized ILD (left) and tone frequency (bottom) tuning curves and ILD-tone frequency combined tunings (top right) of an example excitatory neuron. Bottom panel, An example inhibitory neuron. Notice that the tone frequency response area (# repeats by 9 Hz) was calculated based on the averaged responses among seven ILDs and the ILD response area (# repeats by 7 ILD) was calculated based on the averaged responses among nine tone frequencies. b, Pairwise ILD response area (y-axis) versus tone frequency response area (x-axis) of excitatory (left) and inhibitory (right) neurons. c, Pairwise ILD half-peak value (y-axis) versus tone frequency response area (x-axis). d, Top panel, Stronger inhibition could be due to stronger presynaptic inhibition strength (thick line) and a larger number of presynaptic inhibitory neurons. Bottom panel, Goodness-of-fit or R-squared between ILD selectivity and tone frequency selectivity could be modulated by the number of their presynaptic neurons. When ILD and tone frequency responses both have five presynaptic inhibitory neurons (jitter, 0%), the R-squared between ILD and tone frequency selectivity should be highest. Thus, zero jitter indicates that the postsynaptic neurons that exhibit ILD and tone frequency responses either have the same number of presynaptic inhibitory neurons or have the same strength of presynaptic inhibition. e, Example ILD tuning curves that have the best ILD at the center and far contralateral location. Individual dots represent neural responses on each ILD value. f, Stronger presynaptic inhibition strength (blue dots) and a larger number of presynaptic inhibitory neurons (red dots) both reduce neural responses. g, Top panel, ILD half-peak width decreases gradually with an increased number of presynaptic inhibitory neurons but not the strength of presynaptic inhibition. Bottom panel, ILD half-peak value increases gradually (shift toward larger ILD values) with an increased number of presynaptic inhibitory neurons. h, Top panel, p-values and correlation coefficients (r) from the linearly regressed ILD selectivity against tone frequency selectivity across six different jitter ranges (0%, 20%…100%). Bottom panel, ILD half-peak value against tone frequency selectivity.
Inhibitory neurons have more homogeneous and stronger binaural inhibition. a, Diotic, contralateral, and ipsilateral tuning curve for one example neuron. The maximum activity at each tuning curve was used to measure three metrics: diotic to contralateral gain, diotic to ipsilateral gain, and aural dominance index (ADI). b, Three metrics in one example field of view (FOV) for the excitatory (top, 291 × 291 µm) and inhibitory (bottom, 291 × 291 µm) neurons. The direction of FOVs is the same as in Figure 1d. c, Proportion of FOV that have clusters at different pairwise distances for those three metrics (top, middle, and bottom). The dashed lines indicate median values for the excitatory (orange line) and inhibitory (green line) neurons. d, Top, Diotic to contralateral gain. The median is 1.07 for excitatory (orange line) and 0.89 for inhibitory neurons (green line). The effect size is 0.048, and the 95% confidence interval is 0.023 and 0.073. Middle, Diotic to ipsilateral gain. The median is 2.81 for excitatory and 1.96 for inhibitory neurons. The effect size is 0.091, and the 95% confidence interval is 0.046 and 0.136. Bottom, ADI. The median is 0.40 for excitatory and 0.39 for inhibitory neurons. The effect size is 0.018, and the 95% confidence interval is −0.036 and 0.072.
We used MATLAB 2022a for statistical analyses. We used the “meanEffectSize” function (default parameters, α = 0.05, NumBootstraps: 1000) to compute the effect size between excitatory and inhibitory neurons. We used the “gardnerAltmanPlot” function (Effect=“mediandiff”) to plot the effect size.
Spatial organization of ILD and frequency encodings
We tested if neurons that were located proximally had more similar preferences for ILD or frequency compared with neurons located further apart. To quantify the similarity of ILD or frequency tunings between two neurons, we used two metrics. One metric was the correlation coefficients between their mean responses at seven ILDs or 20 frequencies (−1 to 1), and the other metric was the difference between the best ILD (0–60 dB) and frequency (0–4 octave). Weighted ILD, also known as the center of gravity (CoG) of ILD tuning, was used only in Figure 2a. It was the product of seven ILD values and their corresponding responses then divided by their summed responses (Panniello et al., 2018). Euclidean distance between paired neurons was calculated using the centroid of two selected ROIs.
To quantify the organization of tuning similarity, we fitted the pairwise ILD or frequency tuning curve correlation coefficients against the pairwise Euclidean distance. We got the significance, square root of explained variance, and slope of the line of best fit from the curve fitting. To quantify the local organization of tuning similarity, we used a bootstrapping method to test if neurons that were less than a specific distance were more similar than neurons that were further apart than a specific distance (Panniello et al., 2018). We used the correlation coefficients to quantify the similarity and choose nine equally spaced distance boundaries (25, 50…, 200, 225 µm). We calculated the ILD or frequency similarity for all pairs of neurons that were separated by nine distances. We then calculated the ILD or frequency similarity of the same number of randomly chosen pairs of neurons that were located farther than those nine distances (e.g., >50 µm). The mean ILD or frequency similarity distances from each of 1,000 of such randomly chosen samples of neuronal pairs served as our bootstrapped estimate of the ILD or frequency similarity of distant neurons. If the average ILD or frequency similarity distance for the local pairs of neurons was below the fifth percentile (α = 0.05) of the bootstrapped values for distant pairs, then this FOV has a significant spatial clustering of ILD or tone frequency.
Three ILD decoders
In this study, we have tested three main sound location decoders: space map, opponent channel (two-channel), and population pattern (distributed or labeled line). As mentioned earlier, there were seven ILD responses, and each of them had 10 repetitions.
We implemented our space map decoder following one previous study that decode the location of visual stimuli from the neural responses (also calcium imaging) in the optic tectum (a homolog of the mammalian SC) of zebrafish (Avitan et al., 2016). This space map decoder includes five steps: (1) Build a population vector from all neurons’ XY locations and responses to one ILD within one field of view (FOV).
Unlike the space map decoder which requires distance information but does not need repetition-based responses, the opponent channel and population pattern decoders need diverse tunings from each repetition. Each neuron has seven ILDs, and each ILD has 10 repetitions. We randomly selected one repetition (or five repetitions), and we estimated the mean and standard deviation of the same neuron in response to the same stimulus based on the remaining nine repetitions (or five repetitions). We assumed the distribution of responses was Gaussian, and we truncated the distribution at zero (Belliveau et al., 2014). During each decoding trial (at least 200 trials in total), a fixed number of neurons were randomly selected from the population. The number of selected neurons increased gradually until reached the size of the neural population.
For the opponent channel decoder, we (1) summed the responses of three ipsilateral ILDs and three contralateral ILDs of each neuron; (2) divided the neuron into ipsilateral or contralateral channels based on the difference between two summed responses; (3) selected at least one neuron from the ipsilateral channel and contralateral channel; (4) averaged the responses of single or multiple neurons in each channel to get the mean response; and (5) decoded the stimulus (s) based on the largest probability (p). For a single neuron, the probability p given that a response r was evoked by an ILD stimulus s was decided by Bayes’ rule as follows:
For the population pattern decoder, the excitatory and inhibitory neurons were not divided into two channels. Occasionally, the selected neurons may all come from neurons that preferred the contralateral stimuli. It was a simplified version of the opponent channel decoder:
Computational model of tuning selectivity
We used a model neuron to explain the presynaptic mechanism of ILD and tone frequency selectivity and reproduce the positively correlated ILD and tone frequency selectivity in the inhibitory neurons. Codes are available at https://github.com/ccg1988/IC_ILD_Journal_of_Neuroscience_2024.
The membrane potential
Excitatory conductance
The temporal delay (2 ms) between excitatory and inhibitory inputs was fixed in our models (not shown in the above functions). The time constant
τ was 5 ms, and conduction noise
To modulate the number of inhibitory inputs, the inhibitory and excitatory inputs have the same strength, so the inhibitory-to-excitatory ratio
r equal to one, and the number of excitatory inputs
Results
Two-photon calcium imaging of spatial cues and binaural integration in the dorsal IC of awake mice
Sound source localization is based on interaural timing difference (ITD), interaural level difference (ILD), and direction-dependent filtering by the trunk, head, and pinnae (spectral cues). Due to mice’s small head size and lack of low-frequency hearing, ILD is the primary spatial cue for localizing sound locations in the horizontal plane, i.e., azimuthal localization (see review by Grothe et al., 2010, and Yin et al., 2019; but see Day et al., 2012; Ono et al., 2020; Ito et al., 2020). We presented pure tones or broadband noises with an ILD of −30, −20, or −10 dB that favored the ipsilateral ear, 0 dB that was equal between two ears, and 10, 20, or 30 dB that favored the contralateral ear (Fig. 1c, left). We injected the AAV-Flex-GCaMP6f virus into the nonlemniscal dorsal IC of Vglut2- and VGAT-Cre transgenic mice for targeting the excitatory and inhibitory neurons, respectively (Fig. 1d). We imaged excitatory and inhibitory neurons within the superficial layer (depth, <100 µm) in the awake, head-fixed mice that running freely on a treadmill. Figure 1e shows the fluorescent traces (top) and ILD tunings (bottom) of one example of excitatory (left) and inhibitory (right) neurons. We presented each stimulus 10 times at 55, 65, or 75 dB average binaural levels (ABL). Since 55 or 75 dB ABL was not extensively tested, we only showed the results using the 65 dB ABL stimuli. We found that pure tone stimuli evoked stronger and more reliable responses than the broadband noise stimuli. Furthermore, most previous studies measured each neuron’s ILD with its best frequency (BF, Park et al., 2004; Orton et al., 2016; Benichoux et al., 2017; but see Panniello et al., 2018). Therefore, we quantified each neuron’s ILD preferences using its responses at BF. This example excitatory neuron has a BF at 4.24 kHz and a best ILD at −10 dB that favors the ipsilateral sound location near the center. The example inhibitory neuron has a BF at 6 kHz and a best ILD at 30 dB that favors far contralateral sound location.
We also presented 20 different pure tones only in the left ear (ipsilateral), only the right ear (contralateral), or two ears simultaneously (diotic; Fig. 1c, right). Figure 1f shows the fluorescent traces (top) and binaural tunings (bottom) of the same example excitatory (left) and inhibitory (right) neurons. We presented each stimulus 10 times at 30, 50, or 70 dB sound pressure level (SPL). We only showed the results with 70 dB SPL in this work as 30 and 50 dB SPL were not extensively tested and over 80% of neurons were monotonic when tested with three SPLs. This example excitatory neuron has almost the same frequency tuning curves for the three conditions. The stronger responses to diotic than contralateral stimuli and clear responses to ipsilateral stimuli were consistent with its ILD tunings (larger responses when ILD close to 0 dB). The example inhibitory neuron has more strongly suppressed responses to diotic stimuli than contralateral stimuli and almost negligible responses to ipsilateral stimuli, which was also consistent with its ILD tunings (smaller responses when ILD was <10 dB).
In the following five sections, we present the (1) encoding strategies based on pairwise physical distance in each field of view (FOV; Fig. 2); (2) decoding strategies based on space map, opponent channel, and population pattern decoders (Fig. 3); (3) encoding properties (Fig. 4); (4) correlation of ILD against tone frequency selectivity and computational model (Fig. 5); and (5) binaural integration and its relationship with pairwise distance of neurons (Fig. 6).
Neighboring excitatory neurons have similar ILD tuning
We first leveraged the cellular resolution ability of two-photon calcium imaging and examined whether neighboring neurons have similar ILD preferences. Figure 2a shows the weighted ILDs in two example FOVs for the excitatory (left) and inhibitory (right) neurons. The excitatory neurons that prefer similar ILDs (green dots) were in general far away from neurons that prefer the other ILDs (blue dots). In contrast, the inhibitory neurons that prefer far ipsilateral ILD (violet dot) were close to two inhibitory neurons that prefer far contralateral ILDs (yellow dots). To quantify the similarity of ILD preference between two neurons, we used two metrics. One metric was based on the correlation coefficients between their averaged responses at seven ILDs, with 1 indicating fully correlated and −1 indicating fully anticorrelated. The other metric was based on the difference between the best ILD, with 0 dB indicating the same best ILDs and 60 dB indicating −30 and 30 dB ILDs. The pairwise ILD tuning correlation coefficients were negatively and significantly correlated with pairwise distance in the same example FOV of excitatory neurons (Fig. 2b, left). The inhibitory neurons’ ILD tuning correlation coefficients were not affected by the pairwise distance (Fig. 2b, right). For the examined FOVs, we observed negatively and significantly correlated pairwise tuning correlation coefficients versus distance in 7/19 FOVs for excitatory neurons but none for the inhibitory neurons (Fig. 2c, top). We also observed positively and significantly correlated best ILD difference versus distance in 7/18 FOVs for excitatory neurons but none for the inhibitory neurons (Fig. 2c, bottom). Together, our two metrics both showed that the neighboring excitatory neurons have more similar ILD preferences than the excitatory neurons that were far away.
Previous two-photon calcium imaging studies found that nearby neurons, either excitatory, inhibitory, or unclassified, all have heterogeneous but in general more similar BFs than neurons that were far away (Ito et al., 2014; Barnstedt et al., 2015; Wong and Borst, 2019; Ibrahim et al., 2023). If our two metrics do reflect the tuning similarity between two neurons, we should expect to observe a negatively correlated first metric and a positively correlated second metric between nine tone frequencies (instead of seven ILDs) and distance. And they should exist in both excitatory and inhibitory neurons. We indeed observed those two trends in our experimental findings (Fig. 2d), suggesting that neighboring excitatory neurons have similar ILD and tone frequency preferences than the excitatory neurons that were far away whereas neighboring inhibitory neurons were only similar in tone frequency but not ILD.
Although our previous analysis compared the tuning preferences between neurons that were nearby versus far away, the boundary between “nearby” and “far away” has not been defined. Therefore, we used a bootstrapping method to test if neurons that were less than a specific distance were more similar to neurons that were further apart than a specific distance. We used the first metric (tuning correlation coefficients) to quantify the similarity and choose nine equally spaced distance boundaries (25–225 µm). We measured the proportion of FOV (i.e., number of FOV with clusters divided by the total number of FOV) that did or did not have a cluster, i.e., more similar tuning preferences with pairwise distance inside boundaries than those outside. We found that excitatory neurons formed clusters when the pairwise distance was larger than 25 µm and had the highest proportion of clustered FOV when the pairwise distance was 125 µm (Fig. 2e, left, 62% FOV). In contrast, we only observed a very low proportion of clustered FOV for the inhibitory neurons at farther distances (Fig. 2e, right). Compared to ILD clusters, over 60% of excitatory and 40% of inhibitory neurons have tone frequency clusters across all the distance boundaries (Fig. 2f). Since a globally organized tone frequency organization (i.e., tonotopy) exists in both excitatory and inhibitory neurons, it’s expected to find that nearby excitatory and inhibitory neurons were always more similar than neurons that were far away.
In summary, our data reveal spatially clustered excitatory neurons but not inhibitory neurons that encode similar ILD.
Three decoders have different performances for the responses of excitatory and inhibitory neurons
We wondered whether spatial information contained in the population neural responses contributes to the neural decoding of ILD. To the best of our knowledge, no previous imaging studies have leveraged spatial information for decoding ILD (Panniello et al., 2018) or sound locations (Derey et al., 2016; van der Heijden et al., 2018). In our space map decoder, we built population vectors from all neurons’ responses to each stimulus in the same FOV. We then used a leave-one-out cross-validation approach, dividing the set of population vectors into training and test sets. We calculated the center of mass (CoM) of each population vector and averaged the CoM of each ILD in the training sets (Fig. 3a). To test the decoder for a given population vector, we calculated its CoM and classified it according to the ILD having the closest average CoM (Fig. 3b). The accuracy of the space map decoder is higher in the excitatory than inhibitory neurons (26% vs 21%, Fig. 3c). This is consistent with the spatially clustered ILD tunings observed exclusively in excitatory neurons.
We next examined the other two decoding strategies that were independent of pairwise distance: opponent channel and population pattern (Fig. 3d–g). In the opponent channel decoder, we divided neurons into two channels that either preferred ipsilateral or contralateral ILD stimuli (Fig. 3d). During each decoding trail, we randomly chose at least one neuron from each channel. The number of successfully decoded stimuli divided by total stimuli (i.e., seven ILDs) was the percent of correctly decoded stimuli of a single trial with a chance level of 14% (1/7). At each fixed number of neurons, we ran two hundred decoding trials. Consistent with previous studies (Belliveau et al., 2014; Wood et al., 2019), we found that the percent of correctly decoded stimuli (i.e., accuracy) increased monotonically with increasing neuron number (Fig. 3e). We found that the inhibitory neurons have higher decoding accuracy (73% vs 60%) and lower decoding variability (7% vs 11% SD) than the excitatory neurons, suggesting that inhibitory neurons outperform excitatory neurons in the opponent channel decoder. In the population pattern decoder, there were no restrictions when choosing neurons from the population (Fig. 3f). Surprisingly, excitatory neurons outperformed the inhibitory neurons with a higher decoding accuracy (Fig. 3g, 86% vs 70%) and a lower decoding variability (10% vs 17% SD).
A comparison of the two decoders revealed that the inhibitory neurons have a much lower decoding variability under the opponent channel decoder than the population pattern decoder (7% vs 17% SD). On the other hand, the excitatory neurons have a higher decoding accuracy under the population pattern decoder than the opponent channel decoder (86% vs 60%). Together, findings from excitatory neurons, which represent ∼80% of neurons in the IC (Beebe et al., 2016; Silveira et al., 2020), were consistent with previous studies that showed the best decoding performance in the population pattern decoder (Day and Delgutte, 2013; Goodman et al., 2013; Belliveau et al., 2014; Wood et al., 2019). However, the population pattern decoder was not the optimal decoder for the inhibitory neurons.
In summary, our data suggest that sound location could be better decoded from the excitatory neurons using either space map decoder or population pattern decoder and from inhibitory neurons using opponent channel decoder.
Excitatory neurons have heterogeneous ILD tuning and inhibitory neurons have reliable and selective ILD tuning
We sought to understand why excitatory and inhibitory neurons support a hybrid spatial decoding strategy. To this end, we compared their ILD encoding properties. Figure 4a shows the tuning curves of excitatory and inhibitory neurons across FOVs. It was obvious that more excitatory neurons were tuned to ILD that were near the center (i.e., 0 dB ILD). In contrast, more inhibitory neurons preferred ILD that were far from the center. For example, there were 38% excitatory neurons but only 18% inhibitory neurons that preferred ILD between −10 and 10 dB, and there were 56% inhibitory neurons but only 38% excitatory neurons that preferred 30 dB ILD. Therefore, the higher proportion of inhibitory neurons preferring off-center ILDs could reduce ambiguity when selecting neurons from ipsilateral and contralateral channels separately. This would, in turn, reduce their decoding variability under the opponent channel decoder. On the other hand, more excitatory neurons that preferred the ILD near the center provided more distributed and holistic information about ILD stimuli when choosing neurons freely. This may increase their decoding accuracy under the population pattern decoder. This was consistent with previous studies which showed that heterogeneous ILD tunings contribute to the superior decoding accuracy of the population pattern decoder (Day and Delgutte, 2013; Belliveau et al., 2014).
In addition to the best ILDs, we also measured two factors that contributed to the sound decoding performance: tuning reliability and selectivity. Tuning reliability was the averaged correlation coefficients among paired ILD tuning curves from 10 repetitions (45 pairs, Fig. 4b, left, gray curves). Tuning reliability close to 1 indicated a highly reliable ILD response. We used the response area under the peak normalized ILD tuning curves (Fig. 4b, left) and ILD half-peak value for quantifying the tuning selectivity (Fig. 4b, right).
A response area close to 60 dB indicated a low ILD selectivity. For an ILD tuning curve that has a peak at 30 dB, a larger ILD half-peak value that is close to 25 dB indicates a higher ILD selectivity. Across all the significantly tuned neurons, the inhibitory neurons have significantly higher tuning reliability, smaller response area, and larger ILD half-peak values than the excitatory neurons (Fig. 4c, rank sum test). Reliable ILD tunings cause the randomly chosen repetition’s tuning closer to the averaged tunings of this neuron, thus reducing the decoding variability. A sharper ILD tuning makes the randomly chosen repetition’s decoded ILD value more distinct from the remaining six ILD values, thus increasing the decoding accuracy. These two properties may contribute to the superior decoding performance of inhibitory neurons in the opponent channel decoder.
In summary, analysis of ILD encoding properties suggests that the heterogenous ILD tunings contribute to the higher accuracy of excitatory neurons in the population pattern decoder. In contrast, the higher tuning reliability and selectivity contribute to the smaller variability and higher accuracy of inhibitory neurons in the opponent channel decoder.
A computational model could explain the positively correlated tone frequency and ILD selectivity in inhibitory neurons by an increase in the number of their presynaptic inhibitory inputs
The high ILD tuning selectivity of dorsal IC inhibitory neurons reminded us of a previous study that also showed that inhibitory neurons have slightly higher tone frequency selectivity (Chen and Song, 2019). We wondered whether the same inhibitory neurons that were selective for ILD were also selective for tone frequency. If yes, then it implies that the inhibitory neurons may share similar presynaptic inputs for processing ILD and tone frequency. Figure 5a shows the normalized ILD tuning, tone frequency tuning, and combined tunings of example excitatory (top) and inhibitory (bottom) neurons.
Similar to the ILD response area, here we used the tone frequency response area to measure the tone frequency selectivity. Across all the neurons that were significantly tuned to ILD and tone frequency, we found that ILD tuning and tone frequency tuning were more strongly correlated in the inhibitory neurons than the excitatory neurons (Fig. 5b). This also held true for neurons that have a 30 dB best ILD, and therefore the ILD tuning selectivity was measured with ILD half-peak value (Fig. 5c).
Our results suggest that presynaptic inputs of inhibitory neurons for encoding ILD and tone frequency were similar. There are two related questions about this hypothesis: is the strength or the number of inputs similar (Fig. 5d, top)? What is the extent of input similarity that could explain the experimental findings (Fig. 5d, bottom)? To answer these two questions, we used a simple leaky integrate-and-fire (LIF) neuron model and modulated the strength or number of inhibitory inputs from 10% (weak inhibition) to 100% (strong inhibition). Figure 5e shows the tunings of two model neurons that preferred 0 and 30 dB ILD, which were generated by Gaussian and inverse LogLogistic distributed synaptic inputs to an LIF neuron, respectively.
Increasing the strength or the number of inhibitory inputs both reduced the neural activities gradually and similarly (Fig. 5f). However, our model revealed that it was the increase in the number of presynaptic inhibitory neurons, rather than an increase in presynaptic inhibition strength that contributed to tuning selectivity changes (Fig. 5g). This was consistent with previous rabies tracing studies which showed that inhibitory neurons have much more retrogradely labeled presynaptic inhibitory neurons from the contralateral IC than the excitatory neurons (Chen et al., 2018).
Our modeling and previous anatomical studies both suggested that the inhibitory neurons received a larger number but relatively weaker inhibitory inputs instead of a smaller number but relatively stronger inhibitory inputs. Therefore, we tested how varying the number of presynaptic inhibitory input neurons affects the correlated ILD and tone frequency tuning selectivity (Fig. 5d, bottom). Our model showed that increasing the jitter number for the two tunings gradually reduced their correlations (Fig. 5h). We found that 60 and 70% jitter of presynaptic input neurons could explain the experimental results shown in Figure 5b and c.
In summary, our models revealed that a larger and relatively correlated number of presynaptic inhibitory neurons contribute to the narrower and correlated ILD and tone frequency tunings observed in the inhibitory neurons.
Inhibitory neurons have a more clustered binaural integration effect and stronger binaural inhibition
Our models suggest that inhibitory neurons received inhibition from a large number of inhibitory neurons (Fig. 5). Our experimental data showed that inhibitory neurons preferred the ILDs off the center (Fig. 4). This evidence suggests that inhibitory neurons received the strongest inhibition when the ILDs were close to the center, i.e., contralateral and ipsilateral sound levels were similar. To measure the sign of binaural integration, we presented the contralateral-only, ipsilateral-only, and diotic (same tone at two ears) stimuli (Fig. 6a). We quantified the binaural integration with two gain values: ratio of diotic to contralateral responses and ratio of diotic to ipsilateral responses. Gain values larger than 1 indicated the diotic response was stronger than contralateral or ipsilateral responses. Furthermore, we also quantified the monaural response strength using the aural dominance index (ADI, Xiong et al., 2013). ADI closer to 1 indicated near zero responses to the ipsilateral stimuli. For this example neuron, since its diotic and contralateral responses were similar but both were much stronger than the ipsilateral response, thus the gain (dio/con) was close to 1, but the gain (dio/ipsi) was larger than 1.
Figure 6b shows the two gain values and ADI in one example FOV of excitatory (top) and inhibitory neurons (bottom). In contrast to the ILDs shown in Figure 2a, we found that neighboring inhibitory neurons have relatively more similar gain values compared with excitatory neurons. Using the same bootstrapping method (Fig. 2e,f), we compared the proportion of FOV that have clusters at different distance boundaries between excitatory and inhibitory neurons (Fig. 6c).
We found inhibitory neurons were more likely to form clusters that have similar gain values than the excitatory neurons, regardless of the distance threshold of clusters. Different clustering tendencies only held true for the gains but not ADI, suggesting that only the strength of binaural integration was more clustered in the inhibitory neurons. We found that the excitatory neurons have stronger binaural integration than the inhibitory neurons (Fig. 6d). The median value of gain (dio/con) in the inhibitory neurons was <1, indicating that simultaneously presented ipsilateral stimuli suppressed neural responses to the contralateral-only stimuli (Fig. 6d, top).
Discussion
Our research revealed that when decoding the spatial cue—interaural level difference (ILD)—inhibitory neurons leverage the opponent channel decoder which has a lower variability, while excitatory neurons utilize the spatially organized ILD clusters and population pattern decoder which has a higher accuracy. This hybrid decoding strategy primarily stems from the contrasting ILD encoding properties—lateralized in inhibitory neurons and heterogeneous in excitatory ones. Furthermore, inhibitory neurons exhibited a positive correlation between ILD and tone frequency selectivity, an observation we modeled as indicative of a proportional increase in presynaptic inhibitory neurons. This model gained support from the more clustered and potent binaural inhibition received by inhibitory neurons. Taken together, these findings underscore that inhibitory and excitatory neurons in the dorsal IC perform distinct yet complementary roles in the encoding and decoding of sound location.
Until this study, the local organization of ILD in the colliculi had not been explored, despite a previous study employing two-photon imaging to investigate the local organization of ILD in the auditory cortex (Panniello et al., 2018). Using the same bootstrapping method to identify clusters, we found that approximately 40% of the field of view in the IC showed clustering of ILD preference in the excitatory neurons, while no clusters were identified in the auditory cortex. This contrast reflects their global organization of ILD, as a rough ILD map is present in the colliculi but absent in the cortex. Additionally, using the same bootstrapping method, IC excitatory neurons formed approximately twice as many clusters for tone frequency than for ILD. This difference reflects the more well-organized map of tone frequency compared with ILD.
Clusters of neurons with similar stimulus preferences are a common characteristic of the colliculi. For example, excitatory neurons that share tuning sharpness and sweep sensitivity cluster in the central IC (also called “microdomain”; Ono et al., 2017; Ito, 2020), and there are periodicity clusters rather than a gradient map in the central IC (Schnupp et al., 2015). Other examples include multisensory patches in the nonlemniscal IC (Lesicko et al., 2016) and clusters in the SC formed by neurons that prefer the same orientation and movement direction (Ahmadlou and Heimel, 2015; Feinberg and Meister, 2015; de Malmazet et al., 2018; Li and Meister, 2023). While our study unveiled the local organization of ILD in the dorsal IC of awake, head-fixed mice, it does bear several limitations. Firstly, we relied on spatial cues, specifically ILD, as opposed to spatial location. Our primary rationale for utilizing ILD through sealed tubes was to alleviate the effect of the acoustic noise and reflection generated by two-photon microscopy. Future work could potentially circumvent this issue by employing virtual auditory space (Ito et al., 2020) or silent two-photon microscopy (Song et al., 2022) to map spatial locations. Secondly, despite numerous studies having identified a coarse spatial or spatial cue map via electrophysiological methods, we still lack a map that offers both large-scale coverage and cellular resolution. Future work employing GCaMP6 transgenic mice could potentially address this gap (Wong and Borst, 2019).
A topographic map is a fundamental organization of the sensory and motor systems across species (Kaas, 1997; Groh, 2014). Although it may seem straightforward, whether a topographically organized sensory encoding contributes to sensory decoding remains unclear. If decoding accuracy is the metric, then the space map decoder is suboptimal. In both the SC and IC, although the space map decoder performs above chance level, it is still far worse than the other two decoders (Avitan et al., 2016, and Fig. 3c). However, it is unknown whether the method we implemented (minimal distance to trial-averaged center of mass) is actually utilized by the colliculi. Furthermore, the current evaluation of decoding accuracy is biased against the space map decoder because it can only use neurons from a single FOV, whereas the other two decoders use many more neurons from multiple FOVs. If decoding speed is the metric, then the space map decoder is optimal. It decodes the stimulus in each trial by simply finding the center of population vectors in the cortex and comparing it against fixed anchors of each stimulus. In contrast, the other two decoders rely on calculating stimulus preference and tuning of selected neurons.
In our study, we found that excitatory neurons in the dorsal IC encode ILD with heterogeneous tuning curves, and the population pattern decoder has superior decoding accuracy. This enhanced accuracy allows animals to distinguish between sound locations in close proximity, a common occurrence in complex acoustic environments. Comparable encoding and decoding properties have also been observed in neurons in the auditory cortex (Belliveau et al., 2014; Wood et al., 2019). Given that top-down projections primarily target excitatory neurons (Nakamoto et al., 2013; Chen et al., 2018; Oberle et al., 2023), we posit that the auditory cortex may directly drive the ILD response of excitatory neurons in the superficial layer of the dorsal IC. This hypothesis gains support from previous studies demonstrating that the auditory cortex modulates spatial processing in the IC (Zhou and Jen, 2005; Nakamoto et al., 2008; Bajo et al., 2010) and that the sensitivity of the IC neural population to ILD is drastically changed by the corticofugal pathway (Nakamoto et al., 2008). Furthermore, analogous to cortical neurons, the spectral and temporal coding of excitatory, but not inhibitory neurons, is also strongly influenced by brain state (Chen and Song, 2019). While our hypothesis seems to conflict with anatomical studies indicating that the dorsal IC predominantly receives bottom-up rather than top-down inputs (Chen et al., 2018), as well as two-photon imaging studies demonstrating large response variability in the cortical projection buttons within the IC (Barnstedt et al., 2015), it is important to consider certain limitations. For instance, the anatomical studies did not differentiate between the superficial layer and other layers of the IC, and the imaging studies were performed on anesthetized animals. Future research, employing simultaneous inactivation of auditory corticocollicular feedback along with imaging of excitatory neurons in the dorsal IC in awake animals, could help further evaluate and potentially validate this hypothesis (Bajo et al., 2019; Lesicko et al., 2022; Oberle et al., 2022). It is worth noting that excitatory neurons and inhibitory neurons were imaged in different animals. Although it is unlikely, the Vglut2-Cre and VGAT-Cre lines may differ in their coding strategies. A more convincing method would be to perform imaging in the same animal by expressing a calcium-insensitive fluorophore in only one population of neurons and a calcium-sensitive fluorophore nonselectively (Ibrahim et al., 2023).
Our findings demonstrate that inhibitory neurons in the dorsal IC favor lateralized ILD and the opponent channel decoder exhibits reduced decoding variability. This diminished variability allows animals to reliably estimate sound locations, especially in life-threatening situations and in the presence of noisy sound stimuli. Similarly, neurons in the central IC also prefer lateralized spatial cues, and the accuracy of the opponent channel decoder is comparable to, or just slightly lower than, the population pattern decoder (Lesica et al., 2010; Belliveau et al., 2014; but see Day and Delgutte, 2013). These findings suggest that neurons in the central IC may directly influence the ILD tuning of inhibitory neurons in the dorsal IC. Additionally, we observed that inhibitory neurons exhibit narrower ILD tuning, more correlated ILD and frequency tunings (which we modeled as indicative of a larger number of inhibitory inputs), and stronger binaural inhibition compared with excitatory neurons. We hypothesize that the ILD tunings of dorsal IC inhibitory neurons are shaped by a combination of excitatory inputs from the central IC and inhibitory inputs from the contralateral dorsal IC. This is supported by evidence showing that IC neurons maintain a balance between excitatory and inhibitory synaptic inputs for ILD encoding (Ono and Oliver, 2014) and that dorsal IC inhibitory neurons receive three times more inhibitory inputs than excitatory inputs from the contralateral IC (Chen et al., 2018). Moreover, silencing of the unilateral dorsal IC leads to a decrease in the inhibitory synaptic currents of neurons in the contralateral dorsal IC (Liu et al., 2022). Taken together, these findings suggest that dorsal IC inhibitory neurons may not only modulate spatial coding in the forebrain through their feedforward projections to the auditory thalamus (Ito et al., 2009; Geis and Borst, 2013; Silveira et al., 2020) but also guide an animal away from locations with hazardous stimuli via their extensive feedforward projections to the superior colliculus and periaqueductal gray (Xiong et al., 2015).
It is worth noting that our study focused on three decoders. Excitatory and inhibitory neurons might also exhibit distinct roles using other decoders, such as population vector (Fitzpatrick et al., 1997; Fischer and Peña, 2011), linear classifier (Patel et al., 2018, 2022), and artificial neural network (Middlebrooks et al., 1994; Amaro et al., 2021). An arguably more significant question pertains to decoding an animal’s choices during a sound localization task, rather than interpreting presented stimuli in passive listening conditions (Town et al., 2018; Town and Bizley, 2022). Nonetheless, our exploration of three neural decoders, in combination with neuron type specificity and characterization and modeling of encoding properties, paves the way for future investigations into the function of the colliculi during sound localization tasks.
Footnotes
Author contributions: C.C. designed research; C.C. performed research; C.C. analyzed data; C.C. wrote the paper.
We thank Professor Bo Hong and the members of his laboratory, Yili Yan and Li Shen, for discussing, sound stimulus generation, and speaker calibration. We thank Professor Mitchell Day and Nicholas Lesica for sharing the codes for neural decoding, Professor Xiaowei Chen and Kexin Yuan for the technical assistance, and two anonymous reviewers for their comprehensive, constructive, and inspiring comments. This work was supported by the National Natural Science Foundation of China Grant 61836004 (S.S.), Institute Guo Qiang (S.S.), Beijing Brain Science Special Project Grant No. Z181100001518006 (S.S.), Tsinghua University Initiative Scientific Research Program 20197010009 (S.S.), IDG/McGovern Institute for Brain Research at Tsinghua University (S.S.), and National Key Research and Development Program of China 2021ZD0200300 (S.S.).
The authors declare that they have no conflict of interest.
- Correspondence should be addressed to Chenggang Chen at ccg1988{at}yeah.net.