Abstract
Hearing is an active process in which listeners must detect and identify sounds, segregate and discriminate stimulus features, and extract their behavioral relevance. Adaptive changes in sound detection can emerge rapidly, during sudden shifts in acoustic or environmental context, or more slowly as a result of practice. Although we know that context- and learning-dependent changes in the sensitivity of auditory cortical (ACX) neurons support many aspects of perceptual plasticity, the contribution of subcortical auditory regions to this process is less understood. Here, we recorded single- and multiunit activity from the central nucleus of the inferior colliculus (ICC) and the ventral subdivision of the medial geniculate nucleus (MGV) of male and female Mongolian gerbils under two different behavioral contexts: as animals performed an amplitude modulation (AM) detection task and as they were passively exposed to AM sounds. Using a signal detection framework to estimate neurometric sensitivity, we found that neural thresholds in both regions improve during task performance, and this improvement is largely driven by changes in the firing rate rather than phase locking. We also found that ICC and MGV neurometric thresholds improve as animals learn to detect small AM depths during a multiday perceptual training paradigm. Finally, we revealed that in the MGV, but not the ICC, context-dependent enhancements in AM sensitivity grow stronger during perceptual training, mirroring prior observations in the ACX. Together, our results suggest that the auditory midbrain and thalamus contribute to changes in sound processing and perception over rapid and slow timescales.
Significance Statement
What a listener hears depends on several factors, such as whether the listener is attentive or distracted and whether the sound is meaningful or irrelevant. Practice can also shape hearing by improving the detection of particular sound features, as occurs during language or musical learning. Understanding how changes in sound perception arise in the brain is important for developing strategies to optimize healthy hearing and for treating disorders in which these processes go awry. We report that neurons in the auditory midbrain and thalamus exhibit rapid shifts in sound sensitivity that depend on the sound's behavioral relevance and slower improvements that emerge over several days of training. Our results suggest that subcortical areas make an important contribution to flexible hearing.
Introduction
Sound perception depends on the behavioral state and past experience of the listener. Enhanced detection or discrimination abilities can emerge rapidly due to sudden shifts in attention or expectations (Shinn-Cunningham, 2008; Jaramillo and Zador, 2010; Snyder et al., 2012; Hoskin et al., 2014; Lawrance et al., 2014; Kaya and Elhilali, 2017) and more slowly, over the course of perceptual learning (Gibson, 1953; Fitzgerald and Wright, 2011; Irvine, 2018). In both cases, perceptual improvements are accompanied by, and often correlate with, changes in the responsiveness (Miller et al., 1972; Bao et al., 2004; Niwa et al., 2012), tuning (Beitel et al., 2003; Fritz et al., 2003; Witte and Kipke, 2005; Lee and Middlebrooks, 2011; David et al., 2012; Carcea et al., 2017), or topographic organization (Recanzone et al., 1993; Rutkowski and Weinberger, 2005; Polley et al., 2006; Whitton et al., 2014) of auditory cortical (ACX) neurons.
Although the contribution of the ACX to perceptual plasticity is well established, anatomical and physiological findings suggest that subcortical auditory areas may also play an important role. Both the inferior colliculus (IC) and the medial geniculate nucleus (MGN) receive inputs from neuromodulatory (Fitzpatrick et al., 1989; Olaźbal and Moore, 1989; Klepper and Herbert, 1991; Motts and Schofield, 2009, 2010; Nevue et al., 2016) and limbic structures (Marsh et al., 2002; Liu et al., 2021) that are well positioned to mediate context-dependent and/or learning-related plasticity. IC and MGN neurons also exhibit rapid changes in activity when animals perform sound-guided tasks (Ryan and Miller, 1977, 1978; Ryan et al., 1984; Metzger et al., 2006; von Kriegstein et al., 2008; Otazu et al., 2009; Slee and David, 2015; Rocchi and Ramachandran, 2020; Franceschi and Barkat, 2021; Shaheen et al., 2021) and longer-lasting changes induced by associative learning (Gilad et al., 2020; Taylor et al., 2021).
One limitation of the existing literature is the reliance on simple, suprathreshold sounds to assess plasticity in the IC and MGN. Few studies have used complex or near-threshold stimuli, where contextual cues or experience would provide the most benefit to the listener. Those studies that did use near-threshold sounds did not report neural measures of signal detectability or discriminability and instead focused exclusively on changes in the firing rate (FR; Ryan and Miller, 1977; Ryan et al., 1984). This approach is insufficient, as indiscriminate changes in response gain have little to no impact on neurometric performance (Shaheen et al., 2021). As a result, context- and learning-dependent changes in IC and MGN sound “sensitivity” remain largely unexplored.
Here, we recorded neural activity from the central nucleus of the IC (ICC) and the ventral subdivision of the MGN (MGV) during passive sound exposure and as animals trained on an amplitude modulation (AM) detection task. Using a signal detection framework to estimate neurometric sensitivity, we found that ICC and MGV neurons exhibit better AM depth thresholds during task performance compared with passive sound exposure sessions and that this context-dependent shift is driven largely by changes in FR rather than phase locking. We also show that neurometric thresholds in both the ICC and MGV improve as animals learn to detect smaller AM depths over several days of training. Finally, we reveal that in the MGV, but not the ICC, context-dependent enhancements in AM sensitivity grow stronger over the course of perceptual training. Together, these findings indicate that the IC and MGN support both context-dependent and learning-related perceptual plasticity, raising the possibility that similar changes in cortical sensitivity are inherited from the ascending auditory pathway.
Materials and Methods
Animals
Mongolian gerbils (Meriones unguiculatus) were obtained from Charles River Laboratories and bred in-house. Gerbils were group-housed on a 12 h light/dark cycle and given ad libitum access to chow (Purina Mills LabDiet 5001). During behavioral training, animals were maintained on controlled water access at at least 80% of their initial weight. To account for any effects of sex, we used approximately equal numbers of males and females. All procedures were conducted at the University of Maryland College Park and were approved by the University of Maryland Institutional Animal Care and Use Committee.
Electrophysiology components and assembly
Chronic 64-channel electrode arrays [Buzsaki64_5 × 12-H64LP_30mm (N = 9) and A4 × 16-Poly2-5mm-20s-150-160-H64LP_30mm (N = 2), NeuroNexus Technologies] were attached to the footplate of a custom-made microdrive using superglue. A small protective wall was built around the connector using dental cement (PALACOS). Prior to surgery, the assembled electrode was placed in an ultraviolet sterilization chamber (254 nm, UV Clave) for 15 min.
Surgical procedure
Electrode implantation was performed after animals had successfully completed “associative training” (see below, Behavioral training and analysis). One day prior to surgery, animals were given meloxicam (1.5 mg/kg, 1.5 mg/ml) via either oral suspension or subcutaneous injection as a preventative analgesic. On the day of surgery, animals were administered meloxicam and dexamethasone (0.35 mg/kg, 0.5 mg/ml) subcutaneously to prevent edema. Subjects were sedated in a small induction chamber in which isoflurane (5%) was continuously administered in O2 (2 L/min). After the animal was sedated, the fur on the animal's head was shaved. The animal was then transferred to a warming pad on a stereotaxic device (Kopf Instruments) and secured in place with ear bars and a bite bar. A nose cone delivered a steady flow of isoflurane (1.5–2.5%) and oxygen at a flow rate of 2 L/min. Once a surgical plane of anesthesia was achieved (as evidenced by slow and steady respiration, and the absence of a toe-pinch response), ophthalmic ointment was applied to the eyes, and the surgical area was treated with alternating applications of betadine and alcohol. The skin on the top of the surgical area was removed with a scalpel and scissors to accommodate a headcap. The skull was cleaned and dried with H2O2 and scored with a scalpel. The implantation site was marked on the skull at the appropriate medial–lateral and anterior–posterior coordinates (Table 1). A 7 mm drill bit was used to create holes in the frontal bone and right parietal bone, and four to five bone screws (0–80 thread, 3/32″) were inserted. The skull was coated with luting cement (C&B Metabond Quick!) except at the marked implantation site and lambda.
Electrode implant coordinates, relative to lambda
A craniotomy was performed over the region of interest using a 5 mm drill bit. The assembled electrode was attached to the stereotaxic frame and fluorescent beads (0.2 µm FluoSpheres, Thermo Fisher Scientific) were gently painted on the shanks using a paintbrush (Simmons et al., 2020). After lowering the electrode near the skull surface, a small burr hole was drilled away from the craniotomy site, where the ground wire was inserted and kept in place using super glue. A durotomy was performed, and the electrode was lowered to the appropriate depth (Table 1). For ICC implants, the electrode was lowered at a 30° angle along the anterior–posterior plane. Any moveable parts of the electrode or microdrive that remained above the brain surface were coated with petroleum jelly. Dental cement was then applied in layers on top of the electrode and skull. Immediately after the surgery, subjects were given a subcutaneous injection of Normosol (1–2 ml) and allowed to recover on a warming pad.
An additional dose of meloxicam and dexamethasone was administered 24 h after surgery, and subjects were closely monitored after the surgical procedure. Animals were given a recovery period of at least 1 week before controlled water access began. At least 1 d prior to surgery and during the recovery period, subjects were treated with an antibiotic (minocycline, 1.5 mg/kg) in their drinking water to prevent infection and reduce inflammatory processes in the brain around the electrode (Rennaker et al., 2007).
Behavioral training and analysis
Behavioral performance on a Go/NoGo AM detection task was assessed as previously described (Caras and Sanes, 2015, 2017, 2019; Mowery et al., 2019; Macedo-Lima et al., 2024; Fig. 1A,B). Briefly, we trained the animals to drink from a metal spout during continuous unmodulated broadband noise (100 Hz–20 kHz; the NoGo stimulus) and withdraw when the noise smoothly transitioned to AM (5 Hz, 1 s duration; the Go stimulus). To encourage withdrawal during the AM stimulus, we paired the misses (drinking during AM noise) with a mild shock (0.5–1.0 mA, 300 ms; H13-15, Coulbourn Instruments).
Behavioral paradigm. A, Schematic of AM detection task. During “associative training,” animals learn to drink from a spout during unmodulated noise (black), which serves as the NoGo stimulus, and to cease drinking when the noise transitions to AM (5 Hz, 0 dB re: 100%, red), which serves as the Go stimulus. Three to five NoGo trials were randomly interspersed between each Go trial to prevent the animal from predicting the AM onset. B, During NoGo (unmodulated noise) trials, correctly maintaining spout contact is considered a “correct reject,” whereas incorrectly withdrawing from the spout is considered a “false alarm.” During Go (AM noise) trials, correctly withdrawing from the spout is considered a “hit,” whereas incorrectly maintaining spout contact is considered a “miss” and is punished with a mild shock. C, During “perceptual training,” subjects train with a wide range of AM depths. Depths gradually decrease over several days of training. Lower AM depths are harder to discriminate from unmodulated noise. D, A psychometric curve from a representative subject on the first day of perceptual training. Threshold is defined as the AM depth at which d′ = 1 (dashed line).
Sounds were presented from a free-field calibrated speaker (Peerless DX25TG59-04, Tymphany) centered 1 m above the test cage. Because AM depth discrimination likely operates on a logarithmic scale (Wakefield and Viemeister, 1990), depths were presented on a decibel (dB) scale relative to 100% depth. Thus, 0 dB (re: 100% depth) refers to fully modulated (100% depth) noise, and negative numbers refer to smaller depths. These values differ from dB SPL values, which indicate the root mean square sound level of the stimulus. When AM sounds were presented, the gain of the signal was adjusted to control for changes in average power across modulation depths (Viemeister, 1979; Wakefield and Viemeister, 1990).
Water delivery was triggered by infrared detection of the animal at the spout and controlled by a programmable syringe pump (NE-1000, New Era Instruments). All animals were tested in a custom cage located within a Double Deluxe wooden sound booth (Gretch-Ken Industries) and monitored remotely via webcam. The ePsych MATLAB toolbox (Stolzberg, 2023) and custom MATLAB 2014b scripts running on a Dell PC and an RZ6 signal processor (Tucker-Davis Technologies, TDT) were used to control sound output and collect behavioral data.
During “associative training”, subjects learned to detect a highly salient Go signal (0 dB re: 100% depth AM noise). Animals first learned to drink from the spout at a 2 ml/min flow rate during 60 dB SPL unmodulated noise (the NoGo stimulus). Over a period of several days, the flow rate and dB SPL of the noise were lowered, and the Go stimulus (0 dB re: 100% AM noise) and shock were introduced. By the end of associative training, the sound level was 45 dB SPL, and the flow rate was 0.2 ml/min. These values were maintained throughout “perceptual training” (see below).
Behavioral responses were scored by determining whether the animal withdrew from the spout for at least 50 ms during the final 100 ms of the behavioral trial. Spout withdrawals for >50 ms were defined as correct responses (“hits”) on Go trials and incorrect responses (“false alarms”) on NoGo trials. Hits and false alarms were used to calculate the signal detection metric d′ (Green, 1960):
During “perceptual training,” animals trained with a range of successively smaller AM depths for at least 7 d (Fig. 1C). Each day, five AM depths separated by increments of 3 dB re: 100% were presented in a descending order. The first perceptual training session began with AM depths ranging from 0 to −12 dB re: 100%. These values were chosen with the goal of bracketing the animal's likely naive AM depth threshold as estimated from previous studies (Caras and Sanes, 2015, 2017, 2019; Mowery et al., 2019). After animals completed eight trials at each depth within a session, behavioral performance was assessed online, and stimulus values were adjusted to maintain the threshold bracketing. An animal's final AM threshold for a session determined the five starting AM values for the next training day. In order to minimize variability in our neurophysiological recordings, we aimed to maximize the number of trials completed during each session and therefore let the animal drink until satiation. On average, animals completed 17.39 ± 0.40 (mean ± SEM) trials per AM depth during perceptual training sessions.
Because stimuli were selected each day with the goal of bracketing the animal's threshold, it was likely that the animal would fail to detect the smallest AM depth(s) presented. Punishing the animal by delivering a shock during these trials would likely cause the animal to either peck intermittently at the spout, resulting in an artificially high false alarm rate, or to stop approaching the spout all together. To avoid these potential confounds, the shock was turned off for the lowest two depths presented each day. A previous study tested the validity of this approach and confirmed that animals do not become conditioned to the presence or absence of the shock (Buran et al., 2014).
Each day, hit rates were plotted as a function of AM depth and fit with a cumulative Gaussian using the maximum likelihood procedure of the open-source package Psignifit 4 for MATLAB (Schütt et al., 2016). In previous studies, we found that the default priors worked well for fitting similar data, so we chose to use the default priors again here (Caras and Sanes, 2015, 2017, 2019; Mowery et al., 2019). Fits were then transformed to d′ values and used to calculate each subject's threshold on a given day, defined as the AM depth at which d′ = 1 (Fig. 1D).
Electrophysiological recording and analysis
The goal of this study was to characterize context-dependent and learning-related changes in sound sensitivity in the auditory midbrain and thalamus. Here, we use the term “context” to refer to identical acoustic environments that differ in their behavioral demands. Therefore, each day, neural recordings were made from freely moving animals during two different contexts: perceptual training sessions and periods of passive sound exposure just before (“pre”) and just after (“post”) the behavioral task. During passive exposure sessions, the spout was removed from the test cage, but everything else, including the sound stimuli presented and position of the recording electrodes, remained identical to the task. Animals were presented with 15 trials per AM depth during passive sessions, which closely approximated the average number of trials per depth subjects completed during the task (see above). At the end of each training day, a subject's electrode was advanced if there were few AM-sensitive units or if the quality of the neural signal was poor.
For some subjects (N = 5), extracellular responses were acquired with a wireless headstage and receiver (W64, Triangle Biosystems), preamplified, digitized at a 24.414 kHz sampling rate (PZ5; TDT), and fed via a fiber optic link to an RZ2 BioAmp Processor (TDT) for online filtering and processing. Raw signals were streamed to an RS4 Data Streamer (TDT) for off-line analysis, and data acquisition was controlled by a Dell PC running TDT's Synapse Suite. For the remaining subjects (N = 7), responses were acquired at a 30 kHz sampling rate with a tethered RHD headstage and recording system (Intan Technologies) and a Dell PC running the Open Ephys GUI (Siegle et al., 2017).
Common average referencing and a high-pass filter (150 Hz) were applied to each channel in order to reduce noise. Spikes were extracted and sorted off-line using the open-source package Kilosort2 (Pachitariu et al., 2016; Fig. 2A). The threshold for spike extraction was set to 4–10 standard deviations outside of the background noise band, and an artifact threshold was set to 50 standard deviations above the noise. Sorted spike waveforms were then clustered using principal component analysis and manually curated using phy (Rossant et al., 2023). Units were then analyzed using the Allen Brain Institute's quality metrics (Unit Quality Metrics, 2023). Single units were defined as clusters with (1) a clear separation in principal component space; (2) an interspike interval (ISI) violation, which measures the rate of contaminating spikes relative to true spikes, <0.5 (Hill et al., 2011); and (3) a fraction missing value of <0.1. In the ICC, the ISI threshold was set to 0.7 ms (Garcia-Lazaro et al., 2013; Graña et al., 2017; Yang et al., 2020). In the MGV, the ISI threshold was set to 1.5 ms. Since some MGV neurons fire in bursts, we manually examined spikes after running quality metrics for units with ISI histograms skewed toward short ISIs and a sharp decline between 3 and 10 ms (Bartlett and Smith, 1999; Ramcharan et al., 2005). We classified any burst firing units which passed criteria (1) and (2) but did not pass (3) as single units, given that (3) assumes a normal amplitude histogram (Fee et al., 1996; Hill et al., 2011). Clusters that failed to meet the manual and/or quantitative criteria but appeared neural in origin were considered multiunits. Because waveform characteristics and the location of the spike recorded along the multisite probe can change over extended periods of time due to physical drift, we chose to treat neurons recorded each day as independent samples. By ignoring the potential repeated-measure aspect of our data, our analyses are conservative in their estimates of effect sizes.
Electrophysiology. A, Extracellular recordings were obtained using 64-channel electrodes inserted into either ICC or MGV. Filtered voltage traces from four adjacent contact sites in the ICC are shown. Kilosort2 was used to sort spikes from individual units. B, C, Low- and high-magnification images depicting electrode tracks (green) in the ICC (B) and the MGV (C). Scale bars, 500 μm.
Each unit's responses were quantified by using two complimentary metrics. First, each unit's FR (spikes/s) was calculated as the number of spikes during the entire 1 s trial window consisting of either unmodulated or AM noise. Second, because context-dependent or learning-related changes in phase locking can also contribute to enhanced sound sensitivity (Beitel et al., 2003; Bao et al., 2004; Niwa et al., 2012), we calculated each unit's vector strength on a cycle-by-cycle basis (VScc) as described by Yin et al. (2011). We chose to use VScc instead of other established spike timing metrics like Rcorr (Schreiber et al., 2003), the correlation index (Joris et al., 2006), or the shuffled autocorrelogram (SAC; Lee et al., 2016), for three reasons. First, both the correlation index and SAC are derived from the peak of the SAC in response to a periodic stimulus, which does not permit the calculation of trial-by-trial variance required for the neurometric analyses described below. Second, the calculation of both the correlation index and Rcorr depends on a user-defined bin size. The optimal bin size for the correlation index computation is thought to be 50 µs (Kessler et al., 2021), which requires a sampling rate of at least 100 kHz, well beyond the capabilities of our multichannel recording system. The optimal bin width for Rcorr can vary under different conditions, like hormonal status (Caras et al., 2015), raising the possibility that task performance or learning might alter the optimal width. This would make it difficult to make meaningful direct comparisons of Rcorr-based d′ values across behavioral contexts or training days, which were the primary goals of this study. Third, and finally, while Rcorr, correlation index, and SAC are well suited for evaluating spike timing reliability, none of them specifically assess phase locking. VScc was specifically designed to quantify phase-locked responses to sinusoidal AM stimuli while avoiding many of the pitfalls associated with the original VS metric published by Goldberg and Brown (1969), like inflated VS values at low FRs. Unlike the correlation index and Rcorr, VScc is also parameter-free and, most importantly, is calculated on a trial-by-trial basis which permits the neurometric computations that are critical for addressing our central hypotheses.
After each metric (FR and VScc) was calculated for a given unit, it was used to calculate a neurometric d′ value as previously described (Caras and Sanes, 2017):
Frequency tuning
Every day immediately after the post-passive session, subjects were passively exposed to a series of pure tones (500 Hz–32 kHz in octave steps, 10–80 dB SPL in 10 dB steps, 200 ms duration, 1 s interstimulus interval). Each frequency was played 10 times in a randomized order. We then constructed frequency tuning curves for each unit by calculating the average FR for each frequency and sound level combination. In one animal, the anatomical location of the recording sites could not be conclusively determined using histology. Therefore, we used tuning curve information to determine whether the units from that animal (n = 359) displayed V-shaped narrow-frequency tuning curves, characteristic of units in the MGV (Edeline et al., 1999). Units that did not meet this criterion were flagged and excluded from further analysis (n = 318).
Criteria for unit inclusion
Over the course of 7 d, we recorded from a total of 329 units (217 single-units, 112 multi-units) from the ICC and 823 units (606 single-units, 217 multi-units) from the MGV. (For a breakdown of the number and type of units recorded from each region on each day, see Table 2.) We then applied the selection criteria outlined below to determine which units would be included in each of our statistical analyses. These criteria were applied identically in the ICC and the MGV, allowing for direct comparisons across brain regions.
The number and type of units recorded in each region on each day of perceptual training
First, for each analysis that focused exclusively on FR-based measurements, we pooled single- and multi-units together to maximize our statistical power. In contrast, for each analysis that involved VScc-based measurements, we restricted the analysis to single-units only, because it is difficult to interpret what the VScc of a multi-unit actually reflects.
We then asked whether and when the remaining units exhibited at least one measurable AM depth threshold. For analyses that examined the effect of behavioral context, either alone or in combination with perceptual training, we restricted the analyses to units that exhibited a measurable AM depth threshold during at least one behavioral context (e.g., task performance or during pre- or post-passive sound exposure sessions). The fraction of units that met this criterion, broken down by day and by region, can be found in Table 3. Measurable FR-based thresholds were observed in 41/329 single- and multi-units in the ICC and 117/823 single- and multi-units in the MGV during at least one behavioral context. These units were included in the analyses in Figures 4⇓–6 and 13, A and B. Measurable VScc-based thresholds were observed in 97/217 single-units in the ICC and 91/606 single-units in the MGV during at least one behavioral context. These units were included in the analyses in Figures 7, 8, and 13, D and E.
Fraction of units that exhibited a measurable threshold during at least one behavioral context, broken down by day, region, and measurement type
When examining the effect of perceptual training on neural thresholds alone, we restricted the analyses to units that exhibited a measurable threshold during task performance. For FR-based thresholds, 28/329 single- and multi-units in the ICC and 115/823 single- and multi-units in the MGV met this criterion and were included in Figure 11. For VScc-based thresholds, 58/217 single-units in the ICC and 55/606 single units in the MGV met this criterion and were included in Figure 12.
Finally, we performed two analyses that examined the relationship between FR-based and VScc-based thresholds. For these analyses, only single-units were included (because VScc-based measurements were involved). In one of these analyses, we asked how FR-based and VScc-based thresholds changed in individual neurons as animals transitioned from a state of passive listening to task performance. We therefore included single-units that exhibited a measurable FR-based or VScc-based threshold, either during the task or during the pre-passive sound exposure session. In the ICC 92/217 single-units met this criterion and were included in Figure 9A. In the MGV, 144/606 single units met this criterion and were included in Figure 9B. In the other analysis, we asked whether neurons that exhibited a measurable FR-based threshold during task performance also exhibited a measurable VS-based threshold during passive sound exposure. Therefore, we restricted this analysis to just single-units that exhibited a measurable FR-based threshold during task performance. We found 20/217 single-units in the ICC and 84/606 single-units in the MGV that met these criteria. They were included in Figure 9, C and D.
Once the appropriate sample of units was selected, the final step was to determine how to treat instances when we could not measure a threshold. In most cases (including the analyses for Figs. 4–8, 9C,D, 11, 12) unmeasurable thresholds were treated as NaNs (i.e., not a number). In the remaining analyses, we set these values to 0 dB re: 100%. This decision was made either because we were interested in measuring how the thresholds of individual units changed across contexts (Fig. 9A,B) or across contexts and training in combination (Fig. 13) and needed a real value to quantify or model the magnitude and direction of the change. We chose 0 dB as our replacement value because it is the highest possible AM depth.
Histology
After all training was completed, subjects were briefly placed under light isoflurane anesthesia during which electrolytic lesions were made by passing a small current (10–15 µA) through electrode channels at the tips of the shanks. Twenty-four hours later, animals were anesthetized with an intraperitoneal injection of ketamine (150 mg/kg, 25 mg/ml) and xylazine (6 mg/kg, 1 mg/ml) in saline. After the subject was determined to be insensitive to a toe pinch and deeply sedated, the chest cavity was opened, and the subject was perfused transcardially using 1× phosphate buffered saline followed by 4% paraformaldehyde.
Tissue processing and microscopy
Extracted brains were postfixed in 4% paraformaldehyde at 4°C for at least 1 d, then embedded in 6% agar, and sliced on a vibratome (Leica VT1000 S) at 70 μm thickness. Slices were mounted on gelatin-subbed slides and dried overnight before coverslipping with ProLong Gold with DAPI. Slices were imaged on an upright fluorescent microscope (Leica DM750). To verify that recordings were made in the ICC or MGV, we confirmed that fluorescent beads, electrode tracks, and/or electrolytic lesions were in our area of interest (Fig. 2B,C).
ACX data
A previously published study (Caras and Sanes, 2017) that used an identical behavioral paradigm reported context- and learning-dependent changes in the FR-based AM sensitivity of gerbil ACX neurons. To gain a better understanding of whether the influence of context or learning differs across the auditory hierarchy, we reanalyzed the publicly available dataset associated with that paper, allowing us to quantifiably compare FR-based results from the ICC, MGN, and ACX. These data are included in Figures 6 and 13.
Statistical analysis
We constructed generalized linear mixed models (GLMM) in R using the glmmTMB package (Brooks et al., 2017). Normal distribution of residuals was tested after running the GLMM using DHARMa (Hartig and Lohse, 2022); if the residuals were not normally distributed, a corrective transformation was applied to the data or a bootstrap analysis was applied to verify that the model coefficients were significantly different from zero. An analysis of variance (ANOVA) was used to analyze the statistical significance of each model. Post hoc tests consisted of pairwise comparisons with t and z tests using the emmeans package (Lenth et al., 2024). Because perceptual improvement generally follows an exponential decay function, we first log transformed Day when Day was included as an effect. Results were considered statistically significant when p < 0.05. Any p values from multiple comparisons were adjusted with a Bonferroni’s correction.
To examine the effect of behavioral context on neurometric thresholds (Figs. 4D, 5D, 7J,K, 8J,K), we created linear mixed effects models (LMMs) in which Context (i.e., pre, task, or post) was a fixed effect and Unit nested within Subject was a random intercept. Due to the fact that most units did not exhibit a measurable FR-based threshold during pre- and post-passive sessions, FR thresholds were also analyzed with a GLM that included the same fixed and random effects described above but used a binomial distribution. This analysis asked whether behavioral context affected the presence or absence of a measurable threshold rather than the threshold values themselves.
To examine the effect of behavioral context on FR ratios and coefficients of variation (Figs. 4E–G, 5E–G), we first created LMMs in which Context was a fixed effect and Unit nested within Subject was a random intercept. Some of these models revealed a significant (p < 0.05) result, despite the fact that the effect size (as measured by Cohen's d) was relatively small. In those cases, we created bootstrapped datasets for each measure by sampling with replacement (while preserving the repeated measures aspect of our data) and then ran our GLM analysis on the bootstrapped dataset. We repeated this process 999 times, generating a distribution of χ2 values. We then asked whether the lower bound of the 95% confidence interval (CI) of this distribution exceeded the χ2 critical value for two degrees of freedom and an alpha level of 0.05. When it did, we interpreted it to mean that the original result was robust. These analyses were implemented in R using the boot package (Canty et al., 2024).
To determine whether the impact of behavioral context on the likelihood that a measurable threshold would be observed differed across brain regions (Fig. 6A), we created a GLM in which the Context × Region interaction was a fixed effect and Unit nested within Subject was a random intercept, using a binomial model. We also were interested in whether, when measurable, neural thresholds differed across brain regions. Because most units did not exhibit a measurable FR-based threshold during pre- and post-passive sessions, we focused this analysis on neural thresholds obtained during task performance (Fig. 6B). To do so, we created an LMM in which Region was a fixed effect and Unit nested within Subject was a random intercept.
To quantify how the neurometric sensitivity of individual units changed when animals transitioned from passive sound exposure to task performance (Fig. 9A,B), we used the Rayleigh test of uniformity from the circular statistics package (Lund et al., 2023).
To determine whether perceptual training affected the likelihood that a measurable neural threshold would be observed, we created a GLM in which Day was a fixed effect and Subject was a random intercept, using a binomial model. In this analysis, the total number of units recorded from each subject on each day was included as a vector of prior weights.
To determine the effect of perceptual training on behavioral thresholds (Fig. 10), we created an LMM in which Day was a fixed effect and Subject was a random intercept. The effect of perceptual training on neurometric thresholds obtained during task performance (Figs. 11A,C, 12A,C) was modeled with an LMM with Day as a fixed effect and Unit nested within Subject as a random intercept. To determine the correlation between behavior and neural thresholds (Figs. 11B,D, 12B,D), we used Spearman's rank correlation coefficient.
Finally, to investigate whether the effect of behavioral context on neurometric thresholds changed across perceptual training (Fig. 13), we created an LMM in which the Context × Day interaction was a fixed effect and Unit nested within Subject was a random intercept. We calculated effect sizes using Cohen's d.
Results
To characterize rapid, context-dependent shifts in sound sensitivity in the auditory midbrain and thalamus, we trained Mongolian gerbils on an aversive Go/NoGo AM detection task (Caras and Sanes, 2015, 2017, 2019; Mowery et al., 2019; Macedo-Lima et al., 2024). All animals (N = 11 across all experiments) learned the task quickly, reaching our predetermined criterion level of performance (d′ ≥ 2) within 7 d (Fig. 3). High levels of performance were maintained or quickly restored following surgical implantation of an electrode array into the central nucleus of the inferior colliculus (ICC; Fig. 3A) or the MGV (Fig. 3B).
All animals successfully learned to report the presence of 0 dB re: 100% depth AM noise. A, B, All animals reached our predetermined criterion level of performance (d′ ≥ 2) within 7 d of associative training. High performance levels were maintained or quickly restored after electrode implantation into the ICC (A) or MGV (B). Within each panel, each subject is represented by a different color.
We then recorded neural responses from a total of 329 units (217 single-units, 112 multi-units) from the ICC and 823 units (606 single-units, 217 multi-units) from the MGV as animals trained with a range of successively smaller AM depths over the course of several days. (For a breakdown of the number and type of units recorded from each region on each day, see Table 2.) Each day, recordings were made while animals performed the task and during passive sound exposure sessions just before (pre) and just after (post) the task. FRs and VScc values were transformed into d′ values and fit to generate neurometric functions. Neurometric thresholds were estimated from the fits (see Materials and Methods for full details). Units were considered “sensitive” to AM depth when they exhibited a measurable threshold and “insensitive” to AM depth when they did not. For a complete description of which units were included in each of the following analyses and the rationale for their inclusion, please see Materials and Methods, Criteria for unit inclusion.
Rapid shifts in spike rate, but not phase locking, mediate context-dependent improvements in AM sensitivity
Task performance improves the FR-based sensitivity of ICC and MGV neurons
Previous studies revealed that sound-evoked FRs in both the ICC and the MGV depend on behavioral context (Ryan and Miller, 1977; Slee and David, 2015; Franceschi and Barkat, 2021; Saderi et al., 2021). While this work has yielded important insights, we do not fully understand whether or how context-dependent changes in FR translate into changes in the ability of subcortical neurons to detect or discriminate behaviorally relevant sounds. To answer this question, we first examined how behavioral performance on an AM detection task affects the FR-based AM sensitivity of ICC single and multiunits. Although the majority of these units (288/329, 88%) were always insensitive to AM, a small population (41/329, 12%) exhibited a measurable threshold during at least one behavioral context. (The fraction of units that met this criterion each day is provided in Table 3.) Data from one of these units are depicted in Figure 4A–C. During passive sound exposure sessions, the AM sensitivity of this unit was extremely poor: all FR-based d′ values fell below 1, resulting in an unmeasurable neurometric threshold (Fig. 4C). In contrast, AM sensitivity was better during task performance: d′ values increased, and a neurometric threshold was measurable. This improvement was driven by a broad increase in AM-evoked firing during task performance, as evident in the raster plots in Figure 4A and the quantified FRs in Figure 4B. This neuron was therefore considered to be “insensitive” to AM depth during passive sound exposure and “sensitive” to AM depth during task performance.
Task performance improves the FR-based sensitivity of ICC neurons. A, Peristimulus time histograms and raster plots for one representative ICC neuron under different behavioral contexts. B, Mean ± SEM FRs across AM depths and during unmodulated noise for the same neuron as depicted in A. C, FR-based neurometric functions for the same neuron depicted in panels A, B. This neuron exhibited a measurable FR-based threshold only during task performance. D, FR-based thresholds for all ICC units that exhibited a measurable threshold during at least one behavioral context. Each unmeasurable threshold is NaN (see Materials and Methods). Representative neuron depicted in panels A–C is highlighted in purple. Dashed lines connect thresholds from the same unit. Thick black lines connect the mean of measurable thresholds across contexts. Asterisks denote significant effects of context on measurable thresholds. E–G, FR ratio (E), CV during AM (Go) trials (F), and CV during non-AM (NoGo) trials (G) for all ICC units that exhibited a measurable FR-based threshold during at least one behavioral context. Plot conventions are the same as panel D. *p < 0.05; **p < 0.01.
To determine whether this pattern was consistent across the ICC population, we first asked whether AM-sensitive units were more likely to exhibit a measurable threshold during a specific behavioral context. We found a significant effect of context on measurability (χ2(2) = 10.68; p = 0.0048; Fig. 4D). AM-sensitive units were more likely to exhibit measurable FR-based thresholds during task performance (28/41, 68%) than during the pre- (16/41, 39%; z = 2.78; p = 0.0163) or post- (14/41, 34%; z = 3.29; p = 0.0030) passive sound exposure sessions.
We then asked whether, when measurable, threshold values differed across behavioral contexts. Again, we found a significant effect (χ2(2) = 13.94; p = 0.0009; Fig. 4D). AM-sensitive ICC units exhibited lower (better) FR-based thresholds during the task (t(52) = −3.51; p = 0.0028) and post-passive session (t(52) = −2.93; p = 0.0150) than during the pre-passive exposure session. There was no significant difference between task and post-passive sessions (t(52) = −0.66; p = 1). For a detailed list of the magnitude of the threshold change across contexts and corresponding effect sizes, see Table 4.
The magnitude of context-dependent effects on FR-based metrics, broken down by region
What drives these context-dependent changes in FR-based sensitivity? One explanation is that task performance increases the separation of AM- and non-AM–evoked FR distributions, leading to higher d′ values and lower thresholds. To explore this possibility, we calculated the ratio of average FRs (AM/non-AM) for each ICC unit that exhibited a measurable AM threshold under at least one behavioral context. Because the strongest task-dependent effects on d′ values tended to occur for suprathreshold AM depths (Fig. 4C), we focused on responses at the highest AM depth presented to each neuron. As shown in Figure 4E, FR ratios were significantly affected by behavioral context (χ2(2) = 17.06; p = 0.0002). FR ratios during task performance were significantly higher than those during the pre- (t(117) = 3.21; p = 0.0051) and post- (t(117) = 3.75; p = 0.0008) passive sound exposure sessions. Because these effect sizes were moderate (all comparisons across contexts revealed Cohen's d ≤ 0.6; Table 4), we created a bootstrapped dataset by sampling with replacement (while preserving the repeated measures aspect of our data), then ran our GLM analysis on the bootstrapped dataset. We repeated this process 999 times, generating a distribution of χ2 values. We found that the lower bound of the 95% CI of this distribution (χ2(2) = 6.16) exceeded the χ2 critical value for two degrees of freedom and an alpha level of 0.05 (χ2(2) = 5.99), suggesting that ICC FR ratios are indeed significantly affected by behavioral context.
Another possible driver of increased sensitivity during task performance is a reduction in FR variability. We therefore calculated coefficient of variation (CV) values, again using responses to the highest AM depth presented to each neuron. Although behavioral context appeared to affect AM-evoked (χ2(2) = 18.29; p = 0.0001; Fig. 4F) and (much more weakly) non-AM–evoked CV values (χ2(2) = 6.40; p = 0.0408; Fig. 4G), the effect sizes were small or moderate (all Cohen's d ≤ 0.6; Table 4), and bootstrap analyses revealed that the influence of context was not robust (lower bound of 95% CI; AM, χ2(2) = 5.17; non-AM, χ2(2) = 0.69). Together, these results suggest that in the ICC, task performance enhances the detectability of the target AM sound by increasing the separation of the AM and non-AM FR distributions, without affecting trial-by-trial FR variance. Similar results were observed if we restricted our analysis to just single-units (see Table 5 for a comparison of results when multi-units were included vs excluded from the dataset).
A comparison of all significant effects of behavioral context on FR-based metrics when multiunits were included versus excluded from the dataset
We next examined how task performance affects the FR-based AM sensitivity of MGV units, again pooling data across 7 d. In general, we found the same pattern of results as reported for the ICC. Of the 823 total recorded MGV units, 117 (∼14%) exhibited a measurable FR-based threshold during at least one behavioral context. Data from one of these MGV neurons are shown in Figure 5A–C. Task performance affected this neuron's firing in three subtle but important ways. First, as evident in the rasters (Fig. 5A) and quantified FRs (Fig. 5B), the discharge rate during unmodulated noise was lower during task performance than during passive sound exposure. Second, the firing during unmodulated noise was less variable on a trial-by-trial basis during the task. This observation is more difficult to see by eye in the raster plots but is reflected in the smaller error bars for the unmodulated FR during the task in Figure 5B. Finally, the 0 dB-evoked FR is higher during the task than during the passive conditions. Because FR-based d′ values depend on both the magnitude and variability of firing, these relatively subtle changes combine to generate notable improvements in AM sensitivity during the task (Fig. 5C).
Task performance improves the FR-based sensitivity of MGV neurons. A, Peristimulus time histograms and raster plots for one representative MGV neuron under different behavioral contexts. B, Mean ± SEM FRs across AM depths and during unmodulated noise for the same neuron as depicted in A. C, FR-based neurometric functions for the same neuron depicted in A and B. This neuron only exhibited a measurable FR-based threshold during task performance. D, FR-based thresholds for all MGV units that exhibited a measurable threshold during at least one behavioral context. Each unmeasurable threshold is NaN (see Materials and Methods). The representative neuron depicted in panels A–C is highlighted in purple. Dashed lines connect thresholds from the same unit. Thick black lines connect the mean of measurable thresholds across contexts. Asterisks denote significant effects of context on measurable thresholds. E–G, FR ratio (E), CV during AM (Go) trials (F), and CV during non-AM (NoGo) trials (G) for all MGV units that exhibited a measurable FR-based threshold during at least one behavioral context. **p < 0.01; ***p < 0.0001.
To determine whether this pattern was consistent across the MGV population, we again asked whether AM-sensitive units were more likely to exhibit a measurable threshold under a specific behavioral context. As illustrated in Figure 5D, we found a significant effect of context (χ2(2) = 58.79; p < 0.0001), such that MGV units were far more likely to exhibit measurable thresholds during task performance (115/117, 98%) than during the pre- (12/117, 10%; z = 314.58; p < 0.0001) or post- (11/117, 9%; z = 359.91; p < 0.0001) passive sound exposure sessions. We also found a significant effect of context on measurable threshold values (χ2(2) = 42.64; p < 0.0001; Fig. 5D), with thresholds significantly lower (better) during task performance than during the pre- (t(132) = −4.90; p < 0.0001) or post- (t(132) = −5.64; p < 0.0001) passive sound exposure sessions. See Table 4 for a detailed list of the magnitude of the threshold change across contexts and corresponding effect sizes.
To examine if the enhanced AM sensitivity during task performance was driven by an increase in the separation between AM- and non-AM FR distributions, we calculated the ratio of average FRs (AM/non-AM) for each individual MGV unit that exhibited at least one measurable AM threshold. As above, we restricted this analysis to the highest AM depth presented to each neuron. We found a significant effect of behavioral context (χ2(2) = 471.70; p < 0.0001; Fig. 5E). FR ratios during task performance were significantly higher than those during the pre- (t(345) = 15.42; p < 0.0001) and post- (t(345) = 15.08; p < 0.0001) passive sound exposure sessions. Large effect sizes (Cohen's d > 1; Table 4) and a bootstrap analysis (lower bound of 95% CI, χ2(2) = 395.37) confirmed that these results were robust.
We then analyzed CV values to determine whether changes in FR variability also contribute to context-dependent shifts in MGV sensitivity. A GLM/ANOVA suggested a significant effect of context on AM-evoked CV values (χ2(2) = 88.75; p < 0.0001; Fig. 5F), such that FR variability was lower during the task than during pre- (t(345) = −5.86; p < 0.0001) or post- (t(345) = −5.93; p < 0.0001) sessions. Despite moderate effect sizes (all Cohen's d < 0.7; Table 4), a bootstrap analysis revealed that the results were robust (lower bound of 95% CI, χ2(2) = 5.91). Finally, behavioral context did not significantly affect CV values during non-AM noise (χ2(2) = 2.82; p = 0.2445; Fig. 5G). Together, these results suggest that task performance modulates both the magnitude and trial-by-trial variability of MGV firing, leading to significantly better AM detectability. Similar results were observed if we restricted our analysis to just single units (Table 5).
A previous study (Caras and Sanes, 2017) that used an identical behavioral paradigm to the one implemented here reported that FR-based AM thresholds of gerbil ACX neurons are better during task performance, similar to what we observed in the ICC and MGN. To gain a better understanding of whether the influence of context differs across the auditory hierarchy, we reanalyzed the publicly available dataset associated with that paper. We first asked whether the effect of behavioral context on FR-based threshold measurability differs across the three brain regions. We found a significant interaction of context and region (χ2(4) = 58.25; p < 0.0001; Fig. 6A). While thresholds were more likely to be measurable during task performance in all three brain areas, the effect was more pronounced in the MGV and ACX than in the ICC.
The effect of behavioral context on FR-based AM sensitivity grows stronger across the ascending auditory hierarchy. A, Proportion of units with measurable FR-based thresholds across behavioral contexts. The effect of context is more pronounced in MGV and the ACX than in the ICC. B, During task performance, FR-based thresholds are lower in the ACX than in the ICC or MGV. Each individual point depicts the FR-based threshold for one unit. Large black circles indicate the mean. *p < 0.05; **p < 0.01. ACX data were adapted from Caras and Sanes (2017), with permission.
We then asked whether measurable threshold values differed across brain regions. Because most units were insensitive to AM depth during passive sound exposure, we restricted this analysis to thresholds obtained during task performance. As shown in Figure 6B, AM thresholds differed across brain regions (χ2(2) = 12.26; p = 0.0022). Thresholds were significantly lower in the ACX than in the MGV (t(366) = −3.19; p = 0.0046) and ICC (t(366) = −2.49; p = 0.0396). There was no significant difference between thresholds in ICC and MGV (t(366) = −0.058; p = 1). Together, these results suggest that the influence of behavioral context grows stronger across the ascending auditory hierarchy.
Task performance does not drive a systematic shift in the VScc-based AM sensitivity of ICC or MGV neurons
In addition to FR, context-dependent changes in phase locking can also contribute to enhanced sound sensitivity (Niwa et al., 2012). We therefore asked whether task performance affects the AM sensitivity of ICC or MGV neurons when estimated using a measure of neural phase locking by calculating the VScc for each stimulus and each unit. We then converted VScc values into d′ values, fit the data, and estimated neurometric thresholds from the fits. For these analyses, we only included single-units, pooled across days to maximize our statistical power.
When we examined VScc-based sensitivity in the ICC, over half of the single-units (120/217, 55%) never exhibited a measurable AM depth threshold. The remainder (n = 97/217; 45%) were sensitive to AM depth during at least one behavioral context. In general, these neurons followed one of three distinct response patterns.
The smallest group of neurons (12/97, ∼12%) was insensitive to AM during passive sound exposure but was sensitive during the task. As illustrated for one representative neuron in Figure 7A–C, this task-dependent improvement was typically very weak, often driven by a subtle drop in trial-by-trial VScc variability at the highest AM depth presented (e.g., note the smaller error bars at 0 dB during task performance in Fig. 7B, leading to a higher VScc-based d′ value in Fig. 7C).
Task performance does not drive a systematic shift in the VScc-based AM sensitivity of ICC neurons. A, Peristimulus time histograms and raster plots for one representative ICC neuron that was only sensitive to AM depth during the task. B, Mean ± SEM VScc values across AM depths and during unmodulated noise for the same neuron as depicted in A. C, VScc-based neurometric functions for the same neuron as depicted in panels A and B. D, PSTHs and raster plots for one representative ICC neuron sensitive to AM depth only during passive sound exposure sessions. E, Mean ± SEM VScc values across AM depths and during unmodulated noise for the same neuron as depicted in D. F, VScc-based neurometric functions for the same neuron as depicted in panels D and E. G, PSTHs and raster plots for one representative ICC neuron sensitive to AM depth during the task and during passive sound exposure. H, Mean ± SEM VScc values across AM depths and during unmodulated noise for the same neuron as depicted in G. I, VScc-based neurometric functions for the same neuron as depicted in panels G and H. J, VScc-based thresholds for all ICC neurons sensitive to AM depth during task performance and at least one passive sound exposure session. Each unmeasurable threshold is NaN (see Materials and Methods). Dashed lines connect thresholds from the same unit. Thick black lines connect the mean of measurable thresholds across contexts. Representative neuron depicted in G–I is highlighted in yellow. K, VScc-based thresholds for all ICC neurons that were sensitive to AM during at least one behavioral context. Plot conventions as in panel J. Representative neurons are highlighted in the same colors as in previous panels.
A second group of neurons was AM-sensitive during passive sound exposure but insensitive during the task. Although this pattern was evident in a larger share of the ICC population (39/97 neurons, ∼40%), the task-dependent change was again often relatively subtle, as depicted for a representative neuron in Figure 7D–F.
The final group of neurons, which made up the majority of AM-responsive units (46/97, ∼47%), was sensitive during the task and during at least one passive sound exposure session. As illustrated for a single neuron in Figure 7G, and quantified in Figure 7, H and I, these neurons tended to exhibit robust changes in phase locking across AM depths but seemed to be largely unaffected by task performance. As a result, their thresholds remained constant across behavioral contexts (χ2(2) = 3.77; p = 0.1521; Fig. 7J).
As a result of this heterogeneity (and the relative weak effect of context, when present), the average VScc-based thresholds across the entire ICC population did not significantly differ across contexts (χ2(2) = 3.02; p = 0.2206; Fig. 7K).
We next looked at how task performance affects the VScc-based AM sensitivity of MGV neurons. A minority of MGV neurons (n = 91/606; 15%) were AM sensitive under at least one behavioral context, and as in the ICC, three distinct response patterns were observed.
One group of neurons was insensitive to AM during passive sessions but sensitive during the task. Similar to the ICC, this group was the smallest, consisting of 18/91 neurons (20% of the population), and typically exhibited relatively weak improvements during the task, often manifesting only when fully modulated noise was presented. This point is illustrated for a single neuron in Figure 8A–C.
Task performance does not drive a systematic shift in the VScc-based AM sensitivity of MGV neurons. A, Peristimulus time histograms and raster plots for one representative MGV neuron that was only sensitive to AM depth during the task. B, Mean ± SEM VScc values across AM depths and during unmodulated noise for the same neuron as depicted in A. C, VScc-based neurometric functions for the same neuron as depicted in panels A and B. D, PSTHs and raster plots for one representative MGV neuron sensitive to AM depth only during passive sound exposure sessions. E, Mean ± SEM VScc values across AM depths and during unmodulated noise for the same neuron as depicted in D. F, VScc-based neurometric functions for the same neuron as depicted in panels D and E. G, PSTHs and raster plots for one representative MGV neuron sensitive to AM depth during the task and passive sound exposure. H, Mean ± SEM VScc values across AM depths and during unmodulated noise for the same neuron as depicted in G. I, VScc-based neurometric functions for the same neuron as depicted in panels G and H. J, VScc-based thresholds for all MGV neurons sensitive to AM depth during task performance and at least one passive session. Each unmeasurable threshold is NaN (see Materials and Methods). Dashed lines connect thresholds from the same unit. Thick black lines connect the mean of measurable thresholds across contexts. Representative neuron depicted in G–I is highlighted in yellow. K, VScc-based thresholds for all MGV neurons that were sensitive to AM during at least one behavioral context. Plot conventions as in panel J. Representative neurons are highlighted in the same colors as in previous panels.
A second group of neurons was AM sensitive during at least one passive sound exposure session but insensitive during the task. As in the ICC, these neurons made up a larger share of the population (36/91 neurons, 40%) and tended to exhibit relatively small context-dependent changes, as can be seen for the representative neuron depicted in Figure 8D–F.
A final group of MGV neurons (37/91, 40%) was sensitive during at least one passive session and during task performance. As shown for a single neuron in Figure 8G, and quantified in Figure 8, H and I, these neurons typically exhibited stronger phase locking as the AM depth increased but were insensitive to behavioral context. As a result, their VScc-based thresholds did not significantly change during task performance (χ2(2) = 4.10; p = 0.1285; Fig. 8J).
The heterogeneity across the entire MGV population, coupled with the relatively weak effect of task performance (when it was present at all), meant that on average, VScc-based thresholds in the MGV did not significantly differ across contexts (χ2(2) = 3.10; p = 0.2120; Fig. 8K), similar to what we found in the ICC.
Taken together, these findings suggest that in both ICC and MGV neurons, task performance does not drive a systematic shift in phase-locked–based AM detection.
Context-dependent shifts in FR-based and VScc-based AM sensitivity largely occur independently
Despite the lack of a systematic effect of behavioral context on VScc-based thresholds, some individual neurons did exhibit task-dependent improvement (Figs. 7A–C, 8A–C) or worsening (Figs. 7D–F, 8D–F). This observation motivated us to gain a better understanding of the relationship between VScc-based and FR-based context-dependent shifts in sensitivity within individual neurons. To do so, for each metric, we calculated the difference between thresholds obtained during the task and thresholds obtained during the pre-passive sound exposure session. Because we were interested in quantifying the magnitude and direction of change across contexts, we replaced unmeasurable thresholds with 0 dB re: 100%, the highest possible AM depth presented. We then created population vector plots for each brain region (Fig. 9A,B). In each plot, each vector (gray arrows) represents the direction and magnitude of the FR-based and VScc-based threshold shift for an individual neuron. We restricted this analysis to single units that exhibited a measurable AM threshold during the pre-passive or task context. In the ICC 92/217, single units met this criterion and were included in the analysis. In the MGV, 144/606 single units were included.
Context-dependent shifts in FR-based and VScc-based thresholds are largely independent. A, Each vector (gray arrow) represents the direction and magnitude of the VScc-based and FR-based threshold shifts for an individual ICC neuron. Shifts were calculated as the difference between thresholds obtained during the task and thresholds obtained during the prepassive sound exposure session. Negative shifts reflect instances when the threshold was lower (better) during task performance. Black line represents vector mean. B, Same as panel A, but for MGV neurons. C, FR-based thresholds from ICC neurons obtained during task performance are plotted against VScc-based thresholds obtained during the pre-passive sound exposure session. Each circle represents data from one single unit. Each unmeasurable threshold is considered NaN and plotted separately to the right of the x axis. Neurons with a measurable VScc task threshold are outlined in black. D, Same as panel C, but for MGV neurons. *p < 0.05; ***p < 0.0001.
In the ICC, the majority of neurons (76/92, ∼83%) shifted their AM sensitivity along a single dimension, exhibiting a context-dependent change in their FR-based or VScc-based threshold, but not both. These neurons fall along the x and y axes in Figure 9A. A smaller subset, 16/92 (∼17%), changed along both dimensions. Overall, there was no systematic relationship between the direction of the VScc-based and FR-based shifts. While most of these neurons improved their FR-based threshold from pre-passive to task sessions (Rayleigh test statistic, 0.1311; p = 0.0376), they were equally likely to improve or worsen along the VScc-based dimension (Rayleigh test statistic, −0.0977; p = 0.9047).
Similar results were observed in the MGV. Most MGV neurons (128/144, 91%) shifted their AM sensitivity along a single dimension, and the remainder (13/144, 9%) shifted along both (Fig. 9B). Nearly all neurons that shifted their FR-based sensitivity improved during the task (Rayleigh test statistic, 0.5604; p < 0.0001). VScc-based AM sensitivity was also slightly more likely to improve (Rayleigh test statistic, 0.1320; p = 0.0125). Together, these data suggest that the mechanisms that drive context-dependent shifts in rate-based and phase-locked–based sensitivity operate largely independently.
In a subset of neurons, AM detection is supported by changes in both FR and phase locking during task performance
Many ICC and MGV neurons exhibited a measurable FR-based threshold exclusively during task performance (Figs. 4D, 5D). This observation raised two possibilities. First, these neurons may be largely insensitive to AM during passive exposure and only respond to small AM depths when they gain behavioral relevance. Alternatively, these neurons may in fact detect small AM depths during passive sound exposure but do so by phase locking rather than increasing their FR. This latter idea was inspired by neurons like the one depicted in Figure 5A–C (and again in Fig. 8D–F). For this MGV neuron, FR-based sensitivity was notably better during the task than during passive sound exposure (Fig. 5C); however, its VScc-based sensitivity was better during passive sound exposure than during the task (Fig. 8F).
To explore this idea more fully, we compared FR-based thresholds obtained during task performance with VScc-based thresholds obtained during the pre-passive sound exposure period. For this analysis, we only included single units that exhibited a measurable FR-based threshold during the task, pooled across days to maximize our statistical power.
As illustrated in Figure 9C, just over half of the ICC neurons that exhibited a measurable FR-based threshold during the task (13/20) exhibited a measurable VScc-based threshold during the pre-passive sound exposure session. Of these neurons, the majority (10/13, ∼79%) also had measurable VScc-based thresholds during task performance (indicated by the outlined circles in Fig. 9C). In the MGV, a much smaller proportion of neurons that exhibited a measurable FR-based threshold during the task exhibited a measurable VScc-based threshold during pre-passive sound exposure (11/84, ∼13%, Fig. 9D). Of those that did, we found that a majority (8/11, ∼73%) also exhibited measurable VScc-based thresholds during task performance. Together, these findings suggest that while some neurons, particularly those in the MGV, are only sensitive to AM depth when AM sounds are behaviorally relevant, others are always sensitive, but switch from relying solely on phase locking during passive sound exposure to using a combination of phase locking and overall spike rate when performing a task. Moreover, these data indicate that the reliance on a rate-based AM detection strategy increases as information moves up the ascending auditory pathway, consistent with previous literature (Joris et al., 2004; Bartlett and Wang, 2007; Wang et al., 2008).
Subcortical sensitivity to sound gradually improves across perceptual learning
During perceptual training, subjects improve their ability to detect or discriminate increasingly difficult stimulus cues over several days or weeks. These perceptual enhancements are accompanied by training-induced changes in stimulus representations within the ACX (Recanzone et al., 1993; Witte and Kipke, 2005; Polley et al., 2006; Alain et al., 2007; van Wassenhove and Nagarajan, 2007; Caras and Sanes, 2017, 2019). Whether neurons in subcortical auditory structures undergo similar changes remains uncertain. To address this issue, we estimated the AM sensitivity of ICC and MGV neurons as animals learned to detect smaller and smaller AM depths.
Perceptual training improves behavioral AM depth sensitivity
We first assessed behavioral AM detection thresholds across 7 d of perceptual training with a range of AM depths. Figure 10, A and B, depicts data from a representative animal, illustrating improved psychometric performance across training. As expected from prior work (Caras and Sanes, 2015, 2017, 2019; Mowery et al., 2019), behavioral thresholds significantly decreased across days. This result was true both for animals used for ICC recordings (χ2(1) = 21.81; p < 0.0001; Fig. 10C) and for MGV recordings (χ2(1) = 14.99; p = 0.0001; Fig. 10D). These data confirm that animals underwent perceptual learning on our AM depth detection task.
Perceptual training improves behavioral AM depth thresholds. A, Psychometric fits from one representative animal across 7 d of perceptual training. B, Behavioral thresholds from the same animal depicted in A. C, D, Mean ± SEM behavioral thresholds for all animals implanted with electrodes in the ICC (N = 6, C) or MGV (N = 5, D).
Perceptual training reduces the likelihood that units are AM-sensitive in the MGV, but not the ICC
We then asked whether perceptual training affects the likelihood that a given unit is sensitive to AM depth. As shown in Table 3, the proportion of ICC units exhibiting measurable FR-based and VScc-based thresholds stayed relatively constant across training (FR, χ2(1) = 1.33; p = 0.2495; VScc, χ2(1) = 2.09; p = 0.1486). However, in the MGV, there was a significant effect of training. The proportion of units exhibiting measurable FR-based thresholds and VScc-based thresholds steadily decreased over the course of several days (FR, χ2(1) = 17.51; p < 0.0001; VScc, χ2(1) = 24.42; p < 0.0001). This observation suggests that the AM-sensitive network within the MGV may become sparser during perceptual training. This finding is reminiscent of a prior report that associative learning enhances sparse population coding in the sensory cortex (Gdalyahu et al., 2012).
Perceptual training improves the FR-based AM sensitivity of ICC and MGV neurons
We next asked whether perceptual training had any effect on FR-based thresholds measured during task performance. As shown in Figure 11A, ICC thresholds significantly decreased (improved) over the course of perceptual training (χ2(1) = 6.85; p = 0.0089). Neural and behavioral improvement occurred in tandem, with the average sensitivity of ICC neurons closely tracking average perceptual sensitivity across training days. To explore this relationship further, we asked whether average FR-based thresholds predicted behavioral thresholds within individual animals by calculating the Spearman rank correlation coefficient for each subject. Although this analysis did not reveal any significant correlations, the data were likely underpowered, as we typically did not have data from every day for every animal. We therefore asked whether a correlation was present when we included data from all subjects. In Figure 11B, each datapoint represents a subject's behavioral threshold and simultaneously collected average FR-based threshold for a given training day. Datapoints of the same color were collected from the same animal. As shown, a moderate correlation between ICC FR-based thresholds and behavioral thresholds was observed (r = 0.65; p = 0.0029).
FR-based thresholds of ICC and MGV neurons improve during perceptual training. A, Mean ± SEM FR-based ICC thresholds decrease across perceptual training. Thresholds were estimated from AM-sensitive units during task performance (n = 28 units; 2–7/day). Behavioral thresholds from Figure 10 are replotted here for comparison with neural data. B, FR-based neural thresholds in the ICC significantly correlate with behavioral thresholds. Each data point represents the average FR-based threshold obtained from a single animal on a single day during task performance, plotted against the animal's behavioral threshold from that same session. Each color represents data from an individual subject. C, Mean ± SEM FR-based MGV thresholds decrease across perceptual training. Thresholds were estimated from AM-sensitive units during task performance (n = 115 units; 2–31/day). Behavioral thresholds from Figure 10 are replotted here for comparison with neural data. D, FR-based neural thresholds in the MGV significantly correlate with behavioral thresholds. Plot conventions are as in panel B.
Next, we examined whether perceptual training affects the FR-based AM sensitivity of MGV neurons. In general, we found the same pattern of results as reported for the ICC, where FR-based thresholds of MGV neurons improved over the course of training (χ2(1) = 65.86; p < 0.0001; Fig. 11C). The rate of improvement [−6.57 dB/log(day)] was similar to that observed in the ICC [−5.29 dB/log(day)]. We then again asked whether average FR-based thresholds predicted behavioral thresholds in individual animals. Spearman rank correlation coefficients were only significant in two of five animals. However, when we included data from all subjects (for the same reasons described above), we found that MGV FR-based thresholds and behavioral thresholds were significantly correlated (r = 0.68; p = 0.0001; Fig. 11D). Collectively, these results indicate that rate-based representations of AM stimuli in the ICC and MGV significantly improve during perceptual training and are associated with corresponding improvements in AM depth perception.
Perceptual training improves the VScc-based AM sensitivity of MGV, but not ICC neurons
Improvements in neural phase locking to auditory stimuli may also underlie longer-term learning (Bao et al., 2004; Batterink, 2020), particularly in subcortical regions, where temporal representations of sound stimuli are more prevalent (Krishna and Semple, 2000; Joris et al., 2004; Bartlett and Wang, 2007; Wang et al., 2008). We therefore asked whether perceptual training had any effect on VScc-based thresholds measured during task performance.
In the ICC, average VScc-based thresholds appeared to improve with training [−3.48 dB/log(day)], but the effect was not significant (χ2(1) = 3.66; p = 0.0556; Fig. 12A). We then asked whether average VScc-based thresholds predicted behavioral threshold within individual animals. We found a significant correlation in only one of five subjects. When we included data from all subjects to increase our statistical power, we found a significant, but weak, correlation between VScc-based and behavioral thresholds (r = 0.44; p = 0.0203; Fig. 12B).
VScc-based thresholds of ICC and MGV neurons improve during perceptual training. A, Mean ± SEM VScc-based ICC thresholds decrease across perceptual training. Thresholds were estimated from AM-sensitive units during task performance (n = 58 units; 3–14/day). Behavioral thresholds from Figure 10 are replotted here for comparison with neural data. B, VScc-based neural thresholds in the ICC significantly correlate with behavioral thresholds. Each data point represents the average VScc-based threshold obtained from a single animal on a single day during task performance, plotted against the animal's behavioral threshold from that same session. Each color represents data from an individual subject. C, Mean ± SEM VScc-based MGV thresholds decrease across perceptual training. Thresholds were estimated from AM-sensitive units during task performance (n = 55 units; 1–19/day). Behavioral thresholds from Figure 10 are replotted here for comparison with neural data. D, VScc-based neural thresholds in the MGV significantly correlate with behavioral thresholds. Plot conventions are as in panel B.
In contrast to the ICC, VScc-based thresholds in the MGV significantly improved across training (−7.97 dB/log(day), χ2(1) = 22.51; p < 0.001; Fig. 12C). Despite this result, VScc-based MGV thresholds did not correlate with behavioral thresholds among any of the individual subjects. When we again included data from all subjects, however, we observed a moderate correlation (r = 0.68; p = 0.0025; Fig. 12D).
These data indicate that phase-locked representations of AM stimuli (particularly those in the MGV) improve during perceptual training. However, VScc-based neural thresholds are not as strongly correlated with behavioral thresholds as FR-based thresholds and thus offer less explanatory power.
Task-dependent improvements in neural sensitivity increase during perceptual training
Perceptual training strengthens the influence of behavioral context on FR-based AM sensitivity in the MGV and ACX, but not the ICC
ACX neurons are more sensitive to AM stimuli when subjects perform an AM detection task than during passive sound exposure, and the magnitude of this enhancement increases over the course of perceptual training (Caras and Sanes, 2017). We found that ICC and MGV units exhibit context-dependent enhancements in AM sensitivity that mirror those observed in the ACX (Figs. 4–6). Whether the magnitude of these context-dependent enhancements similarly increase over the course of perceptual training remains unknown.
To address this issue, we first asked whether there was a significant interaction between behavioral context and perceptual training day on FR-based thresholds in the ICC or the MGV. As reported earlier, the majority of units only exhibited measurable FR-based thresholds during task performance (Fig. 6A). This observation left us with the question of how to treat unmeasurable (NaN) thresholds. Excluding these units from our analysis entirely would cause us to grossly underestimate the effect of behavioral context; an unmeasurable threshold is still meaningful, as it indicates an instance when a unit was not sensitive to AM depth. We therefore set NaN values to 0 dB, the highest possible AM depth. By doing so, we were able to generate a conservative estimate of whether and how the effect of behavioral context changes over the course of perceptual training.
As illustrated in Figure 13A, and as expected from the analysis presented in Figure 4D, FR-based thresholds in the ICC were modulated by behavioral context (χ2(2) = 16.35; p = 0.0003). This effect remained relatively constant across time, however, such that there was no significant interaction between context and perceptual training day (χ2(2) = 1.42; p = 0.4910).
Task-dependent improvements in FR-based sensitivity increase with training in the MGV and ACX. A, FR-based ICC thresholds are lower (better) during task performance than passive sound exposure, but the magnitude of this context-dependent enhancement does not change across perceptual training. Data are from 41 units; 2–10/day (Table 3). B, FR-based MGV thresholds are lower (better) during task performance than passive sound exposure, and the magnitude of this context-dependent enhancement increases across perceptual training. Data are from 117 units; 2–32/day (Table 3). C, FR-based ACX thresholds are lower (better) during task performance than passive sound exposure, and the magnitude of this context-dependent enhancement increases across perceptual training. Data are from 232 units; 29–28/day, adapted from Caras and Sanes (2017). D, VScc-based ICC thresholds decrease across perceptual training but exhibit no effect of behavioral context, and no interaction between context and day. Data are from 97 units; 9–24/day (Table 3). E, VScc-based MGV thresholds decrease across perceptual training but exhibit no effect of behavioral context and no interaction between context and day. Data are from 91 units; 1–30/day (Table 3). All data are depicted as means ± SEMs. Thresholds were estimated from units that had at least one measurable AM threshold. Unmeasurable thresholds were replaced with a value of 0 for statistical analysis (see Materials and Methods). ***p < 0.0001.
In the MGV, we also found a strong effect of behavioral context (χ2(2) = 1166.65; p < 0.0001). As expected from the data presented in Figure 5D, thresholds were lower (better) during task performance. However, this effect grew stronger over time, resulting in a significant interaction between behavioral context and perceptual training day (χ2(2) = 63.55; p < 0.0001; Fig. 13B). By comparing the difference between task and pre-passive thresholds, we found that the effect of context was larger on Day 7 (Cohen's d = 3.17) than on Day 1 (Cohen's d = 2.10).
The original Caras and Sanes (2017) paper reported a significant interaction between behavioral context and perceptual training on FR-based thresholds, but did not replace unmeasurable (NaN) thresholds with 0, as we did here. To facilitate a more direct comparison across the auditory hierarchy, we reanalyzed that dataset (n = 219 multi- and 13 single units), replacing NaN values with 0. As expected, we found a significant effect of context on FR-based thresholds (χ2(2) = 2679.54; p < 0.0001). As in the MGV, this effect grew larger over the course of training (χ2(2) = 134.43; p < 0.0001; Fig. 13C), such that the difference between task and pre-passive thresholds was larger on Day 7 (Cohen's d = 3.70) than on Day 1 (Cohen's d = 1.43). As a result, the act of task performance had a much larger effect on both MGV and ACX thresholds at the end of perceptual training than at the beginning.
Perceptual training does not affect the influence of behavioral context on VScc-based AM sensitivity in the ICC or MGV
Our prior analyses revealed that behavioral context does not have a systematic effect on VScc-based thresholds in the ICC and MGV (Figs. 7, 8). These analyses were performed by pooling data across days, which would obscure any effect of training. We therefore asked whether there was a significant interaction between behavioral context and perceptual training day on VScc-based thresholds in the ICC. We found that neither day (χ2(1) = 3.36; p = 0.0666) nor context (χ2(2) = 3.31; p = 0.1911) affected ICC thresholds and found no interaction between day and context (χ2(2) = 1.92; p = 0.3826; Fig. 13D).
We next examined VScc-based thresholds in the MGV. The results are similar to those obtained from the ICC. While VScc-based MGV thresholds significantly improved across perceptual training days (χ2(1) = 11.35; p = 0.0008), neither the effect of context (χ2(2) = 1.94; p = 0.3780) nor the interaction between day and context (χ2(2) = 0.30; p = 0.8608) were significant (Fig. 13E).
Discussion
Adaptive changes in the sensitivity of ACX neurons underlie many aspects of perceptual plasticity, but the contribution of subcortical auditory stations to this process is less understood. Here we found that in the ICC and MGV, neuronal sensitivity to AM, a temporal cue that supports speech perception, is better when subjects perform an AM detection task compared with when subjects are passively exposed to AM sounds. This context-dependent effect is largely mediated by changes in the spike rate rather than stimulus phase locking and grows stronger over the course of perceptual training, particularly in the MGN, mirroring results in the ACX (Niwa et al., 2012; Caras and Sanes, 2017). These findings raise the possibility that aspects of flexible sound processing observed in the ACX may actually be inherited from subcortical auditory regions.
Effects of behavioral context on subcortical sound processing
Prior studies examining the effect of behavioral context on subcortical sound processing have yielded conflicting results. While some studies found that behavioral relevance affects IC and MGN sound-evoked responses (Ryan and Miller, 1977; Ryan et al., 1984; Franceschi and Barkat, 2021; Shaheen et al., 2021), others reported the opposite (Otazu et al., 2009; Slee and David, 2015; Williamson et al., 2015; Rocchi and Ramachandran, 2020). This confusion likely stems from two factors. First, the majority of prior studies did not specify which IC and MGN subdivisions were sampled (Ryan et al., 1984; Metzger et al., 2006; Otazu et al., 2009; McGinley et al., 2015b; Franceschi and Barkat, 2021; Saderi et al., 2021). Second, few studies have examined the effect of behavioral context on IC or MGN responses to near-threshold stimuli and those that did stopped short of quantifying signal detectability or discriminability (Ryan and Miller, 1977; Ryan et al., 1984). Here, we targeted first-order midbrain and thalamic subdivisions and used a signal detection framework to explore how behavioral context impacts neurometric sensitivity to a stimulus cue with real-world relevance: AM depth. Our results demonstrate an unambiguous effect of context: the ICC and MGV exhibit rapid improvements in FR-based thresholds during task performance compared with periods of passive sound exposure.
Although we did not observe a similar effect of context on phase-locked–based thresholds, we cannot rule out the possibility that other aspects of the neuronal response, like spike timing reliability, are context-dependent. This possibility is particularly intriguing given that the temporal resolution required for optimal amplitude encoding is sensitive to certain conditions, like seasonal and hormonal status (Caras et al., 2015). Additional experiments are needed to explore this possibility more fully and to determine whether our current results generalize to other stimuli or to reward-based tasks (David et al., 2012).
Temporal dynamics of task-dependent changes in sound processing
Two previous studies reported that in a subset of ICC neurons, rapid task-dependent changes in activity persist for several minutes after task performance ends (Slee and David, 2015; Saderi et al., 2021). Similar results have been documented in MGN neurons, with altered firing and receptive fields persisting for at least an hour after a bout of classical conditioning (Edeline and Weinberger, 1991, 1992; McEchron et al., 1996). Here, we found no evidence of such persistence in the ICC or MGV; task-dependent improvements in FR-based sensitivity generally reverted back to baseline levels immediately after task performance. This apparent discrepancy may be explained by differences in species, behavioral paradigm, sound stimuli, and measured outcome variable.
Regardless, previous work using the same species, paradigm, and stimuli as this study found that in the ACX, task-dependent improvements in AM sensitivity “do” persist after the task is over (Caras and Sanes, 2017). This observation, together with our findings, suggests that separate mechanisms with distinct temporal dynamics may operate simultaneously to shape context-dependent changes in AM sensitivity in auditory subcortical and cortical regions. Moreover, the fact that ACX persistence is often incomplete, with neural sensitivity reverting slightly, but not completely, back to pre-task levels during the post-passive period raises the possibility that context-dependent shifts in ACX thresholds may reflect a combination of rapid transient mechanisms operating subcortically and local mechanisms operating within the ACX that decay with a longer time constant. Additional studies that examine the influence of behavioral context on cortical and subcortical neurometric thresholds in other species and with different stimuli and task parameters are needed to test the generalizability of this idea.
Models for perceptual learning
Several models have been put forth to explain perceptual learning, including internal noise reduction (Dosher and Lu, 1998, 1999; Jones et al., 2013), sensory reweighting to improve decision-making (Law and Gold, 2008), and various reward- and attention-based theories (Seitz and Dinse, 2007; Roelfsema et al., 2010; Kim et al., 2015). One long-standing model is reverse hierarchy theory, which suggests that plasticity first occurs within higher-order sensory brain regions, and changes in lower levels of the sensory pathway only emerge later in training (Ahissar and Hochstein, 2004). Despite this model's prominence, it remains difficult to determine how well it describes the existing data, particularly in the auditory system. Part of this difficulty is due to the fact that relatively few studies have examined the contribution of subcortical regions to perceptual learning. Of the studies that have, all used tools that lack temporal or spatial precision, and most have restricted their analyses to time points before and after training (Carcagno and Plack, 2011; Lau et al., 2017; Reetzke et al., 2018; MacLean et al., 2024). This approach obscures the temporal relationship between training-induced changes in neural function and perception and is problematic given that the neuroplasticity that accompanies learning is often transient, renormalizing once behavioral performance stabilizes (Muellbacher et al., 2001; Floyer-Lea et al., 2006; Yotsumoto et al., 2008; Sampaio-Baptista et al., 2015; Sarro et al., 2015; Frank et al., 2018).
We recorded from the ICC and MGV while subjects actively trained on a multiday perceptual learning task. Three notable findings emerged. First, neuronal sensitivity to the target AM sound improved over the course of training in both regions. Second, the rate of improvement was relatively similar in the ICC and MGV and largely comparable with the rate of improvement previously observed in the ACX (Caras and Sanes, 2017). Finally, neural and behavioral sensitivity improved in tandem, emerging over similar time courses and to similar degrees. Taken together, our results do not support the reverse hierarchy theory.
Another key finding from our experiments was an interaction between context-dependent and learning-dependent plasticity, such that task performance has a greater effect on rate-based AM sensitivity at the end of training than at the beginning. This effect was absent in the ICC but was robust in the MGV. Similar observations in the ACX previously led us to propose that non-sensory processes drive context-dependent changes in the ACX, and training facilitates perceptual learning by increasing the strength of these modulations (Caras and Sanes, 2017). Our current data suggest that our observations in the ACX may partially result from modulatory processes acting on subcortical regions in the ascending auditory pathway.
Neural mechanisms driving subcortical plasticity
While we demonstrate that ICC and MGV neurons exhibit context-dependent and learning-related plasticity, the neural mechanisms underlying these changes remain uncertain. Context-dependent changes in activity have been noted as early as the auditory nerve (Delano et al., 2007; Gehmacher et al., 2022) and cochlear nuclei (Oatman, 1971; Ryan et al., 1984), and population-level activity from cochlear nucleus neurons can predict behavioral detection thresholds (Mackey et al., 2023). Plasticity in these early regions is likely mediated by descending inputs from the superior olivary complex; indeed, medial olivocochlear bundle activity increases during perceptual training on a phoneme-in-noise discrimination task (de Boer and Thornton, 2008). Additional experiments in animals with silenced olivocochlear efferents will be needed to determine the extent to which our results reflect inherited changes from the auditory periphery or early brainstem nuclei.
Our data show that a behavioral shift from passive sound exposure to task engagement is accompanied by a neural shift to improved rate-based sensitivity within individual ICC and MGV neurons. While this effect is relatively small in the ICC, it increases in both significance and magnitude in the MGV, suggesting a hierarchical emergence of context-dependent plasticity which parallels the emergence of the rate code along the ascending auditory pathway. The mechanism driving the temporal-to-rate transformation is currently unknown, although recent work implicates GluN2C/D containing NMDA receptors in the IC (Drotos et al., 2023).
Other possible drivers of subcortical plasticity could involve inputs from various neuromodulatory sources (Moore et al., 1978; Fitzpatrick et al., 1989; Klepper and Herbert, 1991; Motts and Schofield, 2009, 2010; Nevue et al., 2016; Chen et al., 2018). In the ICC, for example, serotonin levels fluctuate with context (Hurley and Hall, 2011) and could mediate short-term synaptic plasticity and changes in gain (Bohorquez and Hurley, 2009; Miko and Sanes, 2009). Thus, serotonergic inputs may contribute to the rapid rate-based ICC plasticity reported here. Similarly, the arousal level has a pronounced effect on IC and MGN activity, independent of task engagement (McGinley et al., 2015a, Saderi et al., 2021). These effects are likely due to rapid fluctuations in cholinergic and/or noradrenergic transmission (Reimer et al., 2016). Thus, state-dependent neuromodulatory shifts may contribute to the rapid context-dependent changes in subcortical firing reported here.
A final, though not mutually exclusive, possibility is that ICC and MGV plasticity is mediated by descending inputs from the ACX. Layer 5 corticofugal neurons project to higher-order IC and MGN subdivisions (Bajo and Moore, 2005; Williamson and Polley, 2019; Asilador and Llano, 2021) and transmit sensory and non-sensory signals to downstream targets (Oberle et al., 2022; Ford et al., 2024). Ablation or inactivation of these neurons can impair some forms of auditory learning (Bajo et al., 2010; Ford et al., 2024; Krall et al., 2024). Coupling ICC and MGV recordings with L5 corticofugal loss-of-function manipulations will be necessary to establish whether this descending pathway makes a causal contribution to perceptual learning and its associated subcortical plasticity.
Limitations of the current study
Our neural recordings were obtained from freely moving animals. We chose this approach because head-fixation can induce stress (Juczewski et al., 2020) and stress can impact both behavior and neural function. One limitation of the freely moving approach, however, is that the animal's motor actions and position in the test cage are uncontrolled. If these factors differed systematically across behavioral contexts, they could partially explain our results. Although we did not track movement or body position during our experiments, anecdotal observations suggest that the subjects generally moved around the test cage more during passive sound exposure sessions than during task performance, where they spent most of their time standing still to drink from the spout. Increased movement during passive sound exposure could theoretically impact neuronal responses in two different ways. First, the acoustic signals reaching the ears could differ depending on the animal's position in the cage. We believe this is unlikely, as the speaker was centered directly above the cage. The size of the cage and the gerbil's typical body posture would preclude any significant changes in the sound level or arrival time at the two ears.
Another possibility is that reafferent signals associated with locomotion were stronger during passive sound exposure and had a greater impact on ICC and MGV firing. These signals can suppress activity in both the IC (Yang et al., 2020) and the MGN (McGinley et al., 2015a; Williamson et al., 2015). If locomotion suppressed AM-evoked responses more than non-AM responses, it could partially explain the fact that AM/non-AM FR ratios were smaller (and thresholds were weaker) during passive sound exposure than during task performance. We cannot conclusively rule this possibility out; however, two aspects of our data argue against this scenario. First, context-dependent changes in FR-based AM sensitivity grew stronger in the MGV over the course of perceptual training without any obvious day-to-day changes in the animal movement. Second, even if there was a daily increase in movement, we would expect this to impact ICC and MGV activity similarly. Instead, we found that perceptual training strengthened context-dependent changes in AM sensitivity in the MGV, but not the ICC. Together, these observations suggest that our findings are instead due to context-dependent changes in arousal, attention, expectations, or some combination thereof.
Footnotes
This work was supported by National Institute of Health (NIH) Grant T32DC00046 and F31DC021355 to R.Y. Purchase of the Zeiss LSM 980 Airyscan 2 was supported by Award Number 1S10OD025223-01A1 from the NIH. We thank Drs. Matheus Macedo-Lima and Stefanie Kuchinsky (University of Maryland) for their help with statistics and all members of the M.L.C. laboratory for their constructive criticism and support. All data files associated with this manuscript can be found at http://hdl.handle.net/1903/33156. All associated analysis code can be found at https://github.com/caraslab/Ying_JNeuro2024 and https://github.com/caraslab/epa.
The authors declare no competing financial interests.
- Correspondence should be addressed to Rose Ying at roseying{at}umd.edu or Melissa L. Caras at mcaras{at}umd.edu.