Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE

User menu

  • Log out
  • Log in
  • My Cart

Search

  • Advanced search
Journal of Neuroscience
  • Log out
  • Log in
  • My Cart
Journal of Neuroscience

Advanced Search

Submit a Manuscript
  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE
PreviousNext
Research Articles, Systems/Circuits

Auditory Training Alters the Cortical Representation of Complex Sounds

Huriye Atilgan, Kerry M. Walker, Andrew J. King, Jan W. Schnupp and Jennifer K. Bizley
Journal of Neuroscience 30 April 2025, 45 (18) e0989242025; https://doi.org/10.1523/JNEUROSCI.0989-24.2025
Huriye Atilgan
1The Ear Institute, University College London, London WC1X 8EE, United Kingdom
2Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3PT, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kerry M. Walker
2Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3PT, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kerry M. Walker
Andrew J. King
2Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3PT, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Andrew J. King
Jan W. Schnupp
2Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3PT, United Kingdom
3Gerald Choa Neuroscience Institute, The Chinese University of Hong Kong, Hong Kong, Sha Tin
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jan W. Schnupp
Jennifer K. Bizley
1The Ear Institute, University College London, London WC1X 8EE, United Kingdom
2Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3PT, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jennifer K. Bizley
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • Peer Review
  • PDF
Loading

Abstract

Auditory learning is supported by long-term changes in the neural processing of sound. We examined these task-depend changes in the auditory cortex by mapping neural sensitivity to timbre, pitch, and location cues in cues in trained (n = 5) and untrained control female ferrets (n = 5). Trained animals either identified vowels in a two-alternative forced choice task (n = 3) or discriminated when a repeating vowel changed in identity or pitch (n = 2). Neural responses were recorded under anesthesia in two primary auditory cortical fields and two tonotopically organized nonprimary fields. In trained animals, the overall sensitivity to sound timbre was reduced across three cortical fields compared with control animals, but maintained in a nonprimary field (the posterior pseudosylvian field). While training did not increase sensitivity to timbre across the auditory cortex, it did change the way in which neurons integrated spectral information, with neural responses in trained animals increasing their sensitivity to first and second formant frequencies, whereas in control animals cortical sensitivity to spectral timbre depended mostly on the second formant. Animals trained on timbre identification were required to generalize across pitch when discriminating timbre, and their neurons became less modulated by fundamental frequency relative to control animals. Finally, both trained groups showed increased spatial sensitivity and an enhanced response to sound source locations close to the midline, where the loudspeaker was located in the training chamber. These results demonstrate that training elicited widespread alterations in the cortical representation of complex sounds.

  • auditory
  • decoding
  • ferret
  • learning
  • plasticity
  • timbre

Significance Statement

Learning a task can elicit widespread changes in the brain. Here, we trained animals to discriminate sound timbre using synthetic vowel sounds. Somewhat surprisingly, we observed that in three out of four of the brain regions studied, neural responses became less sensitive to timbre, while in the fourth area, sensitivity was maintained. This suggests that training does not simply rewire more neurons to represent learned stimuli. Neurons also changed the way in which they processed stimuli becoming more sensitive to the formant cues that determine vowel identity and tuned preferentially for the region of space in which sounds were presented during training. Together, these results suggest that learning results in complex changes in how and whether neurons represent learned sounds.

Introduction

Sensory discrimination tasks are known to drive cortical plasticity, and increases in the representational area of the stimulus, such as tonotopic map expansion, have been proposed as providing the structural substrate for learning in the auditory cortex (Rutkowski and Weinberger, 2005; Polley et al., 2006; Engineer et al., 2014; Schreiner and Polley, 2014). However, other studies have noted that such changes only occur for simple stimulus features, such as frequency or level, and questioned the functional role of this form of representational plasticity; plasticity may be a temporary phenomenon associated with learning that does not persist once a task is well learned (Reed et al., 2011), and frequency discrimination can occur in the absence of map plasticity (Brown, 2004).

Learning to discriminate behaviorally meaningful sounds, such as when mother mice learn to recognize pup vocalizations, also elicits changes in auditory cortical neurons, altering inhibition (Galindo-Leon et al., 2009) and leading to gain enhancement (Shepard et al., 2016) independently of any changes in tonotopic representation. Animals trained to dynamically locate a target area based on properties of sound lead to adaptive changes within the primary auditory cortex (A1; Bao et al., 2004; Whitton et al., 2014). For example, locating a weak tone in the presence of background noise led to increased noise tolerance and an increase in the number of nonmonotonic rate-level functions (Whitton et al., 2014), while navigating based on the repetition rate of brief sound bursts led to changes in temporal response properties (Bao et al., 2004).

Studies investigating training-induced changes after learning to make discriminations based on complex sound features (i.e., for sounds composed of multiple sound frequencies) have reported changes in the way that neurons optimize the integration of both spectral and temporal features. For example, training cats to discriminate changes in the position of spectral peaks in a harmonic sound complex (akin to single-formant vowels) led to neurons sharpening their frequency tuning and shifting their spectrotemporal preference toward trained sounds, suggestive of altered spectral integration (Keeling et al., 2008). Training ferrets to discriminate forward from reversed vocalizations increased the information conveyed in auditory cortical temporal pattern codes in trained relative to naive animals (Schnupp et al., 2006). In nonhuman primates trained to detect increases in the rate of sound amplitude modulation, auditory cortical neurons showed narrower spectral tuning and a shift in preference toward the trained modulation rates (Beitel et al., 2020).

While most studies of representational plasticity have focused on A1, studies investigating short-term plasticity during active behavior reveal receptive field plasticity that is more marked in higher cortical areas compared with the primary auditory cortex (Atiani et al., 2014; Elgueda et al., 2019). Moreover, higher cortical areas are more likely to show selectivity to spectrally complex sounds such as speech (Mohn et al., 2024). This raises the question of whether longer-term changes in the way in which auditory cortical neurons represent trained stimuli might also vary across the auditory cortical hierarchy.

In this study, we recorded from the auditory cortex of adult ferrets (n = 5) trained to discriminate the timbre of spectrally overlapping artificial vowels, and compared neural responses to the same sounds recorded in naive animals (n = 5). Animals were trained to identify vowel timbre in the context of a two-alternative forced choice task (n = 3; Bizley et al., 2013; Town et al., 2018) or discriminate timbre or pitch in a go/no-go task (n = 2; Walker et al., 2017). After behavioral training was complete, electrophysiological recordings were made from four tonotopic auditory cortical fields. Previous work shows that the perceptual features of complex sounds, such as their location in space or spectral timbre, are distributed across auditory cortical fields (Bizley et al., 2009; Walker et al., 2011; Allen et al., 2017). We sought to determine: (1) how training altered single-neuron response sensitivity to both learned and passively exposed sound features; and (2) whether training-induced effects differed across the auditory hierarchy. Our data demonstrate that training alters the way in which neurons integrate spectral information, and the neural representations of trained and passively exposed features are affected in different ways. Specifically, we observed that timbre sensitivity was maintained in a nonprimary field (the posterior pseudosylvian field, PPF) in trained animals, while neurons in other fields became markedly less sensitive to timbre. In contrast, sensitivity to sound location, which was an untrained but task-relevant feature, was enhanced in all fields of trained animals.

Materials and Methods

Animals

All animal procedures were approved by the Committee on Animal Welfare and Ethical Review at the University of Oxford and performed under license from the UK Home Office in accordance with the Animal (Scientific Procedures) Act 1986. Ten adult, female, pigmented ferrets (Mustela putorius furo) were used in this study. Five of these were naive control animals, from whom some data were previously published (Bizley et al., 2009; Walker et al., 2011). Five of these were trained animals. Three of the trained animals experienced 1–2 years of training on a two-alternative forced choice timbre identification task, which required them to report the identity of an artificial vowel (Fig. 1C, T-Id). For extended details of behavioral training see “stimuli” (below) and Figure 1 (Walker et al., 2009). The other two trained animals were trained to perform a go/no-go task, in which they were presented with a sequence of artificial vowels and had to discriminate either changes in vowel identity or, in different sessions, changes in vowel F0 (Fig. 1D, TP-Disc). For more details on this training paradigm, see “stimuli” (below) and Figure 1 (Walker et al., 2011).

Figure 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1.

Tonotopic organization of the auditory cortex of ferret and behavioral paradigm. A, B, Schematic illustration of the timbre two-alternative forced choice paradigm (A, T-Id) and timbre/pitch discrimination paradigm (B, TP-Disc). C, Location of ferret auditory cortical fields and their tonotopic organization [adapted from Nelken et al. (2004)]. Recordings in this study targeted the primary auditory cortex (A1), anterior auditory field (AAF), posterior pseudosylvian field (PPF), and posterior suprasylvian field (PSF). Field boundaries are marked with dotted lines, and the pseudosylvian sulcus (PSS) and suprasylvian sulcus (SSS) are drawn as solid lines. D, Voronoi tessellation map showing the CFs of all unit recordings made in control animals (615 units, 5 animals) and trained animals (783 units, 5 animals). The tiles represent a recording site and are colored according to the characteristic frequency (CF) of the unit recorded there. E, Histograms of the characteristic frequency distribution across subregions of the auditory cortex: A1, AAF, PPF, and PSF. Data from control animals are on the left, and data from all trained animals are on the right. The histograms represent the number of units responsive to specific frequency ranges (Hz) within each cortical area.

Ferrets were housed in groups of either two or three, with free access to high-protein food pellets and water bottles. On the day before behavioral training, water bottles were removed from the home cages and were replaced on the last afternoon of a training run. Training lasted for 5 d or less, with at least 2 d between each run. On training days, ferrets received drinking water as positive reinforcement while performing a sound discrimination task. Water consumption during training was measured and supplemented as wet food in home cages at the end of the day to ensure that each ferret received at least 60 ml of water per kilogram of body weight daily. Once behavioral training was complete, electrophysiological recordings were made under nonrecovery anesthesia (details below). Recording under anesthesia was necessary for the large-scale mapping of neurons across cortical fields and in order to directly compare the resulting responses with data from control (untrained) animals previously collected under the same anesthetic regime.

Stimuli

Acoustic stimuli for both behavioral testing and electrophysiology were artificial vowel sounds. For electrophysiological testing, sounds were all possible combinations of four F0 values (F0 = 200, 336, 565, and 951 Hz), four spectral timbres (/a/: F1–F4 at 936, 1,551, 2,815, and 4,290 Hz; /ε/: 730, 2,058, 2,979, and 4,294 Hz; /u/: 460, 1,105, 2,735, and 4,115 Hz; and /i/: 437, 2,761, 3,372, and 4,352 Hz), and four locations presented in virtual acoustic space (−45, −15, 15, and 45° azimuth, all at 0° elevation). This gave a total of 64 sounds, each of which was 150 ms in duration. Although the animals were trained using different stimulus protocols, recordings were conducted for all stimulus combinations in both the T-Id and TP-Disc groups, which included all vowels used in both training regimens. Additionally, noise bursts and pure tones were used to characterize individual units and to determine tonotopic gradients in order to confirm the cortical field in which any given recording was made (Bizley et al., 2005).

Behavioral testing

Full details of the training apparatus and procedure for shaping animals are provided in Walker et al. (2011) and Bizley et al. (2013). Briefly, in the timbre identification (T-Id) task, water-restricted ferrets were positively conditioned in a two-alternative forced choice task to report the identity of a vowel sound, which could be either /u/ (F1–F4 at 460, 1,105, 2,735, and 4,155 Hz, animals conditioned to respond at the left spout) or /ε/: (F1–F4 at 730, 2,058, 2,979, and 4,294 Hz, animals conditioned to respond at the right spout). Animals were trained initially with an F0 of 200 Hz, but were then tested across a range of F0 values from 150 to 500 Hz.

In the timbre/pitch discrimination (TP-Disc) task, water-restricted ferrets were trained to report a change in the pitch or timbre of a repeating artificial vowel in a go/no-go task. The reference sound was the vowel /a/ (F1–F4: 936, 1,551, 2,815, 4,290 Hz) with an F0 of 200 Hz, and F0 targets were the vowel /a/ with an F0 of 336, 565, and 951 Hz, while timbre targets were the vowels /i/, /u/, and /ε/, presented at an F0 of 200 Hz.

In both tasks, the animal initiated each trial by inserting its nose in a nose poke hole situated at the center of the sound-isolated testing chamber. For T-Id task, this resulted in the presentation of two identical repetitions of one of the vowel sounds, and animals were rewarded for correctly responding at the side nose poke that was associated with that vowel. If they nose poked at the incorrect side, they were presented with a noise burst and 10–14 s time out. In the TP-Disc task, ferrets heard a sequence of artificial vowels, which could change in identity or pitch at the third to the seventh vowel in the sequence, and if ferrets withdrew from the nose poke hole during the presentation of such a deviant, they were rewarded with water. Failures to withdraw in response to a deviant (within a 550 ms time window following deviant onset) resulted in a noise burst presentation and 12 s time out. In both tasks, sounds were presented from a loudspeaker located 5 cm above the central “go” spout at the animals’ midline (0° azimuth).

Electrophysiological recordings

Experimental methods were identical to those reported in Bizley et al. (2009), which comprises the control dataset for this study. Anesthesia was induced by a single dose of a mixture of medetomidine (Domitor; 0.022 mg/kg/h; Pfizer) and ketamine (Ketaset; 5 mg/kg/h; Fort Dodge Animal Health). The radial vein was cannulated, and a continuous infusion (5 ml/h) of a mixture of medetomidine and ketamine in physiological saline containing 5% glucose was provided throughout the experiment. The ferrets also received a single, subcutaneous dose of 0.06 mg/kg/h atropine sulfate (C-Vet Veterinary Products) to reduce bronchial secretions and, every 12 h, subcutaneous doses of 0.5 mg/kg dexamethasone (Dexadreson; Intervet) to reduce cerebral edema. The ferret was intubated, placed on a ventilator (7,025 respirator; Ugo Basile), and supplemented with oxygen. Body temperature, end-tidal CO2, and the electrocardiogram (ECG) were monitored throughout the experiment. Electrophysiology experiments typically lasted between 36 and 60 h.

The animal was placed in a stereotaxic frame, and the temporal muscles on both sides were retracted to expose the dorsal and lateral parts of the skull. A metal bar was cemented and screwed into the right side of the skull, holding the head without further need for a stereotaxic frame. On the left side, the temporal muscle was largely removed, and the suprasylvian and pseudosylvian sulci were exposed by a craniotomy, exposing the auditory cortex. The dura was retracted, and the cortex was covered with silicon oil or agar. The animal was then transferred to a small table in an anechoic chamber (IAC, UK). Sounds were presented through customized Panasonic RPHV297 headphone drivers. Closed-field calibrations were performed using a one-eighth-inch condenser microphone (Brüel and Kjær), placed at the end of a model ferret ear canal, to create an inverse filter that ensured the driver produced a flat (less than ±5 dB) output.

Recordings targeted primary and nonprimary tonotopic areas: primary auditory cortex (A1) and the anterior auditory field (AAF) on the middle ectosylvian gyrus and posterior pseudosylvian field (PPF) and posterior suprasylvian field (PSF) located on the posterior ectosylvian gyrus (Fig. 1C). Recordings were made with silicon probe electrodes (NeuroNexus Technologies) either in a 16 × 2 configuration (16 active sites spaced at 100 μm intervals on each of two shanks), a 32 × 1 configuration (a single shank with 50 μm spacing between sites), or, in one animal, an 8 × 4 configuration (100 μm spacing in depth, 200 μm between shanks). All probes had active sites of 177 μm. Voltage signals were bandpass filtered (500–5,000 Hz), amplified, and digitized at 25 kHz. Data acquisition and stimulus generation were performed using BrainWare (Tucker-Davis Technologies). Spike sorting was performed via WavClus (Quiroga et al., 2004) to isolate single units (based on waveform shape and refractory period) and multiunit clusters; we refer to “units” to include both unless otherwise specified. A minimum of 80 units were recorded in each field, in each group (mean = 118 units ± 28.32, Table 1).

View this table:
  • View inline
  • View popup
Table 1.

Total number of recordings (probe positions and units) in each field for five control animals and five trained animals

Data were combined across animals within each group (control or trained) to make composite tonotopic maps (Fig. 1D). The characteristic frequency (CF; the frequency to which the unit responded at its threshold sound level) of each unit was determined based on its response to pure tones. CF information, response latencies, and photographs of the electrode penetration positions on the brain were used to make a composite tonotopic map based on the known tonotopic organization of the ferret auditory cortex (Bizley et al., 2005; Bimbard et al., 2018).

Neural data analysis

Cortical responses were analyzed using a variance decomposition approach developed in Bizley et al. (2009). We first calculated spike counts for each of the 64 stimuli, averaged over repeated presentations of the same sound, and binned with 20 ms resolution over the 300 ms immediately following stimulus onset. We then performed a four-way ANOVA on the spike counts, where the three stimulus parameters (timbre, pitch, and location) plus the time bin served as factors. To quantify the relative strength with which one of the three stimulus dimensions influenced the firing of a particular unit, we calculated the proportion of variance explained by each of timbre, pitch, and location, Varstim , as follows:Varstim=SSstim.bin−SSerror.dfstim.binSStotal−SSbin, where “stim” refers to the stimulus parameter of interest (timbre, pitch, and location), SSstim.bin is the sum of squares for the interaction of the stimulus parameter and time bin, SSerror is the sum of squares of the error term, dfstim.bin refers to the degrees of freedom for the stimulus–time bin interaction, SStotal is the total sum of squares, and SSbin is the sum of squares for the time bin factor. A large SSbin reflects the fact that the response rate was not flat over the duration of the 300 ms response window. By examining the stimulus–time bin interactions, we were able to test the statistical significance of the influence a given stimulus parameter had on the temporal discharge pattern of the response. Subtracting the SSerror.dfstim.bin from the SSstim.bin term allows us to calculate the proportion of response variance attributable to each of the stimuli, taking into account the additional variance explained simply by adding extra parameters to the model. As in our previous work (Bizley et al., 2009), we considered a main effect or interaction term in the ANOVA to be statistically significant if p < 0.001. We chose a 300 ms window as this captures both offset responses and late responses, which our previous work showed carried stimulus information.

Statistical analysis

For statistical comparison of neural sensitivity measures between groups and cortical fields (derived using the variance decomposition approach described above), we performed generalized mixed linear model regression, with the specific analysis for each test reported in the Results section. We utilized the lmerTest package in R, defining the model formula as “Value ∼ Training Group * Field + (1|Penetration).” Here, “Value” represented the dependent variable, while “Training Group” and “Field” were the independent variables (fixed effects). “Penetration” was included as random effects in the model, to take into account that simultaneous recordings from a single recording electrode are unlikely to be fully statistically independent. The model was fit using the lmer function from the lmerTest package. To evaluate how robustly our model accounted for the data, we used fivefold cross-validation, calculating the root mean square error, which indicates the model's prediction error, for each test set.

To analyze how acoustic features shaped sensitivity to timbre across groups and cortical fields, we extended our mixed-effects model to incorporate additional fixed effects (the difference in first formant frequencies between vowel in a pair, denoted ΔF1, and the difference in second formant frequencies between each vowel in a pair and interaction terms, denoted ΔF2). The updated model formula was defined as “Value ∼ Field + Training Group + ΔF1 + ΔF2 + Training Group: ΔF1 + Training Group: ΔF2 + Field: ΔF1 + Field: ΔF2 + ΔF1: ΔF2 + (1|Unit) + (1|Penetration).” In this expanded model, “ΔF1” and “ΔF2” represent additional fixed effects, and interactions between “Training Group,” “Field,” “ΔF1,” and “ΔF2” were included to examine potential synergistic or antagonistic effects among these variables. We included both “Unit” (reflecting the multiple observations per unit) and “Penetration” as random effects, and fivefold cross-validation was used to calculate the goodness-of-fit of the model.

To generate predictions for arbitrary ΔF1 and ΔF2 values, we used the model to predict responses to a dense grid of formant frequency space. Specifically, a range of ΔF1 values was defined by 200 equally spaced points between the minimum and maximum observed ΔF1 values in the dataset, while the range of Δ2 values comprised 400 equally spaced points within its observed range. This grid spanned the complete acoustic space represented in our dataset, allowing for a comprehensive prediction of the model's response to changes in formant frequencies. For each unique combination of cortical field and training group, the model predictions were computed across the entire grid of F1 and F2 values, resulting in a detailed landscape of predicted timbre sensitivity. The predict function within lmerTest was employed to estimate the timbre sensitivity for each observation in the generated dataset, based on the fitted mixed-effects model. This expansive prediction dataset formed the basis for subsequent visualization and interpretation, aiming to elucidate the influence of auditory training on the neural encoding of vowel sounds.

To examine differences in spatial tuning that emerged during training, we used a mixed-effects model that considered the effect of (sound) “Location,” “Training Group,” and “Field” on spike rate. The model formula was defined as “Spike Rate ∼ Location * Training Group * Field + (1|Unit) + (1|Penetration),” where “Unit” and “Penetration” were again included as random effects. In this model, “Location” was treated as a factor variable with four levels: “−45,” “−15,” “15,” and “45,” in which “−45” was explicitly used as reference. This final model allowed us to closely examine how changes in “Location” interacted with “Training Group” and “Field” to affect spike rate, while controlling for the random effects of “Unit” and “Penetration.”

Results

Three ferrets were trained in a two-alternative forced choice timbre identification (T-Id) task to discriminate /u/ from /ε/, across a range of F0s (Fig. 1A). Two additional ferrets were trained in a go/no-go task to discriminate changes in timbre or F0 (TP-Disc) within a repeating reference vowel (/a/, F0 = 200 Hz, Fig. 1B). Once behavioral training and testing were complete, we recorded neural activity under medetomidine/ketamine anesthesia, which allowed us to map neural responses across the surface of multiple auditory cortical fields in each animal. These responses were directly compared with those obtained in naive animals in a previous study, which constitutes the control data for this investigation. The activity of 783 units (459 single neurons, 324 multiunits), which were responsive to vowels (paired t test on sound-evoked and spontaneous firing rates, p < 0.05), was recorded from four tonotopic auditory cortical fields (Table 1; Fig. 1C,D) and compared with driven 538 units from control animals. The CF distribution was not significantly different between trained and naive animals [Fig. 1E, GLM to predict CF with factors field (A1, AAF, PPF, PSF) and group (control/TD-Disc/T-Id) and their interactions, all p > 0.27].

As in the control data presented in Bizley et al. (2009), single units displayed diverse time-varying responses to the complete stimulus set (Fig. 2, Extended Data Fig. S1). For instance, Unit A in Figure 2A displayed a very selective response to the 200 Hz fundamental frequency with a high first formant (for 936 and 730 Hz). In contrast, another unit (Fig. 2A, Unit B) exhibited a distinctive response pattern characterized by an early enhanced firing rate for a high first formant onset (the highest for 936 Hz, followed by 730 Hz) and an otherwise reduced firing rate during the stimulus compared to spontaneous activity.

Figure 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 2.

Training drives changes in sensitivity to stimulus features. A, Raster plots from two example units. B–F, Z-scored spike rates (mean ± SEM,) for all units recorded in the control animals (gray), animals trained in the timbre identification task (T-Id; magenta), and animals trained in the pitch/timbre discrimination task (TP-Disc; blue), across F0 (B). First formant frequency (F1, C, order of vowels from low to high F1: /i/, /u/, /ε/, /a/). Second formant frequency (F2, D, order of vowels from low to high F2: /u/, /a/, /ε/, /i/), F1–F2 spectral integration (E, order of vowels from low to high: /u/, /a/, /ε/, /i/), and F0–F1 (vowels separated and plotted across F0). See also Extended Data Figure S1.

While single units appeared to show diverse responses, we first performed some basic analyses to look for any population-wide effects of training. To do so, we calculated the normalized (z-scored) spike rate across different stimulus parameters (Fig. 2B–F) calculated over the first 200 ms after stimulus onset. Both groups of trained animals most frequently heard the stimuli with an F0 of 200 Hz during behavioral testing. We observed larger changes in normalized spike rates for F0 in both training groups than in the control group (Fig. 2B), with units from both trained groups showing a higher firing rate to sounds with an F0 of 200 Hz than to other F0 values. We examined population-level tuning to first formant frequency and second formant frequency (Fig. 2C,D) and their integration (Fig. 2E, sum of F1 and F2). The T-Id group showed a much greater modulation of spike rate by timbre, with higher F1 frequencies eliciting greater responses than lower values. The two trained vowels /u/ (F1 = 460 Hz, F2 = 1,105 Hz, trained to go left) and /ε/ (F1 = 730 Hz, F2 = 2,058 Hz, trained to go right) showed an enhanced firing rate difference, with neurons appearing to respond more strongly to the contralateral conditioned stimulus and having suppressed responses for the ipsilateral trained stimulus. Examination of firing rates across combinations of F0 and timbre (Fig. 2F) was suggestive of F0–timbre interactions across all three groups, in particular the T-Id group.

Response modulation by timbre, pitch, and location

Having examined neural tuning at the population level, we turned to unit-level analysis using the variance decomposition approaches that we have previously developed (Bizley et al., 2009), as this allows us to quantify to what extent different perceptual dimensions modulate spiking responses over time. We first asked to what extent neurons were sensitive to each of the three sound features in the trained group as compared with the control group. We determined the proportion of units whose responses were significantly modulated by variation in stimulus location, pitch (determined by fundamental frequency, F0), or timbre. All trained animals learned to discriminate the timbre of the artificial vowels (Fig. 3A,B). Therefore, one might expect to observe a greater number of auditory cortical neurons to convey timbre information post training, as well as fewer neurons that are sensitive to untrained stimulus features (viz., pitch in the T-Id animals and location in both the T-Id and TP-Disc animals).

Figure 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 3.

Timbre sensitivity of auditory cortical neurons is reduced in animals trained on timbre discrimination tasks. A, Performance of three ferrets identifying timbre across changes in F0 (T-Id, chance = 50%). B, Performance of two animals discriminating changes in timbre and F0 (TP-Disc, chance = 25%). C, Cortical distribution of sensitivity to timbre measured using the proportion of variance explained metric (see Materials and Methods). Each tile represents an electrode penetration, with values averaged across all neural units from the same penetration. D, Swarm plots showing the distribution of timbre sensitivity across cortical fields and training groups. Each data point is one neural unit. See also Extended Data Figures S1 and S2.

Contrary to these predictions, the number of units that showed timbre sensitivity was equivalent between trained and control groups (391/783 units and 293/538 units, respectively; χ2 = 0.005, p = 0.95). In both groups, units with joint stimulus sensitivity outnumbered those with sensitivity to only a single parameter, but the distribution of sensitivity to zero, one, two, or three stimulus parameters was significantly different between the groups (χ2 = 86.9, p < 0.001). In the trained dataset, many more units showed significant modulation by all three stimulus parameters (timbre, pitch and location), or by none of the stimulus parameters, than in the control dataset where most units were modulated by one or two stimulus dimensions (Table 2; Fig. 2A, example cells; Extended Data Fig. S1). Taken together, this suggests that training resulted in a decrease in the likelihood of units being sensitive to modulations of timbre, pitch, or location on their own but an increase in units sensitive to combinations of all three features. We now consider sensitivity to each feature in turn, asking how the responses of units, and their cortical distribution of sensitivity, varied between trained and control animals.

View this table:
  • View inline
  • View popup
Table 2.

Percentages of units whose responses were significantly modulated by stimulus dimensions timbre, F0, or location, for naive, T-Id trained, and TP-Disc trained animals

Neural sensitivity to trained stimulus features: timbre

For each unit in our neural dataset, we determined what proportion of the response variance was attributable to each of the three stimulus dimensions and their combinations. First, we considered sensitivity to timbre, as both behavioral tasks required animals to make judgments based on this feature (Fig. 3A, T-Id; Fig. 3B, TP-Disc).

Figure 3C shows the distribution of sensitivity to timbre (i.e., the proportion of variance explained by timbre) mapped across the four cortical fields examined. Although the chi-squared analysis above indicated that the proportion of units with significant timbre sensitivity is equivalent across trained and control groups, the maps in Figure 3C illustrate that significantly less response variance is explained by sound timbre in the trained animals than in the control animals.

We applied a generalized linear mixed model (GLMM) to assess the impact of different training groups and cortical fields on timbre sensitivity (Table 3) with penetration as a random effect (in order to account for the shared variability in simultaneous recordings). There was a significant main effect of training, with both the TP-Disc (β = −14.855, p < 0.001) and T-Id (β = −10.816, p < 0.001) training groups showing significant negative effects compared with the control group. The magnitude of these coefficients suggests a marked decrease in the proportion of variance explained by timbre after training. The main effect of cortical field was predominantly decreased timbre sensitivity in field PSF relative to A1 (Table 3).

View this table:
  • View inline
  • View popup
Table 3.

The generalized linear mixed model results for the proportion of neural response variance explained by timbre

The GLMM analysis showed several significant interaction terms between group and the secondary fields (Table 3). Therefore, to directly address whether training differentially affected timbre sensitivity in each cortical field, we performed pairwise posthoc comparisons (Tukey's HSD for multiple comparisons) across groups, separately for each field. These revealed significant decreases in timbre sensitivity in trained animals compared with controls for all cortical fields (p < 0.05) except PPF, which was not significantly different. Overall, these results suggest that long-term training in a spectral timbre discrimination task leads to a field-specific alteration in timbre sensitivity; with the exception of field PPF, trained animals display a decrease in timbre sensitivity throughout the tonotopic auditory cortex compared with controls. Example PPF units with high timbre sensitivity can be seen in the extended data (Extended Data Fig. S2: units 327, 550, 677).

While the distribution of CFs was equivalent across fields and training groups, we also tested whether the training effects we observed could be explained by differences in the CF distribution, anticipating that low frequency preferring units would also convey more timbre information. We therefore reran the GLMM using CF included in the model as an additional factor. As expected, CF was negatively associated with timbre sensitivity (high frequency units are less sensitive to timbre than low frequency units), but all of the statistical effects held when CF was included in the model, including the observation that timbre sensitivity in T-Id animals was preserved in field PPF but decreased elsewhere (Table 4).

View this table:
  • View inline
  • View popup
Table 4.

The generalized linear mixed model results for the proportion of neural response variance explained by timbre based on training group, cortical field, and best frequency

Formant cues are reweighted in timbre-trained animals

Contrary to our hypothesis, long-term training on a spectral timbre discrimination task led to an overall decrease in neural sensitivity to timbre. However, the stimuli with which we recorded neural responses allowed us to both contrast effects in trained and untrained vowels and look more generally at tuning to first and second formant frequencies, which our preliminary analysis had highlighted as an area of interest (Fig. 2C,D). To contrast responses to trained and untrained vowels, we repeated the variance decomposition analysis using the neural responses to pairs of vowels (i.e., subsets of 50% of the data comprising the responses to stimulus combinations of two vowels presented at each of four F0s and four locations, yielding 32 stimulus conditions in total). The proportion of neural response variance explained by timbre in each of these analyses provided an estimate of how well neuronal responses differentiate a given pair of vowels across variation in F0 and location. We hypothesized that, if training led to enhanced sensitivity to the target vowels, we would observe higher timbre sensitivity (% variance explained) for the trained vowel pairs compared with untrained vowel pairs. Specifically, for the T-Id animals, /u/ versus /ε/ should yield the greatest timbre sensitivity. For the TP-Disc animals, where /a/ was the reference vowel, we expect the /a/–/i/, /a/–/ε/, and /a/–/u/ contrasts to yield higher sensitivity scores than the pairs of vowels that did not include /a/.

We visualized timbre sensitivity across the three groups for all six pairs of vowels (Fig. 4C), ordering the vowels according to the difference in second formant frequency (ΔF2, Fig. 4A–C). The second formant has been shown to strongly influence the perception of trained ferrets when performing vowel discrimination tasks (Town et al., 2015). While the control group showed a clear relationship between timbre sensitivity and ΔF2, such a relationship was not apparent for the trained animals (Fig. 4C, top panel). Neither was it the case that timbre sensitivity was enhanced only for trained vowels over untrained ones or that neurons became exclusively F1 sensitive (Fig. 4C, bottom panel). To better understand how training group, cortical field, and change in first (ΔF1) and second (ΔF2) formant frequencies influenced the proportion of the neural response variance attributable to timbre, we ran a GLMM that predicted the neural response variance explained by timbre, with factors cortical field, training group, ΔF1 and ΔF2, and with unit ID and penetration as nested random effects. Here, ΔF1 and ΔF2 were calculated as the difference in first and second formant frequencies, respectively, for the relevant vowel pair.

Figure 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 4.

Training reweights sensitivity to first and second formant frequencies. A, First and second formant frequencies for the four vowels used in this study. The /ε/ and /u/ (magenta) were used in the T-Id task, and /a/ (blue) was used as a reference vowel for the TP-Disc task. B, Each pair of vowels can be defined according to the difference in their first (ΔF1) and second (ΔF2) frequencies. C, The proportion of variance explained by timbre for the six possible vowel combinations, organized according to the magnitude of the difference in second formant frequency (ΔF2, ranked from lowest to highest, top) or first formant frequency (ΔF1, bottom). Error bars are standard error of the mean. D, E, Model predictions for the change in sensitivity to timbre according to arbitrary differences in F1 (D) and F2 (E). F, The proportion of variance explained by timbre for the six possible vowel combinations (as in C) ordered by ΔF2 and broken down by cortical field. See also Extended Data Figure S3.

The results of the GLMM are shown in Table 5 and are consistent with training altering the way in which neurons integrate spectral information. Here, positive β coefficients indicate increased sensitivity (i.e., the model predicts higher amounts of neural response variance explained by timbre). The model showed significant main effects of the training group and cortical field (replicating the previous analysis in Tables 3 and 4), as well as significant effects of ΔF1 and ΔF2 on sensitivity to timbre, where ΔF1 decreased sensitivity to timbre and ΔF2 increased sensitivity to timbre. In addition to significant main effects, there were multiple significant two-way interactions, all of which were consistent with the trained animals increasing their sensitivity to changes in first formant frequency (e.g., significant ΔF1–T-Id interaction, with a large positive β coefficient) and decreasing their reliance on differences in the second formant frequency (significant ΔF2–T-Id interaction and ΔF2–TP-Disc interaction, both with a negative β coefficients). The three-way interactions were not significant.

View this table:
  • View inline
  • View popup
Table 5.

GLMM results for the proportion of variance explained by timbre sensitivity based on training group, cortical field, and differences in first and second formant frequency

To further understand timbre sensitivity in trained animals, we used the fitted GLMM coefficients to predict the dependence of timbre-related response variance explained by arbitrary changes in F1 and F2 for both training groups. By systematically varying the ΔF1 and ΔF2 values, we simulated a comprehensive sampling of formant space and graphically depicted the resulting predictions for timbre sensitivity (Fig. 4D, ΔF1; Fig. 4E, ΔF2). The simulated results illustrate the dependence of the control group neurons on ΔF2 and the increased reliance on ΔF1 and decreased sensitivity to ΔF2 present in the T-Id group (Extended Data Fig. S3 units: 312, 245, 538, 276). In the TP-Disc group, sensitivity to timbre is lower overall, but our limited sample size in these animals prevents clear conclusions from being drawn about the relative dependence on ΔF1 and ΔF2. When dissecting the data by cortical field (Fig. 4F), we confirmed that these training-induced shifts in formant frequency weighting were evident across all cortical fields but were most pronounced in field PPF. We also repeated the GLMM model with CF included as a factor, and the statistical results did not differ (data not shown).

Sensitivity to fundamental frequency in trained animals

Basic spike rate measures showed that neurons in trained animals fired more when the F0 was 200 Hz (Fig. 2B). Since T-Id animals were able to identify trained vowels across a range of randomly varying F0 values (Fig. 5A), we predicted that despite this we might see decreased sensitivity (i.e., increased tolerance) to F0 in the cortical responses of these animals. In contrast, the TP-Disc animals were not required to generalize their discriminations across a second stimulus feature but were instead trained to detect changes in either F0 (Fig. 5B) or spectral timbre (Fig. 3B). Therefore, we expected to potentially see greater F0 sensitivity in the cortical responses of these animals, compared with the T-Id and control groups. While some units were clearly tuned to F0 (e.g., Extended Data Fig. S4, units 92 and 276, both tuned to F0 = 336 Hz) and others were driven most strongly by 200 Hz F0 stimuli (e.g., Extended Data Fig. S5, units 362, 309, 363), when sensitivity to F0 is plotted across the cortical surface (Fig. 5C) or broken down by cortical field (Fig. 5D), it is apparent that both trained groups show decreased F0 sensitivity compared with the control group.

Figure 5.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 5.

Cortical sensitivity to F0 is decreased in trained animals. A, Behavioral performance for discriminating timbre across fundamental frequency (F0) on the timbre identification task (T-Id). B, F0 timbre discrimination performance for two animals (TP-Disc). Symbols show individual animals. C, Voronoi tessellation maps plotting the proportion of variance explained by F0 for control (left) and trained (right) animals. Conventions as in Figure 3C. D, Swarm plots showing the distribution of F0 sensitivity across neuron units in different cortical fields and training groups. See also Extended Data Figure S4.

We employed a GLMM to assess the influence of auditory training and cortical field on neural sensitivity to F0 (Table 6). The results indicated that both the TP-Id and TP-Disc groups exhibited a significant reduction in F0 sensitivity compared with the control group, with respective coefficients of β = −11.246 and β = −6.330 (both p < 0.001). In terms of cortical fields, a marked decrease in sensitivity was observed in AAF (β = −3.585, p < 0.001) relative to A1, while PSF and PPF were not significantly different from A1. There were significant interaction effects between training groups and cortical fields. These findings run counter to our hypothesis that F0 sensitivity in the auditory cortex would be enhanced by F0-specific training (i.e., the TP-Disc group) and decreased in animals trained to ignore F0 changes (i.e., the T-Id group). Instead, we observed an overall reduction in neural sensitivity to F0 in trained animals.

View this table:
  • View inline
  • View popup
Table 6.

The GLMM results for the proportion of variance explained by F0 sensitivity

Sensitivity to an untrained feature: location

Finally, we examined how training impacted cortical sensitivity to sound source location cues. During behavioral testing, all sounds were presented from a speaker in front of the central nose-poke sensor. In our control dataset, we previously observed that the spatial tuning of neurons in response to these vowel stimuli presented in virtual acoustic space was modest and, as expected, predominantly contralateral (Bizley et al., 2009). However, when the same neurons were tested with spatially modulated broadband noise (also in virtual acoustic space), we observed considerably greater spatial modulation, leading us to conclude that the low spatial sensitivity was a likely product of the limited bandwidth of the artificial vowel stimuli. Here, we examined whether exposure to the vowels in the behavioral task altered the spatial sensitivity measured in the auditory cortex using these same stimuli. As Figure 6 shows, spatial sensitivity was indeed higher in trained animals, which can be visualized across the auditory cortical surface (Fig. 6A) or between cortical fields (Fig. 6B). Specifically, in the nonprimary cortical areas, spatial sensitivity appeared to be higher in the T-Id animals than in controls.

Figure 6.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 6.

Cortical sensitivity to location, an untrained feature, is higher in trained animals. A, Voronoi tessellation maps plotting the proportion of variance explained by location for control and trained animals. B, Swarm plots showing the proportion of variance explained by location across fields and training groups. C–E, Population spatial tuning functions (mean ± SEM for all responsive units) for control, T-Id animals, and TP-Disc animals. F, β coefficients for the impact of GLMM parameters on spike rate; parameters showing significant effects (p < 0.05) are emphasized with gray bars. See also Extended Data Figure S5.

To examine whether training altered the neural sensitivity to sound source location, we ran a GLMM using the training group, cortical field, and their interactions as factors (Table 7). There was no significant overall effect of group on spatial sensitivity, but there were significant effects of cortical field, with all areas showing lowered spatial sensitivity relative to A1 (see Table 7 for details). There were also significant interactions (with positive β coefficients) for the training group and fields PSF and PPF for the T-Id animals, supporting the observation that azimuth sensitivity was higher in the nonprimary fields in trained animals.

View this table:
  • View inline
  • View popup
Table 7.

The GLMM results for the proportion of variance explained by sound source location sensitivity

We next asked whether the increased spatial sensitivity in trained animals reflected altered spatial tuning. To assess this, we constructed spatial response functions for each neuron, taken as the mean sound-evoked spike rate at each location across all pitch and timbre combinations. To derive a population spatial receptive field, we averaged these functions across all recorded units (Fig. 6C–E). As expected, in the control animals (Fig. 6C), this yielded a monotonically increasing function with the most contralateral (+45°) location eliciting the strongest firing rates. In contrast, in both trained groups of animals, spatial response profiles were nonmonotonic and showed a peak at +15° azimuth. This midline tuning was also clearly evident in the raw data (Extended Data Fig. S5: units 185, 82, 309).

The GLMM in Table 7 showed significant group–field interactions, suggesting that training effects on spatial sensitivity differ across cortical fields. To quantify these effects, we ran a GLMM to model spike rates with group (T-Id or TP-Disc relative to the reference category control group), cortical field (reference category A1), and spatial position (categorical predictor, relative to −45°) as factors, with penetration and unit as random effects. This confirmed significant main effects of spatial position (i.e., responses at −45 and +45° azimuth were significantly different, reflecting the dominant contralateral tuning of the units) and significant field–group–position interactions for PPF and PSF in both trained groups (Fig. 6F). The coefficients for these effects were negative for +45° and positive for ±15°, confirming the changes observed in the azimuth response plots (Fig. 6C–E). Thus, although sound source location was not varied during behavioral testing, and was not relevant to the task, the representation of the region from which the sounds were presented (i.e., the midline) was enhanced in the trained animals.

Discussion

In this study, we trained two groups of ferrets to discriminate perceptual attributes of artificial vowels. One group (T-Id animals) categorized vowels according to their identity (i.e., spectral timbre) while generalizing across changes in pitch (i.e., F0). The other group was trained to detect changes in either the F0 or timbre in a sequence of ongoing vowel sounds (TP-Disc animals). We predicted that the T-Id animals would show enhanced neural sensitivity for timbre and increased tolerance (i.e., decreased sensitivity) to other sound features. In fact, what we observed was more complex: sensitivity to timbre was lower in three auditory cortical areas of trained animals compared with controls, but was maintained in PPF (a secondary tonotopic region). Moreover, cortical sensitivity to vowel identity was less contingent on changes in the frequency of the second formant in trained animals and instead was dependent on changes in both the first and second formant frequency. Sensitivity to F0, which the T-Id animals were trained to generalize across, decreased over all fields with some evidence for a field-specific increase in PSF. In contrast, sensitivity to sound source location, which did not vary during the tasks, was enhanced in nonprimary fields of trained animals. Furthermore, spatial receptive fields were shifted toward the midline, from where the target sounds originated. Overall, animals trained to discriminate vowels in both tasks showed an unexpected decrease in cortical sensitivity to timbre and F0 relative to controls and enhanced spatial sensitivity.

Previous studies investigating the impact of training on neural tuning in the auditory cortex have focused primarily on map plasticity in A1 (Irvine, 2018). However, neurons in higher auditory cortical fields are thought to become increasingly specialized for processing spatial or nonspatial stimulus attributes (Rauschecker and Tian, 2000; Bizley and Cohen, 2013; Elgueda et al., 2019) and show enhanced attention-related changes in activity during behavior (Mesgarani and Chang, 2012; Atiani et al., 2014; Elgueda et al., 2019). While most studies of the cortical substrates for perceptual learning have focused on A1, our data suggest that the areas that show larger attention-related changes may also be those in which receptive fields are optimized through learning to process task-relevant stimuli.

The observation of a marked decrease in sensitivity to task-relevant features in the primary auditory cortical fields A1 and AAF is a potentially surprising finding. However, a number of studies report that engagement in a behavioral task causes suppression of neural responses in the auditory cortex (Otazu et al., 2009; Town et al., 2018) and training to discriminate natural sounds has been observed to result in sparser auditory cortical representations of these sounds in A1 of mice (Maor et al., 2019). The reduced sensitivity we see, along with the increase in unresponsive neurons, may reflect a sparsening of representations. Additionally, it may result from the integration of diverse nonsensory inputs into the auditory cortex that ultimately underlie the emergence of choice- or motor-related activity. Reversibly inactivating A1 in ferrets does not lead to an impairment in a vowel discrimination in a task analogous to the one used here, with behavioral deficits only evident when vowels are presented in simultaneous noise (Town et al., 2023). Our present neural data raise the testable prediction that inactivation of PPF may cause a timbre identification deficit, whereas inactivating A1, AAF, or PSF would have a more modest effect on performance in these tasks.

Training ferrets to identify vowels altered the sensitivity of cortical neurons to the cues that underlie spectral timbre. While the complex sounds used here may allow animals to use a variety of acoustical criteria to solve the tasks, individual animals use remarkably consistent strategies to identify vowel timbre, even across different cohorts, laboratories, and stimulus conditions (Bizley et al., 2013; Town et al., 2015). In the cortical responses of naive control animals, the most discriminable stimuli were those in which there was a large difference in F2. This finding is mirrored behaviorally; when first and second formant cues are placed in conflict, ferrets and human listeners tend to weight the position of the second formant over the first in their decisions, with their behavior being best predicted by either F2 position or the position of the spectral centroid (Town et al., 2015). Nonetheless, the T-Id animals described here were tested behaviorally with single-formant stimuli, and they were able to accurately classify F1 for /u/ and F2 for /ε/ (Bizley et al., 2013). The finding that neural responses in the trained animals are explained by changes in both the first and second formants for trained and novel vowels mirrors electrophysiological recordings in humans (Oganian et al., 2023), and suggests that learning may be associated with enhanced integration of the cues that define the spectral envelope and an increased sensitivity to low-frequency spectral peaks. While the highest levels of sensitivity to timbre were observed in the PPF of trained animals, the pattern of cue-weighting was preserved across fields, suggesting that it is widespread within the auditory cortex. This finding is consistent with previous studies in A1 (Keeling et al., 2008; Beitel et al., 2020). At the population level, control animals showed a reasonably flat firing rate distribution across vowels. In contrast, the T-Id animals showed much greater modulation with a reduced firing rate for one trained vowel (/u/, conditioned as “go left”) than the other (/ε/, conditioned as “go right”). Given our recordings were in the left auditory cortex, this pattern is consistent with recent reports that in animals trained in 2AFC experiments tuning within a hemisphere of A1 is optimized for the contralateral stimulus (Znamenskiy and Zador, 2013; Chang et al., 2022), but confirmation of this would require bilateral recordings.

Artificial vowels are low-frequency stimuli (<4 kHz). Our sampling yielded balanced samples of neurons across the frequency axis in both trained and control animals (Fig. 1). Although we did not design our experiments to perform high-resolution within-field mapping, our data show no indication of an increase in the proportion of low-frequency recording sites suggestive of tonotopic reorganization after training. We also considered the impact of training on the neural representation of two other sound features. In both groups of trained animals, there was a decrease in F0 sensitivity, despite the fact that animals in the T-Id task were required to generalize timbre decisions across F0, while the TP-Disc animals actively discriminated both pitch and timbre features. Our sampling approach ensured that we had good coverage in each cortical field, thereby facilitating comparisons between them. By opting to sample multiple cortical areas, however, it is possible that we did not have sufficient resolution to detect possible hotspots of pitch sensitivity, such as the “pitch area” proposed by Bendor and Wang (2005) based on data from the marmoset, but which has not yet been corroborated in other mammalian species.

In contrast to the decreased sensitivity to nonspatial sound features, there are changes in spatial sensitivity in trained animals despite the task having no spatial component to it. These changes occurred principally as an increase in spatial sensitivity in the nonprimary fields PPF and PSF and resulted in population tuning shifts from monotonically increasing for more contralateral sounds to peaking 15° contralateral to the midline. Our spatial receptive fields were coarsely measured with stimuli at ±15 and ±45°, but if the change in tuning observed in the recorded hemisphere were mirrored on the other side of the brain, this should result in an expanded representation of the midline, where stimuli were presented during behavioral testing. Engaging in a sound discrimination task has been shown to refine spatial tuning in cat A1, with changes occurring for both spatial and—more modestly—nonspatial tasks (Lee and Middlebrooks, 2011). While previous studies suggest active discrimination of the trained feature is not necessary to drive changes in neural responses (Keeling et al., 2008), this is, to our knowledge, the first report of enhanced location coding after repeated exposure to behaviorally relevant stimuli on a nonspatial task. Furthermore, these changes were observed primarily in the higher-order auditory cortex (PPF and PSF), rather than A1.

A limitation of our study is the small sample size in the TP-Disc group. However, despite the limited statistical power and differences in the behavioral task design, the data from these animals support the main conclusions from the T-Id group, namely, diminished timbre sensitivity overall, with an increased sensitivity to combinations of features and down-weighting of sensitivity to ΔF2, and a shift in spatial tuning to midline locations. A further caveat is that the recordings were performed under anesthesia, which may underestimate neuronal changes (Chang et al., 2022). The use of anesthesia here was essential for performing mapping unit activity across multiple neighboring cortical fields in the same animals. Methods for recording in awake animals currently allow only high-density sampling from a small area of cortex or sparse sampling across multiple fields. Recording under anesthesia also has the advantage of allowing us to separate out effects of sensitivity to stimulus features from attention, and to measure the receptive field features to which attentional and task-related effects are likely added during active listening. Tuning for higher-order perceptual features, such as pitch and timbre, may also be modulated by attention during task engagement, so it would be interesting to examine this form of short-term plasticity in future studies to help us better interpret the relevance of the long-term plasticity investigated here.

In summary, training ferrets to discriminate or detect specific features present in artificial vowels has diverse effects on the stimulus sensitivity of neurons in the auditory cortex. Sensitivity to task-relevant stimulus features, which is broadly distributed in naive animals, is modulated by training in a manner that differs across auditory cortical fields. Furthermore, in contrast to control animals, in which responses of cortical neurons are strongly weighted by the second formant frequency of artificial vowels, the neural units recorded in trained animals integrate information about both the first and second formant frequency. Finally, neural sensitivity to a task-orthogonal feature—here auditory space—is enhanced when training stimuli are repeatedly presented from a single location. Since learning triggers widespread changes in gene expression in the auditory cortex (Graham et al., 2023), future work can seek to unpick the specific molecular mechanisms that support changes in auditory cortical function and auditory memory, and consider regions beyond the auditory cortex that might support or drive such changes (Jia et al., 2024).

Data Availability

All code is available on GitHub: https://github.com/huriyeatg/trainingInducedPlasticity. Data are available via https://gin.g-node.org/huriyeatg/trainingInducedPlasticity/ or in the following repository: https://rdr.ucl.ac.uk/account/articles/28659329?file=53231957).

Footnotes

  • This work was supported by a Royal National Institute for Deaf People (RNID; formally Action on Hearing Loss) PhD studentship to H.A., grants from the Biotechnology and Biological Sciences Research Council (grants BB/D009758/1 to J.W.S., J.K.B., and A.J.K., BB/M010929/1 to K.M.W., and BB/H016813/1 to J.K.B.), a Royal Society Dorothy Hodgkin Fellowship (J.K.B.), a Sir Henry Dale Fellowship (J.K.B., 098418/Z/12/Z), a European Research Council Consolidator award (SOUND SCENE) to J.K.B., and a Wellcome Principal Research Fellowship (A.J.K., WT108369/Z/2015/Z). This work was supported in whole, or in part, by the Wellcome Trust. For the purpose of Open Access, the author has applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.

  • The authors declare no competing financial interests.

  • Correspondence should be addressed to Jennifer K. Bizley at j.bizley{at}ucl.ac.uk.

SfN exclusive license.

References

  1. ↵
    1. Allen EJ,
    2. Burton PC,
    3. Olman CA,
    4. Oxenham AJ
    (2017) Representations of pitch and timbre variation in human auditory cortex. J Neurosci 37:1284–1293. https://doi.org/10.1523/JNEUROSCI.2336-16.2016 pmid:28025255
    OpenUrlAbstract/FREE Full Text
  2. ↵
    1. Atiani S,
    2. David SV,
    3. Elgueda D,
    4. Locastro M,
    5. Radtke-Schuller S,
    6. Shamma SA,
    7. Fritz JB
    (2014) Emergent selectivity for task-relevant stimuli in higher-order auditory cortex. Neuron 82:486–499. https://doi.org/10.1016/j.neuron.2014.02.029 pmid:24742467
    OpenUrlCrossRefPubMed
  3. ↵
    1. Bao S,
    2. Chang EF,
    3. Woods J,
    4. Merzenich MM
    (2004) Temporal plasticity in the primary auditory cortex induced by operant perceptual learning. Nat Neurosci 7:974–981. https://doi.org/10.1038/nn1293
    OpenUrlCrossRefPubMed
  4. ↵
    1. Beitel RE,
    2. Schreiner CE,
    3. Vollmer M
    (2020) Spectral plasticity in monkey primary auditory cortex limits performance generalization in a temporal discrimination task. J Neurophysiol 124:1798–1814. https://doi.org/10.1152/jn.00278.2020 pmid:32997564
    OpenUrlCrossRefPubMed
  5. ↵
    1. Bendor D,
    2. Wang X
    (2005) The neuronal representation of pitch in primate auditory cortex. Nature 436:1161–1165. https://doi.org/10.1038/nature03867 pmid:16121182
    OpenUrlCrossRefPubMed
  6. ↵
    1. Bimbard C,
    2. Demene C,
    3. Girard C,
    4. Radtke-Schuller S,
    5. Shamma S,
    6. Tanter M,
    7. Boubenec Y
    (2018) Multi-scale mapping along the auditory hierarchy using high-resolution functional ultrasound in the awake ferret. Elife 7:e35028. https://doi.org/10.7554/eLife.35028 pmid:29952750
    OpenUrlCrossRefPubMed
  7. ↵
    1. Bizley JK,
    2. Cohen YE
    (2013) The what, where and how of auditory-object perception. Nat Rev Neurosci 14:693–707. https://doi.org/10.1038/nrn3565 pmid:24052177
    OpenUrlCrossRefPubMed
  8. ↵
    1. Bizley JK,
    2. Nodal FR,
    3. Nelken I,
    4. King AJ
    (2005) Functional organization of ferret auditory cortex. Cereb Cortex 15:16371653. https://doi.org/10.1093/cercor/bhi042
    OpenUrlCrossRefPubMed
  9. ↵
    1. Bizley JK,
    2. Walker KM,
    3. King AJ,
    4. Schnupp JW
    (2013) Spectral timbre perception in ferrets: discrimination of artificial vowels under different listening conditions. J Acoust Soc Am 133:365–376. https://doi.org/10.1121/1.4768798 pmid:23297909
    OpenUrlCrossRefPubMed
  10. ↵
    1. Bizley JK,
    2. Walker KM,
    3. Silverman BW,
    4. King AJ,
    5. Schnupp JW
    (2009) Interdependent encoding of pitch, timbre, and spatial location in auditory cortex. J Neurosci 29:2064–2075. https://doi.org/10.1523/JNEUROSCI.4755-08.2009 pmid:19228960
    OpenUrlAbstract/FREE Full Text
  11. ↵
    1. Brown M
    (2004) Perceptual learning on an auditory frequency discrimination task by cats: association with changes in primary auditory cortex. Cereb Cortex 14:952–965. https://doi.org/10.1093/cercor/bhh056
    OpenUrlCrossRefPubMed
  12. ↵
    1. Chang S,
    2. Xu J,
    3. Zheng M,
    4. Keniston L,
    5. Zhou X,
    6. Zhang J,
    7. Yu L
    (2022) Integrating visual information into the auditory cortex promotes sound discrimination through choice-related multisensory integration. J Neurosci 42:8556–8568. https://doi.org/10.1523/JNEUROSCI.0793-22.2022 pmid:36150889
    OpenUrlAbstract/FREE Full Text
  13. ↵
    1. Elgueda D,
    2. Duque D,
    3. Radtke-Schuller S,
    4. Yin P,
    5. David SV,
    6. Shamma SA,
    7. Fritz JB
    (2019) State-dependent encoding of sound and behavioral meaning in a tertiary region of the ferret auditory cortex. Nat Neurosci 22:447–459. https://doi.org/10.1523/JNEUROSCI.3318-14.2015 pmid:25653363
    OpenUrlCrossRefPubMed
  14. ↵
    1. Engineer CT,
    2. Perez CA,
    3. Carraway RS,
    4. Chang KQ,
    5. Roland JL,
    6. Kilgard MP
    (2014) Speech training alters tone frequency tuning in rat primary auditory cortex. Behav Brain Res 258:166–178. https://doi.org/10.1016/j.bbr.2013.10.021 pmid:24344364
    OpenUrlCrossRefPubMed
  15. ↵
    1. Galindo-Leon EE,
    2. Lin FG,
    3. Liu RC
    (2009) Inhibitory plasticity in a lateral band improves cortical detection of natural vocalizations. Neuron 62:705–716. https://doi.org/10.1016/j.neuron.2009.05.001 pmid:19524529
    OpenUrlCrossRefPubMed
  16. ↵
    1. Graham G,
    2. Chimenti MS,
    3. Knudtson KL,
    4. Grenard DN,
    5. Co L,
    6. Sumner M,
    7. Tchou T,
    8. Bieszczad KM
    (2023) Learning induces unique transcriptional landscapes in the auditory cortex. Hear Res 438:108878. https://doi.org/10.1016/j.heares.2023.108878 pmid:37659220
    OpenUrlCrossRefPubMed
  17. ↵
    1. Irvine DRF
    (2018) Auditory perceptual learning and changes in the conceptualization of auditory cortex. Hear Res 366:3–16. https://doi.org/10.1016/j.heares.2018.03.011
    OpenUrlCrossRefPubMed
  18. ↵
    1. Jia G, et al.
    (2024) Auditory training remodels hippocampus-related memory in adult rats. Cereb Cortex 34:bhae045. https://doi.org/10.1093/cercor/bhae045
    OpenUrlCrossRefPubMed
  19. ↵
    1. Keeling MD,
    2. Calhoun BM,
    3. Krüger K,
    4. Polley DB,
    5. Schreiner CE
    (2008) Spectral integration plasticity in cat auditory cortex induced by perceptual training. Exp Brain Res 184:493–509. https://doi.org/10.1007/s00221-007-1115-9 pmid:17896103
    OpenUrlCrossRefPubMed
  20. ↵
    1. Lee CC,
    2. Middlebrooks JC
    (2011) Auditory cortex spatial sensitivity sharpens during task performance. Nat Neurosci 14:108–114. https://doi.org/10.1038/nn.2713 pmid:21151120
    OpenUrlCrossRefPubMed
  21. ↵
    1. Maor I,
    2. Shwartz-Ziv R,
    3. Feigin L,
    4. Elyada Y,
    5. Sompolinsky H,
    6. Mizrahi A
    (2019) Neural correlates of learning pure tones or natural sounds in the auditory cortex. Front Neural Circuits 13:82. https://doi.org/10.3389/fncir.2019.00082 pmid:32047424
    OpenUrlCrossRefPubMed
  22. ↵
    1. Mesgarani N,
    2. Chang EF
    (2012) Selective cortical representation of attended speaker in multi-talker speech perception. Nature 485:233–236. https://doi.org/10.1038/nature11020 pmid:22522927
    OpenUrlCrossRefPubMed
  23. ↵
    1. Mohn JL,
    2. Baese-Berk MM,
    3. Jaramillo S
    (2024) Selectivity to acoustic features of human speech in the auditory cortex of the mouse. Hear Res 441:108920. https://doi.org/10.1016/j.heares.2023.108920 pmid:38029503
    OpenUrlCrossRefPubMed
  24. ↵
    1. Nelken I,
    2. Bizley JK,
    3. Nodal FR,
    4. Ahmed B,
    5. Schnupp JW,
    6. King AJ
    (2004) Large-scale organization of ferret auditory cortex revealed using continuous acquisition of intrinsic optical signals. J Neurophysiol 92:2574–2588. https://doi.org/10.1152/jn.00276.2004
    OpenUrlCrossRefPubMed
  25. ↵
    1. Oganian Y,
    2. Bhaya-Grossman I,
    3. Johnson K,
    4. Chang EF
    (2023) Vowel and formant representation in the human auditory speech cortex. Neuron 111:2105–2118.e4. https://doi.org/10.1016/j.neuron.2023.04.004 pmid:37105171
    OpenUrlCrossRefPubMed
  26. ↵
    1. Otazu GH,
    2. Tai LH,
    3. Yang Y,
    4. Zador AM
    (2009) Engaging in an auditory task suppresses responses in auditory cortex. Nat Neurosci 12:646–654. https://doi.org/10.1038/nn.2306 pmid:19363491
    OpenUrlCrossRefPubMed
  27. ↵
    1. Polley DB,
    2. Steinberg EE,
    3. Merzenich MM
    (2006) Perceptual learning directs auditory cortical map reorganization through top-down influences. J Neurosci 26:4970–4982. https://doi.org/10.1523/JNEUROSCI.3771-05.2006 pmid:16672673
    OpenUrlAbstract/FREE Full Text
  28. ↵
    1. Quiroga RQ,
    2. Nadasdy Z,
    3. Ben-Shaul Y
    (2004) Unsupervised spike detection and sorting with wavelets and superparamagnetic clustering. Neural Comput 16:1661–1687. https://doi.org/10.1162/089976604774201631
    OpenUrlCrossRefPubMed
  29. ↵
    1. Rauschecker JP,
    2. Tian B
    (2000) Mechanisms and streams for processing of “what” and “where” in auditory cortex. Proc Natl Acad Sci U S A 97:11800–11806. https://doi.org/10.1073/pnas.97.22.11800 pmid:11050212
    OpenUrlAbstract/FREE Full Text
  30. ↵
    1. Reed A,
    2. Riley J,
    3. Carraway R,
    4. Carrasco A,
    5. Perez C,
    6. Jakkamsetti V,
    7. Kilgard MP
    (2011) Cortical map plasticity improves learning but is not necessary for improved performance. Neuron 70:121–131. https://doi.org/10.1016/j.neuron.2011.02.038
    OpenUrlCrossRefPubMed
  31. ↵
    1. Rutkowski RG,
    2. Weinberger NM
    (2005) Encoding of learned importance of sound by magnitude of representational area in primary auditory cortex. Proc Natl Acad Sci U S A 102:13664–13669. https://doi.org/10.1073/pnas.0506838102 pmid:16174754
    OpenUrlAbstract/FREE Full Text
  32. ↵
    1. Schnupp JW,
    2. Hall TM,
    3. Kokelaar RF,
    4. Ahmed B
    (2006) Plasticity of temporal pattern codes for vocalization stimuli in primary auditory cortex. J Neurosci 26:4785–4795. https://doi.org/10.1523/JNEUROSCI.4330-05.2006 pmid:16672651
    OpenUrlAbstract/FREE Full Text
  33. ↵
    1. Schreiner CE,
    2. Polley DB
    (2014) Auditory map plasticity: diversity in causes and consequences. Curr Opin Neurobiol 24:143–156. https://doi.org/10.1016/j.conb.2013.11.009 pmid:24492090
    OpenUrlCrossRefPubMed
  34. ↵
    1. Shepard KN,
    2. Chong KK,
    3. Liu RC
    (2016) Contrast enhancement without transient map expansion for species-specific vocalizations in core auditory cortex during learning. eNeuro 3:ENEURO.0318-16.2016. https://doi.org/10.1523/ENEURO.0318-16.2016 pmid:27957529
    OpenUrlCrossRefPubMed
  35. ↵
    1. Town SM,
    2. Atilgan H,
    3. Wood KC,
    4. Bizley JK
    (2015) The role of spectral cues in timbre discrimination by ferrets and humans. J Acoust Soc Am 137:2870–2883. https://doi.org/10.1121/1.4916690 pmid:25994714
    OpenUrlCrossRefPubMed
  36. ↵
    1. Town SM,
    2. Poole KC,
    3. Wood KC,
    4. Bizley JK
    (2023) Reversible inactivation of ferret auditory cortex impairs spatial and nonspatial hearing. J Neurosci 43:749–763. https://doi.org/10.1523/JNEUROSCI.1426-22.2022 pmid:36604168
    OpenUrlAbstract/FREE Full Text
  37. ↵
    1. Town SM,
    2. Wood KC,
    3. Bizley JK
    (2018) Sound identity is represented robustly in auditory cortex during perceptual constancy. Nat Commun 9:4786. https://doi.org/10.1038/s41467-018-07237-3 pmid:30429465
    OpenUrlCrossRefPubMed
  38. ↵
    1. Walker KM,
    2. Bizley JK,
    3. King AJ,
    4. Schnupp JW
    (2011) Multiplexed and robust representations of sound features in auditory cortex. J Neurosci 31:14565–14576. https://doi.org/10.1523/JNEUROSCI.2074-11.2011 pmid:21994373
    OpenUrlAbstract/FREE Full Text
  39. ↵
    1. Walker KM,
    2. Davies A,
    3. Bizley JK,
    4. Schnupp JW,
    5. King AJ
    (2017) Pitch discrimination performance of ferrets and humans on a go/no-go task. bioRxiv 165852. https://doi.org/10.1101/165852
  40. ↵
    1. Walker KM,
    2. Schnupp JW,
    3. Hart-Schnupp SM,
    4. King AJ,
    5. Bizley JK
    (2009) Pitch discrimination by ferrets for simple and complex sounds. J Acoust Soc Am 126:1321–1335. https://doi.org/10.1121/1.3179676 pmid:19739746
    OpenUrlCrossRefPubMed
  41. ↵
    1. Whitton JP,
    2. Hancock KE,
    3. Polley DB
    (2014) Immersive audiomotor game play enhances neural and perceptual salience of weak signals in noise. Proc Natl Acad Sci U S A 111:E2606–E2615. https://doi.org/10.1073/pnas.1322184111 pmid:24927596
    OpenUrlAbstract/FREE Full Text
  42. ↵
    1. Znamenskiy P,
    2. Zador AM
    (2013) Corticostriatal neurons in auditory cortex drive decisions during auditory discrimination. Nature 497:482–485. https://doi.org/10.1038/nature12077 pmid:23636333
    OpenUrlCrossRefPubMed
Back to top

In this issue

The Journal of Neuroscience: 45 (18)
Journal of Neuroscience
Vol. 45, Issue 18
30 Apr 2025
  • Table of Contents
  • About the Cover
  • Index by author
  • Masthead (PDF)
Email

Thank you for sharing this Journal of Neuroscience article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Auditory Training Alters the Cortical Representation of Complex Sounds
(Your Name) has forwarded a page to you from Journal of Neuroscience
(Your Name) thought you would be interested in this article in Journal of Neuroscience.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
Auditory Training Alters the Cortical Representation of Complex Sounds
Huriye Atilgan, Kerry M. Walker, Andrew J. King, Jan W. Schnupp, Jennifer K. Bizley
Journal of Neuroscience 30 April 2025, 45 (18) e0989242025; DOI: 10.1523/JNEUROSCI.0989-24.2025

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Request Permissions
Share
Auditory Training Alters the Cortical Representation of Complex Sounds
Huriye Atilgan, Kerry M. Walker, Andrew J. King, Jan W. Schnupp, Jennifer K. Bizley
Journal of Neuroscience 30 April 2025, 45 (18) e0989242025; DOI: 10.1523/JNEUROSCI.0989-24.2025
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • Significance Statement
    • Introduction
    • Materials and Methods
    • Results
    • Discussion
    • Data Availability
    • Footnotes
    • References
  • Figures & Data
  • Info & Metrics
  • eLetters
  • Peer Review
  • PDF

Keywords

  • auditory
  • decoding
  • ferret
  • learning
  • plasticity
  • timbre

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

Research Articles

  • Sex differences in histamine regulation of striatal dopamine
  • The Neurobiology of Cognitive Fatigue and Its Influence on Effort-Based Choice
  • Zooming in and out: Selective attention modulates color signals in early visual cortex for narrow and broad ranges of task-relevant features
Show more Research Articles

Systems/Circuits

  • The Neurobiology of Cognitive Fatigue and Its Influence on Effort-Based Choice
  • Gestational Chlorpyrifos Exposure Imparts Lasting Alterations to the Rat Somatosensory Cortex
  • Transcranial focused ultrasound modulates feedforward and feedback cortico-thalamo-cortical pathways by selectively activating excitatory neurons
Show more Systems/Circuits
  • Home
  • Alerts
  • Follow SFN on BlueSky
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Issue Archive
  • Collections

Information

  • For Authors
  • For Advertisers
  • For the Media
  • For Subscribers

About

  • About the Journal
  • Editorial Board
  • Privacy Notice
  • Contact
  • Accessibility
(JNeurosci logo)
(SfN logo)

Copyright © 2025 by the Society for Neuroscience.
JNeurosci Online ISSN: 1529-2401

The ideas and opinions expressed in JNeurosci do not necessarily reflect those of SfN or the JNeurosci Editorial Board. Publication of an advertisement or other product mention in JNeurosci should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in JNeurosci.