Abstract
Sound discrimination is essential in many species for communicating and foraging. Bats, for example, use sounds for echolocation and communication. In the bat auditory cortex there are neurons that process both sound categories, but how these neurons respond to acoustic transitions, that is, echolocation streams followed by a communication sound, remains unknown. Here, we show that the acoustic context, a leading sound sequence followed by a target sound, changes neuronal discriminability of echolocation versus communication calls in the cortex of awake bats of both sexes. Nonselective neurons that fire equally well to both echolocation and communication calls in the absence of context become category selective when leading context is present. On the contrary, neurons that prefer communication sounds in the absence of context turn into nonselective ones when context is added. The presence of context leads to an overall response suppression, but the strength of this suppression is stimulus specific. Suppression is strongest when context and target sounds belong to the same category, e.g.,echolocation followed by echolocation. A neuron model of stimulus-specific adaptation replicated our results in silico. The model predicts selectivity to communication and echolocation sounds in the inputs arriving to the auditory cortex, as well as two forms of adaptation, presynaptic frequency-specific adaptation acting in cortical inputs and stimulus-unspecific postsynaptic adaptation. In addition, the model predicted that context effects can last up to 1.5 s after context offset and that synaptic inputs tuned to low-frequency sounds (communication signals) have the shortest decay constant of presynaptic adaptation.
SIGNIFICANCE STATEMENT We studied cortical responses to isolated calls and call mixtures in awake bats and show that (1) two neuronal populations coexist in the bat cortex, including neurons that discriminate social from echolocation sounds well and neurons that are equally driven by these two ethologically different sound types; (2) acoustic context (i.e., other natural sounds preceding the target sound) affects natural sound selectivity in a manner that could not be predicted based on responses to isolated sounds; and (3) a computational model similar to those used for explaining stimulus-specific adaptation in rodents can account for the responses observed in the bat cortex to natural sounds. This model depends on segregated feedforward inputs, synaptic depression, and postsynaptic neuronal adaptation.
Introduction
Studying how vocalizations are processed in the brain of listeners is key to understanding the neurobiology of acoustic communication. Bats are highly vocal social animals. They live in complex auditory environments filled with mixtures of their own calls and vocalizations emitted by conspecifics. Bat vocalizations can be classified into two categories, namely, echolocation and communication. Echolocation is primarily used for navigation (Neuweiler, 1989; Thomas et al., 2005), and communication calls convey a variety of messages to conspecifics (Gillam and Fenton, 2016; Chaverri et al., 2018).
Previous studies in the mustached bat described that neurons in specialized areas of the auditory cortex (AC) not only exhibit combination sensitivity for echolocation pulses (O'Neill and Suga, 1979; Fitzpatrick et al., 1993) but also to communication calls (Ohlemiller et al., 1996; Kanwal, 1999). Although several studies clearly indicated that within the AC both echolocation and communication signals can be processed by the same neurons (Esser et al., 1997; Medvedev and Kanwal, 2004, 2008; Kanwal, 2012), how these multifunctional neurons process combinations of these two signals is still unknown. The latter constitutes a relevant question for neuroethology because acoustic transitions, that is, communication sounds following echolocation sequences or vice versa, are common in the natural environments of bats (Fig. 1).
The main aim of the present study is to characterize how AC neurons respond to the acoustic transitions mentioned above. The study was performed in the bat species Carollia perspicillata. The echolocation and communication calls of C. perspicillata differ markedly in frequency composition, with energy peaking between 12 and 40 kHz in communication sounds and above 60 kHz in echolocation signals (Porter, 2010; Knörnschild et al., 2014; Hechavarría et al., 2016). Neurons in nontonotopic areas (high-frequency fields) of the AC of C. perspicillata exhibit multipeaked frequency tuning curves peaking at frequencies close to 20 and 60 kHz, which closely match the peak frequency of communication and echolocation sounds, respectively (Hagemann et al., 2010, 2011).
We tested how neurons in nontonotopic areas of the AC respond to combinations of natural echolocation and communication sounds. As an example of communication calls, we chose distress vocalizations, a type of communication sound emitted under duress, which carries strong communicative power and has been described extensively (Knörnschild et al., 2014; Hechavarría et al., 2016, 2020; González-Palomares et al., 2021). Our goal was to test how AC neurons with multipeaked frequency tuning, which could potentially respond to both echolocation and communication signals, process sound mixtures in which a leading sequence of natural sounds (termed context sequence) is followed by a probe sound. Context and probe could match (e.g., echolocation following echolocation) or mismatch each other (e.g., communication following echolocation).
Single-neuron recordings revealed that leading echolocation and communication sequences modulate cortical responses to lagging sounds. Context-triggered suppression was stimulus specific, with suppression being strongest when both context and probe sounds belonged to the same acoustic category (matched treatment). Context effects suppressed responses to probe sounds even 400 ms after context offset. An integrate-and-fire neuron model, similar to synaptic models designed to explain stimulus-specific adaptation to artificial sounds in rodents (Taaseh et al., 2011; Hershenhoren et al., 2014), could replicate our empirical results. The model predicted high selectivity to communication and echolocation sounds in the inputs arriving in AC neurons of bats, as well as two forms of adaptation, presynaptic frequency-specific adaptation acting in cortical inputs and stimulus-unspecific postsynaptic adaptation. The model also predicted the temporal course of synaptic depression, which differed between the two types of acoustic contexts tested, with faster adaptation in response to communication than to echolocation sounds.
Our findings have two main implications. (1) Context-driven suppression can change the abilities of neurons to discriminate between ethologically relevant natural sounds in awake, passive listening animals, and (2) models of stimulus-specific adaptation developed for explaining responses to artificial stimuli in classic laboratory animals (i.e., rodents) also can explain responses to natural ethologically relevant sounds in bats.
Materials and Methods
Animal preparation
Six bats (two female, Carollia perspicillata) were used in this study. They were taken from a breeding colony in the Institute for Cell Biology and Neuroscience at the Goethe University in Frankfurt am Main, Germany. All experiments were conducted in accordance with the Declaration of Helsinki and local regulations in the state of Hessen (Experimental permit FU1126, Regierungspräsidium Darmstadt).
The bats were anesthetized with a mixture of ketamine (100 mg/ml Ketavet; Pfizer) and xylazine hydrochloride (23.32 mg/ml Rompun; Bayer). Under deep anesthesia, the dorsal surface of the skull was exposed. The underlying muscles were retracted with an incision along the midline. A custom-made metal rod was glued to the skull using dental cement to fix the head during electrophysiological recordings. After the surgery, the animals recovered for 2 d before participating in the experiments.
On the first day of recordings, a craniotomy was performed using a scalpel blade on the left side of the cortex in the position corresponding to the auditory region. Particularly, the caudoventral region of the auditory cortex was exposed, spanning primary and secondary auditory cortices (AI and AII, respectively), the dorsoposterior field and high-frequency (HF) fields. The location of these areas was made by following patterns of blood vessels and the position of the pseudocentral sulcus (Esser and Eiermann, 1999; Hagemann et al., 2010).
Neuronal recordings
In all six bats, recordings were performed over a maximum of 14 d. Experiments were conducted in awake animals. Bats were head fixed and positioned in a custom-made holder over a warming pad set to a temperature of 27°C. Local anesthesia (Ropivacaine 1%, AstraZeneca) was administered topically over the skull before each session. Each recording session lasted a maximum of 4 h.
All experiments were performed in an electrically isolated and soundproof chamber. For neural recordings, carbon-tip electrodes (impedance ranged from 0.4 to 1.2 MΩ) were attached to an electrode holder connecting the electrode with a preamplifier to a DAGAN four-channel amplifier (Dagan EX4-400 Quad Differential Amplifier, gain = 50, filter low cut = 0.03 Hz, high cut = 3 kHz). Analog-to-digital conversion was achieved using a sound card (RME ADI-2 Pro, sample rate = 192 kHz). Electrodes were driven into the cortex with the aid of a Piezo manipulator (PM 10/1, Science Products). Single-unit auditory responses were located at depths of 308 ± 79 µm, mean ± SD, using a broadband search stimulus (downward frequency modulated communication sound of the same bat species) that triggered activity in both low- and high-frequency-tuned neurons. A similar paradigm has been used in previous studies to locate neuronal responses (Martin et al., 2017).
Acoustic stimulation
We used natural sounds to trigger neural activity during the recordings. The natural sounds were obtained from the same species in previous studies from our lab (Beetz et al., 2016; Hechavarría et al., 2016). Acoustic signals were generated with an RME ADI.2 Pro Sound card and amplified by a custom-made amplifier. Sounds were then produced by a calibrated speaker (NeoCD1.0 Ribbon Tweeter; Fountek Electronics), which was placed 15 cm in front of the bat right ear of the bat. The calibration curve of the speaker was calculated with a microphone (model #4135, Brüel & Kjaer).
Once an auditory neuron was located, we determined the isolevel frequency tuning of the unit, with 20 ms pure tones (0.5 ms rise/fall time) presented randomly in the range of frequencies from 10 to 90 kHz (5 kHz steps, 20 trials) at a fixed sound pressure level of 80 dB SPL. This was done only in a subset of the neurons recorded (55 of a total of 74). Our previous studies indicate that 80 dB SPL represents a good compromise as it is strong enough to drive activity in most AC neurons and allows to differentiate between single-peaked and double-peaked tuning curves typical of the AC of this bat (López-Jury et al., 2020). The repetition rate for stimulus presentation was 2 Hz.
We studied context-dependent auditory responses using two types of acoustic contexts, sequences of echolocation pulses and sequences of communication (distress calls). We refer to these two sequences of natural vocalization as context because they preceded the presentation of probe sounds that were either a single echolocation pulse or a single distress syllable. The properties of the contexts used for stimulation are shown in Figure 2A,B. In a nutshell, the echolocation sequence was composed of high-frequency vocalizations and echoes of the vocalizations (carrier frequencies > 50 kHz) repeated at intervals of ∼ 30 ms (Fig. 2A,B, left). The echolocation sequence was recorded from a bat swung in a pendulum following procedures described previously (Beetz et al., 2016). The distress sequence, on the other hand, was composed of individual syllables with peak frequencies of ∼23 kHz (Fig. 2A,B, right). Distress syllables occurred in groups (so-called bouts) repeated at intervals of ∼60 ms, and within the bouts, syllables occurred at rates of ∼15 ms (Hechavarría et al., 2016). We chose to study these two acoustic contexts because they rely on fundamentally different acoustic parameters and are linked to distinct behaviors, that is, echolocation and calling under duress (distress).
Probe sounds (sounds that followed the context) were single echolocation and distress syllable, each obtained from the respective context sequences (Fig. 2C). The sound level of both probes was fixed to 73 dB SPL. These levels resemble a bat vocalizing 1.5 m away from the listener. Distress calls at similar intensities have been recorded in our previous work with a microphone 1.5 m away from the bats (median of 70 dB SPL; Hechavarría et al., 2016). On the other hand, the emitted call intensity of echolocation pulses in this bat species has been reported as source levels (referenced to 10 cm away from the mouth of the bat) averaging 99 dB SPL (Brinklov et al., 2011). Considering spherical spreading (6 dB/doubling of distance) and atmospheric attenuation at peak frequency of 90 kHz (∼2.2 attenuation/m; Lawrence and Simmons, 1982), 73 dB SPL would correspond to a bat echolocating 1.6 m away.
We tested two temporal separations (gaps) between the context offset and probe onset. A previous study in the same region of the AC of C. perspicillata showed neuronal responses to a masker-probe stimulation paradigm using systematically spaced delays. The neurons started showing responses to the probes only 50 ms after the masker offset; at shorter delays it was difficult to disentangle responses to maskers (i.e., the last elements of the context sequence) from those evoked by the probe sounds (Martin et al., 2017). We decided to use gaps above 50 ms in the same order of magnitude (60 ms) or much longer (416 ms) to measure the context effect in our paradigm. Especially in bats, short delays could lead to response facilitation that although interesting was not in the main scope of this study and has been characterized before in the context of echolocation, that is, delay tuning (Wenstrup and Portfors, 2011; Suga, 2015), and communication (Esser et al., 1997). Therefore, a total of eight context stimuli (2 contexts × 2 probes × 2 gaps) were randomly presented and repeated 20 times to awake bats during electrophysiological recordings—echolocation context followed by echolocation probe, echolocation context followed by distress probe, distress context followed by echolocation probe, and distress context followed by distress probe. In addition, we presented each probe after 3.5 s of silence (no context) and repeated it 20 times.
Data analysis
Spike clustering.
All the recording analyses, including spike sorting, were conducted using custom-written MATLAB scripts (R2018b, MathWorks). The raw signal was filtered between 300 Hz and 3 kHz using a bandpass Butterworth filter (third order). To extract spikes from the filtered signal, we detected negative peaks that were at least 3 SDs above the recording baseline; times spanning 1 ms before the peak and 2 ms after were considered as one spike. The spike waveforms were sorted using an automatic clustering algorithm, KlustaKwik, which uses results from principal component analysis to create spike clusters (Harris et al., 2000). For each recording, we considered only the spike cluster with the highest number of spikes.
Neuronal classification.
A total of 74 units were considered responsive to at least one of the probe sounds tested. A unit was considered responsive if the number of spikes fired in response to the sound in question was above the 95% confidence level calculated for spontaneous firing for the same unit (calculated along 200 ms before the start of each trial). Evoked firing had to surpass this threshold for at least 8 ms after probe onset for a unit to be considered responsive. To test for natural sound preferences in each unit (i.e., whether units preferred echolocation vs distress sounds or vice versa), we used the responses to the probe sounds when they were preceded by silence (no context). The spike counts during 50 ms after each probe onset were compared using a nonparametric effect size metric, Cliff's delta. Cliff's delta quantifies effect size by calculating how often values in one distribution are larger than values in a second distribution. Distribution here refers to the number of spikes fired in presentations of echolocation and communication probe trials. The statistic gives values from −1 to 1, with identical groups rendering values of zero. Following previous studies (Romano et al., 2006), if the effect size between the two distributions was negligible or small [abs(Cliff's delta) ≤ 0.3], the unit was classified as equally responsive to both probes. On the other hand, if abs(Cliff's delta) > 0.3, the unit was classified either as preference to echolocation or preference to communication. The preference was assigned to the probe that evoked the highest number of spikes across trials. If a unit was responsive to only one of the probes, it was classified as responsive to only echolocation or only communication. We checked the classification criterion by using nonparametric Wilcoxon rank-sum tests. Rank-sum test failed to reject the null hypothesis when comparing spike counts of the responses to both probes in all the units classified as equally responsive based on the Cliff's delta metric. In contrast, in all 16 units considered as preference to communication, the null hypothesis was rejected, that is, they showed significant differences in spike counts across trials (p < 0.05, Wilcoxon rank-sum test).
The isolevel tuning curves were obtained from the average of spike counts across trials
Units were classified as low-frequency tuned if the normalized response was higher than or equal to 0.6 at any frequency lower than 50 kHz and lower than 0.6 for all frequencies higher than or equal to 50 kHz. High-frequency tuned neurons were those in which
Quantifying the effects of leading acoustic context.
To test for effects of leading acoustic context on probe-triggered responses, we tested (using a nonparametric Wilcoxon rank-sum test) whether the response to the probe, in terms of number of spikes across trials occurring after the context, was significantly different from the responses to the same sound presented without any context. To quantify the magnitude of the effect, we used the following equation:
Call-evoked responses in equally responsive neurons
To determine the latency and duration of single-unit responses to the probes in the no-context condition, we convolved the peristimulus time histograms (PSTHs) with a Gaussian kernel with a bin size of 1 ms and a bandwidth of 5 ms to obtain a spike density function. The latency of the response corresponded to the time at which the spike density function exceeded 50% of the peak response. The response ended when the density function decreased the 50% threshold. Response duration was calculated as the difference between the end and the latency of the response. Together with the spiking data, local field potentials (LFPs) were obtained from the same electrode with a low bandpass filter (0.1–300 Hz). Call-evoked LFPs were averaged and z normalized across trials. To estimate the energy of LFPs, we calculated the absolute value of the Hilbert transform of the signal within a time window of 0–150 ms. To quantify aspects of the temporal structure of the call-evoked LFPs, we calculated the times at negative and positive peaks, and at which the potential became positive between the mentioned peaks. Paired statistics were used to compare such measurements between echolocation and communication probes per unit (Wilcoxon signed-rank test).
Modeling possible synaptic origins of contextual neural response modulation
We modeled a broadly tuned cortical neuron that reproduces the behavior of equally responsive neurons observed in our experiments, using an integrate-and-fire neuron model. Our model (described in detail below) receives input from two narrow frequency channels and includes synaptic adaptation (i.e., activity-dependent depression) in the cortical input synapses. Frequency-specific adaptation in multiple synaptic channels has been proposed to mediate stimulus-specific adaptation in auditory cortex, including during speech processing (Taaseh et al., 2011; May and Tiitinen, 2013). All simulations were performed using the Python package Brian2, version 2.3 (Stimberg et al., 2019).
In the model proposed, the membrane potential (Vm) of the neuron evolves according to the following differential equation:
The neuron fires when Vm reaches the membrane potential threshold
The model includes conductance-based excitatory synapses; every synapse creates a fluctuation of conductance ge that changes in time with time constant
When an action potential arrives at a synapse, the excitatory conductance ge increases in the postsynaptic neuron, according to the following:
The neuron model has two excitatory synapses that receive inputs, which differ by preference of the inputs for natural sound stimulation. The input of the synapse j corresponds to a spike train that follows an inhomogeneous Poisson process with rate
Model parameters and data fitting
All the parameters used in the model are indicated in Table 1. The parameters associated with intrinsic properties of the neuron model were set to qualitatively mimic the extracellular activity of cortical neurons. The parameters that determine the dynamic of the adaptive threshold (
In our model the reversal potential Ee of excitatory synapses was set to a standard value used for modeling AMPA synapses (Clopath et al., 2010). The time constant
The maximum synaptic weight we of each synapse was set to fit experimental data. We ran 100 simulations, systematically changing the maximum synaptic weight of each synapse independently from 1 to 20 nS, in steps of 2 nS. For each simulation, we compared statistically (using Wilcoxon rank-sum test) 50 neuron models with the 45 units were classified as equally responsive, regarding Cliff's delta values between probe responses. In addition, the chosen synaptic weights were constrained to fit approximately the number of spikes evoked by each probe observed empirically.
We assumed that the spiking of the inputs was tuned to the spectral content of natural sounds; high-frequency tuned (≥45 kHz) and low-frequency tuned (<45 kHz). Considering the spectral distribution of the vocalizations used as context (Fig. 2B), high-frequency-tuned input was set to be responsive to echolocation as well as to communication signals, and low-frequency-tuned input to only communication. The input selectivity to natural sounds is given by the parameter
Our model includes short-term synaptic adaptation (i.e., activity-dependent depression) that depends essentially on two parameters, the synaptic decrement
Finally, we modified our model to reproduce experimental data obtained in units classified as preference to communication. We increased the input selectivity for communication sounds (Table 2) and ran several simulations changing the maximum synaptic weight we of each synapse. A higher synaptic weight in the low-frequency synapse and lower weight in the high-frequency synapse turned our neuron model into a preference to communication unit with similar firing rates observed in the experiments. Without changing any other parameter, we observed that the model reproduces the behavior of these neurons after context.
Results
We investigated how auditory cortex neurons of the bat C. perspicillata process two types of natural vocalizations, echolocation pulses and distress calls (a type of communication call, hereafter referred to as communication calls). The goal was to assess how leading context of behaviorally relevant sounds—echolocation and communication vocalizations—affect cortical responses to lagging sounds (called probes throughout the text; Fig. 2). Probe sounds could either match the previous sequence (e.g., echolocation occurring after echolocation; Fig. 2C) or mismatch it (e.g., communication following echolocation). In C. perspicillata, the main energy of echolocation and communication calls occurs at different frequencies, ∼66 kHz (Fig. 2B, second histogram) and ∼23 kHz (Fig. 2B, fourth histogram), respectively. In addition to frequency composition, the sequences used in this study also differ in temporal structure (Fig. 2B; first and third histograms). Probes were presented at two lags after context offset, 60 and 416 ms. These two delay values were chosen because they allow a clear separation between responses to the last elements of the context sequence and responses to lagging probe sounds. A systematic mapping of forward suppression covering multiple delays between leading and lagging sounds was presented previously, albeit in anesthetized animals and using only pairs of communication calls (Martin et al., 2017). Studying the effects of multiple delays between context and probe sounds is done here by means of modeling (see below, Timescale of synaptic depression predicted by the model).
Cortical responses to echolocation and communication syllables after silence
The extracellular activity of 74 units was recorded in the AC of awake bats. Neurons were classified into five groups regarding their responses to the probes presented in isolation, that is, preceded by 3.5 s of silence (see above, Materials and Methods). The majority of units (91%, 67 of 74) responded to both sounds (Fig. 3A, gray bars), and the remaining 9% (7/74 units) responded exclusively to echolocation or communication (Fig. 3A, black bars). Note that we selectively targeted high-frequency regions of the auditory cortex that respond to both low and high frequencies (Hagemann et al., 2010, 2011) and thus this result was expected.
In each unit, the number of spikes evoked by the two probes, that is, single echolocation pulse and communication syllable, were statistically compared using the Cliff's delta; metric. Units with negligible and small effect size [abs(Cliff's delta) ≤ 0.3] were defined as equally responsive to both sounds (45/74 units). Otherwise, units with a larger effect size were classified as preference for communication (16/74 units) or preference for echolocation (6/74 units) by comparing mean spike counts (Fig. 3B). Figure 3C shows the spike count of all units split into the three groups mentioned.
Next, we tested whether the responsiveness of the units could be explained based on the overlap between the frequency spectrum of call components and the isolevel frequency tuning curves measured at 80 dB SPL. We chose this SPL value because it provides a readout that approximates the maximum broadness of the frequency-receptive fields. The level of natural vocalizations used as probe sounds was also fixed to SPL (root mean square) values of 73 dB SPL, resembling a situation in which a conspecific is vocalizing ∼1.5 m away from the listener (see above, Materials and Methods).
Isolevel frequency tuning curves were calculated using pure tones covering frequencies from 10 to 90 kHz (5 kHz steps) for a subset of neurons (n = 54 of 74 units). The classification of the units according to the shape of their isolevel tuning curve is shown in Figure 3D (see above, Materials and Methods). Sixty-seven percent of the units (36/54 units) presented multipeaked frequency tuning curves (Fig. 3E, top, example unit). This feature has been associated with nontonotopically arranged areas in the AC (high-frequency fields and dorsoposterior area) in C. perspicillata (Hagemann et al., 2010; López-Jury et al., 2020). The remaining units studied (18/54) exhibited single-peaked tuning, associated with neurons in tonotopic areas (primary and secondary AC and anterior auditory field). Of those, 17 were low-frequency tuned (best frequency <50 kHz; Fig. 3E, example unit, bottom) and only one had best frequency ≥50 kHz. The mean isolevel tuning curves of all multipeaked and low-frequency-tuned units is shown in red in Figure 3E.
As expected, the majority of neurons with multipeaked tuning curves (67%) belonged to the group equally responsive (Fig. 3E, top). In addition, 53% of the neurons with a low-frequency isolevel tuning curve presented preferential or exclusive responses for communication sounds (Fig. 3E, bottom). These results indicate that although simple, isolevel tuning curves partially predict neuronal responsiveness to natural sounds presented without context, that is, preceded by silence. The percentages of responsivity to natural sounds for each tuning curve group are shown in Figure 3F.
Natural acoustic context suppresses cortical responses to lagging probe sounds
The effect of acoustic context, that is, echolocation or communication sound sequences preceding the presentation of probe sounds, was first quantified for the 45 units that responded equally well to both probes when presented after silence (Fig. 3C, left). Figure 4A shows the PSTHs of an example equally responsive unit in response to echolocation and communication probe sounds preceded either by silence (top), by an echolocation context (middle), or by a communication context (bottom) 60 ms after the offset of the context. In this unit, the response to both probes was reduced after the context sequences. In fact, the response to the echolocation probe was completely abolished after the echolocation context (Fig. 4A, middle, left). Overall, the presence of a leading context decreased responsivity in most cortical neurons regardless of the type of probe sound that followed the context sequences (N values between 36 and 29 of 45 units, p < 0.05, Wilcoxon rank-sum test). Note that all black lines (significant changes) point downward in Figure 4B1. However, a subset of neurons did not present significant variations in the response to the probe after context (p > 0.05, Wilcoxon rank-sum test), the respective spike counts are plotted with gray lines in Figure 4B1 (N values between 9 and 16).
Thirty-two units showed significant reduction of responses in at least three context-probe combinations and were considered context-sensitive neurons. The rest of the neurons (n = 13) were classified as context insensitive (Fig. 4B2). To study if these two classes of neurons differed from each other on their intrinsic properties, we compared best frequency, spontaneous firing rate, and spike waveform. Only the shape of the spikes was significantly different between the two groups (p = 0.01, Wilcoxon rank-sum test, Fig. 4B3). Context-sensitive neurons showed higher values of area under the spike waveform than context-insensitive ones, suggesting that context-sensitive neurons could be classified as putative broad-spiking pyramidal cells, whereas context-insensitive neurons correspond to putative narrow-spiking interneurons (Tsunada et al., 2012).
Context increases cortical discrimination of natural sounds
Although the example in Figure 4A shows reduction of probe-evoked responses by leading context, the magnitude of the suppression was different when comparing between probes. The matching probes in both context categories were more suppressed than the mismatching probes relative to the responses after silence (see context effect suppression values at the top of each PSTH). The context-dependent effect illustrated in Figure 4A was not unique to this example neuron. In fact, stronger reduction of responses to matching probes were common among the neurons studied (Fig. 4C), regardless of the type of context that preceded the probes (p = 6.4e-4 for echolocation and p = 1.7e-4 for communication, Wilcoxon signed-rank test). Relative to the probes preceded by silence, there was a median of 48% of suppression for matching sounds versus 30% for mismatching sounds after echolocation context, and 37 and 28% for communication context, respectively. As expected, the amount of suppression driven by the context sequence decreased when the interval between context offset and probe was set to 416 ms (compare Fig. 4D, 416 ms delay, with data obtained with 60 ms delay in Fig. 4C). A decrease in context effect was apparent in all context-probe combinations [p = 6.8e-8 for echolocation (ech)-ech, p = 2.0e-5 for ech-communication (com), p = 0.0013 for com-com, and p = 3.9e-5 for com-ech; Wilcoxon signed-rank test comparing data obtained with 60 ms delay vs data obtained with 416 ms delay]. Interestingly, the context suppression was still stimulus specific in the 416 ms gap treatment (p = 0.007 for echolocation and p = 0.02 for communication, Wilcoxon signed-rank test) indicating long-lasting effects of natural context on upcoming sounds.
So far, we have shown that nonselective neurons for natural sounds of different behavioral categories, echolocation and communication, exhibit suppression by leading acoustic context that is stimulus specific. We reasoned that such a context-specific suppression could actually render cortical neurons more selective to natural sounds than what could be predicted by presenting single sounds preceded by silence. To quantify whether this was the case, we compared the Cliff's delta values (used here as a discriminability index) obtained when comparing spike counts between the isolated probes versus the value obtained when there was leading context (Fig. 4E). After both contexts, the discriminability between the probes significantly increased (values closer to −1 or 1) compared with the discriminability exhibited in silence (p = 4.4e-5 after echolocation and p = 9.8e-5 after communication, Wilcoxon signed-rank test). The results showed that under the echolocation context, Cliff's delta values became more negative from a median of −0.045 (in silence) to −0.38, which is indicative of a higher number of spikes in response to the communication probe. On the other hand, the presence of the communication context shifted Cliff's delta values closer to 1 (from a median of −0.045 to 0.11), which corresponds to a higher number of spikes in response to the echolocation probe. Such an increment in the discriminability index after context was observed as well in the 416 ms treatment (Fig. 4F; p = 0.0017 after echolocation and p = 0.0059 after communication, Wilcoxon signed-rank test), suggesting again that context effects are long lasting. Overall, these results show that responses to communication or echolocation syllables presented alone are more difficult to discern than when the syllables are presented after a behaviorally relevant acoustic sequence. In other words, acoustic contexts decreased the ambiguity of the response of nonselective cortical neurons by increasing the sensitivity to temporal transitions between these two sound categories.
Effects of acoustic context on neurons with preference for communication sounds
We also examined the effects of context sequences on the response of neurons classified as preference for communication sounds. These neurons formed the second largest neuronal group observed when presenting the sounds without context (22% of the neurons studied; Fig. 3C, right). A typical example neuron that preferred communication probes can be observed in Figure 5A (top). Note that although this neuron has a stronger response to the communication probe, it still responded well to the echolocation probe presented without any acoustic context. In this example, as in the previous neuronal group (equally responsive neurons; Fig. 4), the leading context suppressed the response to both probes, and the suppression was stronger on matching sounds (Fig. 5A, second and third row). The responses to the probes after context were analyzed for all the preference for communication units (n = 16) and significant suppression after context was found in the majority of the neurons (Fig. 5B). Moreover, stimulus-specific suppression was found also at the population level (Fig. 5C), following the same trend exhibited by the example unit in Figure 5, as well as in the equally responsive neurons group (Fig. 4C). Stimulus-specific suppression was also observed in the 416 ms treatment (p = 0.04 after echolocation and p = 0.0013 after communication, Wilcoxon signed-rank test; Fig. 4D). Comparing the amount of suppression (context effect) in the 60 ms and 416 ms conditions resulted in significant differences in three of the four comparisons conducted (p = 0.002 for ech-ech, p = 0.6 for ech-com, p = 7.8e-4 for com-com, and p = 4.4e-4 com-ech, Wilcoxon signed-rank test), indicating again that suppression decreases with lengthening the delay between context offset and probe occurrence.
The clearest difference between selective and nonselective neurons occurred when the probe sounds were preceded by the communication context. To illustrate, in the example in Figure 5A (bottom), when preceded by a communication context, the responses to communication and echolocation probes became similar, although they were notably different when the sounds were presented without context (Fig. 5A, top). This result stems from the unbalanced suppression on probe-evoked responses (−0.46 on matching vs −0.33 on mismatching) together with the intrinsic neuronal preference for communication sounds, which brought spike outputs in response to each probe to the same level when they were presented after the communication sequence.
Equally strong responses to the probes after the communication context were also present at the population level. To quantify the difference between probe-evoked responses, we calculated the discriminability index after silence and after context (Fig. 5E). In agreement with the example neuron shown in Figure 5A, at the population level when the context was communication, Cliff's delta values went from very negative values (preference for communication when preceded by silence) to higher values (p = 5.3e-4, Wilcoxon signed-rank test), closer to zero, that is, similar probe-evoked responses (Fig. 5E, right). This effect on probe discriminability was also observed when the delay was longer (416 ms; Fig. 5F, right). In addition, when the context was the echolocation sequence, Cliff's δ values became even more negative in the 60 ms gap treatment (p = 0.008, Wilcoxon signed-rank test), indicating better discrimination and higher responsivity to mismatching (communication) sounds (Fig. 5E, left). However, this effect was not visible after a 416 ms delay (Fig. 5F, left). To summarize, our results indicate that good neuronal echolocation versus communication classifiers, according to spike counts measured with single sounds presented in silence, became in fact poor category discriminators when tested in a context of communication sounds.
A computational model with frequency-specific adaptation explains context-dependent suppression observed in vivo
Although single echolocation pulse and communication syllable evoked similar responses in most cortical neurons, after the acoustic context the responses shifted toward the syllable that corresponded to the mismatching sound. To identify which mechanisms can account for these context-dependent effects, we modeled an equally responsive cortical neuron using a leaky integrate-and-fire neuron model (Abbott, 1999).
First, we modeled a single cortical neuron, set to be equally responsive to both sounds categories when presented without context and with two synaptic inputs whose firing are highly selective for either communication or echolocation signals (Fig. 6A1). Each synapse receives spike trains whose firing rate is proportional to the degree of selectivity that each input was assumed to have for each sound category. The temporal pattern of the spike trains was assumed to follow the envelope of the respective natural sound used in each simulation. We also assumed that the degree of the input selectivity is correlated with the spectral components of the natural sounds used in our experiments. Thus, we distinguished two input classes, high-frequency tuned (>45 kHz) and low-frequency tuned (<45 kHz). The rate of the inputs in response to each probe are shown in Figure 6A1. Note that the firing rate of both inputs increases in response to the communication syllable, albeit to a higher degree in the low-frequency-tuned input. The latter is a consequence of the spectral broadness of distress calls that carry energy in both low- and high-frequency portions of the spectrum, as opposed to echolocation signals that in this bat species are limited to frequencies above 50 kHz (Fig. 2, natural sounds).
In the neuron model, each presynaptic spike induces an increment of excitatory conductance. The amplitude of this conductance change was adjusted so that the neuron responded equally well to both probe sounds. Similar spiking responses to both probe sounds in one trial simulation are illustrated in Figure 6A2. We ran 20 trial simulations per probe stimulation, and the resultant PSTHs are shown in the respective insets in Figure 6A2.
Simulated responses to combinations of context followed by probe are shown in Figure 6, B and C. The spiking of the low-frequency-tuned input (red), high-frequency-tuned input (blue), and that of the cortical neuron (black) are shown as raster plots (top, middle, and bottom, respectively). Simulations of echolocation context followed by matching and mismatching probes are shown in Figure 6B; the spiking of the low-frequency input (top) across 20 trials corresponds only to spontaneous activity of the input, and the spiking of the high-frequency channel (middle) tightly follows the envelope of the echolocation sequence. The spiking of the cortical neuron (bottom) also exhibited a response to the echolocation sequence because it integrates both low- and high-frequency inputs. In contrast, simulations of communication context evoked spiking responses mostly in the low-frequency input (Fig. 6C) and in the cortical neuron.
To replicate in vivo results in silico, we implemented an activity-dependent adaptation to the model. A likely mechanism of adaptation is synaptic depression, which has been proposed to underlie context-dependent processing (Ulanovsky et al., 2004; Wehr and Zador, 2005). The temporal course of the synaptic strength associated with the spiking of the first simulation trial of each input is indicated by solid lines at the top of the corresponding raster plots (Fig. 6B,C). Because synaptic depression is activity dependent, the more activity a synapse receives from an input, the more adapted it is (lower values of synaptic strength). Therefore, during the echolocation context, the high-frequency-tuned input synapse is more adapted than the low-frequency-tuned input synapse (Fig. 6B), and it is the opposite during the communication context (Fig. 6C). The synaptic strength is proportional to the amplitude of the excitatory postsynaptic potential that each presynaptic spike produces in the cortical neuron; low values indicate less probability of spiking, high values indicate the opposite. The recovery time of the adaptation implemented allowed to affect output responses to forthcoming probe sounds 60 ms after the offset of the context sequence (and 416 ms, see below). Consequently, the spiking associated to the matching probes generated less spikes in the cortical neuron than the spiking in response to the mismatching probes (Fig. 6B,C, PSTHs, right of each raster plot).
Requirements of the model to reproduce data
Our model suggests that stimulus-specific suppression arises from (1) segregated inputs selective for each sound category and (2) synaptic adaptation. To illustrate that these conditions are necessary to reproduce our empirical results, we ran different simulations modifying synaptic properties and quantified the same indexes obtained in vivo with the metrics context effect and discriminability index.
First, we varied the degree of input selectivity and compared the discriminability index obtained after silence versus context (Fig. 7A). The simulations showed that when the inputs have no preference for any of the signals (both inputs respond to both), the discriminability index did not change significantly after context (p > 0.05, Wilcoxon rank-sum test; Fig. 7A, none, gray box plots). These results proved that input selectivity was necessary to obtain stimulus-specific context effects. On the other hand, if the inputs were selective exclusively to one sound category (either echolocation or communication), the discriminability index was significant and drastically different after context compared with silence (p < 0.001, Wilcoxon rank-sum test; Fig. 7A, high, green box plots), following the trend found empirically. However, only simulations with the echolocation context showed no statistical differences with our data (p = 0.3 echolocation; p = 6.0e-11 communication, Wilcoxon rank-sum test). A more biologically plausible model would assume inputs responding to the spectral features of sounds and not to sound categories, probably a middle point between no selectivity and high selectivity. Indeed, the latter model rendered significant changes after context (p < 0.001, Wilcoxon rank-sum test; Fig. 7A, low, orange box plots) but a less dramatic effect than in the previous model with high selectivity. That is, when cortical inputs showed an intermediate level of selectivity, simulations with both contexts showed no differences with the empirical data (p = 0.1 echolocation; p = 0.7 communication, Wilcoxon rank-sum test). Note that in this case, the input tuned to high frequencies would respond at least to some extent to the high-frequency components found in distress signals.
Second, we tested how different forms of adaptation influence the discriminability index (Fig. 7B) and the context effect index (Fig. 7C) in the simulations. As expected, a model that lacks any type of adaptation showed that the discriminability index was zero in all conditions tested (Fig. 7B, none, gray box plots). Similarly, a model that includes neuronal adaptation, dependent on postsynaptic (post) spiking activity, did not show significant differences between silence and context (Fig. 7B, post, green box plots). This indicates that adaptation at the cortical neuron level allows context suppression but not stimulus specificity. Alternatively, if the adaptation depends on presynaptic (pre) activity, the context effect becomes stimulus specific, and the discriminability index increased after context (Fig. 7B, pre, orange box plots). The same effect was observed when the model includes adaptation both at the level of the cortical neuron and at the presynaptic level (Fig. 7B, post + pre, blue box plots).
It is worth mentioning that, although in both models (post and post + pre) the discriminability index increased after context, the model that included both types of adaptations exhibited significant suppression of the mismatching probes (p < 0.05, Wilcoxon one-sample signed-rank test; Fig. 7C, left), which agrees with our experimental data (Fig. 7C, right). In other words, synaptic adaptation is necessary to create stimulus-specific suppression but not enough to explain a reduction of the responses to mismatching probes.
To test whether our computational model could also explain the decrease of the discriminability index observed after the communication context in preference for communication neurons, we modified the maximum synaptic weight (we) of both synapses to obtain a neuron model that possesses selectivity for communication sounds (Fig. 7D). After increasing the we of the low-frequency-tuned synapse and decreasing the we of the high-frequency-tuned synapse, our neuron model presented stronger responses to communication stimulation than to echolocation after silence (Fig. 7D, PSTHs). Without changing any other parameters in the model, we calculated the discriminability index between probes after silence and after context. The results are shown in the Figure 7D (left, box plots). Comparable to our data (Fig. 7D, right, box plots), the neuron model increased its discriminability for the probes after the echolocation context (more negative Cliff's delta values in comparison with silence) but decreased the discriminability after the communication context (Cliff's delta values close to zero). The results suggest that although these two classes of neurons (equally responsive and preference for communication) possess different degrees of selectivity for single isolated natural sounds, the effects of a leading context on their responses to forthcoming sounds can be explained by a common canonical mechanism.
Time course of context-driven suppression and neuronal discriminability changes
Our empirical data indicate that context-dependent effects were also present on the probe responses occurring 416 ms after context offset (Figs. 4D,F, 5D,F). To reproduce this result, we fitted the parameters that describe the time course of the adaptation implemented in the model to replicate discriminability and context effect indexes obtained after the time intervals tested (i.e., 60 and 416 ms).
We reasoned that the fitted model could be used to systematically test the effect of multiple gap values. Several simulations were run changing the temporal separation (gap) between context offset and probe occurrence. Figure 8A shows the temporal course of discriminability index after silence (middle) and after the two types of context tested (echolocation in blue and communication in red) obtained from the simulations. The average across 100 simulated units tightly matched the empirical average (open circles), and the shaded area indicates the SD across the simulations, which also replicated relatively well the variability observed in vivo (vertical solid bars indicate SD). The discriminability index after silence remains constant (top graph) at values ∼0, and it was compared statistically with values obtained after each context (bottom graph). The respective p values are plotted at the top. The model predicted that the increment in discriminability, which can be represented by values closer to 1 or −1, disappears ∼650 ms after communication and ∼1550 ms after echolocation context (last gaps at which p < 0.05 Wilcoxon rank-sum test between respective context and silence). In addition to the discriminability index, we examined the suppression strength as a function of gap length. Figure 8B shows the time course of the recovery of the context effect obtained from the same simulations. The model showed that at gaps when there was no effect in terms of discriminability index (1550 ms for echolocation and 650 ms for communication context), suppression was significant (Wilcoxon one-sample rank-sum test; Fig. 8A, top). However, the strength of the suppression was not enough to create significant differences in the spike count between probe responses.
Time scale of synaptic depression predicted by the model
To visualize the time course of adaptation during and after context resulting from the fitting procedure of model parameters (see above, Materials and Methods), we simulated the presentation of the context sequences followed by 5 s of silence and plotted the average spiking threshold (postsynaptic adaptation) and the average synaptic strength (presynaptic adaptation) across 20 trials (Fig. 9A). The spiking threshold increased during both contexts and recovered following the same time constant (
To test this prediction in our data, we compared the low-frequency components of the neural recording in response to the probes presented after silence (LFPs). It has been speculated that LFPs could reflect differences in synaptic input properties (Haider et al., 2016; Arroyo et al., 2018) and thus were used here as a readout of network synaptic activity to test the predictions of our model. The average evoked potentials across equally responsive units that presented a significant context effect (n = 32; Fig. 4B2) are shown in Figure 9B. Although there were no significant differences in terms of the energy of the evoked potential during 150 ms after the onset of the probe (p = 0.06, Wilcoxon signed-rank test), we did find differences in the temporal course of the LFPs (Fig. 9C,D). LFPs in response to the echolocation probes recovered slower than in response to communication. To quantify this, we calculated the time at which the LFP first crossed zero and the time at which the positive peak occurs. Both measures gave significant differences between the probes, namely, higher values for echolocation (p = 0.005 and p = 0.04, respectively, Wilcoxon signed-rank test). To test whether such a difference could be because of differences in the spiking, we calculated the duration and the latency of the spiking response, and we found no significant differences (p = 0.8 for latency and p = 0.7 for duration, Wilcoxon signed-rank test; Fig. 9E,F), suggesting that the observed differences in the temporal pattern of the LFPs are because of different properties of the synaptic inputs.
Discussion
It is known that in the bat AC there are multifunctional neurons specialized to process sounds linked to echolocation and communication. How these neurons respond to transitions between these two sounds remains unknown. This is especially relevant for communication in complex auditory environments where calls occur within the context of other natural sounds. This study demonstrates that the presence of leading acoustic context increases neuronal discriminability in neurons that are nonselective when tested in a silent condition (i.e., no context) and has the opposite effect on selective neurons. A neuron model explains our results by input selectivity to sounds based on their frequency composition, as well as both presynaptic and postsynaptic adaptation. Furthermore, our model predicts the temporal course of the context dependence and proposes that the time constant of adaptation is shortest for low-frequency sounds, that is, communication signals.
Cortical neurons process both echolocation and communication signals
Previous studies have shown that neurons in the AC of mustached bats that are highly specialized for processing echolocation pulses also present specializations for communication calls processing (Ohlemiller et al., 1996; Esser et al., 1997; Kanwal, 1999, 2012). Here, we provide strong evidence that this is also the case for a frugivorous bat, the C. perspicillata. The majority of the neurons tested in this study (91%) presented responses to conspecific echolocation pulses and to a particular type of communication call, distress calls.
In the mustached bat, it has been well studied how these multifunctional neurons use combination sensitivity as a basic mechanism for call selectivity (Kanwal, 1999, 2006). Comparable physiological responses have been described in C. perspicillata (Kossl et al., 2014). Cortical delay tuning in the form of FM-FM combination sensitivity neurons is robust in dorsal HF areas of the AC (Hagemann et al., 2010, 2011), where we conducted our neural recordings. Therefore, the responses described here could be specific for these neurons, and their underlying neural mechanisms could be similar to those well studied in the mustached bat.
A large proportion of the neurons studied here exhibited multipeaked frequency tuning curves, which is consistent with the fact that we recorded mostly from HF fields of the brain of C. perspicillata (Hagemann et al., 2010; 2011). In agreement with several studies, in a subset of neurons, probe-evoked responses were unpredicted based on their response profile to pure tones (Suga et al., 1978; O'Neill and Suga, 1979; Machens et al., 2004; Sadagopan and Wang, 2009; Laudanski et al., 2012; Feng and Wang, 2017).
In our data, the majority of the selective multifunctional neurons fired more to communication than to echolocation signals. Cortical processing of social calls is known to be lateralized to the left side (Kanwal, 2012; Washington and Kanwal, 2012). All recordings included in this article were, in fact, conducted in the left hemisphere. Accordingly, one could speculate that recordings in the right hemisphere could show a different picture with a higher proportion of neurons preferring high-frequency echolocation sounds, but this remains to be tested.
Acoustic context modulates natural sound discrimination in auditory cortex neurons
The data presented in this article advances our thinking on how context modulates natural sound selectivity by showing that the presence of leading context yields nonselective multifunctional neurons into selective ones and has the opposite effect on selective neurons. Moreover, such modulation appears to be related to neuronal type (i.e., putative pyramidal vs interneuron; Fig. 4B). These findings could have bearings on our interpretation on how the auditory cortex operates in natural conditions. Note that here selectivity refers only to differences in spike count obtained when presenting echolocation and communication sounds. Future studies are needed to test whether similar results are obtained when using different types of communication calls.
We propose that nonselective neurons are important for coding violations to regularities in the sound stream, that is, transitions from echolocation to communication (or vice versa). Meanwhile, neurons that appear to be selective to specific natural sound categories in the absence of context would be worse detectors of acoustic transitions but probably key for sound discrimination in silent environments. Accordingly, it has been claimed that the primary function of the AC is not to map parameters or features, such as frequency, but to integrate sounds on different time scales (Ulanovsky et al., 2003, 2004; Kanwal, 2012; Washington and Kanwal, 2012). Our findings are probably the outcome of sound processing at several levels both below and above the AC. Cortical areas upstream of the AC, such as the frontal cortex, are involved in the production of both vocal and behavioral responses (Eiermann and Esser, 2000; Kanwal et al., 2000; Garcia-Rosales et al., 2020; Weineck et al., 2020) and could potentially modulate how novel sounds are processed in the AC.
Mechanisms underlying context-specific response modulation
We show that in context-sensitive neurons, the presence of a leading context always reduced responses to forthcoming sounds independently of the neuronal tuning. We propose that there is a common mechanism that underlies such a context effect across all cortical neurons of nontonotopic areas. Interestingly, a significant reduction of the probe response was observed both 60 and 416 ms after the context end. Whole-cell recordings have shown that such a long-lasting context dependence (>100 ms) in the AC depends on synaptic depression, whereas at shorter intervals the effects are attributed to postsynaptic (GABAergic) inhibition (Wehr and Zador, 2005; Asari and Zador, 2009).
We found that the magnitude of the suppression driven by context sequences on responses to probe sounds was stimulus specific; there was stronger suppression on responses to probes that matched the context in comparison with those that mismatched. Similar results were found in studies using artificial sounds in the framework of stimulus specific adaptation (SSA) and predictive coding (Calford and Semple, 1995; Auksztulewicz and Friston, 2016; Carbajal and Malmierca, 2018). It has been proposed that SSA underlies enhanced responses to unexpected events occurring as deviant stimuli against a background of repetitive signals (Ayala and Malmierca, 2012). The auditory stimulation paradigm used by us shares similarities with SSA classic paradigms, especially if one considers the context sequences as a form of standard and the mismatched probes as deviant. We propose that highly specialized neurons in the bat AC follow general neuronal principles that explain auditory deviance detection. It should be noted that combination sensitivity present in these neurons involve nonlinear interactions at intersound delays covering time scales between 1 and 30 ms (Hagemann et al., 2010). This time scale is outside the range of delays studied here (60 and 416 ms in the in vivo experiments, longer than to 2 s in silico).
Our model predicts that context modulation in neuronal responses to natural sounds results from two mechanisms, adaptation dependent on the postsynaptic activity and adaptation dependent on the presynaptic activity. The model assumes that the processing of natural sounds is segregated in different frequency channels that converge into a cortical neuron that exhibits context sensitivity. Thus, hearing the context unbalances the input-dependent adaptation and allows context-specific effects. However, the suppression of mismatching sounds cannot be explained by frequency-specific adaptation alone. We predict that postsynaptic-dependent adaptation, which attenuates every input arriving in the cortical neurons, is needed to explain the totality of our results.
We should point out that in our model, synaptic depression is assumed to occur in excitatory synapses in cortical cells. However, inhibitory synapses could also reproduce our predictions. Indeed, the role of GABAergic neurotransmission has been demonstrated in similar context-dependent situations (Perez-Gonzalez et al., 2012). It remains a challenge for future work to discover the biological correlate(s) of the synapses. Whether the synaptic depression depends on the activity of thalamic or cortical neurons remains unsolved. With our data we also cannot establish whether the effects observed originate at the cortical level or whether these are simply inherited from subcortical neurons. What we do propose is that regardless of the input origin, the selectivity to natural sounds in the presynaptic sites is tuned to the spectral components of natural sounds more than to sound categories (either to echolocation or communication signals). Our results showed that preference but not exclusivity to natural sounds is sufficient to reproduce and better fit the observed context-specific modulatory effects. The latter is in accordance with several reports in lower areas of the auditory pathway and in cortical areas, where the neurons exhibit high selectivity to natural sounds (Klug et al., 2002; Mayko et al., 2012; Salles et al., 2020).
Our neuron model also predicts that the increment of neuronal discriminability with acoustic context lasts longer for echolocation than for communication context, 1.6 s and 650 ms, respectively. Similar time scales have been described in the AC as important for processing natural auditory scenes that require integration over seconds (Ulanovsky et al., 2004; Asari and Zador, 2009). In addition, the model suggests different time scales of synaptic depression in the low- and high-frequency inputs of cortical neurons, slowest in high-frequency inputs. This last prediction was corroborated by studying LFP responses evoked by the communication and echolocation sound. Predictions arising from this study could be used as guiding hypotheses in future work using direct measurements of membrane potentials in cortical cells.
Footnotes
This work was supported by Deutsche Forschungsgemeinschaft Project No. 428645493 (J.C.H.). We thank Michael Pasek for assistance during the writing of this manuscript.
The authors declare no competing financial interests.
- Correspondence should be addressed to Luciana Lopez-Jury at lopezjury{at}bio.uni-frankfurt.de or Julio C. Hechavarria at hechavarria{at}bio.uni-frankfurt.de