Abstract
Communication is fundamental for our understanding of behavior. In the acoustic modality, natural scenes for communication in humans and animals are often very noisy, decreasing the chances for signal detection and discrimination. We investigated the mechanisms enabling selective hearing under natural noisy conditions for auditory receptors and interneurons of an insect. In the studied katydid Mecopoda elongata species-specific calling songs (chirps) are strongly masked by signals of another species, both communicating in sympatry. The spectral properties of the two signals are similar and differ only in a small frequency band at 2 kHz present in the chirping species. Receptors sharply tuned to 2 kHz are completely unaffected by the masking signal of the other species, whereas receptors tuned to higher audio and ultrasonic frequencies show complete masking. Intracellular recordings of identified interneurons revealed two mechanisms providing response selectivity to the chirp. (1) Response selectivity is when several identified interneurons exhibit remarkably selective responses to the chirps, even at signal-to-noise ratios of −21 dB, since they are sharply tuned to 2 kHz. Their dendritic arborizations indicate selective connectivity with low-frequency receptors tuned to 2 kHz. (2) Novelty detection is when a second group of interneurons is broadly tuned but, because of strong stimulus-specific adaptation to the masker spectrum and “novelty detection” to the 2 kHz band present only in the conspecific signal, these interneurons start to respond selectively to the chirp shortly after the onset of the continuous masker. Both mechanisms provide the sensory basis for hearing at unfavorable signal-to-noise ratios.
SIGNIFICANCE STATEMENT Animal and human acoustic communication may suffer from the same “cocktail party problem,” when communication happens in noisy social groups. We address solutions for this problem in a model system of two katydids, where one species produces an extremely noisy sound, yet the second species still detects its own song. Using intracellular recording techniques we identified two neural mechanisms underlying the surprising behavioral signal detection at the level of single identified interneurons. These neural mechanisms for signal detection are likely to be important for other sensory modalities as well, where noise in the communication channel creates similar problems. Also, they may be used for the development of algorithms for the filtering of specific signals in technical microphones or hearing aids.
Introduction
Communication is fundamental for our understanding of animal and human behavior. It requires the encoding of information in a signal by a sender, which is transmitted via a transmission channel and detected by a receiver (Bradbury and Vehrenkamp, 2011). However, the acoustic environment of many vocalizing species is complex, as it contains a number of unknown sound sources of the same and different species. Signaling in such groups poses problems for receivers to detect and classify signals (Hulse, 2002; Brumm and Slabbekoorn, 2005; Wiley, 2006, 2013). For humans the “cocktail party problem” (Cherry, 1953) refers to the difficulty of understanding speech in noisy social settings (Yost, 1997; Bronkhorst, 2000). Despite this difficulty, humans and other species can successfully listen and orient to individual sound sources, using a process termed auditory scene analysis (Bregman, 1990). Animal acoustic communication may suffer from similar problems (Bee and Micheyl, 2008), because background noise in choruses of calling individuals can be dramatic. In species-rich tropical rainforests, >50 species of insects and frogs create an acoustic background with vocalizations at high sound pressure levels (Römer et al., 2010; Schwartz and Bee, 2013). Research in the past decade has demonstrated that the selection pressure from the masking of background noise has shaped evolutionary adaptations that allow both senders and receivers to cope with noise (Brumm, 2013). These include individual adjustments of signal properties such as increased loudness (Lombard effect), signal duration and/or redundancy, sound frequency separation, signal timing, or the use of multimodal signals (Higham and Hebets, 2013). Other adaptations may involve the reception and neural processing of sound by the peripheral or central nervous systems, such as narrow frequency filtering, spatial release from masking, or gain control mechanisms (Römer, 2013). Although behavioral studies in various taxa indicate reliable communication at remarkably low signal-to-noise ratios (SNRs), the underlying neuronal mechanisms are often unclear.
Here we studied these mechanisms in two closely related species of acoustic insects that live in sympatry in the tropical rainforest. Both species of katydid belong to the Mecopoda elongata complex. The calling song of the “chirper” consists of broadband chirps that are repeated every 2 s. Males of the competing species produce continuous trills of >105 dB SPL (Krobath, 2013). The song of the “triller” represents a strong external noise that should mask the neural representation of the chirp in receivers. However, contrary to the expectation from theory and from empirical cases with other species using broadband signals (Greenfield, 1988; Römer et al., 1988), acoustic communication of the chirper appears to be almost unaffected by the continuous and loud song of the trilling species despite similar broadband power spectra of both signals (Siegert et al., 2013). A narrow 2 kHz bandwidth in the spectrum of the chirp provides the essential cue for unmasking. The sound energy of this 2 kHz bandwidth is ∼25 dB higher in the conspecific chirp compared with the masking trill. Behavioral studies revealed that males of the chirper synchronize their songs even at SNRs of −8 dB. However, the behavioral response breaks down if the 2 kHz component in the chirps is equalized to the level of the masker (Siegert et al., 2013).
Here, we present two neural mechanisms for selective responses to the song provided by different identified auditory interneurons. These mechanisms are shaped by exploiting the activity of receptors that are tuned to the 2 kHz component of the chirp: (1) selective tuning where some interneurons are selectively tuned to this low-frequency component and respond only to conspecific chirps and (2) novelty detection where broadly tuned interneurons use a mechanism called “novelty detection” (Schul et al., 2012) in which strong adaptation in response to the continuous trill ensures a selective response to the frequency “novelty” in the chirp. The responses of these specific interneurons sufficiently explain the reliable signal detection observed in behavior, even at very low SNRs.
Materials and Methods
Animals.
Males and females of M. elongata (Ensifera, Tettigoniidae, and Mecopodini) were used for neurophysiology. Chirps of the chirper are identical to those of “species S” (Sismondo, 1990), and songs of the trilling species are identical to those of “Mecopoda sp. 2” (Korsunovskaya, 2008). There was no difference observed in the structure and response properties of auditory interneurons between males and females. Insects were taken from a colony at the University of Graz that was originally established with individuals collected in Malaysia close to the field station Ulu Gombak near Kuala Lumpur. The insects were reared in a 12 h light/dark cycle at a temperature of 27°C and 70% relative humidity. They were fed with fresh lettuce, oat flakes, and fish food ad libitum.
Acoustic stimulation.
Calling songs of both species were recorded from isolated males at a distance of 15 cm to the signaling individual using a calibrated, free-field condenser microphone (type 40AC; G.R.A.S. Sound & Vibration A/S) with a flat frequency response between 10 Hz and 40 kHz. The microphone output was amplified using a preamplifier (type 26AM; G.R.A.S. Sound & Vibration A/S) and a power module (type 12AK; G.R.A.S. Sound & Vibration A/S). A/D conversion was performed via an external audio interface (Edirol FA-101; Roland) operating at a sampling rate of 96 kHz.
The chirp used for playback consisted of 15 syllables (chirp duration 285 ms, syllable period 20 ms) with gradually increasing amplitude and was repeated at a rate of 0.4 Hz. The trill of the competing Mecopoda species consisted of a soft syllable followed by two syllables of high amplitude, a stereotypical pattern that was repeated with a period duration of 30 ms. Playback of sound signals was controlled in Cool Edit Pro 2.0 driving an Edirol A/D audio interface operating at a sampling rate of 96 kHz. Sound signals were attenuated (PA-5; Tucker-Davis Technologies) and amplified using an amplifier with a flat frequency response up to 100 kHz (NAD 214; NAD Electronics). Signals were broadcast by two ultrasonic magnetic speakers (MF1-S) with flat frequency characteristics from 1 to >40 kHz (Tucker-Davis Technologies). The spectral composition of both the chirp and the trill at the position of the preparation is shown in Figure 1D. Because of the reduced sampling rate of the Edirol A/D audio interface, the signal playback resulted in a strong attenuation of frequencies >40 kHz. We thus simulated the frequency spectrum of a signal as being perceived at medium sender–receiver distances, where ultrasonic frequencies are strongly attenuated (Keuper et al., 1986; Romer and Lewald, 1992).
Pure tone sound pulses were generated with Cool Edit Pro 2.0 at a sampling rate of 192 kHz. The speakers were positioned next to each other at one side of the longitudinal body axis of the preparation at a distance of 15 cm. Sound pressure levels at the preparation were calibrated at the position of the ears using a ½ inch microphone (type 2540; Larson Davis) connected to a sound level meter (CEL 414; Casella CEL). The sound calibration was performed by presenting a continuous loop of the three last syllables in the chirp.
The chirps and masking trill were broadcast either separately or simultaneously at different SPLs. The variation of SNRs usually started with an SNR of 0 dB with 75 dB SPL for both signals, and was decreased in 3 dB steps by attenuating the intensity of the chirp. The tuning of acoustic neurons was studied with iso-intensity response profiles. In contrast to conventional tuning curves using threshold values at various frequencies, such iso-intensity response profiles provide tuning information at suprathreshold or more relevant intensity values. Tuning was studied using sound pulses (duration 50 ms) of different frequency (1, 2, 5, 10, 15, 20, 30, 40, and 55 kHz) broadcast at 80 dB SPL. To study the impact of the low-frequency components of the chirp on signal discrimination under the masking trill we also used modified chirps as acoustic stimuli. For the experimental approach, the frequency band at 2 kHz was attenuated to the same level as in the spectrum of the trill, using the FFT filter function provided by Cool Edit Pro 2.0.
Intracellular recordings.
Intracellular recordings were performed from auditory interneurons within the prothoracic ganglion. Animals were anesthetized with chlorethyl and the antennae and middle and hind legs were removed. The animals were fixed ventral side up on a holder by using dental wax. The head was slightly tilted backward and fixed on the holder while the tarsi of the forelegs were waxed onto thin wires. An incision between abdomen and thorax reduced hemolymphe pressure and tissue movements from breathing. The prosternum was removed to expose the prothoracic ganglion and the perineurium was carefully removed in the vicinity of the auditory neuropil to allow a smooth penetration of the microelectrode. The prothoracic ganglion was covered with insect saline. The preparation was then placed at a distance of 15 cm to the speakers. The ganglion was stabilized between a small metal platform at its dorsal side and a metal ring at its ventral side with the platform serving as a reference electrode for intracellular recordings. Microelectrodes were filled with 5% Lucifer yellow CH (Sigma-Aldrich) dissolved in aqua destillata or with 0.8% Alexa 555 and 568 hydrazide (Molecular Probes) dissolved in 0.2 m lithium chloride for intracellular staining using conventional protocols.
Intracellular recordings were mainly performed within the anterior part of the prothoracic ganglion at the area of the auditory neuropil (Römer, 1983; Stolting and Stumpner, 1998). Neuronal responses were amplified by a BA-01X amplifier (npi electronic) in bridge mode. Fluorescent dyes were iontophoretically injected into the neurons by hyperpolarizing current injection (0.5–5 nA) for 5–30 min. After histological processing and clearing in methyl salicylate, a Zeiss Axioplan epifluorescence microscope (Carl Zeiss) was used to visualize the morphology of the neurons. Images were taken with an Axiocam ERc 5s (Carl Zeiss). Neurons were reconstructed manually from image stacks using ImageJ (NIH) and Photoshop CS6 (Adobe Systems) software. Neurons were identified according to their morphology and response patterns.
Terminology.
The terminology of neurons was based on their frequency selectivity and axonal projections. The letters LF indicate low-frequency neurons being sharply tuned to the 2 kHz component of the song. The letters BF indicate broadband neurons selective over a bandwidth of >10 kHz. The third letters indicate axonal projections ascending toward the brain (A), descending toward the mesothoracic ganglion (D), and a T-shaped structure of axons both ascending and descending (T). Numbers were added to discriminate between neurons with similar anatomy and tuning properties but differing in other physiological properties.
Data recordings and analysis.
All recording channels were digitized at 20.8 kHz and 16-bit amplitude resolution with 0.153 mV per increment using a CED 1401 micro3 data acquisition interface. Data were recorded to the hard disk of a PC using Spike 2 software (Cambridge Electronic Design). Neural recordings were analyzed off-line using Neuro-Lab software (Knepper and Hedwig, 1997). Responses were analyzed using peristimulus time histograms and averages of instantaneous spike frequency or postsynaptic membrane potentials.
To determine the amount of masking of the chirp response by the trill, the neural responses to the trill were averaged within a time window of 300 ms before the chirp and subtracted from the responses to the chirps presented simultaneously with the trill. This difference indicated the degree of neural representation of the masked chirp. In a similar way the neural response to unmasked chirps was calculated by subtracting spontaneous activity (again over 300 ms before the chirps) from the activity elicited by the chirp. The response to chirps (for both unmasked and masked conditions) was also calculated within a time window of 300 ms after stimulus onset, including 10 ms for the latency of receptor responses. The relative values of spike activity to the chirp during the trill (with the response to the chirp only set as 100%) provide a quantitative measure for the response selectivity of all types of neurons.
Statistical evaluation was performed by using SigmaPlot 12.5 software (Systat). T tests were used for normally distributed data and a Mann–Whitney rank sum test for data not normally distributed.
Results
Response selectivity of auditory receptors
Approximately 40 sound receptors are arranged in a linear fashion in the crista acustica in the ears of the forelegs (Strauß et al., 2012). They are individually tuned to different frequencies from ∼2 to 60 kHz, similar to other katydids (Römer, 1983; Kalmring et al., 1990; Stumpner, 1996; Stolting and Stumpner, 1998). To understand the neural mechanisms underlying signal detection and recognition under masking conditions, we first analyzed the tuning and neural responses to chirps and trills in various auditory receptors. Figure 1A shows the tuning of two auditory receptors as iso-intensity response profiles. The low-frequency receptor (LFR) responded highly selective to 2 kHz, with almost no response to lower or higher frequencies, even though sound pulses were broadcast at 80 dB SPL. In contrast, the broadband frequency receptor (BFR) generated a strong response from 5 to 30 kHz. The spectral composition of the chirp and trill and the consequences of the different tuning of receptors for their selectivity are shown in Figure 1. The LFR responded only to the chirp; even when both chirp and trill were presented simultaneously at 75 dB SPL its response was completely unaffected by the trill (Fig. 1C). In contrast, the BFR showed a strong, continuous response to the trill that completely masked the responses to the chirp, leading to a dramatic decrease of the neural representation of the chirp under masking (Fig. 1C). The quantitative data for the responses to the chirps and the chirps masked by the trill are 17.7 and 16.1 action potentials (APs), respectively, for the LFR and 24.4 and 1 APs, respectively, for the BFR. Thus, there is a strong reduction of response under masking conditions for the latter. The axonal arborizations of the LFR receptor are in the most anterior areas of the auditory neuropil (Fig. 2A) similar to those reported previously in other katydids (Römer, 1983; Stumpner, 1996; Stolting and Stumpner, 1998) and thus represent auditory receptors from the crista acustica and not the intermediate organ. Figure 2B shows iso-intensity response functions of different receptors. The LFRs tuned to 2 kHz generated on average 11.9 ± 3.8 APs per stimulus at the best frequency, while the high-frequency receptors (HFRs) tuned to 30–40 kHz responded with 17.3 ± 4.3 APs per stimulus at the best frequency. These differences are statistically significant (p = 0.034; Mann–Whitney rank sum test). Additionally, the iso-intensity response functions of LFRs tend to be more selective than those of HFRs. Receptors tuned to 2 kHz revealed a high sensitivity only within the narrow frequency range of 1–2 kHz, while high-frequency afferents tuned to 30–40 kHz showed suprathreshold responses from 15 to 55 kHz (and probably higher). This difference can be partly explained by the higher absolute sensitivity of HFRs; tuning curves based on hearing thresholds would probably reveal more similar frequency bandwidths in LFRs and HFRs as shown for other katydids (Römer, 1983; Stumpner, 1996; Stolting and Stumpner, 1998).
Figure 2C shows averaged responses of differently tuned receptors to chirps with and without the masking trill. The magnitude of responses to chirps without the masker is generally stronger in receptors tuned to frequencies higher than 2 kHz (31.0 ± 9.6 APs per chirp in 30 kHz receptors as compared with16.6 ± 8.4 APs per chirp in 2 kHz receptors at 75 dB SPL). However, in the presence of the trill the responses to the chirp are unaffected in receptors tuned to 2 kHz, but decrease to only 1.5 ± 2.8 APs per chirp in 5 kHz receptors and nearly zero in 10 and 30 kHz receptors.
Figure 2D illustrates this finding using the relative values of the responses to the chirp under both conditions. The sharply tuned 2 kHz receptor was unaffected by the higher frequencies of the trill and revealed an activity of 100% when stimulated with chirps under the masker. The more broadly tuned 2 kHz LFR, however, showed a significant reduction by 50% and the 5 kHz receptor responded with only 4.8% to the chirp under noise. Receptors tuned to 10 kHz or 30 kHz revealed no chirp-specific response at all when the chirp was broadcast simultaneously with the trill.
Two mechanisms in interneurons for signal detection under masking trill conditions
Based on the data of auditory receptors, we can conclude that sharply tuned LFRs respond selectively to the 2 kHz component of the chirp, a characteristic that can be exploited at higher levels of auditory processing.
Selective tuning
Similar to the sharply tuned LFRs, some interneurons exhibit response selectivity for the chirp under masking conditions by being selectively tuned to the low-frequency component of the chirp at 2 kHz. We identified three auditory interneurons that revealed a tuning similar to sharply tuned LFRs. Their EPSP delay of 1–2 ms relative to that of auditory afferents, and the overlap of dendritic arborizations with axonal projections of LFRs (Fig. 2 A), indicate that they receive direct input from auditory receptors.
Figure 3A shows reconstructions of three Alexa 555-labeled interneurons tuned to low frequencies. LFT-1 has bilateral symmetrical dendritic arborizations in the auditory neuropil, a T-shaped structure with an ascending axon toward the subesophageal ganglion and a descending axon toward the mesothoracic ganglion. Its cell body is not located within the prothoracic ganglion. Two other interneurons (LFD-1 and LFD-2) have axons descending to the mesothoracic ganglion; they show similar branching patterns, but differed considerably in threshold and response magnitude. We therefore decided to separate these neurons, mainly because the difference in the excitation level was very constant, with no intermediate level, which one would expect if there were some unknown reasons modifying the sensitivity between preparations. Their cell body is located near the anterior auditory neuropil; the dendrites branched contralateral to the cell body within the anterior part of the auditory neuropil. Typical features were also dendritic branches toward the leg nerve and a semicircular projection to the ipsilateral side of the cell body within the posterior part of the neuropil.
The three interneurons were all sharply tuned to 2 kHz (Fig. 3B); almost no response occurred at 3.0 kHz and higher. They differed, however, in sensitivity, so that at 75 dB SPL LFD-1 showed a strong response even for the low-amplitude syllables of the chirp (approximately −20 dB), whereas LFT-1 and LFD-2 responded reliably only to the last syllables. Similar to the 2 kHz receptors, the response of all three neurons to the chirp was unaffected by the masking trill (Fig. 3C,D). Note that LFD-1 showed even a selective response to the chirp at an SNR of −21 dB (the trill being 21 dB more intense than the chirp). Quantitative average values for all three neurons and their responses with and without the masking trill are given in Figure 4. LFT-1 and LFD-2 responded with 9.2 and 8.1 APs to the unmasked chirp, respectively, and the more sensitive LFD-1 neuron generated 28 APs per chirp. There was no significant difference in the responses under masked and unmasked conditions.
Novelty detection
A second mechanism for selective responses to chirps exists in interneurons that are tuned to a broader range of frequencies. We identified three interneurons where the response to the continuous trill exhibits strong adaptation. In contrast, the response to the chirp is unaffected by the adaptation because of the 2 kHz component of the chirp not present in the trill (novelty detection; as suggested by Schul and Sheridan, 2006 and Schul et al., 2012).
The structures of these three interneurons in the prothoracic ganglion are shown in Figure 5A. The cell body of BFD-1 is located near the anterior border of the auditory neuropil and has a contralateral axon descending to the mesothoracic ganglion. Although its structure is similar to LFD-1 and LFD-2, its dendrites on both sides of the ganglion reveal consistent differences. BFA-1 has its cell body close to the dorsal surface of the anterior lobes and dendrites contralateral to the cell body and projects with an ascending axon to the subesophageal ganglion. BFT-1 has a descending and ascending axon and dendrites within the auditory neuropil. Its cell body is not located within the prothoracic ganglion.
These neurons respond to a broad range of frequencies, as evident in their iso-intensity function (Fig. 5B). A characteristic feature of all BFD-1 neurons is a peak of activity at 2 kHz, in addition to a broad frequency range at higher sonic and ultrasonic frequencies. In contrast, BFA-1 and BFT-1 showed a broadband tuning with strongest responses toward higher frequencies, but also including sensitivity to 2 kHz.
A common property of these interneurons is their strong adaptation to the onset of the trill (Fig. 5C). The time course of adaptation varies strongly between these neurons from several seconds in BFD-1 and BFA-1 to <100 ms in BFT-1. As a consequence, although the response to the chirp may be completely masked at the onset of the masking trill, it recovers due to adaptation. Different to the interneurons tuned exclusively to the 2 kHz component of the chirp, these responses under the masking trill after some seconds of adaptation are weaker compared with the unmasked condition (Fig. 5C,D). The comparison of response strength between the unmasked and masked condition shows differences between the three interneurons (8.3 and 4.4 APs per chirp in BDF-1, 34.8 and 15.4 APs per chirp in BFA-1, and 11.3 and 4.3 APs per chirp in BFT-1). However, even at an SNR of −15 dB, a chirp elicited a burst of 10 APs in BFA-1.
A neural mechanism for a stimulus selectivity based on novelty detection requires that only a specific stimulus component (like the 2 kHz component in the chirp) will elicit a response despite strong adaptations to other components in the continuous background. We tested this hypothesis by reducing the amplitude of the 2 kHz component in the chirp to the same level as in the trill (Siegert et al., 2013). Figure 6A shows responses of BFD-1 to chirps with and without the 2 kHz frequency component. Unmasked chirps with the reduced 2 kHz component elicited a strong response in this broadband neuron because of the high frequencies in the chirp, but under the masking condition of the trill there was no response to the chirp when the 2 kHz component was reduced (Fig. 6A, right).
The averaged results from such experiments in three BFD-1 neurons are shown in Figure 6B. In the masking condition of the trill, chirps elicited a reliable burst of APs, although reduced compared with the unmasked condition (7.3 ± 3.9 APs per chirp compared with 20.8 ± 10.3 APs per chirp). However, when the 2 kHz component in the chirp was equalized to the spectrum of the trill, the masked response in BFD-1 was almost absent (0.6 ± 0.6 APs per chirp compared with 17.3 ± 3.2 APs per chirp). These data reveal that the difference in frequency composition at 2 kHz between the conspecific signal and the heterospecific masker is essential for the mechanism of novelty detection. An important physiological characteristic in neurons that use a novelty detection mechanism is that the response to the signal is reduced due to adaptation to the continuous masker, but the masker evokes subthreshold EPSPs (see EPSPs in bottom of Fig. 6A). These EPSPs match precisely with the double syllables of the trill (Fig. 6C).
Quantitative values for all broadband neurons regarding tuning properties and chirp selectivity under noise are given in Figure 7. Similar to the representative neurons of Figure 5, the averages over more neurons revealed a bimodal activity for BFD-1 and a broadband tuning for BFA-1 and BFT-1 (Fig. 7A). These tuning properties have a significant impact on chirp selectivity, both in the unmasked and masked condition (Fig. 7B). Although BFT-1 revealed the lowest sensitivity it also showed the strongest adaptation to the trill leading to a better neural representation of the chirp under noise.
Table 1 emphasizes the degree of chirp selectivity for all identified interneurons by comparing responses to noise with responses to chirps under noise. The table includes all neurons that use either the mechanisms of frequency selectivity or novelty detection and revealed a statistically significant difference between the responses to the masking trill and the responses to the masked chirp.
As mentioned before, the selective tuning to the 2 kHz component in the chirp and the novelty detection mechanism provide selective responses in interneurons to the chirp at remarkably low SNRs. In LFD-1 and BFA-1 we found two different dependencies of masked responses when systematically varying the SNR from 0 dB to the masked threshold (Fig. 8). The masked response in LFD-1 at 0 dB SNR is rather strong with 28.8 ± 2.6 APs per chirp, and it gradually declines with decreasing SNRs, so that there is still a reduced, but reliable, response of 3.9 ± 3.0 APs per chirp at −18 dB SNR (Fig. 8B). In contrast, the masked responses in BFA-1 vary much less with decreasing SNR values, but the response at −18 dB SNR is still reliable with 3.0 ± 0.8 APs per chirp. To get an impression of the afferent information available for the insect at a reduced SNR of −15 dB via all identified interneurons, we show their representative responses in Figure 8C.
Discussion
The acoustic environment of many species with vocal communication can be characterized by what is commonly known as the cocktail-party problem, where receivers have to extract acoustic objects from a mixture of sound waves arriving at the ears from various sound sources in space. We have chosen to study solutions for a cocktail party-like problem in an insect, where signal detection is extremely difficult, because the signal (the chirp) shares most of the frequency spectrum with the long-lasting, intense masking trill of a competing species in the same habitat, in addition to the general high acoustic background in the nocturnal rainforest (Lang et al., 2005). We would expect to find robust solutions to the masking problem in the afferent processing of sensory information in insects with their limited number of neurons available (Schmidt and Römer, 2011; Hildebrandt et al., 2015). Such processing would free the CNS from the complicated task of distinguishing between afferent activity resulting from background noise and relevant signals, which is also the core of the matched-filter hypothesis (Capranica and Moffat, 1983; Wehner, 1989).
The solutions we describe are based on a small difference in the spectral composition between the chirp and the trill at 2 kHz, a component that is more intense in the chirp (Fig. 1D). The two neural mechanisms of selective tuning and novelty detection are both based on this small difference between the signal and the masker. They may represent the neuronal basis for the behavioral ability of males to detect and synchronize their calling songs even under the masking trills of the competing species (Siegert et al., 2013). At the receptor level, some receptors responded selectively to chirps masked by the trill due to their narrow tuning for 2 kHz. These responses were almost identical when chirps were presented with or without the masker, as high-frequency components of the trill failed to elicit any response. In contrast, receptors sensitive at higher frequencies generated strong responses to the trill, and the neural representation of the chirp was completely masked by the activity to the trill. This sheds new light on the ability of katydids for peripheral frequency discrimination, where recent studies have demonstrated the sophisticated biophysical mechanisms underlying the tonotopic representation of frequencies in their ear (Palghat Udayashankar et al., 2012; Montealegre-Z et al., 2012), but most of the detailed frequency representation appears to be lost in afferent interneurons due to strong neuronal convergence from receptors onto interneurons. This convergence would completely abolish any detailed frequency information, so that differences in the frequency content of two signals, or signal and masker, could not be used any more for discrimination.
However, our present findings clearly indicate that selective tuning to a particular component of an otherwise broad frequency range in a conspecific signal may provide the basis for successful listening under strong masking conditions. At the level of auditory interneurons we identified two neural mechanisms underlying masked signal detection. The first is based on selective tuning to the low-frequency components of the chirp. A similar mechanism was described by Schul (1999) for song discrimination in Tettigonia viridissima. Consistent with the tonotopy of the auditory neuropil (Römer, 1983; Stolting and Stumpner, 1998), the dendrites of these neurons overlap with the axonal arborizations of the low-frequency receptors. Apparently, these neurons receive only inputs from sharply tuned low-frequency receptors, and inhibitory mechanisms to sharpen the tuning of a neuron for selective coding of relevant stimuli are not necessary (Stumpner, 1998).
The second mechanism for signal detection under noise is based on novelty detection as described previously (Schul and Sheridan, 2006; Schul et al., 2012; Siegert et al., 2013). The mechanism has two characteristics: broadband interneurons are excited with bursts of APs at the onset of the trill, but show subsequent strong adaptation. In the adapted state they respond only to the frequency component not present in the trill, i.e., the 2 kHz “novelty” of the chirp. Thus, when the 2 kHz component was equalized to the level of the trill, all components of the chirp were completely masked by the trill and no “novel” signal trait could elicit a response (Fig. 6). However, these interneurons exploit the entire frequency range for responses to unmasked chirps.
Schul and Sheridan (2006) described a similar mechanism for the detection of a bat echolocation signal in the presence of a continuous call of a conspecific with reliable responses to the bat call at SNRs of −18 dB at the masked threshold. Triblehorn and Schul (2013) suggested a model that stimulus-specific adaptation may occur in the dendrites for one signal without preventing a response to the other signal. Prerequisites for this model are tonotopic projections of auditory receptors within the auditory neuropil and their overlap with dendritic branches of interneurons (Römer, 1983; Römer et al., 1988; Ebendt et al., 1994). Such a model would fit quite well with our current results, since the interneurons receive low-frequency input at dendritic areas that are separated from those to high-frequency inputs, allowing them to generate suprathreshold responses during adaptation.
Our data indicate that LFRs are less sensitive compared with HFRs as the excitation at best frequency is lower (Fig. 2). As the sensitivity for the 2 kHz component is crucial for the response selectivity to chirps (Fig. 6), we would expect a breakdown of selectivity when the 2 kHz component falls below threshold at higher sender–receiver distances. However, in the natural habitat low frequencies suffer much less from excess attenuation compared with higher frequencies (Romer and Lewald, 1992). It is thus likely that at higher communication distances the 2 kHz component can still activate LFRs, due to the low-frequency advantage in sound transmission, although their absolute sensitivity is reduced compared with HFRs.
Apparently, both mechanisms of selective low-frequency tuning and novelty detection are highly adaptive under the specific ecological conditions of acoustic communication for the Mecopoda chirper, because they provide reliable representation of the conspecific signal at unfavorable SNRs (Fig. 8). A comparison with homologous neurons in other katydid and cricket species may possibly provide information whether and how existing properties of auditory receptors and interneurons in Mecopoda have evolved as a result of the ecological constraints of strong masking.
For example, interneurons with similar morphology as LFD-1 or other LFD-interneurons have been described in other katydids and crickets, respectively (Wohlers and Huber, 1982; Römer, 1987). The DN1-neuron in crickets, for instance, is tuned to low frequencies between 1 and 3 kHz, but with a broader tuning as compared with LFD neurons. It is thus possible that the Mecopoda chirper recruited already existing interneurons serving other function in other species—by sharply tuning the input to 2 kHz the representation of masked chirps is greatly enhanced. BFA-1 interneurons may represent a similar case. They may be homologous to ascending neurons AN1–AN3 described by Stumpner (1997) and Stumpner and Molina (2006) in Ancistrura nigrovittata; however, these neurons were tuned to higher frequencies compared with BFA-1 in Mecopoda. Again, these tuning properties of broadband interneurons may easily result from modifications of synaptic input from auditory receptors to the dendrites over evolutionary time for the specific task to detect the 2 kHz component in the masked signal. One of the most obvious cases for a small, but functionally important evolutionary modification is BFD-1 with its second peak of sensitivity sharply restricted at 2 kHz. This low-frequency peak in sensitivity would strongly suggest that it is an evolutionary innovation providing one of the two requirements for novelty detection: given that the dendritic mechanisms contributing to stimulus-specific adaptation had already been established (Triblehorn and Schul, 2013), the addition of a sharply tuned 2 kHz input to an existing broad frequency would make such a neuron an ideal candidate for detecting the chirp under the masking trill of the competing species.
In this context, an obvious series of experiments would require a comparison of homologous interneurons in the two competing species, the Mecopoda chirper and the trilling species, since such comparison may provide significant insights into how neural mechanisms of signal detection under noise are shaped by evolution.
An important further goal is to reveal the projection areas of the described ascending and descending interneurons and link them to pattern recognizing and/or motor control areas. The projections of descending interneurons may be linked to motoneurons of flight muscles and/or motor control areas for phonotaxis in females. Accordingly, the two mechanisms may be used for different tasks depending on their projection areas. Interneurons that exploit a broader frequency range and generate responses with robust representations of the temporal signal pattern over distance (BFA-1) may be used for pattern recognition, while the very fast response to the onset of chirps of low-frequency interneurons (LFIs) may be involved in more direct pathways for motor control (LFD-1). A parallel, faster and more direct pathway for motor control has been proposed for crickets (Hedwig and Poulet, 2004, 2005), which is activated once the conspecific pattern has been recognized.
Footnotes
Research was funded by the “Austrian Science Foundation” (FWF), Project P21808-B09 and P23896-B24. We thank Manfred Hartbauer for experimental support and fruitful discussions and Berthold Hedwig for his critical comments on an earlier draft of this manuscript.
The authors declare no competing financial interests.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License Creative Commons Attribution 4.0 International, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.
- Correspondence should be addressed to Konstantino Kostarakos, Institute of Zoology, Karl-Franzens-University, Universitaetsplatz 2, 8010 Graz, Austria. konstantinos.kostarakos{at}uni-graz.at
This article is freely available online through the J Neurosci Author Open Choice option.