As opposed to visual imaging, biosonar imaging of spatial object properties represents a challenge for the auditory system because its sensory epithelium is not arranged along space axes. For echolocating bats, object width is encoded by the amplitude of its echo (echo intensity) but also by the naturally covarying spread of angles of incidence from which the echoes impinge on the bat's ears (sonar aperture). It is unclear whether bats use the echo intensity and/or the sonar aperture to estimate an object's width. We addressed this question in a combined psychophysical and electrophysiological approach. In three virtual-object playback experiments, bats of the species Phyllostomus discolor had to discriminate simple reflections of their own echolocation calls differing in echo intensity, sonar aperture, or both. Discrimination performance for objects with physically correct covariation of sonar aperture and echo intensity (“object width”) did not differ from discrimination performances when only the sonar aperture was varied. Thus, the bats were able to detect changes in object width in the absence of intensity cues. The psychophysical results are reflected in the responses of a population of units in the auditory midbrain and cortex that responded strongest to echoes from objects with a specific sonar aperture, regardless of variations in echo intensity. Neurometric functions obtained from cortical units encoding the sonar aperture are sufficient to explain the behavioral performance of the bats. These current data show that the sonar aperture is a behaviorally relevant and reliably encoded cue for object size in bat sonar.
The neural encoding of object size is an important function of sensory systems. In the visual system, the spatial extent of an object, i.e., its visual aperture, is explicitly encoded in terms of the extent of the image on the retina. In the auditory system, however, frequency instead of space is explicitly encoded and auditory space information must be computed in the central auditory system. This problem becomes especially relevant for a bat, which recruits its auditory system to image its surroundings. Through echolocation, bats derive a sensory image not only about the position of an object but also about its spatial extent and 3D shape.
The ability of bats to classify complex 3D objects based on physical properties like shape, orientation, surface structure, or object size has been investigated in both psychophysical and neurophysiological studies (Habersetzer and Vogler, 1983; Schmidt, 1988, 1992; Von Helversen and Von Helversen, 1999; Sanderson and Simmons, 2002; Grunwald et al., 2004; von Helversen, 2004; Firzlaff et al., 2006; Borina et al., 2008; Falk et al., 2011). There are few studies addressing the acoustic cues that underlie size perception in the echo imaging of bats (Simmons and Vernon, 1971; Simon et al., 2006; Firzlaff and Schuller, 2007; Firzlaff et al., 2007), but it is unclear by which echo-acoustic parameters the width of an ensonified object is encoded and how these parameters are processed in the bat's auditory system.
A wider object creates a louder echo because the surface area reflecting the bat's call increases (Simmons and Vernon, 1971). Additionally, the “spread of angles of incidence” from which the echoes impinge on the bat's ears increases with increasing object width. In the present study, this spread of angles of incidence is called the sonar aperture of an object. Using real objects, as in earlier approaches, precludes an experimental isolation of these echo-acoustic cues. This problem can be overcome by the use of virtual objects allowing systematic manipulation and analysis of well isolated object properties (Schmidt, 1988; Weissenbacher and Wiegrebe, 2003). The aim of the present study is to quantify the relevance of echo intensity and sonar aperture for the perceptual evaluation of object width by bats and to find a possible neural correlate for the bat's behavioral performance.
The psychophysical results show that to discriminate objects of different width Phyllostomus discolor predominantly uses sonar aperture even if echo-intensity information is also available. A control experiment shows that in the absence of sonar aperture information, the bats are quite sensitive to the changes of echo intensity. The perceptual salience of the sonar aperture for object–width discrimination is supported by the electrophysiological results: both in the inferior colliculus (IC) and auditory cortex (AC), a population of units is found that responds strongest to echoes of an object with a certain sonar aperture independent of echo intensity. This neural population may represent the basis for the bat's ability to deduce an object's width only through the sonar aperture of its echoes.
Materials and Methods
The bat species used in this study was the neotropical phyllostomid bat, P. discolor. The bats came from a breeding colony in the Department of Biology II of the Ludwig-Maximilians-University in Munich. These bats emit short (<3 ms) broadband, downward frequency-modulated, multiharmonic echolocation calls in the frequency range between 45 and 100 kHz (Rother and Schmidt, 1982). P. discolor feeds mainly on fruit, pollen, and insects (Nowak, 1994).
The psychophysical experiments were implemented as active-acoustic, virtual-object playback experiments in which bats had to discriminate echoes of their own echolocation calls that were presented as reflections from a virtual object extending along the azimuth.
Five adult males and one female P. discolor bats with a body weight ranging between 30 and 50 g participated in the psychophysical experiments. On training days, the individuals were kept in a cage (80 × 60 × 80 cm). After training sessions, the animals could fly freely in a room of 12 m2 until the next morning. All individuals had access to water ad libitum. The training was realized in daily sessions of 20 min at 5 d per week, followed by a 2 d break. The bats were fed with a fruit pulp as reward during training sessions. On days without training, the animals had had access to fruit and mealworms ad libitum (larvae of Tenebrio molitor).
All psychophysical experiments were conducted in complete darkness inside an echo-attenuated chamber (2.1 × 1.8 × 2.1 m). In this chamber, a Y-shaped maze (Y-maze) placed in a semicircular wire mesh cage (radius = 55 cm) was inversely mounted on a metal post at an angle of 45° (Fig. 1). The top end of the Y-maze held a starting perch, whereas a feeder was located at the end of each leg. The angle between the legs measured 90°. Two ultrasonic microphones (CO 100K, Sanken) were installed above the feeders of the maze pointing toward the bat's perch. The cage was plane-parallel arranged towards a semicircular loudspeaker array (radius = 71 cm) that consisted of 34 ultrasonic ribbon loudspeakers (NeoCD1.0, Fountek). The speakers' front plates were covered with plane acoustic foam except for each speaker's membrane (8.0 × 38.0 mm). The speaker array was subdivided into right and left hemispheres, each consisting of 17 speakers. The spatial separation between adjacent speakers in each hemisphere was 5.6°. Each of the 34 speakers was calibrated against a 1/8 inch microphone (without protective grid; Type 4138, Brüel & Kjaer) positioned at the bat's starting perch and oriented perpendicular to the speaker axes. The measured impulse response of each speaker was divided by the impulse response of an ideal bandpass filter (47th-order finite impulse response, cutoff frequencies of 15 and 94 kHz) to generate a compensatory impulse response. Every echo presented over one of the speakers was convolved in real time with this speaker's compensatory impulse response. Thus, it was ensured that all 34 speakers provided a linear frequency response between 15 and 94 kHz and a linear phase at the starting perch of the bat.
Each echolocation call emitted by a bat in the setup was picked up by the microphones and amplified (QuadMic, RME) and digitized [HD 192, MOTU; three devices with 12 analog-to-digital (AD) and digital-to-analog (DA) channels each and a 424 PCI board, MOTU] at a rate of 192 kHz. After determining the required echo level (see below, Stimuli), the calls were convolved with the compensatory impulse response of the particular speaker, DA converted, and amplified (AVR 347, Harman Kardon; five devices with seven channels each) before being sent to the speaker. The input–output delay of the system, together with the physical propagation delay from the bat to the microphones and from the speakers to the bat, added up to 6.7 ms, which corresponds to a fixed distance of the virtual object of 1.12 m.
Residual physical echoes from the experimental setup arrived much earlier: the distance between the perch and the speakers as the source of the latest physical echoes was 0.71 m. The distance difference between the physical and virtual echoes of 41 cm was much too large to create a spectral interference pattern between the physical and virtual echoes.
For acoustic monitoring during the experiments, the digitized signals from a third (central) microphone were multiplied with a 45 kHz pure tone. The resulting difference frequency was in the audible range and sent via an additional DA channel of the MOTU HD 192 and the remaining channel of one of the amplifiers to headphones (K 240 DF, AKG).
The experimenter was seated outside the chamber, observing and controlling the experimental procedure via infrared camera and computer interface. Experimental control, data acquisition, and analysis were implemented in MATLAB 7.5 (MathWorks). For the control of the MOTU system, SoundMexPro software (HörTech) was used.
Each microphone recorded the animal's ultrasonic calls emitted toward its corresponding hemisphere. The virtual objects were implemented as simple reflectors. Echo intensity was manipulated by setting the attenuation of the echo before the DA conversion in each channel. Sonar aperture was manipulated by changing the number of adjacent speakers presenting an echo (see Fig. 1). As a result, 2D echo patterns differing in spatial and intensity information could be presented from both hemispheres of the speaker array.
When the sonar aperture was increased, the number of adjacent speakers presenting the echo of the call picked up by a microphone increased. The number of adjacent speakers was always increased symmetrically around the central speakers of each hemisphere such that the spatial “center of gravity” remained unchanged. Complex spatial interference patterns were generated because the speakers in each hemisphere emitted the same, fully coherent sound and they were all the same distance to the bat's perch. Note, however, that the same would be true for the reflections of a real surface when it is equidistant, i.e., it is bent around the bat's perch: if one imagined the surface as consisting of a number of point reflectors, the lateral reflections from these point reflectors would interfere the same way as the sounds from the speaker membranes in the current setup. The net effect of the interference pattern is that at the bat's starting position echoes from the speakers (and from a flat surface bent in the same way as the speaker array) add up constructively to create a strong overall echo. Moving out of this “focal point” decreases echo intensity dramatically due to destructive interference. But this destructive interference is as similar to a real equidistant object as it is for the virtual object used here. It is clear that such an equidistant surface is an unnatural object for a bat; however, it is the only reasonable object to use when trying to isolate the sonar aperture and echo intensity as the parameters of interest. All other objects would introduce space-dependent changes in echo delay, a confounding parameter to which bats are very sensitive (Simmons, 1971, 1973).
The bats were trained on three experiments. Each experiment followed a two-alternative, forced-choice paradigm with food reward. The animals were trained to discriminate a rewarded virtual object (RO) from an unrewarded virtual object (UO). They were only rewarded for correct decisions, indicating a decision by crawling toward one of the two feeders. When a bat's performance exceeded 80% correct choices on 5 consecutive training days, data acquisition was started. Here, trials were arranged according to a staircase procedure: acquisition started with a block of three to five trials with easily discriminable virtual objects. For each subsequent block, the task difficulty was increased until the bat's performance approached chance level. Psychometric functions are based on 50 trials per UO. Significance was set at p < 0.01 based on a binomial test. The rewarded hemisphere was selected pseudorandomly (Gellermann, 1933).
In the Width experiment, discrimination performance for object width, represented by the physically correct covariation of echo intensity and sonar aperture, was tested. Bats had to discriminate echoes transmitted by a single speaker (RO) in one hemisphere from echoes transmitted by three or more speakers (UO) in the other hemisphere. Every single speaker of the UO provided the same echo intensity as the RO; thus, the sonar aperture covaried with echo intensity. The width of the UOs measured between 11° (the angular separation of three adjacent speaker membranes) and 90° (17 adjacent speaker membranes); the corresponding echo-intensity differences, measured at the starting perch, were between 9.5 and 24.6 dB.
“Sonar aperture” experiment.
This experiment was identical to the Width experiment with the following exceptions. First, the echo intensity of each speaker transmitting the UO was reduced such that the intensity of the waveforms, summed up across all speakers constituting the UO, was equal to that of the RO. Second, echo intensity was roved (±6 dB) between the RO and the UO and over trials to preclude the bat's use of residual intensity differences to solve the task.
In the Intensity experiment, the perceptual threshold of P. discolor for differences in echo intensity was tested. Virtual objects were only presented by the center speakers (45° position) in each hemisphere. Echo-intensity differences were presented in steps of 1 dB; the maximal difference was 10 dB. In each trial, bats had to decide which virtual object provided the lower echo intensity.
For the electrophysiological experiments four P. discolor were used. Recording sessions lasted 4 h and were performed 4 d per week for up to 8 weeks. After experiments and on experiment-free days the bats had access to food and water ad libitum.
All experiments complied with the principles of laboratory animal care and were conducted under the regulations of the current version of the German Law on Animal Protection (approval 55.2-1-54-2531-128–08, Regulation Oberbayern). The animals were initially anesthetized by subcutaneous injection of a mixture of 0.4 μg of medetomidine (Domitor, Novartis), 4 μg of midazolam (Midazolam-ratiopharm, ratiopharm GmbH), and 0.04 μg of fentanyl (Fentanyl-Janssen, Janssen-Cilag) per 1 g of body weight of the animal. The surgery was previously described in detail by Schuller et al. (1991). In short, the skin overlying the cranium was cut along the midline. The cranium was freed from remaining tissue, and a small metal tube was attached to the rostral part of the cranium using a microglass composite (GLUMA Comfort Bond, Heraeus Kulzer). To avoid inflammation, the antibiotic enrofloxacin (Baytril, 0.5 μg/g body weight; Bayer AG;) was injected subcutaneously. After surgery, the anesthesia was antagonized with a mixture of atipamezole hydrochloride (Antisedan, Novartis), flumazenil (Anexate, Hoffmann-La Roche), and naloxon (DeltaSelect GmbH), which was applied subcutaneously (2.5, 0.5, and 1.2 μg/g body weight, respectively). To reduce postoperative pain, the analgesic meloxicam (Metacam, 0.2 mg/kg body weight; Boehringer-Ingelheim) was applied subcutaneously.
Stereotaxic fitting procedure.
After surgery, stereotaxic fitting according to Schuller et al. (1986) was performed to guide the access to the subsequent recording positions.
For verification of the recording sites a tracer (wheat germ agglutinin conjugated to horseradish peroxidase; Sigma) was injected at a defined position in the brain. After histological processing of the brain, recording sites were reconstructed in brain atlas coordinates (B. Schwellnuss, T. Fenzl, A. Nixdorf, unpublished data).
Acoustic stimuli and data acquisition.
All stimuli were computer generated (MATLAB; MathWorks), DA converted (RX6, sampling rate 260 kHz; Tucker-Davis Technologies), and fed into a programmable attenuator (PA5, Tucker-Davis Technologies). The signal was amplified (custom-made amplifiers) and then presented to the animal via ultrasonic earphones (custom made; Schuller, 1997). The frequency response of the ultrasonic earphones was flat (±3 dB) between 20 and 100 kHz.
For measuring the frequency response area (FRA) of neurons, pure tone stimuli with different frequency–intensity combinations were used. A detailed description of the procedure was described previously by Hoffmann et al. (2008).
Spatial receptive fields of neurons were measured using a standard echolocation call of P. discolor that was convolved with the head-related impulse responses (HRIRs) for the left and right ear of the corresponding position in space (Firzlaff and Schuller, 2003). The receptive field was measured in steps of 15° in the frontal hemisphere ranging from ±82.5° in azimuth and elevation. Stimuli were presented in a randomized order and repeated 10 times. The intensity of the loudest echo was adjusted such that the receptive field covered a surface of at least 60° in azimuth. Usually the intensity was around 40 dB above the neuron's pure tone threshold.
To generate echoes of objects with a specific width, the echolocation call was first convolved with the HRIRs corresponding to several adjacent horizontal positions in space (see below in this section), and the resulting echoes were summed up across the positions to generate the stimuli as they would add up on the bat's eardrums. The virtual object was centered in the spatial receptive field measured earlier. This was done to ensure that echoes over the whole range of object widths were within the excitatory region of the spatial receptive field of the unit under study. In addition, it was shown that the position of the pinnae influences the position of the spatial receptive field (Sun and Jen, 1987). It is reasonable to assume that in the psychophysical experiments the bats moved their pinnae to focus on the virtual objects. Thus, the procedure of centering the virtual object in the receptive field of a unit resembles the psychophysical paradigm. As in the psychophysical experiments, wider objects were generated by adding echoes symmetrically around the center of the receptive field. Adjacent positions were separated by 7.5° resulting in virtual objects with a width of 15, 30, 45, and 60° for 3, 5, 7, and 9 adjacent echo positions (Fig. 2). Echoes from a single position in the frontal hemisphere are referred to as having a sonar aperture of 0°.
The summing of echoes generated with different HRIR sets results in complex interference patterns. Note, however, that these are the interference patterns as they would occur in the pinnae of our experimental animal as defined by the HRIRs. The application of HRIRs thus makes the bat a unique directional stereo receiver of the echoes as opposed to a single omnidirectional microphone. On such a microphone, the echoes from different horizontal directions would add up in phase resulting in a 6 dB increase in echo intensity per doubling of the number of echoes (Fig. 2H). After using the bat's HRIRs, the intensity does not increase monotonically with increasing object width (Fig. 2H) due to the destructive interference in the bat's pinnae.
The echoes were presented at different overall intensities. For one of the four experimental animals, intensities covered a range of 24 dB in steps of 6 dB; for the other three animals, intensities covered a range of 12 dB in steps of 3 dB. The lowest intensity was adjusted to the intensity used for the receptive-field measurements. The resulting five-by-five stimulus matrix was presented randomized with 40 repetitions. The repetition rate was ∼2 Hz. As a monaural control, the five-by-five stimulus matrix was presented only to the contralateral ear to determine the degree to which neural responses depend on the binaural stimulation.
All experiments were conducted in an anechoic, electrically shielded, and heated (∼36°C) chamber. Earphones were inserted into the animal's ear canals. Extracellular recordings were made with glass-insulated tungsten microelectrodes (2 MΩ impedance; Alpha-Omega GmbH).
The electrode signal was recorded for 450 ms starting 50 ms before stimulus presentation. The electrode signal was amplified (ExAmp-20KB, M2100; Kation Scientific), bandpass filtered (300–3000 Hz, PC1; Tucker-Davis Technologies), AD converted (RP2.1, sampling rate 25 kHz; Tucker-Davis Technologies), and finally stored on a PC using Brainware (Tucker-Davis Technologies).
Since it was not possible to always analyze responses of a single neuron, the term “unit” for the responses derived from one to three neurons will further be used in this text. The number of neurons included in the term unit was estimated based on the number of different spike-waveforms that can typically be visually discriminated in terms of, e.g., positive/negative amplitudes and/or spike duration on the oscilloscope screen during recording.
Data were analyzed with MATLAB (MathWorks). Spikes evoked by all stimuli were displayed as peristimulus time histogram. As a large variety of response patterns across the units was observed, especially in the auditory cortex, spike responses were analyzed using a sliding window to determine the individual response duration of a unit (Schlack et al., 2005). This analysis window was set automatically by moving a 10 ms window in 1 ms steps over the time course of recorded activity. The first point at which two successive windows led to significant differences (Wilcoxon signed rank test, p < 0.01) in neuronal activity compared with the first 10 ms window (spontaneous activity recorded before the stimulation) was taken as the start of the analysis window. The end of the analysis window was set to the last position of two successive windows that differed significantly from spontaneous activity. For each stimulus, all spikes occurring in this analysis window were summed up.
Best frequency (BF) and threshold of a unit were determined from the FRA. The frequency where a significant response could be elicited at the pure-tone threshold was defined as the BF of a unit. Responses to different frequency-level combinations were considered to be significant if the spike count exceeded a threshold of 20% of the maximum response.
To analyze the responses to virtual objects, the number of spikes for each width–intensity combination was arranged in a five-by-five matrix. In this response matrix, the object width increases along the abscissa, whereas the echo intensity increases along the ordinate (see Fig. 4). When a Kolmogorov–Smirnov test (MATLAB Statistics Toolbox; MathWorks) revealed a significant difference (p < 0.05) in the response strength in one or more rows compared with all of other rows in the response matrix, the unit was categorized as an “Intensity” unit. If one or more columns were found to differ significantly, the unit was categorized as a “Sonar aperture” unit. When a unit showed significant response differences along both dimensions, it was categorized as an “Ambiguous” unit. No significant difference in any tested dimension of the matrix was the criterion for “Insensitive” units. To compare the response matrix of a Sonar aperture unit derived by normal (binaural) stimulation to the response matrix with monaural contralateral stimulation, a 2D correlation coefficient (MATLAB Image Processing Toolbox; MathWorks) was calculated (Keller et al., 1998).
To directly relate the bat's psychophysical performance in the Intensity experiment to the neural sensitivity exhibited in the extracellular recordings, a receiver operating characteristic (ROC) analysis was applied to generate neurometric functions according to Britten et al. (1992), Skottun et al. (2001), and Firzlaff et al. (2006). The neurometric function reflects the probability that an ideal observer could accurately discriminate echo-intensity differences basing his judgments on responses like those recorded from the units under study. The ROC analysis was performed by generating a so-called ROC curve for the comparison of each signal condition (reference intensity plus intensity difference) and the standard condition (reference intensity). The ROC curve shows the probability that both the rate response in a signal condition and the response in the standard condition exceed a certain threshold, e.g., one spike per stimulus. This probability was plotted as a function of the height of the threshold. From there, the (neural) percentage correct discrimination for each signal condition was generated by calculating the area under the ROC curve. When pooling across units, the spike counts across a number of randomly drawn Intensity units was aggregated to form a small population response (Britten et al., 1992).
For the comparison of the psychophysics and physiology concerning the sonar aperture, we calculated d′ from the psychometric functions or from the response-strength differences in sonar aperture units. The latter was achieved, according to Rosen et al. (2010), by extracting the hit rate and the false-alarms rate across repetitions. This analysis is based on the assumption that a unit's response increases when the object width is increased from the psychophysical reference width (0° corresponding to a single speaker). The rates were transformed to z-scores (using the MATLAB “norminv” function), and the z-score of the false-alarms rate was subtracted from the z-score of the hit rate to get the value of d′.
Behavioral results of the Width and Sonar aperture experiments are based on a total of 2400 trials per experiment; results of the Intensity experiment are based on a total of 3000 trials. The mean performance of six bats for discriminating virtual objects of different width is shown as the solid line in Figure 3A. The average data show that the bats can reliably discriminate between echoes presented by a single speaker and echoes presented by seven speakers (34° object width). Data for the Sonar aperture experiment, where the spatial cues are the same as in the Width experiment but the echo-intensity cues have been removed, are shown in the same format in Figure 3B. Although the overall above-threshold performance of the bats is slightly inferior compared with the Width experiment, the bats solve this task with similar success.
The Width and Sonar aperture experiments were repeated while randomizing the position of the RO in one hemisphere within the spatial range of the UO presented in the opposing hemisphere. This control experiment was performed to verify that the bats attended to the difference of the object's sonar aperture as opposed to differences in the absolute azimuthal positions of the edges of the virtual objects. The RO was never presented by the two speakers next to the midline between the hemispheres to clearly separate the RO and the UO in azimuth. The control experiment showed that the RO randomization did not impair the performance in the discrimination of sonar aperture (Fig. 3, compare C, D).
Data for the Intensity experiment, where spatial cues have been removed and only echo-intensity differences are provided, are shown in the same format in Figure 3E. These data show that the bats require an echo-intensity difference of ∼5 dB to reliably choose the fainter of two echoes. A direct comparison of the bat's performance in the Width and Sonar aperture experiments is shown in Figure 3F. Apart from the data for a sonar aperture of 56° (9 adjacent speakers), the performance in the Sonar aperture is not significantly lower than the performance in the Width experiment (Wilcoxon rank sum test, p > 0.05) where intensity cues are provided together with the spatial cues.
In the AC, recordings were taken from three bats (one female and two males); in the IC, recordings were taken from two bats (one female, one male). BF tone response could be obtained from 101 and 161 units in the IC and AC, respectively. The best frequency of units ranged from 20 up to 90 kHz (IC: median 53 kHz; interquartiles, 18 kHz; AC: median 62 kHz; interquartiles, 26.5 kHz) corresponding to the power spectrum of the echolocation calls.
Recordings of virtual objects could be obtained from 74 and 84 units in the IC and AC, respectively. Subsequent analyses are restricted to these units. An example of an Intensity unit (Fig. 4, left column), a Sonar aperture unit (Fig. 4, central column), and an Ambiguous unit (Fig. 4, right column) are shown in terms of the raster plots (top) and normalized response strength (bottom). The response strength of the Intensity unit increases with increasing intensity, but it is independent of object width. The Sonar aperture unit, however, shows a very different selectivity: this unit responds to objects of a certain width regardless of overall intensity.
Additional examples of response matrices from Intensity and Sonar aperture units in the AC and IC are shown in Figure 5. Most Intensity units (Fig. 5A,C) increase their response strength monotonically with increasing echo intensity. In contrast, the Sonar aperture units in the AC and IC typically do not vary monotonically in response strength along the object width axis (Fig. 5B,D). Instead, these units show robust response-strength changes along the horizontal (object width) axis, whereas responses vary little along the vertical (intensity) axis for both the 12 and the 24 dB intensity axes.
The distribution and the numbers of units among categories are shown in Figure 6. These data show that the Sonar aperture units are almost as strongly represented in the AC as the Intensity units. In contrast, Intensity units strongly dominated the other response categories in the IC.
The most interesting, and behaviorally relevant units are in the Sonar aperture category. To assess the degree to which this conspicuous response pattern depends on a binaural stimulation, the response matrices were also obtained with monaurally contralateral stimulation. The effect of switching off the stimulation on the ipsilateral ear is shown in Figure 7. In the AC, switching off the ipsilateral input dramatically decreases the number of units in the Sonar aperture category, whereas this effect is smaller in the IC (Fig. 7A). Thus, it appears that binaural inputs contribute substantially to the large number of Sonar aperture units in the AC.
This population effect is also seen to some degree at the level of individual units. The 2D cross-correlation coefficient between the response matrices of the same Sonar aperture units with and without ipsilateral stimulation is shown for the AC and IC in Figure 7, B and C, respectively. The medians of cross-correlation coefficients are 0.42 for the AC and 0.60 for the IC, indicating that the IC is less influenced by the ipsilateral stimulation than the AC. Note, however, that this difference is not statistically significant.
A direct comparison between the psychometric function for echo-intensity discrimination (compare Figs. 3C, 8) and neurometric functions, based on populations of cortical Intensity' units (compare Figs. 4, left column, 8), is shown in Figure 8. The neurometric sensitivity improves monotonically by increasing the number of Intensity units included in the population for the ROC analysis (see Material and Methods). The analysis shows that populations of 4–8 Intensity units are sufficient to explain the psychophysical performance, whereas psychophysical performance can be exceeded by pooling across populations of 16 units.
A direct comparison between the psychophysical performance in the Width and Sonar aperture experiments and the detectability of object–width changes in cortical Sonar aperture units is shown in Figure 9. Specifically, we calculated d′ from the psychometric functions and, in the physiological experiments, from the rate differences referenced against the response at object width 0′ (see Materials and Methods). As already evident from the Sonar aperture units in Figures 4 and 5, these units do not encode the sonar aperture in a rate code, i.e., as a monotonic increase in response strength with increasing sonar aperture, but more in a “labeled line” code. In the current analysis, this is reflected in the fact that there are some units that encode the width change from 0 to 15° reliably (d′ > = 1 or < = −1), whereas other units encode the change from 0 to 30, 45, or 60° reliably (Fig. 9, fine black lines). The direct comparison between the psychophysical and physiological performance shows that for each object width there are some units that at least reach or even exceed the psychophysical performance.
It is suggested that in the current experiments on the sonar sensitivity to the aperture of an object, the psychophysical performance is reflected by the bats attending to the most informative units for each specific comparison in the forced-choice experiment. This would be in accordance to the lower envelope principle, which states that animals can perceptually rely on the most sensitive neurons with no interference from the less sensitive ones (Parker and Newsome, 1998).
As evident from Figure 9, we recorded from one unit that responded significantly stronger to an object width of 15° and significantly weaker to an object width of 30°. The neurometric performance for this unit is thus better than the average psychometric performance of the animals. Whereas this singular result is at variance with the “lower envelope principle,” individual results in the psychophysical experiments also indicate that some bats could reliably discriminate an 11° object width (Fig. 3C, Bat2 in the “Control width” experiment).
The current experiments were designed to investigate the perceptual strategy and neural representation of the sonar exploration of object width in echolocating bats. The behavioral experiments showed that while the bats were well able to discriminate differences in echo intensity, these intensity differences were not required to discriminate the width of an ensonified virtual object. Instead, the bats relied on the sonar aperture, i.e., the horizontal spread of angles of incidences of the echoes generated by the virtual objects. The psychophysical performance is reflected in the responses of a population of central-auditory units that encode changes in object width independent of echo intensity. Due to this independence, these units reflect the psychophysical performance in the behavioral Width experiment and in the Sonar aperture experiment, where intensity cues were removed.
Earlier work has addressed the acoustic parameter “echo intensity,” which was considered as an important cue for object classification or discrimination in echolocating bats. Simmons and Vernon (1971) postulated that for discrimination of differently scaled triangles, differences in echo intensity were used by the bats. Processing of echo intensity is also reviewed in Yovel et al. (2011). In the present study we show that intensity is not the only cue that can be used for object–width discrimination. These current data provide psychophysical and electrophysiological evidence that bats recruit the directional characteristics of their outer ears to evaluate the sonar aperture of ensonified objects. These cues can be either monaural spectral cues or binaural echo disparities, as hypothesized by Holderied and von Helversen (2006).
It is clear that the sonar aperture cannot serve as a perceptual cue for the discrimination of the size of very small objects: for objects whose absolute sonar aperture is very small (Sümer et al., 2009), the limitations in auditory spatial directionality preclude the use of sonar aperture. For such small objects, echo intensity (“target strength”) is the only available cue for object-size discrimination. The results from the Width and Sonar aperture experiments indicate that sonar aperture cues are useful for object widths larger than ∼30° (58 cm at a distance of 1 m). For such large objects, echo-intensity cues become unreliable: in contrast to an omnidirectional microphone, echoes from such a large sonar aperture arrive by quite different paths at the bat's ears. Thus, while echoes add up in a coherent manner at an omnidirectional microphone, complex constructive but also destructive interference occurs at the bat's eardrums. This is seen in Figure 2H: although echoes are added up across the azimuth, the resulting amplitude at the bat's eardrum does not increase monotonically, because the echoes had been generated with different HRIR sets corresponding to the different azimuths. In contrast to an omnidirectional microphone, echo intensity measured at the bat's eardrums is not a good predictor of object size. Note that the width of the P. discolor sonar beam is wide enough to fully ensonify spatially extended objects as presented here: both simulations of sonar emission patterns based on the 3D geometry of the emitting system (Vanderelst et al., 2010) and experimental data (C. Geberl et al. unpublished data) show that the −3 dB sonar beam width is ∼75° at the second harmonic (40 kHz) and 30° at the fourth harmonic (80 kHz). In flight, however, sonar-beam widths may be narrower (Brinkløv et al., 2011).
In the current psychophysical experiments, the information of the one ultrasonic microphone in each hemisphere is relayed to up to 17 adjacent speakers. Thus, the frequency content of the echo from each active speaker in that hemisphere is the same. We accepted this limitation to be able to compare the psychophysical performance to the neural performance, and the stimulus generation for which followed the exact same rules. The current experiments, however, clearly show that, even with these limitations, the spatial information provided overrules the echo-intensity information when the bats are required to estimate the size of an object.
The electrophysiology shows that the sonar aperture is reliably encoded in the auditory midbrain and cortex. The existence of this neural correlate suggests that bats may gain the information about an object's sonar aperture from the analysis of the echo of a single call. In the behavioral experiments, however, bats could emit series of calls and change their position and that of their pinnae across the series. These dynamic cues, which are no doubt used by the bats in more natural situations (Ghose and Moss, 2003; Surlykke et al., 2009), could only serve to further strengthen the spatial cues for object size.
Although we cannot exclude that the bats in the behavioral experiments sequentially scanned the virtual objects by virtue of, e.g., pinna movements across sonar sequences, the electrophysiological experiments indicate that ample information may be already gathered from the neural processing of the echoes from just one call.
The nature of the information represented by the Sonar aperture units found here is clearly not the sonar aperture per se. In the current physiological experiments, the sonar aperture was encoded in the AC mainly by units that received binaural input, making the exclusive use of monaural spectral cues unlikely. The following binaural cues may be used to encode the sonar aperture of objects.
In the frequency domain, interaural intensity differences (IIDs) change with object width. Width-dependent IID changes for the current virtual objects are shown in Figure 10A. IIDs provide important binaural cues in echolocating bats and are reliably encoded in the bat ascending auditory system (Park et al., 1997).
In the time domain, the interaural correlation of the echo envelope changes with object width (Shackleton et al., 2005; Aaronson and Hartmann, 2010). Width-dependent correlation changes are shown in Figure 10B. Such binaural echo-envelope features are reliably encoded in the IC of P. discolor (Borina et al., 2011).
Interestingly, although changes in the binaural envelope cross-correlation (Fig. 10B) are generally small, changes are nonmonotonic along the width axis, qualitatively similar to the nonmonotonic response behavior of the Sonar aperture units along the object–width axis (compare Figs. 5, 9). Thus, we suggest that the neural code of the sonar aperture is based on the binaural analysis of envelope correlations and/or IIDs.
Note that when the object is centered at 0° azimuth, none of these binaural parameters can encode changes in object width. In this case, only spectral cues generated by the bat's HRIR can be exploited (Fig. 10C). Perceptually, these self-generated spectral cues result in width-dependent changes of echo timbre rather than echo intensity. Monaurally, the addition of echoes from different positions in azimuth produces a complex interference pattern (Fig. 2, compare B, E), which may encode the sonar aperture in a functionally similar way as the elevation of a sound source is encoded in the human auditory system. In previous experiments, units in the AC of P. discolor were shown to encode spectral echo patterns independent of echo amplitude (Firzlaff and Schuller, 2007). In addition, time-variant binaural disparities introduced by ear movements may facilitate the sonar evaluation of object width.
The aperture of an object increases with decreasing distance to an object. In the visual system the covariation of these object parameters is crucial. It has been shown that the size of an object's retinal image is not directly perceived. Instead, the perceived size of the object strongly depends on its perceived distance from the viewer (Gogel, 1969). The psychophysical findings are supported by an imaging study (Murray et al., 2006) that shows that the retinotopic representation of an object in primary visual cortex changes in accordance with its perceived size, which in turn depends on the perceived distance. This change of representation at early stages of the visual system is supposed to be behaviorally important as it may allow for visual scale invariance and size constancy (Richards, 1967; Murray et al., 2006). The covariation of retinal image size and object distance often necessitates specific neurocomputational mechanisms to extract size-independent information, e.g., in terms of the time-to-contact of a looming object (Sun and Frost, 1998).
In such a scenario, biosonar has a principal advantage because, through the neural analysis of call-echo delay, object distance is readily and unambiguously encoded. The current data show that the sonar aperture is also readily perceivable and neurally represented in bat biosonar. Together with the sensitivity to call-echo delay, bats may be able to implement size constancy as the physically correct covariation of sonar aperture and echo delay. This hypothesis remains to be tested experimentally.
In summary, the current data show that bats perceive and behaviorally exploit the sonar aperture of an ensonified object. A neural correlate of this percept is found in a population of midbrain and cortical units that encode the sonar aperture independent of echo intensity. These current data thus highlight the fact that based on fundamentally different peripheral representations of an object across the senses of vision and echolocation, the CNS aims to find modality-independent representations of object features. We argue that the sonar aperture, as the echo-acoustic counterpart of the visual aperture of an object, is one of these object features.
This study was funded by the Volkswagenstiftung Grant I/83838 to U.F. and L.W., by Deutsche Forschungsgemeinschaft Grant FI 1546/1 to U.F., and by the Human Frontier Science Program Grant 0062/2009 to L.W. We thank Harald Luksch for comments on an earlier draft of this paper and acknowledge the workshop crew, especially Michael Fellner, of the Biozentrum Martinsried for valuable assistance in technical issues. We also thank the two anonymous reviewers whose comments and suggestions significantly improved both clarity and precision of the paper.
- Correspondence should be addressed to Dr. Lutz Wiegrebe, Division of Neurobiology, Department Biology II, Ludwig-Maximilians-Universität München, Grosshaderner Strasse 2, D-82152 Planegg-Martinsried, Germany.