Abstract
Monaurally deaf people lack the binaural acoustic difference cues in sound level and timing that are needed to encode sound location in the horizontal plane (azimuth). It has been proposed that these people therefore rely on spectral pinna cues of their normal ear to localize sounds. However, the acoustic head-shadow effect (HSE) might also serve as an azimuth cue, despite its ambiguity when absolute sound levels are unknown. Here, we assess the contribution of either cue in the monaural deaf to two-dimensional (2D) sound localization. In a localization test with randomly interleaved sound levels, we show that all monaurally deaf listeners relied heavily on the HSE, whereas binaural control listeners ignore this cue. However, some monaural listeners responded partly to actual sound-source azimuth, regardless of sound level. We show that these listeners extracted azimuth information from their pinna cues. The better monaural listeners were able to localize azimuth on the basis of spectral cues, the better their ability to also localize sound-source elevation. In a subsequent localization experiment with one fixed sound level, monaural listeners rapidly adopted a strategy on the basis of the HSE. We conclude that monaural spectral cues are not sufficient for adequate 2D sound localization under unfamiliar acoustic conditions. Thus, monaural listeners strongly rely on the ambiguous HSE, which may help them to cope with familiar acoustic environments.
Introduction
The auditory system relies on implicit acoustic cues to encode sound location. Interaural level and time differences relate to the left-right angle of sounds (azimuth). The up-down and front-back angles (elevation) are determined by spectral pinna cues (Shaw, 1966; Musicant and Butler, 1984; Wightman and Kistler, 1989; Blauert, 1997). Independent neural pathways in the brainstem process the different acoustic cues (Irvine, 1986; Yin, 2002; Young and Davis, 2002).
The independent encoding of azimuth and elevation has interesting implications. For example, altering the spectral cues with pinna molds abolishes elevation localization but leaves azimuth performance intact (Oldfield and Parker, 1984; Hofman et al., 1998). Moreover, human listeners can adapt to modified spectral cues and relearn to localize elevation without changing their azimuth performance (Hofman et al., 1998). Narrow-band sounds distort elevation localization but not azimuth (Middlebrooks, 1992; Goossens and Van Opstal, 1999). Conversely, reversing binaural inputs with hearing aids reverses the azimuth responses while leaving elevation unaffected (Hofman et al., 2002).
However, spectral cues may also determine sound-source azimuth. After plugging one ear, young ferrets relearn to localize sounds, presumably by using spectral cues of the intact ear (King et al., 2000). It has been argued, however, that because plugging perturbs binaural cues in a frequency-dependent way, listeners could in principle maintain a binaural strategy by relying on low-frequency information (Wightman and Kistler, 1997).
A binaural strategy for azimuth localization is impossible for unilaterally deaf listeners. Slattery and Middlebrooks (1994) proposed that some of their monaural listeners had learned to use spectral cues of their intact ear. Yet, in a real monaural situation, perceived intensity of the sound source also relates to its azimuth because of the head-shadow effect (HSE). Although the HSE is ambiguous for unknown intensities, monaural listeners might have adopted the HSE to cope with familiar acoustic environments in daily life. So far, localization studies with monaural participants only used a small range in sound intensities (Humes et al., 1980; Newton and Hickson, 1981; Newton, 1983; Slattery and Middlebrooks, 1994), leaving it unclear to what extent the HSE may have played a role.
Here, we study to what degree the unilateral deaf rely on intensity and spectral cues to localize sounds. Listeners made rapid head movements to sounds with varying intensities and locations within the frontal hemifield. By modifying the pinna geometry of the intact ear, we determined the contribution of spectral cues. We also assessed their ability to localize azimuth in a simple single-intensity condition.
Our data show that all monaural listeners strongly relied on the HSE, whereas this cue is entirely ignored by binaural control listeners. Multiple linear regression indicated that some monaural listeners did extract azimuth information, regardless of sound level. These listeners based their responses partly on spectral cues. Moreover, the stronger the contribution of spectral cues to azimuth, the better monaural listeners could also localize elevation. We conclude that, despite its ambiguity, the HSE dominates monaural sound localization.
Materials and Methods
Monaural and binaural listeners
Nine listeners with chronic unilateral hearing loss (18-50 years of age) participated in the free-field localization experiments (Table 1). They were given a short practice session before the start of the actual experiment under open-loop conditions (i.e., no feedback was given to the actual performance of the listeners). The monaural listeners' good ear had normal hearing [within 20 dBA sound pressure level (SPL) of audiometric zero] as determined by an audiogram obtained with a standard staircase procedure (10 tone pips, 0.5 octave separation, between 500 Hz and 11.3 kHz), but thresholds were ∼60 dBA SPL higher in the impaired ear (Fig. 1). Eight monaural listeners were diagnosed with a unilateral hearing loss at a young age (CD, GK, IE, JP, RH, and SB: younger than 4 years of age; BN and LD: diagnosed at age 12), presumably because of a congenitally underdeveloped cochlea, except for participant BN who had his left cochlea removed and participant GK who lost her hearing caused by meningitis. Participant PO (50 years of age) had a sudden hearing loss of unknown origin at the age of 48. None of the monaural listeners had any known uncorrected visual disorder.
Six binaural listeners (21-40 years of age) also participated in these experiments and acted as a control reference. None had any auditory or uncorrected visual disorder. Three control listeners (JV, HV, MW) had previous experience with sound localization studies; participant MW is an author of this study. The other binaural control and monaural listeners were inexperienced and were kept naive about the purpose of this study.
Apparatus
During the experiments, the listener was seated comfortably in a chair in the center of a completely dark, sound-attenuated room (height times width times length = 2.45 × 2.45 × 3.5 m). The walls, ceiling, floor, and every large object present were covered with acoustic foam that eliminated echoes of sound frequencies >500 Hz. The room had an ambient background noise level of 20 dBA SPL.
A total of 58 light-emitting diodes (LEDs) attached to the center of small broad-range loudspeakers (MSP-30; Monacor International GmbH, Bremen, Germany) was mounted on a thin wooden frame that formed a hemispheric surface 100 cm in front of the listener. Stimulus coordinates ranged from -75 to +75° in both azimuth and elevation, as defined in a double-pole coordinate system (Knudsen and Konishi, 1979). In this system, azimuth (α) is defined as the angle between the sound source or response location, the center of the head, and the midsagittal plane, and elevation (ϵ) is defined as the angle between the sound source, the center of the head, and the horizontal plane (Hofman and Van Opstal, 1998). The origin of the (α, ϵ) coordinate system corresponds to the straight-ahead speaker location. Head movements were recorded with the magnetic search-coil induction technique. The listener wore a lightweight (150 gm) “helmet” consisting of two perpendicular 4 cm wide straps that could be adjusted to fit around the listener's head without interfering with the ears. A small coil was attached to the top of this helmet. From the left side of the helmet, a 40 cm long, thin, aluminum rod protruded forward with a dim (0.15 Cd/m2) red LED attached to its end, which could be positioned in front of the listener's eyes. Two orthogonal pairs of 2.45 × 2.45 m coils were attached to the edges of the room to generate the horizontal (60 kHz) and vertical (80 kHz) magnetic fields. The head-coil signal was amplified and demodulated (Remmel Labs, Ashland, MA), after which it was low-pass filtered at 150 Hz (model 3343; Krohn-Hite, Brockton, MA) before being stored on hard disk at a sampling rate of 500 Hz per channel for off-line analysis.
Auditory stimuli
Acoustic stimuli were digitally generated using Tucker-Davis System II hardware (Tucker-Davis Technologies, Gainesville, FL) with a 16 bit digital-to-analog converter (TDT, model DA1; 50 kHz sampling rate). A programmable attenuator (TDT, model PA4) controlled sound level, after which the stimuli were passed to a buffer (TDT, model HB6) and finally to one of the speakers in the experimental room. All acoustic stimuli consisted of Gaussian white noise and had 0.5 msec sine-squared onset and offset ramps
The auditory stimuli were either broadband (BB; flat broadband characteristic between 1 and 20 kHz) or high-pass (HP; high-pass filtered at 3 kHz) stimuli (see below) with a duration of 150 msec. Sound intensities ranged from 30 to 60 dBA SPL (see below). Absolute free-field sound levels were measured at the position of the listener's head with a calibrated sound amplifier and microphone (BK2610/BK4144; Brüel and Kjær, Norcross, GA).
Paradigms
Calibration experiment. Head-position data for the calibration procedure were obtained by instructing the listener to make an accurate head movement while redirecting the dim rod LED from the central fixation LED to one of the 57 peripheral LEDs that was illuminated as soon as the fixation point extinguished. Each experimental session started with a calibration run.
Auditory localization. The listener started a trial by fixating the central LED. After a pseudorandom period of 1.5-2.0 sec, this fixation LED disappeared, and an auditory stimulus was presented 400 msec later. The listener was asked to redirect the head by pointing the dim rod LED as accurately and as fast as possible to the perceived location of the sound stimulus. Because the response reaction times typically exceeded 200 msec, all responses were made under complete open-loop conditions. To investigate the role of different auditory and nonauditory cues in monaural sound localization, the localization experiments were run according to the three paradigms described below. The paradigms were run on separate days. Monaural listeners participated in the various paradigms as indicated in Table 1.
Intensity paradigm. This paradigm was used to investigate the role of stimulus intensity on binaural and monaural sound localization. Both monaural listeners and binaural controls participated in this experiment. HP stimuli were presented at 57 locations (excluding the straight-ahead speaker) and seven intensities (30, 35, 40, 45, 50, 55, and 60 dBA SPL), making a total of 399 stimuli.
Spectral paradigm. The stimuli in this paradigm were HP, presented at 57 locations and four different intensities (30, 40, 50, and 60 dBA SPL; total of 228 stimuli) and were run with the monaural listeners only. To verify whether monaural listeners used monaural spectral cues, this paradigm was run twice in the same session. In the first run, participants performed the standard localization task. Before the second run, the concha of their intact ear was filled with wax to perturb the spectral cues of that ear without occluding the ear canal.
Training paradigm. In the third paradigm, monaural listeners were trained to localize a single-intensity stimulus. During training, a BB stimulus of 60 dBA SPL was presented on the horizontal meridian at 1 of 10 locations [αϵ(-75, 60,..., 60, 75)°; ϵ = 0°, excluding the fixation target at α = 0°]. Listeners were explicitly told that the stimulus had one fixed intensity. Each location was presented five times in pseudorandom order (n = 50 stimuli). A similar block of stimuli then followed this first block of stimuli, but this time the LED at the speaker location was also illuminated to provide the subject with visual feedback. This auditory-visual block was then followed by another pure-auditory block and so on, until three blocks of each type were presented.
Acoustic head-shadow measurements. To quantify the HSE, a silicone tube attached to a miniature microphone (EA1842; Knowles, Itasca, IL) was placed near the entrance of the listener's ear canal to record the acoustic signal. The head was restrained to face the center speaker while the sound stimuli (BB, 60 dBA SPL) were presented from all 58 speaker locations. Signals were stored on a hard disk at a sampling frequency of 50 kHz for off-line analysis. The recordings were made for two binaural listeners (JV and MW) and two monaural listeners (JP and RH).
Data analysis
Data calibration. The 58 fixation points obtained from the calibration experiment were used to train two three-layer back-propagation neural networks that served to calibrate the head-movement data. Both networks received the raw horizontal and vertical head-position signals as inputs and yielded the desired azimuth and elevation angles (in degrees), respectively, as their output. The trained networks were subsequently used to map the raw data to calibrated two-dimensional head positions with an absolute accuracy within 4% over the entire response range (for details, see Goossens and Van Opstal, 1997). Response coordinates were defined in the same double-pole azimuth-elevation coordinates as the stimuli (see above). For binaural listeners, a positive azimuth angle refers to targets and responses located on the right-hand side. For ease of comparison between the monaural listeners, we defined azimuth as positive when targets and responses were located on the side of the good ear.
Head movement detection. Saccadic head movements were detected from the calibrated head-movement signals by setting thresholds to the head velocity for onset and offset, respectively, using a custom-made program (onset velocity = 20 °/sec; offset velocity = 15 °/sec). Detection markings from the program were visually checked by the experimenter and could be adjusted manually when deemed necessary. Head movements with reaction times <80 or >1000 msec were discarded, because responses with extremely short latencies may be regarded as anticipatory and responses with excessive latencies are usually the result of inattentiveness of the listener.
Statistics. All responses were analyzed separately for each listener by determining the optimal linear fit for the following stimulus-response relationship: 1
for the azimuth and the elevation components, respectively, by minimizing the least-squares error. In Eq. 1, αR and ϵR are the azimuth and elevation response components, and αT and ϵT are the actual azimuth and elevation coordinates of the stimulus. Fit parameters, a and c, are the biases (offsets; in degrees), whereas b and d are the gains (slopes, dimensionless) of the azimuth and elevation responses, respectively. Note that an ideal listener should yield gains of 1.0 and offsets of 0.0°. Also, Pearson's linear correlation coefficient, the residual error (SD around the fitted line), and the mean absolute localization error were calculated.
As described in the Introduction, we hypothesized that the HSE might potentially underlie the localization behavior of the monaural listeners. To quantify the acoustic effect of the head on sound level, proximal to the ear, as a function of sound azimuth, we used the following model to describe the HSE: 2
with αT (in degrees) target azimuth. Parameters e and h (in dBA SPL), f (in degrees-1), and g (dimensionless) were found by minimizing the mean-squared error (Gauss-Newton method). Because the differences between various listeners were small, measured HSE data were pooled across four listeners to determine the optimal fit parameters of Eq. 2 (see Fig. 5A).
Azimuth. To evaluate the potential role of both stimulus azimuth and sound level in determining the subject's responses, the data were analyzed by applying multiple linear regression. However, because azimuth and sound level are measured in different units, a direct regression does not allow for a quantitative comparison of the relative contributions of these two stimulus factors. To deal with this problem, we normalized the relevant variables and performed a standardized multiple linear regression analysis by fitting the following relationship: 3
where , , and ÎP are now dimensionless variables [where x̂ = (x - μx)/σx is the so-called z-score of variable x, with μx as the mean, and σx as the variance of variable x], k and m are the (dimensionless) partial correlation coefficients that result from the fit, and Ip is the stimulus intensity at the good ear, which in this study will be termed “proximal stimulus intensity.” The latter was determined by the following equation: 4
In Eq. 4, If is the free-field (absolute) stimulus level (in dBA SPL).
The partial correlation parameters k and m provide a measure for relative importance of the associated variable (azimuth and intensity, respectively) to explain the subject's responses. By definition, k and m are constrained to values between -1 and 1. When k = 1 and m = 0, the subject's responses are entirely described by changes in stimulus azimuth and are insensitive to changes in sound level. Conversely, when k = 0 and m = 1, the azimuth responses of the subject are entirely determined by changes in sound level, regardless of the actual stimulus azimuth. The squared values of k and m quantify how much of the variance in the data is explained by the respective variable.
Because azimuth is defined as positive for monaural listeners when locations are on their hearing side, a positive value of m indicates that monaural listeners orient their responses to the side of the good ear when the proximal sound level Ip is high and toward the deaf ear when these intensities are low. Although there is no a priori reason for a linear effect of Ip on perceived azimuth, the results show that this first-order approximation is quite reasonable.
Elevation. Previous research has indicated that each ear contributes to elevation localization on the contralateral side in an azimuth-dependent way (Morimoto, 2001; Hofman and Van Opstal, 2003). Because the results in this study show that unilaterally deaf listeners rely on both target azimuth and proximal intensity for their azimuth responses, these two stimulus parameters (effectively determining the listener's perceived azimuth) were included in the analysis of the elevation responses.
To that end, elevation localization behavior was quantified by fitting the following standardized multiple linear regression: 5
where the normalized elevation response &̂epsi;R may depend not only on target elevation, &̂epsi;T but also on target azimuth, &̂alpha;T, and proximal intensity, ÎP. Proximal intensity, Ip, was replaced by the free-field intensity, If, in Eqs. 3 and 5 when these regressions were performed on the responses from binaural control listeners.
The bootstrap method was applied to obtain confidence limits for the optimal fit parameters in the regression analyses 1, 2, 3, and 5. To that end, 100 data sets were generated by randomly selecting (with replacement) data points from the original data. Bootstrapping thus yielded a set of 100 different fit parameters. The SDs in these parameters were taken as estimates for the confidence levels of the parameter values obtained in the original data set (Press et al., 1992).
Results
The influence of intensity on monaural sound-source azimuth localization
Figure 2 exemplifies the azimuth and elevation responses of a typical binaural listener (Fig. 2A,B) and a typical monaural participant (Fig. 2C,D) to one of the stimulus types in the intensity paradigm (45 dBA SPL; high-pass noise). The binaural listener was quite precise in localizing these stimuli as demonstrated by the near-optimal regression lines and the small amount of scatter for both sound-source azimuth (Fig. 2A) and elevation (Fig. 2B). Although the scatter is clearly larger for the monaural listener, localization of the 45 dBA SPL stimulus appeared to be remarkably good despite the absence of binaural cues (Fig. 2C,D). Data such as these are therefore in line with a previous report (Slattery and Middlebrooks, 1994).
Figure 3 shows the responses of the same two listeners to all stimuli in the intensity paradigm (pooled intensities from 30-60 dBA SPL; see Materials and Methods). Note that the responses for the binaural listener were insensitive to the large range in sound levels. The regression lines through the pooled data were indistinguishable from the regression on the 45 dBA SPL data in Figure 2. Also, the scatter around the regression lines was quite modest. In contrast, the monaural listener appeared to be unable to localize the sound source, because her responses hardly correlated with the actual stimulus coordinates. Note, however, that this listener still appeared to perceive a large range of sound-source azimuths and elevations, which was comparable with that of the binaural listener. The responses of the monaural listener clearly were not directed merely to the side of the intact ear as reported in acute monaural studies (Slattery and Middlebrooks, 1994; Wightman and Kistler, 1997). Yet, the responses did not seem to be driven by spatial information either.
Localization capabilities of the other unilateral deaf in the intensity paradigm were also poor (supplemental Table 2, available at www.jneurosci.org), as evidenced by the poor correlation between stimulus and response azimuth (r2 < 0.3), the low response gains, b (0.42 ± 0.18, median ± SD), and the high biases, a (11 ± 11°, median ± SD), in the linear regression analysis (Eq. 1) and the large mean unsigned errors, which were typically >30°.
To further illustrate how the azimuth responses of the two listeners were influenced by absolute sound level, we collected each listener's azimuth responses into small target azimuth-stimulus intensity bins (3.75° × 3 dBA SPL). Response angles were averaged if a bin contained more than one response. Subsequently, the average response azimuth was gray coded and plotted in the corresponding stimulus bin [with dark gray indicating responses into the far left (binaural) or the deaf (monaural) hemifield, and light gray into the right or hearing hemifield]. If listeners responded spatially accurate, regardless of sound level, a uniform pattern of vertical iso-gray bands should emerge. The overall performance of the binaural listener in Figure 4A approached this ideal situation quite well.
In contrast, performance of the unilaterally deaf listener IE across the entire range of free-field sound intensities was quite different (Fig. 4B). The data clearly show that free-field stimulus level dominates her localization behavior. In particular, note that the loudest stimuli were located exclusively in the unaffected hemifield (Fig. 4B, light shading), whereas the weakest stimuli all appeared to arrive from the deaf side (dark shading). Monaural listeners often did have trouble hearing the weakest stimuli on their far-deaf side, so that a response was sometimes not made at all (Fig. 4B, indicated by the white bins). Stimuli with midrange intensities typically elicited a larger range of azimuth responses, so that listeners seemed to make near-normal localization responses (Fig. 2C,D).
However, because of the HSE, the free-field sound is filtered and attenuated in an azimuth-dependent way by the acoustic properties of the head. To incorporate this acoustic effect in our analysis, we first measured the average sound level at each ear as a function of sound-source azimuth. The result is shown in Figure 5A (data pooled for four listeners). Across the range of target azimuths (from -75 to 75°), sound level at each ear varied over a range of ∼20 dBA SPL. The analysis presented thus far has not accounted for this potential localization cue. In Figure 5B, the data are shown as a function of proximal sound level in the same format as in Figure 4B after incorporating the HSE (Eq. 4) (see Materials and Methods). Note the clear influence of the HSE on azimuth localization responses of the monaural listener: low proximal stimulus levels are localized on the deaf side (i.e., negative azimuths), whereas high proximal stimulus levels are localized on the hearing side (i.e., positive azimuths).
To quantify the relative contributions of actual target azimuth and either proximal stimulus level (for monaural listeners) or absolute free-field intensity (for binaural listeners) to the localization of azimuth, a standardized multiple linear regression analysis was performed (Eqs. 3, 4) (Fig. 6; supplemental Table 3, available at www.jneurosci.org). Plotting the resulting partial correlation coefficients for stimulus azimuth (k) against that for absolute sound level (m), Figure 6 shows that for all binaural listeners, the azimuth coefficient, k, was close to one, whereas the intensity coefficient, m, did not differ significantly from zero (p > 0.05). This analysis therefore shows that the main predictor for the orienting response of all binaural listeners is actual sound-source azimuth, whereas the role of free-field sound level is negligible.
However, the monaural listeners clearly deviate from this picture. For three monaural listeners (IE, JP, and PO), the azimuth coefficient k was insignificant (p > 0.05), whereas the coefficient for proximal sound level was high (m ≈ 0.75). These monaural listeners therefore made no distinction between the proximal intensity of a sound and its location in the horizontal plane. Interestingly, for five other monaural listeners, the partial correlation coefficients for stimulus azimuth were significant (p < 0.05), although also in these monaural listeners proximal intensity dominated their responses (m > k). Only for monaural listener CD both coefficients were approximately equal but still low (∼0.35). As seen in this figure, the more monaural listeners relied on actual target azimuth, the lower the partial correlation for intensity (r = -0.74; p < 0.05).
The data therefore indicate that monaural azimuth localization is nearly impossible for monaural listeners when they are subjected to an acoustic environment with a large range of potential stimulus intensities. All monaural listeners relied heavily on the proximal sound intensity cue provided by the HSE, even though this cue is ambiguous for sound localization. Therefore, actual performance of these listeners in such an environment was quite poor, because their stimulus-response relationships nearly completely broke down (Fig. 3).
The influence of spectral cues on monaural sound-source azimuth localization
Despite the poor localization behavior and strong contribution of proximal sound level, six monaural listeners were able to extract a significant amount of information about the veridical azimuth location (k > 0). Because the monaural spectral cues of the good ear were unaffected, we wondered whether these monaural listeners also learned to use this more complex cue when localizing sound azimuth.
To test for this possibility, we subjected seven of our monaural listeners to the spectral paradigm (see Materials and Methods) in which they localized sounds of variable intensities with and without a wax mold applied to the pinna of their good ear. This wax mold perturbed the spectral cues while leaving the HSE unaffected. As expected, the mold produced a severe impairment in the elevation localization behavior of all monaural listeners (data not shown). The azimuth data from this experiment were analyzed in the same way as in the intensity experiment. The effects of the wax mold on the partial correlation coefficients for both proximal intensity and azimuth are plotted in Figure 7. Note that when the azimuth partial correlation, k, obtained from responses during the free-ear condition was non-zero (p < 0.05; monaural listeners BN, CD, LD, and RH), they were significantly lower for the mold condition. For monaural listeners GK, JP, and SB, the azimuth coefficients in both conditions did not differ from zero in this experiment. Intensity coefficients increased significantly for three of the monaural listeners in the mold condition (CD, LD, and RH) but did not change for the other monaural listeners.
These data support the possibility that monaural listeners can use monaural spectral cues to extract information about sound-source azimuth. However, only half of the monaural listeners did so and even then performance was severely hampered when compared with normal binaural localization.
We next investigated whether the spectral cues of the hearing ear could also contribute to localization of sound-source azimuth on the deaf side. To that end, the multiple linear regression analysis was repeated on the data from the intensity paradigm (Eqs. 3 and 4) (see Materials and Methods) but now separate for the hearing (αT > 5°) and deaf sides (αT < -5°) (Fig. 8). The results indicate that the partial correlations for azimuth on the deaf side were much lower than those on the hearing side. Furthermore, the intensity coefficients did not differ between hearing and deaf sides (p > 0.05; data not shown).
In summary, our data suggest that (1) monaural spectral cues are only used by approximately half of the monaural listeners to localize azimuth, and (2) these cues cannot be used to extract azimuth information over the entire azimuth range.
Monaural sound-elevation localization
As shown in Figures 3 and 4, the binaural listener was quite accurate at localizing sound elevation for the different sound intensities, whereas performance of the monaural listener was clearly compromised.
Previous studies with binaural listeners have shown that disruption of the spectral cues of one ear systematically affects elevation localization as a function of sound-source azimuth on the disrupted side (Musicant and Butler, 1984; Morimoto, 2001; Hofman and Van Opstal, 2003). Our data thus far show that the monaural listeners localize azimuth on the basis of both spectral cues and proximal sound level (HSE) cues with different relative contributions (Fig. 6). We therefore hypothesized that the percept of sound-source elevation in these monaural listeners would not only rely on actual target elevation (spectral cues) but also on perceived sound-source azimuth, which was shown to be a function of target azimuth and proximal sound level (Fig. 6).
As exemplified for one of our binaural control listeners in Figure 9, A and B, the binaural localization responses solely depended on the actual sound elevation, regardless of either target azimuth or free-field intensity. However, quite different response patterns were obtained for the monaural listeners. For example, the elevation responses of monaural listener IE depended slightly on target azimuth, which is apparent from the systematic gradient in the azimuth direction in Figure 9C. The monaural listener's elevation responses also depended on proximal sound level. At low proximal intensities, responses were located at a small range of low elevations; however, at higher proximal intensities, her responses were based toward more veridical elevation angles (Fig. 9D). In contrast, monaural listener JP responded almost exclusively to proximal intensity, because the responses did not systematically depend on either target elevation or target azimuth (Fig. 9E,F). Low proximal intensities elicited low response elevations, whereas high proximal sound levels yielded high response elevations in this participant.
To quantify the influence of target elevation, target azimuth, and proximal sound intensity for all unilaterally deaf listeners and binaural control listeners, we performed multiple linear regression on the elevation response data (Eq. 5) (supplemental Table 4, available at www.jneurosci.org). As expected, all binaural listeners responded exclusively to actual target elevation (median value of n ± SD, 0.90 ± 0.09), regardless of the changes in target azimuth (o ≈ 0) and free-field intensity (q ≈ 0). However, the results for the monaural listeners were more idiosyncratic. For most monaural listeners, actual target elevation played a much smaller role than in the binaural controls (median value of n ± SD, 0.56 ± 0.24), with the notable exception of monaural listener RH, whose performance (n = 0.83) equaled that of the naive binaural listeners (MT, SW, and TE). For monaural listeners GK, RH, and SB, responses depended exclusively on sound-source elevation. For the other six monaural listeners, responses also depended on proximal sound level and target azimuth (supplemental Table 4, available at www.jneurosci.org). Interestingly, monaural listeners with a high partial correlation coefficient for elevation, n, also had a high partial correlation coefficient for azimuth in their azimuth responses, k (Fig. 10A). These two coefficients were highly correlated (r = 0.79; p < 0.05), which indicates that unilaterally deaf listeners who were able to make use of their spectral cues to localize sound-source azimuth were also successful at using the spectral cues to localize elevation. Some of the monaural listeners could also localize elevation on their deaf side (Fig. 10B).
Training to localize azimuth monaurally
Most monaural listeners responded with head movements covering a range of azimuths that varied with the applied stimulus intensities (Fig. 4B). Low-intensity stimuli typically elicited responses far into the deaf hemifield, whereas loud stimuli tended to elicit responses far into the hearing side. Apparent near-normal localization capabilities were seen for intermediate intensities. This finding suggests that these monaural listeners may have learned to use the ambiguous HSE cue to cope with the acoustic environment encountered in daily life.
We therefore wondered whether monaural listeners could rapidly learn to improve their azimuth localization performance to a novel stimulus of one fixed intensity. To that end, five monaural listeners (CD, GK, JP, RH, and SB) were trained to localize loud BB noise bursts of 60 dBA SPL under open-loop (auditory) and closed-loop (auditory-visual) stimulus conditions. Six blocks of stimuli were presented, alternating between auditory and auditory-visual blocks, with the first block being purely auditory (see Materials and Methods). Monaural listener JP (Fig. 11A) was able to reduce her mean-squared localization error already within the first open-loop block by 18°. In the subsequent two open-loop blocks, these errors were reduced even further. The other monaural listeners started out with smaller initial errors (Fig. 11B, dark circles), but performance increased for all monaural listeners tested to an average mean unsigned localization error of ∼18° (Fig. 11B, white circles). This high performance rate was already achieved in the first open-loop block (Fig. 11B, gray circles), so the subsequent visual feedback blocks and open-loop blocks did not add much to the improvement. Thus, the improvements shown in these data suggest that the HSE can be easily and rapidly remapped, even in the absence of visual feedback, for a simple fixed-intensity sound.
Discussion
This study investigated the role of monaural spectral pinna cues and the HSE on two-dimensional sound localization performance of listeners with a complete unilateral hearing loss and compared their response behavior to that of normal-hearing binaural control listeners. We found that none of the binaural controls relied on stimulus intensity as a potential cue for sound location (Figs. 3, 4, 6).
In contrast, all monaural listeners depended heavily on the HSE to localize sound-source azimuth (Figs. 4B, 5B, 6). This is quite remarkable, given that the HSE is in principle an ambiguous signal for azimuth location. Indeed, because monaural listeners heavily relied on this cue, azimuth localization performance in the randomized intensity paradigm was quite poor (Fig. 3C,D).
Previous free-field localization studies have also shown substantial localization deficits for listeners with a unilateral hearing loss (Angel and Fite, 1901; Jongkees and Van der Veer, 1957; Viehweg and Campbell, 1960; Gatehouse and Cox, 1972; Gatehouse, 1976; Humes et al., 1980; Newton and Hickson, 1981; Newton, 1983; Slattery and Middlebrooks, 1994; Bosman et al., 2003). Some of these studies reported near-normal localization performance in these monaural listeners (Newton, 1983; Slattery and Middlebrooks, 1994), a finding that is not supported by our results (Fig. 3C,D). The discrepancy is likely attributable to the limited range of intensities used in these previous studies when compared with our study. For sound levels in the midrange (40-50 dBA SPL), adequate response behavior often can be observed (Fig. 2C,D), even though the responses have to be entirely explained by the HSE (Fig. 5B).
Measurements of the head-related transfer functions indicate that the monaural spectral cues can provide unique information about sound-source location for both the azimuth and elevation directions (data not shown) so that, at least in principle, monaural listeners might have regained adequate localization performance by relying on the spectral cues of their normal ear. Note that the normal-hearing control listeners do not appear to use these more subtle spectral cues for azimuth localization. After application of a unilateral or binaural mold, their azimuth localization responses are not affected (data not shown in this study, but see Oldfield and Parker, 1984; Hofman and Van Opstal, 2003), whereas manipulation of spectral cues in a virtual set-up has little influence on lateral angle judgments (Macpherson and Middlebrooks, 2002). Applying a monaural plug to normal-hearing listeners leads to an immediate and long-lasting shift of the azimuth percept toward the side of the free ear both in humans (Oldfield and Parker, 1986; Butler et al., 1990; Slattery and Middlebrooks, 1994) and in experimental animals (Knudsen et al., 1982; King et al., 2000).
A majority of the monaural listeners in our study showed evidence of using spectral cues to localize sound-source azimuth. This is in line with previous suggestions made by Newton (1983) and Slattery and Middlebrooks (1994). The present study extends these findings by quantifying how this contribution varied from listener to listener. Furthermore, the use of spectral cues was restricted to the hearing hemifield for the majority of monaural participants (Fig. 8). Interestingly, the strength of the spectralcue contribution to a listener's azimuth percept provided a good predictor for the monaural listener's ability to localize sound-source elevation also (Fig. 10A). However, some monaural listeners did not use the spectral cues at all. Their azimuth localization responses were entirely dominated by proximal stimulus intensity.
The unilateral deaf adapted to their loss of binaural information by incorporating monaural cues into their sound-azimuth localization behavior. In human plasticity studies, the evidence for an adaptive shift in sound-azimuth response behavior after a modification of binaural information is sparse. By introducing a monaural plug in binaural listeners, Slattery and Middlebrooks (1994) did not see a change in azimuth localization behavior within 24 hr. Listeners exposed to reversed binaural cues did not even adapt over a period of up to 19 d (Hofman et al., 2002). Of course, our unilateral deaf were subjected to their binaural loss during a much greater span of time (typically >18 years). Auditory-evoked potentials obtained from a late-onset unilaterally deaf listener suggest that the changes in cortical activity, evidenced by an increased inter-hemispheric symmetry, occur gradually and may continue for at least 2 years (Ponton et al., 2001).
This long-term adaptive shift is likely to be very different from the quick changes in response behavior during the training paradigm. Although the incorporation of monaural cues for azimuth localization appears to be very difficult, if not impossible, learning the relationship between proximal sound intensity and sound location may be quite simple for familiar acoustic situations. Under such conditions, listeners may adopt a strategy to respond to the proximal intensity by mapping the midrange intensities to central locations. The training paradigm showed that the unilateral deaf could apply this strategy within a single block of trials when told that the stimulus is of a fixed intensity.
Our results show a clear degradation of both azimuth and elevation response performance when compared with binaural listeners. Also, the unilateral deaf in the study by Slattery and Middlebrooks (1994) had problems localizing sound elevation, especially on the deaf side. In binaural listeners, spectral disruption of one ear degrades elevation localization on the side ipsilateral to the mold (Morimoto, 2001; Hofman and Van Opstal, 2003). Our results extend these findings by showing that degradation in azimuth performance also induces a similar degradation in elevation performance (Fig. 10A). Therefore, incorporation of the spectral cues to guide azimuth localization behavior is linked to the ability to use spectral cues from the good ear for elevation localization on the deaf side.
Our observation that the unilateral deaf may have residual localization abilities that could be attributed to their use of spectral cues may seem to disagree with results from monaural sound localization with a virtual acoustic set-up in normal-hearing listeners (Wightman and Kistler, 1997). The difference between the two studies is probably explained by the much longer exposure to a binaural information loss in the unilateral deaf, in contrast with the immediate monauralization of the normal-hearing listeners. Thus, when listeners have not learned to use the monaural spectral cues, their ability to monaurally localize sound-source azimuth as well as elevation is abolished. Indeed, some of our monaural listeners were entirely unable to localize sounds in both azimuth and elevation (Fig. 10).
In conclusion, the apparent conflict in results from monaural listeners across different studies in the literature is probably attributable to two factors. First, a significant fraction of the listeners has not learned to incorporate spectral cues to extract azimuth location. Second, most studies did not use sufficient variation of stimulus intensities to enable a dissociation of the different contributions of the HSE and spectral cues.
Note that the complex spectral cues, although veridical, will contribute almost exclusively to localization on the hearing side and at most a small amount on the deaf side (Fig. 8). The signal-to-noise ratio in these cues for low-intensity sounds will deteriorate rapidly for stimuli on the deaf side. Thus, these complex cues can only be applied successfully for a limited class of sound sources that are both loud enough and contain a relatively flat broadband spectrum.
In contrast, intensity cues can easily be learned in simple acoustic environments for a variety of sounds (compare Fig. 11). Thus, the unilateral deaf might have adopted a pragmatic strategy by incorporating the relatively straightforward monaural intensity cue to localize sounds while neglecting the veridical but limited spectral cues. However, monaural listeners who neglect the spectral cues altogether also lack the ability to localize sounds in elevation. A recent study with normal-hearing listeners showed that adult listeners could relearn new spectral cues within a period of only a few weeks (Hofman et al., 1998). Given the importance of adequate sound-localization performance in the highly dynamic and complex acoustic environments of everyday life, it would be worthwhile to explore the possibility of training monaural listeners to use their spectral cues and thus to radically improve their overall localization behavior.
Footnotes
This research was supported by the University of Nijmegen (A.J.V.O.,M.M.V.W.) and the Human Frontiers Science Program Grant RG 0174-1998/B (M.M.V.W.). We thank G. Van Lingen, H. Kleijnen, G. Windau, and T. Van Dreumel for technical assistance. We also express our gratitude to the nine monaural and five binaural volunteers for repeatedly participating in our experiments.
Correspondence should be addressed to John Van Opstal, Department of Medical Physics and Biophysics, University of Nijmegen, Geert Grooteplein 21, 6525 EZ Nijmegen, The Netherlands. E-mail: johnvo{at}mbfys.kun.nl.
Copyright © 2004 Society for Neuroscience 0270-6474/04/244163-09$15.00/0