Synthesizing spatially complex sound in virtual space: an accurate offline algorithm
Introduction
The psychophysical study of auditory motion processing is still underdeveloped compared to the study of stationary spatial processing (Wightman and Kistler, 1993, Brown, 1994, Blauert, 1997). The physiological study of auditory motion processing is also sparse (Sovijärvi and Hyvärinen, 1974, Ahissar et al., 1992, Wagner et al., 1994, Jiang et al., 2000), and most of the physiological studies of spatial processing in the auditory system focus on sound localization of stationary objects (e.g. Middlebrooks and Knudsen, 1987, Imig et al., 1990, Rajan et al., 1990, Brugge et al., 1996 in cats; Moiseff and Konishi, 1983 in barn owls).
To study auditory spatial processing, investigators usually use one of two sound delivery methods: free-field presentations using speakers, or headphone presentations of virtual space stimuli. Free-field presentation requires special mechanical setups and limits the number of possible spatial configurations that can be used within one experiment (Perrott and Tucker, 1988, Perrott and Marlborough, 1989, Saberi and Perrott, 1990, Grantham, 1997). Furthermore, complex spatial configurations, such as motion along curved trajectories or sources that change their velocity randomly over time are difficult to achieve in free-field. As a result, compromises must be made, such as presenting apparent motion stimuli by activating an array of speakers in a volley (e.g. Wagner et al., 1994), simulating motion in free-field (Grantham, 1986) or using artificial sounds that mimick some aspects of auditory motion (Stumpf et al., 1992, Griffiths et al., 1998, Baumgart et al., 1999, Jiang et al., 2000).
Virtual space methods employ earphone presentation to simulate sound sources from different positions in space (e.g. Wightman and Kistler, 1989a, Wightman and Kistler, 1989b). The pinna and body introduce frequency-dependent level and phase distortions in sounds reaching the eardrum. These effects are quantified by a position-dependent function called the Head Related Transfer Function (HRTF). Virtual space stimuli are generated by modifying the spectrum of a sound source using HRTFs. When subjects are presented over earphones with stimuli modified by their own HRTFs, they usually report that the sound appears to be externalized (Hartmann and Wittenberg, 1996). The HRTFs of different animal species have also been measured (e.g. Musicant et al., 1990, Rice et al., 1992 in cats; Keller et al., 1998 in barn owls; Spezio et al., 2000 in Rhesus monkeys) though there is no known behavioral correlate for the subjective feeling of externalization. In principle, knowing HRTFs from all positions in space should enable simulation of sound generated by sources moving along arbitrary trajectories in space.
In this paper, we describe an algorithm for synthesizing auditory motion stimuli in virtual space. The algorithm makes direct use of measured HRTFs from any source (human, animal, artificial ears). The HRTFs are assumed to be sampled densely enough in space so that interpolation of HRTFs at non-measured directions will be valid. No other assumptions are made about the properties of the HRTFs. The algorithm described in this paper is physically validated by comparing the waveforms it generates with waveforms measured during actual motion.
Two applications of the algorithm are presented to demonstrate its capabilities. In one application, human subjects perform a discrimination task of motion direction. The detection thresholds are congruent with thresholds previously reported in the literature. In a second application, barn owls are trained to turn their head in response to virtual motion stimuli.
Section snippets
The convolution equation
The propagation of sound from any position in space to the eardrum is described by a linear transfer function, the HRTF. The HRTF contains information about both the delay and attenuation due to the propagation, and the spectral distortion due to the angular position of the sound source with respect to the head and torso. The time-domain counterpart of the HRTF is the head-related impulse response (HRIR), which will be denoted by hx(t).
When a sound source travels along a trajectory x(t) in
Analysis of the convolution equation
As an illustration of the properties of Eq. (1), it is now shown that it implicitly contains the Doppler effect. To demonstrate this, let h be reduced to a delta function depending on the radial distance from the center of the head alone, i.e.where rt is the distance of x(t) from the center of the head. δ(τ) possesses the following property:where τ0 solves τ−rt−τ/c=0. Substituting Eq. (6) in Eq. (1),Consider a sound source
Discussion
We presented an accurate algorithm for synthesizing virtual space sound. While virtual space sound algorithms based on measured HRIR sets have been described previously (Jenison et al., 1998), our algorithm attempts to achieve accurate reconstruction of the eardrum waveform without extraneous assumptions. The validity of this equation is tested directly by comparing waveforms recorded during real motion with those estimated by the algorithm. The applicability of the algorithm for auditory
Acknowledgements
The authors thank Professor Hermann Wagner and Nachum Ulanovsky for comments on the manuscript, and Yehoshua Yehuda for help with programming the motor. This work was supported by a grant from the German–Israeli Foundation (GIF).
References (48)
- et al.
Responses of cells to stationary and moving sound stimuli in the anterior ectosylvian cortex of cats
Hearing Res.
(2000) - et al.
Head-related transfer functions of the barn owl: measurement and neural responses
Hearing Res.
(1998) - et al.
Pinna-based spectral cues for sound localization in cat
Hearing Res.
(1992) - et al.
Auditory cortical neurons in the cat sensitive to the direction of sound source movement
Brain Res.
(1974) - et al.
Head-related transfer functions of the Rhesus monkey
Hearing Res.
(2000) - et al.
Eye movements of the owl
Vision Res.
(1973) - et al.
Encoding of sound-source location and movement: activity of single neurons and interactions between adjacent neurons in the monkey auditory cortex
J. Neurophysiol.
(1992) - et al.
A movement-sensitive area in auditory cortex
Nature
(1999) Spatial Hearing: The Psychophysics of Human Sound Localization
(1997)Sound localization
Localization of noise bands by old world monkeys
J. Acoust. Soc. Am.
The structure of spatial receptive fields of neurons in primary auditory cortex of the cat
J. Neurosci.
Detection and discrimination of simulated motion of auditory targets in the horizontal plane
J. Acoust. Soc. Am.
Auditory motion perception: snapshots revisited
Right parietal cotrex is involved in the perception of sound movement in humans
Nat. Neurosci.
On the externalization of sound images
J. Acoust. Soc. Am.
Single-unit selectivity to azimuthal direction and sound pressure level of noise bursts in cat high-frequency primary auditory cortex
J. Neurophysiol.
A spherical basis function neural network for modeling auditory space
Neural Comp.
Synthesis of virtual motion in 3D auditory space
IEEE Eng. Med. Biol.
A model of head-related transfer functions based on principal components analysis and minimum-phase reconstruction
J. Acoust. Soc. Am.
Sound localization by the barn owl (Tyto alba) measured with the search coil technique
J. Comp. Phys.
Cited by (11)
Acoustic calibration in an echoic environment
2018, Journal of Neuroscience MethodsCitation Excerpt :Maximal length sequences, however, are not exactly white and thus can't be used for estimating a perfect impulse response. To solve this problem, Golay complementary sequences were proposed (Foster, 1986) and later applied for acoustic measurement in a number of studies of the auditory system (Jacobson et al., 2001; Mrsic-flogel et al., 2005; Zhou et al., 1992). Even though Golay complementary sequences require playback and recording of two different sequences, the test signal is shorter than its pseudo-random equivalents, resulting in faster estimation of the impulse response (Foster, 1986).
Estimation of the low-frequency components of the head-related transfer functions of animals from photographs
2014, Journal of the Acoustical Society of AmericaOn the synthetic binaural signals of moving sources
2011, 131st Audio Engineering Society Convention 2011Tendencies in auditory physiology
2005, Uspekhi Fiziologicheskikh Nauk