Synthesizing spatially complex sound in virtual space: an accurate offline algorithm

doi:10.1016/S0165-0270(01)00327-2

Journal of Neuroscience Methods

Volume 106, Issue 1, 30 March 2001, Pages 29-38

https://doi.org/10.1016/S0165-0270(01)00327-2 Get rights and content

Abstract

The study of spatial processing in the auditory system usually requires complex experimental setups, using arrays of speakers or speakers mounted on moving arms. These devices, while allowing precision in the presentation of the spatial attributes of sound, are complex, expensive and limited. Alternative approaches rely on virtual space sound delivery. In this paper, we describe a virtual space algorithm that enables accurate reconstruction of eardrum waveforms for arbitrary sound sources moving along arbitrary trajectories in space. A physical validation of the synthesis algorithm is performed by comparing waveforms recorded during real motion with waveforms synthesized by the algorithm. As a demonstration of possible applications of the algorithm, virtual motion stimuli are used to reproduce psychophysical results in humans and for studying responses of barn owls to auditory motion stimuli.

Introduction

The psychophysical study of auditory motion processing is still underdeveloped compared to the study of stationary spatial processing (Wightman and Kistler, 1993, Brown, 1994, Blauert, 1997). The physiological study of auditory motion processing is also sparse (Sovijärvi and Hyvärinen, 1974, Ahissar et al., 1992, Wagner et al., 1994, Jiang et al., 2000), and most of the physiological studies of spatial processing in the auditory system focus on sound localization of stationary objects (e.g. Middlebrooks and Knudsen, 1987, Imig et al., 1990, Rajan et al., 1990, Brugge et al., 1996 in cats; Moiseff and Konishi, 1983 in barn owls).

To study auditory spatial processing, investigators usually use one of two sound delivery methods: free-field presentations using speakers, or headphone presentations of virtual space stimuli. Free-field presentation requires special mechanical setups and limits the number of possible spatial configurations that can be used within one experiment (Perrott and Tucker, 1988, Perrott and Marlborough, 1989, Saberi and Perrott, 1990, Grantham, 1997). Furthermore, complex spatial configurations, such as motion along curved trajectories or sources that change their velocity randomly over time are difficult to achieve in free-field. As a result, compromises must be made, such as presenting apparent motion stimuli by activating an array of speakers in a volley (e.g. Wagner et al., 1994), simulating motion in free-field (Grantham, 1986) or using artificial sounds that mimick some aspects of auditory motion (Stumpf et al., 1992, Griffiths et al., 1998, Baumgart et al., 1999, Jiang et al., 2000).

Virtual space methods employ earphone presentation to simulate sound sources from different positions in space (e.g. Wightman and Kistler, 1989a, Wightman and Kistler, 1989b). The pinna and body introduce frequency-dependent level and phase distortions in sounds reaching the eardrum. These effects are quantified by a position-dependent function called the Head Related Transfer Function (HRTF). Virtual space stimuli are generated by modifying the spectrum of a sound source using HRTFs. When subjects are presented over earphones with stimuli modified by their own HRTFs, they usually report that the sound appears to be externalized (Hartmann and Wittenberg, 1996). The HRTFs of different animal species have also been measured (e.g. Musicant et al., 1990, Rice et al., 1992 in cats; Keller et al., 1998 in barn owls; Spezio et al., 2000 in Rhesus monkeys) though there is no known behavioral correlate for the subjective feeling of externalization. In principle, knowing HRTFs from all positions in space should enable simulation of sound generated by sources moving along arbitrary trajectories in space.

In this paper, we describe an algorithm for synthesizing auditory motion stimuli in virtual space. The algorithm makes direct use of measured HRTFs from any source (human, animal, artificial ears). The HRTFs are assumed to be sampled densely enough in space so that interpolation of HRTFs at non-measured directions will be valid. No other assumptions are made about the properties of the HRTFs. The algorithm described in this paper is physically validated by comparing the waveforms it generates with waveforms measured during actual motion.

Two applications of the algorithm are presented to demonstrate its capabilities. In one application, human subjects perform a discrimination task of motion direction. The detection thresholds are congruent with thresholds previously reported in the literature. In a second application, barn owls are trained to turn their head in response to virtual motion stimuli.

Section snippets

The convolution equation

The propagation of sound from any position in space to the eardrum is described by a linear transfer function, the HRTF. The HRTF contains information about both the delay and attenuation due to the propagation, and the spectral distortion due to the angular position of the sound source with respect to the head and torso. The time-domain counterpart of the HRTF is the head-related impulse response (HRIR), which will be denoted by h_x(t).

When a sound source travels along a trajectory x(t) in

Analysis of the convolution equation

As an illustration of the properties of Eq. (1), it is now shown that it implicitly contains the Doppler effect. To demonstrate this, let h be reduced to a delta function depending on the radial distance from the center of the head alone, i.e. $h_{x(t−τ)} (τ)=δ τ− r_{t−τ} c,$ where r_t is the distance of x(t) from the center of the head. δ(τ) possesses the following property: $∫ d τ ψ(τ)δ τ− r_{t−τ} c =ψ(τ_{0}),$ where τ₀ solves τ−r_t−τ/c=0. Substituting Eq. (6) in Eq. (1), $y(t)= ∫ 0 ∞ d τ s(t−τ)δ τ− r_{t−τ} c .$ Consider a sound source

Discussion

We presented an accurate algorithm for synthesizing virtual space sound. While virtual space sound algorithms based on measured HRIR sets have been described previously (Jenison et al., 1998), our algorithm attempts to achieve accurate reconstruction of the eardrum waveform without extraneous assumptions. The validity of this equation is tested directly by comparing waveforms recorded during real motion with those estimated by the algorithm. The applicability of the algorithm for auditory

Acknowledgements

The authors thank Professor Hermann Wagner and Nachum Ulanovsky for comments on the manuscript, and Yehoshua Yehuda for help with programming the motor. This work was supported by a grant from the German–Israeli Foundation (GIF).

References (48)

H Jiang et al.
Responses of cells to stationary and moving sound stimuli in the anterior ectosylvian cortex of cats
Hearing Res.
(2000)
C.H Keller et al.
Head-related transfer functions of the barn owl: measurement and neural responses
Hearing Res.
(1998)
J.J Rice et al.
Pinna-based spectral cues for sound localization in cat
Hearing Res.
(1992)
A.R Sovijärvi et al.
Auditory cortical neurons in the cat sensitive to the direction of sound source movement
Brain Res.
(1974)
M.L Spezio et al.
Head-related transfer functions of the Rhesus monkey
Hearing Res.
(2000)
M.J Steinbach et al.
Eye movements of the owl
Vision Res.
(1973)
M Ahissar et al.
Encoding of sound-source location and movement: activity of single neurons and interactions between adjacent neurons in the monkey auditory cortex
J. Neurophysiol.
(1992)
F Baumgart et al.
A movement-sensitive area in auditory cortex
Nature
(1999)
J Blauert
Spatial Hearing: The Psychophysics of Human Sound Localization
(1997)
C.H Brown
Sound localization

C.H Brown et al.

Localization of noise bands by old world monkeys

J. Acoust. Soc. Am.

(1980)

J.F Brugge et al.

The structure of spatial receptive fields of neurons in primary auditory cortex of the cat

J. Neurosci.

(1996)

R.P Feynman et al.

(1963)

D.W Grantham

Detection and discrimination of simulated motion of auditory targets in the horizontal plane

J. Acoust. Soc. Am.

(1986)

D.W Grantham

Auditory motion perception: snapshots revisited

T.D Griffiths et al.

Right parietal cotrex is involved in the perception of sound movement in humans

Nat. Neurosci.

(1998)

W.M Hartmann et al.

On the externalization of sound images

J. Acoust. Soc. Am.

(1996)

Hartung K, Braasch J, Sterbing SJ. Comparison of different interpolation methods for the interpolation of head-related...

T.J Imig et al.

Single-unit selectivity to azimuthal direction and sound pressure level of noise bursts in cat high-frequency primary auditory cortex

J. Neurophysiol.

(1990)

Jenison RL. A Spherical Basis Function Neural Network for Pole-Zero Modeling of Head-Related Transfer Functions. IEEE...

R.L Jenison et al.

A spherical basis function neural network for modeling auditory space

Neural Comp.

(1996)

R.L Jenison et al.

Synthesis of virtual motion in 3D auditory space

IEEE Eng. Med. Biol.

(1998)

D.A Kistler et al.

A model of head-related transfer functions based on principal components analysis and minimum-phase reconstruction

J. Acoust. Soc. Am.

(1992)

E.I Knudsen et al.

Sound localization by the barn owl (Tyto alba) measured with the search coil technique

J. Comp. Phys.

(1979)

Cited by (11)

Acoustic calibration in an echoic environment
2018, Journal of Neuroscience Methods
Citation Excerpt :
Maximal length sequences, however, are not exactly white and thus can't be used for estimating a perfect impulse response. To solve this problem, Golay complementary sequences were proposed (Foster, 1986) and later applied for acoustic measurement in a number of studies of the auditory system (Jacobson et al., 2001; Mrsic-flogel et al., 2005; Zhou et al., 1992). Even though Golay complementary sequences require playback and recording of two different sequences, the test signal is shorter than its pseudo-random equivalents, resulting in faster estimation of the impulse response (Foster, 1986).
The sound fed to a loudspeaker may significantly differ from that reaching the ear of the listener. The transformation from one to the other consists of spectral distortions with strong dependence on the relative locations of the speaker and the listener as well as on the geometry of the environment. With the increased importance of research in awake, freely-moving animals in large arenas, it becomes important to understand how animal location influences the corresponding spectral distortions.
We describe a full calibration pipeline that includes spatial sampling and estimation of the spectral distortions. We estimated the impulse responses of the environment using Golay complementary sequences.
Using those sequences, we also describe an acoustic 3D localization method for freely moving animals.
In our arena, the impulse responses are dominated by a small number of strong reflections. We use this understanding to provide guidelines for designing the geometry of the environment as well as the presented sounds, in order to provide more uniform sound levels throughout the environment. Our 3D localization method achieves a 1.5 cm accuracy through the utilization of sound cues only.
To our knowledge, this is the first description of a large-scale acoustic calibration pipeline with acoustic localization for neuroscience studies.
Principled sampling of large arena allows for better design and control of the acoustic information provided to freely-moving animals.
Precognitive and cognitive elements in sound localization
2002, Zoology
Sound localization behavior is of great importance for an animal's survival. To localize a sound, animals have to detect a sound source and assign a location to it. In this review we discuss recent results on the underlying mechanisms and on modulatory influences in the barn owl, an auditory specialist with very well developed capabilities to localize sound. Information processing in the barn owl auditory pathway underlying the computations of detection and localization is well understood. This analysis of the sensory information primarily determines the following orienting behavior towards the sound source. However, orienting behavior may be modulated by cognitive (top-down) influences such as attention. We show how advanced stimulation techniques can be used to determine the importance of different cues for sound localization in quasi-realistic stimulation situations, how attentional influences can improve the response to behaviorally relevant stimuli, and how attention can modulate related neural responses. Taken together, these data indicate how sound localization might function in the usually complex natural environment.
Estimation of the low-frequency components of the head-related transfer functions of animals from photographs
2014, Journal of the Acoustical Society of America
On the synthetic binaural signals of moving sources
2011, 131st Audio Engineering Society Convention 2011
Location of the auditory egocentre in the blind and normally sighted
2008, Perception
Tendencies in auditory physiology
2005, Uspekhi Fiziologicheskikh Nauk

View all citing articles on Scopus

View full text

Synthesizing spatially complex sound in virtual space: an accurate offline algorithm

Abstract

Introduction

Section snippets

The convolution equation

Analysis of the convolution equation

Discussion

Acknowledgements

Hearing Res.

Hearing Res.

Hearing Res.

Brain Res.

Hearing Res.

Vision Res.

Encoding of sound-source location and movement: activity of single neurons and interactions between adjacent neurons in the monkey auditory cortex

J. Neurophysiol.

A movement-sensitive area in auditory cortex

Nature

Spatial Hearing: The Psychophysics of Human Sound Localization

Sound localization

Localization of noise bands by old world monkeys

J. Acoust. Soc. Am.

The structure of spatial receptive fields of neurons in primary auditory cortex of the cat

J. Neurosci.

Detection and discrimination of simulated motion of auditory targets in the horizontal plane

J. Acoust. Soc. Am.

Auditory motion perception: snapshots revisited

Right parietal cotrex is involved in the perception of sound movement in humans

Nat. Neurosci.

On the externalization of sound images

J. Acoust. Soc. Am.

Single-unit selectivity to azimuthal direction and sound pressure level of noise bursts in cat high-frequency primary auditory cortex

J. Neurophysiol.

A spherical basis function neural network for modeling auditory space

Neural Comp.

Synthesis of virtual motion in 3D auditory space

IEEE Eng. Med. Biol.

A model of head-related transfer functions based on principal components analysis and minimum-phase reconstruction

J. Acoust. Soc. Am.

Sound localization by the barn owl (Tyto alba) measured with the search coil technique

J. Comp. Phys.