Abstract
A major cue to infer sound direction is the difference in arrival time of the sound at the left and right ears, called interaural time difference (ITD). The neural coding of ITD and its similarity across species have been strongly debated. In the barn owl, an auditory specialist relying on sound localization to capture prey, ITDs within the physiological range determined by the head width are topographically represented at each frequency. The topographic representation suggests that sound direction may be inferred from the location of maximal neural activity within the map. Such topographical representation of ITD, however, is not evident in mammals. Instead, the preferred ITD of neurons in the mammalian brainstem often lies outside the physiological range and depends on the neuron's best frequency. Because of these disparities, it has been assumed that how spatial hearing is achieved in birds and mammals is fundamentally different. However, recent studies reveal ITD responses in the owl's forebrain and midbrain premotor area that are consistent with coding schemes proposed in mammals. Particularly, sound location in owls could be decoded from the relative firing rates of two broadly and inversely ITD-tuned channels. This evidence suggests that, at downstream stages, the code for ITD may not be qualitatively different across species. Thus, while experimental evidence continues to support the notion of differences in ITD representation across species and brain regions, the latest results indicate notable commonalities, suggesting that codes driving orienting behavior in mammals and birds may be comparable.
Introduction
A sound originating at a specific location horizontally away from the midline of the head will reach the two ears at different times. The timing primarily depends on both the sound location relative to the two ears and the distance between the ears. The binaural difference in the arrival time of the sound, referred to as interaural time difference (ITD), is a major cue for sound localization. Since the original proposal by Jeffress (1948) of a neural circuit to account for human sound localization, the neural representation of ITD has been the source of much debate (Schnupp and Carr, 2009; Grothe et al., 2010; Ashida and Carr, 2011). Jeffress (1948) proposed that the auditory system contains a circuit consisting of delay lines and coincidence detector neurons that creates a map of ITD across the physiological range, which is the range of ITD naturally encountered given the spacing of the ears. A topographical arrangement of ITD-sensitive neurons consistent with the Jeffress model has been observed in the avian brainstem (Sullivan and Konishi, 1986; Carr and Konishi, 1990; Peña et al., 2001; Köppl and Carr, 2008; Carr et al., 2015). However, such a topographic arrangement of ITD-sensitive neurons has not been found in mammals. This discrepancy has led to a prevalent view that the ITD code supporting sound localization is fundamentally different in birds and mammals (Schnupp and Carr, 2009; Grothe et al., 2010; Ashida and Carr, 2011; Palanca-Castan and Köppl, 2015).
In both birds and mammals, the timing of acoustic signals is sensed by receptor cells of the cochlea and transmitted to the cochlear nuclei through the auditory nerve fibers. ITD is first detected by coincidence detector neurons receiving bilateral projections from the cochlear nuclei, located in the nucleus laminaris in birds and alligators (Carr and Konishi, 1990; Köppl and Carr, 2008; Kettler and Carr, 2019) and the medial superior olive (MSO) in mammals (Goldberg and Brown, 1969; Yin and Chan, 1990; Spitzer and Semple, 1995). Information about ITD is conveyed to the inferior colliculus (IC) across vertebrate species.
Qualitative differences between mammals and birds have been reported in central regions of the IC and in brainstem ITD-detector neurons (McAlpine et al., 2001; Brand et al., 2002; Joris et al., 2006). Neurons in the central regions of the IC and brainstem ITD-detector neurons are narrowly tuned to frequency in both birds and mammals, but ITD tuning differs, as shown in Figure 1. In birds, IC (Wagner et al., 2007) and brainstem ITD-detector cells (Carr and Konishi, 1990; Köppl and Carr, 2008) are largely tuned to ITDs across the physiological range for all sound frequencies, suggesting a labeled line code where sound location is inferred from the location in the brain where ITD-selective cells are active (i.e., a place code) (Fig. 1A). In contrast, ITD-tuned neurons in the mammalian IC and MSO, especially those recorded in smaller rodents and tuned to low frequencies, are tuned to ITDs outside the physiological range (McAlpine et al., 2001; Brand et al., 2002; Grothe et al., 2010), and ITD tuning in the MSO and IC of mammals is markedly frequency-dependent: neurons tuned to lower frequencies respond best to larger ITDs than neurons tuned to higher frequencies (McAlpine et al., 2001; Hancock and Delgutte, 2004; Day and Semple, 2011; Bremen and Joris, 2013). These differences across species have led to the proposition that there are divergent codes for converting ITD to spatial position (Schnupp and Carr, 2009; Grothe et al., 2010). Instead of a place code, it has been hypothesized that mammals might use an opponent-channel code for detecting the location of sounds across the horizontal plane. In the opponent-channel code, neurons respond best to sounds in the contralateral hemifield and have ITD tuning curves with peaks outside the physiological range. Additionally, their responses change most rapidly for sounds arising near the midline, leading to a region of highest sensitivity in the front. This type of tuning to ITD is referred to as hemispheric tuning. It has been proposed that sound location can be decoded from the relative firing rates or the pattern of firing rates in two opposed hemispheric tuned channels created by populations of ITD-sensitive neurons in the two brain hemispheres (Fig. 1B) (McAlpine et al., 2001; Harper and McAlpine, 2004; Grothe et al., 2010; Lüling et al., 2011).
Postulated ITD coding schemes in birds and mammals. A, Top, In the owl's midbrain space map, binaural neurons are sharply tuned to different ITDs (indicated by different colors) within the physiological range given by the head size (dashed lines). Bottom, This organization suggests a labeled-line code for sound localization, where the position of maximal activity within the map represents the source location. B, Top, ITD tuning of two hypothetical hemispheric populations tuned to ITDs within the contralateral space in the mammalian brain. The steepest slopes of their broad ITD tuning curves are positioned within the physiological range (dashed lines), but peak responses may be positioned beyond that range. Bottom, The relative activation of the two brain hemispheres provides an opponent-channel code for sound source location (McAlpine and Grothe, 2003). C, Population vector readout of the owl's midbrain space map. Each neuron in the space map is represented by a vector pointing toward its preferred direction (colored arrows). The overrepresentation of frontal space (dashed line) drives the resultant vector slightly to the front of the actual location, matching the underestimation of sound sources observed across species. Data from Fischer and Peña (2011).
As we discuss later, different explanations have been proposed for the frequency dependence of ITD tuning in mammals. In the opponent-channel code scheme, the dependency between ITD and frequency tuning observed in the mammalian MSO and IC may be fundamental for achieving high spatial acuity near the midline by positioning the steepest slopes of ITD-response functions over this region across neurons tuned to different frequencies (McAlpine et al., 2001; Harper and McAlpine, 2004; Schnupp and Carr, 2009). Complementary to the issue of decoding sound direction, recent theoretical work based on spike-timing-dependent plasticity underlying the emergence of ITD selectivity detector neurons suggests that Hebbian learning may explain the observed relationship between ITD and frequency tuning (Fontaine and Brette, 2011). The coding principles underlying the frequency dependence of ITD tuning in mammals remains an open question.
In this review, we first provide a panoramic view of the literature describing spatial tuning of midbrain and forebrain neurons in mammalian and avian species. Next, we discuss recent findings in the avian auditory system that reveal similarities with coding schemes proposed for mammals. We then discuss how this coding emerges from a midbrain representation historically believed to be qualitatively different between birds and mammals. Finally, we conclude providing a view that highlights similarities, yet considers differences across species.
Midbrain space maps for gaze orientation
In barn owls, a map of auditory space emerges in the external nucleus of the IC (ICx), which is directly downstream of the central IC. The owl's midbrain map of auditory space and the neural processing leading to it have influenced the field for decades, due to its computational elegance and profound implications for neural coding (Moiseff and Konishi, 1981, 1983; Takahashi et al., 1984; Takahashi and Konishi, 1986; Moiseff, 1989; Carr and Konishi, 1990; Peña and Konishi, 2001; Fischer and Peña, 2011). The emergence of spatial tuning in neurons of the owl's space map relies on the combination selectivity to ITD and an additional binaural cue, the interaural level difference (ILD) (Moiseff and Konishi, 1983; Peña and Konishi, 2001). The combination selectivity to ITD and ILD results from a multiplicative integration of the tunings to ITD and ILD (Peña and Konishi, 2001). The emergence of spatial tuning in ICx also depends on convergence across frequency, which resolves the ambiguity in responses of ITD-tuned neurons at high frequencies (Takahashi and Konishi, 1986; Mazer, 1998; Peña and Konishi, 2000). The optic tectum (OT) receives input from ICx and has a similar map of auditory space (Knudsen and Konishi, 1978; Knudsen, 1982). The OT, called the superior colliculus (SC) in mammals, represents the initial hub of the midbrain selection network (Knudsen, 2018). This laminar structure of space-specific neurons sends output to the forebrain (Wurtz et al., 2005; Knudsen, 2018) and to the midbrain premotor nuclei that mediate ballistic-orienting movements (du Lac and Knudsen, 1990; Masino, 1992). The topographic organization of spatial tuning observed in the owl's ICx and OT has historically suggested a local labeled-line coding, where spatial positions are inferred from the location of the peak activity of single or few ITD-sensitive neurons (Fig. 1A) (Konishi, 2003; Harper and McAlpine, 2004; Köppl and Carr, 2008; Schnupp and Carr, 2009).
In mammals, tuning to multiple sound localization cues that are associated with spatial locations emerges in the brachium of the IC (Slee and Young, 2014) where there is a coarse map of azimuth (i.e., horizontal coordinate) (Schnupp and King, 1997). There is some similarity with ICx in the barn owl in that neurons in the brachium of the IC are selective for multiple sound localization cues across frequency (Slee and Young, 2013, 2014). However, neurons in the brachium of the IC have additive, not multiplicative, tuning to ITD and ILD (Slee and Young, 2014). Additionally, their responses are dominated by ILD tuning, with ITD playing a weak role in spatial tuning (Slee and Young, 2013). An auditory space map of varied coarseness across species has been reported in the SC (Palmer and King, 1982; Middlebrooks and Knudsen, 1984; King, 1993; King et al., 1996; Gaese and Johnen, 2000; Lee and Groh, 2014). The SC receives multimodal sensory inputs (Meredith and Stein, 1983), including a space-selective one from the brachium of the IC, but shows a clearer topographic representation of space, suggesting further refinement of spatial tuning in this region (Schnupp and King, 1997). While there are differences in the characteristics and synthesis of the midbrain space maps between barn owls and mammals, the combination of multiple sound localization cues to support direction of gaze is a common underlying mechanism.
Spatial tuning in the auditory cortex
Differences between avian and mammalian ITD coding have focused on the topographical representation of ITD in birds and the opponent-channel hemispheric representation in mammals. However, studies of spatial tuning in the mammalian auditory cortex have described a range of representations of sound location, not always consistent with the opponent-channel theory. While there is evidence in cats and primates of coarse spatial selectivity to peripheral locations in the contralateral space (Middlebrooks and Pettigrew, 1981; Middlebrooks et al., 1994, 1998; Brugge et al., 1996; Woods et al., 2006) that is consistent with the opponent-channel code, there are also reports of more heterogeneous cortical representations, with neurons tuned not only to peripheral, but also to frontal auditory space (Middlebrooks and Pettigrew, 1981; Stecker et al., 2003; Woods et al., 2006; Lee and Middlebrooks, 2013; Remington and Wang, 2019). Studies in humans, both physiological (Magezi and Krumbholz, 2010; Salminen et al., 2010; Briley et al., 2013) and psychophysical (Boehnke and Phillips, 1999; Phillips and Hall, 2005; Vigneault-MacLean et al., 2007), also support the idea of the opponent-channel model with two opposing broadly tuned sets of neurons encoding azimuthal space, consistent with studies in nonhuman mammals (McAlpine et al., 2001; Stecker et al., 2005; Werner-Reiss and Groh, 2008). However, mammalian physiological studies have reported a substantial number of neurons in the auditory cortex displaying sensitivity to sound location near the midline at lower sound levels (Zhang et al., 2004; Higgins et al., 2010; Razak and Fuzessery, 2010; Zhou and Wang, 2012; Lee and Middlebrooks, 2013; Belliveau et al., 2014). Similarly, EEG studies in humans have provided evidence of a midline spatial channel in the auditory cortex (Briley et al., 2016), suggesting that the human brain may contain a representation dedicated specifically to azimuth positions near the midline. Furthermore, studies in several mammalian species have shown that unilateral lesions and cooling of cortical and subcortical areas induce localization deficits restricted to the contralateral space (Jenkins and Masterton, 1982; Thompson and Cortez, 1983; Kavanagh and Kelly, 1987; Malhotra et al., 2008; Wood et al., 2017). These findings indicate that one side of the brain is necessary and sufficient to localize sound in the opposite side of space. Although not inconsistent with data showing left and right broadly tuned channels, they are inconsistent with the hypothesis that localization depends on the comparison of activity between the two sides of the brain, and have therefore motivated sound localization models that depart from the opponent-channel theory, including a third channel representing midline locations in models of azimuth localization (Dingle et al., 2010) or a labeled line code (Belliveau et al., 2014).
Like studies that investigated spatial tuning of cortical neurons along the azimuth, the examination of ITD tuning in the auditory cortex has produced conflicting results. While some studies in mammals, including humans, reported preferred ITDs corresponding to sounds arising far within the contralateral hemifield (Reale and Brugge, 1990; Magezi and Krumbholz, 2010; Salminen et al., 2010), others reported preferred ITDs largely covering the physiological range (Fitzpatrick et al., 2000; Scott et al., 2009; Belliveau et al., 2014). The nature of the ITD code supporting sound localization in the cortex may still be more complex than the two-channel, three-channel, or distributed coding schemes, in light of reported changes in temporal spiking pattern with sound location (Mickey and Middlebrooks, 2003) and task dependency of spatial tuning (Lee and Middlebrooks, 2013). The effect of anesthesia, stimulation paradigms, behavioral state, and hierarchy within cortical auditory pathways are also important questions regarding how spatial information is represented and readout in the mammalian auditory cortex.
New evidence on the coding of ITD in the owl's midbrain and forebrain
The overlap of new experimental evidence in the owl's auditory system with both observed response properties and proposed coding theories for mammals brings into question whether coding schemes downstream from sites of ITD detection are fundamentally different across mammalian and avian species. The dominant theory of sound localization in owls has been that a place code is used to infer sound location, where sound location is estimated from the place of highest activity in the map of space (Jeffress, 1948; Konishi, 2003; Schnupp and Carr, 2009). This theory has been challenged, however, for the midbrain map of space by recent evidence that it does not predict the systematic errors that owls make when locating sound at eccentric directions (Fischer and Peña, 2011; Cazettes et al., 2016). These studies have proposed that the owl's sound-localizing behavior could indeed be explained by a population vector decoding of the midbrain map, where the larger number of neurons tuned to frontal locations drives the vector to the front of the actual location, explaining these errors (Fig. 1C). According to this theory, each neuron in the space map is associated with a vector pointing toward its preferred direction, corresponding to the peak of the spatial tuning curve. The length of this vector is scaled to the size of the neuron's response at each direction. Averaging over the population results in a vector pointing toward the decoded direction. Population vector decoding, originally developed for predicting arm movements (Georgopoulos et al., 1986), has been motivated by its algorithmic simplicity. For decades, studies have reported that the firing rate of neurons can be correlated with vector quantities in many neural systems, such as the premotor and motor cortices (Georgopoulos et al., 1986; Caminiti et al., 1991) and cerebellum (Fortier et al., 1989) for monkey arm movement, the cercal system of the cricket for wind detection (Theunissen and Miller, 1991), the mammalian visual system for motion perception (Steinmetz et al., 1987; Gilbert and Wiesel, 1990; Churchland and Lisberger, 2001), or in the SC of vertebrates for saccade direction (Van Gisbergen et al., 1987; Lee et al., 1988). Recently, a population-vector readout of a hemispheric ITD representation has also been invoked to explain the repulsive effect of adaptation on history-dependent azimuth estimation in human listeners (Lingner et al., 2018). Yet, a limited number of studies have proposed theories for where and how the brain could implement such vector quantities (McNaughton et al., 1994; Salinas and Abbott, 1994; Burgess and O'Keefe, 1996), and even fewer studies have attempted to investigate these questions experimentally (Gonzalez-Bellido et al., 2013).
Other findings challenge the notion that the selectivity of neurons in the owl's midbrain map of auditory space is purely defined by their tuning to interaural time and level differences (Konishi, 2003). Because convergence across frequency is a primary mechanism underlying spatial tuning of space-specific neurons (Takahashi and Konishi, 1986; Mazer, 1998), an implicit notion is that the emergence of midbrain maps of auditory space correlates with a fading of the tonotopic organization widely observed in upstream areas of the owl's auditory system (Konishi, 2003). However, recent studies suggest a less simplistic view. Since the discovery of the space map, it has been known that the spatial tuning of neurons in the front of the space map is sharper than in the periphery (Knudsen and Konishi, 1978). Recent work shows that the basis of this nonuniform tuning is the varying frequency selectivity of neurons (Cazettes et al., 2016). Correlation between ITD and frequency tuning has been reported in the owl's space map (Knudsen, 1984; Cazettes et al., 2014). These findings indicate that the classic map of auditory space in the owl can still be viewed as a tonotopic representation, as high frequencies are represented in the front and low frequencies in the periphery of the map (Fig. 2A). Indeed, the best ITD increases as the preferred frequency decreases, in a manner that, albeit different in range, is reminiscent of that observed in mammals (Fig. 2B) (McAlpine et al., 2001; Hancock and Delgutte, 2004; Day and Semple, 2011; Bremen and Joris, 2013). This relationship between best ITD and preferred frequency leads to predictable changes in the owl's sound localization behavior, such as better accuracy localizing low-frequency than high-frequency narrowband sounds in the periphery (Cazettes et al., 2018).
Frequency tuning of ITD-selective cells in birds and mammals. A, Best ITD (absolute value) is correlated with best frequency in neurons of the barn owl's map of auditory space, where neurons selective to ITDs corresponding to sounds located in frontal positions respond to higher frequencies. Adapted from Cazettes et al. (2014). B, Best ITD (absolute value) also varies as a function of best frequency in neurons of the mammalian auditory pathway. Adapted from McAlpine et al. (2001). Neurons that prefer small ITDs are also tuned to higher frequencies.
Additionally, recent evidence suggests different coding schemes of sound location downstream of the midbrain maps. Studies in the owl's forebrain have reported evidence of a transformation in the neural code of ITD, where convergence across ITD channels leads to quasi-hemispherical ITD tuning (Vonderschen and Wagner, 2009, 2012). Furthermore, recent comparative population recordings in the owl's midbrain and forebrain showed differing correlation structure of nearby cells, where the hemispherical tuning of forebrain cells was associated with a significant drop in correlated variability of nearby cells, eliminating the potentially detrimental effect of noise correlation on information in a putative forebrain opponent-channel code (Beckert et al., 2017).
Emergence of hemispheric tuning from a brain map of auditory space
A critical question remains of how seemingly different ITD coding schemes observed in different parts of the brain of mammals and birds are functionally connected. Recent studies investigated the processing that links the sensory representation of ITD in the barn owl's space map to the motor command guiding head-orienting responses. The spatial tuning of neurons of a nucleus downstream from the space map, the midbrain tegmentum, which commands the owl's gaze via projections to the spinal cord (Masino, 1992; Masino and Knudsen, 1993), was measured for the first time (Cazettes et al., 2018). Consistent with previous reports (Masino, 1992; Masino and Knudsen, 1993), a topographic organization of ITD tuning was not observed in the midbrain tegmentum, suggesting that the map of auditory space disappears in this nucleus. Surprisingly, however, these premotor neurons displayed hemispheric ITD tuning, as predicted by the opponent-channel theory. Unlike neurons in the space map, which are sharply tuned to ITD, and therefore space (Fig. 3A,B), midbrain tegmentum neurons responded to the whole contralateral hemifield and displayed sigmoid-like ITD tuning curves (Fig. 3C,D). The peak response of their ITD tuning curves lay on the contralateral side, away from the midline (Fig. 3D). Additionally, the premotor neurons efficiently accounted for the owl's behavior with responses that are consistent with a population-vector readout of the midbrain map (Cazettes et al., 2018), as predicted by Fischer and Peña (2011).
Spatial and ITD tuning in the owl's midbrain map and midbrain tegmentum. A, Spatial tuning of an example neuron in the midbrain map. B, Example ITD tuning curves of three neurons in the midbrain map selective for different ITDs. C, Spatial tuning of an example midbrain tegmentum cell. D, Example ITD tuning curves of three midbrain tegmentum neurons. Firing rate is normalized from maximum to minimum in all plots. Different colors represent different neurons. Modified from Cazettes et al. (2018).
Average midbrain tegmentum responses in each brain hemisphere increased for ITDs corresponding to the contralateral side of space (Fig. 4A). These tuning curves are consistent with hypothesized hemispheric population responses to ITD underlying the inference of sound location in mammals (McAlpine and Grothe, 2003) (Fig. 1B). Additionally, the ITD tuning of midbrain tegmentum neurons varied with frequency (Fig. 4B) in a manner that matches the dependency between frequency and ITD tuning in the midbrain map, from which it receives projections (Knudsen, 1984; Cazettes et al., 2014, 2018). When neurons were stimulated with high-frequency tones, the ITD tuning peaked at small ITDs; whereas when neurons were stimulated with low-frequency tones, the ITD tuning peaked at larger ITDs (Fig. 4B) (Cazettes et al., 2018). This pattern is consistent with responses of ITD-sensitive neurons in the mammalian IC (Fig. 4C) (McAlpine et al., 2001).
ITD tuning in midbrain tegmentum: emergence of neural responses reminiscent of the opponent-channel code hypothesized in mammals. A, Averaged ITD tuning curves (mean ± SE) of midbrain tegmentum neurons in left (red) and right (blue) hemispheres. The curves cross at the midline and the steepest slopes are positioned within the physiological range (black dashed lines). Note the similarity with postulated population responses in Figure 1B. Modified from Cazettes et al. (2018). B, Example ITD tuning curves of a single midbrain tegmentum cell, stimulated with tones of different frequencies. Adapted from Cazettes et al. (2018). C, Average rate-ITD curves of sets of inferior colliculus cells with best frequencies within frequency bands centered on 250 Hz (purple), 325 Hz (blue), 500 Hz (green), 700 Hz (yellow), 1.0 kHz (orange), and 1.4 kHz (red). Adapted from McAlpine et al. (2001). D, Convergence of coding schemes. The opponent-channel coding scheme proposed for mammals (left) emerges downstream the map of auditory space in the owl (right).
Notably, midbrain tegmentum responses could be explained by a weighted convergence of midbrain map responses (Cazettes et al., 2018), consistent with the population-vector readout previously predicted (Fischer and Peña, 2011; Cazettes et al., 2018) (Fig. 1C). In sum, the ITD tuning of midbrain tegmentum premotor neurons shows striking resemblance with proposed neural responses supporting the opponent-channel sound localization theory in mammals. However, in barn owls, this code emerges from a map of auditory space generated upstream (Fig. 4D).
Functional role of frequency-dependent ITD tuning: integrating natural statistics into the space map readout
Contextual factors of acoustic environments can significantly affect the information about sound azimuth carried by the ITD cue. For example, for a given location, the squeaks of a mouse in a barn full of noisy livestock will be distorted differently compared with the sound emitted by the same mouse in a calm open meadow. This effect was quantified in owls by measuring the variability of the interaural phase difference (IPD) as concurrent sounds were played from different locations. This showed that concurrent sounds are expected to corrupt the IPD of a target signal in a location- and frequency-dependent manner (Cazettes et al., 2014). The variance of IPD, a statistical quantity of how much this critical spatial cue of a sound's azimuthal position varies across directions of secondary concurrent sounds, is a measure of sensory cue reliability. If a sound frequency from a given location carries an IPD that does not vary much across contexts, then this auditory cue can be relied upon. Consistent with this, Cazettes et al. (2014) showed that neurons in the owl's map of auditory space are selective to the frequencies that carry the most reliable IPDs at the azimuthal location that each neuron prefers. Notably, by virtue of the frequency-convergence mechanism underlying the emergence of spatial selectivity, the relationship between frequency tuning and cue reliability results in a correlation between the sharpness of spatial selectivity and sensory reliability. In particular, the frontal locations, which are associated with more reliable ITDs, are represented by cells with sharper spatial tuning (Cazettes et al., 2016). Furthermore, the coincidence detection mechanism underlying ITD detection (Carr and Konishi, 1990; Fischer et al., 2008) results in a relationship between the width of spatial tuning and signal-to-noise ratio, which allows the tuning width to be a correlate of ongoing sensory reliability (Cazettes et al., 2016).
Theoretical studies have shown that the nonuniform spatial acuity of neurons across the map and the overrepresentation of frontal space may be used by a population vector decoder to perform optimal inference of sound location (Fischer and Peña, 2011; Cazettes et al., 2016). In this framework, the overrepresentation of frontal space acts as a prior distribution that emphasizes frontal directions, and the pattern of activity across the population represents a likelihood function for direction, where narrow tuning reflects high reliability of sensory information and wide tuning reflects low reliability of sensory information. As a result, the population vector combines information about a prior distribution for direction and the reliability of the sensory evidence to perform Bayesian inference. Within this context, the multiplicative combination selectivity to spatial cues of neurons in the map (Peña and Konishi, 2001) is optimal for integrating multiple spatial cues. As these cues can be assumed statistically independent, their multiplicative integration results in sensory information carried by each cue having greater influence on the estimate of sound direction (Fischer and Peña, 2017). The above response properties result in a critical implication for coding: the nonuniform frequency and spatial selectivity of space map neurons can represent the statistics of spatial cues (Fischer and Peña, 2011; Rich et al., 2015; Cazettes et al., 2016). Interestingly, decoding population of neurons with a population vector can also explain optimal inference in human visual orientation perception (Girshick et al., 2011).
A fundamental question has been whether and how statistical information is conveyed from the sensory midbrain map into premotor responses. Recent work has found that the firing rate of neurons in the owl's midbrain tegmentum captures sensory cue statistics through a weighted sum of responses of neurons in the space map (Cazettes et al., 2018). Thus, information on the statistics of sensory cues contained in population responses of the space map is represented by the firing rate of midbrain tegmentum premotor neurons, providing experimental evidence that optimal behavioral commands can be generated from the decoding of a sensory representation. The notion that sound-localizing behavior follows rules of optimality is not limited to owls, having been invoked in mammals, including humans (Harper and McAlpine, 2004; Dosso and Wilmut, 2012; Parise et al., 2014; Reijniers et al., 2014; Mlynarski, 2015; Pavao et al., 2018).
Several pieces of evidence suggest a common role of sensory statistics in driving neural coding of sound localization across species. The finding that the owl's midbrain space map reflects optimal cue combination (Fischer and Peña, 2017) may also apply to mammals, even though the way that ITD, ILD, and spectral sound localization cues contribute to the map formation may differ in birds and mammals. Indeed, the predominant use of ITD in the fine structure of sounds at low frequencies, and ILD and envelope ITD at higher frequencies as the cues for azimuth (Lord Rayleigh, 1876; Stevens and Newman, 1936; Henning, 1974) is consistent with the reliability of those cues as computed by MSO and lateral superior olive neurons. In addition, the multiplicative combination of ITD and ILD cues demonstrated in the barn owl's midbrain neurons (Peña and Konishi, 2001), an optimal operation for integration of statistically independent cues (Fischer and Peña, 2017), was first proposed for human binaural cue integration (Stern and Colburn, 1978). Furthermore, recent human studies indicate that ILD cue statistics explain high-frequency spatial acuity in humans (Brown et al., 2018). While the population vector may not describe behavior in all species, other implementations of optimal inference may be used (Day and Delgutte, 2013; Goodman et al., 2013). Future work will elucidate the fundamental question how sensory statistics are integrated into behavioral commands across species and sensory modalities.
Conclusions
Experimental evidence shows that auditory responses of premotor tegmentum neurons in the owl's midbrain meet predictions of an opponent-channel coding scheme supporting the owl's orienting behavior (Cazettes et al., 2018). Computing this opponent-channel code requires a weighted sum over the space map, therefore departing from the classical place code model of sound localization in the barn owl (Fischer and Peña, 2011; Cazettes et al., 2016, 2018). A transformation from a place code to an opponent-channel code has previously been proposed for midbrain control of eye movements (Groh, 2001). Thus, the observation of a similar conversion in the midbrain tegmentum, a premotor area, supports the validity of this hypothesis. This conversion also reveals similarities between the ITD tuning in the avian brains and the opponent-channel code proposed for the mammalian brains. Even though the opponent-channel code is not universally accepted across mammalian species and brain regions, these results indicate points of overlap between avian and mammalian species suggesting unifying principles in the representation of ITD (Fig. 4D). A recent study also suggests that the basis for auditory spatial adaptation may be comparable in ferret and owls (Keating et al., 2015). These findings account for growing evidence connecting neural coding of auditory spatial cues across species.
That responses consistent with the opponent-channel theory of sound localization emerge in premotor neurons of the owl's midbrain suggests that codes driving orienting behavior in mammals and birds may be comparable. A critical factor in barn owl studies is that an adaptive behavioral command emerges from the direct readout of a sensory representation. This evidence is consistent with predictions of how the retinotopic and auditory maps in the SC could be read out to generate an opponent-channel code guiding orienting behavior (Groh, 2001; Lee and Groh, 2014). More generally, the transformation of an ITD-map in the OT into a representation where ITD tuning is quasi-monotonic, reaching maxima at peripheral locations, in premotor neurons of the barn owl provides further evidence that generating sound-driven movements requires converting a place code into an opponent-channel code. To the extent that animal behavior is used as a metric for both perceptual and motor functions, how such transformation is accomplished by the brain may be significant for understanding both motor coordination and sensory perception.
Owls are auditory specialists that have evolved high capability for capturing prey using acoustic cues. If mammals and owls share coding strategies supporting sound azimuth localization, a question arises regarding the functional benefit of a space map in the owl's midbrain. The existence of a space map that recapitulates Jeffress model is uncontested in the barn owl. Seminal work by Harper and McAlpine (2004) provided an explanation of how topographic ITD representations could maximize information, based on the species' head size and frequency range over which ITD is encoded. Although this theory may not apply to all species (Kettler and Carr, 2019), it does for barn owls, given their physiological range of ITD and the unusually broad frequency range over which their brain can encode ITD (Carr and Konishi, 1990; Köppl, 1997). Recent findings showing that this space map is also a frequency map that excludes frequency ranges carrying unreliable spatial information (Cazettes et al., 2014) and can represent sensory reliability (Cazettes et al., 2016) further illuminate the evolutionary advantage of this specialization.
Footnotes
This work was supported by National Institutes of Health Grants DC012949, DC007690, and NS104911. We thank Catherine Carr and Dan Tollin for feedback; and Roland Ferger and Keanu Shadron for comments on the manuscript.
The authors declare no competing financial interests.
- Correspondence should be addressed to Jose L. Peña at jose.pena{at}einstein.yu.edu