Research reportPitch vs. spectral encoding of harmonic complex tones in primary auditory cortex of the awake monkey
Introduction
Pitch is a fundamental feature of auditory perception. It underlies recognition of gender and intonation in speech and forms a basis for our appreciation of music. On a more primitive level, the pitch percept can be viewed as a product of auditory scene analysis, the process by which the auditory system integrates or segregates overlapping frequencies generated by multiple sources in order to construct accurate representations of sound sources in the environment [5]. The pitch evoked by a pure sinusoidal tone correlates with its frequency. However, natural sounds almost always are composed of many simultaneously occurring frequencies. In addition, these component frequencies often are harmonically related, i.e., they are consecutive integer multiples, or harmonics, of a common fundamental frequency (f0). Our auditory experience generally consists not of the pitches of the individual harmonics of complex sounds (spectral pitch), but rather of a unified global pitch equal to the f0. Most importantly, the global pitch of the harmonic complex persists even when spectral energy at the f0 is completely absent from the stimulus.
This phenomenon, known as the pitch of the missing fundamental (also as residue pitch, periodicity pitch and virtual pitch), serves as the foundation for hypotheses regarding the extraction of pitch information from complex sounds 16, 27, 44, 45, 59, 64. Current theories of pitch perception emerge from two different, although not necessarily incompatible, perspectives on auditory system function. One conceptualizes the pitch mechanism as involving an initial spectral analysis of complex sounds based on the place of maximal auditory nerve fiber activation along the basilar membrane. The perceived pitch is subsequently derived by matching the resolved spectral information to a central harmonic template corresponding to the f0 of which the components are integer multiples 16, 19, 59, 64. In contrast to spectral pattern-recognition models, temporal models of pitch representation are based on the auditory system's ability to follow the periodicity of the composite waveform of a harmonic complex tone. Since the repetition rate of the amplitude modulated envelope arising from the summation of component harmonics is equal to the f0, even in the absence of spectral energy at that frequency, pitch information is conveyed by the temporal firing pattern of neurons phase-locked to the f07, 8, 20, 21, 27, 28, 47. While the debate between proponents of spectral pattern-recognition models and temporal models of pitch encoding has assumed an either–or character, ample psychoacoustic evidence has accumulated supporting the coexistence of two mechanisms, one operating in the frequency domain for stimuli containing aurally resolved harmonics and the other operating in the time domain for stimuli consisting of upper harmonics that the peripheral auditory system fails to resolve 16, 19, 20, 52. Given that natural sources typically produce sounds containing harmonically related components, it seems reasonable that neuronal populations might exist that respond to the harmonic features of sound, utilizing either or both of these strategies.
A tonotopic organization of A1 has been demonstrated in a number of mammals (e.g., cat 31, 41, monkey 30, 33). Studies based on positron emission tomography, auditory evoked potentials and auditory evoked magnetic fields have suggested a similar topographic representation in humans, with activity on the superior temporal plane occurring more laterally for tones of low frequency and more medially for tones of higher frequency 3, 26, 36. Such a tonotopic organization has led to the proposal that the pitch of harmonic complex tones, with or without the f0, is mapped onto essentially the same regions as pure tone frequency in auditory cortex [29]. In a human neuromagnetic study designed to test this hypothesis, Pantev et al. [37]reported that the depth of the equivalent current dipole corresponding to the M100 component did not differ significantly in response to a 250 Hz pure tone and to a missing fundamental complex tone of equivalent pitch, whereas significant differences in depth were observed in response to those pure tones that were contained in the complex tone. These results were interpreted as demonstrating an organization in A1 based on pitch rather than frequency. An additional study replicated these results and included the finding that the corresponding virtual pitch representation was activated when the harmonics of the complex tone were distributed between ears, as predicted by psychoacoustic studies demonstrating the central formation of virtual pitch from dichotically presented harmonics 19, 35. Because the M100 component is thought to be generated by activity in A1 and because auditory cortex lesions that include A1 disrupt missing fundamental perception in cats and humans, it is possible that A1 plays a key role in pitch perception for complex tones 36, 63, 66. However, single-unit studies in awake macaques failed to find neurons that responded to the missing fundamental and, therefore, to pitch [51]. Instead, single-unit responses to missing fundamental stimuli were determined by the relationship between the spectral content of the stimulus and the pure tone neuronal tuning curves. The latter finding is consistent with the classical topographic organization of A1 based on frequency. Given the high degree of similarity in auditory cortical anatomy between macaques and humans, it seems unlikely that their primary auditory areas would exhibit functionally disparate organizations of representation [15]. Thus, the basis for pitch encoding of complex sounds in A1 remains controversial.
It is possible that the respective limitations of the techniques employed in these studies are responsible for the disparity of the results. While single-unit studies provide detailed information regarding the firing properties of individual cells, non-invasive studies monitor the synchronous synaptic activity of large neuronal populations, albeit with more limited spatial resolution. Thus, the supposition that the M100 magnetic response is uniquely generated in A1 without contributions from additional adjacent auditory areas is problematic. Consequently, the assumption of a single dipole generator within the superior temporal gyrus that is capable of adequately representing the auditory topographic organization for pitch may not be justified. On the other hand, it is possible that pitch encoding relies on the pattern of activity across large ensembles of neurons (i.e., population encoding), a dimension of analysis that is largely inaccessible using single-unit techniques. Synchronized responses of neuronal ensembles have been shown to represent the functional organization of auditory cortex more reliably than the single-unit firing rate, suggesting that concerted activity of neuronal populations may underlie encoding of perceptual features such as pitch [12].
The present study attempts to clarify whether A1 is characterized by topographic organization based on frequency or on pitch by bridging the single-unit level of analysis with that explored using non-invasive techniques measuring synchronous activity of neuronal populations. Simultaneous intracranial recordings of auditory evoked potentials (AEPs), multiunit activity (MUA) and the laminar distribution of current source density (CSD) derived from the AEP profiles were used to identify the location and magnitude of neuronal ensemble activation in A1 of the macaque monkey, as they relate to the encoding of harmonic complex tones missing the f0. The advantage of these techniques is that they measure both firing patterns and synchronized synaptic activity of neuronal populations with higher spatial resolution than in noninvasive studies. In addition, these techniques allow dissociation of cortical activity into its temporally discrete components and thereby permit analysis of individual phases of the response. As suggested by the reported uniqueness of the human M100 in correlating with pitch, these distinct response components may reflect activity in specific processing streams encoding specific attributes of environmental sounds [37]. Macaques share many features of auditory cortical anatomy and physiology with humans and also are capable of perceiving the missing fundamental of harmonic complex tones, making them a suitable animal model for exploring intracortical mechanisms underlying pitch encoding in the auditory system 15, 57, 60.
Section snippets
Materials and methods
Four adult male monkeys (Macaca fascicularis) were studied using methods previously reported [54]. All animals were housed in our AAALAC-accredited Animal Institute under daily supervision by veterinary staff. Briefly, under general anesthesia and using aseptic techniques, small holes were made in the exposed skull to accommodate epidural matrices of adjacently placed 18-gauge stainless steel tubes. Matrices were stereotaxically positioned to target A1 and were oriented at an angle of 30° from
Results
Results are based on a total of 16 electrode penetrations into A1 of four monkeys with BFs between 300 and 2000 Hz. Although not included in the statistical analysis, results from eight additional electrode penetrations with BFs outside the range of f0s encompassed by the compound stimuli are discussed at the end of the results section.
In all of the cortical sites examined, MUA elicited by harmonic complexes clearly reflects the spectral content and not the missing f0 of the stimulus,
Discussion
Previous physiological studies have provided conflicting conceptions of the cortical mechanisms involved in encoding the pitch of complex tones. Single-unit studies in awake macaques demonstrate that responses to harmonic complex tones missing the f0 are determined by the relationship between the BF of the area and the spectral content of the stimulus [51]. Conversely, non-invasive neuromagnetic studies in humans support a topographic organization at the level of A1 which is based on the pitch
Acknowledgements
We are sincerely grateful to Dr. Steven Walkley and May Huang for providing histological facilities and assistance, Dr. Charles Schroeder for his assistance with the surgical procedures, Shirley Seto for constructing the electrodes and Susana Chan for technical help. This research was supported by grants DC00657 and MH06723 and the Institute for the Study of Music and Neurologic Function of Beth Abraham Hospital. Submitted in partial fulfillment of the requirements for the degree of Doctor of
References (66)
- et al.
A new multicontact array for the simultaneous recording of field potentials and unit activity
Electroencephalogr. Clin. Neurophysiol.
(1981) Temporal modulation transfer functions for AM and FM stimuli in cat auditory cortex. Effects of carrier type, modulating waveform and intensity
Hear. Res.
(1994)Periodicity coding in the auditory system
Hear. Res.
(1992)- et al.
Tonotopic organization in human auditory cortex revealed by positron emission tomography
Hear. Res.
(1985) - et al.
Representation of the cochlear partition on the superior temporal plane of the macaque monkey
Brain Res.
(1973) - et al.
Population responses to multifrequency sounds in the cat auditory cortex: one- and two-parameter families of sounds
Hear. Res.
(1994) - et al.
Binaural fusion and the representation of virtual pitch in the human auditory cortex
Hear. Res.
(1996) - et al.
Tonotopic organization of the human auditory cortex revealed by transient auditory evoked magnetic fields
Electroencephalogr. Clin. Neurophysiol.
(1988) - et al.
Source location of a 50 msec latency auditory evoked field component
Electroencephalogr. Clin. Neurophysiol.
(1988) - et al.
Phase-locked cortical responses to human speech sound and low frequency tones in the monkey
Brain Res.
(1980)