Pitch perception is critical for the perception of speech and music, for object identification, and for auditory scene analysis, whereby representations are derived for each sounding object in the environment from the complex sound wave that reaches the ears. The perceived pitch of a complex sound corresponds to its fundamental frequency. However, removal of energy at the fundamental does not alter the pitch because adults use the harmonics to derive the pitch (Bendor and Wang, 2005; Trainor, 2008). Although sound frequency is represented subcortically, the integration of harmonics into a representation of pitch does not occur until auditory cortex (Bendor and Wang, 2005). Given that auditory cortex is immature in young infants, we examined the development of cortical representations for pitch by measuring electrophysiological (EEG) responses to pitch changes that required processing the pitch of the missing fundamental. Adults and infants 4 months and older showed a mismatch negativity response to these pitch changes, but 3-month-old infants did not. Thus, cortical representations of the pitch of the missing fundamental emerge between 3 and 4 months of age, indicating that there is a profound change in auditory perception for pitch in early infancy.
Pitch provides information about object identity, including body size (Duellman and Trueb, 1986), age (Harnsberger et al., 2008), and gender (Childers and Wu, 1991), and pitch contours carry emotional information in speech and music (Trainor et al., 2000). Pitch analysis is also critical for perceptually separating and identifying simultaneous sound sources. Physical sounds that give rise to the sensation of pitch typically have energy at integer multiples of a fundamental frequency, f0, such that the repetition rate of the complex waveform is unchanged when the fundamental frequency is missing. The ability to extract the missing fundamental allows identification of the pitch of objects in environments in which low frequencies are attenuated or missing. It enables people, for example, to converse over the telephone (which does not transmit all frequencies in the f0 range of the human voice), to perceive the lowest notes of organs whose pipes are too short to physically produce these fundamentals, and to hear the bass notes in music through loudspeakers that cannot transmit their fundamentals. Species as diverse as birds, fish, and mammals perceive the pitch of the missing fundamental, attesting to its importance for survival (Shofner, 2005).
Behavioral studies indicate that 7-month-old infants can perceive the pitch of the missing fundamental (Clarkson and Clifton, 1985; Clarkson and Rogers, 1995; Montgomery and Clarkson, 1997), but these techniques are difficult to apply to younger infants. We used event-related potentials (ERPs) derived from electroencephalogram (EEG) recordings to test the sensitivity of adults and 3-, 4-, and 7-month-old infants to the pitch of the missing fundamental. When exemplars from a sound category are presented repeatedly, occasional changes in the sound category elicit a mismatch negativity (MMN) in adults between 120 and 220 ms after the onset of the unexpected deviant sound (Winkler et al., 1995, 1997; Picton et al., 2000; Näätänen et al., 2007).
A number of ERP studies show that infants respond to changes in pitch from the newborn period [e.g., Leppänen et al. (1997), Čeponiené et al. (2002), Kushnerenko et al. (2002), Draganova et al. (2005), and Novitski et al. (2007)]. A simple pitch change elicits a frontally negative MMN-like response in 3- and 4-month-old infants ∼210 ms after stimulus onset, but a slow frontal positive wave in younger infants (He et al., 2007, 2009a). Despite the different responses at different ages, the important point is that cortical correlates of pitch discrimination are apparent at all ages. Furthermore, young infants are also able to respond to changes in pitch patterns, including changes in the order of pitches by 2 months (He et al., 2009b), pitch contour in newborns (Carral et al., 2005), and pitch interval in newborns (Stefanics et al., 2009).
We presented standard trials consisting of two complex tones with fundamentals, such that the pitch always increased from the first to the second tone. On occasional deviant trials, the harmonics of the second tone were all integer multiples of a low-pitched missing fundamental. Thus, only if the missing fundamental was perceived should deviant trials elicit MMN.
Materials and Methods
We tested 29 3-month-old infants (mean age = 110 d; range = 97–118; 16 female), 15 4-month-old infants (mean age = 142 d; range = 131–149 d; 9 female), 15 7-month-old infants (mean age = 228 d; range = 219–236; 8 female), and 10 adults (mean age = 20 years; range = 18–26 years; 6 female) with normal hearing. Data from an additional 34 infants were excluded because participants fell asleep (6, 7, and 7 infants at 3, 4, and 7 months, respectively) or became fussy and failed to produce the minimum of 100 artifact-free deviant trials (2, 4, and 8 infants at 3, 4, and 7 months, respectively).
Tones were synthesized using Adobe audition 1.0 (Adobe Software) and played using E-prime 1.1 (Psychology Software Tools) on a Dell OptiPlex280 computer with an Audigy 2 platinum sound card (Creative Labs) through a custom-built WestSun loudspeaker with flat spectrum response. Each tone in the standard stimulus tone pairs was synthesized by adding together 10 sine-wave tones in random phase that were at integer multiples of the fundamental frequency (selected from the first 15 harmonics), such that intensity decreased by 4 dB/octave. Each resulting complex tone was microfrequency (1% of the fundamental frequency) modulated at a rate of 5 Hz to increase the perceived synthesis of the sine-wave components (Fig. 1).
The fundamental frequency was always present in both tones of each of the six standard pairs. The pitch always increased from the first to the second tone in each of the standard tone pairs (with no musical relation), and each sine-wave component also increased from the first to the second tone. Pitch increases were chosen to be clearly perceptible to infants but not to consist of familiar Western musical intervals. The pitch of the first tone varied between 209 and 314 Hz, and the pitch increase from first to second tone varied between 82 and 142 Hz (Fig. 1). Each tone of each standard tone-pair stimulus was 150 ms, including linear onset and offset ramps of 15 ms, and the two tones were separated by 50 ms. Standard tone pairs were presented on six-sevenths of trials, with each of the six standard tone pairs presented equally often (one-seventh of the total trials). Tone pairs were separated by 400 ms. On deviant trials (one-seventh of the total trials), the first tone was constructed as in the standard trials and contained the fundamental frequency. Each sine-wave component of the first tone increased in frequency between the first and second tones of the pair, as with the standard tone pairs. However, the components of the second tone in deviant stimuli were created so as to form a missing fundamental that decreased in pitch. The sine-wave components of the deviant and standard tones occurring on the second tone of each pair were matched to have similar spectral spacing (Table 1). Trials were presented in quasirandom order with the constraints that the same standard pair was never played twice in a row and that at least two standard trials preceded each deviant trial. Stimuli were presented in a sound-treated room at ∼68 dB(A) at the position of the participant's head over a noise floor of 29 dB(A). The lower sine-wave components of the presented sounds were measured at 58 dB in the sound field. According to Hall (1972) and Goldstein (1973), any difference tones created by nonlinearities in the cochlea that might arise would be at least 25 dB below this (<33 dB). Given that infant thresholds at 3–4 months of age are ∼30 dB for these frequencies (Olsho et al., 1988), this suggests that there should be no influence of distortion products on infants' perception of the missing fundamental in the present study. On the other hand, Pressnitzer and Patterson (2001) found that when harmonics are all in cosine phase, difference tones can be as high as only 10–15 dB below the level of the harmonics. However, when different harmonics have different phases, the level of the difference tones is much lower. Given that phase was random in the stimuli of the present study, it is likely that difference tones were substantially more than 15 dB below the harmonics. We cannot be entirely sure that these tones were inaudible to infants, but they would almost certainly have been very quiet if audible at all, and unlikely to generate any significant ERP responses.
EEG was recorded from 124 locations on the scalp (128 for adults) through Geodesic Sensor nets (Electrical Geodesics) and digitized at 1000 Hz with a vertex reference and bandpass filter of 0.1–400 Hz, while impedance was maintained at <50 kΩ. Off-line, the data were filtered between 0.5 and 20 Hz with a roll off of 24 dB/octave and segmented into 1050 ms epochs starting 200 ms before the onset of the first tone in each trial, and trials with eye artifact were removed. For each age group for each electrode site, standard and deviant trials were averaged separately relative to the 200 ms baseline before onset of the first tone. Subsequently, groups of electrodes were averaged together in left and right frontal, central, occipital, and parietal regions (supplemental Fig. 1, available at www.jneurosci.org as supplemental material). Difference waves were created for each participant by subtracting the standard average from the deviant average. Points at which the waveforms were significantly different were determined by t tests. For each 4-month-old infant and each 7-month-old infant, MMN was defined as the largest frontally negative peak between 100 and 250 ms (100 and 200 ms for adults) after the onset of the second tone. A peak could not be identified in waveforms of 3-month-old infants. Peak amplitude and latency were analyzed with linear mixed-effects model ANOVAs, with age group (4 months, 7 months, adults), scalp region (frontal, central, parietal, occipital), and hemisphere (left, right) as variables. Bonferroni corrections were applied to multiple comparisons.
In adults, a typical MMN was seen in response to missing fundamental deviants (significant by t tests across time), indicating that adults represent the pitch of the missing fundamental in auditory cortex, as expected. In infants, MMN was also significant in 4- and 7-month-old infants, but no significant response, positive or negative, was present in 3-month-old infants. All data were subsequently filtered between 1.5 and 20 Hz to reduce any slow-wave components and confirm that a negative MMN response was not hidden by any slow-wave activity. Consistent with the original analysis using a 0.5–20 Hz filter, the new analysis confirmed the presence of MMN in adults (Fig. 2) and 4- and 7-month-old infants (Fig. 3), but not in 3-month-old infants (Fig. 3). This indicates that a cortical representation for the pitch of the missing fundamental appears to emerge after 3 months and before 4 months of age.
Analyses of 4-month, 7-month, and adult waveforms indicated that the latency of the MMN peak decreased with age (F(2,34.5) = 6.09, p < 0.001), with the peak in 4-month-old infants [192 ± 3 ms (SEM) after second tone onset] significantly later (p < 0.001) than in 7-month-old infants [166 ± 3 ms (SEM)], and the peak in 7-month-old infants significantly later (p < 0.001) than in adults [135 ± 4 ms (SEM)]. The electrical field distribution over the scalp at the time of the MMN peak showed a bipolar pattern in 4-month-old infants, 7-month-old infants, and adults, with an anterior negativity accompanied by a posterior positivity (Fig. 4), consistent with primary generators of the activity in the left and right auditory cortices. At the same time, the scalp distribution also varied somewhat across age as shown by a significant interaction between age and region (F(6,110) = 3.45, p = 0.004). In particular, in adults the auditory cortical neurons involved in generating the response appear to be oriented in a more central direction compared with those of the younger infants. Finally, across all age groups, MMN was slightly but significantly earlier in the left [162 ± 2 ms (SEM)] than in the right [167 ± 2 ms (SEM)] hemisphere (F(1,237) = 4.71, p = 0.03).
These results show that infants begin to show cortical responses to the pitch of the missing fundamental after 3 months and before 4 months of age, indicating integration of the harmonic frequency components into a pitch percept by 4 months.
Although frequency is encoded at the level of the basilar membrane in the inner ear and in temporal firing patterns in the auditory nerve, and although there are tonotopic frequency maps in many subcortical nuclei and in primary auditory cortex, a number of lines of evidence indicate that in humans, pitch is not represented until beyond primary auditory cortex, in a region in lateral Heschl's gyrus. These include functional magnetic resonance imaging studies (Patterson et al., 2002; Penagos et al., 2004; Schneider et al., 2005), lesion studies (Warrier and Zatorre, 2004), and a case study using depth electrodes (Schönwiesner and Zatorre, 2008). Single-cell recordings in marmoset monkeys also confirm a pitch-sensitive region that is localized to an area adjacent to primary auditory cortex (Bendor and Wang, 2005), and studies of primary auditory cortex in macaque monkeys have failed to find even a population representation of the missing fundamental in primary auditory cortex (Fishman et al., 1998).
In humans, the auditory brainstem, which supports frequency representation, is quite mature at birth. However, auditory cortex is immature at birth in terms of the number (Huttenlocher and Dabholkar, 1997) and efficiency (Moore and Guan, 2001) of connections between neurons, and undergoes rapid development during the next few months. Neurofilament expression, which reflects the ability of neurons to communicate efficiently, can be seen only in cortical layer I before 4 months, at which time expression begins in deeper cortical layers IV, V, and VI (Moore and Guan, 2001). If auditory cortex is necessary for the integration of frequency components into a representation of a single auditory object with a particular pitch, then very young infants would not be expected to be able to do this.
Although the present results show that 3-month-old infants do not appear to integrate harmonics into a pitch percept, infants younger than 3 months do respond to changes in the pitch of complex tones if the fundamental frequency is present (He et al., 2007). It is difficult to measure phenomenologically what infants experience at this age when presented with complex sounds such as phonating voices or musical instruments, but the present results suggest that these early responses to pitch are based on representations of frequency rather than pitch. For example, if young infants processed only the fundamental frequency or any harmonic of complex sounds that give rise to pitch sensations in adults, they would appear to be able to process pitch. However, their ability to analyze the number of, and identity of, sounding objects in an environment would be impaired. What is clear is that between 3 and 4 months of age there is a major shift in how pitch is represented in the cortex, such that by 4 months components that stand in harmonic relations fuse into a single percept whose pitch corresponds to the fundamental, whether or not it is actually present in the stimulus.
This research was supported by a grant to L.J.T. from the Canadian Institutes of Health Research. We thank Lisa Hotson and Dorcas Yung for help in testing the infants and Steven Brown, Daphne Maurer, and Terri Lewis for comments on an earlier draft.
- Correspondence should be addressed to Laurel J. Trainor, Department of Psychology, Neuroscience & Behaviour, McMaster University, Hamilton, ON L8S 4B2, Canada.