Abstract
A major task across infancy is the creation and tuning of the acoustic maps that allow efficient native language processing. This process crucially depends on ongoing neural plasticity and keen sensitivity to environmental cues. Development of sensory mapping has been widely studied in animal models, demonstrating that cortical representations of the sensory environment are continuously modified by experience. One critical period for optimizing human language mapping is early in the first year; however, the neural processes involved and the influence of passive compared with active experience are as yet incompletely understood. Here we demonstrate that, while both active and passive acoustic experience from 4 to 7 months of age, using temporally modulated nonspeech stimuli, impacts acoustic mapping, active experience confers a significant advantage. Using event-related potentials (ERPs), we show that active experience increases perceptual vigilance/attention to environmental acoustic stimuli (e.g., larger and faster P2 peaks) when compared with passive experience or maturation alone. Faster latencies are also seen for the change discrimination peak (N2*) that has been shown to be a robust infant predictor of later language through age 4 years. Sharpening is evident for both trained and untrained stimuli over and above that seen for maturation alone. Effects were also seen on ERP morphology for the active experience group with development of more complex waveforms more often seen in typically developing 12- to 24-month-old children. The promise of selectively “fine-tuning” acoustic mapping as it emerges has far-reaching implications for the amelioration and/or prevention of developmental language disorders.
Introduction
The foundations of language are established in infancy and can be observed well before spoken language emerges. Specifically, fine-grained analyses in the tens-of-milliseconds range appear to be critical to decoding the speech stream (Eilers et al., 1981; Aslin, 1989; Werker and Tees, 2005). To facilitate decoding, the developing brain constructs acoustic maps of the sounds of its native language, thus allowing the child to respond in a fast automatic way to incoming language (Kuhl et al., 1992; Werker and Tees, 2005). This process crucially depends on ongoing neural plasticity and the exquisite sensitivity to environmental cues that characterize early brain development (deVillers-Sidani et al., 2007; Kuhl, 2010; Froemke and Jones, 2011).
The processes involved in exposure-based plasticity are complex and incompletely understood in human infants, although animal studies demonstrate that developing cortex is highly plastic with experience (Kilgard et al., 2001; Zhang et al., 2001; Ranasinghe et al., 2012). In developing rats, “patterned auditory inputs play a crucial role in shaping neuronal processing/decoding circuits in primary auditory cortex” (Zhang et al., 2002); moreover, response latency of cortical neurons changes as a function of sensory experience (Moucha et al., 2005). Threlkeld et al. (2009) demonstrated in rats that for both simple tones and complex auditory discriminations, prior early experience significantly improved acoustic discrimination as adults. For human infants, it is probable that emergent selective representation of the phonemic structure of the infant's native language is a manifestation of this powerful, sound exposure-based plasticity (Kuhl et al., 1992), and thus facilitates accurate construction of early acoustic maps (Guenther and Gjaja, 1996).
A number of studies have manipulated infants' sound environments, introducing novel speech (Kuhl et al., 2003) or contrasting vowel sounds (Cheour et al., 2002), and then assessed discrimination. One study more systematically manipulated auditory exposure observing specific effects on acoustic mapping. Four-month-old children received a week of daily passive experience (PEx) with differing musical timbres; plastic changes in electroencephalograms (EEGs)/event-related potential (ERPs) were differentially induced in acoustic representations (Trainor et al., 2011). The authors concluded that “exposure to a particular timbre in infancy enhances representations, leading to more precise pitch processing for that timbre.” However, to date, no infant study has determined whether interactive experience, using non-speech-containing linguistically relevant cues, would impact and perhaps optimize the cortical representations that will ultimately support later language. Specifically, we ask whether “active engagement” that recruits attention to salient properties of the acoustic environment differs from passive exposure to the same sounds.
We used EEGs/ERPs to examine neural correlates of acoustic mapping before and after a 6 week interactive or passive acoustic experience delivered from 4 to 7 months. We hypothesized, based on the extant literature, that, in addition to expected maturational differences in latency and amplitude, a targeted, interactive acoustic experience would do the following: (1) induce more precise and efficient acoustic mapping; (2) show a differential rate effect such that stimuli with a faster rate would drive acoustic plasticity more than slower rates; (3) improve both processing speed and discrimination of trained stimuli; and (4) generalize to untrained stimuli as a consequence of more sharply defined acoustic maps.
Materials and Methods
Participants
Forty-nine infants were recruited and then randomly assigned to one of three groups. The study used a mixed longitudinal and cross-sectional design that ensured both within-age and within-subject control subjects (Fig. 1a). The active experience (AEx; N = 18; 10 males) and PEx (N = 17; 9 males) groups were followed longitudinally from 4 to 7 months of age. For both groups, the mean (SD) age at pretest was 4.4 months (0.18 months) and the mean age at post-test was 7.2 months (0.16 months). A naive control (NC) group (N = 14; 7 males) was recruited at 7 months of age (mean age, 7.1 months; SD, 0.16 years) and served as the cross-sectional maturational control. Infants were recruited from urban and suburban communities from a large metropolitan area. All infants came from monolingual English-speaking families and were of middle to upper middle socioeconomic class. Parents reported uneventful prenatal and perinatal circumstances, and all infants were born healthy, at full term, and of normal birth weight. Exclusionary criteria included reports of family history of language-learning impairments, psychiatric disorders and/or autism; and infant history of hearing loss, repeated episodes of otitis media or other medical or neurological disorders. Parents were compensated for their time, and infants received a toy after the visit. The study was conducted in accordance with the Declaration of Helsinki, and informed consent, approved by the Institutional Review Board of our university, was obtained from all participants before study participation.
Measures
Electroencephalography.
Dense-array EEGs/ERPs to auditory stimuli were collected from all infants at 4 and/or 7 months of age.
EEG/ERP stimuli.
Stimuli were 70-ms-long (5 ms rise time/5 ms fall time) complex tones with 15 harmonics and a 6 dB roll-off per octave. They were presented at an intensity of 75 dB SPL free-field via loudspeakers to the left and right of the participant. At both pretest (4 months of age) and post-test (7 months of age), a single-deviant oddball paradigm was used. Standard (STD) stimuli, 800–800 Hz complex tone pairs (STD, 708 tokens; 85%), were interspersed with deviant (DEV) stimuli, 800–1200 Hz complex tone pairs (125 tokens; 15%). The stimuli were presented in a passive auditory oddball paradigm using a blocked design with interstimulus intervals (ISIs) of either 70 or 300 ms. In all cases, the 70 ms ISI tone pairs (the temporally modulated condition) were presented first followed by a second block of 300 ms ISI stimuli (control condition). The onset-to-onset intertrial intervals (ITIs) were 915 and 1140 ms, and the offset-to-onset ITIs were 705 and 700 ms, for 70 and 300 ms ISI conditions, respectively. Stimuli were presented in a pseudorandomized order where 3–12 STDs were presented before each DEV pair.
At post-test (7 months), in addition to the single-deviant oddball paradigm, a multiple deviant oddball paradigm was presented. Here, the standard stimulus was an 800 Hz single complex tone (1455 tokens) interspersed with the following deviant stimuli (120 tokens for each stimulus): (1) a 1200 Hz single complex tone (frequency deviant); (2) an 800 Hz single complex tone that was shorter in duration, 30 ms instead of 70 ms (duration deviant); (3) an 800 Hz single complex tone with a 20 ms silent gap inserted in the middle of a 70 ms ISI total stimuli length (gap deviant); and (4) single complex sinusoidal up-sweep with linear modulation from 800 to 1200 Hz (sweep deviant). Due to poor signal-to-noise ratio (i.e., a smaller percentage of usable deviant trials), the sweep deviant was not analyzed and is not included in the results presented here. The onset-to-onset ITI was 930 ms for all conditions. Stimuli were presented in a pseudorandomized order where 3–12 STD stimuli were presented before each DEV stimulus.
EEG/ERP acquisition.
EEG/ERP data were acquired across several sessions. All stimuli were presented using E-Prime software (Psychology Software Tools). Sounds were presented free field to infants via left and right speakers attached to opposite walls of a sound-attenuated and electrically shielded sound booth (Industrial Acoustics Company). Infants were seated on their caregiver's lap in a comfortable chair equidistant from each speaker. An experimenter engaged the infant with a silent puppet show or toys. Age-appropriate movies without sound were also presented via video monitor. EEG data were recorded from a 128-channel geodesic sensor net using an EGI (Electric Geodesics) recording system. The vertex electrode was used as the on-line reference electrode. EEG was sampled at 250 Hz and bandpass filtered on-line at 0.1–100 Hz. Impedances were maintained at <50 kΩ.
EEG/ERP data processing.
EEG was filtered off-line with a bandpass of 1–15 Hz; trials containing signals higher than ±200 μV were discarded. Eye movements were estimated from EEG data at the electrodes slightly above and lateral to both eyes. Remaining artifact-free trials were averaged by stimulus type (deviant or predeviant standard) for each block. Segment length was the same as each onset-to-onset ITI (as detailed above). In addition, a 100 ms prestimulus segment was included for baseline correction, and 0 was taken as the time of onset of the first tone (single oddball paradigm) in the tone pair or the first stimulus (multiple oddball paradigm). An average of 92 artifact-free predeviant standard and deviant EEG segments were used in each block for averaging ERPs (300 ms ISI condition: predeviant standard average, 92 segments; deviant average, 91 segments; 70 ms ISI condition: predeviant standard average, 93 segments; deviant average, 93 segments). For the multiple deviant oddball condition, an average of 353 artifact-free predeviant standards and 88 deviant EEG segments were used.
ERP peak extraction.
Based on the channel reduction strategy described by Choudhury and Benasich (2011), nine channels were selected from the 128-channel EGI sensor array. Maximum peak amplitude and latency were extracted at frontal (F3, Fz, and F4), frontocentral (Fc3, Fcz, and Fc4), and central (C3, Cz, and C4) channels. Peaks reported here were identified as a positive or negative deflection from baseline and were labeled according to their order of appearance (e.g., P1, N1, P2, and N2). The change discrimination peak (N2*) is defined as the latency of the negative peak for the deviant wave that indicates the beginning of the discrimination response (for further discussion, see Choudhury and Benasich, 2011). Latencies reported are absolute values (0 was taken as time of onset from the first tone). Please refer to Table 1 for time windows.
Behavioral assessments
AEx group–go/no-go operant conditioning protocol.
Before AEx, a go/no-go (G/N-G) looking task designed to assess each infant's ability to learn an association between an auditory stimulus and the onset of a video reward (Nawyn et al., 2007) was administered. Successful learning of the contingency was demonstrated when the infant directed his/her gaze to a “reward” video presented in a confined display area after the presentation of the conditioned auditory stimulus and before the onset of the reward.
Auditory stimuli.
The auditory stimuli consisted of two pure tones (70 ms duration) with fundamental frequencies of either 800 or 1200 Hz. Tones were presented in pairs of either 800–800 Hz (standard) or 800–1200 Hz (deviant/target) via free-field speakers in a quiet testing room. The within-pair ISI varied from 300 to 40 ms, depending on the phase of the G/N-G session (as described below). Infants were taught to discriminate target from standard on these simple tone pairs (STD, 800–800 Hz; DEV, 800–1200 Hz).
A 4 s video reward stimulus was selected from age-appropriate DVD stimuli and appeared in a defined screen area representing 25 × 15° of visual angle.
Procedure.
There were the following three phases: familiarization, training, and criterion. During all phases, standard stimuli were repeatedly presented, interspersed with experimenter-initiated target stimuli paired with video reward presentations. Target (go) trials were initiated when infant receptivity was judged to be optimal by the experimenter. For familiarization, infants were noncontingently rewarded for up to 10 correct responses/turns to presentations of the target stimulus to condition the association between the novel stimulus and the reward. During the training phase, infants were conditioned to direct their gaze to a specified reward region on a computer screen in response to a go trial (i.e., a series of three successive targets). The reward video was initiated automatically when the infant looked toward the reward area at any point over the go trial window.
During familiarization and training, the between-pair ISI was 300 ms with a stimulus onset asynchrony of 1500 ms. If the child did not respond during the go trial, the reward video was initiated to maintain the stimulus/reward contingency. The training phase ended when the child responded correctly to 3 of 5 successive go trials or when a total of 10 target trials were presented. The child then proceeded to criterion phase. The criterion phase used the same go stimuli (800–1200 Hz). However, 10 no-go trials (i.e., a standard tone pair; 800–800 Hz) were interspersed among the 10 go trials. Successful completion of criterion phase required the infant to show four of five correct responses on two go and two no-go trials within a block of five successive trials. All 18 AEx infants (100%) were able to learn the task at a 300 ms ISI and demonstrate contingency learning during the criterion phase of the go/no-go procedure, thus moving on to the active experience sessions.
Active experience protocol.
The AEx group entered the active training period 1 week after completion of pretest sessions. Infants visited the laboratory once a week (∼20 min) for 6 consecutive weeks and were taught to discriminate target from standard on three different types of acoustic stimuli. The training aim was to support and optimize acoustic mapping and auditory discrimination of brief, successive spatiotemporal cues during a time period when the developing brain is maximally sensitive to environmental sensory experience. The task was designed to focus attention on salient information in these nonlinguistic stimuli that had relevance for subsequent linguistic mapping and to gradually entrain widening subsets of auditory neurons.
Stimuli for active experience protocol.
Infants were trained to discriminate target from standard on three different types of acoustic stimuli, as follows: weeks 1 and 2, complex tones (STD, 800–800 Hz; DEV, 800–1200 Hz); weeks 3 and 4, bandpass noise (STD: 400–1900 Hz and 400–1900 Hz; DEV: 400–1900 Hz and 800–1900 Hz); and weeks 5 and 6: simple sweeps STD: 1600–1200 Hz and 1600–1200 Hz; DEV: 1600–1200 Hz and 1200–1600 Hz). These stimuli were presented at varying ISIs using an up-down staircase procedure (Trehub et al., 1986). A go/no-go operantly conditioned paradigm assessed whether the infants continued to respond contingently to the stimulus/reward association and apply it as the target became progressively more difficult to resolve.
Procedure for active experience.
There were three phases that included familiarization, training, and baseline. Phases 1 and 2 were the same as the familiarization and training described above for the go/no-go procedure.
Phase 3, baseline, was similar to phase 2, except stimuli were presented in blocks of 10 trials with 5 no-go trials (i.e., a standard tone pair) interspersed among 5 go trials, and the between-tone ISI increased or decreased according to infant performance. The criterion for decreasing the ISI was achieved if the infant had four of five correct responses, including two correct go and two correct no-go trial responses. If the infant failed the criterion, the computer algorithm increased the ISI. The task continued in this fashion until the child fatigued (∼7–9 min).
During the baseline phase, 13 of 19 infants achieved success at ≤70 ms ISI on at least one type of stimuli; 9 of 13 were successful at ≤40 ms ISI on at least one type of stimulus.
PEx group–passive experience protocol.
Sound exposure for the PEx group began 1 week after completion of the 4 month pretest session. Infants visited the laboratory once a week (∼20 min) for 6 consecutive weeks.
Stimuli for passive experience protocol.
At each 20 min session, infants were passively exposed to exactly the same temporally modulated auditory sequences as used in the active condition, as follows: weeks 1 and 2: complex tones (STD, 800–800 Hz; DEV, 800–1200 Hz); weeks 3 and 4: bandpass noise (STD, 400–1900 Hz and 400–1900 Hz; DEV, 400–1900 Hz and 800–1900 Hz); and weeks 5 and 6: simple sweeps STD: 1600–1200 Hz and 1600–1200 Hz; DEV, 1600–1200 Hz and 1200–1600 Hz).
Procedure for passive experience.
The infant sat comfortably in an infant seat placed on a chair equidistant between left and right speakers in a sound-attenuated and electrically shielded sound booth (Industrial Acoustics Company). The stimuli were presented free field while the infant was silently entertained with puppets/silent toys to maintain alertness. Two blocks of stimuli were presented in random order at each session, 10 min at 40 ms ISI and 10 min at 70 ms ISI. This condition was designed to increase spectrotemporal processing efficiency through controlled background exposure.
Analytic strategy
The focus of this study was to assess the efficacy of using an active experience protocol to induce changes in neural representation of auditory information and in auditory processing between 4 and 7 months of age. The analyses reported here focus on changes in the cortical electrophysiology of the infant, over and above typical maturation or passive experience alone, across the age period of interest. Independent-samples t tests were conducted to assess typical maturational changes from 4 to 7 months of age. Preliminary analyses of the 4 month data were conducted to ensure that there were no systematic group differences between AEx and PEx groups at the initial visit. Figure 1b shows the grand average waveforms for the 4-month-old children who were randomized into either the AEx or PEx groups as well as the grand average waveform for all of the 4-month-old children combined. As expected, t tests comparing the two 4-month-old groups revealed no significant differences on any of the indices assessed in this study (t value range, 2.0 to −2.0; p value range, 0.06–0.99), and thus the full 4 month sample was used for the cross-sectional maturation analyses (i.e., comparison of 4 and 7 month EEG/ERPs).
To measure the effects of auditory exposure over and above that of maturation, a series of ANOVA models was used to assess differences in ERP amplitude and latency by age (4 vs 7 months of age), group (AEx vs PEx vs NC), and stimulus type (STD vs DEV) for each condition (300 ms ISI, 70 ms ISI, and multideviant oddball). Bonferroni and Tukey–Kramer corrections were conducted for multiple comparisons.
Results
Cross-sectional analysis between 4- and 7-month-old infants: maturational effects on morphology, latency, and amplitude
The morphology, latency and amplitude of the 4-month-old NC group (who were naive at this first pretraining visit) was compared with that of the 7-month-old NC group (Fig. 1c, grand average waveforms for the 4- and 7-month-old NC groups). Results from independent-samples t tests revealed maturational effects for both morphology and latency. For the 300 ms ISI tone pairs, at 4 months we see a four-peak response to the tone pair (P1, N1, and P2 followed by a large negativity between 400 and 500 ms). At 7 months, a more clearly defined and well developed eight-peak response to the tone pair (tone 1: P1, N1, P2, and N2; tone 2: P1T2, N1T2, P2T2, and N2T2) was observed on both standard and deviant waves at all sites examined (Fig. 1c; see Table 1 for specific time windows). Independent-samples t tests of the positive peak responses to tone 1 (P1, P2) revealed significantly faster latencies for the 7-month-old group compared with the younger infants (STD: t(37) range, 2.0–3.7; DEV: t(37) range, 2.0–4.3, p < 0.05) and showed significantly larger amplitudes on the STD and DEV waves (STD: t(37) range, 2.1–2.2; DEV: t(37) range, 2.1–4.3, p < 0.05).
For the 70 ms ISI tone pairs, there was a three-peak response at both 4 and 7 months on both standard waves (P1, N1, and P2) and deviant waves (P1, N2*, and P2; Fig. 1b). t Tests of the positive peaks on standard (P1) and deviant (P1, P2) waves at 7 months were significantly faster (STD: t(44) range, −4.3 to 5.5; DEV: t(44) range, 2.1–4.5, p < 0.04) and smaller in amplitude (STD: t(44) range, 2.1–5.0; DEV: t(44) range, 2.4–3.2, p < 0.05) as compared with 4-month-old children. Negative peak amplitudes increased from 4 to 7 months for both STD and DEV waves (STD: t(44) range, 3.1–4.6; DEV: t(44) range, 2.3–3.2, p < 0.03).
Group effects of auditory exposure on latency and amplitude beyond maturation
We then examined differences among all three 7-month-old groups (i.e., AEx, PEx, and NC groups) to measure the effects of auditory exposure over and above maturational effects. Waveforms were inspected across the entire time period with a specific focus on the P2 and N2* components. Given the active versus passive experience contrast, we hypothesized that there might be a differential effect among groups on components that have been implicated in modulation of attention or vigilance. The P2 component is thought to reflect transient sensory-processing abilities (Ceponiene et al., 2001), specifically processes linked to stimulus awareness and perceptual salience (Ceponiene et al., 2005), and also has been suggested as an index of auditory recognition memory (deRegnier, 2007; Mai et al., 2012). The N2* component on the deviant wave is also of prime interest as it marks the beginning of the discrimination response and has been shown to be a robust infant predictor of later language outcomes (Benasich et al., 2006; Choudhury and Benasich, 2011).
The impact of interactive acoustic experience was found primarily for performance on the fast-rate stimuli (70 ms ISI), although significant effects of auditory exposure in favor of the AEx group were observed for the amplitude of the P2 component for both 300 and 70 ms ISI conditions. In the fast-rate condition (70 ms ISI), the P2 for the deviant wave also demonstrated significant latency effects. More importantly, significant latency and amplitude effects as a function of auditory exposure were shown for the N2* on the deviant wave (Fig. 2a depicts the overlaid waveforms for each of the three 7-month-old groups by condition).
P2 amplitude at 300 ms ISI
To assess group differences in P2 peak amplitude responses to the 800 Hz tones, a total of four P2 peaks (two peaks from the standard pair, one from the predeviant standard and the single peak from the target 1200 Hz tone) were examined with a 3 (group) × 4 (P2 peak amplitude) factorial ANOVA. While a main effect for the 1200 Hz tone revealed that all three groups had significantly larger P2 responses for the deviant tone (range, F(3,105) = 4.4–17.8; p < 0.006), a significant interaction for frontal (range, F(6,105) = 2.3–2.7; p < 0.04), frontocentral (range, F(6,105) = 2.2–2.7; p < 0.05), and central (range, F(6,105) = 2.2–2.4; p < 0.04) channels was also found. The AEx group had significantly higher amplitude for the oft-repeated 800 Hz tones when compared with both the PEx and NC groups (Fig. 2b).
P2 amplitude at 70 ms ISI
To examine this same effect in the 70 ms ISI condition, a 3 (group) × 2 (P2 peak amplitude) ANOVA was run, as only two P2 peaks were identified for this temporally modulated condition. Results showed a significant group difference for frontocentral channels (F(2,46) = 3.7, p < 0.03) and central channels (F(2,46) range = 3.5–6.5; p < 0.04). The NC group had significantly smaller amplitudes for both the 800 Hz (STD) and 1200 Hz (DEV) tones compared with those for the AEx and PEx groups (Fig. 2c).
P2 latency and N2* latency/amplitude at 70 ms ISI
Further analysis using a one-way ANOVA revealed significant enhancement in the speed of acoustic processing. Specifically, the AEx group achieved significantly faster latencies (∼40–50 ms) for the P2 peak on the deviant wave (range, F(2,46) = 3.4–8.9, p < 0.04) compared with both PEx and NC groups (Fig. 3a). For the N2* on the deviant wave, the AEx and PEx groups showed significantly faster (∼20–40 ms) latencies (range, F(2,46) = 3.6–4.2, p < 0.04) than the NC group, and the AEx group showed larger amplitudes (range F(2,46) = 3.4–6.0 to 12, p < 0.04) compared with those for both the PEx and NC groups (Fig. 3b).
Group effects of auditory exposure on morphology beyond maturation
A prominent effect was also seen on ERP morphology for the AEx group. Specifically, additional negative and positive peaks emerged at 400 and 511 ms, respectively, for the 70 ms ISI stimuli, creating a double-peaked waveform not observed consistently for either the PEx or NC groups (Fig. 4a, star). To investigate this further among all participants, we used a group-blind protocol; two independent raters visually inspected each individual's deviant waveform within a 250–550 ms window to verify the presence or absence of these additional peaks. A majority of infants in the AEx group (78%) showed the presence of these double peaks, while they were present for only 41% of the passive group and 28% of the NC group (χ2 = 8.6, p = 0.01). Emergence of new peaks precedes development of more complex waveforms (Choudhury and Benasich, 2011). Thus, these emerging peaks at 7 months of age suggest accelerated development of more complex waveforms resembling those seen in typically developing 12- and 24-month-old infants (Choudhury and Benasich, 2011) without training (see examples in Fig. 4b). Please note that although the emergent peaks observed in 7-month-old children had similar topography to those in older children, these peaks were of the longer latency and larger amplitude typical of younger infants (Shafer et al., 2000; Kushnerenko et al., 2002; Sussman et al., 2008; Choudhury and Benasich, 2011).
Generalization to nonexposed stimuli in multideviant paradigm
These results were not attributable to a specific and isolated “practice” effect. Significantly, these enhanced processing profiles generalized to new, different stimuli that the infants had not previously experienced, but were similar in acoustic structure, incorporating rapid spectrotemporal change as described in Materials and Methods. Thus, for this multideviant generalization condition, P1 and N1 peak amplitude and latency responses to the standard and three generalization stimuli (gap, duration, and frequency) were examined. A four-peak response was seen for the standard and each of the generalization deviants, and were labeled according to the following order of appearance: P1, N1, P2, and N2 (for analysis time windows, see Table 1). The N1 within this multideviant paradigm may or may not be the same discrimination-related response (N2*) as seen for tone-paired stimuli. However, note that the N1 induced in the multideviant paradigm occurs in both standard and deviant waves at ∼200 ms. Further, the discrimination response on the deviant waveform follows the N1 peak and at this point the groups differ significantly for the generalization deviants.
For the standard wave, a one-way ANOVA revealed that the AEx group had significantly faster latencies for the P1 peak component (range, F(2,34) = 4.4–7.7, p < 0.05) and smaller, more mature amplitudes for the P1 and N1 peaks (P1: range, F(2,34) = 3.5–6.3; N1: range, F(2,34) = 3.2–6.5, p < 0.05). Moreover, the morphology for the standard waveform differed markedly between groups at every time point across the waveform. Figure 5a shows the waveforms by group, and the bar graphs (Fig. 5b) illustrate the significant latency and amplitude differences for the P1 and N1 components.
For the deviant waves, a 3 (group) × 3 (P1 peak to each deviant) factorial ANOVA revealed significant group differences for all frontal (range, F(2,34) = 4.2–5.7, p < 0.01), frontocentral (range, F(2,34) = 5.3–6.2, p < 0.01), and central channels (range, F(2,34) = 3.8–6.9, p < 0.01) for P1 latency only (Fig. 6a). That is, the AEx group had significantly faster latencies for all three deviants on the P1 peak compared with both the PEx and NC groups. Similar to the P1, analysis of the N1 component showed significant interactions for latency at frontal (range, F(4,68) = 2.8, p < 0.03), frontocentral (range, F(4,68) = 3.0; p < 0.03), and central channels (range, F(4,68) = 3.0–3.4, p < 0.05). Overall, while the mean latency of the N1 for all three deviants for the AEx group was faster than that for the PEx and NC groups, the interaction is a result of the NC group being notably slower when processing the frequency deviant (Fig. 6b). No significant amplitude differences were observed for P1 or N1 of the deviant waveform.
Responses to the gap deviant
As a specific example, we include here a more detailed analysis of the grand average waveform by group for the “gap” generalization deviant. This deviant stimulus in the multideviant paradigm required infants to make a challenging acoustic discrimination (detecting a 20 ms gap within a 70 ms tone) and also differed the most from the training stimulus set.
Gap separation analysis
As described above, the AEx infants achieved significantly faster P1 latencies for all generalization stimuli, including the gap deviant compared with both PEx and NC groups (Fig. 6a). Further examination of the gap deviant grand average waveforms by group also suggested that the AEx group was processing the deviant quite differently, as evidenced by the distribution of the P2–N2 complex (Fig. 7). These marked differences in morphology were of particular interest given that the 250 to 550 ms window of interest for this P2–N2 complex maps onto the time window in which the mismatch response (MMR) on the difference wave can be identified in this age group (i.e., ∼250–500 ms) in 70 ms ISI conditions (Choudhury and Benasich, 2011) and thus seems to index deviant stimulus discrimination.
To investigate this further, using a group-blind protocol, each individual infant's deviant waveform was visually inspected by two independent raters and then sorted into two subgroups based on the P2–N2 complex distribution on the difference wave (presence/absence of a P2–N2 complex in the 250–550 ms window; Fig. 7a.) A majority of infants in the AEx group (71%) showed this P2–N2 complex as indexed by a peak in the difference wave at ∼ 300–400 ms. In contrast, only 41% of the PEx and 30% of the NC infants generated similar topographies.
A series of χ2 analyses were computed to investigate how passive and active experience compared with no exposure influenced P2–N2 topography. Results revealed that the AEx group significantly differed from the NC group (χ2 = 4.0, p < 0.05), while there was no difference between the PEx and NC group. (χ2 = 1.8, p = 0.18). To follow up on these significant χ2 results, a 2 (group) × 2 (subgroup) ANOVA for N2 peak amplitude and latency was used to examine AEx versus NC group differences (Fig. 7b). Significant interactions were seen for all nine sites examined (range, F(1,20) = 5.2–10.8, p < 0.03). In all cases, infants in the AEx group that showed the P2–N2 complex achieved significantly faster latencies compared with their naive peers. No significant amplitude differences were found.
Similarly, a 2 × 2 ANOVA for the MMR from the difference wave was used to examine latency and amplitude differences between the two subgroups (Fig. 7b). Results revealed that infants that showed the P2–N2 complex had significantly larger amplitudes for the MMR compared with those infants who did not show the P2–N2 complex (range, F(1,20) = 5.2–16, p < 0.03). All significant results are summarized in Table 2.
Discussion
In this study, we demonstrate that both active and passive acoustic experience from 4 to 7 months of age, using temporally modulated nonspeech stimuli, impacts acoustic processing. However, progressive active experience that recruits attention confers a significant advantage, producing specific and measurable changes in the morphology, amplitude, and latency of brain waves that are surrogates for prelinguistic acoustic mapping. The data show qualitative as well as quantitative changes. In particular, we observed significant effects of active acoustic experience on the accuracy and speed of discrimination of key prelinguistic acoustic cues. These effects are over and above that seen for passive exposure or maturation alone. Response to fast-rate stimuli (with tens-of-milliseconds “speech-like” timing) more effectively impacted EEG surrogates for acoustic mapping. Alterations in waveform morphology for the multideviant standard stimulus in the generalization paradigm were seen most prominently in the AEx group, suggesting a more clearly defined and accessible stimulus representation (Näätänen and Winkler, 1999; Polley et al., 2006; Schreiner and Winer, 2007). Finally, infants who had acoustic exposure generalized an enhanced processing profile to multiple novel temporally modulated stimuli. The AEx group demonstrated significantly faster P1 latencies for all three generalization stimuli compared with the PEx and NC groups. Moreover, examination of the P2–N2 complex on the challenging gap deviant suggests that the AEx group is generating a larger-amplitude MMR, an ERP marker for the construction of stimulus–feature representations in auditory sensory memory (Näätänen, 2001).
We suggest, given the similarity to findings in the animal literature, that both active and passive training induce sharpening of automatic acoustic processing, perhaps bootstrapping and fine tuning the creation of well defined acoustic maps (Kilgard et al., 2001; Schreiner and Winer, 2007; Woods et al., 2009). Another candidate and probably convergent mechanism for the AEx group may be improved allocation of attentional resources (Kauramäki et al., 2007; Schreiner and Winer, 2007) during mapping. The significant amplitude differences seen for the AEx group on the P2 peak might be interpreted as an early attention/vigilance effect that signals involuntary allocation of attention toward an environmental stimulus (Escera et al., 2000), reflecting increased levels of “perceptual vigilance” (Ortiz-Mantilla et al., 2010). Although some facilitation of acoustic mapping was seen in the PEx group, a greater effect was observed in the AEx infants. Attention/vigilance, even at this early age, may confer a substantial advantage. It has also been shown in adults that focusing attention on salient information drives neural dynamics and that selective attention increases both gain and feature selectivity of the human auditory cortex (Kauramäki et al., 2007; Woods et al., 2009).
Given the many animal studies that demonstrate continuous modification of developing cortical representations by the sensory environment (Kilgard et al., 2001; Woods et al., 2009; Froemke and Jones, 2011), we believe that the changes seen here may be a manifestation of sound exposure-based plasticity that is typical of this early developmental period. The human neocortex is particularly attentive to statistically distributed acoustic cues during early critical periods, when cortical networks are organizing around salient features of the sensory environment (Saffran et al., 1996; Maye et al., 2002). Precisely targeted nonlinguistic acoustic experience that focuses the infant's attention on linguistically relevant environmental cues may facilitate neural plasticity in this early developmental period (Kauramäki et al., 2007; Kuhl, 2010; Reed et al., 2011; Ranasinghe et al., 2012). Thus, we used nonlinguistic acoustic cues to provide a universal, less language-specific input and to facilitate generalization, while minimizing perceptual-narrowing effects.
For the AEx infants, significantly faster and more precise responses to both familiar and novel stimuli, as measured by ERPs, were elicited compared with naive age-matched control subjects. The PEx group showed significant enhancement in processing speed in several analyses compared with the NC group. However, the PEx group never outperformed the AEx infants and often did not differ from the NC group. Morphology of the waveforms is also seen to reflect a more mature topography, with the emergence of waveform complexity in the AEx infants. Although these morphological profiles look similar to those generated by older children, they are also age appropriate in important ways, including larger amplitude and slower latencies typically seen when comparing younger to older children.
Comparison of before and after profiles illustrates a more “optimal processing profile” for the AEx infants, based on published results of longitudinal predictive studies and maturational profiles of EEG/ERP (Guttorm et al., 2005; Benasich et al., 2006; Choudhury and Benasich, 2011). Components most favorably altered include the P1, P2, and N2*. The N2* has been the best predictor of linguistic outcome from the infancy period through 4 years of age (Benasich et al., 2006; Choudhury and Benasich, 2011). Across previous studies, children who demonstrated infant electrophysiological profiles similar to that shown by the AEx group were most likely to later show superior standardized language outcomes, whereas those with longer N2* latencies and more immature topography had significantly poorer language outcomes (Benasich et al., 2006; Choudhury and Benasich, 2011).
As noted above and shown in Figure 5, a and b, the morphology of the multideviant standard also differed between the age-matched 7-month-old groups at every point across the waveform, including the P1 and N1 peaks. Repetition of the standard stimulus induces a neural representation of the stimulus reiterations, extracted from the acoustic signal, which is then maintained in auditory memory, providing a basis for continuous comparisons with subsequent acoustic input. Formation of this memory trace or “sensory map” is critical as it allows efficient discrimination of subsequent deviant stimuli. Our finding that the AEx infants can more quickly and efficiently process the multideviant standard implies that active experience generates a “more accessible” representation and further suggests that interactive auditory experience induced experience-dependent changes in sensory memory (Näätänen and Winkler, 1999; Benasich et al., 2006; Polley et al., 2006).
The ability of the brain to decode incoming speech is dependent upon the accurate perception of rapid acoustic changes (Eimas et al., 1971; Jusczyk et al., 1980; Kraus et al., 1996; Benasich and Tallal, 2002). Both the precision of biological encoding of sound and the “stability” of the neural representation throughout the auditory system are important contributors to accurate acoustic mapping (Banai et al., 2009; Hornickel et al., 2009; Hornickel and Kraus, 2013; Centanni et al., 2014). Difficulties in biological encoding can result in distorted neural boundaries, unstable representation of sound, and degraded categorical and phonemic representations (Werker and Tees, 1987; Tallal, 2000; Hornickel et al., 2012; Centanni et al., 2014). Such unstable and perhaps overinclusive representations lead to difficulty in decoding and identifying speech as well as nonspeech transitions in the domain of tens of milliseconds that is highly associated with language impairments in older children and adults (Kraus et al., 1996; Tallal, 2004; McArthur and Bishop, 2005; Oram Cardy et al., 2005; Hornickel et al., 2009; Hornickel et al., 2012).
In previous studies, we have shown that the ability to accurately process such rapid acoustic change can be reliably assessed in infancy, and that infant nonlinguistic acoustic processing abilities predict robustly to later language (Benasich and Tallal, 2002; Benasich et al., 2006; Choudhury and Benasich, 2011). Thus, we believe that the results reported here have significant implications for typical as well as atypical language development. As hypothesized, only faster-rate stimuli elicited widespread and significant changes in ERPs compared with the naive control group. This may reflect plasticity in response mechanisms for phonemic characteristics that share the same temporal resolution, such as consonant-vowel transitions. However, a number of animal studies have also shown a rate-specific effect as a result of acoustic exposure and experience (Threlkeld et al., 2009; de Villers-Sidani et al., 2010), so these results need further exploration.
This study was tightly focused on the impact of experience on acoustic processing/mapping; however, there is much still to be accomplished in this domain. Although the effects seen here are quite robust, limitations included a relatively small sample and infants that were screened to exclude those at familial risk for language learning impairment (LLI). Thus, an important next step will be to recruit a much larger sample of typically developing children and to include those at higher risk for LLI. As we follow this sample longitudinally, we also hope to investigate the ongoing impact of active auditory experience on EEG/ERP as well as subsequent language and cognitive outcomes. Moreover, it will be important to examine additional longitudinal manipulations using other modalities (e.g., visual, auditory-visual) as well as the impact of individual differences on response to auditory experience.
In summary, active exposure during early infancy to nonspeech stimuli containing linguistically relevant acoustic cues appears to confer a significant acoustic processing advantage when compared with passive exposure or maturation alone. Specifically, such experience appears to facilitate neural plasticity and more efficient sensory processing during the developmental period when infants are constructing their sensory maps. Further exploration of experience-dependent neural mechanisms underlying acoustic mapping will provide the opportunity to identify and characterize the earliest precursors and biological markers of normative and disordered language processing. Precisely targeted support of emergent plasticity in acoustic mapping for infants at higher risk for language disorders as a function of a positive family history may provide a new and powerful avenue for amelioration and perhaps prevention of such disorders.
Footnotes
This research was supported by the Elizabeth H. Solomon Center for Neurodevelopmental Research with additional funding from a Rutgers University Board of Trustees Excellence in Research Award to A.A.B.
The authors declare no competing financial interests.
- Correspondence should be addressed to April A. Benasich, Center for Molecular and Behavioral Neuroscience, Rutgers, The State University of New Jersey–Newark, Newark, NJ 07102. benasich{at}andromeda.rutgers.edu