Neural Correlates of Perceptual Learning in the Auditory Brainstem: Efferent Activity Predicts and Reflects Improvement at a Speech-in-Noise Discrimination Task

An extensive corticofugal system extends from the auditory cortex toward subcortical nuclei along the auditory pathway. Corticofugal influences reach even into the inner ear via the efferents of the olivocochlear bundle, the medial branch of which modulates preneural sound amplification gain. This corticofugal system is thought to contribute to neuroplasticity underlying auditory perceptual learning. In the present study, we investigated the involvement of the medial olivocochlear bundle (MOCB) in perceptual learning as a result of auditory training. MOCB activity was monitored in normal-hearing adult listeners during a 5 d training regimen on a consonant–vowel phoneme-in-noise discrimination task. The results show significant group learning, with great inter-individual variability in initial performance and improvement. As observed in previous auditory training studies, poor initial performers tended to show greater learning. Strikingly, MOCB activity measured on the first training day strongly predicted the subsequent amount of improvement, such that weaker initial MOCB activity was associated with greater improvement. Moreover, in listeners that improved significantly, an increase in MOCB activity was observed after training. Thus, as discrimination thresholds of listeners converged over the course of training, differences in MOCB activity between listeners decreased. Additional analysis showed that MOCB activity did not explain variation in performance between listeners on any training day but rather reflected an individual listener's performance relative to their personal optimal range. The findings suggest an MOCB-mediated listening strategy that facilitates speech-in-noise perception. The operation of this strategy is flexible and susceptible to training, presumably because of task-related adaptation of descending control from the cortex.


Introduction
There is increasing evidence that the adult auditory cortex (AC) is a dynamic and adaptive processing center, in which descending "top-down" influences play as important a role as ascending "bottom-up" activations in shaping neural sound representations (Gilbert and Sigman, 2007;Scheich et al., 2007). This has been demonstrated, in particular, in studies of auditory perceptual learning, in which long-term neural changes have been observed in the adult AC of both animals (Bao et al., 2004;Polley et al., 2006) and humans (Alain et al., 2007;van Wassenhove and Nagarajan, 2007) after intensive auditory training. Current models propose that perceptual learning in adults depends strongly on top-down influences such as attention, reward, and task relevance (Gilbert and Sigman, 2007;Keuroghlian and Knudsen, 2007).
Top-down influences do not terminate at the AC but also extend from the AC toward subcortical nuclei (Winer, 2005;Palmer et al., 2007) via the extensive corticofugal system (Zhang and Suga, 2000;Zhou and Jen, 2000). Corticofugal influences reach even into the inner ear Perrot et al., 2006) via the efferents of the medial olivocochlear bundle (MOCB) (Guinan, 2006), which originate from the brainstem and terminate inside the cochlea, where they modulate preneural amplification gain. It is thought that the corticofugal system contributes to learning-related plasticity by forming feedback circuits that initiate and reinforce altered neural sound presentations along the central auditory pathway . Banai et al. (2005) have described a link between phonological processing in the brainstem and measures of speech understanding and literacy in children with language-based learning problems, who show particular difficulties with speech-in-noise perception. In this population, intensive auditory training has been shown to reduce noise degradation of neural responses to speech sounds in the brainstem (Russo et al., 2005), as well as increases in MOCB activity (Veuillet et al., 2007), concomitant with improvement in speech perception. In adult listeners, however, there is as yet no direct evidence of learning-related auditory brainstem plasticity, although differences in brainstem encoding of linguistic pitch (Wong et al., 2007) and musical and speech tokens (Musacchia et al., 2007), as well as in MOCB activity (Perrot et al., 1999), have been observed between professionally trained musicians and normal controls. The MOCB has been implicated in speech-in-noise perception not only in children (Kumar and Vanaja, 2004) but also in adults (Giraud et al., 1997). This raises the question whether the MOCB could also play a role in training-induced improvement in speech-in-noise perception in adult listeners. It is thought that, in adult perceptual learning, top-down modulation becomes particularly important when the stimulus input is degraded by noise, with the locus of learningrelated plasticity moving downstream to enhance signal-to-noise (SNR) ratios at the input level (Ahissar and Hochstein, 2004). In the present experiment, we investigated the role of top-down influences on MOCB function in perceptual learning of speechin-noise processing in adult listeners. Normal-hearing adults were trained for 5 d on a monaural consonant-vowel (CV) phoneme-in-noise discrimination task, and MOCB activity was measured daily.

Participants
Sixteen healthy female volunteers (mean Ϯ SD age, 22 Ϯ 4 years) participated in the main study and underwent 5 d of training and testing. Eight additional volunteers (female, mean Ϯ SD age, 21 Ϯ 1.9 years) participated as an untrained control group and were tested only on days 1 and 5.
Before testing on the first day, all participants were screened for normal audiological status using otoscopy, tympanometry, and acoustic reflexes (both using GSI-33 Middle Ear Analyzer; Grason-Stadler, Milford, NH) and pure tone audiograms (Kamplex KC50; Interacoustics, Assens, Denmark). Normal hearing was defined as hearing thresholds better than 20 dB normal hearing level (HL), bilaterally, at 0.5, 1, 2, and 4 kHz. Only participants with ear canal pressure values between Ϫ100 and ϩ50 daPa and middle-ear compliance values between 0.2 and 2.5 ml were included. Additional screens were applied using questionnaires to ensure participants spoke English as a first language and showed dominant righthandedness according to the Edinburgh inventory (Oldfield, 1971).
During all audiometric testing and experimental sessions, participants were seated upright in a comfortable chair inside a dimly lit soundattenuating booth (Industrial Acoustics Company, Winchester, UK), facing a monitor placed directly behind the booth window. The booth window was blacked out, and participants were monitored throughout the experiment using a webcam. Sensation thresholds for all stimuli were obtained on the first day using a 10 dB-down, 5 dB-up procedure, in which two consecutive ascending correct detections were taken to indicate threshold.
Written and informed consent was obtained from the participants, who were paid for taking part. All recordings were performed in accordance with the guidelines of the Declaration of Helsinki and were approved by the South and West Local Research Ethics Committee.

Training
Stimuli. The training task used a continuum of speech sound stimuli between two naturally spoken CV syllables (/bee/ and /dee/, 335 ms duration), taken from a speech discrimination training package . The CV syllables were spoken by an adult male and formed the endpoints of a 96-step continuum of stimuli, which was generated using linear interpolation coding of the consonant portion of the syllable (/b/ to /d/), with 11 free parameters in the spectral and amplitude domains. The continuum was designed to make speech sounds increasingly hard to discriminate as their mutual distance within the continuum decreased. The stimuli were embedded in a monaural continuous broadband noise generated using Adobe Audition (version 1.5; Adobe Systems, San Jose, CA) and presented to the right ear via the personal computer soundcard through TDH 39 headphones. The broadband noise was presented at 40 dB sensation level (SL). The CV stimuli were embedded in the broadband noise at an SNR ratio of 10 dB, based on the root-meansquare amplitude.
Task. The participants performed a three-interval, two-alternative forced-choice AXB discrimination task. In each trial, three stimuli from the CV continuum were presented at intervals of 500 ms. Participants were asked to indicate whether X was identical to A or to B by pressing either a left or a right button on a gamepad (Nostromo n50 SpeedPad; Belkin, Compton, CA) using the thumb of the left and right hand, respectively. Participants were instructed to fixate visually on a central box displayed on the monitor throughout the task. Directly above the box, three black circles appeared during each trial to indicate stimulus timing. Once all three stimuli had been presented, all three circles were displayed, indicating to the participant that a response should be given. Inside the central box, immediate visual feedback was given after the response to indicate whether it was correct (green tick) or incorrect (red cross). Feedback was displayed for 1 s. New trials were initiated 2 s after response was given or 10 s after the last stimulus if no response was given, although the latter occurred very rarely (Ͻ1% trials).
Threshold procedure. The A and B stimuli used in the AXB task were chosen from symmetrically opposite points along the continuum. A three-down, one-up staircase procedure was used to estimate the 80% correct discrimination threshold, expressed in distance between the A and B sounds within the continuum. The staircase was terminated after 10 reversals or 60 trials. In the first trial, A and B were taken from the two endpoints of the /b/ to /d/ continuum, corresponding to sounds numbers 1 and 96. After three consecutive correct responses, the distance between A and B was decreased by 20 sound numbers by increasing and decreasing by 10 the sound number of A and B, respectively. This step size was used for the first three reversals. For the subsequent three reversals, the step size was halved to 10, and, for the last four reversals, the step size was 6. This protocol was based on pilot experiments that were performed to establish reliable and convergent staircases. On average, a single staircase track lasted ϳ5 min.
Contralateral noise condition. A direct link between speech-in-noise processing and MOCB activity has so far been shown only when monaural task performance was found to improve during contralateral noise stimulation (Giraud et al., 1997;Kumar and Vanaja, 2004). This improvement was found to be correlated to measures of MOCB activation in the ipsilateral ear by contralateral noise. It may be that the MOCB plays an active role in speech-in-noise perception only under task conditions in which contralateral or binaural noise is present. Furthermore, the physiological measure of MOCB activity in this experiment and in the studies mentioned above uses contralateral noise to activate the MOCB (see below). It is possible that this measure will reflect learning-related effects on MOCB activity only when learning occurs in the presence of contralateral noise.
To evaluate whether the interaction between our measure of MOCB activity and behavioral improvement depends on contralateral noise stimulation during the task, 8 of the 16 training participants received broadband noise to the contralateral ear continuously throughout the training sessions. It was ensured that the ipsilateral and contralateral noises were not correlated, because this would have evoked central unmasking effects unrelated to MOCB activation. Contralateral noise was taken from the masking channel of a Kamplex KC50 audiometer and was presented through the left headphone at 40 dB SL.
Training regimen. Participants were trained on 5 consecutive days (Monday to Friday) at fixed times of the day (seven participants attended morning sessions, eight attended afternoon sessions, and one attended in the early evening). Each training session took ϳ1 h, including short breaks.
During each session, participants performed three blocks of staircases, with 5 min rests between blocks. Each block comprised three tracks, which took ϳ5 min each, with 30 s rests between tracks. Before the very first training session, a fixed set of instructions was given to each participant to explain the task, and a practice run of 10 trials was performed by the participant in which the difference between the sounds to be discriminated was maximal. This practice run was performed in the contralateral noise condition (on/off) that the participant had been assigned to beforehand. Participants were not informed beforehand what sounds they would be hearing, nor that these would be speech syllables.
Training analysis. For each staircase, the discrimination threshold was estimated as the arithmetic mean of the last five reversals, which were expressed as the percentage distance between two sounds within the continuum compared with the maximal distance, that is, between the two endpoints. Thus, if A and B were at the endpoints (1 and 96), the distance was 100%, and if A and B were both sound number 48, the distance was 0%. Nine thresholds were obtained on each day (three blocks of three tracks each day), thus yielding 45 thresholds over the entire training period. ANOVA was used to assess training effects and its interaction with independent variables. Huynh Feldt-corrected values of significance are reported when appropriate. Regression lines were fitted to individual learning curves of discrimination threshold as a function of track number from the fourth track to the end of the entire training period (track 45). The first three track thresholds were excluded from the regression analysis because their arithmetic mean was used as a measure of the participants' initial performance. An individual was considered to show significant learning if the regression fit showed a p value Ͻ0.05 and the slope of the regression line was negative.

MOCB measurements
Contralateral suppression of evoked otoacoustic emissions. MOCB activity in the right ear was evaluated immediately after each training session by measuring the suppression of evoked otoacoustic emissions (EOAEs) by contralateral noise (for in-depth review, see Guinan, 2006). In brief, EOAEs are small acoustic byproducts of the cochlear amplification mechanism, which can be detected in the ear canal in response to sound stimulation (Kemp, 1978). Activation of the MOCB by contralateral noise reduces the gain of cochlear amplification, which leads to a suppression of EOAE amplitude (Collet et al., 1990). This amplitude suppression decreases with increasing EOAE eliciting stimulus level, such that MOCB activation produces a linearization of the characteristic compressive nonlinear EOAE input/output (I/O) function (Veuillet et al., 1996).
In this experiment, we recorded EOAEs using clicks, presented at two stimulus levels (see below), in both the presence and absence of contralateral broadband noise at 40 dB SL. The reduction of click-evoked otoacoustic emissions (CEOAE) amplitude ("amplitude suppression") and the increase in I/O slope ("I/O suppression") in the presence of contralateral noise were used as indices of MOCB activity. The two measures were used in a complementary manner to identify the characteristic level-dependent contralateral suppression of EOAEs mediated by the MOCB.
CEOAE recordings. CEOAEs were recorded at a rate of 50/s at 30 and 40 dB SL. Recordings were made using an in-house EOAE system (Thornton et al., 1994), consisting of an external digital signal processing (DSP) board, controlled by a personal computer using customized software written in Visual Basic, and connected to a general purpose ILO Otodynamics (Hatfield, UK) OAE probe. Clicks of 100 s duration were generated on the DSP board and fed into the loudspeaker of the probe at 30,000 samples/s via a sigma-delta digital-to-analog converter. The EOAE response picked up by the probe microphone was fed back into the DSP board and digitized at 30,000 samples/s. A rejection level of 5000 Pa was used over the time window between 6 and 16 ms after the click. Rejection rates varied between participants but were kept below 30% for all recordings. CEOAEs were averaged over 800 clicks per replicate.
CEOAE recordings were interleaved with EOAE recordings, which used maximum length sequences (MLS) of clicks to evoke responses (Thornton, 1993;Hine et al., 2001). The resulting MLS OAEs are similar to CEOAEs but are obtained using a very high click rate, which produces an essentially noise-like stimulus. The reason for including MLS OAEs was to investigate the effect of temporal excitation patterns on MOCB interaction with the cochlear amplification mechanism. Because the MLS OAE results did not have any direct bearing on the findings presented here, these recordings will not be discussed further, except to note that no learning effects were observed on MOCB measures based on MLS OAE recordings.
Contralateral noise. At each CEOAE stimulus level, four replicate waveforms were recorded consecutively, with contralateral noise presented at 40 dB SL to the left ear during the second and the fourth replicate, using the same equipment and TDH 39 headphones as used in the training task, with the right ear headphone silent and placed comfortably behind the ear.
Analysis of contralateral suppression of EOAEs. All data analysis was performed in Matlab (version 6.1; MathWorks, Natick, MA) and SPSS (version 12.0; SPSS, Chicago, IL). The CEOAE waveforms were filtered between 250 and 6000 Hz and analyzed over the 6 -16 ms time window. Two replicates were obtained at each combination of level and contralateral noise condition (on/off). From these, the following parameters were calculated. Response amplitude was estimated as the real part of the cross spectrum obtained between the two average waveforms. Amplitude suppression was defined as the change in response amplitude (in decibels) in the presence of contralateral noise and was calculated for each level separately. I/O slope was estimated as the growth (decibels/decibels) in response amplitude between the two stimulus levels, divided by the level difference. I/O suppression was defined as the change in the I/O slope in the presence of contralateral noise.
For two participants, the CEOAE recordings on one particular day were missing because of equipment failure, in which case the suppression values for that day were replaced by the average of the values on the other 4 d. Additional analysis excluding these two participants altogether, or excluding all data from that day, produced highly equivalent data trends with similar significance levels.
Middle-ear muscle reflex. An important consideration in the measurement of contralateral suppression of EOAEs to index MOCB activity is activation of the contralateral acoustic reflex, which can also produce suppression of EOAEs. This reflex causes contraction of the middle-ear muscle (MEM) in response to high-level contralateral noise stimulation. MEM contraction produces a change in middle-ear impedance, which suppresses all sound pressure changes in the ear canal and affects both forward and backward transmission of sound between the ear canal and the cochlea.
To avoid contribution of the MEM reflex to the contralateral suppression of EOAEs in our experiment, we ensured that the contralateral noise level was at least 15 dB below the MEM reflex threshold for each participant. Such levels have been shown not to induce a MEM contribution to contralateral suppression (Giraud et al., 1995;Sun, 2008). The MEM reflex threshold was measured on the first training day using standard clinical procedures on a GSI 33 tympanometer. A 226 Hz probe tone was used to monitor middle-ear impedance, and contralateral noise was presented through an insert earphone for 1.5 s. The reflex threshold was defined as the lowest contralateral noise level, in 5 dB steps, that produced a perceptible change in middle-ear impedance, as shown on the tympanometer display and judged by an experienced observer.
On average, the MEM reflex threshold in our participants was 77.5 Ϯ 9 dB normal HL (mean Ϯ SD) and ranged between 60 and 95 dB normal HL. The difference between the contralateral noise level, presented at 40 dB SL, and MEM reflex threshold estimated for each individual participant was 25 Ϯ 8 dB (mean Ϯ SD) on average and ranged from 15 to 40 dB. Based on these measurements, we ruled out the possibility of MEM reflex activation during contralateral suppression measurements. This line of reasoning is supported by findings from a previous study in our laboratory, which used identical procedures and equipment to compare contralateral suppression of CEOAEs in the healthy and operated ears of vestibular neurectomy patients, in whom the OCB had been severed on one side (Hine et al., 1997). The study found that contralateral suppression was absent in the operated ear but preserved in the healthy ear, which confirms that the contralateral suppression measured using this methodology was solely mediated by the MOCB. Moreover, the absolute contralateral noise level used in our study was 52.5 Ϯ 8 dB sound pressure level (SPL) (mean Ϯ SD). Previous work, which directly investigated MEM reflex contribution to contralateral suppression, has shown that, for noise levels up to 60 dB SPL, the MEM reflex does not contribute to contralateral suppression of EOAEs in normal ears (Buki et al., 2000).

Control experiment
The control group was tested only on 2 d, corresponding to days 1 and 5 of the training regimen. These participants performed only one block of three tracks of the discrimination task on each day. Contralateral suppression was measured immediately after task performance on both days. Both behavioral and contralateral suppression measurements followed identical procedures to those described above for the training group. There was no difference between the control group and the training group in sensation threshold for the ipsilateral or contralateral noise de Boer and Thornton • Training and the MOCB ( p Ͼ 0.1), in the MEM reflex threshold ( p Ͼ 0.4), or in difference between MEM threshold and contralateral noise at 40 dB SL ( p Ͼ 0.4). Equally, sensation thresholds for the CEOAE stimuli were the same ( p Ͼ 0.1). The results of the control group are not included in the main analyses but are presented in a separate analysis to validate the effect of training on both behavioral and physiological measures.

Behavioral results
Auditory training significantly improved group performance at the phoneme-in-noise discrimination task, as is illustrated in Figure 1A, which shows the average discrimination threshold as a function of training day. ANOVA with training day (5), block (3), and track (3) as within-subject factors and training condition (contralateral noise on/off) as between-subjects factor confirmed that the improvement with training day was significant (F ϭ 11.2; p Ͻ 0.001). The greatest improvement occurred over the first 3 training days. Bonferroni's-corrected comparisons showed a significant difference (⌬) in threshold between day 1 versus day 3 (⌬ ϭ 9%; p ϭ 0.008), day 1 versus day 4 (⌬ ϭ 11%; p ϭ 0.007), and day 1 versus day 5 (⌬ ϭ 12%; p ϭ 0.009). On average, performance worsened slightly for successive tracks within a block (F ϭ 4.2; p ϭ 0.029; data not shown), with Bonferroni'scorrected comparisons showing a small but significant increase in threshold from track 1 to track 3 (⌬ ϭ 3%; p ϭ 0.009). Withinsession improvement occurred only for the first track of each block and was significant only on day 1 (F ϭ 3.6; p ϭ 0.04) and day 2 (F ϭ 3.4; p ϭ 0.05) (data not shown). The presence of contralateral noise during training did not affect the amount of improvement (F Ͻ 1; p Ͼ 0.4). Contralateral noise did, however, lead to higher average thresholds throughout (data not shown). Independent t tests found this difference to be nonsignificant except for the first (of 45) track threshold obtained (⌬ ϭ 21%; p ϭ 0.021).
In the individual learning curves, a large amount of variability was observed in both initial performance and subsequent amount of learning. Linear regression fits produced a significant ( p Ͻ 0.05) negative slope in 8 of the 16 participants, indicating robust learning. These participants were classified as "learners," for example, Figure 1B. The remaining eight participants were classified as "nonlearners." Of these, seven participants showed learning curves with a small and nonsignificant ( p Ͼ 0.3 for n ϭ 5; p Ͼ 0.1 for n ϭ 2) negative slope (Fig. 1C), and one participant produced a significant ( p ϭ 0.01) positive slope (0.14), indicating significant worsening of performance over the course of training. Figure 1D shows the distribution of learning slopes for the learner (mean Ϯ SD, Ϫ0.24 Ϯ 0.08) and nonlearner (Ϫ0.02 Ϯ 0.07) groups. Contralateral noise during training did not affect individual learning slopes: no significant difference was observed in average learning slope between participants trained with or without contralateral noise (data not shown; p Ͼ 0.8). The ratio of learners versus nonlearners was 5:3 and 3:5 in training groups with and without contralateral noise, respectively.
A strong relationship between initial performance and subsequent amount of learning was observed, such that better initial performance was associated with less improvement. This is illustrated in Figure 1E, which plots initial performance against the fitted learning slope, showing a highly significant negative correlation (r ϭ Ϫ0.82; p Ͻ 0.001). In this plot, one participant that showed significant worsening was considered an outlier. This participant did not form an outlier on any measures related to the audiological screen or questionnaire, apart from being the oldest of the participants (28 years) but did report tiredness and lack of concentration on the first 2 training days because of sleep deprivation. Based on her unusual learning pattern, we excluded this participant from additional analysis.
Although learners thus initially performed worse than nonlearners, over the course of training, the performance of the two groups converged, as the learners improved. This is illustrated in Figure 1F, which shows the average discrimination threshold as a function of training day for the two groups. Although ANOVA confirmed a difference in the average learning curve of the two groups (F ϭ 6.7; p Ͻ 0.001), as expected, independent t tests found that the difference in threshold between the learner and nonlearners groups was significant only on the first day (⌬ ϭ 19.3%; t ϭ 2.5; p ϭ 0.026). It should be noted that learners thus did not at any stage perform better than nonlearners, but rather that they reached the common average level of performance only after training.
We compared improvement in performance from day 1 to day 5 between the trained participants (n ϭ 16) and the untrained control group (n ϭ 8) using ANOVA. In this analysis, the average threshold achieved in the first and last block performed by the trained group was compared with that in the two blocks performed by the untrained control group on the equivalent of day 1 and day 5. There was a significantly (F ϭ 4.9; p ϭ 0.037) greater improvement in the trained versus the untrained group. Separate paired t tests showed a significant improvement in the trained group (⌬ ϭ 18.4%; p Ͻ 0.001) and a small, near-significant im- provement in the untrained control group (⌬ ϭ 5.7%; p ϭ 0.052). This improvement in the untrained group was more significant, but smaller in size, than that observed in the nonlearner group (⌬ ϭ 10.3%; p ϭ 0.095) and much smaller than the learner group (⌬ ϭ 25.6%; p ϭ 0.001). It is not possible to know what the proportion of potential learners and nonlearners was in the control group. However, initial performance was clearly predictive of subsequent learning. There was no significant difference ( p ϭ 0.9) in average initial performance (first block) between the trained (mean Ϯ SEM, 61.7 Ϯ 4.4%) and the untrained (62.6 Ϯ 2.9%) groups. Moreover, the average initial performance of the untrained group lay approximately halfway between that of the learners (72.4 Ϯ 5.1%) and the nonlearners (49.6 Ϯ 3.8%). From this, we tentatively infer that the untrained group of control participants contained a mixture of learners and nonlearners and thus that the substantially smaller improvement observed in the untrained group is attributable to the lack of intervening training.

Physiological results
Contralateral noise produced a significant reduction in amplitude of CEOAEs in the trained group on each of the 5 training days at both stimulus levels, as confirmed by paired t tests (all p Ͻ 0.001). The size of the contralateral suppression (mean Ϯ SEM, 30 dB SL, 1.38 Ϯ 0.14 dB; 40 dB SL, 1.05 Ϯ 0.16 dB) was in line with that reported in previous work on MOCB-mediated suppression of CEOAEs (Veuillet et al., 1991;Giraud et al., 1995;Maison et al., 2001). Moreover, the suppression showed the characteristic and significant (F ϭ 17.8; p ϭ 0.001) decrease with increasing stimulus level (Veuillet et al., 1996), associated with a significant increase in I/O slope ( p Ͻ 0.05 on each training day), corresponding to an I/O suppression of 0.041 Ϯ 0.007 dB/dB (mean Ϯ SEM), when averaged over all training days.
When amplitude suppression was examined separately at the two levels, a positive trend with training day was observed at 30 dB SL ( Fig. 2A, filled symbols) but not 40 dB SL (data not shown). Moreover, a similar positive trend was observed for the I/O suppression (Fig. 2B, filled symbols). However, ANOVA showed this increase to be nonsignificant for both these suppression measures (F Ͻ 1; p Ͼ 0.4), suggesting a degree of variability within the data that may have reduced the significance. To examine possible sources of variability, we examined the effect of training condition (contralateral noise on/off) and learning outcome on the CEOAE suppression as a function of training day. No difference in suppression measures was observed on any training day between participants trained with or without contralateral noise (independent t tests, p Ͼ 0.3). Conversely, a notable difference was observed between participants that showed robust learning (learners) and those that showed no significant improvement (nonlearners). This difference is illustrated in Figure 2, C and D, where it may be observed that the learners showed much weaker suppression on average than nonlearners on the first training day. Independent t tests showed this difference to be significant for both amplitude suppression at 30 dB SL (⌬ ϭ 0.8 dB; t ϭ 2.6; p ϭ 0.022) and I/O suppression (⌬ ϭ 0.06 dB/dB; t ϭ 4.0; p ϭ 0.001). Moreover, in the learner group, both suppression measures increased steadily with training day, whereas nonlearners showed no change. This difference in suppression increase was confirmed by ANOVA, which revealed a significant interaction between learner group and training day for both amplitude (F ϭ 2.5; p ϭ 0.05) and I/O suppression (F ϭ 3.7; p ϭ 0.01). Follow-up ANOVAs on the two groups separately revealed a significant increase in the learner group in both amplitude (F ϭ 3.4; p ϭ 0.023) and I/O suppression (F ϭ 5.3; p ϭ 0.003) but not in the nonlearner group ( p Ͼ 0.5 for both). The greatest increase occurred over the first 2 training days, but the increase did not become significant until the fourth and third day for amplitude and I/O suppression, respectively. Planned comparison with day 1 revealed a significantly greater amplitude suppression on day 4 (⌬ ϭ 0.63 dB; F ϭ 7.2; p ϭ 0.032) and day 5 (⌬ ϭ 0.78 dB; F ϭ 8.7; p ϭ 0.022) in the learners. Similarly, I/O suppression was significantly stronger on day 3 (⌬ ϭ 0.058 dB/dB; F ϭ 17.8; p ϭ 0.004), day 4 (⌬ ϭ 0.054 dB/dB; F ϭ 12.2; p ϭ 0.01), and day 5 (⌬ ϭ 0.085 dB/dB; F ϭ 13.0; p ϭ 0.009) compared with day 1.
These findings suggest an interaction between MOCB activity and training-induced improvement. We performed a number of checks to rule out other possible explanations. First, we evaluated the possibility that the difference observed between the learners and nonlearners on the first day might arise from experimental factors relating to the contralateral noise. No difference was found between the two groups in sensation thresholds for the eliciting click ( p Ͼ 0.5) or for contralateral noise ( p Ͼ 0.9). Moreover, there was no difference between the two groups in MEM reflex threshold ( p Ͼ 0.4) or in the level difference between the contralateral noise at 40 dB SL and the MEM threshold ( p Ͼ 0.4), which was 25 dB on average. These numbers imply that the MEM threshold would have had to drop by ϳ20 dB over the first training session in the nonlearner group to explain the first-day difference between the groups and by an equal amount over the course of training in the learner group to explain the observed changes in contralateral suppression. Such a dramatic change in MEM reflex threshold is both unlikely and unprecedented. Conversely, the size of both the initial differences, and of the subsequent changes, in contralateral suppression are highly comparable with those reported for MOCB activity in previous work (Perrot et al., 1999;Maison et al., 2001;Veuillet et al., 2007).
An additional check on MEM contributions to the contralateral suppression was performed by examining the effect of the contralateral noise on the CEOAE click artifact, which reflects the sound pressure change in the ear canal attributable to the eliciting click. Guinan et al. (2003) proposed that MEM reflex contributions can be identified through its effect on the eliciting stimulus. de Boer and Thornton • Training and the MOCB ANOVA found no main effect of contralateral noise ( p Ͼ 0.2) on the amplitude of the click artifact, nor was there any secondorder effect on the click attributable to interaction between contralateral noise and training day ( p Ͼ 0.4). Moreover, these factors did not show any interaction with learner group ( p Ͼ 0.3).
Last, we compared the contralateral suppression measured in the untrained control group on the equivalent of day 1 and day 5. No change was observed in the amplitude suppression at 30 dB SL (⌬ ϭ 0.05; p ϭ 0.79) or in I/O suppression (⌬ Ͻ 0.001; p ϭ 0.96). There was no significant difference between the trained and untrained group on the first day in amplitude suppression ( p Ͼ 0.2) or in I/O suppression ( p Ͼ 0.9). These findings confirm that changes in MOCB activity are not observed in the absence of intervening training.
We conclude that the observed interaction between learning and contralateral suppression reflects changes in MOCB activity attributable to the auditory training regimen. The interaction is significant for amplitude suppression only at the lower CEOAE stimulus level and more significant for the I/O suppression measure. This level dependence of the interaction of the contralateral suppression measure is very similar to that observed in an earlier study in our laboratory (de Boer and Thornton, 2007) and is likely to reflect the characteristic compressive nonlinearity of the cochlear amplification mechanism. The stronger effect found for the I/O slope may result from the fact that this measure is more directly and exclusively dependent on the gain of this amplification mechanism, which underlies EOAE generation, than the CEOAE amplitude.

Correlation between physiological and behavioral parameters
Because the main difference in MOCB activity between learners and nonlearners was observed for the contralateral suppression measures on the first training day, the relationship between this measure and learning parameters was explored further. A highly significant positive correlation was found between learning slope and first-day contralateral suppression measures, as illustrated in Figure 3, A and B, for amplitude suppression (r ϭ 0.74; p ϭ 0.002) and I/O suppression (r ϭ 0.79; p Ͻ 0.001), respectively. Both these suppression measures also showed a significant negative correlation with initial discrimination threshold (amplitude, r ϭ Ϫ0.52, p ϭ 0.048; I/O slope, r ϭ Ϫ0.66, p ϭ 0.007), as shown in Figure 3, C and D. As was illustrated in Figure 1E, a strong negative correlation occurred between initial performance and subsequent learning slope. This implies a three-way correlation between learning slope, initial performance, and first-day CEOAE suppression. The question arises whether each of these correlations reflects a direct relationship, or whether some may be mediated indirectly via this three-way correlation. In particular, it is of interest to determine whether initial MOCB activity, as indexed by CEOAE suppression, is directly predictive of initial performance or of learning slope, or both. This may be tested by using linear regression fits that model the first-day CEOAE suppression measures with the two behavioral parameters as independent variables (Baron and Kenny, 1986). As expected, when evaluated for each behavioral parameter separately, both the learning slope (t ϭ 3.93; p ϭ 0.002) and the initial performance (t ϭ Ϫ2.18; p ϭ 0.048) produced a good linear model for the first-day CEOAE amplitude suppression. However, when both parameters were combined as independent variables in the model, the learning slope maintained a significant contribution to the fit ( p ϭ 0.016), whereas initial performance did not (t ϭ 0.65; p ϭ 0.53). Similarly, as single independent variables, both learning slope (t ϭ 4.6; p Ͻ 0.001) and initial performance (t ϭ Ϫ3.2; p ϭ 0.007) significantly predicted first-day CEOAE I/O suppression, but combining both the two parameters in the model abolished the contribution of initial performance (t ϭ Ϫ0.29; p ϭ 0.78), whereas learning slope remained significant (t ϭ 2.4; p ϭ 0.033). These results imply that first-day CEOAE suppression showed a direct, predictive relationship with learning slope but not with initial performance. The lack of direct association between CEOAE suppression and performance was further confirmed by a lack of correlation between performance and suppression on any subsequent training day.
A similar analysis was performed to evaluate whether initial performance independently contributed to subsequent learning slope. This analysis showed that initial performance significantly explained variation in learning slope when combined with both first-day amplitude suppression (t ϭ Ϫ3.7; p ϭ 0.003) and firstday I/O suppression (t ϭ Ϫ2.67; p ϭ 0.021). Thus, both initial performance and first-day suppression contributed independently to prediction of subsequent learning.

Discussion
Our results show that auditory training can improve perceptual discrimination of speech sounds in noise in normal-hearing adult listeners, as has been shown previously for a variety of auditory perceptual tasks (Wright and Fitzgerald, 2001;Delhommeau et al., 2002;Karmarkar and Buonomano, 2003). As observed in previous studies (Amitay et al., 2005), the amount of learning showed great inter-individual variability and was strongly dependent on initial performance, such that poor performers showed greater improvement than good performers. The striking new finding of this study is that learning outcome was both predicted and reflected by the activity of the MOCB. Listeners who improved significantly in performance over the training period (learners) initially showed weaker MOCB activity than listeners who showed no or little improvement (nonlearners). Moreover, learners showed a significant increase in MOCB activity over the course of training, whereas no changes were observed in nonlearners. Thus, as the thresholds of learners improved and ap- proached those of nonlearners, the difference in MOCB activity between the two groups decreased.
Despite an apparent correlation between initial performance and first-day MOCB activity, additional analysis showed that MOCB activity did not directly explain variation in performance between listeners on any training day. Instead, first-day MOCB activity reflected listeners' initial performance relative to their personal maximal level and thus indicated the extent of improvement that individuals could subsequently achieve within their own range. These findings suggest an MOCB-mediated sound processing scheme or listening strategy that facilitates speech-innoise perception and that is flexible and susceptible to training.
No difference was found between listeners trained with or without contralateral noise present in either auditory learning or MOCB changes. This suggests that the observed changes are not specific to the MOCB activation pattern during training but reflect altered top-down control arising from changes at higher levels.

MOCB and antimasking
Based on physiological data , an "antimasking" model was developed, in which MOCB activation reduces cochlear responses to continuous noise, allowing greater responsiveness to rapidly changing acoustic signals embedded in the noise (Liberman and Guinan, 1998). This antimasking mechanism has been suggested to explain the observed link between the MOCB and speech-in-noise perception. It is plausible that the MOCB-mediated listening strategy suggested by our findings corresponds to a form of antimasking. In this interpretation, learners initially perform worse because of reduced antimasking, reflected by weaker MOCB activity. Subsequent improvement arises as antimasking is enhanced attributable to increased MOCB activity. Nonlearners, conversely, already use maximal antimasking mechanisms from the start, as a result of stronger initial MOCB activity, and thus have a reduced range for improvement.
In a study on children with learning problems, concomitant with speech-in-noise deficits, Russo et al. (2005) reported reduced noise degradation of brainstem representations of CV phonemes after intensive auditory training, concomitant with improved performance. Notably, this was observed only in a subset of children that initially showed excessive noise degradation compared with normal controls. Their findings show striking similarities to our results and are compatible with enhanced neural representations of speech sounds in noise attributable to increased MOCB-mediated antimasking after training.

MOCB and attention
A number of studies have reported effects of attention on MOCB activity (Froehlich et al., 1993;Giard et al., 1994;Meric and Collet, 1994). In particular, contralateral suppression of EOAEs was enhanced when attention was directed to the contralateral ear (Maison et al., 2001) and decreased when attention was directed toward the ipsilateral ear (de Boer and Thornton, 2007). It is possible that, in our results, MOCB activity reflected the extent or focus of attention paid by listeners to the ipsilateral sound stimuli. Increased attentional effort is generally associated with greater task difficulty. When speech sounds are degraded by noise, cortical areas related to attentional processing show increased activation (Binder et al., 2004;Scott et al., 2004), reflecting an increase in the relative importance of top-down influences in speech perception (Davis and Johnsrude, 2003;Zekveld et al., 2006). In this context, the changes in MOCB activity observed in learners may be attributable to decreased attentional effort in these listeners as their performance improved. An argument against this interpretation is that MOCB activity was measured in a passive condition after the training session, which would rule out real-time modulation by training-related attentional mechanisms during MOCB recording. However, it has been shown that corticofugal modulation of subcortical sound processing can persist for up to 3 h after the evoking cortical activation has ceased (Gao and Suga, 1998;Zhang and Suga, 2000). This suggests that corticofugal effects of attention during task performance may have persisted for some time after the training session ceased and could have affected MOCB recordings performed immediately after.
This interpretation fits in with current models of perceptual plasticity, which propose that short-term plasticity, such as related to attention, is strongly related to and precedes long-term changes in neural representations. Consolidation of the former into the latter occurs when the right conditions are met, which depend on interaction with reward and association centers and involve release of acetylcholine from the nucleus basalis (Jääskeläinen et al., 2007;Weinberger, 2007).

MOCB and categorical perception
A recent study reported increased MOCB activity in dyslexic children, as measured by changes in EOAE suppression, after intensive binaural audiovisual training on phoneme identification (Veuillet et al., 2007). Specifically, categorical perception shifted from abnormal toward the normal range after training, accompanied by increased MOCB asymmetry in favor of the right ear, as seen in typical children. Perception of speech sounds is strongly categorical, affecting phoneme discrimination judgments. Because the phonemes in our task were degraded by noise, they may not have initially been perceived as speech-like by all listeners, who were not informed what type of sounds they would be presented with. It is possible that the improvement in our good learners arose from a change from noncategorical to categorical perception, as they learned to recognize the stimuli as speech sounds. Such a change could be associated with altered cortical activation patterns, including changes in top-down influences, as reflected in MOCB activity. Although this interpretation is highly speculative, it merits additional investigation, given the strong similarities between our findings and those of Veuillet and coworkers and the suggested link between MOCB asymmetry and cerebral speech lateralization (Philibert et al., 1998;Morand-Villeneuve et al., 2005).

Long-term changes to MOCB activity
Sustained changes in MOCB activity have been reported after passive long-term conditioning with high-level binaural noise (Brown et al., 1998). It is unlikely that noise exposure is responsible for the MOCB changes in our results, because both duration and level of the noise in our experiment were much smaller than used in that study. Moreover, in our findings, changes in MOCB activity did not depend on binaural stimulation but showed a strong interaction with training outcome.
Although training-related changes in MOCB activity (Veuillet et al., 2007) and in auditory brainstem responses (Russo et al., 2005) have been reported in children with learning problems, such changes have not been shown previously in normal adult listeners. However, population differences have been reported in brainstem representations of speech tokens and musical notes (Musacchia et al., 2007;Wong et al., 2007), as well as in MOCB activity (Perrot et al., 1999), between musicians and nonmusi-cians, which suggest long-term effects of musical training on brainstem processing.
The MOCB changes in our study were observed using passively recorded responses to stimuli unrelated to the training task. It could be argued that nonspecific, long-term changes to lowlevel processing, attributable to relatively short-term and specific training, are neither likely nor useful given the changeable demands of everyday listening situations. A possible alternative is that the observed MOCB changes reflect altered corticofugal activity rather than sustained changes in afferent activation in the brainstem. Sustained changes in descending control may be specifically evoked by task performance and produce effects on MOCB activity that persist for a limited period after the training session. In this interpretation, the long-term "memory" associated with learning resides in the cortex. This explanation is compatible with current models of learning-related plasticity Weinberger, 2007) and is specific to task and stimulus, in line with recent findings on cortical plasticity (Ohl and Scheich, 2005;Gilbert and Sigman, 2007).

Conclusions
We present evidence of sustained changes in sound-evoked brainstem activity attributable to perceptual learning in adult listeners. These results support the emerging view of the central auditory pathway as a flexible processing structure, in which descending feedback pathways play an important role in both shortterm and long-term adaptive plasticity. Our findings further confirm a role for the MOCB in speech-in-noise perception and are in line with ongoing work on subcortical dysfunction in children with language-based learning problems, who show specific speech-in-noise deficits (Banai et al., 2005). Moreover, our results suggest the possibility of training-based remediation of brainstem deficits in speech processing in adult listeners. Additional studies are required to investigate how training-induced changes in MOCB activity affect speech-in-noise processing at higher stages along the central auditory pathway.