Abstract
We used magnetoencephalography to elucidate the cortical activation associated with the segmentation of spoken words in nonreading-impaired and dyslexic adults. The subjects listened to binaurally presented sentences where the sentence-ending words were either semantically appropriate or inappropriate to the preceding sentence context. Half of the inappropriate final words shared two or three initial phonemes with the highly expected semantically appropriate words. Two temporally and functionally distinct response patterns were detected in the superior temporal lobe. The first response peaked at ∼100 msec in the supratemporal plane and showed no sensitivity to the semantic appropriateness of the final word. This presemantic N100m response was abnormally strong in the left hemisphere of dyslexic individuals. After the N100m response, the semantically inappropriate sentence-ending words evoked stronger activation than the expected endings in the superior temporal cortex in the vicinity of the auditory cortex. This N400m response was delayed for words starting with the same two or three first few phonemes as the expected words but only until the first evidence of acoustic–phonetic dissimilarity emerged. This subtle delay supports the notion of initial lexical access being based on phonemes or acoustic features. In dyslexic participants, this qualitative aspect of word processing appeared to be normal. However, for all words alike, the ascending slope of the semantic activation in the left hemisphere was delayed by ∼50 msec as compared with control subjects. The delay in the auditory N400m response in dyslexic subjects is likely to result from presemantic–phonological deficits possibly reflected in the abnormal N100m response.
Children with developmental dyslexia have difficulties in reading acquisition and reaching a level of reading fluency that could be expected on the basis of their age or intelligence. Beginning readers must learn that spoken words are composed of speech sounds, phonemes, which can be represented by corresponding letters, graphemes, in written language. Phonological skills at preschool age predict later success in reading (Lundberg et al., 1980; Bradley and Bryant, 1983). Accordingly, children who have impaired phonological skills are likely to experience difficulties in reading acquisition (Bradley and Bryant, 1983; Scarborough, 1990).
Behavioral studies have shown that the discrimination of syllables /ba/ and /da/ is impaired in dyslexic individuals. In addition, reading-impaired children are less consistent than their nonreading-impaired peers in labeling syllables on the synthetic continuum from /ba/ to /da/ (Reed, 1989). Thus, it seems that in dyslexic children the phonological categories are broader and less sharply defined for these speech sounds that begin with rapid formant transitions. Furthermore, the mismatch response evoked by rarely presented speech sounds /ba/ in a sequence of more often presented speech sounds /da/ has been reported to be abnormally small in dyslexic children 300–600 msec after speech sound onset (Schülte-Körne et al., 1998). Recent behavioral studies have also suggested that the perception of vowels can be impaired in dyslexia (Adlard and Hazan, 1998).
The analysis of spoken words is composed of various subprocesses like acoustic, phonetic, phonological, semantic, and syntactic analysis, the temporal involvement of which can only be followed by time-sensitive methods like electroencephalography (EEG) and magnetoencephalography (MEG). However, the correspondence between auditory event-related potentials (ERPs), peaking in distinct time windows and the different operations involved in speech processing still has not been clearly determined. ERP studies have used the N400 paradigm both in the visual and auditory domains to tap semantically sensitive activation (for review, see Osterhout and Holcomb, 1995; Kutas and Federmeier, 2000). In their seminal study, Kutas and Hillyard (1980) used sentences with either expected or semantically inappropriate final words. The inappropriate endings evoked an N400 response, a negative deflection peaking ∼400 msec. Using conventional scalp-recorded ERPs, the localization and accurate description of the time behavior of the N400 has been complicated.
We recently used MEG to clarify the spatial and temporal pattern of semantic activation during reading (Helenius et al., 1998). Semantically inappropriate sentence-ending words evoked stronger activation than expected endings most consistently in the left superior temporal cortex. In dyslexic individuals the onset of semantic activation was delayed (Helenius et al., 1999a). Furthermore, unlike in control subjects, in dyslexic individuals the N400m response was weaker to inappropriate words that began with the expected letters. This suggests that visual word recognition may occur in atypically small sublexical units in dyslexic readers.
In the current study we elucidated the cortical location, timing, and rules in auditory word recognition. We studied both nonreading-impaired and dyslexic adults to determine whether phonological deficits associated with reading problems manifest as differences in the cortical responses elicited by naturally spoken words.
MATERIALS AND METHODS
Subjects. A total of nine nonreading-impaired adults (five females and four males) and 10 adults with a history of developmental dyslexia (five females and five males) took part in the study. The dyslexic adults were recruited from the population of the Jyväskylä Longitudinal Study of Dyslexia (JLD) (Lyytinen, 1997). The inclusion criteria used in the JLD study for selecting dyslexic individuals are: self-reported childhood and present reading and/or writing difficulties, below-normal reading and/or spelling test performance, intelligence quotient >80 (Raven et al., 1992), and dyslexia among close relatives (for details, see Leinonen et al., 2001). Individuals with a medical history of sensory or neurological abnormalities are excluded. The nonreading-impaired individuals were either spouses of the dyslexic individuals or age-matched control subjects that had no history of reading difficulties and a present reading performance within norms.
The dyslexic individuals had been tested with the standard behavioral test battery used in the JLD study within a few years of the MEG measurement (Leinonen et al., 2001). Control subjects were also tested for IQ and reading and spelling performance before the MEG recording. The subject groups did not differ in nonverbal IQ (Raven et al., 1992), but compared with control subjects the dyslexic participants were significantly slower and more error prone in reading aloud text passages, and made more errors in spelling aloud words and pseudowords presented aurally (Table 1). Compared with a normative sample of 100 nonreading-impaired adults (Leinonen et al., 2001), the dyslexic adults of the present study were also impaired in phonological awareness tasks. Dyslexic subject were successful in deleting a phoneme from a word on average 7.2 times (± SD 4.4) of 16 trials, whereas the mean of the normative sample was 13.0 (± 3.5) (t(108) = 3.9; p < 0.0005). In a syllable reversal task the dyslexic subject succeeded on average 4.0 times (± 3.3) of 20 trials, whereas the mean of the normative sample was 15.3 (± 4.5) (t(106) = 8.5; p < 0.0001). The oral reading speed of every dyslexic individual was at least 2 SDs below the mean of the normative sample and in at least one of the phonological awareness tasks 70% of the dyslexic individuals scored <2 SDs of the mean of the normative sample (Leinonen et al., 2001).
The behavioral profiles of control and dyslexic subjects
Materials. We used Finnish sentences with four types of final words, graded with respect to their appropriateness to the preceding sentence context (Helenius et al., 1998, 1999a). Some of the sentences were modified from the English versions used by Connolly and Phillips (1994) and Connolly et al. (1995). In the expected condition, the last word of a sentence was semantically appropriate and highly probable to that sentence context (e.g., “The piano was out oftune”). Alternatively, expected ending could be replaced by an improbable final word, i.e., a word that was semantically appropriate but of low probability with respect to the preceding sentence context (e.g., “The crying baby woke up hersitter”). In the phonological condition, the expected word was replaced by a semantically inappropriate final word beginning with the same two or three phonemes as the most probable word (e.g., “The gambler had a streak of bad luggage”). In the anomalous condition the final word was both semantically and phonologically totally inappropriate to the preceding sentence context (e.g., “The traffic lights changed from red to sunny”). The total number of sentences was 400 (100 sentences per condition). Presentation order of sentences was randomized.
Sentences were recorded using a male voice on a DAT tape in an anechoic chamber (Acoustics Laboratory, Helsinki University of Technology). The sentences were presented for reading on a computer screen one word at a time at a rate of approximately one word per second. Thus, across-word coarticulatory or prosodic cues were minimal. The sentences were edited so that a constant 750 msec silent gap always preceded the last word of the sentence. The length of the final word was on average 490 msec (SD 80 msec). Each new sentence was preceded by a mean gap of 3250 msec. The MEG recording was performed in six blocks, each lasting ∼10 min. The blocks were interleaved with 2–3 min breaks. During the recording the sentences were presented binaurally, and subjects were instructed to concentrate on the meaning of the sentences.
MEG recording and data analysis. The recordings were conducted in a magnetically shielded room using the Neuromag Vectorview whole head system (Neuromag Ltd., Helsinki, Finland). The device contains 102 triple sensor elements composed of two orthogonal planar gradiometers and one magnetometer. The measured data were stored for off-line analysis. Signals were bandpass filtered to 0.03–100 Hz and sampled at 0.3 kHz. Separately for each type of sentence-ending word the signals were averaged from 200 msec before to 1000 msec after the presentation of the word. We also averaged signals time locked to the presentation of all the first words of the sentences. Both horizontal and vertical eye movements were recorded (bandpass 0.03–100 Hz), and epochs contaminated by eye or lid movements were rejected. The mean number of artifact-free responses accepted for the averages was 84–87 for the four types of sentence-ending words and 319 for the first words of the sentences across all four conditions.
We analyzed the data in two ways. The areal mean signals (Hari et al., 1997) were calculated to get a rather crude but quick impression of the major features of the data over the whole head. The signals of each planar gradiometer were first squared, and then the signals of each sensor pair were summed together. Then, the square root of the signal was calculated. The channels were then grouped into 10 sections. Within each of the 10 sections the mean signals across all sensor pairs were averaged together for each individual. Group averages were calculated for the nonreading-impaired subjects and for the dyslexic individuals. A difference between waveforms was considered to be statistically significant at the 0.05, 0.01, and 0.001 probability levels when it exceeded 1.96, 2.58, and 3.29 times, respectively, the mean strength of the activation during the prestimulus period (from −100 msec to stimulus onset).
Equivalent current dipole (ECD) analysis (Hämäläinen et al., 1993) was used to reduce the neuromagnetic signals detected by the planar gradiometers into time behavior of distinct cortical areas. An ECD represents the orientation, strength, and center of the underlying electric current. Dipoles were localized individually for each subject using a subset of channels that ideally covered the distinct magnetic field patterns. After the dipoles had been localized they were included into a multidipole model and, keeping their orientation fixed, their amplitudes were allowed to be adjusted to achieve maximum explanation of the measured whole head dataset. The results gathered using dipole modeling were analyzed statistically using ANOVA models including both between- and within-subjects variables.
The location of sources was defined in head coordinates that were set by the nasion and two reference points anterior to the ear canals:x-axis was directed from the left (negative) to the right (positive) preauricular point, y-axis toward the nasion, andz-axis toward the vertex. At the beginning of the recording, the locations of four head position indicator coils were determined with respect to the sensors. The locations of these coils with respect to anatomical landmarks (nasion and ear canals) were measured with a three-dimensional digitizer. Because none of the subjects had magnetic resonance images available, the locations of the ECDs were presented on an average brain (see for further details on visualization).
RESULTS
Areal mean signals
Figure 1 illustrates the areal mean signals calculated for 10 channel sections in control and dyslexic subjects for the first words of the sentences (Fig. 1a) and for the expected and anomalous sentence-ending words (Fig.1b). Both the very first words and the final words of the sentences elicited prominent activation over the left and right temporal channels. A similar signal could be seen over the anterior temporal–inferior frontal channel sections as well. The similarity of the signals on temporal and anterior temporal–inferior frontal channels suggests that the signal detected with these two channel sections is likely to have the same origin, possibly in the middle temporal region. Over other channel sections the activation was more modest and variable.
The areal mean signals. The signals for 10 channel sections area shown in control subjects (left) and in dyslexic subjects (right) both for the first words (a) and for the expected and anomalous last words of the sentences (b). Expected sentence-ending words are indicated with a gray line and the anomalous words with a black line.
The activation in the temporal channel sections had two prominent peaks. The first peak was detected ∼100 msec bilaterally in the temporal channels. The N100m responses to expected and anomalous sentence-ending words were equally strong. However, because the first words of the sentences were preceded by a longer silence, the N100m responses were stronger to the first than to the final words of the sentences (p < 0.05 in control andp < 0.01 in dyslexic subjects in the left hemisphere channels). In dyslexic individuals the left hemisphere N100m response was stronger than in nonreading-impaired individuals (p < 0.001 for the first words of the sentences and p < 0.075 for the sentence-ending words).
After the N100m response, the first words of the sentences and the anomalous sentence-ending words evoked prominent activation peaking ∼400 msec, whereas the activation evoked by the expected endings was weaker. This N400m response was statistically significantly stronger to anomalous than to expected sentence-ending words in the left temporal channels in both subject groups (p < 0.05). In the nonreading-impaired individuals the semantically sensitive activation in the left temporal channels peaked at 360 msec and in the dyslexic individuals at 420 msec after word onset for the anomalous sentence-ending words. The possible differences in the timing of the N400m response between the subject groups was quantified by reducing the signals detected by the MEG sensors into time behavior of distinct cortical areas.
Localization of neural populations underlying the N100m and N400m responses
The location of the neural population generating the N100m response was determined at the peak of the response elicited by the first words of the sentences. The orientation of the current flow at the peak of the activation was perpendicular to the Sylvian fissure, toward the base of the brain. In all but one subject the magnetic field pattern was easily visible at the peak of the response and not obscured by simultaneous activation in nearby areas. In the one subject with a more complex right-hemisphere field pattern the interfering activation was removed using signal-space projection (Uusitalo and Ilmoniemi, 1997).
For the sustained activation peaking at ∼400 msec, the orientation of the current flow was also downward perpendicular to the Sylvian fissure. The location of the neural population generating this N400m response was computed at a time point when the field pattern was most clearly visible either in the data evoked by the first words or, in a few subjects, in the data gathered during the presentation of the anomalous sentence-ending words. The sustained downward oriented current flow was missing or the field pattern was too obscure to allow reliable source localization in the left hemisphere in one subject and in the right hemisphere in three subjects. In addition, in one subject neither the left nor the right-hemisphere response could be localized. In the remaining 14 subjects with bilateral localizable N400m and N100m responses the mean distance between these two sources was 4.0 mm in the lateral–medial direction, 3.5 mm in the anterior–posterior direction, and 5.5 mm in the inferior-superior direction. The distances were, however, in opposite directions in the two hemispheres (lateral–medial and anterior–posterior directions) or very subtle (superior–inferior direction), and thus the difference between the N100m and N400m source coordinates did not reach statistical significance in a 2 (response type) × 3 (coordinate) × 2 (hemisphere) × 2 (subject group) ANOVA (F(1,12) = 4.1;p < 0.07). No statistically significant differences were detected between the two subject groups in the N100m or N400m source locations.
A statistically significant difference was detected in the N100m and N400m source orientations (F(1,12) = 8.6; p < 0.01); the N100m sources formed on average a 70° angle with respect to the horizontal y-axis, whereas for the N400m sources the angle was 82°. Thus, based on source locations and orientations, the N100m and N400m responses seem to be generated by nonidentical but spatially adjacent neural populations.
The strength and time behavior of semantic activation in the left hemisphere
Because of the close proximity of the N100m and N400m sources, we included only the N400m sources in a multidipole model to account for the temporal activation over the entire analysis interval. In those two subjects in whom reliable source localization could not be achieved between 200 and 600 msec in the left hemisphere, the N100m source was used, instead. In the right hemisphere the N100m source was used for four subjects. Sources generated in other cortical areas were included provided that they did not interfere with the detection of the time behavior of the N400m sources. In the left hemisphere these additional sources were generated either in anterior perisylvian areas, peaking ∼200 msec (the P200m response was found in 6 of the 18 subjects) or in posterior perisylvian areas with a variable peak latency (in seven subjects). In the right hemisphere P200m activation was detected in nine subjects and posterior perisylvian activation in four subjects. The functional role of the P200m response is elusive, but it seems to be elicited especially reliably by noise bursts (Hari et al., 1987).
In the left hemisphere, anomalous sentence-ending words evoked statistically significant activation, i.e., around the peak the response strength exceeded 1.96 times the SD in the prestimulus period, for at least 100 msec between 200 and 600 msec in 18 subjects. In these 18 subjects the response for the anomalous sentence-ending words was during the same time period statistically significantly stronger than for the expected endings at least for 50 msec, i.e., the difference exceeded 1.96 times the SD in the prestimulus period. Figure2a depicts the spatial distribution of the N400m responses in those eight nonreading-impaired (left) and nine dyslexic subjects (right) that had both semantically sensitive and localizable activation in the left hemisphere between 200 and 600 msec. The individual sources are shown in reference to the center of activation of the N100m response (for additional information on source visualization, see ).
The N400m response locations and mean time behavior in the left hemisphere. a, The semantically sensitive and localizable N400m responses (black spheres) in eight nonreading-impaired (left) and nine dyslexic subjects (right) in the left hemisphere. The N100m response is shown as a white sphere. The mean time behavior of activation in the left temporal region for the first words (b) and for the four types of sentence-ending words (c) across eight control and 10 dyslexic subjects with semantically sensitive activation.
The mean time behavior of activation in the left temporal region to the first words of the sentences and to the anomalous and expected sentence-ending words across those 8 control and 10 dyslexic subjects with semantically sensitive activation is shown in Figure 2,b and c. The N100m response did not differ between the four types of sentence-ending words. However, after the N100m response, ∼170 msec after word onset in control subjects, the anomalous sentence-ending words started to differ from the activation evoked by the expected endings. In each individual subject we measured the peak strength of the N400m source between 200 and 600 msec to each sentence type. In a 4 (sentence type) × 2 (subject group) mixed ANOVA, a significant main effect of sentence type was detected (F(3,48) = 36.4; p < 0.0001). The anomalous sentence-ending words elicited a statistically significantly stronger N400m response than the improbable sentence-ending words (F(1,16) = 12.0;p < 0.003) and the improbable evoked stronger activation than the expected endings (F(1,16) = 28.6; p < 0.0001). Thus, the N400m response strength was modulated by the semantic appropriateness of the sentence-ending word to preceding sentence context. For the phonological sentence-ending words the activation was even stronger than for the anomalous endings (F(1,16) = 7.6; p < 0.01).
The timing of the broad N400m response was characterized by measuring the onset, the point in time when the activation had reached 50% of the maximum and the peak latency of the N400m response for the three types of unexpected sentence-ending words. A 3 (sentence type) × 3 (time point) × 2 (subject group) mixed ANOVA revealed a significant main effect of sentence type (F(2,32) = 42.0; p < 0.0001). The timing of the semantic activation did not differ between the anomalous and improbable endings, but for the phonological endings the semantic activation was delayed as compared with anomalous endings (F(1, 16) = 34.3; p ≤ 0.0001). The difference in peak latency for the anomalous and phonological sentence-ending words was on average 95 msec.
The strength and latency of the N100m and N400m responses in dyslexic and nonreading-impaired individuals in the left hemisphere
The waveforms depicting the mean time behavior of the left temporal activation in dyslexic and nonreading-impaired subjects are overlaid in Figure 3. The top row shows the responses to the first words of the sentences and the bottom row to the anomalous sentence-ending words. The mean strength and latency of the N100m and N400m responses are plotted on the right side of Figure3. The peak strength and timing of the N100m response was measured from source waveforms that were generated by including only the N100m sources in the multidipole model.
The strength and latency of N100m and N400m responses in the left hemisphere. The mean time behavior of the left temporal activation for the first words of the sentences (top row) and for the anomalous sentence-ending words (bottom row) in dyslexic (solid line) and nonreading-impaired subjects (dotted line) is shown on the left. The mean (±SEM) strength and latency of the N100m and N400m responses are shown for control (white bars) and dyslexic subjects (black bars) on theright. Asterisks denote statistically significant differences between the subject groups at p < 0.05.
A significant main effects of subject group were detected on the strength of the N100m response both to the first words (F(1,17) = 11.0; p < 0.004) and to the last words of the sentences (F(1,17) = 5.8; p < 0.03). The N100m responses were ∼40% stronger in dyslexic than nonreading-impaired individuals. The latency of the N100m response was identical in the two subject groups.
The N400m source strengths did not differ between the two subject groups. The main effect of subject group on the timing of the N400m response was significant both in the analysis of the responses to the first words of the sentences (F(1,16)= 5.5; p < 0.03) and to the unexpected sentence-ending words (F(1,16) = 10.3;p < 0.005). In nonreading impaired subjects the N400m response evoked by the anomalous sentence-ending words started ∼170 msec (SEM ± 15 msec) and peaked ∼325 msec (SEM ± 20) after stimulus onset. In dyslexic subjects the response started at 205 msec (SEM ± 10 msec) and peaked at 395 msec (SEM ± 25 msec). On average the N400m response peaked ∼60 msec later in the dyslexic subjects than in the nonreading-impaired individuals, when calculated across all unexpected sentence-ending words and the first words of the sentences.
The strength and time behavior of semantic activation in the right hemisphere
Anomalous sentence-ending words evoked statistically significant activation for at least 100 msec between 200 and 600 msec in the right hemisphere in 17 subjects. In 15 subjects (seven control and eight dyslexic subjects) during the same time period the response for the anomalous endings was statistically significantly stronger than for the expected endings. Figure 4 depicts the sources of the semantically sensitive N400m response in those seven nonreading-impaired (left) and six dyslexic subjects (right) in whom N400m sources could be successfully localized (see for details of visualization). The mean time behavior of activation in the right temporal region to the first words of the sentences and to the anomalous and expected sentence-ending words across all 15 subjects with semantically sensitive activation is shown below.
The N400m response locations and mean time behavior in the right hemisphere. a, The semantically sensitive and localizable N400m responses (black spheres) in seven nonreading-impaired (left) and six dyslexic subjects (right) in the right hemisphere. The N100m response is shown as a white sphere. The mean time behavior of activation in the right temporal region to first words (b) and to four types of sentence-ending words (c) across seven control and eight dyslexic subjects with semantically sensitive activation.
The N400m peak amplitudes revealed significant difference between the four types of sentence-ending words in a 4 (sentence type) × 2 (subject group) ANOVA (F(3,39) = 24.5;p < 0.0001). The anomalous sentence-ending words elicited a statistically significantly stronger N400m response than the improbable endings (F(1,13) = 10.0;p < 0.007), and the improbable endings evoked stronger activation than the expected endings (F(1,13) = 24.6; p < 0.0003). For the anomalous and phonological sentence-ending words the activation was equally strong. In a 3 (sentence type) × 3 (time point) × 2 (subject group) mixed ANOVA, also the main effect of sentence type on latency reached statistical significance in the right hemisphere (F(2,26) = 10.5;p < 0.0005). The timing of the semantic activation was similar for the anomalous and improbable endings, but for the phonological endings the semantic activation was delayed in comparison with the anomalous endings (F(1,13) = 10.5; p < 0.006). The difference in peak latency for the anomalous and phonological sentence-ending words was on average 70 msec.
The effect of subject group on the N400m response strength or latency in the right hemisphere was nonsignificant for both the last words of the sentences and for the first words of the sentences.
Comparisons of strength and timing of semantic activation in the left and right hemispheres
In those 15 subjects with semantically sensitive activation in both the left and right hemispheres we compared the strength and time behavior of the activation. The timing of the N400m responses did not differ in the two hemispheres. However, for the N400m response strength a 2 (hemisphere) × 4 (sentence type) × 2 (subject group) mixed ANOVA revealed a significant main effect of hemisphere (F(1,13) = 6.5; p < 0.02). As indicated by the significant hemisphere by sentence type interaction (F(3,39) = 5.7;p < 0.003), only the anomalous (F(1,13) = 6.0; p < 0.03), phonological (F(1,13) = 8.5;p < 0.01), and improbable sentence-ending words (F(1,13) = 7.9; p < 0.01) evoked a stronger N400m response in the left than in the right hemisphere, whereas the expected words evoked equally strong activation in the two hemispheres.
DISCUSSION
Naturally spoken words evoked two temporally and functionally distinct response patterns in the superior temporal lobe in nonreading-impaired and dyslexic adults. The activation peaking ∼100 msec, the N100m response, was found to reflect presemantic processing, and the activation ∼400 msec, the N400m response, semantic processing. Both of these processing stages differed between dyslexic and nonreading-impaired adults.
Activation peaking at ∼400 msec in the superior temporal cortex, in close proximity of the supratemporal plane, was modulated by the semantic appropriateness of the sentence-ending words. The activation was stronger to semantically inappropriate sentence-ending words than to semantically appropriate, but unexpected, endings and weakest to semantically appropriate, expected endings. This semantic activation was clearly bilateral, although more robust and slightly more reliably detected in each individual in the left than right hemisphere. The N400m response was also evoked by the very first words of the sentences. Thus, as previous ERP studies of reading have indicated, the N400m response is elicited by most words within a sentence (Kutas et al., 1988), and the strength of the N400 response reflects the extent to which a word is semantically primed (Kutas and Hillyard, 1984).
Based on ERP data, it has been suggested that the N400 response evoked by spoken words is preceded and partly overlapped by an earlier negativity (or PMN) peaking between 200 and 300 msec (Connolly and Phillips, 1994; Hagoort and Brown, 2000; van den Brink et al., 2001) (but see also an opposing view by van Petten et al., 1999). The earlier negativity seems to be evoked by phonologically unprimed words and has been tentatively associated with a left anterior generator (Connolly et al., 2001). In the current MEG study we could not detect any separable component preceding the N400m response. Instead the N400m activation in the bilateral temporal lobes started ∼170 msec covering the time periods of both the suggested early negativity and the N400. Naturally the results must be considered cautiously because the ERP and MEG responses are likely to have at least partly divergent generators. Particularly deep sources do not contribute to the measured magnetic field unlike to the electric potential. On the other hand, as the skull and the scalp distort the electric potential, the signals in each channel receive contribution from various cortical areas, whereas in the current MEG study, with the help of source modeling, the time behavior of left temporal lobe activation could be studied without interference from other cortical regions.
The onset of the N400m response occurred when only approximately one-third of the whole semantically anomalous word had been presented. For phonological sentence-ending words that shared their two or three first phonemes with the expected words, the N400m response was delayed by ∼100 msec. Taking into account the effect of coarticulation, 100 msec is likely to be very close to the true point of uniqueness between the expected and phonological sentence-ending words. Thus, semantic processing seemed to be delayed only until emergence of the first evidence of acoustic-phonetic dissimilarity.
Psycholinguistic models differ in the ways they assume the speech signal is segmented to enable it to make contact with the distributed representations of word forms and meanings in the lexicon (Gaskell and Marslen-Wilson, 1997, 2001). Most current models assume that lexical access is based on phonemes (McClelland and Elman, 1986; Norris et al., 2000) or acoustic features (Marslen-Wilson and Warren, 1994) rather than on syllables (Segui et al., 1990). Our data, revealing only a 100 msec delay in the phonological condition, agrees with phoneme or acoustic feature based access to the lexicon. This finding is also in line with the ERP study by Connolly and Phillips (1994) using similar stimuli in English. Based on behavioral data, the initial access to the lexicon is likely to take place ∼200 msec after word onset (Marslen-Wilson and Tyler, 1980). This estimate corresponds nicely to the observed onset of the N400m response. From the very beginning of this activation, the responses evoked by expected and anomalous sentence-ending words started to diverge, indicating that semantic priming has an effect at the point lexical representations are accessed, but not at prelexical stages (e.g., during the N100m response).
The finding that lexicosemantic neural populations in the left superior temporal lobe are accessed almost online while phonetic information is being presented (Marslen-Wilson and Warren, 1994; Norris et al., 2000) has clear implications for imaging studies. In PET and fMRI studies, auditory semantic activation is often tapped by contrasting speech with reversed speech or pseudo-word listening (Howard et al., 1992; Price et al., 1996; Binder et al., 2000; Burton et al., 2001). Reversed speech is mostly incomprehensible but readily identified as speech because over 70% of the constituent letters can be correctly identified (Binder et al., 2000). As lexical access is based on subsyllabic speech units both pseudowords and reversed speech are likely to evoke lexicosemantic activation. Contrasting speech to reversed speech or pseudo-words could even cancel out most of the semantic activation elicited by any speech-like stimulus. However, when speech is contrasted to an acoustically complex nonlinguistic stimulus, lexicosemantic activation in addition to phonetic/phonological activation is likely to be revealed. Binder et al. (2000) recently compared the peak activation loci across four PET and fMRI studies (Démonet et al., 1992; Zatorre et al., 1992; Binder et al., 1997;Binder et al., 2000) where activation to different types of speech and nonlinguistic stimuli was contrasted. In these studies speech-specific activation was found in the areas of the superior temporal sulcus and superior temporal gyrus surrounding the auditory cortex. This is exactly the spatial distribution found in the current study. In addition, our study clearly reveals that the activation in the supratemporal plane, peaking ∼100 msec, is prelexical. Semantically sensitive neural populations are distributed in the surrounding superior temporal cortex, and the activation of these neurons peaks 300–400 msec after word onset. In the only fMRI study contrasting aurally presented semantically anomalous sentences to semantically appropriate sentences (Ni et al., 2000), activation was found in the left superior temporal cortex only slightly posterior to the center of activation found in the current study.
The semantic activation in the left hemisphere was delayed in the dyslexic individuals in the present study. This agrees with our previous findings in reading (Helenius et al., 1999a). Thus, it appears that dyslexic individuals have delayed access to semantic properties of the words also during the processing of natural spoken language. However, the qualitative aspects of spoken-word segmentation appeared similar in the two subject groups (initial lexical access being based on acoustic-phonetic features). In addition to semantic delay, our earlier studies of reading in dyslexic individuals have indicated abnormal presemantic processing in the left inferior occipitotemporal cortex ∼150 msec after word onset (Salmelin et al., 1996; Helenius et al., 1999b). Also in the current experiment, differences were detected in a time window preceding semantic activation, already in the N100m response.
Auditory input reaches the auditory cortex within 10–15 msec after stimulus onset (Celesia, 1976; Liégeois-Chauvel et al., 1994), and thus the N100m response belongs in the category of long-latency auditory responses. The activation contributing to the N100m response is likely to originate predominantly in the planum temporale (Liégeois-Chauvel et al., 1994; Lütkenhöner and Steinstrater, 1998). Recent MEG studies have suggested that ∼150–200 msec after stimulus onset the phonological categories have already been accessed (Phillips et al., 2000; Vihla et al., 2000). It is thus plausible that the neural populations underlying the N100m response could be involved in phonetic–phonological processing.
In dyslexic individuals the N100m response was abnormally large in the left hemisphere. One highly speculative interpretation of the aberrant auditory N100m response is that the neural populations in the posterior supratemporal plane have failed to specialize adequately for speech processing. Thus, speech sounds evoke activity in a large unspecialized neural population resulting in an atypically strong N100m response. If the abnormally strong N100m response in dyslexic individuals reflects their phonological difficulties, the N100m response should be normal for simple or complex nonspeech sounds. This is also what we recently found in the same subjects that participated in the present study (Helenius et al., 2002). However, many dyslexic individuals have difficulties in processing brief or rapidly successive nonspeech stimuli as well (Tallal, 1980; Hari and Kiesilä, 1996; Ahissar et al., 2000). Brief rapidly successive nonspeech stimuli have been reported to elicit abnormal auditory responses between 100 and 200 msec in these individuals (Nagarajan et al., 1999). Thus, future studies are clearly needed to clarify the functional role and development of the N100m response and its relation to speech and nonspeech processing difficulties in dyslexia.
To summarize, auditory–phonological deficits associated with reading problems are manifested as differences in the cortical activation elicited by naturally spoken words. Although access to the meaning of words occurred in subsyllabic units in both nonreading-impaired and dyslexic individuals, semantic activation was delayed in dyslexia. This delay is likely to have resulted from difficulties in presemantic auditory processing, possibly reflected in the abnormal N100m response.
Appendix
According to a sizable literature on the location of the N100m response to simple tones, the center of activation lies just posterior to Heschl's gyrus in the planum temporale (Hari, 1990;Lütkenhöner and Steinstrater, 1998). Because individual MRIs were not available, sources of the N100m responses evoked by 1 kHz 50 msec tones in a separate short recording session provided reference points in the left and right auditory cortex. The main experiment on speech processing lasted for ∼1 hr. Comparison of the location of the N100m response evoked by the first words across the entire measurement and only during the first third of the measurement showed that the subjects' heads had slipped downward in the helmet by ∼4 mm during the long session. After correction for this head movement, the mean location of the N100m response to the first words of the sentences in both hemispheres was within a few millimeters from the supratemporal auditory cortex, as indicated by comparison with the sources of the N100m response to 1 kHz tones, and further supported by the average source locations projected on an average brain (created using elastic transformation, see Schormann et al., 1996). The N400m sources of each individual were projected onto the average brain with reference to the speech N100m response, the center of which was in each individual aligned to the supratemporal auditory cortex.
Footnotes
This work was supported by the Academy of Finland Grants 32731, 1365981, and 39253, Human Frontier Science Program Grant RG82/1997-B, Finnish Cultural Foundation, and Wihuri Foundation. We thank K. Eklund and K. Müller for assistance in recruiting the subjects and gathering the behavioral data, R. Service for reading the sentences to the tape, P. Antsalo for valuable help in stimulus recordings, A. Tarkiainen for providing the program for the areal mean signal calculations, and M. Seppä for providing the average brain.
Correspondence should be addressed to Dr. Päivi Helenius, Brain Research Unit, Low Temperature Laboratory, Helsinki University of Technology, P.O. Box 2200, FIN-02015 HUT, Espoo, Finland. E-mail:paivi{at}neuro.hut.fi.