Abstract
Speech processing requires the temporal parsing of syllable order. Individuals suffering from posterior left hemisphere brain injury often exhibit temporal processing deficits as well as language deficits. Although the right posterior inferior parietal lobe has been implicated in temporal order judgments (TOJs) of visual information, there is limited evidence to support the role of the left inferior parietal lobe (IPL) in processing syllable order. The purpose of this study was to examine whether the left inferior parietal lobe is recruited during temporal order judgments of speech stimuli. Functional magnetic resonance imaging data were collected on 14 normal participants while they completed the following forced-choice tasks: (1) syllable order of multisyllabic pseudowords, (2) syllable identification of single syllables, and (3) gender identification of both multisyllabic and monosyllabic speech stimuli. Results revealed increased neural recruitment in the left inferior parietal lobe when participants made judgments about syllable order compared with both syllable identification and gender identification. These findings suggest that the left inferior parietal lobe plays an important role in processing syllable order and support the hypothesized role of this region as an interface between auditory speech and the articulatory code. Furthermore, a breakdown in this interface may explain some components of the speech deficits observed after posterior damage to the left hemisphere.
Introduction
Parsing the temporal order of syllables is a crucial process for speech production and perception. In general, temporal order judgment (TOJ) refers to the ability to distinguish the order of onset among two or more events. Recent studies suggest that the inferior parietal lobe (IPL) supports TOJ; however, hemispheric lateralization may differ based on domain (Husain and Rorden, 2003). Specifically, TOJ may result in right hemisphere reliance for visual information, but a more left and/or bilateral IPL reliance for auditory information (Wittmann et al., 2004; Battelli et al., 2007).
Numerous studies report an association between temporal processing and language abilities, and therefore, it has been hypothesized that impaired temporal processing negatively affects language processing. Previous findings include significantly longer TOJ thresholds for clicks and tones in patients with aphasia compared with controls (Fink et al., 2006) and slowed temporal rate for self-paced finger tapping in patients with left hemisphere lesions (Wittmann et al., 2001). Two current models of speech processing provide a tentative framework for understanding the role of the IPL in auditory TOJ. The Directions into Velocities of Articulators (DIVA) model proposes a highly interactive network among articulatory maps in the inferior frontal lobe, auditory maps in the superior temporal lobe, and somatosensory maps in the IPL (Guenther, 2006). This model makes strong predictions regarding the role of the left temporal and frontal areas in phonological processing and motor speech, respectively, but is less descriptive about the nature of interface that may be provided by the IPL. In contrast, the Dual Stream model suggests that the IPL supports the translation of auditory speech into the articulatory code, including the temporal binding of syllables and articulation maps (Hickok and Poeppel, 2004). Based on this model, damage to the IPL could lead to phonological errors consistent with conduction aphasia. The diagnosis of conduction aphasia often results from damage to the left IPL and underlying white matter, and is typically characterized by impaired repetition and phonemic paraphasias in the form of phoneme substitution, deletion, and transposition (Monoi et al., 1983; Canter et al., 1985).
In a study of brain-injured patients with lateralized cortical infarctions, the only group that performed significantly worse than controls on auditory TOJ were patients with posterior lesions in the left hemisphere (von Steinbüchel et al., 1999). However, it remains unclear whether these posterior regions are associated with processing the temporal order of syllables. In a review paper, Shalom and Poeppel (2008) argue that the language network may be functionally organized based on the type of processing needed rather than the type of material being processed. They suggest that memorizing is dominated by the temporal lobe, analyzing by the parietal lobe, and synthesizing by the frontal lobe. Thus, the purpose of this study was to examine whether the left IPL is recruited during TOJ of speech stimuli in normal participants. If the left IPL is important for temporal processing of syllables, increased neural activity in this area should occur when participants make TOJs about syllable order.
Materials and Methods
Participants.
Fourteen right-handed females with a mean age of 21.4 years (range 19–23) participated in this study. All participants were native English speakers with no history of neurological/psychiatric disorders or speech/hearing/visual impairment. Participants were excluded if they reported contraindication for MRI scanning (e.g., implanted metal, seizures, pregnancy). This experiment was conducted with approval from the institutional ethics committee, and all participants completed written informed consent before inclusion.
Stimuli.
Stimuli were digitally recorded with an audio interface in a sound booth. Six native English speakers (three male and three female) produced 14 monosyllables and 12 polysyllabic pseudowords (phonotactically plausible nonwords). Monosyllables included the two highly contrastive target syllables (/pa/ and /shei/), as well as six rhyming pairs (/tu/ and /zu/, /dai/ and/fai/, /ko/ and /so/, /g⋀/ and/v⋀/, /mi/ and /li/, /nε/, and /rε/). The four-syllable pseudowords included planned combinations of the two target syllables, /pa/ and /shei/, in varying syllable positions with two other syllables from the list of rhyming pairs. In half of the pseudowords, /pa/ came before /shei/. In the other half, /shei/ came before /pa/ in the counterbalanced syllable position and paired with the same filler syllables (e.g., pa-shei-mi-ko and shei-pa-mi-ko; rε-pa-g⋀-shei and rε-shei-g⋀-pa). Thus, the syllables /pa/ and /shei/ each occurred three times in each syllable position (i.e., initial, second, third, and final). Speakers produced the pseudowords at a moderate rate (∼4 syllables per second). Stimuli were normalized to the same loudness level across all speakers.
Procedure.
Participants completed a forced-decision task while undergoing functional magnetic resonance imaging (fMRI). Audio speech was presented via noise-attenuating MRI-compatible headphones (Resonance Technology). Because syllables provide a fundamental unit for parsing the speech signal (Hickok and Poeppel, 2007), our experimental task required judgments at the syllable level. More specifically, participants were asked to make judgments about the stimuli based on either (1) syllable content (experimental condition) or (2) gender of the speaker (control condition for comparison). These two judgment conditions were presented in 30 s blocks that were randomly ordered and counterbalanced across participants. The stimuli were pseudo-randomly divided into nine blocks with eight stimuli per block. Stimuli for the two conditions were identical, but were differentiated by the simultaneous visual presentation of an assigned color. More specifically, response criteria (syllable vs gender) were indicated by a central blue or green circle that was presented on a back-projected computer screen visible through a mirror attached to the scanner's head coil. Thus, half of the participants made syllable-based responses during the presentation of a blue circle, whereas the other half did so during the presentation of the green circle. Participants responded to stimuli by pressing the thumb or index finger after each trial using an MRI compatible response glove (Psychology Software Tools) fitted on the left hand.
One potential confound between the experimental and control conditions was that these tasks differed not only based on whether temporal processing was required, but whether phonological processing was required. That is, one must assume that phonological processing of the syllable content (i.e., syllable identification) would be obligatory for syllable order judgments but not for gender distinctions. Thus, to aid in the interpretation of our data, participants were asked to complete a separate phonologically based paradigm that did not require temporal judgments, but maintained the same control task. Instead of four-syllable pseudowords, single syllables were presented and syllable identification was required. Thus, participants completed two paradigms (i.e., single syllable identification and multisyllabic temporal order judgements) in a single scanning session, and the order of completion was counterbalanced across participants. During both paradigms, participants were instructed to indicate gender (the control task) by pressing their thumb if the speaker was male and their index finger if the speaker was female. However, the task instructions differed slightly between the two paradigms for the phonologically based decisions. During the single-syllable paradigm, participants were instructed to identify syllables by pressing their thumb if they heard either /pa/ or /shei/ and their index finger if they heard any other syllable. However, during the multisyllabic paradigm, participants indicated temporal order by pressing their thumb if /pa/ came before /shei/ and their index finger if /shei/ came before /pa/. Other than the differences mentioned above, these two paradigms were identical. Each paradigm took 9 min to complete, for a total of 18 minutes of fMRI scanning.
Imaging data.
MRI data were collected on a Siemens 3T Trio scanner with a 12-channel radiofrequency head coil. Functional data were acquired using a continuous echo planar imaging (EPI) sequence (repetition time, 2200 ms; echo time, 30 ms; flip angle = 90°, 64 × 64 matrix, 192 × 192 mm field of view, 36 slices, slice thickness = 3 mm with 0.6 mm gap). In addition, a gradient echo field map (with spatial dimensions and alignment identical with those of the fMRI sequence) and a T1-weighted high-resolution anatomical image (1 mm isotropic voxels) were collected for each participant to aid in normalization.
fMRI data processing was performed using the FSL [FMRIB (Functional Magnetic Resonance Imaging of the Brain, Analysis Group, Oxford University, Oxford, UK) Software Library] tools (Smith et al., 2004). Standard prestatistics processing included motion correction, non-brain removal, spatial smoothing using a Gaussian kernel with a 6.0 mm full-width-half-maximum, mean intensity normalization, high-pass temporal filtering (Gaussian-weighted least-squares straight line fitting, with σ = 50.0 s), and field map-based EPI unwarping to correct for spatial distortion commonly associated with fMRI data collection near the orbital frontal lobes. Time-series statistical analysis was performed using a general linear model with local autocorrelation correction. Higher-level analysis was performed using FLAME (FMRIB's Local Analysis of Mixed Effects). Z-statistic (Gaussianized T) maps were initially thresholded using clusters determined by Z > 2.3 and subsequently filtered using a p = 0.05 cluster threshold for multiple comparisons.
In the first level of analysis, data from each participant were individually analyzed for blood-oxygen-level-dependent (BOLD) signal changes associated with syllable > gender decisions in each paradigm (i.e., single-syllable identification and multisyllablic TOJs). Data were analyzed in each participant's native space and registered to Montreal Neurological Institute space using FLIRT (FMRIB's Linear Image Registration Tool). In the higher-level group analysis, a t test was used to compare activity across participants by using the outputs that were generated for each individual at the first level of analysis.
Results
All participants were able to complete the tasks with relatively high accuracy. The group mean for percentage correct during each condition is as follows: multisyllabic syllable order (84%), multisyllabic gender identification (90%), monosyllabic syllable identification (92%), and monosyllabic gender identification (92%). Because the study question focused on the potential role of the IPL in the temporal processing of speech stimuli, the experimental task of interest was the syllable order judgments in the multisyllabic condition. For BOLD signal comparisons, the control task was gender identification of the speaker for the same multisyllabic targets. When these two conditions were contrasted in the group analysis, distinct statistical maps of brain activity emerged (Fig. 1; Tables 1 and 2). Areas of greater cortical activation during gender identification compared with syllable order judgments were found primarily in the right hemisphere, including the right inferior temporal/occipital gyrus [Brodmann area (BA) 37/18] and right precentral/postcentral gyrus (BA 6/3), as well as the bilateral superior frontal gyrus (BA 8/9). In contrast, the syllable order condition resulted in greater activity in the left middle frontal gyrus (BA 9), middle/superior temporal gyrus (BA 37/39), and supramarginal gyrus (BA 40), as well as the right inferior frontal gyrus (IFG) (BA 47) and bilateral precuneus (BA 7), compared with the gender identification condition.
To account for the potential confound that the activity observed in the syllable order condition reflected phonological processing more generally, a phonologically based syllable identification condition was also included in our study. Thus, participants completed two conditions which required phonological processing at the syllable level, but only the multisyllabic condition required TOJs. The control task of gender identification was the same across the two types of stimuli, making it feasible to compare across the two syllable conditions (i.e., identification vs order) after contrasting each with its complementary baseline (i.e., gender identification). Figure 2 displays the statistical activation maps for regions of greater activity during syllable order judgment than during syllable identification. Greater activity was revealed in the bilateral inferior/posterior frontal lobe (BA 44/6) and left posterior IPL (BA 40) during syllable order judgments (over baseline) than during the syllable identification (over baseline). The opposite contrast was not significant. That is, no areas showed greater activity during syllable identification than during syllable order judgments (Fig. 2; Table 3).
Discussion
As predicted, the group analysis revealed greater neural activation in the left posterior IPL, specifically the supramarginal gyrus (Fig. 1), during syllable order judgments compared with the control condition (gender identification of the speaker) using the same stimuli. These results support our hypothesis that the left posterior IPL is recruited during TOJs of auditory information such as speech. This is consistent with the role of this region in the Dual Stream model of speech that was proposed by Hickok and Poeppel (2007). They propose that the IPL serves as a translator between auditory speech and articulatory maps in the dorsal stream, including information such as temporal binding. For example, in a multisyllabic word, the translation of the phonological form to the articulatory form would include details regarding not only what syllables were involved, but also in what order they should occur. However, it is important to note that Hickok and Poeppel (2007) propose that integration of information across longer temporal windows (e.g., suprasegmentals) shows greater neural representation in the right hemisphere, whereas integration within shorter temporal windows (e.g., phonemes) shows more bilateral representation. This would suggest that TOJ tasks, such as the one used in the current study, should result in greater right hemisphere recruitment as opposed to left hemisphere activation. However, they also postulate that the left hemisphere may be preferentially recruited for acoustic material. We argue that our left lateralized finding for TOJ of syllables could be reflective of the strong reliance of auditory-motor integration in the left hemisphere during speech acquisition. However, the sequential processing performed by the left hemisphere does not seem to be strictly limited to acoustic material. For example, visual TOJ appears to lead left hemisphere activation even when controlled for stimulus properties (Smith et al., 2003) and duration for salient information (Davis et al., 2009). In addition, patients with damage to the posterior left hemisphere often have difficulty both with sequencing motor movements and detecting sequential errors in observed pantomime action (Weiss et al., 2008). This suggests that the left hemisphere may play a dominant role in serial order processing, regardless of modality.
In addition to temporal order processing, syllable order judgment must also rely on other basic cognitive processes, such as attention, response selection, auditory processing of frequency information, and syllable recognition/discrimination. Much of the activity observed during the syllable order task compared with baseline can be plausibly attributed to these obligatory processes. For example, there are substantial data to support the role of the left superior and middle temporal gyrus in speech perception (Fridriksson and Morrow, 2005; Price et al., 2005). Likewise, a growing body of evidence suggests that posterior inferior frontal regions (e.g., the inferior frontal gyrus and motor cortex) are recruited not only during speech production but also during speech perception (Watkins and Paus, 2004; Fridriksson et al., 2009).
Previous studies have observed posterior IPL activation during phonological processing (for review see Vigneau et al., 2006). We argue that our statistical contrast reveals that this region also plays a special role in auditory temporal order identification. Specifically, if the activity we observed was merely related to general phonological processing, this activity should also have been present in the task of syllable identification, in which no syllable order judgment is required. The results from our second group analysis demonstrate that the posterior IPL is important for TOJ beyond any basic role in phonological processing, at least at the syllable level, as greater activity was observed during the syllable-order condition compared with the syllable identification condition. Based on the conceptualization of the neural language network proposed by Shalom and Poeppel (2008), parietal and frontal areas have similar roles in processing (i.e., analyzing and synthesizing, respectively) compared with the temporal lobe (i.e., memorizing/retrieval). That is, both analyzing and synthesizing require consideration of the internal parts of the representations regardless of whether processing is occurring at a phonological, lexical, or syntactic level. The distinction would be that whereas the parietal regions separate the pieces, the frontal regions combine the pieces together. In support of this hypothesis, our findings implicate the IPL and the IFG in syllable order judgments. Thus, we speculate that the IPL provided temporal analysis of syllable order, whereas the IFG provided subvocal rehearsal in order for this analysis to occur. This is consistent with a number of studies that report neural recruitment of the IFG during phonological working memory tasks.
Alternative explanations are feasible and, therefore, warrant mentioning. It could be argued that, apart from TOJs, the processing of the multisyllabic stimuli requires greater demands on phonological processing (simply because of the number of syllables in the stimuli) and/or greater demands on working memory than the single-syllable stimuli. We attempted to control for the fundamental difference in phonological processing by factoring out activity that was associated with hearing the same stimuli, but responding based on gender (a nonphonological task). Furthermore, the temporal unfolding of the acoustic signal for any given syllable was the same across the two phonologically based tasks (i.e., single-syllable identification and. multisyllabic temporal order). Thus, neural activity related to phonological processing at the syllable and segmental level should be equivalent, particularly when compared with the respective baselines. However, the multisyllabic condition also required processing over a longer temporal window to determine syllable order (i.e., which syllable came first?). Therefore, the primary distinction in the higher-level contrast should be the presence or absence of TOJs. However, at the lower-level contrast, it is important to consider that gender identification would not necessitate the use of verbal working memory. Likewise, working memory demands would be minimal for single-syllable identification. According to Baddeley's theory of working memory (Baddeley, 1992, 2003), the “phonological loop,” which serves as a short-term memory store for auditory information, may be supported by the left IPL and superior temporal lobe. However, more recent works suggest that the temporal lobe is the neural hub for phonological processing rather than the parietal lobe (Rimol et al., 2005; DeLeon et al., 2007; Graves et al., 2007). In the current study, left temporal lobe activity was observed when syllable order was compared with baseline, but this activity did not survive the statistical threshold when compared with the syllable-identification condition. Although the absence of significant activation cannot be interpreted as proof that this particular region is not active, it is worthy to note that left IPL lobe and IFG activity was revealed for this contrast in the absence of left temporal lobe activity. Thus, the difference in the working memory load was likely revealed by the activation in the left IFG. Our findings suggest that phonological processing necessary for identification at the syllable level is supported primarily by the temporal lobe, whereas analysis and construction of syllable order are supported by the left IPL and IFG, respectively.
The fact that the posterior IPL showed significant activation for the “syllable order > syllable identification” contrast supports our hypothesis regarding its involvement in temporal order processing and is consistent with patient data (Wittmann et al., 2004). It seems reasonable to conclude that the posterior IPL plays an important role in processing the temporal order of syllables. Of particular interest, conduction aphasia has been associated with lesions in the IPL (Bartha and Benke, 2003; Quigg et al., 2006; Geldmacher et al., 2007). Most individuals with conduction aphasia present with frequent phonological errors (Ardila, 1992). Typically, these errors do not violate the phonotactic constraints of the persons' native language, suggesting that the phonological form of a word is assembled at the syllable level. Thus, even if a person was able to retrieve the correct lexical item, deficits in the temporal ordering of syllables could contribute to language difficulties by interfering with the translation of auditory speech into articulatory maps. This type of deficit could quite reasonably lead to phonological errors and repetition impairment, such as those seen in conduction aphasia.
In conclusion, we suggest that the left posterior IPL contributes to TOJs of syllables. Whereas the right posterior IPL seems to be associated with TOJs in the visual domain, this same region in the left hemisphere seems to be more important in TOJs in the auditory domain, as suggested by others (Wittmann et al., 2004; Battelli et al., 2007). This may be reflective of left lateralized preference for auditory-motor integration, in general, or for speech processing, more specifically. Although our study only used normal participants, the results broadly support the notion that damage to the left posterior IPL could directly impact speech and language abilities in a manner similar to that which is seen in conduction aphasia, by interfering with the temporal analysis of phonological representations to be sent to frontal regions for the assembly of motor plans. Although our study specifically focused on TOJs, we contend that this is only one of the functions supported by the left IPL for the translation between auditory speech and the articulatory code used for motor speech planning.
Footnotes
-
This work was supported by the National Institute on Deafness and other Communication Disorders Grant R01-DC008355.
- Correspondence should be addressed to Dr. Dana Moser, Center for Clinical Neurosciences, University of Texas Health Science Center at Houston, 1333 Moursund Street, Suite H114, Houston, TX 77030. dmoser{at}bcm.tmc.edu