Overlapping Networks Engaged during Spoken Language Production and Its Cognitive Control

Spoken language production is a complex brain function that relies on large-scale networks. These include domain-specific networks that mediate language-specific processes, as well as domain-general networks mediating top-down and bottom-up attentional control. Language control is thought to involve a left-lateralized fronto-temporal-parietal (FTP) system. However, these regions do not always activate for language tasks and similar regions have been implicated in nonlinguistic cognitive processes. These inconsistent findings suggest that either the left FTP is involved in multidomain cognitive control or that there are multiple spatially overlapping FTP systems. We present evidence from an fMRI study using multivariate analysis to identify spatiotemporal networks involved in spoken language production in humans. We compared spoken language production (Speech) with multiple baselines, counting (Count), nonverbal decision (Decision), and “rest,” to pull apart the multiple partially overlapping networks that are involved in speech production. A left-lateralized FTP network was activated during Speech and deactivated during Count and nonverbal Decision trials, implicating it in cognitive control specific to sentential spoken language production. A mirror right-lateralized FTP network was activated in the Count and Decision trials, but not Speech. Importantly, a second overlapping left FTP network showed relative deactivation in Speech. These three networks, with distinct time courses, overlapped in the left parietal lobe. Contrary to the standard model of the left FTP as being dominant for speech, we revealed a more complex pattern within the left FTP, including at least two left FTP networks with competing functional roles, only one of which was activated in speech production.


Introduction
Spoken language production is a complex brain function involving many linguistic processes (Indefrey and Levelt, 2004;. It also engages motor-sensory systems (Bohland and Guenther, 2006;Guenther et al., 2006), controlled access to semantic representations Whitney et al., 2012;Noonan et al., 2013), and domain-general cognitive systems (Raboyeau et al., 2008;Eckert et al., 2009;Brownsett et al., 2014;Vaden et al., 2013). In addition to these "task-positive" systems, a network normally considered to be "task negative," known as the default mode network (DMN), is engaged in the absence of externally driven tasks and may be involved in accessing episodic and semantic memories, without which speech is nonpropositional (Raichle et al., 2001;McKiernan et al., 2006;Binder et al., 2009). Some of these brain networks spatially overlap and interact with each other (de Pasquale et al., 2012;Seghier and Price, 2012;Braga et al., 2013;Simmonds et al., 2014).
Although lesion studies have localized language functions predominantly to the left hemisphere, functional neuroimaging has introduced inconsistent results about the localization of the systems supporting speech production. For example, the involvement of the parietal lobe in spoken language has not been apparent in many neuroimaging studies of language (Brownsett and Wise, 2010). Equally, nonlinguistic tasks can activate accepted language-specific regions (Fedorenko et al., 2012). Similarly, domain-general systems can activate bilateral frontal-temporalparietal (FTP) cortices that overlap with putative language networks (Dosenbach et al., 2008;Vincent et al., 2008;Duncan, 2010;Fedorenko et al., 2013). This ambiguity in the neuroimaging studies of spoken language may arise from the interplay between domain-specific and domain-general networks and the spatial overlap of these networks within the left hemisphere.
Multivariate techniques may better reveal activity in networks that are spatially overlapping than univariate subtractive techniques (Leech et al., 2011;Geranmayeh et al., 2012;. We have shown previously, using independent component analysis, that a left-lateralized FTP network is involved in speech production (Geranmayeh et al., 2012). This network included inferolateral temporal cortex, left dorsolateral prefrontal cortex (dlPFC), and the left inferior parietal lobe (IPL). However, the univariate analyses in that study revealed a nonsignificant deactivation with speech production in the IPL. The described ambiguity in the literature and the differences between our previous multivariate and univariate results may be explained by the presence of functionally independent or anticorrelated networks that spatially overlap within the left hemisphere.
We performed an fMRI study of speech production to test for the presence of these functionally independent but partially spatially overlapping spatiotemporal systems within the left hemisphere. We incorporated a nonverbal cognitive task and a nonpropositional speech task (counting), to disambiguate domain-specific (linguistic) processes from domain-general (nonlinguistic) cognitive control during propositional speech production. Based on the findings of Geranmayeh et al. (2012), we hypothesized that the multivariate analysis would identify a left-lateralized FTP network specifically activated during propositional speech production. In addition, we hypothesized that this network would spatially overlap with other networks that deactivate during propositional speech production and are therefore functionally different from the speech-related left FTP network.

Materials and Methods
Participants and fMRI procedure. Twenty-six right-handed fluent English-speaking participants without neurological illness were scanned. The data from two participants were excluded because of excessive movement (Ͼ4.5 mm relative motion) and, in one case, additional atrophy. Therefore, 24 subjects were included in the final analysis (7 male, average age: 57 years, range: 37-78 years). Approval for the study was provided by the National Research Ethics Service Committee of West London.
MRI data were obtained on a Siemens Magnetom Trio 3 tesla scanner using T2*-weighted, gradient-echo, dual-echo echoplanar, parallel imaging sequence with whole-brain coverage. Thirty-six contiguous axial slices with a thickness of 3 mm were acquired in an interleaved order (resolution: 3.5 ϫ 3.5 ϫ 3.0 mm; field of view: 225 ϫ 225 ϫ 108 mm). Repetition time, 10 s; acquisition time, 2 s; first echo time (TE1), 13 ms; second echo time (TE2) 31 ms; flip angle, 90°. Quadratic shim gradients were used to correct for magnetic field inhomogeneities within the brain. In addition, a high-resolution 1 mm 3 T1-weighted whole-brain structural image and field maps were obtained for each subject. Despite measures taken in this study to reduce field inhomogeneities, magnetic susceptibility differences, particularly within the anterior temporal lobes (Devlin et al., 2000), mean that the signal from this region, and hence its role in the tasks, is likely to be underestimated. This study analyzed the data from TE2 (31 ms). If the hypothesis had been focused on regions that are significantly affected by susceptibility artifacts, it would have been prudent to investigate ways of combining images acquired at the early and late echo times to allow a fuller understanding of the role of these regions in the present study (Halai et al., 2014). fMRI paradigm design. A "sparse" fMRI design (Hall et al., 1999) was used to minimize movement-and respiratory-related artifacts associated with spoken language production (Gracco et al., 2005;Mehta et al., 2006;Geranmayeh et al., 2012). Tasks were performed in response to specific visual stimuli during an epoch of 7 s (s). After this, a fixation cross was displayed, which was the cue for the subjects to discontinue the task. One second later, whole-brain functional imaging data were acquired over 2 s. The cycle was then repeated. This technique acquires data after taskrelated head movement and immediate speech-related respiration has temporarily ceased and the delayed hemodynamic response function is at its peak.
During each scan, the subjects performed three runs of task fMRI. Each run consisted of four conditions that were pseudorandomly grouped in blocks of two or four trials. Each run consisted of 20 spoken language production (Speech), 16 counting (Count), 16 silent rest (Rest), 16 nonverbal decision response conditions (Decision), and 4 TRs during which subjects read an instruction page that preceded the Decision trials. Therefore, each run contained 72 repetitions. Twenty subjects were scanned again after an average interval of 98 d (range 64 -173 d). A paired t test of the univariate analysis showed no significant biological wholebrain differences between the two scans. Therefore, all three runs for the two scans for these subjects were combined in a fixed-effects analysis and were included as inputs to higher-level analysis.
During the Speech trials, subjects were required to define colored pictures of noun objects selected from a standardized picture set (Snodgrass and Vanderwart, 1980;Rossion and Pourtois, 2004). A total of 120 pictures representing monosyllabic nouns were selected from this picture set. The nouns were matched across each of the runs and scanning sections with respect to imageability, concreteness, familiarity, and Kucera-Francis frequency based on measures derived from the Medical Research Council psycholinguistic database (Wilson, 1988). A different picture was used for each trial. The pictures were displayed at the center of a screen inside the bore of the scanner. The subjects were instructed to speak for the entire 7 s when the picture was displayed and to generate as much verbal information pertaining to the given object as possible. After 7 s, the picture was replaced by a fixation cross for 1 s, a signal for the subject to cease speaking before image data acquisition over 2 s. An example of a subject's response to the picture of a spoon was "it's a spoon, it's got a long handle you hold, it could be a teaspoon." The participants received training before the scan.
The cycle was the same for the Count trials, except that the subjects saw a sign "1. . . " printed in large black font for 7 s, during which time they were required to count up from 1 at a rate of 1/s. During the Rest trials, the subjects saw the fixation cross for the entire 8 s before data acquisition. The Decision trials were presented in blocks of four consecutive trials, preceded by a trial containing an instruction page with a simple written and pictorial instruction reminding the subjects of the task. The task itself required no explicit verbal or linguistic processing; audio recording during the Decision task showed that Ͻ0.5% of the decision trials across all subjects involved any overt verbalization. The subjects were instructed to press a button placed in the left hand every time they saw a blue square and to ignore orange circles. During the 7 s, either a blue square or an orange circle was presented at the center of the screen in a random order, each displayed up to a maximum of 1.5 s. The next stimulus followed with a gap of 0.5 s either after 1.5 s had elapsed or if the subject made a response. The percentage of correct responses was calculated.
Behavioral analysis. Speech output was recorded using an MRcompatible microphone (FOMRI-III noise-cancelling microphone; Optoacoustics). The recordings of Speech trials were phonetically transcribed and suprasegmental aspects were not included in the transcription. The transcription was then analyzed to calculate both the number of appropriate information-carrying words (AICW) as defined by the Comprehensive Aphasia Test battery (Swinburn et al., 2005) and the total number of syllables (function words included) produced per trial. Inevitably, in some trials, there were occasional morphological, syntactic, and/or semantic errors in the utterance, with intermittent repetition of a word. These errors of production were excluded in the breakdown of AICWs, but they were included in syllable rates. Fillers ("um," "er," etc.) were not included in the count. A Pearson correlation analysis was used to identify the relationship of AICWs to syllables in both runs. These measures were correlated with activations derived from the fMRI.
Univariate whole-brain analysis. Univariate analysis was performed via a general linear model (GLM) performed using the FMRI Expert Analysis Tool Version 6.00, part of FMRIB's Software Library (FSL; www.fmrib. ox.ac.uk/fsl). TE2 EPIs were extracted. The slices were acquired in interleaved manner with a TR of 10 s that does not allow for T1 stabilization of spins. Therefore, before the standard preprocessing steps were carried out, signal intensity normalization of alternate slices was performed. Images were preprocessed in the following manner: realigment for motion correction using MCFLIRT , nonbrain voxels removal with the Brain Extraction Tool (Smith, 2002), spatial smoothing using a 8 mm full-width half-maximum Gaussian kernel, grand-mean intensity normalization of the entire 4D dataset by a single multiplicative factor, and high-pass temporal filtering (Gaussian-weighted leastsquares straight line fitting, with ϭ 50 s) to correct for baseline drifts in the signal. Registration of EPI images to high-resolution structural images was performed by Boundary-Based Registration (Greve and Fischl, 2009) and field-map-based distortion correction. The high-resolution structural image was registered to the Montreal Neurological Institute standard space images (MNI 152) using FMRIB's Linear Image Registration Tool (FLIRT). Time-series statistical analysis was performed using FMRIB's Improved Linear Modeling (FILM) with local autocorrelation correction. The combination of different runs at the individual subject level was analyzed using a fixed-effects model. The design matrix modeled the different task conditions, the four Decision instruction trials at the beginning of each Decision block, and EPI volumes recognized as motion outliers. Contrast images were produced from these individual analyses and used in the second-level higher analysis. Higher-level between-subject analysis was performed using a mixed-effects analysis. Final statistical images were corrected for multiple comparisons using Gaussian random-field-based cluster inference with a height threshold of z Ͼ 2.3 and a cluster significance threshold of p Ͻ 0.05.
Multivariate analysis: independent component analysis. Independent component analysis (ICA) is a multivariate analysis technique that can extract important information from the data that is not always apparent from a subtractive univariate analysis (Geranmayeh et al., 2012;. ICA takes advantage of fluctuations in the fMRI data to separate the signal into maximally independent spatial maps or components, each explaining unique variance of the 4D fMRI data. The total variance in the data is separated among the different components. Each component has a time course that may relate to a coherent neural signaling associated with a specific task, artifact, or both. ICA may find artifacts that persist over the entire duration of the data acquisition, such as artifacts associated with blood flow in venous sinuses or noise components related to a specific condition; for example, movement related artifact in a spoken language production paradigms not removed by the initial image preprocessing. ICA was performed using group concatenation Probabilistic Independent Component Analysis (Beckmann and Smith, 2004) as implemented in Multivariate Exploratory Linear Decomposition into Independent Components (MELODIC) Version 3.10, part of FSL. The following data preprocessing was applied to the input data: masking of nonbrain voxels, voxelwise demeaning, normalization of the voxelwise variance, and registration to standard space. Preprocessed data were whitened and projected into a 55-dimensional subspace using Principal Component Analysis. The whitened observations were decomposed into sets of vectors which describe signal variation across the temporal domain (time courses), the session/subject domain, and the spatial domain (maps) by optimizing for non-Gaussian spatial source distributions using a fixedpoint iteration technique (Hyvärinen, 1999). Estimated component maps were divided by the SD of the residual noise and thresholded by fitting a mixture model to the histogram of intensity values (Beckmann and Smith, 2004).
Of the 55 components, 44 were clearly related to residual movement artifact (characterized by the majority of the signal being distributed around the edge of the brain or within the CSF spaces), variation in head size, or vascular blood flow. The high proportion of components classified as artifactual is comparable to previous fMRI studies investigating speech production, which attributed approximately two-thirds of the derived components to artifact (Geranmayeh et al., 2012;Simmonds et al., 2014). For comparison, in an ICA of fMRI data from resting state, which arguably has less motion-related artifact than that from speech production studies, the investigators attributed 25 of 70 components to artifact (Smith et al., 2009). One advantage of ICA over univariate approaches is that it explicitly models these structured noise sources and allows for identification of components related to non-neural noise.
Of the 55 components, 11 had correlated signal distributed within the brain parenchyma. Herein, we refer to these non-noise components as "networks" and, where appropriate, label them with a regional descriptors (e.g., left-lateralized FTP). To demonstrate how the 11 components spatially differ from each other, a pairwise spatial correlation was performed between each pair of components.
For each of the 11 components, we established whether that component was significantly functionally involved in any task condition by applying a GLM to the component's time course: the associated runspecific time courses for each subject (dependent variable) were regressed against the design matrix for the tasks (independent variable) and tested for significance to identify components where activity was greater during Speech, Count, or Decision against Rest or other contrast combination. Given that ICA is a data-driven, model-free approach, it is only appropriate to correct for multiple comparisons at this stage when applying the GLM. To this end, we Bonferroni corrected for multiple comparisons with the 11 components and 9 contrasts, resulting in a significance threshold of p Ͻ 0.0005.
ICA divides the total variance in the 4D data among the different components (as detailed in Table 2). However, the magnitude of this share of variance associate with each component is not the key issue in interpretation of the ICA results. Rather, it is the degree to which the time course associated with each component is predicted by the task time course in the GLM that is more informative about the behavior of that component in relation to task.
It is important to note that ICA can reveal multiple, separable spatiotemporal networks within any one brain region, each with different functional roles. This overlap could be the result of the presence of spatially adjacent but functionally different neurons in that region or of neurons that are flexibly involved in different functional networks. The present dataset cannot pull apart these two alternative explanations. Therefore, describing a spatiotemporal network that encompasses a brain region as "inactive" or "deactivated" during a specific task does not, by itself, mean that the region is not involved in the task. Instead, it simply implies that the specific spatiotemporal signal attributed to that network is deactivated in that task.
To test the robustness of the left FTP network, the ICA was repeated by decomposing the data into varying number of components: 20, 25, 30, 35, 40, 45, 50, 55, and 60. The dimensionality of the ICA is usually driven by previous published work and is chosen somewhat arbitrarily (Smith et al., 2009. Fifty-five was initially chosen as representing a good balance between richness and interpretability of the derived components. Networks defined at lower dimensionalities have, in some cases, been shown to split at higher dimensionality into subnetworks (Smith et al., 2009), whereas higher-dimensional ICA models noise more accurately by extracting variations in the data as additional components (Braga et al., 2013). In the present study, higher dimensionalities were predicted to be better able to separate out temporally independent but spatially overlapping networks that we hypothesized to exist within the parietal lobe, in particular the left FTP.
Confirmatory functional connectivity analysis. To confirm that the left FTP network is indeed language related, we performed the first stage of a dual regression analysis with an unbiased left FTP network derived by Geranmayeh et al. (2012) in a different independent experiment as a "seed" region (Filippini et al., 2009;Zuo et al., 2010;Leech et al., 2011). This involved back projecting or spatially regressing the spatial map of the left FTP language network (Geranmayeh et al., 2012; see their Fig. 3, component 24) into each run's 4D dataset for each participant (total 138 runs) to give a set of time courses for this spatial component. This approach is based purely on the GLM and is similar to a seed-based connectivity analysis (except that the seed region is a weighted map taken from a prior analysis rather than a single voxel or average time course from across a mask). This time course was then regressed against the design matrix to calculate a ␤ coefficient that is the estimate of BOLD signal evoked for the different task conditions. The ␤ coefficients for each individual were then tested at the group level.

Behavioral results
Before scanning, the subjects were tested on the following (1) the shortened Raven's Matrices, (2) a composite cognitive score based on the Comprehensive Aphasia Test (recognition memory, semantic memory, arithmetic, motor apraxia), and (3) a composite score for phonetic and semantic fluency (the number of words beginning with "S" plus the number of animal names, respectively, generated in 1 min). The mean scores were as follows: Raven's Matrices 96% (range 83-100%), cognitive score 95% (71-100%), and total fluency score 43 (20 -62). Across each scanning run, participants spoke 7.42 appropriate informationcarrying words (SD 1.54) per 7 s trial and 2.64 syllables per second (SD 0.59). The two measures significantly correlated with each other (r ϭ 0.81, p Ͻ 0.0001). There was no significant difference between the two scanning sessions with respect to the syllable rate and appropriate information-carrying words (paired t test, p Ͼ 0.05).
In the Decision task, during fMRI data acquisition, subjects correctly identified the blue square target on 98.8% of trials and correctly inhibited a response to the orange circles on 99% of trials (suggesting a ceiling effect with respect to the level of task difficulty).

Univariate analysis
Before performing multivariate analyses, which might have revealed multiple distinct and partially overlapping patterns of activation, baseline standard whole-brain univariate analyses were performed on the time courses from each voxel. Univariate analysis of Count against Rest (Fig. 1, Table 1) revealed patterns of activation that have been previously reported during speaking: the bilateral motor-sensory cortices, superior temporal gyri (STG) (the primary and association auditory cortices, including the plana temporale and adjacent parietal opercula), and supplementary motor areas (SMAs), merging with activity in the dorsal anterior cingulate gyri (dACC) and the adjacent superior frontal gyri (SFG). Right-lateralized activity was evident in the inferior frontal gyrus (IFG) and IPL. The ventral frontoparietal asymmetry for counting was consistent with a previous study showing a reversal of interhemispheric asymmetry in BA 45 between sentential speech and counting, with counting lateralizing to the right and sentential speech lateralizing to the left hemisphere (Dhanjal et al., 2008).
The contrast of the Speech with the Rest condition ( Fig. 1) revealed similar areas of activation to that shown in the study of Geranmayeh et al. (2012). Activity overlapped with that observed in the contrast of the Count with the Rest condition in bilateral motor-sensory cortex, the STG, the paravermal cerebellum, and SMA/dACC/SFG. The contrast of the Speech with the Count condition ( Fig. 1) revealed additional activity in left dlPFC, including the pars triangularis and opercularis, the middle and inferior frontal gyri, the middle temporal gyrus (MTG), and the inferior temporal gyrus (more on the left than right). There was also greater activity in the SFG/SMA/dACC in the Speech condition compared with Count extending more anteriorly, which is consistent with previous studies (Blank et al., 2002;Awad et al., 2007;Dhanjal et al., 2008). There was also additional symmetrical activation in visual cortices, the posterior fusiform gyri, and regions associated with the dorsal visual attention network (lateral occipital cortices extending to the superior parietal lobules and the bilateral precentral gyri). These activations related to the visual nature of the picture description task. Importantly, as with the previous study by Geranmayeh et al. (2012), there was no left parietal activity observed in the univariate analysis for the Speech condition.

Multivariate analysis
The initial spatial ICA was constrained to calculate 55 components. Of the 55 components generated, 44 were judged to be related to movement and other sources of artifact (see methods). These were excluded from further consideration. The remaining 11 components showed patterns of temporally coherent signal confined to the brain parenchyma. For convenience, we subsequently refer to these components as brain networks and label them according to their spatial location (e.g., left FTP) or previously well described labels (e.g., DMN). Figure 2 and Table 2 show the spatial distribution of each of these 11 components. Figure 3 displays the result of the regression analysis on each component, showing how the different networks are modulated by the three task conditions. To demonstrate how the 11 components differed spatially from each other, a pairwise spatial correlation was performed between each pair of components (Fig. 4). The maximum pairwise correlation ratio was 0.17, suggesting that, although some components load the different tasks in the same manner (e.g., C9 and C19 or C3, C28, and C33), they are clearly spatially distinct.
4.0*10 Ϫ9 ) and suppressed in the Count and Decision trials. This is consistent with the results of Geranmayeh et al. (2012), which showed that activity in a spatially similar left-lateralized FTP network was significantly greater in spoken language production than silent movements of the tongue. To confirm this similarity with the previous study, we used the left FTP network described previously (Geranmayeh et al., 2012) in the first stage of a dual regression analysis (i.e., calculating subject-specific time courses for the left FTP network from the previous study). These time courses were then regressed against the task design matrix, revealing the same speech-specific pattern of activation as observed with C4 [i.e., left FTP significantly correlated with Speech (t (137) ϭ 3.97, p ϭ 0.0001) and significantly anticorrelated with Count (t (137) ϭ Ϫ3.19, p ϭ 0.001)].
In addition to C4, several other components showed Speechrelated activity. Consistent with the univariate analysis, ICA revealed two motor-sensory systems (C2 and C19) associated with Speech and Count (C2, Speech t (137) ϭ 8.5, p ϭ 1.3*10 Ϫ13 ; C2, Count t (137) ϭ 8.6, p ϭ 1.0*10 Ϫ13 ; C19, Speech t (137) ϭ 12.4, p ϭ 3.0*10 Ϫ20 ; C19, Count t (137) ϭ 11.2, p ϭ 4.3*10 Ϫ18 ). C19 was also significantly activated to a lesser degree during the Decision trials that required the subjects to respond to a visual stimulus by pressing a button (t (137) ϭ 3.5, p ϭ 3.0*10 Ϫ5 ). These two components revealed many of the regions previously described as being involved in the motor-sensory control of spoken language production (Riecker et al., 2005;Bohland and Guenther, 2006;Guenther et al., 2006). C1 showed activity related to the visual cortices and dorsal attention system engaged in goal-directed, top-down selection of stimuli and shift of spatial attention and eye movement (Corbetta and Shulman, 2002;Vincent et al., 2008). This component was specific to the Speech trials (t (137) ϭ 5.0, p ϭ 2.2*10 Ϫ7 ), which, unlike the other conditions, required the subjects to describe colored pictures of objects, which was expected to evoke top-down visual attentional control and higher-level visual processing.
C30 showed significant Speech-related activity (t (137) ϭ 12.1, p ϭ 8.0*10 Ϫ20 ) in the anterior temporal lobes (left more than right). This activation was significantly more than that seen in Count and Decision trials. It may be related to semantic retrieval processes engaged by the Speech task because this task is impaired in patients with semantic dementia, in whom the site of major atrophy is Coordinates for each local maxima within significant clusters of activation (cluster-corrected p Ͻ 0.05) are given for contrasts of Count Ͼ Rest, Speech Ͼ Rest, and Speech Ͼ Count (see Figure 1). the anterior temporal lobes (left more than right; Acosta-Cabronero et al., 2011). A number of other components either failed to activate for speech or showed relative deactivations. C3 showed significant deactivation in all three tasks with significantly more deactivation in the Speech condition than the Count (t (137) ϭ 7.1, p ϭ 6.0*10 Ϫ11 ) or Decision (t (137) ϭ 5.5, p ϭ 3.1*10 Ϫ8 ) trials. C3 is spatially similar to the classic DMN, which has consistently been shown to activate more during task-negative, "interoceptive" cognition and to deactivate during task-positive responses to external stimuli (Gusnard and Raichle, 2001;Raichle et al., 2001;Esposito et al., 2006;Buckner et al., 2008).
A right-lateralized FTP network (C5) that mirrored the spatial distribution of the left-lateralized FTP network (C4) was active in Count (t (137) ϭ 10.0, p ϭ 3.2*10 Ϫ16 ) and Decision (t (137) ϭ 8.5, p ϭ 1.5*10 Ϫ13 ) but deactivated in Speech (t (137) ϭ Ϫ3.7, p ϭ 2.2*10 Ϫ5 ). This network may be involved in the top-down attention processes that are required for the Counting and nonverbal Decision trials. Its relative deactivation in Speech trials would be consistent with the attention and control of language production being lateralized to the left, whereas for the other domains, it is lateralized mainly to the right. These mirror FTP systems overlapped in the left parietal and frontal lobes, suggesting that there are overlapping subregions within these cortical regions that support separate functional neural networks (Fig. 5A).
Component 7 included bilateral frontoparietal and cingulo-opercular networks and bilateral precentral gyri. This component was significantly activated in the Decision trials (t (137) ϭ 7.3, p ϭ 2.3*10 Ϫ11 ) and deactivated in the Speech trials (t (137) ϭ Ϫ11.6, p ϭ 6.5*10 Ϫ19 ). This component also included activity in the territory of the so-called salience network (bilateral frontal operculum/anterior insular cortex and dACC). Activity in the bilateral frontoparietal and cinguloopercular networks may be related to the executive demands of the Decision trials, with identification of the salient stimuli (blue squares) while inhibiting any response to the orange circles (Dosenbach et al., 2007;Vincent et al., 2008;Duncan, 2010;Menon and Uddin, 2010;Spreng et al., 2010).
Interestingly, although in some ways spatially similar to the left-lateralized Speech component (C4), C28 was strongly deactivated in both the Speech (t (137) ϭ Ϫ21.6, p ϭ 1*10 Ϫ20 ) and Count Figure 2. Spatial distribution of 11 biologically plausible independent task-related components of the total of 55 components derived from the group-concatenated ICA analysis. Red-yellow represents correlated activity and blue represents anticorrelated activity. This figure shows every 8th axial slice in 2 mm MNI 152 standard space, starting with the lowest slice at z ϭ Ϫ16 mm and the highest at z ϭ 46 mm. Statistical threshold was set at z Ͼ 2.3. See Table 2 for coordinates. Coordinates for each local maxima within significant clusters of activity (cluster-corrected p Ͻ 0.05) are given for the components shown in Figure 2. PCC, Posterior cingulate cortex; SMG, supramarginal gyrus. For each component, the percentage of the total variance in the data explained by that component is given in parentheses.
(t (137) ϭ Ϫ9.0, p ϭ 2.2*10 Ϫ14 ) trials. It is a left-lateralized FTP network that overlaps with the left FTP language network (C4) in all three lobes. The areas of overlap include the left IFG, left superior parietal lobule, and left MTG (Fig. 5B). C33 was significantly deactivated during all tasks (Speech Ͼ Count Ͼ Decision), with significantly more deactivation in the Speech condition than the Count (t (137) ϭ 7.8, p ϭ 3.4*10 Ϫ12 ) or Decision (t (137) ϭ 5.5, p ϭ 4.3*10 Ϫ18 ) trials. This component corresponds to the rostral medial occipital lobes, extending up into the cuneus. It corresponds to a resting state network identified previously , but its precise role is not yet clear. Negative alongside positive BOLD signal has been previously identified in visual cortex during visual tasks (Wade and Rowland, 2010;Goense et al., 2012), but the activity observed in C33 seems to located rostral to this previously observed negative signal.
C9 showed activation during all three tasks. It is a noisy component with activation in the bifrontal cortices, mixed with movement-related noise at the edge of the brain and CSF-parenchyma interface.  Figure 2. Dark gray, light gray, and blank bars represent Speech, Count, and Decision trials, respectively. Significance for each condition against rest is shown with error bars and significant differences between conditions is shown with horizontal bars. Bonferroni-corrected p Ͻ 0.0005.  Figure 2. The maximum correlation ratio between components was r ϭ 0.17, suggesting that, although some components load the different tasks in the same manner (e.g., see C9 and C19 or C3, C28, and C33 in Fig. 3), they are predominantly spatially distinct. The bar chart refers to the correlation coefficient. Blue colors have a low correlation coefficient; red colors have a high correlation coefficient.
Due to the large contribution of noise to this component, it will not be discussed further.

Robustness of the left FTP network at different ICA dimensionalities
A higher dimensionality in ICA may be better able to decompose larger networks into more discrete subnetworks (Smith et al., 2009). To test the stability of the Speech-specific left FTP network C4 (and how the analysis separated it from C28, the other spatially partially overlapping left FTP network), we performed the ICA at different dimensionalities. At higher dimensionalities (60, 50, and to some extent 45), a component with high spatial correlation with C4 (r Ͼ 0.5) was identified and its time course remained strongly associated with the Speech trials. Similarly, component 28 (with the opposite pattern of deactivation with Speech) was consistently found at these higher dimensionalities. However, at lower dimensionalities (20, 25, 30, 35, and 40), there was no single component with a strong spatial correlation with C4. Instead, multiple components with moderate or weak correlations were identified that either did not significantly activate for Speech or showed a relative deactivation during Speech. Therefore, the higher decompositions were better able to differentiate between overlapping components in the left frontoparietal regions. At lower dimensionalities, the two networks with opposite patterns of functional activation (C4 and C28) could not be differentiated (Fig. 6).

Conjunction of networks modulated by the task conditions
The multivariate analysis breaks up the fMRI data into multiple components with independent sources of variance. Although these components are spatially dissimilar, they can be functionally modulated by tasks in a similar manner. Figure 7 recombines networks with a spatiotemporal signal that are (1) uniquely activated for propositional speech (Speech), (2) activated during general speaking (propositional speech and counting), and (3) deactivated during propositional speech. These conjunction maps show that, across all components, there is a generally left-lateralized system ac-  The y-axis shows the correlation coefficient for each of the components. Top, Lower-dimensional ICA-extracted multiple networks, each with lower spatial correlations with C4. In contrast, higher ICA decompositions extracted components, one of which showed a strong spatial correlation with C4. In this dataset, higher dimensionalities were better able to extract the Speech-specific left FTP network. Bottom, C28 was not well extracted at low dimensionalities (20 and 25), but was uniquely and strongly extracted with ICAs at lower dimensionalities compared with ICAs that were able to identify C4. tivated for propositional speech, a more bilateral sensorimotor system involved in general speech tasks (Count and Speec), and a more right-lateralized deactivation during Speech.

Discussion
Using a data-driven approach with ICA, we have demonstrated that there were overlapping spatiotemporal networks in the left FTP cortex, particularly the parietal lobe. Some networks were active and others simultaneously deactivated during spoken language production (Speech). A specific left-lateralized FTP network was activated during propositional spoken language production, but was relatively deactivated during the nonpropositional speech task (Count) or the nonverbal domain-general cognitive control task (Decision). Instead, these two tasks were associated with activity in a right-lateralized FTP network (C5). This network was suppressed during propositional spoken language production and overlapped with the Speech-specific left FTP (C4) in the left superior parietal lobe. We conducted multiple ICAs with multiple numbers of decompositions to show that the key component of interest, the left FTP network associated with Speech, was highly spatially and functionally consistent at higher ICA decompositions.
The left FTP network specific to Speech has a similar distribution to the network that was shown by Geranmayeh et al. (2012) to be active during spoken language production, but not in silent movements of the articulators. Similarly, this FTP network is spatially comparable to the resting state network associated with language tasks in the BrainMap meta-analysis (Smith et al., 2009). We were able to demonstrate a more specific functional role played by this network, namely that its role in spoken language extends beyond low-level motor-sensory aspects shared with counting and, equally, that its deactivation during the Decision trials suggests it is at least relatively less engaged by tasks requiring domaingeneral cognitive control. Instead, it may be engaged in cognitive control processes that are heavily involved in propositional speech. For example, the left FTP shares a similar distribution to the networks engaged in controlled semantic processing (Binder et al., 2009;Noonan et al., 2013). Noonan et al. (2013) revealed that left and right PFC, left MTG, and left dorsal angular gyrus (AG)/IPS, regions overlapping with the left FTP network in this study, respond to the executive demands of semantic tasks. These left-lateralized areas also overlap with the most common areas of damage in patients with poststroke "semantic aphasia." These patients have relatively preserved semantic representations, but are impaired at accessing these representations in the context of specific word or concept retrieval tasks Noonan et al., 2010). Transient inhibition of the left PFC, MTG, or IPS/AG using transcranial magnetic stimulation techniques in healthy participants compromises semantic control (Whitney et al., 2011(Whitney et al., , 2012. The left FTP network also incorporates extensive left inferior frontal activity, extending ventrally into classic Broca's area, a region usually associated with language-specific processing (Indefrey and Levelt, 2004;Vigneau et al., 2006;Binder et al., 2009). However, there is emerging evidence that parts of this region are implicated in domain-general executive processing (Fedorenko et al., 2012) and controlled access to semantics (Thompson-Schill et al., 1997;Wagner et al., 2001;Whitney et al., 2012;Noonan et al., 2013) and phonology (Noonan et al., 2013). More specifically, its relationship to language may in part be management of conflicts arising when selecting among a range of possible semantic and lexical representations during spoken language production (Thompson-Schill et al., 1999;Novick et al., 2009;Schnur et al., 2009).
There was no correlation between speech production measures (syllable rate and appropriate information-carrying words) and BOLD signal change for the contrast of Speech Ͼ Rest in this Speech-specific component (C4). Control networks, as opposed Figure 7. Conjunction of networks modulated by the different task conditions. A, Components specifically activated in propositional speech condition (Speech). Turquoise, C1; yellow, C30; magenta, C4. B, Components that are activated more in the Speech and Count trials compared with the Rest baseline or the Decision trials. Green, C2; red, C19; blue, C9. The latter predominantly reflects non-neural noise intermixed with bilateral frontal parenchymal signal. C, Components that either showed a deactivation only in the Speech condition or showed most deactivation in the Speech condition compared with the other conditions. Yellow, C3; turquoise, C5; red, C7; blue, C28; tan, C33. The components are overlaid on slices from a 2 mm MNI 152 standard brain marked with coordinates in the z direction. Statistical threshold was set at z ϭ 4.
to linguistic systems, are likely to operate independently of actual linguistic output. Therefore, a picture that a subject finds difficult to describe, particularly in the time-limited manner that the trials required, may have generated greater cognitive control even when the actual speech output for that trial was relatively low. Interindividual variability in fluency for individual pictures, and particularly when that variability was quite low as reflected in the size of the SD in the group-level behavioral analyses, would account for an absence of a correlation between "effort" and "output." This is consistent with the study by Geranmayeh et al. (2012), in which there was a similar lack of correlation between activity in the left FTP network and actual speech output.
C3 and C5 correspond to two broad brain networks, the DMN and a right FTP network, respectively. The DMN is recognized as a task-negative network where activity decreases during task performance (Raichle et al., 2001;McKiernan et al., 2006). In the present study, the DMN, with its characteristic anatomical distribution, was deactivated in all three tasks, with the greatest deactivation during Speech. Subjectively, selecting things to say about a picture under a strict time constraint is a demanding task. The least deactivation observed was during the Count trials, an overlearned task performed almost effortlessly. Our results suggest that activity in the DMN is related to general task demands rather than specific linguistic processes. This is consistent with findings that activity in the DMN is anticorrelated with a network that is spatially very similar to C4 in our study . Different regions within the DMN may deactivate to different extents in response to specific language tasks (Seghier and Price, 2012). In this study, C4 network overlapped with the DMN (C3) in the left AG and supramarginal gyrus. Similarly, systems that process semantics (Seghier et al., 2010) or support speech production (Geranmayeh et al., 2012) have been shown to overlap spatially with the DMN in the left AG. In this study, we could not find evidence for the classic DMN being engaged in semantic processing, as has been suggested previously (Binder et al., 1999(Binder et al., , 2009. The right FTP network in this study was significantly active during Count and Decision, but deactivated during Speech. Counting is highly overlearned, relatively devoid of linguistic processing, and relies much less, if at all, on semantic memory retrieval. Therefore, attention and cognitive control during counting will be very different from that during propositional speech. The right lateralization of this component is consistent with the observation of preserved "automatic" speech in aphasic patients with word production difficulties after a left hemisphere stroke (Van Lancker and Cummings, 1999;Bookheimer et al., 2000;Vanlancker-Sidtis et al., 2003). In addition, other studies have suggested that "automatic" speech (including counting) either shows a right-lateralized distribution in healthy controls (Vanlancker-Sidtis et al., 2003) or reduced left lateralization compared with that seen in naming or sentential speech production (Bookheimer et al., 2000;Petrovich Brennan et al., 2007). It is, however, a monotonous task that requires sustained attention to execute. The Decision task required both sustained vigilance and "bottom-up" capture initiating a response by the salient stimulus. Both forms of attention have been shown to depend on a right-lateralized FTP system (Corbetta et al., 2008;Singh-Curry and Husain, 2009), but this study did not separate them into two overlapping systems.
Bilateral frontoparietal networks have been identified previously in the context of goal-directed tasks that require controlled processing of information and domain-general control (Dosenbach et al., 2007;Seeley et al., 2007;Vincent et al., 2008;Spreng et al., 2010) and tasks requiring spatial attention (Corbetta and Shulman, 2002;Braga et al., 2013). These "task-positive" and "task-negative" networks often operate in an anticorrelated manner (Fox, 2005;Chen et al., 2013), and the balance between them has been studied in healthy participants (Kelly et al., 2008;Leech et al., 2011; and patients (Bonnelle et al., 2011).

Spatially overlapping networks
C28 was a second left FTP network that overlapped substantially with the Speech activated FTP network (C4). However, unlike C4, this component was strongly deactivated in the Count and Speech trials and to a lesser degree deactivated in the Decision trial. The regions that overlap between the two left-lateralized FTP networks were the left IFG, left superior parietal lobule, and left MTG. The functional role of C28 is unclear; however, recent evidence suggests that there are similarly distributed leftlateralized networks that have high connectivity to posterior midline structures of the DMN and deactivate on externally focused tasks Braga et al., 2013). One possible functional role of such networks may be in maintaining a broad vigilant state (Leech and Sharp, 2014) that is distinct from the narrowly focused attentional state that occurs during most FMRI tasks.
We were only able to reveal the existence of these overlapping networks through the use of a multivariate analysis, which, unlike the univariate analysis, showed multiple overlapping networks in the left frontal and parietal lobes. This overlap of functionally distinct networks is the result of structural and functional heterogeneity in the frontal (Amunts and Zilles, 2012;Fedorenko et al., 2012) and parietal lobes (Caspers et al., 2008;Mars et al., 2011;Cabeza et al., 2012). A meta-analysis of 393 neuroimaging studies has shown that the IPL is involved in multiple linguistic and nonlinguistic cognitive domains (M. Lambon Ralph, personal communication).
The distributed spatial organization and the close proximity and overlap of brain networks may reflect an underlying neurobiological organization that is necessary for efficient information processing and rapid integration of information from multiple sources (van den Heuvel and Sporns, 2011;de Pasquale et al., 2012;Braga et al., 2013;Hellyer et al., 2014). Spoken language control is likely to require the ability to rapidly deactivate competing networks that would otherwise interfere with its production while activating the left-lateralized FTP network that may both incorporate and control phonological, semantic, and syntactic expression.