Abstract
Despite myriads of studies on a parallel organization of cortico-striatal-thalamo-cortical loops, direct evidence of this has been lacking for the healthy human brain. Here, we scrutinize the functional specificity of the cortico-subcortical loops depending on varying levels of cognitive hierarchy as well as their structural connectivity with high-resolution fMRI and diffusion-weighted MRI (dMRI) at 7 tesla. Three levels of cognitive hierarchy were implemented in two domains: second language and nonlanguage. In fMRI, for the higher level, activations were found in the ventroanterior portion of the prefrontal cortex (PFC), the head of the caudate nucleus (CN), and the ventral anterior nucleus (VA) in the thalamus. Conversely, for the lower level, activations were located in the posterior region of the PFC, the body of the CN, and the medial dorsal nucleus (MD) in the thalamus. This gradient pattern of activations was furthermore shown to be tenable by the parallel connectivity in dMRI tractography connecting the anterior regions of the PFC with the head of the CN and the VA in the thalamus, whereas the posterior activations of the PFC were linked to the body of the CN and the MD in the thalamus. This is the first human in vivo study combining fMRI and dMRI showing that the functional specificity is mirrored within the cortico-subcortical loop substantiated by parallel networks.
- cognitive hierarchy
- functional specificity of the cortico-subcortical loop
- high-resolution fMRI and dMRI
Introduction
The prefrontal cortex (PFC) has been known to be systematically organized along its anterior-to-posterior axis depending on varying levels of cognitive hierarchies, with the anterior region being involved in the higher level and the posterior region in the lower level (Badre and D'Esposito, 2007; Koechlin and Summerfield, 2007; Jeon and Friederici, 2013). This pattern of activations has also been observed in subcortical structures such as the caudate nucleus (CN) (Badre and Frank, 2012; Mestres-Missé et al., 2012). Similarly, the head and the body of the CN have been known to be interconnected with the corresponding regions of the PFC such that fibers originating from the anterior or posterior portion of the PFC terminated in the head or the body of the CN, respectively (Yeterian and Pandya, 1991; Verstynen et al., 2012). Various thalamic nuclei are also known to receive inputs from several separate but functionally related cortical areas forming multiple parallel and segregated loops (Zhang et al., 2010). Together, both the cortical and the subcortical structures revealed a rostro-caudal functional dissociation and preferential connections to the cortical areas.
Animal and clinical studies have shown that cortico-striatal-thalamo-cortical loops are topographically organized and functionally segregated in each loop (Grahn et al., 2008; Krause et al., 2012). Alexander and colleagues (Alexander et al., 1986, Alexander and Crutcher, 1990) proposed five parallel segregated loops with selected cortical areas in the frontal lobe. Two of them are involved in motor function with skeletomotor and oculomotor areas, and the remaining three loops are related to nonmotor functions with the dorsolateral PFC, the lateral orbitofrontal cortex, and the anterior cingulate/medial orbitofrontal cortices. Of these five loops, we are particularly interested in the dorsolateral prefrontal loop, which passes information from the PFC, via the CN, the pallidum/substantia nigra, and the thalamus, back to the PFC. This loop is known to be recruited in cognitive aspects, including working memory, planning, rule-based learning, and sequence learning (Koziol and Budding, 2009; Goldman-Rakic, 2011).
Based on these findings, we aimed to scrutinize large-scale functional specificity and structural connectivity pertinent to different levels of cognitive hierarchies in the PFC, the CN, and the thalamus. Despite a considerable number of animal and clinical studies, functional organization and structural connections of these brain areas have not yet been fully understood in healthy participants because previous studies were confined to invasive axonal tract-tracing, cell-recording, or postmortem studies, which are not applicable to humans (Zhang et al., 2010). In addition, the functional description of the subcortical structures was not precise enough to discern subareas, given their established functional heterogeneity (Llano, 2013). Here, we addressed these issues by using high-spatial-resolution MR imaging, at 7 tesla (7T), for detailed and noninvasive investigation. We hypothesized that the anterior–posterior functional dissociation of the PFC would also be reflected in the CN and the thalamus, mirroring functional specificity across cortical and subcortical areas. In addition, functionally compartmental areas in the PFC, the CN, and the thalamus would be interconnected by parallel fascicles showing a high correspondence between functional mapping and structural connectivity.
Materials and Methods
Participants.
Nineteen participants (9 males, 10 females; age 20–28 years, mean 22.5 ± 3.7 years) took part in the experiments. They were all native speakers of German with no history of neurological disease, and all were right-handed [mean LQ (language quotient) = 92.7 ± 6.6; Oldfield, 1971]. They had normal or corrected-to-normal vision and gave written, informed consent to participate in the study. None of the participants had previously studied Korean or any other Asian languages. The study was approved by the Research Ethics Committee of the University of Leipzig.
Learning procedure.
We designed a cross-domain study (language and nonlanguage) based on the view that a common neural system participates in aspects of processing linguistic sequences and abstract nonlinguistic sequences within the cortico-striatal-thalamo-cortical loop (Dominey et al., 2003; Dominey, 2005). Moreover, this allowed us to confirm the domain generality of the finding. For the language domain, second language (L2) was used instead of first language because the task difficulty between the processing of L2 and nonlanguage (NL) is similar, being engaged in a controlled process (Abutalebi, 2008; Jeon and Friederici, 2013).
In the L2 study, participants learned a miniature version of Korean containing simplified grammar, invented vocabulary, and easy pronunciation, which was known to guarantee a successful learning performance in a short period of time (Jeon and Friederici, 2013). The experimental setups met all the requirements for successful L2 learning following the linguistic theories by enabling the participants to learn in an interactive and formal situation (Carroll, 1973; Krashen et al., 1978). Learning comprised three stages: syllable, vocabulary, and grammar. For syllable learning, participants studied seven consonants () and five vowels () that were presented with their pronunciation, thus allowing participants to additionally learn phonological knowledge in Korean. For vocabulary learning, participants studied different types of words: (a) noun: (mother), (father), (sister), (flower), (book); (b) verb: (sell), (buy), (see), (like); (c) temporal adverb: (yesterday), (today), (tomorrow); and (d) numeral noun: (one), (two), (three), along with their pronunciation and pictures. For grammar learning, function words were introduced: (a) particles (subject, object), (b) a verb final ending (), (c) a prefinal ending for subject honorification (), (d) prefinal endings for tense (past, future), (e) a complementizer (), and (f) numeral classifiers for flower () and for book (). Along with the function words, participants learned simplified rules of Korean grammar: (a) subject–verb agreement, (b) applying particles attached to a word to specify the role of the word as a subject or an object, (c) constructing a verb phrase with a prefinal ending for honorification or tense before the verb final ending, (d) using numeral classifiers to count a noun, and (e) constructing an embedded sentence by attaching a complementizer to the end of a verb phrase.
In the NL study, we used Korean vowels and consonants which were not used in the L2 study, renamed them as Nero symbols and Fero symbols, respectively, and devised a color sequence rule. In other words, here, participants processed Korean vowels and consonants not as linguistic stimuli but as nonlinguistic ones. The learning process comprised three stages corresponding to those of L2 learning: Nero/Fero symbol identification, Nero color rule acquisition, and Fero sequence rule application. For symbol identification, participants learned Nero and Fero symbols and classified them into three groups: N1 (), N2 (), N3 () for the Nero group, and F1 (), F2 (), F3 () for the Fero group. For Nero color rule acquisition, we introduced a “chunk” as a corresponding structure for a phrase in L2 and participants learned rules of color sequence in a chunk depending on the first Nero symbol within the chunk. For example, if the first Nero symbol was from N1 (i.e., “” in “”), then the color sequence of the chunk was green (G)–red (R)–yellow (Y)–blue (B) (i.e., ). Similarly, if a chunk started with the Nero symbol from N2 or N3, then the color sequences of the chunks were B–Y–R–G or R–B–G–Y, respectively. When the number of symbols in a chunk was more than four, the color sequence was repeated (i.e., ). For the final learning stage of the Fero sequence rule application, participants studied how to apply the Nero color rule to series of other chunks according to the Fero sequence rule, as follows. Rule 1: If the first Fero symbol in a chunk belongs to F1, the Nero color sequence of the chunk is applied to the current and the next chunk; Rule 2: If the first Fero symbol of a chunk belongs to F2, the Nero color rule of the chunk applies to the current chunk and an upcoming chunk only when it starts with a symbol from F2; Rule 3: The Nero color rule applies to only the current chunk if the chunk starts with a symbol from F3.
The learning session for each domain took 2 days. The L2 and NL learning sessions were conducted 1 month apart, and the order of learning was counterbalanced across all the participants who each participated in both L2 and NL studies. Participants learned the stimulus material using a Microsoft PowerPoint slide show through which they could progress at their own speed by navigating freely through the slides. They were instructed to study the stimulus material until they became confident about their learned knowledge. They progressed to the next level if they scored >90% accuracy; otherwise, they repeated the learning and testing procedures until they met the criterion to move on to the next level. Participants could terminate the learning session and progress to the fMRI session after they achieved >90% accuracy in the final learning tests (i.e., grammar learning in L2 and Fero sequence rule learning in NL). All the tests across the three levels consisted of 80 questions with four multiple-choice answers, programmed with MATLAB 7 (Mathworks Inc.). Responses were made by pressing number keys on a keyboard.
fMRI procedure.
Event-related fMRI scanning was administered the day after termination of the learning session. On the day of the fMRI experiment, participants had 30 min of training before the actual scanning to revise what they had learned previously. The three levels of hierarchies were constructed following the cascade model proposed by Koechlin (Koechlin and Summerfield, 2007). This model described the processing of a given stimulus in relation to cross-temporal contingencies (Fig. 1). The lowest level is represented by a contextual condition which corresponds to the situation, where participants can respond to a stimulus without referring to previous or upcoming stimuli. The medium level is represented by the episodic condition, where participants should consult a discrete preceding stimulus, presented shortly before, to respond to the current stimulus. The highest level is represented by the branching condition, which is similar to the episodic condition in that participants refer to the discrete preceding stimulus to select the response that is appropriate to the current stimulus. The distinctive feature of the branching condition compared with the episodic condition is that the ongoing stimulus is maintained in a pending state while another stimulus is being processed, and is reactivated upon completion of the ongoing one. For example, the structure of branching control can be depicted as A–B–A′, where an ongoing task A is suspended by an intervened task B and then reactivated subsequently as task A′ after completion of task B (E. Koechlin, personal communication). The three conditions were randomly positioned in the sentences/sequences. For a baseline condition (BASE), participants simply pushed the response button when several Xs were presented on the screen before sentences/sequences started. Six phrases/chunks in a sentence/sequence were visually presented one by one. Each stimulus was presented for 3 s, jittered by 0, 0.5, 1, 1.5, or 2 s. Participants were asked to judge the grammaticality of each phrase in L2 and the Fero sequence rule of each chunk in NL via button press (index finger—correct, middle finger—incorrect) within the 3 s that the stimulus was displayed on screen. Across the domains, experiments consisted of two sessions each comprising 52 sentences in L2 or 52 sequences in NL. Sentences/sequences consisted of six phrases in L2 and six chunks in NL having two contextual conditions, one episodic condition and one branching condition, which yielded 104 trials in the baseline, episodic, and branching conditions and 208 trials in the contextual condition. Mean sentence/sequence asynchrony was 28 s, and one session lasted for 1456 s, resulting in ∼50 min for two sessions. Stimuli were projected onto the back of a screen via an LCD projector. Participants viewed the images on the screen above their heads through a mirror attached to the head coil.
fMRI data acquisition.
Functional images were acquired on a human whole-body 7T MRI scanner (Magnetom 7T, Siemens Healthcare Sector) with a 24-channel NOVA head coil (NOVA Medical). 7T fMRI offers the advantage of high signal-to-noise ratio subsequently leading to a better contrast-to-noise ratio. This in turn produces the beneficial effect of higher sensitivity of fMRI for depicting the detailed structure of subcortical areas (Olman and Yacoub, 2011; Heidemann et al., 2012b). High-resolution 3D anatomical T1-weighted scans for the whole brain were acquired with MP2RAGE (Marques et al., 2010) using the following parameter set: TR = 5 s, TE = 2.45 ms, TI1 = 900 ms, TI2 = 2750 ms, flip angle1 = 5°, flip angle2 = 3°, isotropic voxel size = 0.7 mm, GRAPPA acceleration (iPAT = 2). For the functional scans, the T1-weighted images were used to position the slice package, and, accordingly, 49 axial T2*-weighted gradient-echo echo-planar images (GE-EPIs) were acquired with the following parameters: TR = 2 s, TE = 18 ms, flip angle = 80°, isotropic voxel size = 1.5 mm, GRAPPA acceleration (iPAT = 3), no gap. A field map was acquired to correct the images for distortions caused by the static field inhomogeneity. Additionally, 3D T1-weighted structural scans were collected from a 3T MRI scanner (Siemens Trio) on a different day (MP-RAGE sequence, nonselective inversion pulse, TI = 650 ms, TR = 1.3 s, TE = 3.93 ms, flip angle = 10°, bandwidth = 67 kHz/pixel, matrix = 256 × 240, 128 sagittal slices, spatial resolution = 1 × 1 × 1.5 mm3, 2 acquisitions, reconstructed to an isotropic voxel size of 1 mm).
Diffusion MRI data acquisition.
We acquired diffusion-weighted MRI (dMRI) data from 9 healthy young participants (22.5 ± 3 years) randomly selected from the group of participants of the fMRI experiment. Approximately 1 month elapsed between fMRI and dMRI sessions. Imaging was performed using the same 7T MR hardware as for the fMRI acquisition. With particular advantage for diffusion imaging, the scanner was equipped with a high-performance gradient system achieving a maximum gradient amplitude of 70 mT/m and a maximum slew rate of 200 T/m/s (SC72; Siemens Healthcare). High-angular-resolution diffusion-weighted images were acquired with a single-refocused spin-echo EPI sequence (Heidemann et al., 2012a), which allowed high-resolution, high-quality acquisition with an isotropic voxel size of 1 mm, and full brain coverage (imaging parameters: TE = 67 s, acquisition matrix = 204 × 204, 100 axial slices, 1 mm isotropic resolution, no gap, TR = 11.3 s, GRAPPA acceleration iPAT = 3, bandwidth 1066 Hz/pixel, 6/8 partial Fourier). Phase-encoding was chosen in the anterior–posterior direction. The dMRIs with a b-value of 1000 s/mm2 were acquired along 60 diffusion-encoding gradient directions distributed isotropically on one hemisphere. Additionally, seven reference images with minimal diffusion weighting (b-value = 50 s/mm2) were acquired at the beginning of the sequence and after each block of 10 dMRIs. They were used as reference images with high signal intensity for offline motion correction. Images with a b-value of 50 (instead of b = 0) were chosen to partially suppress the signal from the cortico-spinal fluid and reduce partial volume effects. The seven diffusion directions were isotropically distributed and averaged in the preprocessing step to exclude a directional bias. The sequence lasted 13.5 min and was repeated 4 times to optimize the image quality. Additionally, field maps with an isotropic resolution of 2 mm and a high-resolution anatomical T1-weighted image were acquired.
Behavioral data analysis.
In the learning session, mean percentage accuracy in the final learning test (Korean grammar test in L2 and Fero sequence rule test in NL) was calculated across the domains. In the fMRI session, mean percentage accuracy and response time (RT) were calculated across the domains with a two-way within-subject ANOVA with factors Condition (contextual, episodic, and branching) and Domain (L2, NL).
fMRI data analysis.
Analysis and visualization were performed using SPM8 software (http://www.fil.ion.ucl.ac.uk/spm/). The first five functional volumes were excluded to allow for magnetic saturation effects, leading to a total of 732 volumes per scanning session. The fMRI images were slice-time corrected, realigned to the first image, and corrected for geometric distortions using the individual field map. For coregistration, high-resolution T1-weighted images from 7T MRI and from 3T MRI were skull-stripped using CBS Tools (http://www.cbs.mpg.de/institute/software/cbs-hrt; Lucas et al., 2010; Bazin et al., 2012; Landman et al., 2013), and the stripped 7T structural image was coregistered onto the stripped 3T structural image and then once more onto the SPM8 T1 template image (Montreal Neurological Institute). These coregistered 7T structural images were obtained to confirm correct registration between the two experiment sessions and the absence of spatial distortion, which is known to increase with higher magnetic field strength (Dammann et al., 2011). The fMRI images were normalized to the standard SPM8 T1 template image and smoothed with an isotropic 3 mm full-width half-maximum (FWHM) Gaussian kernel.
Whole-brain analysis.
Participants' hemodynamic responses were estimated based on the general linear model of SPM8 (http://www.fil.ion.ucl.ac.uk/spm/) for the stimulus duration from the onset of each condition. Six regressors were created: BASE, contextual condition, preceding/subsequent episodic conditions (e.g., ➃/➄ in Fig. 1A and ➁/➂ in Fig. 1B), and preceding/subsequent branching conditions (e.g., ➀/➅ in Fig. 1A,B). The preceding/subsequent stimuli were modeled separately because dissociation between the episodic and branching conditions became obvious when the subsequent stimulus was processed, whereby only the subsequent regressors were considered for the episodic- and branching-specific contrasts. Additionally, motion parameters and incorrect responses were incorporated into the model as nuisance regressors. Overall patterns of activations were first obtained by a fixed-effect analysis on data pooled over all participants. Condition-specific effects involved creating contrast images of each condition (contextual, episodic, and branching) with a comparison to the baseline condition (BASE) for each participant. These contrast images were then entered into a second-level random effect analysis. Based on the hypothesis, we looked for activations that showed gradual increase as the level of hierarchy became higher in L2 and NL separately, leading to the contrasts of each condition: [contextual > BASE] for the contextual condition, [episodic > contextual] for the episodic condition, [branching > (episodic ∪ contextual)] for the branching condition. Contrasts were initially thresholded at p < 0.001 uncorrected, and only activations that survived p < 0.05 FDR (false discovery rate) at cluster level were reported. For percentage BOLD (blood oxygenation level-dependent) signal change, we extracted β values of each condition within a sphere-shaped region (radius 2 mm) centered upon the peak voxel from each participant using the MarsBaR software (http://marsbar.sourceforge.net/) and averaged them over participants.
Volume of interest analysis.
Given the hypothesis about the mirrored pattern of activations in the cortical area (PFC) as well as the subcortical area along with the dorsolateral PFC loop, a volume of interest (VOI) analysis over the CN and the thalamus was performed separately. Originally, the dorsolateral PFC loop consisted of the PFC, the CN, the pallidum/substantia nigra, and the thalamus (Alexander et al., 1986). However, we excluded the pallidum/substantia nigra and focused on the CN and the thalamus as VOI. The reason for selecting the CN and the thalamus as VOI was that the detectability of BOLD signal is known to be maximized in the CN and the thalamus. BOLD fMRI using GE-EPIs measures T2*-weighted signals whose detectability is affected by iron deposition in the tissues; the more the iron is accumulated, the less the T2*-weighted signal is detected. In particular, a previous study about quantitative MR imaging of iron concentration in the brain demonstrated that the amount of iron deposition within the basal ganglia was highest in the globus pallidus (GP), moderate in the putamen, and lowest in the CN and the thalamus (Langkammer et al., 2010), which limits the detectability of functional activations in the GP and the putamen. For the VOI analysis, we set up the same contrasts as in the whole-brain analysis. The statistical inferences for activation were drawn with the search volume confined to the bilateral CN and thalamus defined by automated masks from the Wake Forest University Pickatlas (http://fmri.wfubmc.edu/software/PickAtlas) at p < 0.001 uncorrected at voxel level and p < 0.05 FDR corrected for VOI at cluster level (Maldjian et al., 2003).
Singular value decomposition analysis.
To quantify the distribution of activation foci, individual xyz-coordinates from the MNI space were extracted from the peak activations in the episodic and branching conditions in L2/NL domains and were transformed into an optimal coordinate system using singular value decomposition (SVD; Watkins, 2004). SVD allows the transformation of a high-dimensional dataset into a lower-dimensional one while preserving most of the variance. In our study, we transformed the three-dimensional fMRI coordinates of the peak voxels in the two experimental conditions (episodic and branching) into a one-dimensional coordinate (along the dominant direction) so that we could estimate orientation of the plane in which the fMRI coordinates were stretched with the highest variance.
dMRI data analysis.
The dMRI data were corrected for participant motion using rigid-body transformation (Jenkinson et al., 2002) computed from the 7 reference images. The transformations were interpolated to the 67 volumes of each acquisition and applied as initial motion correction step to the data. After motion correction, small linear eddy-current distortions remained in the phase-encoding direction (anterior–posterior direction). They were corrected by linear registration of all dMRIs to the mean dMRI (b = 1000) with restriction of the optimization to the anterior–posterior direction. Additionally, geometrical distortions in the dMRIs caused by magnetic susceptibility artifacts were corrected using the field map. This map was converted into a voxel displacement map and scaled to the resolution of the dMRI. A final transformation was used to align the dMRIs with the anatomical images. Therefore, the brain was segmented from T1-weighted structural scans, and then rotated to the standard coordinate system defined by the midsagittal plane and the anterior and posterior commissure. The distortion-corrected mean reference dMRI was registered on this structural image using a mutual information cost function, and the registration parameters were combined with the motion and eddy-current correction parameters as well as the voxel displacement map. All steps were combined to one transformation image for each acquisition and applied to the dMRIs after an initial two-stage hybrid image restoration (Lohmann et al., 2010). The gradient directions for each volume were corrected using the rotation parameters. In this way, the registered images were interpolated only once to the anatomical space and the 1 mm isotropic resolution was preserved with minimized image interpolation. Finally, the four corresponding acquisitions and the gradient directions were averaged to increase the signal-to-noise ratio (SNR). Averaging several acquisitions was of particular importance in areas of lower signal intensity, as was done in the inferior frontal lobe, to compensate the inhomogeneous sensitivity of the MR acquisition coil. For comparison, we estimated the SNR within a white-matter region in the parietal lobe (high signal) and within the inferior frontal lobe. The SNR was computed as the mean signal of two consecutive acquisitions divided by the SD of the difference image. In the parietal lobe it was 24 ± 1 (SD) for a single image with a low b-value and 25 ± 2 for the averaged b = 1000 image of one gradient direction in a typical participant (corrected for the 4 repetitions). In the inferior frontal lobe, the SNR was 15 ± 1 and 16 ± 2, respectively.
To estimate the 3D fiber pathway between the cortical and subcortical fMRI activation maxima, as well as the strength of the different connections in the pathway, we performed probabilistic tractography using a crossing fiber model (up to 3 directions per voxel) using FSL (www.fmrib.ox.ac.uk/fsl; Behrens et al., 2007). This method allows a robust estimation of the white-matter fiber pathway between two masked brain areas and a precise localization of the connecting tract. The robust group activation maxima in the PFC, CN, and thalamus were morphed to the brain of each participant. For highest-precision tractography, the cortical locations were projected to the closest white-matter voxel, and we examined the locations of the subcortical seed points individually and corrected their locations in the CN and the thalamus. Probabilistic tractography was seeded with 5000 streamlines per voxel in a local neighborhood with a radius of 4 mm around the activation maximum (257 voxels). Differentiation of a probabilistic connection from the chance level is still an open issue (Morris et al., 2008). Based on the high SNR value in the data used in this study, we are confident that the structural pathways were robustly reconstructed. To detect only anatomically correct connections, we chose a conservative threshold of 100 computed pathways connecting both regions. This threshold was determined empirically, based on the data properties and the correspondence of the reconstructed pathway across participants and conditions. All connections were individually checked for consistency with the underlying high-resolution anatomy. Additionally, the median connection strengths and their variability were reported for all connections. Random connections (binarization threshold at 10% of the maximum visitation value) were removed from the final tractography image (also called visitation map), and the image was normalized to the template brain in MNI standard. The group overlap of the connection pathways was slightly smoothed (Gaussian filter with 1 mm FWHM) for visualization purposes.
Results
Behavioral results
For the learning sessions, the mean percentage accuracy and the SEM for the final grammar tests were 89.89% (5.49) for L2 and 90.68% (3.28) for NL. An independent sample t test showed no significant differences between the groups (t (−1.38) = 36, p = 0.176). For the fMRI sessions, the mean percentage accuracy and the mean RTs are provided in Figure 2. Within-subjects ANOVAs on accuracy and RTs were conducted with the factors Domain (L2 and NL) and Hierarchy (contextual, episodic, branching). In accuracy data, no main effect was observed in DOMAIN (F(1,18) = 41.97, p = 0.32) and in Hierarchy (F(2,36) = 0.97, p = 0.38). The RTs showed a main effect in HIERARCHY (F(2,36) = 341.474, p < 0.001) but not in Domain (F(1,18) = 0.001, p = 0.975). Interaction was revealed only in RT (F(2,36) = 6.86, p = 0.003) but not in accuracy (F(2,36) = 0.967, p = 0.39).
Functional MRI data results
Whole-brain analysis
The activations pertinent to the varying levels of cognitive hierarchies across the L2 and NL domains were superimposed in the lateral PFC with mean percentage BOLD signal changes (Fig. 3; see Table 1 for their coordinates). In L2, the lowest level of hierarchy (the contextual condition) showed the most posterior activation in the left precentral gyrus (BA 4; not displayed in the figure; see Table 1 for its xyz-coordinate). For the highest level of hierarchy (the branching condition), the activation was found in the most anterior region (the left middle orbital gyrus, BA 47). The activation in the episodic condition, relevant to the medium level of hierarchy, was observed in the posterior pars triangularis (BA 45 boarding BA 44) being located between the other two areas.
By also applying the gradient-wise comparison to the NL domain, results similar to those observed for L2 were revealed. The anterior portion of the ventrolateral PFC (BA 45 boarding BA 47) was activated for the highest level of cognitive hierarchy (branching condition). The medium level (episodic condition) activated a more posterior area, namely, the left ventral precentral gyrus (BA 4). Additional right hemisphere activation was observed in the precentral gyrus in the branching condition (not displayed in the figure, see Table 1 for its xyz-coordinates). No significant activation was observed in the contextual condition in NL. The mean percentage BOLD signal changes represented significant differences between the episodic and branching conditions.
Volume of interest analysis
To test whether the gradient pattern of activations in the lateral PFC was replicated in the subcortical structure, we analyzed the VOI in the CN and the thalamus. As expected, the same gradient was depicted with the CN being activated in the central region (body of the CN) in the episodic condition, whereas the ventroanterior region (head of the CN) was activated in the branching condition across the two domains (Fig. 4; see Table 1 for their coordinates). No significant activation was found for the lowest level of hierarchy; that is, the contextual condition in the CN in both conditions. The mean percentage BOLD signal changes also suggested that each of the activation foci represented the dissociation well in terms of neural activities generated from the different levels of hierarchies.
As in the PFC and the CN, various thalamic nuclei also showed a gradient pattern of activations not only in L2, but also in NL (Fig. 5). In the branching condition, the most anterior activation was found in the left ventral anterior nucleus (VA) and, interestingly enough, an additional posterior region (the left MD) was observed in this condition. In the episodic condition, the activation was yielded in the more posterior region (MD). The activation related to the contextual condition and some activations in the right hemisphere were observed as well (not displayed in Fig. 5; see Table 1 for their coordinates). The mean percentage BOLD signal changes showed significant differences between the two conditions (episodic vs branching) only in the VA, but not in the MD. In the PFC, the CN, and the thalamus, the lowest level of hierarchy (contextual condition) did not display systematic activations across domains and was therefore excluded from further analyses.
Singular value decomposition analysis
A 2 × 2 ANOVA between Hierarchy (episodic vs branching) and Domain (L2 vs NL) was performed over the transformed xyz-coordinates in the left hemisphere. Only the coordinates from the episodic and branching conditions were taken for SVD analysis, and those from the contextual condition were excluded because they were missing in NL in the PFC and in L2/NL in the CN. As expected, within the PFC, there was a main effect of Hierarchy (F(1,18) = 89.913, p < 0.001), but not a main effect of Domain (F(1,18) = 0.389, p = 0.541). An interaction was found between the two factors (F(1,18) = 24.21, p < 0.001). The post hoc test was conducted on variables of hierarchical levels. With the Bonferroni tests, the comparison between the two conditions (episodic vs branching) showed significant differences in terms of their peak activation areas (p < 0.001). Crucially, the coordinates in the CN from the VOI analysis revealed the same results as in the PFC. We found a main effect of Hierarchy (F(1,18) = 135.38, p < 0.001) and interaction (F(1,18) = 86.26, p < 0.001). No significant main effect was found for the factor Domain (F(1,18) = 6.65, p = 0.091). The post hoc tests with Bonferroni correction also showed that the episodic and branching conditions were significantly differentiated regarding their peak activation areas (p < 0.001). Finally, in the thalamus, where we considered only the VA from the branching condition and the MD from the episodic condition for analysis, we found a main effect for Hierarchy (F(1,18) = 25.55, p < 0.001) but no main effect for Domain (F(1,18) = 2.74, P= 0.115) or interaction (F(1,18) = 2.62, P= 0.123). The post hoc tests were also administered between the episodic and branching conditions using Bonferroni correction, leading to a significant difference between the peak activation areas (p < 0.001).
dMRI data results
High-resolution dMRI tractography at 7T MRI was able to determine the white-matter pathway between the activation maxima in the PFC and the respective areas in the CN and the thalamus, which were derived from the group analysis. The connection strength computed as relative number of connecting pathways was above threshold for all the connections in all the participants. The connection between each pair of cortical and subcortical activations (i.e., PFC–CN, and PFC–thalamus) showed parallel, mainly nonoverlapping pathways in the white matter.
Pathway between the PFC and the CN
In the branching condition of L2 and NL, the pathway starting from the activation maxima in the PFC followed the fibers which are parallel to the anterior thalamic radiation (ATR) and sharply curved to reach the ventral head of the CN. The pathways for L2 and NL run parallel, with NL being positioned dorsal to L2. In the episodic condition, the PFC and the body of the CN were interconnected through a lateral–medial pathway which crossed the superior longitudinal fascicle (SLF) and the internal capsule in both domains. The connections of the L2 and NL showed parallel pathways until they reached the CN and followed a pathway within the CN to reach their activation peaks (Fig. 6A).
Pathway between the PFC and the thalamus
In the branching condition, the PFC was interconnected to the anterior thalamic nucleus (i.e., VA) through the ATR in L2 and NL. The pathways for L2 and NL run parallel, with NL being positioned dorsal to L2. The pathway for the episodic condition in L2 also followed the ATR between the PFC and the thalamus and aligned with the internal lamina to connect with the posterior thalamic nucleus (i.e., MD). In the episodic condition in NL, the pathway between the PFC and the MD ran in medial–lateral orientation, crossed the SLF, and connected with the thalamus at a location similar to the L2 episodic pathway (Fig. 6B).
Discussion
In the present study, the preferentially activated areas depending on varying levels of cognitive hierarchy were observed as the anterior–posterior gradient across the PFC, the CN, and the thalamus not only in L2, but also in NL. These functional specificities were further supported by parallel networks within the human cortico-striatal-thalamo-cortical system.
The PFC
The present high-resolution functional data confirmed that the anterior–posterior gradient within the PFC selectively contributed to different levels of hierarchical processing (Koechlin and Summerfield, 2007; Jeon and Friederici, 2013). To demonstrate this, we compared the peaks of activations in L2 from the present study and those from Koechlin et al. (1999, 2003) and Badre and D'Esposito (2007) by mapping activations related to the hierarchical levels (Fig. 7). Obviously, the cascade of activations in the present experiment mostly overlapped with those in other experiments even though the cognitive hierarchies were ranked based on different frameworks across the studies. It should be noted that several theories have been suggested to explain the possible framework for generating different levels of cognitive hierarchies, including the temporal organization of behavior, abstract representational hierarchy, domain generality in working memory, and relational complexity, and one should be very cautious about drawing a final dissociation between these theories (for review, see Badre, 2008). Among the available theories, we suggest that the concept of cross-temporal contingencies (Koechlin and Summerfield, 2008) was most suitable for generating the levels of cognitive hierarchies in both L2 and NL in the present study.
The CN
Previous fMRI studies have provided evidence of the involvement of the CN in complicated processes such as ambiguity resolution (Ketteler et al., 2008), processing of complex sentences (Mestres-Missé et al., 2012), or word-stem completion tasks with many possible candidates (Desmond et al., 1998). In neuropsychological studies, patients with CN lesions were impaired in a controlled processing where an additional top-down process had to be intervened (Copland et al., 2000). In line with these findings, the CN in the present study was activated only for the higher level of cognitive hierarchy (branching and episodic) but not for the lower level (contextual) because the stimulus–response association for the episodic and branching conditions, unlike the contextual condition, requires a demanding process such as grasping the temporal structure of events involved in the tasks (Kouneiher et al., 2009).
More specifically, an attempt has been made to dissociate the role of subregions within the CN. The segmentation of the structure into head and body components has revealed a consistent observation that the head area is involved in more complex cognitive processing than the body area (Grahn et al., 2008). Patients with Huntington's disease having lesions near the head of the CN showed abnormalities in more demanding language tasks (Chenery et al., 2002). A recent fMRI study also showed more activations in the head of the CN in a language task with a high level of complexity and in the body of the CN in a relatively less complex task (Mestres-Missé et al., 2012). This functional specificity within the CN is also supported by a recent meta-analytic approach revealing parallel functional and structural connections between the PFC and the CN along the anterior-to-posterior region (Desrochers and Badre, 2012; Robinson et al., 2012). All these previous findings and the result of the CN in the present study provide converging evidence that subregions of the CN as part of a cortico-striatal-thalamo-cortical network subserve the process at different levels of cognitive hierarchies.
The thalamus
A similar anterior–posterior gradient was observed with VA being activated only in the branching condition, whereas MD was activated in the branching and episodic conditions across the domains. These distinct thalamic nuclei activations support the early proposal of a “cognitive” prefrontal-subcortical loop as formulated by Alexander et al. (1986) by demonstrating the function of the cortical areas with which the nuclei are connected. A characteristic feature of anterior thalamic neurons is that they fire rhythmically with the so-called “theta rhythm” (Buzsáki, 2002), which increases in power during mnemonic functions (Kirk and Mackay, 2003). In line with this, the activation in VA in the current study can be explained by the fact that participants were involved in the memory processing mostly in the branching condition, for which the information must be temporarily stored and reactivated later when confronting its corresponding stimulus. The VA is also recruited for higher-level cognitive tasks in association with other thalamic nuclei such as the MD, ventral lateral nucleus, or intralaminar nucleus (Zikopoulos and Barbas, 2007), which fits well with our data showing the coactivation between VA and MD.
The thalamus contains several nuclei that are highly specific in their connections with distinct subregions of the cerebral cortex. In particular, the MD, known to have interconnections predominantly with the PFC, is recognized as an intermediary relay station mediating cognitive functions in association with other structures such as the VA, internal medullary lamina, and mamillo-thalamic tract (Masterman and Cummings, 1997; Van der Werf et al., 2003; Negyessy and Goldman-Rakic, 2005; Izquierdo and Murray, 2010; Klein et al., 2010). Therefore, the involvement of the MD across all the conditions in the present study can be interpreted by its role as a primary relay nucleus of the thalamus, linking basal ganglia and the PFC (McFarland and Haber, 2002).
The similar pattern of activations not only in L2, but also in NL, across the PFC, the CN, and the thalamus deserve some discussion in terms of the domain-generality of the present study. Dominey and colleagues (Dominey et al., 2003, 2009) suggested the domain-general involvement of the cortico-striatal-thalamo-cortical loop for the processing of linguistic sequences and abstract nonlinguistic sequences. In their line of research, they have searched for a common link between nonlinguistic sequence processing and language processing with the interaction between cortical (BA 44/6, 45, 47, superior/middle temporal gyrus) and cortico-striatal networks (CN, substantia nigra pars reticulate, and thalamus; Dominey et al., 2009). The detailed discussion of their model is beyond the scope of the present study; however, our study may add empirical evidence that the PFC, CN, and thalamus actively accommodate all the levels of cognitive processes in both linguistic and nonlinguistic domains.
The network
Previous MR tractography and invasive tract-tracing studies showed parallels between topographic specificity in the cortico-thalamic connections such that the lateral PFC had a high probability of interconnection with more lateral MD and a dorsal PFC with a dorsal MD (Yeterian and Pandya, 1991; Klein et al., 2010; Kotz et al., 2013). Another connectivity study (Draganski et al., 2008) also observed a “rostro-caudal gradient” in the PFC, CN, and thalamus. In accord with these findings, the present study, with the help of high-resolution dMRI tractography, allowed us to separate specific and parallel pathways between the functionally differentiated and mirrored cortico-subcortical areas. More importantly, we revealed for the first time precise insight into the 3D course of the parallel cortico-caudal and cortico-thalamic pathways, in particular, following these pathways even in areas of crossing fibers. In addition to the tracking within the white matter, the pathways could also be reconstructed within the subcortical gray matter areas to exactly reach the activation maxima.
Conclusion
In conclusion, this is the first in vivo human study combining fMRI and dMRI to demonstrate the mirroring of functional specificity depending on the cognitive hierarchies within the cortico-subcortical loop substantiated by parallel networks. Future work should investigate functional connectivity among the spatially distributed activations to scrutinize a correlation in resting state time-series of the PFC, the CN, and the thalamus.
Footnotes
We thank Robert Trampel, Pierre-Louis Bazin, and Andreas Schaefer for help with the analysis; Elisabeth Wladimirow and Domenica Wilfling for data acquisition; Etienne Koechlin for helpful comments and suggestions; Jan Leppin, Julia Becker, Marianne Schell, and Merle von der Nahmer for assistance in the behavioral training; Kerstin Flake for graphics; and Elizabeth Kelly for proofreading.
The authors declare no competing financial interests.
- Correspondence should be addressed to Hyeon-Ae Jeon, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103, Leipzig, Germany. jeon{at}cbs.mpg.de