Abstract
Cognitive control relies on neural representations that are inherently high-dimensional and distributed across multiple subregions in the prefrontal cortex (PFC). Traditional approaches tackle prefrontal representation by reducing it into a unidimensional measure (univariate amplitude) or using it to distinguish a limited number of alternatives (pattern classification). In contrast, representational similarity analysis (RSA) enables flexibly formulating various hypotheses about informational contents underlying the neural codes, explicitly comparing hypotheses, and examining the representational alignment between brain regions. Here, we used a multifaceted paradigm wherein the difficulty of cognitive control was manipulated separately for five cognitive tasks. We used RSA to unveil representational contents, measure the representational alignment between regions, and quantify representational generality versus specificity. We found a graded transition in the lateral PFC: The dorsocaudal PFC was tuned to task difficulty (indexed by reaction times), preferentially connected with the parietal cortex, and representationally generalizable across domains. The ventrorostral PFC was tuned to the abstract structure of tasks, preferentially connected with the temporal cortex, and representationally specific. The middle PFC (interposed between the dorsocaudal and ventrorostral PFC) was tuned to individual task sets and ranked in the middle in terms of connectivity and generalizability. Furthermore, whether a region was dimensionally rich or sparse covaried with its functional profile: Low dimensionality (only gist) in the dorsocaudal PFC dovetailed with better generality, whereas high dimensionality (gist plus details) in the ventrorostral PFC corresponded with better ability to encode subtleties. Our findings, collectively, demonstrate how cognitive control is decomposed into distinct facets that transition steadily along prefrontal subregions.
- cognitive control
- connectivity
- dimensionality
- multiple-demand network
- prefrontal cortex
- semantic cognition
Significance Statement
Cognitive control is known to be a high-dimensional construct, implemented along the dorsocaudal–ventrorostral subregions of the PFC. However, it remains unclear how prefrontal representations could be dissected in a multivariate fashion to reveal (1) what information is encoded in each subregion, (2) whether information systematically transforms across contiguous PFC subregions as a gradient, and (3) how this transformation is affected by functional connectivity. Here, we shed light on these issues by using RSA to decode informational composition in the PFC while using participant-specific localizers to facilitate individually tailored precision. Our findings elucidate the functional organization of the PFC by revealing how a trade-off between dimensionality and generalizability unfolds in the PFC and highlight the strength of RSA in deciphering the coding of cognitive control.
Introduction
The prefrontal cortex (PFC) is an anatomically complex and functionally heterogeneous brain region. Anatomically, the PFC contains multiple subregions, each receiving input and sending output to a unique set of other areas (Thiebaut de Schotten et al., 2017; Jung et al., 2022). Perhaps as a result of varied connectomic fingerprints, the PFC is functionally manifold—some subregions are ubiquitously involved across diverse situations regardless of stimuli and operations (Duncan, 2010) while others exhibit proclivity for certain types of stimuli/tasks (Goldman-Rakic, 1996). For decades, scientists have endeavored to understand how PFC subregions are systematically organized to identify an organizational axis that best ascribes functions to structures. One of the prominent findings is that functional profiles gradually vary along the dorsocaudal–ventrorostral axis of the PFC (for review, see Badre and Nee, 2018; Constantinidis and Qi, 2018; Abdallah et al., 2022). On the one end of the axis, the dorsocaudal PFC preferentially encodes concrete/perceptible representations to prepare for action (e.g., mapping a visible stimulus location to a finger reaction). On the other end, it has been proposed that the ventrorostral PFC preferentially encodes abstract representations (entities not directly perceivable, such as semantics or abstract relationships among stimuli). Critically, regions situated between the dorsocaudal and ventrorostral PFC vary in a stepwise manner, with those midway between the extremes (e.g., the middle frontal gyrus) preferentially representing the current focus of attention or the goal of behavior (Woolgar et al., 2011). These middle frontal regions might integrate information from dorsocaudal and ventrorostral sources (Nee and D'Esposito, 2016).
While the dorsocaudal–ventrorostral axis has offered an anchoring framework for mapping out the relationship between functions and subregions, it is nevertheless incomplete. Below we discuss the insufficiency of existing research and how the present study was performed to address these problems. First, a lack of crosstalk between literatures: Traditionally, research on executive functions tends to employ highly abstract rules and stimuli (such as arbitrarily mapping visual shapes to motor outputs) and consider semantic meaning as “contamination.” However, a substantial swath of the PFC is also involved in semantic tasks, especially during goal-directed retrieval/manipulation of semantics (Chiou et al., 2018; Gao et al., 2021). To fully understand the correspondence between PFC structures and functions, it is necessary to incorporate both semantic and nonsemantic components in a single experimental design. Second, the limitation of univariate analysis: As discussed by Freund et al. (2021a), the majority of neuroimaging studies on cognitive control employed univariate contrasts, which inherently condenses potentially multifaceted processes into a unidimensional measure (activity amplitude) and therefore loses information. Contrary to univariate analysis, representational similar analysis (RSA) has the ability to decompose the informational contents distributed across a region's neuronal population by revealing the dimensions that define this representation. Using RSA, we can relate a region's functionality (domain-general vs domain-specific) to its dimensionality (limited vs multifaceted). Third, a lack of precision when defining regions of interest (ROI): As discussed by Fedorenko (2021), the topography of task-induced activation can be idiosyncratic across individuals. As a result, an analysis relying on anatomical templates or group-level clusters loses sensitivity and can result in inaccurate estimates for an individual. To combat this imprecision issue, it has been suggested that using a functional localizer to individually identify ROIs greatly improves accuracy (Shashidhara et al., 2020). In the present study, we tackled these issues by manipulating various semantic and nonsemantic variables to create a design conducive to conducting RSA; a localizer task was employed to precisely demarcate individual-specific prefrontal ROIs.
Our RSA decoding reveals an informational gradient in the PFC, transitioning from representing cognitive demand, via the individuality of task sets, to semantic meaning. Different subregions along the gradient preferentially exchange information with distinct regions outside the PFC and have different representational dimensionalities that dovetail with their functional properties (general vs specific; concrete/motoric vs abstract/conceptual). Together, these results reveal an organizational principle that specifies “how” and “where” information transforms within and beyond the PFC.
Materials and Methods
Participants
Twenty-five volunteers gave informed consent before the scanning session began. The sample consisted of 15 females and 10 males, with an average age of 28 years and SD = 6. All volunteers speak English as their native language and are right-handed. All participants completed a screening questionnaire about magnetic resonance imaging safety before the functional MRI experiment started. None of them reported having any previous brain injury or neurological or psychiatric condition. This study was reviewed and approved by the Northwest Research Ethics Committee.
Design and stimuli
After the structural scan, volunteers completed 11 runs of echo-planar imaging in a single session. Run 1 was the localizer experiment, and its associated results have been previously published (Chiou et al., 2023). This localizer was designed to probe cognitive control for semantic and perceptual processes, and its individual-level contrast results were used to localize control-related ROIs for each participant. Runs 2–11 were the RSA decoding experiment. As the primary goal of our study was to compare the similarities and differences among prefrontal subareas, decoding analysis on the data of Runs 2–11 was primarily focused on the individual-specific prefrontal ROIs defined by the localizer contrasts.
Details about the localizer task are comprehensively described in a previous study (Chiou et al., 2023). Below we report the key information: We used a two-by-two factorial design in which different types of processing (semantic processing on lexical stimuli vs visuospatial processing on pictorial stimuli) and the extent of difficulty required to achieve a correct response (easy vs hard) were orthogonally manipulated. There were four items displayed in each trial (either four English words or four geometric patterns), and one of the four items served as an “oddball,” either semantically incongruous with other words (belonging to a separate semantic category from the rest) or perceptually incongruous with the others (having a distinct visual configuration). Difficulty level was manipulated by varying the closeness of semantic categories or the similarity of visual patterns. Participants had to indicate the oddball with a button response. The localizer experiment had a block design, with each of the four conditions having 12 blocks; each block consisted of three trials; each trial started with a fixation cross (0.1 s), followed by a quadruplet of stimuli (3.9 s). Together with interblock intervals, the localizer task had a total duration of 647.5 s. The experiment was programmed using E-Prime 2.0 (2013, Psychology Software Tools). When performing the task, participants responded by pressing one of the four designated buttons on an MR-compatible response pad using their right fingers. All text stimuli were in white color and displayed on a black background; the text was in Arial typeface, with 24 points of font size. Stimuli were displayed using an MRI-specialized LCD screen (1,920 × 1,080; NordicNeuroLab) and projected onto a coil-mounted device for viewing stimuli.
The RSA decoding experiment contained 10 runs of scanning (Runs 2–11). We employed a five-by-two factorial design that fully crossed different types of task processing (five tasks: semantic association, synonym decision, color knowledge, numerical comparison, and shape matching) with the difficulty with which to attain a correct answer (easy vs hard). The overall structure of the experiment is illustrated in Figure 1A. As the figure shows, in each trial, there was a reference item on the top (a word, a number, or a shape) and three options beneath. In different tasks, participants were asked to engage in different types of cognitive operations and choose an answer in relation to the reference. The specifics of task requirements were as follows: (1) For the semantic association task, participants evaluated the relationship between the reference noun word and three options and chose the option (target) semantically most associated with the reference. We manipulated the strength of association between the easy and hard conditions of this task such that the semantic relationship was obvious and straightforward in the easy condition (e.g., “salt” and “pepper”), but ambiguous and indirect in the hard condition (e.g., “salt” and “grain”). To ensure the adequacy of stimuli, eight additional volunteers (none took part in the MRI study) were recruited to rate semantic association between the reference and options on a five-point scale, and the data showed that, as intended, the targets were significantly more strongly associated with the references in the easy context (mean ± SD, 4.56 ± 0.27) than in the hard condition (3.39 ± 0.65); the difference was found at the individual level (by-item analysis: p < 10−9 for all raters) and the group level (by-subject analysis, p < 0.001), and the nontarget options (foils) were significantly less associated with the references than targets (ps < 0.001). We also controlled for word length (by letter count) and lexical frequency (using the corpus of Van Heuven et al. (2014) for all words used in the task so that “easy versus hard” did not differ on these linguistic variables. (2) For the synonym decision task, participants compared the conceptual similarity between the reference noun and the three options and chose the most synonymous option with the reference. We manipulated the ease with which to access lexicosemantics by using concrete nouns (tangible/perceptible concepts, such as “box” or “kettle”) in the easy condition and abstract nouns (e.g., “onus” or “deceit”) in the hard condition, based on the well-documented findings that lexical access to abstract words is cognitively more taxing than to concrete words (Hoffman et al., 2015). We adopted the concreteness ratings from Brysbaert et al. (2014) to ensure that the concrete words used in the easy condition were rated as more concrete than the abstract words in the hard condition (p < 10−25) while there was no difference in word length and lexical frequency (both ps > 0.45); we recruited eight additional participants to rate the synonymy of words and ensured that the targets and references were judged as more synonymous with each other compared with foils (ps < 0.001 for both concrete and abstract words). (3) For the color knowledge task, participants needed to retrieve knowledge about the canonical color of objects; the references were nouns associated with a typical color attribute (e.g., “banana” is typically yellow), and the task was to choose the option that shared a similar color to the reference. The difficulty manipulation here concerned the relationship between the references and foils—in the hard condition, the foils were semantically highly related to the references, which made them highly distracting as participants must actively suppress a tendency to respond to those semantically related (yet task-irrelevant) foils and concentrate on identifying the semantically unrelated targets using the arbitrary task rule (i.e., pairing words using color similarity; for the precedent of this task design, see Badre et al., 2005). In the easy condition, in contrast, the targets and foils were all semantically unrelated to the references (despite the targets and references sharing similar colors), hence reducing the interference of foils. We collected rating data from eight additional volunteers to ascertain that the foils were significantly more related to the references in the hard than easy condition (ps < 0.0001) while the targets and references were minimally related despite sharing similar colors. Word length and lexical frequency were controlled to ensure these factors did not confound the key manipulation (both ps > 0.27). (4) For the numerical comparison task, volunteers compared the numerical distance between the reference number and three options and identified the option numerically nearest to the reference. The numbers were three-digit numerals, ranging from 200 to 999. In the easy condition, the target had an obviously closer distance to the reference compared with foils, whereas in the hard condition, the distance was not obvious, which necessitated careful comparison to identify the target. (5) For the shape matching task, volunteers viewed geometric shapes and were asked to compare their perceptual sizes. In the easy condition, it was easy to identify the target because the sizes of the foils obviously differed from the reference and target, whereas in the hard condition, careful visual inspection was necessary to select the target because all options’ sizes were perceptually comparable to the reference. For the three conditions in which lexical stimuli were used (i.e., semantic association, synonym decision, and color knowledge), we carefully cherry-picked 1,200 words as the references, targets, and foils to ensure that (1) each word only appeared once in the entire experiment and (2) none of the words was repeatedly used to avoid the effect of residual memory from previous trials interfering between conditions/tasks. In all tasks, the targets were equally likely to appear in each of the three locations and thus equally likely to be reacted to using any of the three buttons.
A, The design of the RSA decoding experiment contained 10 task conditions. Specifically, there were five tasks (1, semantic association; 2, synonym decision; 3, color knowledge; 4, numerical comparison; 5, shape matching), and for each task, we manipulated the level of difficulty to achieve a correct response (easy vs hard), yielding 10 conditions in total. B, Selection of voxels for defining individual-specific ROIs was based on the following procedure: First, functionally defined group-level clusters from Fedorenko et al. (2013) and Jackson (2021) were used to constrain the scope of selection and ensure consistency with prior literature. Second, based on the functional clusters’ alignment with gyri and sulci structures, the clusters were divided into four sections: frontal eye field (FEF; cyan), middle frontal gyrus (MFG; blue), inferior frontal sulcus (IFS; magenta), and inferior frontal gyrus (IFG; red). Third, using the results of the localizer experiment, for each individual participant, we selected the top 200 voxels, which were mostly contiguous and maximally responsive to the ROI-defining contrast within each of the four PFC divisions. Examples of results from one participant are illustrated as the orange patches in the inset boxes.
Figure 1-1
Download Figure 1-1, TIF file.
Each run of scanning consisted of 20 blocks of trials; prior to the onset of each block, an instruction was presented for 3 s to inform the upcoming task (i.e., semantic association, synonym decision, color knowledge, numerical comparison, and shape matching). Each block consisted of five trials; each trial began with a fixation dot (0.4 s), followed by a quadruplet of stimuli (one reference and three options) displayed for 3.5 s, yielding 19.5 s duration for each block of trials, and each block was succeeded by a 2 s interval before the onset of next instruction. We counterbalanced the order in which the 10 conditions appeared across runs and participants so that each condition was equally probable to appear in any slot of the sequences and equally probable to be preceded and succeeded by any other conditions, with each condition having 20 blocks of trials in total and each run lasting 489 s. The order of stimuli within each condition was randomized. Participants were instructed to respond as quickly and accurately as possible within a trial's time limit. All stimuli (words, numbers, and shapes) were presented in white color against a black background. Identical devices and parameters for stimuli display and response collection to the localizer experiment were used in the present RSA decoding experiment. It is worth noting that, using a fully counterbalanced factorial structure in our experimental design, we ensured that all conditions were equally likely to appear in each run of scans, with each condition having the same number of blocks, and evenly distributed across the timeslots across the 10 runs, which prevented some confounding biases in RSA that an unbalanced design might have (for discussion, see Freund et al., 2021b).
Regions of interest
A common practice of multivoxel pattern analysis (MVPA) is to investigate distributed patterns of activity within an ROI using anatomical templates (Woolgar et al., 2011) or atlas parcellations (Freund et al., 2021b). Although this method offers high consistency across individuals (i.e., exactly the same voxels are selected for all participants), it omits the heterogeneity of brain activities between individuals. As demonstrated in recent reports (Shashidhara et al., 2020; Fedorenko, 2021), the loci of functional responses are somewhat idiosyncratic for each person; the conventional method of atlas parcel provides a ballpark estimate but is not sensitive enough to capture subtle idiosyncrasy, which has been suggested to be particularly problematic for decoding prefrontal representations (Bhandari et al., 2018). To tackle this issue, in the present study, when defining the ROIs, we employed the “group-constrained yet individual-specific” method advocated by Fedorenko (2021), which has been demonstrated to successfully capture individual variation while ensuring all individual loci are situated within a predefined target realm (Fedorenko et al., 2013; Blank and Fedorenko, 2017; Mineroff et al., 2018). The steps for defining ROIs were as follows: (1) First, we specified the scope for selecting voxels using the functional masks from Fedorenko et al. (2013) and Jackson (2021). Fedorenko et al. (2013) focused on “domain-general” executive control and identified a group of prefrontal regions, mostly in the dorsolateral prefrontal cortex (DLPFC) that encompasses the frontal eye field (FEF) and middle frontal gyrus (MFG). Jackson (2021) conducted a meta-analysis surveying the literature on “domain-specific” semantic control (executive operations specifically on semantic representations) and identified several regions in the ventrolateral prefrontal cortex (VLPFC) that encompassed the entirety of the inferior frontal gyrus (IFG) and inferior frontal sulcus (IFS, a fissure just superior to the IFG). It is worth noting that the two studies that we used here to define the boundaries of voxel selection are highly representative of the literature in that they used a wide variety of paradigms (i.e., seven tasks probing domain-general control in 197 subjects and 87 fMRI studies on diverse ways of semantic control). The masks are openly downloadable from the original studies. (2) Next, using the macroanatomy of AAL atlas (Tzourio-Mazoyer et al., 2002), the clusters of domain-general control (Fedorenko et al., 2013) and domain-specific control (Jackson, 2021) were parcellated into four divisions based on the alignment with gyrus/sulcus structures (for illustration, see Fig. 1B): (1) The FEF parcel contains exclusively domain-general voxels and is most dorsocaudal; (2) the MFG parcel contains exclusively domain-general voxels and is ventral/rostral to the FEF; (3) the IFS is the conjunction zone between the domain-general and domain-specific clusters and is located ventral/rostral to the MFG; (4) the IFG contains exclusively domain-specific voxels and is most ventrorostral. Note that the FEF parcel covered the conventionally agreed-upon location of the FEF and a nearby patch that corresponded to Area i6-8 of the Connectome Atlas, but for simplicity, we call this whole region FEF. (3) Finally, for each of the four prefrontal parcels and for every participant, individual localizer result was used to identify the top 200 voxels showing the biggest contrast parameters (see Fig. 1B inset boxes for examples). Specifically, we used to localizer's main effect contrast of “hard > easy” (aggregating across levels of the semantic condition and visuospatial condition) to identify 200 voxels showing the largest response to the overall effect of difficulty. The selected voxels from each parcel were used to construct individual-specific ROIs for subsequent analysis. Note that the localizer data for ROI construction was entirely separate from the main experiment's data for multivoxel decoding. In Extended Data Figure 1-1, we report the individual results of ROI creation, using the selection of IFG voxels as an example, to explain why the “group-constrained/individual-specific” approach is able to maintain consistency at the group level while accommodating individual variation. It is worth stressing that the number of voxels was predefined before we conducted any analysis; the number (200 voxels) was based on prior studies showing that increasing the size of an ROI up to 200 voxels robustly improved decoding before it reached a plateau as more voxels were added (Shashidhara et al., 2020). With this protocol to define ROIs, we were able to adapt to the heterogeneous distribution between individual results while considering prefrontal anatomy and ensuring consistency with prior literature. Moreover, given that an unequal number of voxels is known to be a confounding factor to multivariate decoding (Todd et al., 2013), our approach prevents this problem by ensuring an equivalent number of voxels in every ROI and thus creating a level playing field for comparing between ROIs. With this procedure, we ensured consistency at multiple levels: First, task-based clusters and macroanatomical templates were used as constraints to ensure the loci/extent of “search spaces” were aligned across participants (“spatial consistency”). Second, localizer contrasts were used to cherry-pick only those voxels tuned to cognitive control (“functional consistency”). Third, using the clusters of previous large-scale studies to define the bounds of search spaces ensured “literature consistency” with prior research. In Figure 2, we report the heatmaps showing the proportion of convergence among the selected voxels across participants for each of the four prefrontal zones.
In the figure, the proportion of overlap between each participant's selected voxels is color-coded with the XYZ-coordinate of highest inter-individual consistency pinpointed, and the scope for voxel selection (the blue cluster) is rendered on a volumetric MNI template. The figure also shows the coordinate at which the highest proportion of overlap is located, as well as the range of min/max overlap (proportion ranging from 0 to 1) in each of the four zones. These maps of overlap reveal two important messages: (1) While there is individual variation in the precise locations of functional brain responses, there is still sufficient consistency (∼40%) in the approximate locations where reliable responses are observed across participants. (2) Outside the highly overlapped regions, there is a great deal of individual variation. This echoes the suggestions by Fedorenko (2021) that an individual-tailored and group-constrained approach (the method we adopt for the present study) is crucial for ensuring that the definition of an ROI is sensitive to individual variation while the scope of voxel selection is constrained by anatomical division and previous results in the literature.
Figure 1-2
Download Figure 1-2, TIF file.
For each individual, we additionally defined two ROIs in the posterior/superior parietal cortex (PPC) and the posterior middle temporal gyrus (pMTG) for the “representational alignment” analysis to investigate how the four prefrontal ROIs share representational structures with remote areas and whether there is any systematic change along the dorsocaudal–ventrorostral axis. Whereas the PPC is part of the dorsal attention network and heavily involved in visuospatial tasks (Yeo et al., 2011; Mengotti et al., 2020), the pMTG is part of the semantic control system and frequently involved when semantic operations become more difficult (Davey et al., 2016; Jackson, 2021; Chiou et al., 2023). Our previous report of the localizer data (Chiou et al., 2023) also showed that the PPC and pMTG are, respectively, the regional peaks for visuospatial control (MNI: −28, −58, 52) and semantic control (−62, −42, −8). Given that the PPC and pMTG have distinct functional profiles, we used different localizer contrasts to define the ROIs: (1) For the PPC, confined within the cortical mask of the superior parietal lobule (Harvard Oxford Atlas: Desikan et al., 2006), we selected the top 200 voxels showing largest responses to the localizer contrast of “visuospatial-hard > visuospatial-easy” for each participant to construct individually tailored ROIs. (2) For the pMTG, confined in the posterior/temporo-occipital divisions of the middle temporal gyrus, we individually selected 200 voxels showing the largest responses to the localizer contrast of “semantic-hard > semantic-easy.” Akin to the way we defined prefrontal ROIs, our approach accommodated individual variation while ensuring the equivalent number of voxels in each ROI, and all chosen voxels were near the group-level average and situated in a predefined target territory.
In Extended Data Figure 1-2, we report a univariate analysis of the localizer results, comparing the four prefrontal ROIs’ preferences for semantic versus visuospatial processing. Results corroborate the previous literature on prefrontal functions (Nee and D'Esposito, 2016)—the dorsolateral PFC preferentially responds to visuospatial processing, while the ventrolateral PFC preferentially responds to semantic processing. This justified our labeling (in subsequent figures of our study) of the FEF (most dorsocaudal) as “visual-sensitive,” the IFG (most ventrorostral) as “semantic-sensitive,” and the MFG and IFS (situated in the intermediate territory) as “domain-general” (owing to these two regions responding equally strongly to visuospatial and semantic processing).
fMRI acquisition and preprocessing
The MRI protocol and preprocessing parameters are identical to those reported in Chiou et al. (2023). A Siemens 3 Tesla PRISMA system was used to acquire all of the MRI data. Anatomical images were attained with a T1-weighted magnetization-prepared rapid gradient echo (MPRAGE) sequence [repetition time (TR), 2,250 ms; echo time (TE), 3.02 ms; inversion time, 900 ms; 230 Hz per pixel; flip angle, 9°; field of view (FOV), 256 × 256 × 192 mm; GRAPPA acceleration factor, 2]. Task-related data were obtained with a multiecho multiband (MEMB) blood oxygenation level-dependent (BOLD)-weighted echo-planar imaging (EPI) pulse sequence. This MEMB sequence has the strengths that it acquired four functional volumes for each TR (multiecho, enabling capturing signals that peaked at early and late echo times that are often overlooked by conventional protocols) and simultaneously recorded two slices during the acquisition of each volume (multiband, expediting the acquisition rate). The parameters included the following: TR, 1,792 ms; TE1, 13 ms; TE2, 23.89 ms; TE3, 34.78 ms; TE4, 45.67 ms; flip angle, 75°; FOV, 192 × 192 mm; MB factor, 2; in-plane acceleration, 3. Each EPI volume consisted of 46 axial slices, acquired from the top and middle slice in descending order, with a voxel size of 3 × 3 × 3 mm and FOV of 240 × 240 × 138 mm. A series of 362 and 273 functional volumes per run, respectively, were acquired for the localizer experiment and the RSA decoding experiment.
All raw DICOM data were converted to NIfTI format using dcm2niix. The T1 structural images were processed using the standard processing pipeline of the FSL package's “fsl_anat” function (Version 5.0.11; https://fsl.fmrib.ox.ac.uk/). This procedure involves the following six steps: (1) Reorient images to standard MNI space (“fslreorient2std”), (2) automatically crop image to remove the neck (“robustfov”), (3) bias-field correction to fix field inhomogeneity (“fast”), (4) registration into the MNI space (“flirt” and “fnirt”), (5) brain extraction (using “fnirt” nonlinear method), and (6) tissue-type segmentation to separate white/gray matter and other structures (“fast”). Each T1 image was individually inspected for accuracy after being normalized into the MNI space. The functional EPI data were preprocessed using a combination of tools in FSL, AFNI (Version 18.3.03; https://afni.nimh.nih.gov/), and a specialized Python package to perform TE-dependent analysis (Kundu et al., 2012, 2013, 2017). Despiked (“3dDespike”), slice-time corrected (“3dTshift,” matched to the middle slice), and realigned (“3dvolreg”) images were submitted to the “tedana” toolbox, which took the time series data from all of the four TEs and decomposed the data into BOLD signals and noises (non-BOLD). Specifically, decomposition was based on whether a signal series depended on the four echo times—with the strength of multiple echo times, the algorithm was able to tell apart noises that fluctuated randomly and independently of the timing of four TEs (e.g., scanner's drift, cardiac/respiratory noises, and head motion) from signals that systematically varied with the timing of TEs (e.g., real functional data of BOLD). Data from the four echo times were optimally integrated, weighted based on the intensity of the T2* signal in each voxel, and separated from the TE-independent/non-BLOD noises (Kundu et al., 2017). Finally, the optimally combined images were coregistered into the individual's T1 structural scan (using FSL's “flirt”), normalized to the MNI space (using FSL's “fnirt” warps and “flirt” transform), and smoothed with a 6 mm FHWM Gaussian kernel.
Multivoxel pattern analysis (RSA and classification)
Prior to multivariate decoding, we used SPM12 to construct general linear models (GLM) to fit the data of the localizer experiment and the RSA experiment, convolving each participant's design matrix with a canonical hemodynamic response function. For the localizer experiment, four task regressors were created in the GLM (semantic-easy, semantic-hard, visuospatial-easy, visuospatial-hard), which enabled the relevant contrasts for defining ROIs. For the RSA experiment, 10 task regressors were created to enable constructing a 10 × 10 representational distance matrix in RSA (association-easy, association-hard, synonym-easy, synonym-hard, color-easy, color-hard, number-easy, number-hard, shape-easy, shape-hard). Based on the established methods (Kriegeskorte et al., 2008; Nili et al., 2014), for the off-diagonal elements of the matrix, we calculated the neural similarity between each pair of experimental conditions as the Pearson’s correlation of the two conditions’ vectorized patterns of neural activities; the similarity results were converted into a distance measure (1 – Pearson's r) to represent how distant the neural patterns are from one another. In addition to constructing matrices using the neural data, we also constructed various theoretical matrices based on different assumptions about the relationship between task conditions and used them to account for the neural results. Details of the theoretical models are presented later in the Results section. Based on the standard practice of RSA (Nili et al., 2014), we used a rank-based correlation index Kendall's τ (τ-a) to measure the similarity between different matrices. We also computed Spearman's ρ and found that it yielded entirely consistent results with τ-a; however, we chose to report τ-a as Nili et al. (2014) recommend that, compared with Spearman's ρ, τ-a offers a more conservative and detailed prediction.
While simple correlation approaches (e.g., τ-a or Spearman's ρ) have been widely used to evaluate how well a hypothesis explains a neural representation, more statistical control is needed to assess the unique variance explained by a hypothesis under consideration while the explanatory influences from all other sources are removed. This concern is especially crucial in one of our analyses in which we sought to compare the prefrontal subregions to investigate whether their neural coding was loaded with certain information. In this analysis, the neural matrices served as the candidates for adjudication, and the theoretical matrices were the targets of prediction—e.g., among the four regions, which carried the most information about reaction times, and which least? In this regard, an important consideration was that we needed to control for the confounding influences from all other subregions when gauging the unique variance that each brain region could explain. Given that our four ROIs are spatially contiguous, it is critical to statistically rule out the contamination from other regions to get a clean estimate for each region. To this end, we used two approaches that have been widely adopted to verify the relationship between a target matrix with multiple candidate matrices—partial correlation (Fernandino et al., 2022) and multiple regression (Freund et al., 2021b). While both methods are commonly used to assess the unique relationship between two matrices while controlling for the effects of other variables, they differ in mathematical definitions and the degree of strictness (i.e., partial correlation is a “purer” index in which covariance with confounding variables are removed from both of the matrices under assessment; in contrast, multiple regression removes covariance from only one of the two matrices, typically the predictor/candidate matrix). By verifying both approaches, we also aimed to check whether they produced consistent conclusions. Specifically, (1) we computed partial correlation (Spearman's ρ), which focused on the unique relationship between the neural and theoretical matrices (e.g., the similarity between IFG and a hypothesis matrix) while excluding the covariance they shared with all other prefrontal areas (i.e., the FEF, MFG, and IFS). (2) We computed multiple regression, following the example of Freund et al. (2021a). In this analysis, the off-diagonal elements of the four candidate matrices (i.e., the neural matrices of four prefrontal ROIs) simultaneously entered the regression as the predictors. The neural data were warped into vectors, rank-transformed, and normalized into z-scores, while one of the theoretical matrices served as the dependent/predicted variable. With this approach, the regression coefficient β of each regressor represented the explanatory power unique to each regressor while the effects from all other regressors were excluded. It is worth noting that RSA was performed at the individual level to ascertain that fine-grained, voxel-wise representations were preserved for every person, and group-level statistics (e.g., comparing the sizes of partial correlations or regression coefficients) were performed only after representational distance matrices were computed for each participant and for each ROI.
Finally, in addition to relating neural matrices to theoretically constructed matrices (which is a more common practice in RSA), we also performed “representational alignment” analyses, which quantified the extent to which brain regions’ representational structures aligned with one another (cf. Ito and Murray, 2023). When computing representational alignment between two brain regions, we extracted the off-diagonal elements of the neural matrices and calculated the rank-based correlation between regions. A high correlation means the two regions were representationally similar, which has been suggested to be driven by intensified informational exchange and tighter coupling of time series between the two regions (see Anzellotti and Coutanche, 2018, for discussion about “representational connectivity”). As a complementary analysis of representational alignment, we also extracted each region's fluctuation of BOLD time series and computed the correlation between brain regions as per the conventional measure of covariation-based functional connectivity.
Following the approach of Ito and Murray (2023), we computed the representational dimensionality for each participant and each brain region. It was calculated as the “participation ratio” that quantified the degree to which all of the dimensions contributed evenly to a region's neural code. If a region has low dimensionality, the first few dimensions can explain a large proportion of the variance while the remaining dimensions only have a negligible contribution. That is, the first few dimensions are enough to capture this region's overall representation, thus a low participation ratio (only a small proportion of the dimensions have explanatory power). In contrast, if a region has high dimensionality, most of the dimensions are dispensable, and each of them contributes to defining different aspects of the representation. Using the method of Ito and Murray (2023), the participation ratio was calculated as follows:
We also performed classification analyses on the multivoxel patterns as a complementary approach to RSA. Prior to the classification analyses, we remodeled the data of the RSA experiments in which each block of a task condition was modeled as an individual regressor so as to increase the number of samples for training the classifier in the k-fold validation procedure. We performed classification with The Decoding Toolbox (TDT; Hebart et al., 2015), which used a linear support vector machine (SVM) for classification with a “C = 1” cost parameter (Chang and Lin, 2011). For each classification, a leave-one-run-out 10-fold splitter was used whereby the algorithm learned to distinguish between relevant categories using the data from nine of the ten runs; its ability to correctly predict was tested using the unseen data from the remaining “held-out” run. This procedure was iterated over all possible combinations of runs used for training and testing. By partitioning the datasets based on different runs of scanning, there was no contamination of information leaking from the training sets to the testing sets. The accuracy scores were then averaged across folds to produce a mean accuracy score for further statistical analysis. This was done separately for each participant and for each region. Particularly, we used classification to evaluate whether the multivoxel pattern of brain regions has the ability to cross-classify between two disparate domains (e.g., training on the lexicosemantic domain, and then testing on the visual–perceptual or a numerical domain, and vice versa).
Results
We start by providing a roadmap of the analyses conducted: First, we began by qualitatively assessing the representational structure for each ROI to identify the resemblances and divergences between the four prefrontal regions and whether their differences varied in any systematic way along the dorsocaudal–ventrorostral axis. Second, correlation/regression analyses were performed to formally quantify the extent to which the four brain areas’ neural matrices relate to the theoretical matrices. In particular, we focused on the comparison among brain regions to explore how the four prefrontal areas differed in their information loading. Comparison among hypotheses was also performed to assess how well each theory characterized the neural patterns. Third, we investigated the “representational alignment” and “functional connectivity” of the four ROIs to understand whether they preferentially communicate with different brain regions beyond the PFC (cf. representational alignment; Ito and Murray, 2023). Fourth, we employed machine-learning classification to evaluate how the four ROIs differed in their ability to (1) generalize information from one domain to another and (2) differentiate the intricacies of task conditions. We related the multivoxel classification data with the representational dimensionality of each region to understand whether the state of being high- or low-dimensional systematically varied with the ability to cross-classify or discern details (cf. Badre et al., 2021 for the links among dimensionality and generalizability).
Qualitative evaluation of the representational distance matrix
We began by visually inspecting the representational structure of the prefrontal areas’ neural matrices. As evident in Figure 3A, the most obvious information was the gradual emergence of the separation between semantic conditions and nonsemantic conditions along the dorsocaudal–ventrorostral axis. Specifically, in the most dorsolateral FEF, there was only mild clustering that differentiated the semantic conditions (i.e., the six conditions in which participants processed semantic meaning) from nonsemantic conditions (i.e., the four conditions where participants compared numerical sizes and matched visual shapes). However, the separation strengthened in the MFG, further manifested in the IFS, and finally culminated in two distinctly separated clusters in the IFG that contrasted the semantic cluster with the nonsemantic one. This bipartite split suggests that, while we manipulated multiple variables of difficulty (easy vs hard), stimuli (words, numbers, shapes), and five task sets, “semantics versus nonsemantics” was the most dominant factor that configured neural representations. This segregation was increasingly clear toward the ventrorostral PFC, hinting at a graded transformation of informational contents couched in the macroscale structure of the PFC.
A, The representational distance matrix of the four ROIs. B, Hierarchical clustering was used to analyze the relationship among the neural representations, and dendrograms were used to visualize the relationships. Cophenetic distance (arbitrary unit) was used to quantify the distance between data points. It is evident in the dendrograms that the distance between the semantic cluster (the green circle) and the nonsemantic cluster (the blue circle) was shortest in the FEF representation, gradually increased in the MFG and IFS, and reached longest in the IFG. Also evident in the figure is that the interposed regions (the MFG and IFS) had more clear-cut representations that individuated the task sets (the dash-line ovals that enclosed each task) in both semantic and nonsemantic domains (the solid-line ovals enclosing each domain), particularly the IFS which individuated all five tasks. In contrast, the FEF and IFG individuated the tasks only in its preferred domain (FEF, nonsemantic; IFG, semantic) while conflating the tasks in its nonpreferred domain.
We also employed hierarchical clustering to further investigate their representational proximity and visualized the results as dendrograms. As shown in Figure 3B, “semantic versus nonsemantic” was the factor driving the primary bifurcation of neural representations, splitting the six semantic conditions from the four nonsemantic conditions. While in all four prefrontal regions, this split was seen, closer inspection revealed that the four regions differed in their representational distances between semantic clusters and nonsemantic clusters. As Figure 3B shows, this distance (quantified using the cophenetic coefficient) was shortest in the FEF, gradually lengthened along the MFG and IFS, and was longest in the IFG. This corroborates the qualitative characterizations above and quantifies the degree of segregation. Furthermore, closer scrutiny of the dendrogram branches revealed that the four regions differed in the extent to which the five different tasks were individuated. Specifically, the most dorsocaudal FEF was able to distinguish the two nonsemantic tasks (number vs shape) while confusing the three semantic conditions (association vs synonym vs color knowledge).
In contrast, the ventrorostral IFG showed the reverse pattern: It mixed up the two nonsemantic tasks, while it was able to clearly discern the three semantic tasks. Crucially, the interposed regions, the MFG and IFS, possessed the ability to tell apart both semantic and nonsemantic tasks. Particularly, the IFS encoded each of the five tasks as an individual entity (i.e., the easy and hard conditions of the same task were correctly grouped together, while different tasks were independent of one another). Together, these qualitative assessments allowed us to intuitively grasp the overarching organization that (1) the separation of “semantic versus nonsemantic” steadily arose along the dorsocaudal–ventrorostral axis and (2) regions located in the intermediate territory (the MFG and IFS) represented the five task sets as individual entities (i.e., for each task, its easy and hard levels were lumped together, due to them sharing the same rules/goals, while segregated from all other tasks). In contrast, regions located in the flanks (the FEF and IFG), could clearly discern tasks of their preferred domain (nonsemantic for FEF and semantic for IFG, respectively) while representing tasks in their nonpreferred domain in a fuzzy manner.
Comparing the four prefrontal regions with the theoretically constructed models
We took two approaches to RSA: The first compared the prefrontal subregions and aimed to clarify how they differed in representational content. Thus, the neural matrices were treated as candidates to test against a specific hypothetical matrix. The second treated different hypotheses as candidates to test against the neural matrix of a particular prefrontal region. The two approaches revealed distinct yet complementary information—the first focused on the neural space (the four prefrontal subregions) to compare their goodness of fit with a particular theory while the second focused on the task space (various relationships among task conditions) to adjudicate which best described the neural pattern.
Prior to conducting RSA, we constructed various hypothetical matrices about the commonality or differences among the task conditions. As illustrated in Figure 4A, four hypothetical matrices were constructed with each reflecting a unique assumption about the tasks: The “bipartite” matrix depicted the distinction between semantic and nonsemantic domains, stressing the binary segregation between domains while glossing over the differences within each domain. The “quintipartite” matrix (which means five-way) only represented the individuality of the five task sets, stressing the uniqueness of each task while ignoring the difference within a task (i.e., its easy and hard levels). The “complex structure” matrix represented all graded differences inside the task space—“easy versus hard” was nested within each task while the five task sets were nested within the bipartite split of “semantic versus nonsemantic,” yielding a graded structure. The “reaction time” matrix characterized the pairwise difference of reaction times (RT) between two conditions—this model was constructed using the behavioral data, taking the absolute value of two conditions’ difference in RTs while ignoring all other aspects of the experimental design. Given the fact that RT has been extensively used in behavioral psychophysics as an index of task difficulty (i.e., difficult tasks typically demand more cognitive effort and result in longer RT), a small pairwise difference in RT could be interpreted as the two conditions requiring comparable extent of cognitive effort whereas a large difference suggests the two conditions differed greatly in how much effort they taxed. Note that RT was the behavioral outcome of a mixture of various cognitive factors associated with task difficulty (e.g., heightened level of attention, the need to suppress a prepotent response, and more saccades to revisit options). We keep the interpretation of RT neutral as simply an index of cognitive demand without ascribing it to any particular process.
A, The theoretical models set up as the reference matrices of RSA. B, For each model, correlations were computed, associated the four prefrontal regions neural matrices with the reference. Post hoc comparisons: *p < 0.05; ***p < 0.001. Error bars are standard error of the mean (SEM).
After setting up the four hypothetical matrices, we began comparing how much the prefrontal regions (as candidates) accorded with each hypothesis. As illustrated in Figure 4B, it is obvious that areas along the dorsocaudal–ventrorostral axis systematically differed in sensitivity to distinct information. Specifically, information about semantic versus nonsemantic (viz., the “bipartite” matrix) was preferentially encoded in the IFG, which steadily decreased dorsally along the axis, and was least encoded in the FEF, yielding robust differences between the regions (F(3,72) = 30.39, p < 0.001). Information about cognitive effort (the “reaction time” matrix) showed the opposite pattern—it was most encoded in the FEF, gradually decreased ventrally along the axis, and least encoded in the IFG (F(3,72) = 10.95, p = 0.003). Next, information about the individuality of the five task sets (i.e., the “quintipartite” matrix) was most encoded in the two intermediate regions (the MFG and the IFS) but dwindled in the dorsal FEF and the ventral IFG (F(3,72) = 23.07, p < 0.001). Finally, information about the “complex structure” of tasks (which considered the nested structure of difficulty levels, task sets, and semantic vs nonsemantic) was encoded by the IFG, IFS, and MFG to comparable extents, but significantly least in the FEF (F(3,72) = 33.32, p < 0.001). To summarize the results, we found that regions along the dorsocaudal–ventrorostral axis systematically varied in the information ingredients they represented: The dorsocaudal PFC was more sensitive to information about behavioral effort (as indexed by reaction time). The intermediate zone was sensitive to task sets, consistent with the role of MFG/IFS in sustaining task-relevant rules during a task. The ventrorostral PFC was more sensitive to the discrepancy of semantic versus nonsemantic, echoing the involvement of IFG in many psycholinguistic processes.
While the results revealed an informational transition in the PFC, control analyses were necessary to preclude the covariance between regions (collinearity, especially between two contiguous regions) when we estimated the correlative strength unique to each area. Thus, we computed partial correlation (Fig. 5) and multiple regression (Fig. 6) to assess the unique relationship between each region and each hypothetical matrix; the correlation/regression coefficients were computed individually for each hypothesis, followed by statistical comparisons among the regions. We found a partial correlation and multiple regression both replicated the pattern reported above, even after the confounding/covariance factors were excluded. Specifically, in partial correlation, we found that the ventrorostral PFC contained more information about semantic versus nonsemantic, which gradually decreased dorsally (F(3,72) = 12.84, p = 0.002); the MFG carried more information about task sets, which declined in other areas (F(3,72) = 4.96, p = 0.03); the ventrorostral PFC (particularly the IFG) had more information about the “complex structure” matrix, which decreased dorsally (F(3,72) = 10.19, p = 0.004); the dorsocaudal PFC (particularly the FEF) possessed more information about behavioral effort, which reduced ventrally (F(3,72) = 13.78, p = 0.001). A largely consistent pattern was observed in the data of regression analysis: the IFG's neural pattern best fitted the matrix of “semantic versus nonsemantic,” with the β coefficients gradually decreasing in dorsal regions (F(3,72) = 9.60, p = 0.005); the IFG pattern also best fitted the matrix of “complex structure,” with the β coefficients steadily reducing in dorsal areas (F(3,72) = 8.72, p = 0.007); the dorsocaudal PFC (particularly the FEF) best fitted the matrix of “reaction time,” with the β values gradually waning in ventral regions (F(3,72) = 14.29, p = 0.001); finally, we only observed a nonsignificant weak trend in the regression model for the hypothesis of task identity (F(3,72) = 3.04, p = 0.09)—the middle PFC (the MFG and IFS) had numerically higher β values than the dorsal FEF and ventral IFG. Taken together, the results of partial correlation and multiple regression replicated the overall pattern found in the unadjusted correlation—there is a graded difference between regions, with behavioral effort preferentially encoded in the dorsocaudal PFC, task sets in the middle PFC, and abstract structures (e.g., semantic vs nonsemantic) in the ventrorostral PFC.
A, This example illustrates partial correlation in a Venn diagram. The unique relationship between the IFG and the “complex structure” hypothesis (which is what partial correlation identifies) is shaded in yellow while the covariance driven by other areas is excluded. B, For each hypothesis, we computed its partial correlations with each of the four prefrontal areas, followed by statistical tests comparing the four regions. Post hoc comparisons: **p < 0.005; *p < 0.05. Error bars indicate SEM.
A, In this example of regression analysis, the four prefrontal regions’ neural patterns are included as regressors to predict the matrix of the complex structure hypothesis. Regression fit (β weight) indicates the unique explanatory power of a region when the effects of other regions are held constant. B, For each hypothesis, the β weight associated with each regressor was computed, followed by statistical comparisons among the regions. Post hoc comparisons: **p < 0.005; *p < 0.05. Error bars indicate SEM.
In addition to comparing the prefrontal regions to know which best fitted with a certain hypothesis, we also carried out the inverse test—comparing different hypotheses to know which best described a region's neural pattern. For this exploratory analysis, we expanded the space of hypotheses by adding the “tripartite” matrix (three-way division in the task space) and “quadripartite” matrix (four-way division) but excluded the “complex structure” matrix (owing to its multicollinearity with other matrices), yielding five hypotheses as candidates (Fig. 7A). Partial correlation was calculated to quantify the unique relationship between a region and a target hypothesis while excluding the influences from all other hypotheses. As shown in Figure 7B, for each of the four regions, the five hypotheses differed significantly in the strength of association with the neural data (FEF, F(4,96) = 41.49, p < 0.001; MFG, F(4,96) = 53.91, p < 0.001; IFS, F(4,96) = 61.18, p < 0.001; IFG, F(4,96) = 125.07, p < 0.001). We found that the “bipartite” matrix (semantic vs nonsemantic) was generally the best candidate to account for the neural data, significantly outperforming all other hypotheses (with the exception that, in the FEF, “reaction time” was on a par with the “bipartite” matrix). These statistically support our interpretation of Figure 2 that “semantic versus nonsemantic” dominantly determines how the clusters are configured.
A, Five theoretical matrices were constructed as the candidates to assess their relationship with each region's neural pattern. B, The partial correlation between the theoretical matrices and each of the four regions. Error bars indicate SEM. Post hoc comparisons were based on significant F-test results: ***p < 0.001; **p < 0.005; *p < 0.05. The “bipartite matrix” represents a two-way split between semantic tasks and nonsemantic tasks. The “tripartite matrix” was a further elaboration of the two-way split to specifically differentiate between semantic processing that was naturalistic (i.e., both the association and synonym tasks involved pairing words using their natural meaning) versus semantic processing that was arbitrary (the color task, which required arbitrary/deliberate pairing using color knowledge), hence resulting in a three-way split (naturalistic semantic, arbitrary semantic, and nonsemantic). The “quadripartite matrix” was a further elaboration of the three-way model to further differentiate the two nonsemantic conditions into the number task and the shape task, resulting in a four-way split. The “quintipartite matrix” represents the individualization of each of the five task sets, hence a five-way split. Finally, the “reaction time” matrix used behavioral RT as a proxy to represent the level of cognitive effort for each task condition as RT is robustly correlated with the amount of effort devoted to each task.
Taken together, the results revealed that, for all of the prefrontal regions, “semantic versus nonsemantic” was the predominant factor that configured neural representations relative to every other hypothesis. However, when specifically comparing the four regions to investigate how they fitted each hypothesis, we found robust differences among regions such that distinct types of information were preferentially registered in different subparts along the dorsocaudal–ventrorostral axis of the PFC.
Representational alignment with regions beyond the prefrontal cortex
It has been established in studies of both monkey cytoarchitecture (Saleem et al., 2014) and human neuroimaging (Jung et al., 2022) that different subdivisions in the PFC tend to connect with separate regions outside the PFC. The dorsal PFC, especially the FEF, is preferentially connected with the posterior parietal cortex (PCC) and jointly forms a circuit heavily involved in spatial attention (Yeo et al., 2011; Xu, 2018), whereas the ventral PFC, especially the IFG, is preferentially connected with the posterior middle temporal gyrus (pMTG), and they jointly form a circuit heavily involved in retrieving semantic memory (Davey et al., 2016; Chiou et al., 2018; Jackson, 2021). We speculated that connectomic fingerprints (i.e., preferential linkage to different areas; cf. Mars et al., 2018) might be the mechanism that drives distinct information to be preferentially registered in different parts of the PFC (in line with the broader hypothesis that local processing is shaped in part by the pattern of its connectivity with the rest of the brain that constrains this area's input/output). To investigate this, we employed representational alignment analysis (cf. Ito and Murray, 2023) and compared the results of representational alignment with seed-based functional connectivity. A high correlation between two regions’ representational matrices indicates that information encoded by the two regions is “configurationally aligned” and is interpreted as a more informational exchange between the two areas (greater connectivity strength; cf. Anzellotti and Coutanche, 2018).
As shown in Figure 8A, the neural pattern of pMTG was most aligned with that of the IFG (i.e., highest resemblance), followed by the IFS and MFG, and was least aligned with the FEF (F(3,72) = 74.62, p < 0.001). This pattern was mirrored in the result of functional connectivity—the time series of pMTG, as the seed, was most synchronized with that of the IFG, followed by the IFS and MFG, and least with the FEF (F(3,72) = 51.32, p < 0.001). This gradient (which peaked in the ventral PFC and gradually weakened dorsally) was flipped when we examined the four regions’ relationship with the PCC (Fig. 8B): The PCC's neural pattern was representationally most similar to the FEF, followed by the MFG and IFS, and least similar to the IFG (F(3,72) = 8.62, p = 0.007); also, the PCC's time series was most in synchrony with the FEF, followed by MFG and IFS, and least with the IFG (F(3,72) = 87.32, p < 0.001). We also report a supplementary analysis in Extended Data Figure 8-1 in which the posterior ROIs are defined using conventional atlas parcels and find consistent patterns with the results based on voxel selection. Together, we replicated the functional and structural divide of the dorsal versus ventral PFC, echoing the seminal classic study of Goldman-Rakic (1984), and revealed a parallel between representational similarity and functional connectivity.
A, Representational alignment: comparing the similarity of neural patterns and the strength of functional connectivity between the pMTG (a region heavily involved in semantic control) and the four prefrontal regions. B, Functional connectivity: comparing the similarity of neural patterns and the strength of functional connectivity between the PPC (a region heavily involved in visuospatial attention) and each of the four prefrontal regions. Error bars indicate SEM. Post hoc comparisons were based on significant F-test results: ***p < 0.001; **p < 0.005; *p < 0.05.
Dimensionality, generalizability, and expressivity of neural representations
Dimensionality reduction techniques, when applied to multivariate neural data, enable researchers to dissect informational content and extract major dimensions (Rigotti et al., 2013; Brincat et al., 2018; Bartolo et al., 2020). A key approach is to quantify the dimensionality of neural representations (Badre et al., 2021; Ito and Murray, 2023). In technical terms, the dimensionality of a brain region's representation refers to the number of variables needed to delineate the essence of its representation. Research has shown that the high dimensionality of neural codes is associated with high expressivity (i.e., the region can represent subtle differences and is therefore highly expressive), but it comes with the cost of low generalizability (the neural code captures too much peculiarity about each stimulus, sacrificing its ability to generalize across contexts, akin to overfitting). In contrast, low dimensionality is associated with low expressivity (unable to tell apart subtle differences) but high generalizability (i.e., the neural representation captures the broad-stroke information, which can be extracted from one context and applied to another; see Badre et al., 2021 for the trade-off among dimensionality, expressivity, and generalizability).
We investigated how representational dimensionality varied between the four prefrontal regions and related dimensionality with generalizability and expressivity. To begin with, we examined whether neural representations of the four regions differed in their generalizability across contexts. This was achieved with a multivariate cross-classification: The algorithm was trained to classify easy versus hard on the data of a semantic task and tested using the unseen data of a nonsemantic task (and vice versa). Successful decoding means the neural code of a region contains generalizable information about difficulty across domains. As shown in Figure 9, we found that the generalizability systematically varied along the dorsocaudal–ventrorostral axis such that the accuracy of cross-domain decoding of difficulty was highest in the FEF, declined in the MFG and IFS, and became nonexistent in the IFG. This was found in the cross-classification between the association task and nonsemantic task (F(3,72) = 15.66, p < 0.001) and between the synonym task and nonsemantic (F(3,72) = 12.98, p < 0.001), unaffected by whether the nonsemantic task was number or shape (both Fs < 1). A similar albeit nonsignificant trend (highest in the FEF and gradually dropping in ventral regions) was seen in the results of the color knowledge task (F(3,72) = 2.91, p = 0.11). Next, we computed the dimensionality for each of the four prefrontal regions, using the methods of Ito and Murray (2023); see Materials and Methods for the mathematical details. As shown in Figure 10A, representational dimensionality was highest in the IFG, followed by the IFS and MFG, and dropped to lowest in the FEF (F(3,72) = 24.64, p < 0.001). Crucially, there is an inverse relationship between dimensionality and generalizability—a trade-off between the two features. As seen in Figure 10B (the association task) and Figure 10C (the synonym task), the IFG showed the highest dimensionality yet the lowest generalizability, which could be understood as the highly dimensional neural pattern of the IFG having too much idiosyncratic task features and being unable to generalize across contexts. In contrast, the FEF exhibited the lowest dimensionality but the highest generalizability, suggesting that this region used “stereotyped” patterns to represent easy versus hard (which were shared across the five tasks, thus enabling cross-domain classification).
In the multivoxel cross-classification, the support vector machines were trained to classify easy versus hard of a semantic task (either the data of the association task or those of the synonym task) and were tested with a nonsemantic task (either the number task or the shape task), and vice versa—trained on nonsemantic, tested on semantic. The chance level was 50% in this binary classification. Because the outcome was unaffected by whether the cross-classification was against the number task or the shape task (Fs < 1), the averaged results were shown here to highlight the main effect of ROIs that cross-classification accuracy was high in the dorsal PFC and dropped steadily in the ventral PFC. Error bars indicate SEM. Post hoc comparisons are based on significant F-test results: ***p < 0.001; **p < 0.005.
A, Dimensionality index, computed as the participation ratio of the eigenvariables of a region's representational matrix, of the four prefrontal regions. We found that dimensionality has an inverse relationship with cross-domain classification accuracy (which was a proxy measure of generalizability, indicating how well the neural pattern could generalize information from one domain to another). This was found in the results of the association task (B) and synonym task (C) such that highly dimensional regions tended to exhibit lower generalizability. Post hoc comparisons: ***p < 0.001; **p < 0.005. Error bars indicate SEM.
We also analyzed the relationship between dimensionality and expressivity. According to Badre et al. (2021), highly expressive neural representations encode intricate differences between task conditions and enable fine-grained differentiation. In contrast, low expressive representations fail to distinguish small differences and therefore only enable coarse-grained differentiation. Based on this definition, we performed a 10-way classification for each of the four prefrontal regions to measure expressivity, testing the ability to distinguish details across the entire scope of task space and predict the one correct label from 10 task conditions. Thus, decoding accuracy served as a proxy measure of expressivity to assess how much nuance was discernible from the representations. As Figure 11A shows, decoding accuracy was highest in the IFG, gradually declined dorsally, and was lowest in the FEF (F(3,72) = 56.69, p < 0.001). As shown in Figure 11B, decoding accuracy had a positive relationship with dimensionality such that the most expressive region (the IFG) had the highest dimensionality while the least expressive region (the FEF) had the lowest dimensionality. Taken together, our investigation into the relationship among dimensionality, generalizability, and expressivity revealed a trade-off between these features, concurring with the proposal of Badre et al. (2021). Moreover, all these features varied gradually along the dorsocaudal–ventrorostral axis of the PFC, with representations being, respectively, most generic (low-dimensional) in the dorsocaudal PFC and most specific (high-dimensional) in the ventrorostral PFC.
A, In the 10-way cross-validated classification, the support vector machines were trained to predict the correct condition label from 10 possible choices. This 10-way classification required more details about the differences between conditions than, for example, the binary classification of semantic versus nonsemantic, and served as a proxy quantifying a region's representational expressivity. The chance level was 10% in this 10-way classification. The accuracy of such fine-grained classification was higher in the ventral PFC and lower in the dorsal FPC. B, There was a positive relationship between dimensionality and expressivity that highly dimensional regions (e.g., the IFG) tended to show higher expressivity. Post hoc comparisons: ***p < 0.001. Error bars indicate SEM.
Discussion
In the present study, we designed a multifarious paradigm in which participants performed three types of semantic tasks and two types of nonsemantic tasks, with difficulty level manipulated for each task. This design merged multiple variables into a multifaceted structure and enabled us to leverage RSA to identify the abstract dimensions that defined neural representations during task performance and verify whether distinct information was preferentially encoded in different subparts of the PFC. Results showed a graded transition along the dorsocaudal–ventrorostral axis: The FEF was sensitive to the reaction times to different conditions, preferentially connected with the superior parietal lobule, encoding task difficulty in a representationally generalizable fashion yet dimensionally impoverished. In contrast, the IFG was sensitive to the complex/abstract structure of our task paradigm (particularly to the principal division of semantic vs nonsemantic), preferentially connected with the left MTG, representationally expressive to distinguish subtleties, and dimensionally richest among the four areas. Finally, the areas interposed in the middle zone (the MFG and IFS) were sensitive to the individuality of task sets and ranked in the middle in terms of their connectivity, dimensionality, generalizability, and expressivity. Together, by leveraging subject-specific localization to capture the idiosyncrasy of individual activation (Fedorenko, 2021) and RSA to infer informational contents (Freund et al., 2021b), we showed how highly dimensional prefrontal representations could be decomposed and compared with various hypothetical constructs about cognitive control, revealing the gradual transformation of task-relevant information in different PFC subparts. Below we expound on four issues: (1) How do the present findings relate to previous literature about semantic versus nonsemantic processing in the PFC; (2) how can dimensionality explain the functional repertoires of PFC subregions; (3) functional connectivity and the emergent prefrontal gradient; and (4) different approaches to study the organization of cognitive control in the PFC.
Semantic control system versus multiple-demand system in the PFC
Primarily based on univariate analysis, a dorsoventral divide has been established in the literature: A large part of the ventrolateral PFC is involved in goal-directed operations on semantic meaning (Thompson-Schill et al., 1997; Badre et al., 2005; Chiou et al., 2018, 2023; Gao et al., 2021; Zhang et al., 2021) and joins forces with the pMTG as the “semantic control” system (Davey et al., 2016; Alam et al., 2019; Jackson, 2021). Beyond semantic control, the ventrolateral PFC is also engaged in various tasks requiring the retrieval/manipulation of memories (e.g., episodic memory, Vatansever et al., 2021; autobiographical memory, Chiou et al., 2020; language comprehension, Fedorenko et al., 2011), and it tends to be suppressed by perceptual tasks (e.g., visual search, Chiou et al., 2020). However, instead of simply preferring memory retrieval to perceptual processes, the ventrolateral PFC is also sensitive to executive demand, as its activation increases with the difficulty of semantic processes (but not the difficulty of perceptual processes; Humphreys and Lambon Ralph, 2017; Chiou et al., 2023). This selective and combinatorial profile (e.g., preferring semantic tasks especially when they are difficult) reflects a fusion between semantics and executive control (Davey et al., 2016; Wang et al., 2020). In contrast, the “multiple-demand” system is sensitive to executive demand on a wide range of goal-directed tasks (Duncan, 2010) and occupies a set of broadly distributed patches in the prefrontal and parietal lobes (Assem et al., 2020, 2021). In the lateral PFC, the “semantic control” system is located in subregions ventral and rostral to the “multiple-demand” system. Advancing from univariate data, our multivariate decoding lends support to this dorsal-ventral divide and further reveals the informational contents. For example, when transitioning from the dorsocaudal to ventrorostral PFC, the preferred contents vary from the speed of task performance, via task sets, to semantic meaning, consistent with a progression from “multiple-demand” regions to “semantic control” regions. Furthermore, our dimensionality analysis showed that IFG representations had higher dimensionality yet lower generalizability than dorsal regions. The higher dimensionality in ventral areas is consistent with a nonlinear profile of responses (e.g., the IFG cannot be defined simply by either difficulty or semantics as it prefers “both” easy mind-wandering and difficult semantic tasks; Chiou et al., 2023), whereas the lower dimensionality in dorsal areas fits with their domain-generality in a wide variety of task situations (i.e., task difficulty universally engages “multiple-demand” regions, irrespective of various task rules and stimuli modalities, as if dimensionality is condensed in these regions such that the neural codes gloss over superficial differences between tasks and only preserve generic information applicable across situations).
Dimensionality in different prefrontal regions
It has been argued that to precisely encode the world and efficiently respond to novel challenges, the brain needs to balance precision (truthfully representing details) against efficiency (swiftly reusing learned skills to tackle similar problems). Hence, high- and low-dimensional representations are both necessary to enable such adaptive behavior (Fusi et al., 2016; MacDowell et al., 2022). To represent complex thoughts and actions, high-dimensional representations are needed to capture all minutiae; to efficiently adapt to new situations, low-dimension representations are needed to encode a “sketch” of the novel challenge, identify its similarity with former situations, and generalize previous solutions.
As suggested by Badre et al. (2021), there is a computational trade-off as the brain balances between low and high dimensionality. This fits with our findings: The ventrorostral PFC was high-dimensional, which enabled distinguishing subtleties yet compromised its generalizability across domains, whereas the dorsocaudal PFC was low-dimensional, which allowed extrapolating from one domain to predict another domain (which dovetailed its domain-general profile), but its neural coding was less elaborate. It remains to be empirically tested whether this trade-off holds valid beyond the five tasks employed in the present study. Critically, dimensionality steadily varied along the prefrontal gradient, which may be driven by graded changes of connectivity with posterior regions (which we discuss next).
Connectivity and functional heterogeneity in the PFC
It has been established that the dorsal parietal and ventral temporal pathways of the visual system are preferentially connected with, respectively, the dorsolateral and ventrolateral PFC. This was initially identified by the work of Goldman-Rakic using monkey electrophysiology (Goldman-Rakic, 1984) and was later replicated in human neuroimaging (Shmuelof and Zohary, 2005), as well as in the results of representational alignment and seed-based connectivity that we report in the present study. Recent evidence further show that the difference extends beyond the visual system (Haber et al., 2022; Jung et al., 2022), with the dorsolateral PFC more connected with regions for attentional control and the ventrolateral PFC more connected with regions for semantic memory. As argued by Mars et al. (2018), each region's unique pattern of input and output connections shapes its functional repertoire, driving functional heterogeneity. Connectivity has been shown to cause functional diversity within the temporal cortex (Jackson et al., 2016; Bajada et al., 2017), parietal cortex (Humphreys et al., 2022), and occipital cortex (Konkle and Caramazza, 2017). Taken together, robust evidence suggests that connectivity is one of the major factors that constrain how functions arise from brain structures.
Different approaches to study the organization of cognitive control in the PFC
Different approaches used to study the functional organization of cognitive control in the PFC often yield different conclusions. One approach is parcellating the PFC into a patchwork of cortical parcels (Glasser et al., 2016) and identifying the parcels preferring harder to easier tasks (Assem et al., 2020, 2021, 2024). This approach reveals multiple patches in the PFC, which topographically form a blotchy arrangement that is juxtaposed alongside the patches from other functional networks, dispersed across the PFC (see Assem et al., 2024, for a demonstration of such a mosaic-like pattern in the PFC comprising the patches of different networks). Another approach assumes that cognitive control is hierarchically organized along the dorsocaudal–ventrorostral axis of the PFC (for review, see Badre and Nee, 2018; Badre and Desrochers, 2019). This approach typically hierarchically manipulates the contingency between stimuli and the responses that higher-order contexts stipulate different responses to the same stimuli (viz., as the superordinate/abstract contexts vary, the mapping between stimuli and responses also varies; Badre and D'Esposito, 2007; Nee and D'Esposito, 2016; Nee, 2021). With such hierarchically nested structures in the design, graded changes in the PFC have been found—the dorsocaudal division is involved in executing action, the middle division is involved in maintaining task sets, and the ventrorostral division is involved in representing a superordinate schema of task contexts (Nee and D'Esposito, 2016). It is, however, important to note that arbitrating on the various approaches is difficult given the drastic differences in experimental designs, ways to parcellate the PFC, and analytical methods. Functional dissociations in the PFC can vary depending on a wide range of factors—the choices of tasks being compared (e.g., hierarchical or “hard vs easy” contrast), how ROIs are defined (coordinate-based, parcel-based, or using a localizer to select voxels within a scope), and the analysis performed (univariate, multivariate, connectivity, etc.). Thus, it is unsurprising that they have yielded disparate conclusions.
Limitations of the present study and future directions
This study did not examine dynamic changes in connectivity or function that occur over time or across different states (Wang et al., 2024). Moreover, given the differences in task design, loci of the ROIs, and analytical methods, it is difficult to establish the correspondence between the PFC gradient we observed and those reported previously. Ostensibly, we found a gradient steadily transitioning from cognitive demand (as indexed by RT), via task sets, to semantics, which is morphologically consistent with the “concrete-to-abstract” transition reported previously (Badre and Nee, 2018). However, future research is needed to closely match experimental parameters, potentially adopting a hierarchical-nested design that incorporates both semantic and nonsemantic elements and using an individual localizer to select voxels within parcels of the high-precision fine-grained parcellation of the Human Connectome Project. With all these parameters incorporated, it will hold promise for reconciling discrepant findings and clarifying how cognitive control is implemented in the PFC.
Footnotes
This work was funded by an MRC Program grant and intramural funding to M.A.L.R. (MR/R023883/1; MC_UU_00005/18) and a Sir Henry Wellcome Fellowship (201381/Z/16/Z) to R.C. We thank the three anonymous reviewers for their insightful comments.
The authors declare no competing financial interests.
- Correspondence should be addressed to Rocco Chiou at rocco.chiou{at}ndcn.ox.ac.uk.