Domain-General and Domain-Specific Patterns of Activity Supporting Metacognition in Human Prefrontal Cortex

Metacognition is the capacity to evaluate the success of one's own cognitive processes in various domains; for example, memory and perception. It remains controversial whether metacognition relies on a domain-general resource that is applied to different tasks or if self-evaluative processes are domain specific. Here, we investigated this issue directly by examining the neural substrates engaged when metacognitive judgments were made by human participants of both sexes during perceptual and memory tasks matched for stimulus and performance characteristics. By comparing patterns of fMRI activity while subjects evaluated their performance, we revealed both domain-specific and domain-general metacognitive representations. Multivoxel activity patterns in anterior prefrontal cortex predicted levels of confidence in a domain-specific fashion, whereas domain-general signals predicting confidence and accuracy were found in a widespread network in the frontal and posterior midline. The demonstration of domain-specific metacognitive representations suggests the presence of a content-rich mechanism available to introspection and cognitive control. SIGNIFICANCE STATEMENT We used human neuroimaging to investigate processes supporting memory and perceptual metacognition. It remains controversial whether metacognition relies on a global resource that is applied to different tasks or if self-evaluative processes are specific to particular tasks. Using multivariate decoding methods, we provide evidence that perceptual- and memory-specific metacognitive representations coexist with generic confidence signals. Our findings reconcile previously conflicting results on the domain specificity/generality of metacognition and lay the groundwork for a mechanistic understanding of metacognitive judgments.


Introduction
Metacognition is the capacity to evaluate the success of one's cognitive processes in various domains; for example, perception or memory (Flavell, 1979;Nelson and Narens, 1990;Metcalfe and Shimamura, 1994;Fleming et al., 2012a). Metacognitive ability can be assessed in the laboratory by quantifying the trial-by-trial correspondence between objective performance and subjective confidence (Galvin et al., 2003;Maniscalco and Lau, 2012;Overgaard and Sandberg, 2012;Fleming and Lau, 2014). Anatomical (Fleming et al., , 2012bMcCurdy et al., 2013;, functional (Fleck et al., 2006;Yokoyama et al., 2010;Baird et al., 2013;Hilgenstock et al., 2014), and neuropsychological (Shimamura and Squire, 1986;Schnyer et al., 2004; evidence indicates specific neural substrates (especially in frontolateral, frontomedial, and parietal regions) contribute to metacognition across a range of task domains, including perception and memory. However, the neurocognitive architecture supporting metacognition remains controversial. Does metacognition rely on a common, domain-general resource that is recruited to evaluate performance on a variety of tasks? Or is metacognition supported by domain-specific components?
Current computational perspectives (Pouget et al., 2016;Fleming and Daw, 2017) suggest that both domain-general and domain-specific representations may be important for guiding behavior. One needs to be able to compare confidence estimates in a "common currency" across a range of arbitrary decision scenarios (de Gardelle and Mamassian, 2014). One solution to this problem is to maintain a global resource with access to arbitrary sensorimotor mappings (Holroyd et al., 2005;Heekeren et al., 2006;Cole et al., 2013). Candidate neural substrates for a domain-general resource are the frontoparietal and cingulo-opercular networks, known to be involved in arbitrary control operations (Cole et al., 2013). In particular, the posterior medial prefrontal cortex (PFC) (encompassing the paracingulate cortex and presupplementary motor area) has been implicated in representing confidence, monitoring conflict and detecting errors across a range of tasks (Gehring et al., 1993;Botvinick et al., 2004;Ridderinkhof et al., 2004;Fleming et al., 2012b). Conversely, if the system only had access to generic confidence signals, then appropriate switching between particular tasks or strategies on the basis of their expected success would be compromised. Functional imaging evidence implicates human anterior PFC in tracking the reliability of specific alternative strategies during decision making (Donoso et al., 2014) and such regions may also support domain-specific representations of confidence.
Current behavioral evidence of a shared resource for metacognition is ambiguous, in part due to the difficulty of distilling metacognitive processes from those supporting primary task performance (Galvin et al., 2003;Maniscalco and Lau, 2012;Fleming and Lau, 2014). Some studies have found that efficient metacognition in one task predicts good metacognition in another (McCurdy et al., 2013;Ais et al., 2016;Ruby et al., 2017;Samaha and Postle, 2017;Faivre et al., 2018), whereas others indicate the independence of metacognitive abilities (Kelemen et al., 2000;Baird et al., 2013;Vo et al., 2014). Recent studies using bias-free measures of metacognition have identified differences in the neural correlates of memory and perceptual metacognition in both healthy subjects (Baird et al., 2013;McCurdy et al., 2013) and neuropsychological patients . However, the study of behavioral individual differences provides only indirect evidence of the neural and computational architecture supporting metacognition.
Here, we investigated this ontology directly by examining neural substrates engaged when metacognitive judgments are made during perceptual and memory tasks matched for stimulus and performance characteristics. We used a combination of univariate and multivariate analyses of fMRI data to identify domainspecific and domain-general neural substrates engaged during metacognitive judgments. We also distinguished activations engaged by a metacognitive judgment from neural activity that tracks confidence level. Together, our findings reveal the coexistence of generic and specific confidence representations consistent with a computational hierarchy underpinning effective metacognition.

Participants
Thirty healthy subjects (ages 18 -33 years, mean 24.97; SD ϭ 4.44; 14 males) with normal or corrected-to-normal vision were monetarily compensated and gave written informed consent to participate in the study at the Center for Neural Science at New York University. The study protocols were approved by the local institutional review board. The number of participants was determined a priori at n ϭ 30, which is consistent with recent guidelines on neuroimaging sample sizes (Poldrack et al., 2017). Due to behavioral and in-scanner motion cutoff criteria, six subjects were excluded from further analysis (details below). We present the results of 24 subjects whose data were fully analyzed.

Experimental and task design
The experiment had a 2 ϫ 2 ϫ 2 design: condition (confidence/follow) ϫ task domain ( perception/memory) ϫ stimulus type (shapes/words). It consisted of six scanner runs, each with eight nine-trial miniblocks (72 trials per run, 432 trials in total). Perceptual and memory two-alternative forced-choice (2-AFC) tasks were presented in separate, interleaved runs (three runs per task; order counterbalanced across subjects). In each run, there were four pairs of miniblocks from the confidence and follow conditions. To avoid stimulus confounds, two different types of stimulus were used throughout the experiment. In each run, two pairs of confidence/follow miniblocks used words and the remaining two pairs used abstract shapes (interleaved and order counterbalanced across runs).
In the perceptual task, subjects were asked to report the brighter of two stimuli on each trial. In the memory task, subjects began each miniblock by learning a set of nine consecutively presented stimuli. A stimulus from this set was then presented on each subsequent trial (in randomized order) alongside a new stimulus. The subjects' task was to identify the studied stimulus. In miniblocks from the confidence condition, subjects had to rate their confidence in their performance in each trial by selecting a number from a scale of 1 to 4. In miniblocks from the follow condition, subjects had to "follow the computer" in each trial by pressing the button corresponding to the highlighted number regardless of their confidence. The highlighted number was yoked to their ratings in the previous confidence miniblock (randomized presentation order) to ensure similar low-level visuomotor characteristics in both conditions for any given pair of miniblocks.
Subjects were reminded at the beginning of each miniblock of the condition, task, and stimulus type that would follow. They used two fingers of their right hand to respond on an MRI-compatible button box: left stimulus (index) and right stimulus (middle). For confidence ratings, they used four fingers: 1 (index), 2 (middle), 3 (ring), and 4 (little). If subjects failed to provide either type of response within the allotted time (see Fig. 1A for details), the trial was missed and an exclamation mark was displayed for the remainder of the trial. Failing to press the highlighted number counted as a missed trial.
Before entering the scanner, participants were familiarized with the tasks and the confidence rating scale. After computing independent brightness thresholding for words and abstract shapes, subjects practiced one of each miniblock type (i.e., eight miniblocks). Instructions emphasized that confidence ratings should reflect relative confidence and participants were encouraged to use all ratings. The whole experiment lasted ϳ1.5 h.

Stimuli
The experiment was programmed in MATLAB (The MathWorks) and stimuli were presented using Psychtoolbox (Brainard, 1997). Abstract, 22-or 28-line shapes were created randomly by specifying an (invisible) grid of 6 ϫ 6 squares that subtended 4 degrees of visual angle where lines could connect two vertices horizontally, vertically, or diagonally. The first line always stemmed from the central vertex of the invisible grid randomly connecting one of the surrounding eight vertices to ensure shape centrality within the grid. The remaining lines were drawn sequentially, ensuring that all lines were connected. Orientation and originating vertices were selected randomly.
All words were nouns of six to 12 letters with one to four syllables obtained from the Medical Research Council Psycholinguistic Database (Wilson, 1988). In the perceptual task, words had high familiarity, concreteness, and imageability ratings (400 -700). In the memory task, words had low ratings (100 -400) to increase task difficulty. Each word and each shape was presented once throughout the experiment (across perceptual and memory blocks, including practice trials). All subjects were tested on the same words and shapes (counterbalanced across confidence and follow conditions across subjects). Words and rating scales were presented using DS-Digital font (40 points) to make their visual features similar to the abstract shapes.
To obtain stimulus sets of similar difficulty for shapes and words, we ran a series of pilot studies in which participants rated abstract shapes' distinctiveness and then performed the memory task [15 miniblocks per subject; 171 Amazon Mechanical Turk participants (73 for shapes; 98 for words) and six subjects in the laboratory who performed a complete version of the experiment]. Based on these results, we expected a mean performance in the memory task of ϳ71% correct responses when 22and 28-line distinctive shapes were used in the same block and ϳ83% correct when long words (6 -12 letters) with low concreteness, imageability, and familiarity ratings (100 -400) were used. To further increase difficulty, we created pairs of old and new words split between the confidence and follow conditions (counterbalanced across subjects), blocked by similar semantic category (e.g., finance, argumentation, character traits, etc.), such that each new word within a block was freely associated with one old word (and when possible, vice versa) according to the University of South Florida free association normed database (Nelson et al., 2004).
In the perceptual task, the difference in brightness (⌬b) between the two stimuli was calibrated for each subject and independently for each stimulus type. The brightness of a randomly located reference stimulus was fixed (middle gray). The brightness of the nonreference stimulus was titrated using a staircase procedure similar to previous experiments (Fleming et al., , 2012b. During practice, we used a fixed, large step size two-down/one-up procedure until subjects reached 15 reversals or 90 trials. The step sizes followed recommended ratios to match the expected performance in memory blocks (García-Pérez, 1998). The experiment began with a ⌬b value determined by the average of the ⌬b values at each reversal, excluding the first one. Throughout the experiment, we kept a small step size staircase running to account for learning or tiredness.
A middle gray fixation cross subtending 0.3 degrees of visual angle was presented between the two stimuli on a black background. The reference stimulus in the perceptual task and all stimuli in the memory task were middle gray. All stimuli were surrounded by an isoluminant blue bounding box separated from the stimulus by a gap of at least 0.15 degrees of visual angle.

Behavioral data analysis
Data analysis was performed in MATLAB and statistical analysis in RStudio (R Studio Team, 2015). We estimated metacognitive efficiency by computing log(meta-d'/d') where d' is a signal detection theoretic measure of type I sensitivity and meta-d' is a measure of type II sensitivity (i.e., the degree to which a subject discriminates correct form incorrect responses) expressed in the same units as type I sensitivity (d') (Maniscalco and Lau, 2012;Fleming and Lau, 2014). Meta-d' indicates the d' that would have been predicted to give rise to the observed confidence rating data assuming a signal detection theoretic ideal observer. Meta-d' ϭ d' indicates an optimal type II behavior for the observed type I behavior. Meta-d' greater or less than d' indicates metacognition that is better or worse, respectively, than the expected given task performance, as may occur, for instance, if first-order decisions and confidence are supported by partly parallel processing streams (Fleming and Daw, 2017). We used hierarchical Bayesian estimation to incorporate subject-level uncertainty in group-level parameter estimates (Fleming, 2017). Certainty on this parameter was determined by computing the 95% high-density interval (HDI) from the posterior samples (Kruschke, 2010). For correlation and individual differences analyses, we used single-subject Bayesian model fits. Two subjects were discarded for missing Ͼ10% of the trials (i.e., Ͼ1 SD from the average missed trials, which was 5%). Missed trials were not analyzed.

fMRI data acquisition
Brain images were acquired using a 3T Allegra scanner (Siemens). BOLD-sensitive functional images were acquired using a T2*-weighted gradient-echo echo-planar images (42 transverse slices, interleaved acquisition; TR, 2.34 s; TE, 30 ms; matrix size: 64 ϫ 64; 3 ϫ 3 mm in-plane resolution; slice thickness: 3 mm; flip angle: 90°; FOV: 126 mm). The main experiment consisted of three runs of 210 volumes and three runs of 296 volumes for the perceptual and memory tasks, respectively. We collected a T1-weighted MPRAGE anatomical scan (1 ϫ 1ϫ1 mm voxels; 176 slices) and local field maps for each subject.

fMRI data preprocessing
Imaging analysis was performed using SPM12 (Statistical Parametric Mapping; www.fil.ion.ucl.ac.uk/spm). The first five volumes of each run were discarded to allow for T1 stabilization. Functional images were realigned and unwarped using local field maps (Andersson et al., 2001) and then slice-time corrected (Sladky et al., 2011). Each participant's structural image was segmented into gray matter, white matter, CSF, bone, soft tissue, and air/background images using a nonlinear deformation field to map it onto template tissue probability maps (Ashburner and Friston, 2005). This mapping was applied to both structural and functional images to create normalized images to Montreal Neurological Institute (MNI) space. Normalized images were spatially smoothed using a Gaussian kernel (8 mm FWHM). We set a within-run 1 mm rotation and 4 mm affine motion cutoff criterion, which led to the exclusion of 4 subjects, leaving a total of 24 subjects whose functional and behavioral data were fully analyzed.

Univariate analysis
All of our general linear models (GLMs) focus on the "rating period" of each trial by specifying boxcar regressors beginning at the subjects' type I response and ending at their type II response (i.e., either confidence rating or number press). Motion correction parameters were entered as covariates of no interest along with a constant term per run. Regressors were convolved with a canonical hemodynamic response function. Lowfrequency drifts were excluded with a 1/128 Hz high-pass filter. Missed trials were not modeled. For judgment-related (JR) analyses, we created a GLM with two regressors of interest per run to estimate BOLD response amplitudes in each voxel during the rating period in each trial of the confidence and follow blocks. For the confidence-level-related (CLR) parametric modulation analysis, a GLM was used to estimate BOLD responses in the confidence blocks. There were two regressors of interest in each run, one modeling the confidence rating period and another that encoded a parametric modulation by the four available confidence ratings (1-4).
Statistical inference. For the JR analysis, single-subject contrast images of the confidence and follow regressors were entered into a second-level random-effects analysis using one-sample t tests against zero to assess group-level significance. For the CLR parametric modulation analysis, single-subject contrast images of the parametric modulator were entered into a similar second-level random-effects analysis. For conjunction analyses of activations common to both domains, second-level maps thresholded at p Ͻ 0.001 (uncorrected) were intersected to reveal regions of shared statistically significant JR and CLR activity. Activations were visualized using MRIcro (http://www.mccauslandcenter.sc.edu/crnl/mricro). All second-level unthresholded statistical images were uploaded to Neurovault (Gorgolewski et al., 2015) (https://neurovault.org/collections/ 3232/). ROI analysis. To define regions of interest (ROIs), 12 mm spheres were centered at MNI coordinates identified from previous literature (see Fig.  3C). ROIs in left rostrolateral PFC (L rlPFC) [Ϫ33, 44, 28], right rlPFC (R rlPFC) [27,53,25], and dorsal anterior cingulate cortex/presupplementary motor area (dACC/pre-SMA) [0,17,46] were created based on (Fleming et al., 2012b). The mask for precuneus (PCUN) [0, Ϫ64, 24] was based on (McCurdy et al., 2013). The MNI x-coordinates for the dACC/pre-SMA and PCUN masks were set to 0 to ensure bilaterality. Beta values were extracted from subjects' contrast images for the JR and CLR univariate analyses, respectively.

Multivoxel pattern analysis
Multivoxel pattern analysis (MVPA) was performed in MATLAB using the Decoding Toolbox (Hebart et al., 2014). We classified runwise beta images from GLMs modeling JR and CLR activity patterns in ROI and whole-brain searchlight analyses. ROI MVPAs were performed on normalized, smoothed images using the ROI spheres as masks. Previous work has shown that these preprocessing steps have minimal impact on support vector machine (SVM) classification accuracy while allowing meaningful comparison across subject-specific differences in anatomy, as in standard fMRI analyses (Kamitani and Sawahata, 2010; Op de Beeck, 2010). A single accuracy value per subject, per condition, and per ROI was extracted and used for group analysis and statistical testing. As a control, we added a 6-mm-radius sphere centered at the ventricles [0 2 15].
Whole-brain searchlight analyses used 12 mm-radius spheres centered around a given voxel for all voxels on spatially realigned and slice-timecorrected images from each subject to create whole-brain accuracy maps. For group-level analyses, these individual searchlight maps were spatially normalized and smoothed using a Gaussian kernel (8 mm FWHM) and entered into one-sample t tests against chance accuracy (Hebart et al., 2014(Hebart et al., , 2016. Whole-brain cluster inference was performed in the same manner as in univariate analysis. Visualizations were made with Surf Ice (https://www.nitrc.org/projects/surfice/). Before decoding, for JR activity pattern classification, we modeled two regressors of interest per run focused on the rating periods in the confidence and follow conditions. For classification of CLR activity patterns, we collapsed ratings 1 and 2 into a low-confidence regressor and ratings 3 and 4 into a high-confidence regressor to allow binary classification. The remaining parameters of no interest were specified as in the univariate case. For the CLR searchlight analysis, we used an exclusive mask of activity patterns associated with usage of the confidence scale obtained from the successful cross-classification of button presses (1-2 vs 3-4) between the confidence and follow conditions to eliminate low-level visuomotor confounds (see Fig. 4D).
In independent across-domain classifications, we used the runwise beta images reflecting JR and CLR activity as pattern vectors in a linear support vector classification model (as implemented in LIBSVM). We assigned each vector from each domain a label corresponding to the classes confidence (1) and follow (Ϫ1) in the JR analysis and low confidence (Ϫ1) and high confidence (1) in the CLR analysis. We trained an SVM with the vectors from one domain (three per class, six in total) and tested the decoder on the six vectors from the other domain (and vice versa) (see Fig. 4A, left), obtaining a mean average classification accuracy value for each of these two-way cross-classifications.
For within-domain classifications, we ran independent leave-onerun-out cross-validations for each domain on JR activity patterns (confidence vs follow) and CLR activity patterns (low vs high confidence). Pattern vectors from two of the three runs in each domain were used to train an SVM to predict the same classes in the vectors from the left-out run. We compared the true labels of the left-out run with the labels predicted by the model and iterated this process for the other two runs to calculate a mean cross-validated accuracy independently for each domain (see Fig. 4A, right).
We also tested the ability of confidence-related activity patterns to predict objective performance in the absence of confidence reports. We used a GLM that modeled low versus high confidence trials with a regressor that focused on the rating period and incorrect versus correct follow trials with a regressor that focused on the decision period (i.e., from stimulus onset to subjects' type I response). We performed a crossclassification analysis in which a decoder trained on confidence trials (low vs high confidence) was tested on pattern vectors from follow trials (incorrect vs correct) and vice versa (collapsed across domain). This confidence-objective performance generalization score was compared with a leave-one-run-out cross-validation analysis decoding low versus high confidence on confidence trials only (collapsed across domain). Together, these scores characterize whether a particular set of patterns are specific to confidence or also generalize to predict objective performance (Cortese et al., 2016) (see Fig. 5).

Individual differences
Metacognitive efficiency scores (log meta-d'/d') for each subject were estimated independently for the perceptual and memory tasks, together with a single score collapsed across domains. These scores were inserted as covariates in second-level analyses of within-perception, within-memory, and across-domain classifications of confidence-level-related activity, respectively, to assess the parametric relationship between metacognitive efficiency and decoding success.

Results
We analyzed the data from 24 subjects who underwent hemodynamic neuroimaging while performing two-alternative forced-choice discrimination tasks in perceptual and memory domains (Fig. 1A). In the perceptual task, subjects were asked to indicate the brighter of two stimuli (words or abstract shapes). In the memory task, subjects were asked to memorize exemplars of the same stimulus types and then select the previously learned stimulus from two stimuli presented on each trial. In half of the trials ("confidence" condition), subjects performed a metacognitive evaluation after the discrimination task by rating their confidence in the correctness of their response by selecting a number on a scale of 1 to 4 (1 ϭ not confident; 4 ϭ very confident). To differentiate metacognitive-related activity from visuomotor activity engaged by use of the confidence scale, in the other half of trials ("follow" condition), subjects were asked to respond according to a highlighted number without evaluating confidence in their response. To avoid stimulus-type confounds, two different types of stimuli, words and abstract shapes, were used in both tasks.

Behavior
We first compared task performance, measured by percentage of correct responses, across condition, task, and stimulus type. A 2 ϫ 2 ϫ 2 repeated-measures ANOVA (confidence/follow ϫ perception/memory ϫ shapes/words) showed that performance was well matched across conditions (confidence vs follow) (F (1,23) ϭ 3.036, p ϭ 0.095). None of the four paired t tests (domain ϫ stimulus) comparing performance between the confidence and follow conditions returned a significant difference ( p Ͼ 0.05). In the remainder of the behavioral analyses, we focused on the confi-A B C Figure 1. Task design and performance results. A, Subjects performed two-alternative forced-choice discrimination tasks about perception and memory. In perception blocks, subjects selected the brighter of two stimuli. Memory blocks started with an encoding period and then subjects indicated in each trial which of two stimuli appeared during the encoding period. Abstract shapes and words were used as stimuli in both tasks. In confidence blocks, subjects rated their confidence and, in follow blocks, they pressed the highlighted number. B, Percentage correct responses per block type in the confidence condition. Each marker represents a subject. C, Mean percentage correct responses by domain averaged over subjects and stimulus types. Dotted lines indicate chance performance. Bars indicate SEM. n.s., Not significant; P, perception; M, memory.
dence condition. Matching performance across stimulus type was more challenging because subjects' memory for words was expected to be considerably higher than that for abstract shapes trials based on pilot data (see Materials and Methods for details). Instead, we aimed to match subjects' performance independently for each stimulus type across task domains by titrating the difficulty of the perceptual task to approximate the performance expected for the corresponding stimulus type in the memory task (shapes: perceptual M ϭ 73%, memory M ϭ 67%; words: perceptual M ϭ 81%, memory M ϭ 89%; Fig. 1B). Critically, this ensured that performance was matched across task domains when averaging stimulus types across participants (perceptual: M ϭ 77%, memory: M ϭ 78%; paired t test t (23) ϭ 0.38, p ϭ 0.70; Fig. 1C). A 2 ϫ 2 repeated-measures ANOVA of performance in the confidence condition (perception/memory ϫ shapes/words) confirmed there was no main effect of domain (F (1,23) ϭ 0.15, p ϭ 0.702). However, we observed a main effect of stimulus type due to greater overall performance on words (F (1,23) ϭ 75.69, p ϭ 9.87 ϫ 10 Ϫ9 ) and a domain ϫ stimulus interaction due to a greater difference in performance between shapes and words in the memory compared with the perception task (F (1,23) ϭ 16.74, p ϭ 0.00045). Subjects were faster providing type I responses in perceptual trials (M ϭ 636 ms) than in memory trials (M ϭ 1222 ms). There was also a small difference in reaction times between shape (M ϭ 967 ms) and word (M ϭ 892 ms) trials. A 2 ϫ 2 repeatedmeasures ANOVA confirmed a main effect of domain (F (1,23) ϭ 367, p ϭ 1.23 ϫ 10 Ϫ15 ) driven by slower reaction types in the memory task. There was also a main effect of stimulus type on response time (F (1,23) ϭ 8.95, p ϭ 0.006), as well as a significant domain ϫ stimulus interaction due to a greater difference in reaction times between shapes and words in memory compared with the perception task (F (1,23) ϭ 5.82, p ϭ 0.024).
As expected, subjects gave higher confidence ratings after correct decisions than after incorrect decisions ( Fig. 2A) and mean confidence ratings were similar across task domains (perceptual M ϭ 2.62, memory M ϭ 2.47; paired t test t (23) ϭ 1.26, p ϭ 0.22). Reaction times for confidence ratings were not different between domains (perceptual M ϭ 518 ms, memory M ϭ 516 ms; paired t test t (23) ϭ 0.16, p ϭ 0.87). We next estimated log (meta-d'/d'), a metacognitive efficiency measure derived from signal detection theory that assays the degree to which confidence ratings distinguish between correct and incorrect trials (Maniscalco and Lau, 2012;Fleming and Lau, 2014;Fleming, 2017). We used hierarchical Bayesian estimation to incorporate subject-level uncertainty in group-level parameter estimates (Fleming, 2017). Metacognitive efficiency in the perceptual task was significantly lower than in the memory task ( p Ͼ0 ϳ 1; for details, see Fig. 2B and the Materials and Methods), consistent with previous findings . Metacognitive efficiency above optimality (meta-d' ϭ d') in memory trials suggests subjects had better metacognition than expected given their task performance, whereas the suboptimal metacognitive efficiency in perceptual trials suggests that subjects had worse metacognition than expected given their task performance (assuming an ideal observer in both cases). We did not find a correlation between subjects' individual metacognitive efficiency scores in the perceptual and memory domains (r (22) ϭ Ϫ0.076; p ϭ 0.72; Fig. 2C). We also evaluated the correlation coefficient within a hierarchical model of meta-d', which takes into account uncertainty in subject-level model fits (Fleming, 2017). The 95% confidence interval on the posterior correlation coefficient overlapped zero in this analysis ( ϭ 0.205; HDI ϭ [0.826, Ϫ0.358]), also indicating a dissociation between domains.
We next estimated metacognitive efficiency separately for each stimulus type (Fig. 2D). A 2 ϫ 2 repeated-measures ANOVA (perception/memory ϫ shapes/words) indicated that metacognitive efficiency was greater for memory than perception (F (1,23) ϭ 22.44, p ϭ 8.97 ϫ 10 Ϫ5 ). Importantly, there was no stimulus main effect (F (1,23) ϭ 0.015, p ϭ 0.902) and there was no interaction between domain and stimulus type (F (1,23) ϭ 2.835, p ϭ 0.106). To further assess a potential covariation between metacognitive abilities in each domain, we calculated for each subject a domain-generality index (DGI) that quantifies the similarity between scores in each domain for each participant  as follows: Lower DGI scores indicate more similar metacognitive efficiencies between domains (DGI ϭ 0 indicates identical scores). Mean DGI for shapes (1.42), words (0.66), and collapsed by stimulus type (0.95) were higher than zero (Fig. 2D). Metacognition for words was behaviorally more stable across domains because the DGI was smaller than for shapes (paired t test: t (23) ϭ 2.86; p ϭ 0.009). Together, these results suggest domain-specific constraints on metacognitive ability.

fMRI analyses
We next turned to our fMRI data to assess the overlap between neural substrates engaged when metacognitive judgments are made during perceptual and memory tasks. A full understanding of the neural substrates of metacognition requires an independent examination of the process of engaging in a metacognitive task and the level of confidence expressed by the subject (Chua et  al., 2014). To this end, in both univariate and multivariate analyses, we focused on two distinct features of metacognition-related activity. First, we assessed brain regions engaged in JR activity (i.e., the difference between confidence trials requiring a metacognitive judgment and the follow condition). Second, we assessed brain regions engaged in CLR activity. In univariate CLR analysis, we focused on the parametric relationship between confidence ratings (1-4) and neural activity. In multivariate CLR analysis, we collapsed ratings 1 and 2 into a low-confidence category and ratings 3 and 4 into a high-confidence category to allow binary classification of activity patterns.
Univariate results JR activity. In standard univariate analyses, we found elevated activity in dACC/pre-SMA, bilateral insulae, and superior and middle frontal gyri when contrasting the confidence against the follow condition (collapsed by domain), which is consistent with previous findings (Fleming et al., 2012b) (Fig. 3A). There were no significant clusters of activity in the reverse contrast (follow Ͼ confidence). Splitting the data by domain (Table 1), an interaction contrast (memory confidence Ͼ memory follow) Ͼ (perception confidence Ͼ perception follow) revealed significant clusters of activity in middle cingulate gyrus, left insula, PCUN, left hippocampus, and cerebellum (Fig. 3B, blue). No significant clusters of activity were found in the reverse interaction contrast. In a conjunction analysis, elevated activity for the confidence Ͼ follow condition was observed across both perception and memory trials in anterior cingulate and right insula (Fig. 3B, green).
CLR activity. We next sought to investigate the parametric relationship between confidence level and neural activity. Collapsing across domains, we found activity in the left precentral and postcentral gyri, the posterior midline, ventral striatum, and ventromedial PFC (vmPFC) correlated positively with confidence ratings (Fig. 3E, hot colors). We also replicated negative correlations between confidence and activation in dACC/pre-SMA, parietal cortex, and bilateral PFC that have been reported in several previous studies ( . fMRI univariate analysis results. A-D, JR activity. A, Whole-brain analysis of significant activation in the confidence Ͼ follow contrast (collapsed by domain); there were no significant clusters in the follow Ͼ confidence contrast. B, (Memory confidence Ͼ memory follow) Ͼ (perception confidence Ͼ perception follow) interaction contrast (blue). There were no significant clusters in the reverse contrast. The conjunction of memory confidence Ͼ memory follow and perception confidence Ͼ perception follow contrasts is indicated in green. C, Spherical binary masks of four a priori ROIs, 1 ϭ dACC/pre-SMA, 2 ϭ L rlPFC, 3 ϭ R rlPFC, and 4 ϭ PCUN, and an ROI in the ventricles (5) used as a control region in multivariate analyses (Fig. 4). D, Estimated mean beta values for JR activity by domain in the main four ROIs displayed in C. E-G, CLR activity. E, Whole-brain analysis of activity parametrically modulated by level of confidence (collapsed by domain). Hot colors indicate a positive correlation with confidence and cool colors a negative correlation. F, Memory Ͼ perception contrast (blue) testing for differences between the parametric effect of confidence by domain; there were no significant clusters in the perception Ͼ memory contrast. A conjunction analysis revealed shared activity that was positively (green) and negatively (yellow) correlated with confidence levels in both domains. G, Estimated mean beta values of CLR activity in the main four ROIs displayed in C. All displayed whole-brain activations are significant at a cluster-defining threshold p Ͻ 0.001, corrected for multiple comparisons. p FWE Ͻ 0.05; except for conjunction analyses, in which we computed the intersection of two independent maps thresholded at p Ͻ 0.001, uncorrected. Images are displayed at p Ͻ 0.001. Graded color bars reflect T-statistics. Error bars indicate SEM. ***p Ͻ 0.001; **p Ͻ 0.01; *p Ͻ 0.05. L, Left; R, right; P, perception; M, memory. When testing for differences between these parametric regressors by domain (Table 2), a memory Ͼ perception contrast revealed a significant cluster of activity in right parietal cortex (Fig. 3F, blue), whereas there was no significant activity in a perception Ͼ memory contrast. Shared positive correlations between confidence and activity in perception and memory trials were found in ventral striatum and in left precentral and postcentral gyri, the latter consistent with use of the right hand to provide confidence ratings (conjunction analysis; Fig. 3F, green). Shared negative correlations with confidence were found in regions of right dorsolateral PFC and medial PFC, overlapping with pre-SMA (Fig.  3F, yellow).
Complementing the ROI analysis of JR activity, we performed an ROI analysis of CLR activity that recapitulated the wholebrain results. We observed negative relationships between confidence and activity in dACC/pre-SMA and positive relationships in PCUN. Importantly, no significant differences in the parametric effect of confidence were found between domains in any of our a priori ROIs ( Fig. 3G; paired t tests: dACC/pre-SMA, t (23) ϭ Ϫ0.47, p ϭ 0.643; left rlPFC, t (23) ϭ 0.23, p ϭ 0.820; right rlPFC, t (23) ϭ 1.62, p ϭ 0.119; PCUN: t (23) ϭ 0.56, p ϭ 0.583). Together with the lack of marked domain-specific differences in confidence-related activity at the whole-brain level, these results are suggestive of an absence of domain specificity in confidencerelated activity. However, a lack of difference between univariate activation profiles is not necessarily conclusive. For instance, differences in confidence level may be encoded in fine-grained spatial patterns of activity even when the overall BOLD activity is evenly distributed across confidence levels (Cortese et al., 2016). Similarly, whereas metacognition-related activity may show similar overall levels of activation across tasks, distributed activity patterns in frontal and parietal areas may carry distinct task-specific information (Hebart et al., 2016;Cole et al., 2016). We next turned to multivariate analysis methods, which are sensitive to differences in spatial activity patterns, to test this hypothesis.

Multivariate results
We performed a series of MVPAs (Fig. 4A) focused on both JR activity patterns and CLR activity patterns.
ROI analysis of JR activity patterns. If metacognitive judgments are based on domain-general processes (i.e., shared across perceptual and memory tasks), then a decoder trained to classify JR activity patterns in perceptual trials should accurately discriminate JR activity patterns when tested on memory trials (and vice versa). Alternatively, domain-specific activity profiles would be indicated by significant within-domain classification of JR activity patterns in the absence of across-domain transfer. To adjudicate between these hypotheses, we performed an SVM decoding analysis using as input vectors the runwise beta images pertaining to confidence and follow trials obtained from a GLM (12 input vectors in total). For within-domain classification, we used standard leave-one-out independent cross-validations for each domain and we tested for across-domain generalization using a cross-classification analysis (see Materials and Methods for details). Chance classification in both analyses was 50%.
As a control, classification accuracy in the ventricles was not different from chance (across: t (23) ϭ 0.66, p ϭ 0.52; within: t (23) ϭ 1.04, p ϭ 0.31). This suggests that the patterns of activity that distinguish metacognitive judgments from the visuomotor control condition in one domain are distinct from analogous patterns in the other domain. In particular, within-domain classification accuracy was significantly different from across-domain classification accuracy in (dACC/pre-SMA: t (23) ϭ 2.88, p ϭ 0.008) and right rlPFC (t (23) ϭ 2.24, p ϭ 0.035). These results are consistent with the hypothesis that metacognitive judgments recruit domainspecific patterns of cortical activity in PFC. Searchlight analysis of JR activity patterns. We ran a similar decoding analysis using an exploratory whole-brain searchlight, obtaining a classification accuracy value per voxel (Hebart et al., 2014). Consistent with our ROI results, we observed significant within-domain classification in large swathes of bilateral PFC for both perception (red) and memory (blue) (Fig. 4C, Table 3). Withinperception classification was also successful in parietal regions, the PCUN in particular, and within-memory activity patterns were classified accurately in occipital regions. We also identified clusters showing significant across-domain generalization (yellow) in dACC, pre-SMA, SFG (BA9), supramarginal gyrus (BA40), and bilateral IFG/insula, consistent with univariate results (Fig. 3A).
ROI analysis of CLR activity patterns. We next investigated whether confidence is encoded in a domain-general or domainspecific fashion by applying a similar approach to discriminate low versus high confidence trials. In this case, ROI univariate analyses did not reveal any differences in confidence-related ac- Pattern vectors (runwise beta images) from one domain were used to train an SVM decoder on two classes and then tested in a cross-classification of the same two classes using vectors from the other domain (and vice versa). Classification of low (L) and high (H) confidence levels is illustrated. Right, Within-domain classification design. Pattern vectors of two classes (e.g., low and high confidence) pertaining to one domain were used to train a decoder in a leave-one-run-out design that was then tested in the left-out pair. The process was iterated three times to test pairs from every run. An identical, independent cross-validation was performed on vectors from the other domain. B, C, JR activity patterns. B, ROI results for across-domain (yellow) and mean within-domain (red-blue stripe) classification accuracy of Confidence vs Follow trials. C, Searchlight analysis for same classifications in B. D, Low-level visuomotor mask used in F (see main text and Materials and Methods for details). E, F, CLR activity patterns. E, Low versus high confidence classification accuracy results. F, Searchlight analysis for the same classifications in E exclusively masked for visuomotor-related activity patterns. Bars in B and E indicate means and error bars indicate SEM. Dashed lines indicate chance classification (50%). Diamonds and circles indicate mean independent classification in perception and memory trials, respectively. White diamonds/circles indicate classification was significantly different from chance, Bonferroni corrected. All clusters in C and F are significant at cluster-defining threshold p Ͻ 0.001, corrected for multiple comparisons at p FWE Ͻ 0.05. Image is displayed at p Ͻ 0.001. Color bars indicate T-scores. A, anterior; P, posterior. ***p Յ 0.001; **p Յ 0.01; *p Ͻ 0.05; all one-sample t tests are Bonferroni corrected. tivity between domains (Fig. 3G). We hypothesized that if confidence level is encoded by domain-general neural activity patterns, then it should be possible to train a decoder to discriminate low (1-2) from high (3-4) confidence rating patterns in the perceptual task and then accurately classify confidence patterns on the memory task (and vice versa). In the absence of acrossdomain classification, significant within-domain classification is indicative of CLR domain-specific activity patterns. ROI crossclassifications and cross-validations were performed in a similar fashion as above (Fig. 4A). Two subjects did not provide ratings for one of the classes in at least one run and were left out from the main analysis to avoid entering unbalanced training data into the classifier; however, including those subjects did not change the main result.
Searchlight analysis of CLR activity patterns. We ran a similar decoding analysis of confidence level using an exploratory wholebrain searchlight, obtaining a classification accuracy value per voxel. Here, we leveraged the follow trials as a control for lowlevel visuomotor confounds by exclusively masking out activity patterns associated with usage of the confidence scale (Fig. 4D). The remaining activity patterns can therefore be ascribed to CLR signals that do not encode visual or motor features of the rating (Fig. 4F, Table 4). We found widespread across-domain classification of confidence (yellow) in a predominantly midline network including a large cluster encompassing dACC/pre-SMA, vmPFC, and ventral striatum. Domain-specific CLR activity patterns were successfully decoded from right PFC (insula, IFG, BA9, BA46) in memory trials (blue) and were also independently decoded in both domains from dACC/pre-SMA.
Generalization of CLR activity to objective performance. To further address the question of how confidence judgments may relate to activity patterns, we examined the relationship between objective task accuracy and confidence. Previous work suggested that the neural basis (and associated activation patterns) of confidence and performance may be partly distinct (Rounis et al., 2010;Cortese et al., 2016). Specifically, we tested the hypothesis that we could train a decoder using CLR activity patterns to classify objective performance-related activity patterns (correct/incorrect) on follow trials (and vice versa) in a cross-classification analysis (collapsed across domain). This analysis confirmed that activity patterns in dACC/pre-SMA (t (21) ϭ 2.38, p ϭ 0.027) and right rlPFC (t (21) ϭ 2.64, p ϭ 0.015) could predict objective accuracy levels in follow trials above chance (Fig. 5, light gray; uncorrected), but not in left rlPFC (t (21) ϭ 1.49, p ϭ 0.15) or PCUN (t (21) ϭ Ϫ0.46, p ϭ 0.65). We then compared these decoding scores with a leave-one-run-out cross-validation decoding analysis of low versus high confidence on confidence trials only (collapsed by domain; Fig. 5, dark gray; uncorrected). Consistent with the analyses reported in Figure 4E (yellow), this decoder was unable to classify domain-general confidence patterns of activity in right rlPFC (t (21) ϭ 1.00; p ϭ 0.33), but was above chance in dACC/pre-SMA (t (21) ϭ 3.80, p ϭ 0.001), left rlPFC (t (21) ϭ 2.26, p ϭ 0.034), and PCUN (t (21) ϭ 2.56, p ϭ 0.018). Critically, in PCUN, confidence classification was significantly greater than confidence-performance generalization, which was at chance (paired t test t (21) ϭ 2.16, p ϭ 0.043). This result indicates that confidence-related patterns in PCUN do not generalize to predict objective performance, consistent with a partly distinct coding of information relevant to task performance and confidence. In contrast, in dACC/pre-SMA, general confidence level and performance could be predicted from common patterns of activation.  Figure 4C.
Metacognitive efficiency and CLR activity classification. Finally, we reasoned that, if confidence-related patterns of activation are contributing to metacognitive judgments, then they may also track individual differences in metacognitive efficiency. To test for such a relation, we investigated whether individual metacognitive efficiency scores collapsed across domains and independently in each domain predicted searchlight classification accuracy of confidence level. We did not find any significant clusters after wholebrain correction for multiple comparisons in domain-general, within-perception, or within-memory analyses. However, memory metacognitive efficiency predicted memory confidence classification accuracy in a cluster in right PCUN and left precentral gyrus ( p Ͻ 0.001, uncorrected), whereas perceptual metacognitive efficiency predicted perceptual confidence classification accuracy in left middle frontal gyrus, right vmPFC, bilateral temporal gyri, and cerebellum ( p Ͻ 0.001, uncorrected). Previous studies McCurdy et al., 2013) have reported similar relations between perceptual and memory metacognitive efficiency and individual differences in the structure of prefrontal and parietal cortex, respectively. Although we do not in-terpret these findings further here, for completeness, second-level unthresholded statistical images of these analyses were uploaded to Neurovault to inform future work  (https://neurovault.org/collections/3232/).

Discussion
When performing a cognitive task, confidence estimates allow for comparisons of performance across a range of different scenarios (de Gardelle and Mamassian, 2014). Such estimates must also carry information about the task context if they are to be used in decision making. Here, we investigated the domain generality and domain specificity of representations that support metacognition of perception and memory.
Unlike previous studies (McCurdy et al., 2013), subjects' performance was matched between domains for two different types of stimulus, thereby eliminating potential performance and stimulus confounds. Subjects' confidence ratings were also matched between domains and followed expected patterns of higher ratings after correct decisions than after incorrect decisions. Metacognitive efficiency scores between tasks were not correlated and metacognitive efficiency scores in the memory task were superior to those in the perceptual task. Using univariate and multivariate analyses, we showed the existence of both domain-specific and domain-general metacognition-related activity during perceptual and memory tasks. We report four main findings and discuss each of these in turn.
First, we obtained convergent evidence from both univariate and multivariate analyses that a cingulo-opercular network centered on dACC/pre-SMA encodes a generic signal predictive of confidence level and objective accuracy across memory and perceptual tasks. Previous studies of metacognition have implicated the cingulo-opercular network in tracking confidence level (Fleck et al., 2006;Fleming et al., 2012b;Hilgenstock et al., 2014;Hebart et al., 2016). However, we go beyond these previous studies to provide evidence that these signals generalize to predict confidence across two distinct cognitive domains. This finding is consistent with posterior medial frontal cortex as a nexus for monitoring the fidelity of generic sensorimotor mappings, building on previous findings that error-related event-related potentials originating from this region are sensitive to variation in subjective certainty (Scheffers and Coles, 2000; Boldt and Yeung, 2015). The activity  Fig. 4D). Significant activations at cluster-defining threshold p Ͻ 0.001, corrected for multiple comparisons at p FWE Ͻ 0.05. For more information, see Figure 4F. Figure 5. Generalization of confidence-related activity to objective accuracy. Light gray bars denote mean cross-classification accuracy results obtained from training a decoder on CLR activity and testing it on objective accuracy (correct/incorrect) activity patterns in the follow condition (and vice versa). Dark gray bars denote decoding accuracy for a leave-one-out crossvalidation of low and high confidence on confidence trials only (collapsed by domain). Bars indicate group means and SEM. Dotted line indicates chance level. *p Ͻ 0.05; ***p Ͻ 0.001.
in dACC/pre-SMA was also consistently elevated by the requirement for a metacognitive judgment (Fleming et al., 2012b). However, the results regarding the generalizability of the pattern of these increases across tasks were inconclusive. Whole-brain searchlight analysis revealed successful cross-classification of these activity patterns in dACC and insular regions, consistent with the results of the univariate analysis. These patterns of activity, however, did not generalize across tasks in a predefined region of interest centered in dACC/pre-SMA. Although both dACC/pre-SMA and PCUN showed significant domain-general decoding of confidence, in PCUN these patterns did not generalize to also predict changes in objective accuracy. Whereas performance and subjective confidence may both depend on similar decision variables (Kiani and Shadlen, 2009;Fleming and Daw, 2017), behavioral dissociations between these quantities are also consistent with distinct internal states contributing to decisions and confidence ratings (Busey et al., 2000;Fleming and Daw, 2017). For instance, hierarchical models of confidence formation suggest a downstream network "reads out" decision-related information in a distinct neural population (Insabato et al., 2010). The observed lack of cross-classification in PCUN (Fig. 5) is consistent with the recent observation of distinct neural patterns of activity pertaining to confidence and firstorder performance revealed through multivoxel neurofeedback in frontal and parietal regions (Cortese et al., 2016).
Second, in lateral anterior frontal cortex, we found activity patterns that tracked both the requirement for metacognitive judgments and level of confidence. Large swathes of lateral PFC distinguished activity patterns pertaining to metacognitive judgments that were specific for each domain. Critically, however, confidence-related activity patterns were selective for domain in right rlPFC (Fig. 4E): they differed according to whether the subject was engaged in rating confidence about perception or memory. Such signals may support the "tagging" of confidence with contextual information, thereby facilitating the use of confidence for behavioral control (Donoso et al., 2014;Purcell and Kiani, 2016). The identity of perceptual and memory tasks can be reliably decoded from activity in right PFC neural populations (Mendoza-Halliday and Martinez-Trujillo, 2017), consistent with the possibility that this contextual information is recruited during confidence rating. It is possible that anterior prefrontal regions combine generic confidence signals with domain-specific information to fine-tune decision making and action selection in situations in which subjects need to regularly switch between tasks or strategies on the basis of their reliability (Donoso et al., 2014). An alternative hypothesis, also compatible with our data, is that PFC first estimates the confidence level specifically for the current task, which is then relayed to medial areas to recruit the appropriate resources for cognitive control in a task-independent manner. Processing dynamics may also unfold simultaneously in both areas. These possibilities echo a longstanding debate in the cognitive control literature on the relative primacy of medial and lateral PFC in the hierarchy of control (Kerns et al., 2004;Tang et al., 2016). Further inquiry and development of computational models of the hierarchical or parallel functional coupling of these networks in metacognition is necessary.
Third, we obtained convergent evidence that PCUN plays a specific role in metamemory judgments. In univariate fMRI analyses, we found that the requirement for a metacognitive judgment recruited our preestablished region of interest centered on PCUN only on memory, but not perceptual, trials (Fig. 3D). Individual metacognitive efficiency scores in memory trials predicted classification accuracy in a more dorsal precuneal region, whereas individual differences in metacognitive efficiency scores in perceptual trials predicted classification accuracy in vmPFC (albeit at uncorrected thresholds). These findings are consistent with the medial parietal cortex making a disproportional contribution to memory metacognition (Simons et al., 2010;Baird et al., 2013;McCurdy et al., 2013) and offer a potential explanation for a decrease in perceptual, but not memory, metacognitive efficiency seen in patients with frontal lesions . However, we do not conclude that PCUN involvement is specific to metamemory. We note that univariate negative correlations with confidence were found also on perceptual trials and multivariate classification results in PCUN indicated the presence of both perceptual-and memory-related signals. This dual involvement of the PCUN in perception and memory metacognition is consistent with previous studies suggesting a relationship between PCUN structure and visual perceptual metacognition McCurdy et al., 2013).
Fourth, we found in both univariate and multivariate wholebrain analyses that domain-general signals in the ventral striatum and vmPFC (including subgenual ACC) were modulated by confidence level. These results are compatible with previous findings finding activity in the ventral striatum to be positively correlated with confidence (Daniel and Pollmann, 2012;Hebart et al., 2016;Guggenmos et al., 2016). Evidence of vmPFC encoding of confidence signals has been reported in connection with decision making and value judgments (De Martino et al., 2013;Lebreton et al., 2015). Our experimental design, however, does not allow us to disentangle whether the signals found in these regions pertain uniquely to confidence or if they are entangled with implicit value and reward signals (e.g., the expected value of being correct). Future experiments are needed to explicitly decouple reward from confidence to resolve this issue.
In our experimental design, perception and memory blocks were interleaved across runs, which raises the question as to whether the domain-specific neural substrates that we found would persist if subjects had to switch between tasks more often. Due to intertask "leaks" in confidence (Rahnev et al., 2015), in which confidence in one task influences confidence ratings on the following task (or even in the following trial), there is a possibility that interleaving blocks of different tasks might favor the observation of more domain-general confidence-related patterns.
Our experimental design assumes that visual perception and memory are distinct domains. We acknowledge that distinguishing between cognitive domains or individuating perceptual modalities is not straightforward (Macpherson, 2011). For instance, different modalities (e.g., vision, audition, touch, etc.), different aspects within a single modality (e.g., motion and color within vision), or closely related modalities (e.g., visual perception vs visual short-term memory) could be part of a unified perceptual domain for metacognitive purposes. Recent findings suggest that metacognitive efficiency in one perceptual modality predicts metacognitive efficiency in others and that they share electrophysiological markers (Faivre et al., 2018). Metacognitive efficiency is also correlated across vision and visual short-term memory, especially for features such as orientation (Samaha and Postle, 2017), and dACC and insula regions similar to those identified here have been found to show univariate confidence signals across both color and motion tasks in the visual domain (Heereman et al., 2015). However, it is an open question whether more finegrained, modality-specific patterns of metacognitive activity could be decoded using multivariate approaches. More research is needed on the neural architecture of metacognition in other cognitive domains and whether this architecture changes in a graded or discrete fashion as a function of task or stimulus.
In summary, our results provide evidence for the coexistence of content-rich metacognitive representations in anterior PFC with generic confidence-related signals in frontoparietal and cingulo-opercular regions. Such an architecture may be appropriate for "tagging" lower-level feelings of confidence with higher-order contextual information to allow effective behavioral control. Previous studies have tended to draw conclusions about either domainspecific or domain-general aspects of metacognition. Here, we reconcile these perspectives by demonstrating that both domainspecific and domain-general signals coexist in the human brain, thus laying the groundwork for a mechanistic understanding of reflective judgments of cognition.