Abstract
It is clear that prior expectations shape perceptual decision-making, yet their contribution to the construction of subjective decision confidence remains largely unexplored. We recorded fMRI data while participants made perceptual decisions and confidence judgments, manipulating perceptual prior expectations while controlling for potential confounds of attention. Results show that subjective confidence increases as expectations increasingly support the decision, and that this relationship is associated with BOLD activity in right inferior frontal gyrus (rIFG). Specifically, rIFG is sensitive to the discrepancy between expectation and decision (mismatch), and higher mismatch responses are associated with lower decision confidence. Connectivity analyses revealed expectancy information to be represented in bilateral orbitofrontal cortex and sensory signals to be represented in intracalcarine sulcus. Together, our results indicate that predictive information is integrated into subjective confidence in rIFG, and reveal an occipital-frontal network that constructs confidence from top-down and bottom-up signals. This interpretation was further supported by exploratory findings that the white matter density of right orbitofrontal cortex negatively predicted its respective contribution to the construction of confidence. Our findings advance our understanding of the neural basis of subjective perceptual processes by revealing an occipitofrontal functional network that integrates prior beliefs into the construction of confidence.
SIGNIFICANCE STATEMENT Perceptual decision-making is typically conceived as an integration of bottom-up and top-down influences. However, perceptual decisions are accompanied by a sense of confidence. Confidence is an important facet of perceptual consciousness yet remains poorly understood. Here we implicate right inferior frontal gyrus in constructing confidence from the discrepancy between perceptual judgment and its prior probability. Furthermore, we place right inferior frontal gyrus within an occipitofrontal network, consisting of orbitofrontal cortex and intracalcarine sulcus, which represents and communicates relevant top-down and bottom-up signals. Together, our data reveal a role of frontal regions in the top-down processes enabling perceptual decisions to become available for conscious report.
Introduction
Perception is increasingly being seen as an active process, in which current or future sensory states are inferred from predictive information (Engel et al., 2001; Lee, 2002; Bar, 2007; Beck and Kastner, 2009; Fiser et al., 2010; Gilbert and Li, 2013). These predictions can be modeled in Bayesian terms as prior beliefs, which bias perceptual inference toward solutions that are a priori more likely in a given context (Bülthoff et al., 1998; Seriès and Seitz, 2013; Trapp and Bar, 2015). Predictions, or priors, can have striking effects on perception, especially under high sensory uncertainty. For example, ambiguous rotational motion can be subjectively disambiguated by prior exposure to rotation direction, resulting in the perception of a rotation direction despite none existing in the physical stimulus (Maloney et al., 2005). In laboratory conditions, these behavioral effects of prediction are typically accompanied by increases in BOLD amplitude, ERP amplitude and evoked gamma power, both over sensory (Egner et al., 2010; Wacongne et al., 2011; Jiang et al., 2013; Bauer et al., 2014; Kouider et al., 2015) and decision-related (Bubic et al., 2009) brain regions. These neural responses are typically referred to as a “prediction error” response profile and taken to represent the discrepancy between internal templates and perceptual content.
The perceptual content that forms the basis of our visual experience is accompanied by a degree of subjective confidence. Confidence reflects the estimated accuracy of a perceptual choice and can be seen as a gate for postperceptual processes, such as learning and belief-updating (Nassar et al., 2010; Yeung and Summerfield, 2012). However, while subjective confidence is an integral part of perceptual experience that is easy to probe in human subjects (Seth et al., 2008; Sandberg et al., 2010; Overgaard and Sandberg, 2012; Fleming and Lau, 2014; Wierzchoń et al., 2014), its construction remains poorly understood.
It is clear that confidence increases with evidence in support of the decision (Fetsch et al., 2014b; Gherman and Philiastides, 2015; Hebart et al., 2016). Decision and subjective confidence are thought to evolve together until the first-order, objective decision has been made (Ratcliff and Starns, 2009; Kepecs and Mainen, 2012). Accordingly, there exists strong evidence for a common sensory signal underlying both types of report (Kiani and Shadlen, 2009; Fetsch et al., 2014a; Kiani et al., 2014). However, influences on objective decisions are not always reflected in subjective aspects of decision-making. For example, evidence for the unchosen perceptual inference is underweighted in confidence judgments (Zylberberg et al., 2012; Maniscalco et al., 2016).
Although many studies have investigated the role of top-down prior expectations on objective decision-making, surprisingly little research has investigated the role of such priors on confidence. Converging behavioral evidence indicates that subjective confidence increases with prior evidence in favor of the associated choice (Meyniel et al., 2015a; Sherman et al., 2015). This suggests that the construction of confidence may involve a comparison process between decision and prior, yet the neural substrates accompanying this comparison remain unexplored.
Here we aimed to identify brain regions in which prior perceptual expectations are integrated into confidence judgments. Based on aforementioned previous work, we reasoned that confidence should be high when decisions are supported by prior knowledge (i.e., when the discrepancy between expectation and perceptual decision is low). We therefore sought to identify brain regions that: (1) are sensitive to both “prediction error” and confidence, and (2) in which prediction error and confidence are negatively correlated. In such a region, confidence would be associated with the mismatch between internal templates and perceptual contents.
We further hypothesized that regions found to integrate prior expectations into confidence judgments should be functionally connected with two information sources: one representing the decision evidence or sensory information and one that represents the prior expectation. As confidence increasingly depends on prior expectations, functional connectivity (FC) with the source of the priors should increase. Similarly, when confidence is less dependent on priors, FC with sensory regions should increase.
Materials and Methods
Participants.
The study was approved by the Brighton and Sussex Medical School Research Governance and Ethics Committee. Twenty-four healthy, English-speaking and right-handed subjects were tested (age 19–34 years, mean 25 years; 13 females). Data from 5 participants were excluded: 2 for whom thresholding failed (Gabor hit rate = 2%, visual search d′ = −0.1); one who revealed abnormal vision only after scanning (whose estimated contrast thresholds were accordingly >2 SD from the mean); one for excessive head movement in the scanner, such that their T1 scan was unusable; and one for failing to respond on 33% of trials (relative to a mean of 3%). This left 19 participants with normal or corrected-to-normal vision for analysis. All participants gave informed, written consent and were reimbursed £50 for their time.
Procedure.
The experiment was conducted over three sessions, at least 2 h apart (no participant completed all three on a single day). In Session 1, informed consent was obtained. Participants were trained on all tasks before scanning. This consisted of on-screen instructions, followed by a minimum of 10 practice trials of each task. Participants were encouraged to continue training until the task was well understood and response mappings learned. Next, three separate staircase procedures were run to equate detection accuracy across levels of attention and across participants. These were completed in the scanner without acquiring EPIs. Finally, two 17 min runs of experimental trials were completed while EPI scans were acquired.
Session 2 did not include a training component but was otherwise identical to Session 1. Session 3 consisted of 10 min for T1 acquisition, 15 min of retinotopy (data from which are not used in this paper), and, time permitting, one more experimental run (8 participants completed this). Two participants were only able to complete three runs in total, and two participants completed a total of six runs. This variation in number of runs was a result of scanner availability.
Once the three sessions had been completed participants were compensated for their time and debriefed.
Experimental design.
The paradigm used in the present study was adapted from a previously reported design (Sherman et al., 2015). The visual display was identical in all sections of the sessions (training, staircase, and experimental). It consisted of a central visual search array and the presence or absence of a to-be-detected Gabor patch in the periphery (Fig. 1; Trial sequence).
In experimental trials, the principal task was Gabor detection. Two factors were orthogonally manipulated: prior probability of Gabor presence and attention. Expectations were manipulated blockwise by manipulating the percentage of trials on which a Gabor was present within the block and informing participants of this probability: a 25% condition induced an expectation of Gabor absence; a 75% condition induced an expectation of Gabor presence; and a 50% condition acted as a control (flat prior). Attention was manipulated by instructing participants to either perform or ignore a visual search task, presented concurrently with the Gabor target. This task consisted of detecting target ‘T’s among an array of distracter ‘L’s. Performing both tasks concurrently diverted attention from the Gabor detection task, allowing us to separate effects of expectation from those of attention.
There were 12 trials in each condition and each condition occurred once per scanning run, in fully counterbalanced order. Participants were informed of both the expectation and attention condition before each experimental block began via the presentation of an instruction screen presented for 10 s (Fig. 1). Participants were instructed to always maintain fixation at a central cross.
Trial sequence.
The trial sequence was identical for training, staircasing, and experimental trials and is shown in Figure 1. These sections differed only in task instructions and response prompts (see Experimental design). Trials began with a white fixation cross of random duration between 2.5 and 5 s. Next, a visual search array appeared, which consisted of seven letters (1.6° × 2.4°): all white, capital ‘L’s (50% chance), or a white, capital ‘T’ replacing an ‘L’ (50% chance). All letters were equidistant from fixation (eccentricity 4.3°) and took an independently random orientation. These were subsequently masked by a matching array of ‘F’s to increase task difficulty. In total, the visual search array was present for 1.1 s. The stimulus onset asynchrony (SOA) between target and masking arrays was titrated for each participant to achieve 79% accuracy (see Staircases).
On some trials, a near-threshold (∼4% contrast under full attention and ∼6% under diverted attention) peripheral (eccentricity 8°) Gabor patch (orientation = 135°, phase 45° on 50% of trials, 225° on 50% of trials, sf = 3c/°, Gaussian SD = 0.45°) was also presented. On these trials, the Gabor target appeared simultaneously with visual search array onset. To minimize attentional capture, the Gabor was presented over 0.6 s in a Gaussian envelope over time so that it had a gradual onset and offset. Stimulus contrast was titrated to equate performance across levels of attention and participants at 79% accuracy (see Staircases).
The interval between offset of the masking array and onset of response prompts was jittered during experimental trials. The aim here was to minimize motor cortex activity reflecting response anticipation. Jitter was randomly selected from the discrete values 1.3 s:0.3 s:3.1 s.
Response prompts were presented at the end of the trial. The first prompt asked whether the Gabor had been presented. “Absent”/“Present” responses were recorded by pressing the outer left/right key. This prompt was presented on all trials, except those of the visual search staircase procedure (only visual search performed). The second prompt asked whether participants guessed (inner left) or were confident (inner right) in their Gabor detection response (not presented during staircases). The third prompt was only presented on trials where participants performed the visual search task. This asked whether the visual search target ‘T’ was absent (outer left) or present (outer right). Response prompts remained onscreen for 2 s, and responses were coded as missed trials if no response was given within the allowed time.
Staircases.
Before each experimental session, three separate adaptive 1-up-3-down psychophysical staircase procedures (nine reversals) were completed in the scanner. The aim of these staircases was (1) to equate Gabor detection and visual search performance across subjects and (2) to equate Gabor detection performance across levels of attention. We did not equate Gabor detection performance across levels of expectation because expectation does not tend to affect detection sensitivity (Kok et al., 2012a; Morales et al., 2015; Sherman et al., 2015).
Staircase trials were identical to those in experimental trials (see Trial structure), except: there was no manipulation of attention or expectation; the Gabor was always present, but randomly oriented either 45° to the left or to the right; the Gabor task was left/right orientation discrimination instead of yes/no target detection; and confidence ratings were not requested.
We used orientation discrimination instead of Gabor detection so that the Gabor would be present on every trial, enabling effective titration. Although this procedure might not lead to precisely 79% correct on the Gabor detection task, performance on this task should still be equated across levels of attention and across subjects, which was the main goal of the staircasing procedure.
Staircase 1 titrated Gabor contrast to achieve 79% accuracy under full attention. Initial contrast was 1.5%. The visual search array was masked after 0.5 s. Participants were instructed to ignore the visual search array but still fixate centrally.
Staircase 2 titrated the SOA between the visual search array and masking array to set performance at 79% (in the visual search task). Initial SOA was 500 ms. Participants ignored the Gabor orientation task and only performed the visual search task. The (ignored) Gabor was presented at the contrast acquired in Staircase 1.
Staircase 3 titrated Gabor contrast to achieve 79% accuracy in Gabor orientation discrimination under diverted attention (i.e., under dual-task conditions). Initial contrast was set at that obtained in Staircase 1 (under single-task conditions) and visual search SOA was set at the value obtained by Staircase 2. Here, participants performed both the Gabor and the visual search tasks. The visual search SOA was set at the value obtained in the previous staircase and initial contrast was set at that obtained in the first. Gabor contrast was titrated over the course of the staircase to obtain contrast thresholds under diverted attention.
Statistical analyses.
Gabor detection sensitivity and decision threshold were quantified by computing signal detection theoretic measures d′ and c, respectively. These are computed by classifying trials as hits (h), misses (m), false alarms (fa), or correct rejections (cr). Hit rate (HR) and false alarm rate (FAR) are defined as follows: Where Z is the inverse cdf of the standard normal distribution, detection sensitivity d′, and decision threshold c are defined as follows: Confidence was computed by calculating the proportion of trials on which each subject reported “confident.” We did not use the Type 2 signal detection theory measure of confidence threshold Type 2 C because it is an unprincipled measure (Galvin et al., 2003).
Behavioral and follow-up statistical tests were run on JASP (Love et al., 2015). When the null hypothesis was predicted, Bayesian t tests and repeated-measures ANOVAs used the JASP default Cauchy prior of Y = 0.7, centered on zero. All results presented were robust to reasonable adjustments of this value. Bayes factors >1/3/10/100 are, respectively, interpreted as showing insensitive/moderate/strong/very strong evidence for the alternative hypothesis (Kass and Raftery, 1995). Bayes factors less than the reciprocal of these values are given the same labels but refer to the null hypothesis.
Unless otherwise stated, all repeated-measures ANOVA results met the assumption of sphericity. When sphericity was violated, corrected degrees of freedom and p values were used. The Greenhouse-Geisser correction was used for small violations (ε < 0.75) and the Huynh-Feldt correction for large violations (ε > 0.75).
MRI acquisition and preprocessing.
Functional T2*-sensitive EPIs were acquired on a Siemens Avanto 1.5T scanner. Axial slices were tilted to minimize signal dropout from frontal and occipital cortices. 34 2 mm slices with 1 mm gaps were acquired (TR = 2863 ms, TE = 50 ms, FOV = 192 mm × 192 mm, matrix = 64 × 64, flip angle = 90°). Full brain T1-weighted structural scans were acquired on the same scanner and were composed of 176 1-mm-thick sagittal slices (TR = 2730 ms, TE = 3.57 ms, FOV = 224 mm × 256 mm, matrix = 224 × 256, Flip angle = 7°) using the MPRAGE protocol.
Each functional run lasted 17 min. The first four functional volumes of each run were treated as dummy scans and discarded. Images were processed using SPM8 software (http://www.fil.ion.ucl.ac.uk/spm/software/spm8/). Images were preprocessed using standard procedures: anatomical and functional images were reoriented to the anterior commissure; images were slice-time corrected with the middle slice used as the reference; and EPIs were aligned to each other and coregistered to the structural scan by minimizing normalized mutual information. Next, EPIs were spatially normalized to MNI space using parameters obtained from the segmentation of T1 images into gray matter (GM) and white matter (WM). Finally, spatially normalized images were smoothed with a Gaussian smoothing kernel of 8 mm FWHM.
fMRI statistical analysis.
At the participant level, BOLD responses were time-locked to the onset of the visual search array (which appeared at the same time as the Gabor, if present), enabling us to examine BOLD responses to both target present and target absent trials. BOLD responses were modeled in a GLM with regressors and their corresponding temporal derivatives for each combination of the following factors: Attention (full, diverted), Expectation (25%, 50%, and 75%), Stimulus (Gabor present, Gabor absent), Report (yes, no), and Confidence (confident, guess). If a certain combination of factors had no associated trials for a particular participant, that regressor was removed from the participant's first-level model and contrast weights were rescaled.
The reliability of the regression weights was maximized by entering data from all runs and sessions together, increasing the trial count per regressor. To avoid smearing artifacts, no bandpass filter was applied. Instead, low-frequency drifts were regressed out by entering WM drift (averaged over the brain) as a nuisance regressor (Law et al., 2005). Nuisance regressors representing the experimental run and six head motion parameters were also included in the first-level models.
Comparisons of interest were tested by running one-sample t tests against zero at the participant level, then running group-level paired t tests on those one-sample contrast images. Unless otherwise stated, all contrasts at the group level were run with peak thresholds of p < 0.001 (uncorrected) and corrected for multiple comparisons at the cluster level using the FDR method.
We wanted to control for possible confounds between reaction speed and confidence (Petrusic and Baranski, 2003) which correlate (see e.g., Grinband, Hirsch, and Ferrera, 2006), and between individual or condition-wise differences in Gabor contrast and confidence (which correlate) (Rahnev et al., 2011). To do this, a control GLM was computed. Here, each regressor was parametrically modulated by both Gabor contrast (linear modulation) and reaction time (quadratic modulation). By design, this model controlled for (1) reaction time-confidence associations and (2) across-subject and across-condition differences in Gabor contrast. The Results section reports analyses on our main model (the model without parametric modulators) because the control model had a fourfold increase in number of regressors, reducing statistical power. Nonetheless, all GLM analyses were replicated under our control model when using a peak threshold of p < 0.005. Crucially, all results under right inferior frontal gyrus (rIFG) were also replicated when using a peak threshold of p < 0.001.
Functional ROIs were extracted using the MarsBaR 0.42 toolbox (http://marsbar.sourceforge.net/download.html). Anatomical areas were labeled using the SPM Anatomy toolbox (Eickhoff et al., 2005). Brodmann areas were identified using MRIcro (Rorden and Brett, 2000). Results of whole-brain analyses were plotted onto glass brains using MATcro (now called MRIcroS, https://www.nitrc.org/plugins/mwiki/index.php/mricros:MainPage). Other figures were made using the Canlab Coretools toolbox (https://github.com/canlab/CanlabCore) and custom code.
Psychophysiological interaction (PPI) analysis.
The PPI analysis was performed using the CONN FC toolbox (http://web.mit.edu/swg/software.htm). The GLM comprised regressors for attention condition (full/diverted), confidence (confident/guess), and expectation-response congruence (congruent/neutral/incongruent). Nuisance regressors were identical to those used in the GLM on BOLD, with the addition of scrubbing parameters, which exclude outlier volumes that bias connectivity estimates. Again, the signal was not bandpass filtered, but instead the mean WM drift was entered as a nuisance regressor. Otherwise, default prepreprocessing steps were taken: the data were denoised by regressing out signal from WM, from CSF and from each individual condition, plus signal associated with all nuisance regressors. The PPI was run on univariate regression weights to identify effective connectivity between a functionally defined seed (rIFG) and remaining voxels. These weights were examined in a second-level model, which used an uncorrected peak threshold of p < 0.005 and FDR cluster corrected threshold of p < 0.05.
Voxel-based morphometry.
T1-weighted structural scans were reoriented to the anterior commissure and segmented into GM, WM, and CSF. These were normalized to MNI space using DARTEL with SPM defaults and a Gaussian smoothing kernel of 8 mm FWHM (Ashburner and Friston, 2000). Modulated WM and GM images were separately compared across participants in a multiple regression with age and total intracranial volume (GM + WM + CSF) as nuisance regressors. Gender was not included because this resulted in multicollinearity between regressors (older participants were more likely to be male). Clusters reported as significantly correlating with behavior survived voxelwise FWE correction.
Results
Expectations liberalize decision thresholds and attention increases contrast sensitivity
Our first analyses confirmed the efficacy of our paradigm. To equate difficulty across attention conditions and participants, adaptive psychophysical staircases identified the stimulus contrast required for 79% accuracy on the Gabor detection task (see Staircases). Comparing the acquired contrasts in the full (4.34 ± 3.50%, mean ± SD) and diverted (5.69 ± 3.79%) attention conditions revealed that contrast thresholds were significantly lower under full than diverted attention (t(19) = 2.95, p = 0.014, 95% CI [0.50%, 2.31%], dz = 0.70), indicating that our paradigm successfully manipulated attention.
To ensure that our staircase procedure successfully equated detection sensitivity d′ across conditions, we ran a within-subjects Attention (full, diverted) × Expectation (25%, 50%, 75%) ANOVA. This revealed no significant difference between d′ under full (1.06 ± 0.14) and diverted (1.21 ± 0.20) attention conditions (F(1,18) = 0.34, p = 0.569, ηp2 = 0.02) (Fig. 2A), and was corroborated by a Bayesian repeated-measures ANOVA of the same design that revealed moderate evidence for the null hypothesis (BF = 0.240). There was no significant effect of Expectation on d′ (F(2,36) = 0.70, p = 0.505, ηp2 = 0.04, BF = 0.07; strong evidence for the null) and no significant interaction term (F(2,36) = 0.76, p = 0.476, ηp2 = 0.04, BF = 0.016; strong evidence for the null). Our staircases therefore successfully equated d′.
To determine whether we had successfully manipulated priors, we compared signal detection theoretic decision thresholds (c, see Materials and Methods) across levels of expectation (de Lange et al., 2013; Morales et al., 2015; Sherman et al., 2015). As the expectation of Gabor presence over absence increases, decision threshold should become increasingly biased toward “yes” responses (i.e., liberalized, indicated by lower values of c). This was confirmed in a within-subjects Attention (full, diverted) × Expectation (25%, 50%, 75%) ANOVA (F(1.65,29.72) = 18.10, p < 0.001, ηp2 = 0.50). LSD post hoc tests revealed a greater bias toward reporting “yes” in the 50% (neutral) than the 25% (expect absent) condition (p = 0.010, dz = 1.15), and in the 75% (expect present) than the 50% (neutral) condition (p < 0.001, dz = 1.39) (Fig. 2B). We found no evidence for attentional effects on decision threshold (F(1,18) = 3.38, p = 0.083, ηp2 = 0.16) and no Expectation × Attention interaction (F(2,36) = 0.37, p = 0.693, ηp2 = 0.020).
Summarizing these results, our design successfully independently manipulated attention and expectation, while keeping detection sensitivity constant across conditions.
Expectations increase confidence
We have previously shown that subjective confidence increases when perceptual decisions are congruent with prior expectations (Sherman et al., 2015), and on this basis, hypothesized that confidence would relate to prediction error signals. To determine whether we had replicated this behavioral result, we compared confidence for expectation-congruent and expectation-incongruent perceptual decisions. Congruent responses are “yes” reports in the 75% (expect present) condition and “no” reports in the 25% (expect absent) condition. The reverse applies for incongruent responses. An effect of congruence is therefore demonstrated by an interaction between expectation and report, whereby when participants report “yes,” confidence increases with increasing prior probability of target presence, and when participants report “no” confidence increases with decreasing probability of target presence.
The percentage of high confidence trials were subjected to an Attention (full, diverted) × Accuracy (correct, incorrect) × Expectation (25%, 50%, 75%) × Report (yes, no) repeated-measures ANOVA. Participants appropriately showed higher confidence in correct than in incorrect detection judgments (F(1,18) = 54.583, p < 0.001, η2 = 0.752). Confidence was subject to several other effects. Overall, confidence was higher when reporting absence. This held for incorrect but not correct judgments, and under full but not diverted attention. Attention also interacted with expectation such that under attention confidence was higher when the prior was informative (the 25% and 75% conditions), but under diverted attention confidence was higher in the absence of an informative prior (50% condition). Statistics are presented in Table 1.
Crucially, congruent reports were associated with higher confidence than incongruent reports, as shown by a significant Expectation × Report interaction (F(2,36) = 15.535, p < 0.001, η2 = 0.463) (Fig. 3). Here, confidence for “yes” responses was higher in the 75% (congruent) than the 25% (incongruent) condition (t(18) = 2.51, p = 0.021), whereas confidence for “no” responses was higher in the 25% (congruent) than the 75% (incongruent) condition (t(18) = 3.83, p = 0.001). The three-way interaction with Attention was not significant (p = 0.938). Confidence therefore increased when the reported percept was congruent with prior expectations, independently of attentional resource. In other words, perceptual decisions that were consistent with prior beliefs were associated with higher confidence. This crucial behavioral result motivated our investigation into the neural correlates of predictability effects on subjective confidence.
Two forms of congruency
To unravel the neural correlates of predictive influences on confidence, we first needed to identify brain regions sensitive to perceptual expectations. We predicted, based on previous work, that areas sensitive to perceptual expectations would exhibit an increased BOLD amplitude for trials on which expectations were violated (Egner et al., 2010; Kok et al., 2012a; Jiang et al., 2013; Kouider et al., 2015; St. John-Saaltink et al., 2015).
The experimental design used near-threshold stimuli, leading to potential dissociations between stimulus presentation and perceptual contents. Accordingly, expectations can be violated in two ways. First, stimulus presentation can be unexpected. These violations are under the control of the experimenter. We refer to the neural correlate of this expectancy violation PESTIMULUS. Second, the perceptual report (i.e., perceptual content) can be incongruent with that expected. These violations are not under the control of the experimenter. Rather, they are a function of the participant's sensory representation. We refer to the neural correlate of this expectancy violation PEREPORT. PESTIMULUS is most often observed at lower levels of the perceptual hierarchy (Kok et al., 2012a; Chennu et al., 2013; Jiang et al., 2013), whereas the decision-related PEREPORT signals have been reported in both visual cortex (Pajani et al., 2015) and higher-level, decision-related areas (Bubic et al., 2009).
Representation of PESTIMULUS in visual cortex
In our first analysis, we searched for regions that are sensitive to discrepancies between expectation and stimulus presentation over whole brain. To do this, we computed the contrast unexpected stimulus presentation > expected stimulus presentation. Gabor presence is expected in the 75% condition but unexpected in the 25% condition, whereas Gabor absence is expected in the 25% condition but unexpected in the 75% condition. Our analysis identified one PESTIMULUS-sensitive area in contralateral occipital cortex (V1-V3, BA18, peak MNI x = −12, y = −80, z = 22, Zpeak = 4.09, 0.66 cm3, cluster pFDR = 0.350, puncorr = 0.023) and one on the ipsilateral side (V1-V3, BA18, peak MNI x = 8, y = −80, z = 18, Zpeak = 3.99, 1.01 cm3, cluster pFDR = 0.205, puncorr = 0.007). Neither of these clusters survived cluster-level correction, so they will not be considered beyond this point. They are presented simply to show consistency with previous studies, in which statistical power was improved by constraining the analysis with functional localizers (Smith and Muckli, 2010; Kok et al., 2012a, b; Larsson and Smith, 2012; Jiang et al., 2013).
The whole-brain contrast PESTIMULUS, attended > PESTIMULUS, unattended yielded no significant or marginally significant clusters, indicating no evidence for a PESTIMULUS × attention interaction.
Using a peak threshold of p < 0.005, both of these analyses were replicated under our control model, which included reaction speed and Gabor contrast as parametric modulators (unexpected > expected, contralateral: pFDR = 0.446, puncorr = 0.014, ipsilateral: pFDR = 0.446, puncorr = 0.011).
Regions representing PEREPORT
Next, we searched for regions whose BOLD response reflects the discrepancy between expectation and perceptual report. Expectation-congruent reports are “yes” responses in the 75% (expect present) condition and “no” responses in the 25% (expect absent) condition. The reverse applies for expectation-incongruent reports. These definitions differ from those in the previous analysis because they consider perceptual report instead of stimulus presence or absence. In turn, this analysis searches for regions sensitive to unexpected perceptual content.
The contrast expectation-incongruent report > expectation-congruent report was computed over whole brain. This revealed eight significant clusters distributed throughout the cortex (Fig. 4A; Table 2). We found no significant clusters for the reverse contrast, even with a more liberal peak threshold of p < 0.005 uncorrected.
Regions exhibiting a PEREPORT pattern should show a larger BOLD response to incongruent than congruent responses, regardless of whether the response was “yes” or “no” (Kok et al., 2012a). We checked whether this held in these putative PEREPORT regions by extracting median regression coefficients as a function of attention, expectation and report, and subjecting them to separate repeated-measures ANOVAs.
Results are depicted in Figure 4B, and statistics are presented in Table 3. All regions exhibited a significant PEREPORT response for both “yes” and “no” judgments, except middle orbital gyrus and left inferior frontal gyrus. Accordingly, these clusters were excluded from our set of PEREPORT regions.
Under our control model (in which regressors were parametrically modulated by Gabor contrast and reaction speed, see Materials and Methods), all significant results here were replicated, at least at marginal significance (rIFG, our critical region, pFDR = 0.044). Results were fully replicated when using a peak threshold of p < 0.005. This means that PEREPORT responses do not reflect reaction speed or Gabor contrast/detection sensitivity.
In summary, we have identified six regions signaling PEREPORT as follows: right middle temporal gyrus, right superior medial gyrus, rIFG, right angular gyrus, and bilateral inferior parietal lobule. These regions are sensitive to the discrepancy between perceptual expectations and perceptual choice (i.e., to unexpected perceptual inferences).
High confidence is associated with an attenuated PEREPORT response in rIFG
Our key hypothesis was that high confidence would be associated with low PEREPORT. However, confidence can be also influenced by attention (Rahnev et al., 2011) and tracks accuracy (Dienes, 2008; Pleskac and Busemeyer, 2010). To test whether any PEREPORT region represented confidence after controlling for these potential confounds, median regression weights from each PEREPORT region were extracted as a function of confidence, attention, and decision accuracy. These regression coefficients were then subjected to separate Bayesian repeated-measures ANOVAs. We were looking for regions whose BOLD response (in these regions, representing PEREPORT) differs with confidence. We could not test for a PEREPORT × Confidence interaction: when the participant reports low confidence, they have signaled their yes/no decision as unreliable; that is, their perception of Gabor presence or absence does not necessarily correspond to their report. Because PEREPORT is a function of both Expectation and Report, this variable will also be unreliable when the participant has reported low confidence.
Only one region exhibited a BOLD response (i.e., PEREPORT amplitude) that differed as a function of subjective confidence: rIFG. Here, supporting our hypothesis, BOLD amplitude was higher for guess than confident reports (Fig. 5A). Crucially, the analysis revealed substantially more evidence for modeling rIFG BOLD as a function of confidence alone (BF = 13.620) than as a function of just accuracy (BF = 0.877), just attention (BF = 0.711), or as a combination of confidence and any other factors (BF = 0.003–2.069; for summary of results from all ROIs, see Table 4). A frequentist ANOVA gave the same result: a significantly higher BOLD amplitude for guess than confident responses (F(1,18) = 6.04, p = 0.024, η2 = 0.251, 95% CI [0.10, 1.28]). These results are shown in Figure 5B.
Does rIFG sensitivity to confidence indeed reflect an effect of expectation? If so, the effect of expectation on confidence (as shown in Fig. 3) should be correlated with IFG PEREPORT amplitude. We tested this with an across-subject brain-behavior correlation. Our behavioral variable was the effect of expectations on confidence, denoted ΔConfidence = Confidencecongruent − Confidenceincongruent. Because PEREPORT cannot be meaningfully computed for guess responses (see above), it was computed from confident responses only.
Correlating these two variables revealed a significant negative correlation (ρ = −0.512, p = 0.027) (Fig. 5C): smaller PEREPORT amplitude in rIFG was associated with larger increases in confidence for expectation-congruent perceptual decisions. In turn, this result confirms our finding that high confidence is associated with low PEREPORT in rIFG, and implicates rIFG in the mechanism by which prior expectations increase confidence.
To ensure that these differences were not driven by differences in reaction speed or Gabor contrast, we extracted data from the rIFG cluster revealed by our control GLM. This revealed that, even after controlling for these possible confounds, rIFG BOLD was significantly higher for guess than confident responses (t(18) = 2.21, p = 0.041, dz = 0.44). The significant brain-behavior correlation was also replicated (ρ = −0.510, p = 0.027).
Together, these analyses reveal that subjective confidence is reliably associated with PEREPORT in right IFG even after controlling for attention, Gabor contrast, decision accuracy, and reaction speed.
Sources of priors and sensory signals for confidence
Thus far, we have shown that lower PEREPORT amplitude in rIFG, reflecting the extent to which perceptual inferences are unexpected, are associated with higher confidence. Assuming a model in which decision confidence is a weighted function of top-down expectations and “bottom-up” sensory signals (or decision evidence), we asked whether we could identify sources of these variables. To do this, we ran a seed-to-voxel psychophysiological interaction analysis (PPI), with rIFG as a functionally defined seed.
We were interested in regions communicating predictive information and therefore searched for clusters exhibiting different degrees of FC with rIFG for congruent and incongruent reports. We reasoned that, although confidence should be a function of both sensory signals and expectations, there would be individual differences in how each component would be weighted, reflecting, for example, how reliable the expectation information is thought to be. Capitalizing on these individual differences, we reasoned that rIFG would show stronger FC with the expectation region in participants whose confidence was weighted more by expectation. By contrast, rIFG would show stronger FC with the source of sensory signals in participants whose confidence was only weakly shaped by expectation.
To test this hypothesis, we used a behavioral covariate of interest − the influence of expectations on confidence. This behavioral variable was defined as ΔConfidence = ConfidenceIncongruent − ConfidenceCongruent, and is the same as the behavioral variable in Figure 5C. Higher values signify that expectations exerted a stronger effect on confidence.
Sources of predictive information for confidence were identified by computing the contrast incongruent ≠ congruent, with ΔConfidence as a between-subjects covariate of interest. That is, we searched for brain connectivity-behavior relationships.
As shown in Figure 6A, the PPI analysis revealed three significant clusters. The more expectations shaped confidence (higher ΔConfidence), the more that congruence was associated with FC between rIFG and two clusters: one in left orbitofrontal cortex (lOFC) (ΔConfidence × congruent > incongruent; peak MNI x = −36, y = 38, z = −18, 3.40 cm3, cluster pFDR = 0.008) and one in right OFC (ΔConfidence × congruent > incongruent; peak MNI x = 10, y = 26, z = −18, 2.50 cm3, cluster pFDR = 0.024).
By contrast, the less expectations shaped confidence (lower ΔConfidence), the more that congruence was associated with FC between rIFG and intracalcarine sulcus (ΔConfidence × congruent > incongruent: peak MNI x = 6, y = −58, z = 12, 3.86 cm3, cluster pFDR = 0.004). Thus, intracalcarine sulcus and bilateral OFCs exhibited a push-pull relationship, with the dominant region predicted by ΔConfidence.
Although the balance of FC between these regions was determined by ΔConfidence, FC between these rIFG and these regions was present independently of ΔConfidence. Specifically, FC between rIFG and OFC was greater than zero on congruent trials (lOFC, p < 0.001; rOFC, p = 0.053), whereas intracalcarine sulcus-rIFG FC was significantly greater than zero on incongruent trials (p = 0.022).
Because OFC was primarily associated with congruent responses, we reasoned that FC with these regions might reflect the communication of perceptual priors. Consistent with this, we found a main effect of expectation condition on lOFC BOLD F(1.64,29.59) = 3.61, p = 0.047, ηp2 = 0.167 (Fig. 5B). Here, BOLD exhibited a ‘U’-shaped relationship with expectation (p = 0.036), consistent with the representation of prior information: BOLD was higher when there was an informative prior (the 25% and 75% conditions) than when the prior was flat (the 50% condition). In rOFC, this pattern was exhibited under full (F(2,36) = 3.80, p = 0.032, ηp2 = 0.174) but not diverted (F(2,36) = 0.87, p = 0.426, ηp2 = 0.046) attention (interaction p = 0.030, Fig. 6C). Interestingly, in rOFC, there was also a significant attention × confidence interaction (F(1,18) = 7.84, p = 0.012, ηp2 = 0.303) (Fig. 6D), such that attention reversed the BOLD response to confident versus guess responses.
These results are consistent with the interpretation of bilateral OFC communicating prior information. Whereas lOFC represented prior information independently of attention, rOFC did this only under full attention. Moreover, the attention by confidence interaction under rOFC BOLD suggests that this region may additionally represent the degree of (reverse) uncertainty associated with attentional state.
Next, we asked whether intracalcarine sulcus represented prediction error signals. PEREPORT is demonstrated in an expectation by report interaction, whereas PESTIMULUS is demonstrated in an expectation by stimulus interaction. However, neither was found (all p > 0.288). Rather, the BOLD response here was marginally higher for stimulus present than absent trials, F(1,18) = 3.81, p = 0.067, ηp2 = 0.175 (Fig. 6E).
One might wonder whether bilateral OFC directly signals priors to occipital lobe, or vice versa for sensory signals. This was not the case. Rerunning the PPI analysis in the same way, but with either OFC cluster as our seed, revealed no significant or marginally significant connectivity with intracalcarine sulcus. Similarly, running the analysis setting intracalcarine sulcus as the seed revealed no significant or marginally significant connectivity with either OFC cluster.
Together, these results show that the integration of expectations into confidence judgments in rIFG recruits an occipitofrontal network that represents top-down influences of attention and expectation in OFC, and bottom-up sensory signals in intracalcarine sulcus.
The contribution of OFC to confidence is predicted by WM density
Our connectivity analyses revealed that the integration of expectations into confidence judgments recruits OFC, representing top-down signals, and intracalcarine sulcus, representing bottom-up signals. The extent to which each region was recruited was predicted by individual differences in the extent to which expectations shaped behavioral confidence. The presence of these individual differences motivated an exploratory follow-up analysis that asked whether they are reflected in brain structure. More specifically, we considered whether the weighting of top-down predictions was a function of WM or GM density of the source region.
The BOLD response of OFC reflected an effect of perceptual expectations on objective decision. The behavioral correlate of this is therefore Δc = c25%−c75%, the extent to which perceptual expectations bias (yes/no) decision. We performed a whole-brain multiple regression analysis on WM density, with total intracranial volume and participant age as nuisance covariates, and with Δc as the regressor of interest. This analysis revealed that propensity to incorporate low-level priors into decision-making, as measured by Δc, was negatively correlated with rOFC WM density (Fig. 7; peak MNI x = 23, y = 30, z = −14, 11.51cm3, ppeak-FWE = 0.030, Z = 5.08). The same analysis for GM yielded no significant results.
This result suggests that the dependence of confidence on FC with the expectation source regions is reflected in anatomical indications of that connectivity: WM density in OFC was negatively predicted by its functional correlate.
Discussion
In the present paper, we have shown that perceptual confidence increases when decisions are supported by (or congruent with) prior expectations. Crucially, we show that the process of integrating this predictive information recruits rIFG.
Our data reveal that unexpected perceptual content is associated with heightened PEREPORT (a mismatch response to expectation-incongruent perceptual decisions) in a distributed set of frontal, parietal, and temporal decision-related regions. Interestingly, this PEREPORT-sensitive set resembles those implicated in other forms of “top-down” processing, such as modality-independent sensory change detection (Downar et al., 2000), response inhibition (Verbruggen and Logan, 2008; Criaud and Boulinguez, 2013), and detection of behavioral salience (Downar et al., 2002).
Crucially, the contribution of top-down expectations to subjective confidence judgments was reflected in rIFG BOLD activity. Here, high confidence was associated with a lower PEREPORT response. Furthermore, stronger behavioral effects of expectation on confidence were associated greater PEREPORT attenuation in this region. Control analyses ruled out explanations in terms of attention, stimulus contrast, decision accuracy, or reaction speed. Our results therefore indicate a central role for rIFG in subjective perceptual decision making, in which the “mismatch” between internal templates and perceptual content is integrated into subjective confidence judgments. That is, these data indicate that perceptual decision confidence is shaped by a neural mismatch response that is perceptual in nature.
One might interpret PEREPORT as a arising from the preparation or execution of a response that is a priori likely to be incorrect. Recent evidence has shown that perceptual confidence is increased by response-effector congruency (Fleming et al., 2015). However, we find such an explanation for the present results unlikely for two reasons: (1) our paradigm did not experimentally induce response conflict; and (2) effects of confidence on rIFG BOLD were independent of decision accuracy. Rather, we suggest that PEREPORT reflects perceptual “prediction error,” arising from the degree to which the reported percept is surprising.
Our data suggest that expectations shape the decision signal on which confidence judgments are based (although we cannot and do not rule out an additional effect of expectations at the metacognitive level). Accordingly, our results are consistent with many models of choice confidence that permit expectations to shape decision evidence (for a thorough review, see Summerfield and de Lange, 2014). For example, in recent work, we have shown that the signal detection model can be reformulated in Bayesian terms. Here, the distributions of internal responses to stimulus presence and absence are represented as posterior probability distributions, whereas decision and confidence thresholds are fixed according to a particular posterior odds ratio (Fig. 8A) (Sherman et al., 2015). Increasing the prior probability of target presence will change the distribution of internal responses, driving decision and confidence thresholds. The result of this is a greater propensity to report “yes” and to report “confident” when target presence expected. Crucially, these are not “undesirable” biases that ought be eliminated. Rather, they reflect the underlying statistics of the task, even influenced by periodic fluctuations in occipital cortex (Sherman et al., 2016).
In the present study, one could argue that rIFG serves to compare this internal response to a confidence threshold (Fig. 8B). However, such an account would not be consistent with previous work, implicating only left IFG in adjusting perceptual criteria, and right IFG in estimating task difficulty and uncertainty (Tops and Boksem, 2011; White et al., 2012; Reckless et al., 2014). However, this literature would support an account by which rIFG integrates prior beliefs into the internal response (Fig. 8C), forming the signal upon which confidence is constructed.
We found that the process of relating predictive information into confidence judgments recruited both intracalcarine sulcus and bilateral OFCs. Intracalcarine sulcus exhibited a marginally significant BOLD response to the stimulus, so we interpret connectivity with this region as reflecting the communication of sensory signals. We assume this effect was weak because it was localized to a large cluster while reflecting the neural response to a small stimulus in retinotopically organized space.
We further found that both right and left OFC represented top-down prior information, consistent with previous work (Schoenbaum and Roesch, 2005; Wallis, 2007; Trapp and Bar, 2015). For left OFC, this was independent of attention, but right OFC was sensitive to attentional state: representation of the prior required attention. Furthermore, the effect of confidence on rOFC BOLD reversed under diverted attention, possibly indicating that rOFC represents the uncertainty associated with inattention.
Together, our results suggest that the construction of confidence recruits intracalcarine sulcus, representing stimulus-driven signals, OFC, representing top-down signals, and rIFG, which maps the discrepancy between expectation and perceptual choice to subjective confidence. OFC has been repeatedly been shown to reflect reward expectations and beliefs (Kepecs et al., 2008; Kim et al., 2011; De Martino et al., 2013; Lebreton et al., 2015); however, here we place OFC belief representations within a larger hierarchical structure for perceptual processing, perhaps generating predictions (Stalnaker et al., 2015; Trapp and Bar, 2015) that constrain perceptual confidence.
Importantly, our PPI analysis cannot determine the directionality of functional connections in this network. While the functional data are consistent with a model in which rIFG receives top-down and bottom-up inputs, on an alternative view bottom-up signals may be passed to rIFG, which passes a transformation of PEREPORT into confidence to rOFC. Under this account, rIFG would construct the initial confidence representation, whereas OFC would transform the confidence estimate represented in rIFG into a reportable judgment, based on the mismatch between the estimate, expectations, and potentially, attentional state (Lebreton et al., 2015). A third possibility is that rIFG is recruited for the estimation of decision uncertainty, not confidence, and top-down processes from bilateral OFC are recruited more when this uncertainty is high. Therefore, top-down signals from rOFC would not shape rIFG representations of confidence but rather result from uncertain (i.e., variable or noisy) representations. Further studies will be needed sequence the involvement of these regions and disambiguate these possibilities.
Our results are readily interpretable from Bayesian brain perspectives (Lee, 2002; Yuille and Kersten, 2006; Friston, 2009; Clark, 2013). These propose that perceptual inference is a weighted integration of sensory evidence and prior beliefs about the cause of the sensation, such that the perceptual report corresponds to the belief with greatest posterior probability. The posterior probability increases as the correspondence between prior and sensory signal increases. Therefore, inference is deemed “successful,” and so should be associated with high confidence, when “prediction error” is low (Meyniel et al., 2015b), as we saw here. Neuronal representations of prediction errors are well established in the reward domain (Nakahara et al., 2004; Bayer and Glimcher, 2005), but in the perceptual domain evidence remains restricted to BOLD correlates, such as PEREPORT. From Bayesian perspectives, the finding that PEREPORT amplitude in rIFG was lower for confident responses would point to the representation or construction of the posterior belief in this region; and indeed, rIFG has been found to encode the decision variable, both in Bayesian form (the posterior) (d'Acremont et al., 2013) and as decision evidence (Hebart et al., 2016).
Previous work has implicated rIFG in the representation of expectation violation in a range of modalities, from speech perception (Clos et al., 2014) to the auditory (Garrido et al., 2009), visual (Bubic et al., 2009), and tactile (Allen et al., 2016) domains. It has also been shown that rIFG represents subjective uncertainty (Fleck et al., 2006; Fleming and Dolan, 2012). However, to our knowledge, these functions of rIFG have not been related to each other before. rIFG is also frequently implicated in the seemingly disparate task of detecting or resolving response conflict (Casey et al., 2000; Hampshire et al., 2010), and is a key component of the response inhibition network (Verbruggen and Logan, 2008; Criaud and Boulinguez, 2013). This raises the intriguing possibility of a functional overlap between resolution of response conflict and the formation of confidence.
These roles could be unified by considering rIFG as the region in which the posterior belief on sensory causes is computed (at least for perceptual tasks) because the posterior affords a hypothesis space for adaptive, plausible actions (Mansouri et al., 2009). Consistent with this view, rIFG is recruited when appropriately acting on perceptual choices (Suzuki and Gottlieb, 2013), computing behavioral significance (Sakagami and Pan, 2007), and computing action-outcome likelihoods (Morris et al., 2014). Anatomical considerations also support such a view because the rIFG is directly connected with regions relevant for both cognitive and motor control (Petrides and Pandya, 2002). Right IFG may even compute high-level, abstracted posteriors that integrate high-level beliefs about decision outcomes because the rIFG BOLD response to erroneous decisions is associated with both the valence of the decision outcome, and the optimism of the participant (“self-belief”) (Sharot et al., 2011). We leave open for future research the question of whether and how rIFG relates perceptual confidence to action outcomes.
In conclusion, we have shown that top-down expectations are integrated into decision confidence, and we have shown that this occurs in a functional network consisting of rIFG, OFC, and intracalcarine sulcus. Here, top-down perceptual expectations and bottom-up sensory inputs are integrated into a subjective sense of perceptual confidence. Together, our data reveal a crucial role of top-down influences in the mechanism by which perceptual decisions become available for conscious report.
Footnotes
This work was supported by the Dr. Mortimer and Theresa Sackler Foundation (which supports the Sackler Centre for Consciousness Science) to M.T.S. and A.K.S., the Japan Science and Technology Agency to R.K., and the School of Psychology, University of Sussex to M.T.S.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Maxine T. Sherman, School of Psychology, Pevensey Building, University of Sussex, Falmer BN1 9RH, UK. maxinesherman{at}gmail.com