Abstract
Forrest Gump or The Matrix? Preference-based decisions are subjective and entail self-reflection. However, these self-related features are unaccounted for by known neural mechanisms of valuation and choice. Self-related processes have been linked to a basic interoceptive biological mechanism, the neural monitoring of heartbeats, in particular in ventromedial prefrontal cortex (vmPFC), a region also involved in value encoding. We thus hypothesized a functional coupling between the neural monitoring of heartbeats and the precision of value encoding in vmPFC. Human participants of both sexes were presented with pairs of movie titles. They indicated either which movie they preferred or performed a control objective visual discrimination that did not require self-reflection. Using magnetoencephalography, we measured heartbeat-evoked responses (HERs) before option presentation and confirmed that HERs in vmPFC were larger when preparing for the subjective, self-related task. We retrieved the expected cortical value network during choice with time-resolved statistical modeling. Crucially, we show that larger HERs before option presentation are followed by stronger value encoding during choice in vmPFC. This effect is independent of overall vmPFC baseline activity. The neural interaction between HERs and value encoding predicted preference-based choice consistency over time, accounting for both interindividual differences and trial-to-trial fluctuations within individuals. Neither cardiac activity nor arousal fluctuations could account for any of the effects. HERs did not interact with the encoding of perceptual evidence in the discrimination task. Our results show that the self-reflection underlying preference-based decisions involves HERs, and that HER integration to subjective value encoding in vmPFC contributes to preference stability.
SIGNIFICANCE STATEMENT Deciding whether you prefer Forrest Gump or The Matrix is based on subjective values, which only you, the decision-maker, can estimate and compare, by asking yourself. Yet, how self-reflection is biologically implemented and its contribution to subjective valuation are not known. We show that in ventromedial prefrontal cortex, the neural response to heartbeats, an interoceptive self-related process, influences the cortical representation of subjective value. The neural interaction between the cortical monitoring of heartbeats and value encoding predicts choice consistency (i.e., whether you consistently prefer Forrest Gump over Matrix over time. Our results pave the way for the quantification of self-related processes in decision-making and may shed new light on the relationship between maladaptive decisions and impaired interoception.
Introduction
Do you prefer Forrest Gump or The Matrix? The decision is subjective: only you know which movie you like best. The subjective values used in preference-based decision-making are internally generated and intrinsically private, and entail self-reflection. In other words, estimating a subjective value requires a reflection about how an item affects you. In contrast, the evidence required to decide which of the two words “listen” and “look” has more characters is publicly and objectively available to any reader of this article. While the neural underpinnings of valuation and choice have been well studied, the biological mechanism supporting the self-reflection intrinsic to subjective decisions remains unspecified. It might derive from the simplest biological implementation of self-reflection (i.e., the monitoring of one's current physiological state; Craig, 2002; Blanke and Metzinger, 2009; Damasio, 2010; Park and Tallon-Baudry, 2014; Azzalini et al., 2019) required to select the most appropriate behavior to restore homeostatic balance, thus ensuring the integrity of the living organism. It follows that the organism needs to track its internal state to assign a value to a given option (Keramati and Gutkin, 2014; Juechems and Summerfield, 2019), and that an imprecise representation of the internal state may lead to suboptimal choice (Paulus, 2007).
The monitoring of current physiological state is notably indexed by the transient neural response automatically elicited by each heartbeat, also known as the heartbeat-evoked response (HER; Montoya et al., 1993; Kern et al., 2013). HERs have been linked to subjective, self-related cognitive processes in ventromedial prefrontal cortex (vmPFC; Park et al., 2014; Babo-Rebelo et al., 2016a,b). A separate stream of studies repeatedly showed that vmPFC encodes subjective values (Lebreton et al., 2009; Bartra et al., 2013; Grueschow et al., 2015). We thus hypothesized that (1) HERs in vmPFC would signal the recruitment of self-reflective processes in preparation for a subjective decision, but absent when preparing to an objective decision, and (2) that HER fluctuations would affect valuation in subjective preference-based decisions, but not in decisions based on objective evidence publicly available in the outside world, such as perceptual discriminations. We tested these hypotheses in a paradigm where participants performed either a subjective, preference-based choice, or a control, objective perceptual discrimination, between two visually presented movie titles (Fig. 1), while their neural and cardiac activity were measured with magnetoencephalography (MEG) and electrocardiography (ECG), respectively. Each trial began with an instruction period, with a symbol indicating which type of decision to perform, during which we measured HERs. Once options were displayed, participants selected the title of the movie they preferred in subjective preference trials, and the title written with the highest contrast in objective perceptual discrimination trials. We found that (1) HERs during the instruction period were larger when preparing for preference-based decisions than for discrimination ones, and we demonstrated that (2) HER amplitude interacted with the neural encoding of subjective value in vmPFC during choice. The neural interaction between HER and value encoding was associated with more consistent subjective choices. This functional coupling was specific to subjective decisions: HER did not interact with the encoding of perceptual evidence in objective visual discrimination trials.
Materials and Methods
Experimental design
Participants.
Twenty-four right-handed volunteers with normal or corrected-to-normal vision took part in the study after having given written informed consent. They received monetary compensation for their participation. The ethics committee Comités de Protection des Personnes Ile de France III approved all experimental procedures. Three subjects were excluded from further analysis: one subject for too low overall performance (74%, 2 SDs below the mean = 87.6%), one subject was excluded for an excessive number of artifacts (29.7% of trials, 2 SDs above the mean = 7%), and one subject was excluded because the independent component analysis (ICA) correction of the cardiac artifact was not successful.
Twenty-one subjects were thus retained for all subsequent analyses (9 males; mean ± SD age, 23.57 ± 2.4 years).
Tasks and procedure.
Participants came on 2 consecutive days to the laboratory (mean elapsed time between the two sessions, 22.28 ± 3.55 h) to complete two experimental sessions. The first session was a likability rating on movies (behavior only), from which we drew the stimuli used in the second experimental session, during which brain activity was recorded with MEG.
Rating session.
We selected 540 popular movies from the Web (https://www.allocine.fr/) having a maximal title length of 16 characters (spaces included). DVD covers and titles of the preselected movies were displayed one by one on a computer screen, and subjects had to indicate whether they had previously watched the currently displayed movie by pressing a “yes” or “no” key on a computer keyboard, without any time constraint. Participants were then presented with the list of movie titles they had previously watched and were asked to name the two movies they liked the most and the two they liked the least. Participants were explicitly instructed to use these four movies as reference points (the extremes of the rating scale) to rate all other movies. Last, the titles and the covers of the movies belonging to the list were displayed one by one at the center of the computer screen in random order. Participants assigned to each movie a likability rating by displacing (with arrow keys) a cursor on a 21 point Likert scale and validated their choice with an additional button press. Likability ratings were self-paced, and the starting position of the cursor was randomized at every trial.
Stimuli.
Experimental stimuli consisted of 256 pairs of written movie titles drawn from the list of movies that each participant had rated on the first day. Each movie title was characterized along the following two experimental dimensions: its likability rating (as provided by the participant), and its contrast. The mean contrast was obtained by averaging the RGB value (between 40 and 100; gray background at 190, no unit) randomly assigned to each character of the title. We manipulated trial difficulty by pairing movie titles so that the differences between the two items along the two dimensions (i.e., likability and contrast) were parametric and orthogonal. Additionally, we controlled that the sum of ratings and the sum of contrast within each difficulty level were independent of their difference and evenly distributed. Each pair of stimuli was presented twice in the experiment: one per decision type. A given movie title could appear in up to 10 different pairs. The position of the movie titles on the screen was pseudorandomly assigned so that the position of the correct option (higher likeability rating or higher contrast) was fully counterbalanced.
Two-alternative forced-choice task.
On the second day, subjects performed a two-alternative forced-choice task while brain activity was recorded with MEG. At each trial, participants were instructed to perform one of the two decision types on the pair of movie titles (Fig. 1A): either a preference decision, in which they had to indicate the item they liked the most, or a perceptual discrimination, in which they had to indicate the title written with the higher contrast. Each trial began with a fixation period of variable duration (uniformly distributed between 0.8 and 1.2 s in steps of 0.05 s) indicated by a black fixation dot surrounded by a black ring (internal dot, 0.20° of visual angle; external black ring, 0.40° of visual angle), starting from which participants were required not to blink anymore. Next, the outer ring of the fixation turned either into a square or a diamond (0.40° and 0.56° visual angle, respectively), indicating which type of decision participants were to perform (preference-based or perceptual, counterbalanced across participants), for 1.5 s. Then, the outer shape turned again into a ring, and two movie titles appeared above and below it (visual angle, 1.09°). Options remained on screen until a response was provided (via button press with the right hand) or until 3 s had elapsed. After response delivery, movie titles disappeared and the black fixation dot surrounded by the black circle remained on screen for 1 more second. The central dot turned green and stayed on screen for a variable time (uniformly distributed between 2.5 and 3 s in steps of 0.05 s), indicating to participants that they were allowed to blink before the beginning of next trial. Each recording session consisted of eight blocks of 64 trials each.
Before the recording session, participants familiarized themselves with the experimental task by carrying out three training blocks. The first two blocks (10 trials each) comprised trials of one type only, and hence preceded by the same cue symbol. The last block contained interleaved trials (n = 20), as in the actual recording. The movie pairs used during training were not presented again during the recording session.
Heartbeat counting task.
After performing the eight experimental blocks, we assessed participants' interoceptive abilities by asking them to count their heartbeats by focusing on their bodily sensations, while fixating the screen (Schandry, 1981). Subjects performed six blocks of different durations (30, 45, 60, 80, 100, and 120 s) in randomized order. No feedback on performance was provided. Since the acquisition of our dataset, this widely used paradigm has been criticized in several respects (Ring et al., 2015; Desmedt et al., 2018; Zamariola et al., 2018).
Questionnaires.
Once subjects left the MEG room, they filled out the following four questionnaires in French: Beck's Depression Inventory (BDI; Beck et al., 1961); Peter's et al., Delusions Inventory (PDI; Peters et al., 2004); the State-Trait Anxiety Inventory (STAI; Spielberger et al., 1983); and the Obsessive-Compulsive Inventory (OCI; Foa et al., 2002).
Recordings
Neural activity was continuously recorded using a MEG system with 102 magnetometers and 204 planar gradiometers (sampling rate, 1000 Hz; online low-pass filter, 330 Hz; Neuromag TRIUX, ELEKTA). Cardiac activity was simultaneously recorded (sampling frequency, 1000 Hz; online filter, 0.05–35 Hz; BIOPAC Systems). The electrocardiogram was obtained from four electrodes (two placed in over the left and right clavicles, two over left and right supraspinatus muscles; Gray et al., 2007) and referenced to another electrode on the left iliac region of the abdomen, corresponding to four vertical derivations. The four horizontal derivations were computed offline by subtracting the activity of two adjacent electrodes. Additionally, we measured beat-to-beat changes in cardiac impedance, to compute the beat-by-beat stroke volume (i.e., the volume of blood ejected by the heart at each heartbeat; Kubicek et al., 1970). Impedance cardiography is a noninvasive technique based on the impedance changes in the thorax due to the changes in fluid volume (blood). A very low-intensity (400 μA rms), high-frequency (12.5 kHz) electric current was injected via two source electrodes: the first one was placed on the left side of the neck, and the second 30 cm below it (approximately on the sixth rib). Two other monitoring electrodes (placed 4 cm apart from the source electrodes, below the source electrode on the neck and above the source electrode on the rib cage) measured the voltage across the tissue. To determine left ventricular ejection time, aortic valve activity was recorded by placing a nonmagnetic homemade microphone (online bandpass filter, 0.05–300 Hz) on the chest of the subject.
Pupil diameter and eye movements were tracked using an eye-tracker device (EyeLink 1000, SR Research) and four electrodes (two electrodes placed on the left and right temples, and two electrodes placed above and below the participant's dominant eye).
Cardiac events and parameters
Cardiac events were detected on the right clavicle–left abdomen ECG derivation in all participants. We computed a template of the cardiac cycle by averaging a subset of cardiac cycles, which was then convolved with the ECG time series. R-peaks were identified as peaks of the result of the convolution, normalized between 0 and 1, exceeding 0.6. All other cardiac waves were defined with respect to R-peak. In particular, T-waves were identified as the maximum amplitude occurring within 420 ms after the Q-wave. R-peak and T-wave automatic detection was visually verified for each participant.
Interbeat intervals (IBIs) were defined for each phase of the trial as the intervals between two consecutive R-peaks. More specifically, we considered for “fixation,” “instruction period,” and “response” phases the two R-peaks around their occurrence. IBIs during 'choice' were based on the two R-peaks preceding response delivery. Interbeat variability was defined as the SD across trials of IBIs in a given trial phase.
Stroke volume was computed according to the following formula (Kubicek et al., 1970; Sherwood et al., 1990):
MEG data preprocessing
External noise was removed from the continuous data using MaxFilter algorithm. Continuous data were then high-pass filtered at 0.5 Hz (fourth-order Butterworth filter). Trials (defined as epochs ranging from fixation period to 1 s after response) contaminated by muscle and movement artifacts were manually identified and discarded from further analyses (average, 6% of trials; range, 0–15%).
ICA (Delorme and Makeig, 2004), as implemented in FieldTrip Toolbox (Oostenveld et al., 2011), was used to attenuate the cardiac artifact on MEG data. ICA was computed on MEG data epoched ±200 ms around the R-peak of the ECG in data segments that were free of artifacts, blinks, and saccades >3°. The number of independent components to compute was set to be equal to the rank of the MEG data. Mean pairwise phase consistency (PPC) was estimated for each independent component (Vinck et al., 2010) with the right clavicle–left abdomen ECG derivation signal in the frequency band 0–25 Hz. Components (up to 3) that exceeded 3 SDs from mean PPC were then removed from the continuous data.
To correct for blinks, 2 s segments of data were used to estimate blink and eye-movement components. Mean PPC was then computed with respect to vertical electro-oculogram signal, and components exceeding the mean PPC plus 3 SDs were removed from continuous data. The procedure was iterated until no component was beyond 3 SDs or until three components in total were removed. Stereotypical blink components were manually selected in two participants as the automated procedure failed to identify them.
ICA-corrected data were then low-pass filtered at 25 Hz (sixth-order Butterworth filter).
Trials selection
Trials had to meet the following criteria to be included in all subsequent analysis: no movement artifacts, sum of blinking periods <20% of total trial time, at least one T-peak during instruction period (see HERs section), and reaction time (RT) neither too short (at least 250 ms) nor too long. To identify exceedingly long RTs, we binned the trials of each task in four difficulty levels based on the difference of the two options (i.e., difference in ratings in preference-based choice and difference in contrast for the perceptual ones). Within each difficulty level, for correct and error trials separately, we excluded the trials with reaction times exceeding the participant's mean RT plus 2 SDs.
The mean ± SD number of trials retained per participant was 421.67 ± 43.36.
Heartbeat-evoked responses
Heartbeat-evoked responses were computed on MEG data time locked to T-wave occurring during the instruction period. T-waves had to be at minimum 400 ms distance from the subsequent R-peak. To avoid contamination by transient visual responses or by preparation for the subsequent visual presentation, we only retained T-waves that occurred at least 300 ms after the onset of the instruction cue and 350 ms before the onset of options presentation. If more than one T-wave occurred in this period, HERs for that trial were averaged. HERs were analyzed from T-wave plus 50 ms to minimize contamination by the residual cardiac artifact (Dirlich et al., 1997) after ICA correction.
We verify that differences in HERs between the two types of decision were truly locked to heartbeats, and that a difference of similar magnitude could not arise by locking the data to any time point of the instruction period. To this end, we created surrogate timings for heartbeats (within the instruction period), to break the temporal relationship between neural data and heartbeats, and computed surrogate HERs. We created a 500 surrogate heartbeat dataset by permuting the timings of the real T-wave between trials belonging to the same decision type (i.e., the timing of the T-wave at trial i was randomly assigned to trial j). We then searched for surrogate HER differences between trial types using a cluster-based permutation test (Maris and Oostenveld, 2007; see below). For each of the 500 iterations, we retained the value of the largest cluster statistics [sum(t)] to estimate the distribution of the largest difference that could be obtained randomly sampling ongoing neural activity during the same instruction period. To assess statistical significance, we compared the cluster statistics from the original data against the distribution of surrogate statistics.
Statistical analysis
Nonparametric statistical testing of MEG data.
The difference in HERs between preference-based and perceptual trials during instruction presentation was tested for statistical significance using a cluster-based permutation two-tailed t test (Maris and Oostenveld, 2007) as implemented in FieldTrip toolbox (Oostenveld et al., 2011), on magnetometer activity in the time window of 50–300 ms after T-wave. This method defines candidate clusters of neural activity based on spatiotemporal adjacency exceeding a statistical threshold (p < 0.05) for a given number of neighboring sensors (n = 3). Each candidate cluster is assigned cluster-level test statistics corresponding to the sum of t values of all samples belonging to the given cluster. The null distribution is obtained nonparametrically by randomly shuffling conditions labels 10,000 times, computing at each iteration the cluster statistics and saving the largest positive and negative t sum. The Monte Carlo p value corresponds to the proportion of cluster statistics under the null distribution that exceeds the original cluster-level test statistics. Because the largest chance values are retained to construct the null distribution, this method intrinsically corrects for multiple comparisons across time and space. Control analyses involving the clustering procedure were performed with the same parameters.
The significance of β time series obtained from general linear model (GLM) analyses at the sensor level was obtained using a cluster-based permutation two-tailed t test against zero.
Bayes factor.
We used Bayes factors (BFs) to quantify the evidence in support of the null hypothesis (H0 = no difference between two measures). To this aim, we computed the maximum log-likelihood of a Gaussian model in favor of the alternative hypothesis and for the model favoring the null adjusting the effect size to correspond to a p = 0.05 for our sample size (n = 21 for all analyses except for pupil for which n = 16, and for three ECG derivations for which n = 20). Finally, we computed Bayesian information criterion and the corresponding Bayes factor. As a summary indication, BF < 0.33 provides substantial evidence in favor of the null hypothesis; BF between 0.33 and 3 does not provide enough evidence for or against the null (Kass and Raftery, 1995).
For regression analyses, the Bayes factor was computed using the online calculator tool (http://pcl.missouri.edu/bf-reg) based on the study by Liang et al. (2008).
General linear model on response-locked single trials.
To analyze how task-related variables are encoded in neural activity during a decision, we ran a GLM on baseline-corrected (−500 to −200 ms before instruction presentation) single-trial MEG data time locked to button press. We predicted z-scored MEG activity at each time point and channel using task-relevant experimental variables. For preference-based decisions, we modeled MEG activity as follows:
Similarly, for perceptual decisions we used the following:
This procedure provided us with a time series of β values at each channel that could be tested against zero for significance using spatiotemporal clustering (Maris and Oostenveld, 2007). Once significant clusters encoding task-related variables were identified at the sensor level, we reconstructed the cortical sources corresponding to the sensor-level activity averaged within the significant time window. We modeled source-reconstructed neural activity with the same GLMs to identify the cortical areas contributing the most to the significant sensor-level effect.
General linear model on posterior right vmPFC.
To quantify the influence of HER in anterior right vmPFC (r-vmPFC) during instructions on subjective value encoding during choice, we modeled the activity of posterior r-vmPFC, encoding subjective value with the following GLM:
To verify that the interaction between subjective value encoding and HER amplitude was specifically time locked to heartbeats and not a general influence of baseline activity in anterior r-vmPFC, we ran the following alternative model to explain posterior r-vmPFC activity:
Anatomical MR acquisition and preprocessing
An anatomic T1 scan was acquired for each participant on a 3 tesla Siemens TRIO (n = 2), a Siemens PRISMA (n = 20), or a Siemens VERIO (n = 2) scanner. Cortical segmentation was obtained by using an automated procedure as implemented in the FreeSurfer software package (Fischl et al., 2004). The results were visually inspected and used for minimum-norm estimation.
Source reconstruction
Cortical localization of neural activity was performed with BrainStorm toolbox (Tadel et al., 2011). After coregistration of individual anatomy and MEG sensors, 15,003 current dipoles were estimated using a linear inverse solution from time series of magnetometers and planar gradiometers (weighted minimum-norm; signal-to-noise ratio, 3; whitening PCA; depth weighting, 0.5) using an overlapping-spheres head model. Current dipoles were constrained to be normally oriented to cortical surface, based on individual anatomy. Source activity was obtained by averaging sensor-level time series in the time windows showing significant effects (difference between HERs and β values different from zero), was spatially smoothed (FWHM, 6 mm), and was projected onto the brain of one template participant (15,003 vertices). Note that sources in subcortical regions cannot be retrieved with the reconstruction method used here.
To assess which cortical areas contributed the most to the effects observed at the sensor level, we ran a parametric two-tailed t test and reported all clusters of activity spatially extending >20 vertices with individual t values corresponding to p < 0.005 (uncorrected for multiple comparisons). We reported the coordinates of vertices with the maximal t value and their anatomic labels according to an Automated Anatomical Labeling atlas (Tzourio-Mazoyer et al., 2002). For clusters falling into prefrontal cortices, we reported the corresponding areas according to the connectivity-based parcellation developed by Neubert et al. (2015).
Pupil data analysis
Pupil data that contained blinks (automatically detected with EyeLink software and extended before and after by 150 ms), saccades beyond 2° and segments in which pupil size changed abruptly (signal temporal derivative exceeding 0.3 a.u.) were linearly interpolated. All interpolated portions of the data that exceeded 1 s were removed from further analyses. Continuous pupil data from each experimental block were then bandpass filtered between 0.01 and 10 Hz (second-order Butterworth filter) and z-scored. Sixteen subjects were retained for pupil analysis; 5 subjects were excluded because of data of too low quality. Pupil analysis was performed in the following two ways: (1) averaged pupil diameter in the same time period used for HER computation (i.e., 300 ms after instruction presentation until 350 ms before options display); and (2) averaged pupil diameter in the time window spanning 1 s before button press until its execution.
Data availability
The custom code and the source data used for the main analyses in this article can be accessed online at https://github.com/DamianoAzzalini/HER-preferences. Participants did not give any formal agreement to publicly share the MEG and physiological data, hence, the raw data supporting the findings are available from the corresponding authors on reasonable request.
Results
Behavioral results
Participants were asked to choose between two simultaneously presented movie titles according either to their subjective preferences or to the visual contrast of movie titles, as indicated by trial-by-trial instructions presented before the alternatives (Fig. 1A). Decision difficulty, operationalized as the difference between the two options (in the preference task: difference between likeability ratings measured 1 d before the MEG session (see Materials and Methods); in the perceptual task: difference between contrasts), had the expected impact on behavior in both tasks. Both preference consistency and discrimination accuracy increased, and reaction times decreased for easier decisions (Fig. 1C; preference task, one-way repeated measure ANOVA, main effect of difficulty: accuracy, F(3,60) = 99.25, p < 10−15; RT, F(3,60) = 41.14, p < 10−13; perceptual task, main effect of difficulty: accuracy, F(3,60) = 280.2, p < 10−15; RT, F(3,60) = 87.67, p < 10−15). Preference and perceptual decisions were matched in accuracy (two-way repeated-measures ANOVA; main effect of task on accuracy: F(1,20) = 0.38, p = 0.55; interaction task × difficulty, F(3,60) = 2.53, p = 0.07), but preference-based decisions were generally slower (two-way repeated-measures ANOVA; main effect of task: RT, F(1,20) = 57.64, p < 10−6) and reaction times decreased less rapidly for easier decisions (interaction task × difficulty: F(3,60) = 4.08, p = 0.01). Participants only used task-relevant information (subjective value or objective contrast) to decide, since nonrelevant information could not predict choice (Fig. 1D, Table 1).
Experimental design and behavioral results. A, Trial time course. After a fixation period of variable duration, a symbol (square or diamond) instructed participants on the type of decision to perform on the upcoming movie titles: either a subjective preference-based choice or an objective perceptual discrimination task. Decision type varied on a trial-by-trial basis. The two movie titles appeared above and below fixation and remained on screen until response or until 3 s had elapsed. B, Rationale for data analysis. Left, Interoceptive self-related processes were indexed by HERs computed during the instruction period, before option presentation, by averaging MEG activity locked to the T-waves of the electrocardiogram. Right, Response-locked MEG activity during choice was modeled on a trial-by-trial basis with a GLM to isolate the spatiotemporal patterns of neural activity encoding value. The central question is whether HERs before option presentation and value encoding interact. C, Behavioral results. In both tasks, performance (choice consistency or discrimination accuracy) increased (F(3,60) = 99.25, p < 10−15), and response times decreased (F(3,60) = 41.14, p < 10−13) for easier choices (i.e., larger difference in subjective value for preference-based decisions, or difference in contrast for perceptual ones). D, Only task-relevant information significantly contributed to choice in both preference-based and perceptual decisions as estimated by logistic regression (two-tailed t test against zero). Each dot represents one participant. **p < 0.01.
Logistic regression against choice, for task-relevant and task-irrelevant stimulus information
Neural responses to heartbeats are larger when preparing for preference-based decisions
HERs in vmPFC have been previously shown to index self-related processes (Park et al., 2014; Babo-Rebelo et al., 2016a,b). We thus predicted that HERs would be larger when preparing for subjective, preference-based decisions relying on self-reflection than when preparing for objective, perceptual ones. Importantly, we devised our experimental design to clearly isolate two distinct phases in the trial: the preparatory phase in which participants prepared for the upcoming decision, and in which we analyzed HERs, and the evaluation phase, beginning when options were revealed. This temporal separation permits the analysis of HERs in the absence of concomitant processes, such as evaluation, comparison, and motor preparation, which could hinder the interpretability of the results. Using a nonparametric clustering procedure that corrects for multiple comparisons across time and space (Maris and Oostenveld, 2007), we found that HERs during the instruction period, before option presentation, were indeed larger when participants prepared for preference-based decisions than for perceptual ones (Fig. 2A,B; nonparametric clustering, 201–262 ms after T-wave; sum(t) = 1789; Monte Carlo cluster level, p = 0.037). Averaging cluster activity separately in the two conditions results in an effect size Cohen's d of 1.28. Although the accuracy for the two decision types was matched across participants, we explored whether interindividual differences in accuracy between the two tasks are related to the magnitude of HER difference. Correlating HER amplitude differences in the significant cluster (Fig. 2A,B) with the difference in mean accuracy between the two decision types across subjects revealed that subjects with larger HER differences were also more accurate in the preference-based decisions relative to perceptual ones (robust regression, β = 0.47; t(19) = 2.11, p = 0.05, r2 = 0.19). The cortical regions that mostly contributed to the HER difference (Fig. 2C) were localized as expected in right and left anterior vmPFC (areas 11m and 14 bilaterally; cluster peak at MNI coordinates [1, 57, −21] and [−3, 47, −6]; t = 4.68 and t = 3.76, respectively), but also in the right post-central complex ([32, −22, 56]; t = 5.57) and right supramarginal gyrus ([41, −33, 43]; t = 3.98).
HERs and subjective value encoding. A, Topography of the significant HER difference between preference-based and perceptual decisions during the instruction period (201–262 ms after cardiac T-wave, Monte Carlo cluster level, p = 0.037). B, Time course of HER (±SEM) for preference-based and perceptual decisions in the cluster highlighted in white in A. The portion of the signal (50 ms after T-wave) still potentially contaminated by the cardiac artifact appears in lighter color. The black bar indicates the significant time window, as established by nonparametric clustering procedure. C, Brain regions mostly contributing to the HER difference between preference-based and perceptual decisions (at least 20 contiguous vertices with uncorrected p < 0.005). D, Topography of the significant encoding of the chosen option subjective value (−580 to −197 ms before motor response) during choice in preference-based trials. E, Time course (±SEM) of the GLM parameter estimate for the chosen option subjective value in the cluster highlighted in white in D. Black bars indicate significant time windows, as established by nonparametric clustering procedure. F, Brain regions mostly contributing to the encoding of the subjective value of the chosen option (at least 20 contiguous vertices with uncorrected, p < 0.005). *p < 0.05, **p < 0.01.
The HER difference between subjective preference-based trials and objective perceptual discrimination trials was not accompanied by any difference in ECG activity (paired t test on four ECG vertical derivations: all p ≥ 0.89; all BFs ≤ 0.24; paired t test on four ECG horizontal derivations: all p ≥ 0.34, all BFs ≤ 0.47), in cardiac parameters (interbeat intervals, interbeat interval variability, stroke volume, T-wave mean latency, and variability), or in arousal indices (alpha power and pupil diameter) measured during the instruction period (Table 2). HER difference is thus neither because of differences in cardiac inputs nor to overall changes in brain states. Note that if the HER difference was driven by differences in the task difficulty, as differences in reaction times between perceptual and preference trials might suggest, one would also expect brain states to be different in preparation to easier (perceptual) versus more difficult (preference-based) decisions. However, our control analyses on arousal measures rule out this possibility. In addition, there is no evidence that reaction time differences between the two tasks contribute to HER differences, accounting for an extremely low percentage (0.06%) of this variance (robust regression across participants, β = 0.08, t(19) = 0.35, p = 0.73, R2 = 0.006, BF = 0.41). Importantly, the HER difference was time locked to heartbeats and thus did not reflect a baseline difference between conditions (Monte-Carlo cluster level, p = 0.026; for details, see Materials and Methods).
Cardiac parameters and arousal measures during instructions do not differ between preference-based and perceptual decisions
The subjective value of the chosen option is encoded in medial prefrontal cortices in preference-based decisions
We then identified when and where the subjective value was encoded during preference-based choice. First, we modeled single-trial response-locked neural activity at the sensor level using a GLM (GLM1a; see Materials and Methods), using as regressors the subjective values of the chosen (ChosenSV) and unchosen (UnchosenSV) options, as well as the response button used. Neural activity over frontal sensors encoded the subjective value of the chosen option in two neighboring time windows (βChosenSV; first cluster: −580 to −370 ms before response, sum(t) = −7613; Monte Carlo cluster level, p = 0.004; second cluster: −336 to −197 ms before response, sum(t) = −4405; Monte Carlo cluster level, p = 0.033; Fig. 2D,E). No cluster of neural activity significantly encoded the subjective value of the unchosen option. Motor preparation was encoded later in time in two posterior–parietal clusters of opposite polarities (βButton Press; negative cluster: −287 to −28 ms before response; sum(t) = −10,918; Monte Carlo cluster level, p = 0.003; positive cluster: −373 to −196 ms before response; sum(t) = 5848; Monte Carlo cluster level, p = 0.02).
To identify the cortical regions contributing to the encoding of subjective value at sensor level, we used the same model (GLM1a) to predict source-reconstructed activity averaged in the time window identified at sensor level (−580 to −197 ms before response). The subjective value of the chosen option was encoded as expected in medial prefrontal regions (right posterior vmPFC areas 32 and 24: cluster peak at MNI coordinates [7, 40, 0]; t = 4.52; right dorsomedial PFC (dmPFC) area 8m: cluster peak at MNI coordinates [5, 30, 40]; t = 3.73), as well as in bilateral occipital poles (MNI coordinates [6, −77, 11] and [−1, −85, 16]; t = 4.17 and t = 3.85, respectively) and in mid-posterior left insula (MNI coordinates [−34, −27, 17]; t = 4.48; Fig. 2F).
HER amplitude during instruction interacts with subjective value encoding in right vmPFC during choice on a trial-by-trial basis
We thus show that two different subregions of vmPFC were involved at different moments in a trial: during the instruction period, HERs were larger when participants prepared for preference-based versus perceptual decisions in left and right anterior vmPFC, and, during the choice period, subjective value was encoded in right posterior vmPFC. We then addressed our main question (Fig. 1B): does the amplitude of neural responses to heartbeats during the instruction period interact with the encoding of subjective value during choice in vmPFC?
We tested whether subjective value encoding in right posterior vmPFC was affected by HER amplitude measured in either left or right anterior vmPFC in a two-by-two ANOVA with HER amplitude (high or low, median split across trials) and hemisphere as factors. The ANOVA revealed a significant interaction between HER amplitude and hemisphere (Fig. 3A; F(1,40) = 5.07, p = 0.036; no main effect of HER amplitude: F(1,20) = 2.69, p = 0.12; no main effect of hemisphere: F(1,20) = 0.19, p = 0.67). This interaction corresponded to a significantly stronger subjective value encoding in trials where HERs in right vmPFC were larger during instructions (two-tailed paired t test on βChosenSV in large versus small HER values in right vmPFC: t(20) = 2.52, p = 0.02, Cohen's d = 0.55).
The interaction between HER and value encoding accounts for interindividual variability in choice consistency and for intraindividual trial-by-trial fluctuations of choice precision. A, Parameter estimates for the encoding of the chosen option value in posterior r-vmPFC during choice, in trials where HERs during task preparation were small (yellow) or large (orange) in left anterior vmPFC (left) or in right anterior vmPFC (right). Each dot represents a participant, the horizontal line indicates the mean. The difference in value encoding between large and small HERs (gray bars; error bars indicate SEM) was significant in right vmPFC (ANOVA on value encoding; significant interaction of HER amplitude × hemisphere: F(1,40) = 5.07, p = 0.036; two-tailed paired t test comparing encoding strength for trials with large or small HERs in right vmPFC: t(20) = 2.52, p = 0.02) B, Activity in posterior r-vmPFC is by design explained by the chosen option subjective value, but it is also explained by HER amplitude in anterior r-vmPFC during instructions and by the interaction between HER and value encoding. C, Robust regression shows that the magnitude of the interaction between HER and value encoding positively predicts interindividual variability in choice consistency. D, Behavioral data (dots) and fitted psychometric function (lines) for one representative participant, in trials with a large or small interaction between HER and value encoding. Error bars represent SEM. E, Criterion and slope of the psychometric function in all participants, revealing a significantly steeper slope for trials with large interaction between HER and value-related vmPFC activity (p = 0.037). The decision criterion is unaffected. Black lines represent the parameter estimates of the participant displayed in D. *p < 0.05, **p < 0.01. ns, not significant.
The influence of HER amplitude was specific to value encoding: the amplitude of visual responses evoked by option presentation was unrelated to HER amplitude (Table 3). HER amplitude is thus not a mere index of cortical responsiveness, interacting with any other brain response. HER amplitude in r-vmPFC did not vary with pupil diameter or with alpha power, either during choice or during value encoding (Table 3), indicating that HER fluctuations are not driven by an overall change in brain state. Last, to definitively rule out an influence of attention/arousal, we tested whether the strength of value encoding was modulated by fluctuations in alpha power or pupil diameter measured during instructions. We median-split trials based on either alpha power or pupil diameter, but found no difference in value encoding (α, paired t test on βChosenSV: t(20) = 0.19, p = 0.85, BF = 0.25; pupil, paired t test: t(20) = −0.23, p = 0.82, BF = 0.24).
Arousal states and physiological parameters do not differ between low and high HER amplitude in preference trials
The interaction between HER amplitude and subjective value encoding in right vmPFC was further tested using a full parametric approach. Here (GLM2), we predicted the activity of posterior r-vmPFC during choice from the subjective value of the chosen option, the HER amplitude in anterior r-vmPFC during the instruction period, and the interaction between these two terms (Fig. 3B). Since the posterior vmPFC region of interest was defined based on its encoding of the chosen value, the parameter estimate for chosen value was, as expected, large (βChosenSV = −0.06 ± 0.02; two-tailed t test against 0: t(20) = −3.37, p = 0.003). Activity in posterior vmPFC was also predicted by the amplitude of HERs occurring ∼1.5 s earlier, during the instruction period, independently from the chosen value (βHER = 0.04 ± 0.02; two-tailed t test against 0: t(20) = 2.13, p = 0.046), and, importantly, by the interaction between HERs and chosen value (βHER * ChosenSV = −0.05 ± 0.02; two-tailed t test against 0: t(20) = −2.41, p = 0.025). Both the median-split analysis and the parametric model thus reveal a significant interaction between the amplitude of HERs during the instruction period and the neural encoding of subjective value during choice.
We then verified that the effect on the neural encoding of subjective value was specific to HER amplitude, and not because of an overall baseline shift in anterior r-vmPFC during the instruction period. We ran an alternative model (GLM3) in which the activity in posterior r-vmPFC was predicted from the subjective value of the chosen option, the activity in anterior r-vmPFC averaged during the whole instruction period (i.e., activity was not time locked to heartbeats) and the interaction between the two terms. This analysis revealed that while the subjective value of the chosen option still significantly predicted the activity of posterior r-vmPFC (βChosenSV = −0.05 ± 0.02; two-tailed t test against 0: t(20) = −3.27, p = 0.004), the other two terms did not (activity in anterior r-vmPFC averaged during instruction period: βBL vmPFC = 0.006 ± 0.03; two-tailed t test against 0: t(20) = 0.22, p = 0.83, BF = 0.25; interaction: βBL vmPFC* ChosenSV −0.03 ± 0.02, two-tailed t test against 0, t(20) = −1.55, p = 0.14, BF = 1.14). The encoding of subjective value is thus specifically modulated by HER amplitude in anterior r-vmPFC and not by an overall baseline shift unrelated to heartbeats in the same region.
The functional coupling between HERs and subjective value encoding was also region specific: HER amplitude in anterior r-vmPFC was unrelated to the strength of value encoding in any other value-related regions (two-tailed paired t test on βChosenSV in large vs small HER values in right dmPFC: t(20) = −0.89, p = 0.38, BF = 0.43; right occipital pole: t(20) = −0.86, p = 0.40, BF = 0.41; left occipital pole: t(20) = −1.60, p = 0.13, BF = 0.81; left posterior insula: t(20)= 1.00, p = 0.33, BF = 0.49). Conversely, HERs outside anterior r-vmPFC did not significantly interact with value encoding in posterior r-vmPFC. Splitting trials based on the amplitude in the two other cortical regions showing differential heartbeat-evoked responses (Fig. 2C) showed no significant modulation of value encoding in right posterior vmPFC (post-central complex: two-tailed paired t test on βChosenSV: t(20)= −1.41, p = 0.17, BF = 0.90; right supramarginal gyrus, two-tailed paired t test: t(20)= −1.96, p = 0.06, BF = 2.41).
We thus show that HER fluctuations in right anterior vmPFC predict the strength of value encoding in right posterior vmPFC. To gain a deeper insight into which factors may drive HER fluctuations, we tested whether single-trial HER amplitude in preference-based trials could be explained by past trial experience. To this end, we modeled single-trial HER amplitude in right anterior vmPFC as a function of characteristics of the previous two trials (switches between tasks, decision difficulty, the overall likeability of items), and as a function of the average value of the chosen option up to the current trial. Our analysis showed that none of these experimental-related variables could account for a significative portion of HER amplitude (no parameter estimate was different from zero, t test vs 0: all t(20) ≤ 1.48, all p ≥ 0.15, all BFs ≤ 1.01). To rule out the possibility that these null results were driven by the large number of predictors used in the same model, we performed the same analysis by modeling HER amplitude as a function of one variable at a time, resulting in 7 different models. None of these 7 models could predict a significant portion of HER amplitude (maximal explained variance across models, 0.75%). We can thus conclude that fluctuations in HER amplitude are not driven by task alternation, past decision difficulties, the likeability of items presented in the two preceding trials, or the average subjective value of the chosen option.
The interaction between HER and value encoding predicts choice consistency
To what extent does the interaction between HER and value encoding in vmPFC predict behavior? We first tested whether the interaction between HER and value encoding relates to interindividual differences in choice consistency (i.e., whether participants selected the movie to which they had attributed the greatest likeability rating the day before). Given the overall high consistency in preference-based decisions, which may reduce our ability to detect significant relationships, we computed mean choice consistency using the top 50% most difficult trials (i.e., trials above median difficulty in each participant). We regressed the model parameter of the interaction between HER and value encoding (βHER * ChosenSV obtained from GLM2) against mean choice consistency across participants. The larger the interaction between HER and value encoding, the more consistent were participants in their choices (βrobust = 0.41, robust regression R2 = 0.22, t(19) = 2.29, p = 0.03; Fig. 3C). In other words, 22% of interindividual difference in behavioral consistency is explained by the magnitude of the interaction between HER and value encoding.
The correlation between neural activity and behavior was specific to the interaction parameter: interindividual differences in choice consistency could not be predicted from the model parameter estimate of HER (βHER from GLM2: βrobust = 0.02, R2 = 4 * 10−4, t(19) = 0.09, p = 0.93, BF = 0.39), or from the parameter estimate of value (βChosenSV from GLM2: βrobust = −0.19, R2 = 0.04, t(19) = −0.88, p = 0.39, BF = 0.52). The interaction between HER and subjective value encoding did not covary significantly either with the personality traits, assessed through self-reported questionnaires (robust regressions on BDI scores: t(19) = −0.17, p = 0.87, BF = 0.40; STAI scores: t(19) = 0.90, p = 0.38, BF = 0.52; OCI scores: t(19) = −1.23, p = 0.22, BF = 0.70; PDI scores: t(19) = −0.32, p = 0.75, BF = 0.60), or interoceptive ability, assessed with the heartbeat counting task (t(19) = −0.73, p = 0.48, BF = 0.48).
So far, results are based on parameter estimates computed across trials for a given participant. To assess how within-participant trial-by-trial fluctuations in behavior relates to the interaction between HERs and subjective value encoding, we computed the z-scored product of the HER amplitude in anterior r-vmPFC during the instruction period and the value-related activity in posterior r-vmPFC during choice. We then median split the trials according to this product and modeled participants' choices separately for trials with a small versus large interaction (Fig. 3D). When the interaction was large, psychometric curves featured a steeper slope, corresponding to an increased choice precision (two-tailed paired t test: t(20) = −2.24, p = 0.037, Cohen's d = −0.49; after removal of the unique outlier with a slope estimate exceeding 3 SDs from population mean: t(19) = −3.30, p = 0.003, Cohen's d = −0.74; Fig. 3E), while decision criterion was not affected (two-tailed paired t test: t(20) = −1.20, p = 0.25, BF = 0.64; after outlier removal: t(19) = −0.96, p = 0.35, BF = 0.46; Fig. 3E).
To control for the specificity of the interaction, we estimated the psychometric function on trials median-split on HER amplitude alone but found no difference in choice precision (two-tailed paired t test: t(20) = 0.41, p = 0.69, BF = 0.27) or in criterion (two-tailed paired t test: t(20) = 0.52, p = 0.61, BF = 0.29). Similarly, median-splitting trials on value-related posterior r-vmPFC activity alone revealed no difference in the psychometric curve (two-tailed paired t test; slope: t(20) = 0.21, p = 0.84, BF = 0.25; criterion: t(20) = −0.57, p = 0.58, BF = 0.30). Therefore, our results indicate that trial-by-trial choice precision is specifically related to the interaction between HERs in anterior r-vmPFC and value-related activity in posterior r-vmPFC.
Altogether, these results indicate that the interaction between HER amplitude and subjective value encoding accounts both for within-subject intertrial variability and for interindividual differences in preference-based choice consistency.
HER effects are specific to preference-based choices
Finally, we tested whether the effect of HER was specific to subjective value or whether it is a more general mechanism interacting with all types of decision-relevant evidence. To this aim, we analyzed perceptual discrimination trials using the same approach as for preference-based trials. First, we modeled the single-trial response-locked MEG sensor-level data using a GLM (GLM1b) with the parameters accounting for choice in the perceptual task (i.e., contrast of the chosen option, ChosenCtrs, and the contrast of the unchosen option, UnchosenCtrs), as well as the response button. The nonparametric clustering procedure revealed the presence of a frontocentral cluster encoding the contrast of the chosen option (βChosenCtrs, −257 to −25 ms before response; sum(t) = 8121; Monte Carlo cluster level, p = 0.005; Fig. 4A,B). We also found two clusters of opposite polarities encoding the contrast of the unchosen option (βUnchosenCtrs: positive cluster, −250 to −79 ms before response; sum(t) = 6182; Monte Carlo cluster level, p = 0.008; negative cluster, −211 to −88 ms before response; sum(t) = −4127; Monte Carlo cluster level, p = 0.04) and two clusters encoding motor preparation (βButton Press; positive cluster: −193 to 0 ms before response; sum(t) = 11,103; Monte Carlo cluster level, p = 0.0004; negative cluster: −222 to 0 ms before response; sum(t) = −10,395; Monte Carlo cluster level, p = 0.0004). The same model (GLM1b) applied to source-reconstructed activity averaged in the time window identified at sensor level (−257 to −25 ms before response) revealed the following four cortical areas encoding the contrast of the chosen option (Fig. 4C): left midcingulate area (peak at MNI coordinates, [11, 27, 43], t = 5.14), left superior frontal gyrus ([15, 7, 71], t = 4.70), bilateral inferior parietal lobule (IPL, right: [47, 55, 53], t = 4.25; left: [44, 42, 49], t = 5.51). Finally, we median split perceptual trials according to the amplitude of HERs in anterior r-vmPFC. The encoding strength of the contrast of the chosen option did not interact with heartbeat-evoked response amplitudes in any of the contrast-encoding regions (all p values ≥ 0.26, BF ≤ 0.62; Table 4). The results thus indicate that HER amplitude in r-vmPFC is specifically linked to the cortical encoding of subjective value.
Encoding strength for contrast during perceptual decisions does not depend on HER amplitude in anterior r-vmPFC
Neural encoding of perceptual evidence. A, Topography of the significant encoding of the chosen option contrast (−257 to −25 ms before motor response) during choice in objective visual discrimination trials. B, Time course (±SEM) of the GLM parameter estimate for the chosen option contrast in the cluster highlighted in white in A. The black bar indicates the significant time window, as established by nonparametric clustering procedure. C, Brain regions mostly contributing to the encoding of the contrast of the chosen option (at least 20 contiguous vertices with uncorrected p < 0.005). **p < 0.01.
Discussion
We show that preparing for subjective preference-based decisions led to larger responses to heartbeats in vmPFC, as expected from previous studies relating self and HERs (Park et al., 2014; Babo-Rebelo et al., 2016a, b), and in post-central gyrus, a region known to respond to heartbeats (Kern et al., 2013; Azzalini et al., 2019; Al et al., 2020). We further reveal that HERs before option presentation interact specifically with subjective value encoding during choice in vmPFC. The interaction between HERs and value encoding predicted preference-based choices: it was associated with more precise decisions at the single-trial level and predicted interindividual variability in choice consistency over time. No interaction between HER and the encoding of perceptual evidence could be found in the control objective task. Neither the HER difference between the two tasks, nor the interaction between HER and value in the subjective preference task, could be trivially explained by changes in cardiac parameters (heart rate, heart rate variability, ECG, stroke volume) or by changes in arousal state (pupil diameter, alpha power). Altogether, our results reveal how the self-reflection intrinsic to preference-based decisions involves the neural readout of a physiological variable and its integration into the subjective valuation process.
In line with previous studies relating the self with HERs in vmPFC (Park et al., 2014; Babo-Rebelo et al., 2016a,b, 2019), we find that HERs are more pronounced when preparing for the subjective preference task than when preparing for the objective discrimination task. A number of self-related processes might take place specifically when preparing for the subjective preference-based task, such as turning attention inward or preactivating autobiographical memory circuits. Note that HERs cannot be influenced by processes triggered by the movie titles themselves, such as retrieving movie-specific information from memory, since options are not yet available in the instruction period where HERs are measured. We thus interpret the HER difference between the tasks as indexing the degree of self-reflection engaged in the two tasks: necessary to evaluate how an option fits with one's taste and absent, or most certainly reduced, when comparing visual contrasts in perceptual discrimination. This would also account for why HER fluctuations were not associated with any change in the neural encoding of objective perceptual evidence. This result might seem to be at odds with previously reported effects of HERs on sensory detection at threshold (Park et al., 2014; Al et al., 2020). However, as opposed to the perceptual discrimination task used here, sensory detection at threshold is intrinsically subjective, since participants are asked to introspect and report their fluctuating subjective experience in response to physically and objectively constant stimuli (Ress and Heeger, 2003; Campana and Tallon-Baudry, 2013). Last, there was no difference in cardiac parameters between the two tasks, indicating that HER fluctuations relevant for valuation and behavior correspond to changes in the quality of the neural monitoring in cardiac inputs, rather than to changes in cardiac activity.
We successfully retrieved with MEG the cortical valuation network described in the fMRI literature (Levy and Glimcher, 2012; Bartra et al., 2013; Clithero and Rangel, 2014), including dmPFC and vmPFC, during the choice period. These findings are interesting per se, as data on the temporal course of value-based choices in the human prefrontal cortex remain scarce (Hunt et al., 2012, 2015; Polanía et al., 2014; Lopez-Persem et al., 2020). Here, we find that chosen value is robustly encoded in the valuation network from 600 to 200 ms before motor response, with a temporal (but not spatial) overlap with motor preparation. Note that we did not find a robust encoding of the unchosen value, which is in line with electrophysiological recordings in the monkey orbitofrontal cortex where the encoding of the chosen value dominates (Padoa-Schioppa and Assad, 2006; Strait et al., 2014; Hunt et al., 2015; Rich and Wallis, 2016). In the objective discrimination task, relevant perceptual evidence was encoded in, among other regions, posterior parietal cortex, which is consistent with the monkey electrophysiology literature (Shadlen and Newsome, 2001; Heekeren et al., 2008).
To date, vmPFC has been investigated by two separate streams of studies concerning valuation (Fellows, 2006; Delgado et al., 2016; Vaidya et al., 2018) and self (Qin and Northoff, 2011; Andrews-Hanna et al., 2014). Here, we show a functional coupling between these two seemingly separated processes, namely through the interaction between HERs and subjective value representation. These results are in keeping with the proposed integrative role of vmPFC in value-based decision-making (Vaidya and Fellows, 2020) and subjective appraisal (Dixon et al., 2017), but they substantially advance our understanding of the functional role of vmPFC by identifying the mechanism through which this integration occurs. More in general, our results reveal the centrality of the self-reflective process for subjective evaluation and its stability over time.
Because fluctuations in HERs occurred during task preparation, before option presentation, their influence on value encoding might generally pertain to interactions between ongoing activity (during task preparation) and stimulus-evoked activity (in response to option presentation). HERs constitute a very specific subset of such interactions: HERs interacted only with value, but not with visual evoked responses, for instance, and only in vmPFC in the subjective task. Our results are thus in keeping with our initial hypothesis, based on recent findings (Park et al., 2014; Babo-Rebelo et al., 2016a,b, 2019) that HERs are specifically linked to self-related processes, and further reveal the functional relationship with subjective evaluation. However, alternative interpretations of these results should be considered. First, HER fluctuations may reflect changes in vmPFC prestimulus activity, which has been shown to have an additive effect on pleasantness ratings (Abitbol et al., 2015; Lopez-Persem et al., 2020). Here, the effect of HERs on neural encoding of subjective value and on choice consistency is multiplicative. In addition, the link between prestimulus activity and value that we identified is specific to HERs: the prestimulus activity not locked to heartbeats does not interact with value. A second alternative interpretation may be that trial history affects baseline activity (Bouret and Richmond, 2010), thus driving HER fluctuations. However, our control analysis rules out this interpretation: HER amplitude fluctuations are not explained by the average value of the options chosen so far, by the difficulty or overall likeability of previous decisions, or by the recent switches between tasks. Finally, our results cannot easily be interpreted in the framework of the somatic marker hypothesis (Bechara and Damasio, 2005), where changes in peripheral bodily signals are related to performance (Bechara et al., 1997). In the present data, fluctuations in HERs affecting behavior are not accompanied by changes in peripheral states. Altogether, HER fluctuations appear related to trial-to-trial fluctuations in self-reflective processes occurring during subjective decision preparation. This interpretation also accounts for the HER difference between the self-reflective valuation task, and the non-self-related perception task. Self-reflection might entail internally oriented attention, but it is not accompanied by changes in arousal, as probed by pupil diameter and alpha power during instructions. To conclude, our results identify that part of the unspecified “neural noise” driving fluctuations in choice consistency (Padoa-Schioppa, 2013; Kurtz-David et al., 2019; Webb et al., 2019) comes from the interaction between interoceptive self-related processes, indexed by neural monitoring of cardiac signals, and the neural encoding of subjective value.
A more detailed mechanistic account of how responses to heartbeats during task preparation influence the subjective valuation process taking place ∼1.5 s later remains to be established. One possibility is that neural responses to heartbeats during decision preparation reflect the precision with which subjects are able to sample internally generated evidence via self-reflection. Computationally, this may translate into a better retrieval of cached subjective values. Our experimental design does not allow us to address this question, requiring further studies to specifically test this mechanism. Future research should also investigate how HERs contribute to subjective valuation during the decision phase, an analysis that, given the very few suitable heartbeats occurring in the decision phase, was precluded in this study. Still, our results pave the way for the quantification of self-related processes in decision-making, an aspect mostly absent from computational models of decision-making despite its relevance to understanding maladaptive decisions in psychiatric disorders (Paulus, 2007; Moeller and Goldstein, 2014; Sui and Gu, 2017).
Decisions on primary goods such as food integrate information about internal bodily states to select options that preserve the integrity of the organism that needs to be fed and protected—the simplest notion of self. We show here that the subjective valuation of cultural goods, which relies on the same cortical valuation network as used for primary goods (Chib et al., 2009; Lebreton et al., 2009; Levy and Glimcher, 2011; McNamee et al., 2013; Sescousse et al., 2013), has inherited a functional link with the central monitoring of physiological variables. Even when choosing between cultural goods that do not fulfill any immediate basic need, the neural monitoring of heartbeats supports self-reflection underlying evaluation, contributing to the precision of subjective decisions and fostering the stable expression of long-lasting preferences that define, at least in part, our identity.
Footnotes
This work was supported by funding from the European Research Council under the European Union's Horizon 2020 research and innovation program (Grant Agreement 670325, Advanced Grant BRAVIUS) and a senior fellowship of the Canadian Institute for Advanced Research program in Brain, Mind and Consciousness to C.T.-B.; by a doctoral fellowship from the Ecole des Neurosciences de Paris Ile de France to D.A.; and by Agence Nationale de la Recherche Grant ANR-17-EURE-0017. We thank Clémence Alméras, Maximilien Chaumon, and Christophe Gitton for assistance in data acquisition.
The authors declare no competing financial interests.
- Correspondence should be addressed to Damiano Azzalini at damiano.azzalini{at}gmail.com or Catherine Tallon-Baudry at catherine.tallon-baudry{at}ens.psl.eu