Abstract
The ability to resist current temptations in favor of long-term benefits is a critical human capacity. Despite the extensive studies on the neural mechanisms of intertemporal choices, how the subjective value of immediate and delayed rewards is represented and compared in the brain remains to be elucidated. The present fMRI study addressed this question by simultaneously and independently manipulating the magnitude of immediate and delayed rewards in an intertemporal decision task, combined with univariate analysis and multiple voxel pattern analysis. We found that activities in the posterior portion of the dorsal medial prefrontal cortex (DmPFC) were modulated by the value of immediate options, whereas activities in the adjacent anterior DmPFC were modulated by the subjective value of delayed options. Brain signal change in the ventral mPFC was positively correlated with the “relative value” (the absolute difference of subjective value between two intertemporal alternatives). In contrast, the dorsal anterior cingulate cortex activity was negatively correlated with the relative value. These results suggest that immediate and delayed rewards are separately represented in the dorsal mPFC and compared in the ventral mPFC to guide decisions. The functional dissociation of posterior and anterior DmPFC in representing immediate and delayed reward is consistent with the general structural and functional architecture of the prefrontal cortex and may provide a neural basis for human's unique capacity to delayed gratification.
Introduction
Many daily-life decisions involve trade-offs between short- and long-term consequences. The ability to delay gratification (i.e., choose later-larger rewards over sooner-smaller rewards) has often been studied using intertemporal choice paradigms (Ainslie, 1975; Loewenstein, 1988). Cumulative evidence suggests that several mechanisms may affect intertemporal decisions (Peters and Büchel, 2011), including valuation of immediate and delayed options (Hariri et al., 2006; Jimura et al., 2013), cognitive control during decisions (Luo et al., 2009; Figner et al., 2010), and future-oriented thinking (Peters and Büchel, 2010; Cooper et al., 2013).
Of particular interest is how different types of rewards are represented and compared to guide decisions. Considerable evidence has implicated the orbitofrontal and medial prefrontal cortices in representing the value of a wide range of rewards, such as taste, olfactory, oral texture, somatosensory, visual, social, and monetary stimuli (Kringelbach, 2005; Grabenhorst and Rolls, 2011). In particular, sensory-specific values are initially represented in a distributed manner (Sescousse et al., 2010; McNamee et al., 2013) and then assigned into a “common currency” using a common neural scale to guide decisions (Montague and Berns, 2002; Rangel et al., 2008).
Based on the phylogenetics and ontogenesis of the prefrontal cortex (PFC) (Semendeferi et al., 2001; Wise, 2008), it is posited that reward stimuli that are more complex and abstract depend on the more anterior mPFC, whereas reward stimuli that are more basic or tangible depend on the more posterior mPFC (Kringelbach and Rolls, 2004; Sescousse et al., 2013). Consistently, preliminary evidence has revealed an increased engagement of the anterior MFPC for decisions relying on more distant outcomes (Koritzky et al., 2013). During intertemporal decisions, however, the neural representations of immediate and delayed reward are still obscure. For example, although some studies have found that the mPFC, the posterior cingulate cortex (PCC), and the ventral striatum (VS) showed greater activity when an immediate option was present than when both options were delayed (McClure et al., 2004, 2007), other studies found that this system also tracked the subjective value of delayed option when the value of immediate option was fixed (Kable and Glimcher, 2007, 2010; Peters and Büchel, 2009). A latter study suggested these regions might represent the differences in subjective value of the two options (i.e., relative value) (Sripada et al., 2011).
In these studies, immediate value, delayed value, and relative value were not independently manipulated; thus, these different value signals in the brain were not clearly dissociated. Furthermore, the univariate approach used in these studies is not well suited for the detection of fine spatial separation of value representation. In our current project, we sought to build on past work, addressing both these limitations. We 1) parametrically varied the magnitude of immediate and delayed rewards with full orthogonalization and 2) used multiple voxel pattern analysis (MVPA), which provides greater sensitivity for fine spatial separation (Clithero et al., 2009; Jimura and Poldrack, 2012; McNamee et al., 2013) and is thus better powered to detect dissociations of value representation in neighboring tissue.
Materials and Methods
Subjects.
Twenty-eight subjects (9 females; age, 22.07 ± 1.9 years) participated in the fMRI study. Three additional volunteers were recruited but excluded from analyses because of either misunderstanding of task instructions (one subject) or high discounting rate that caused collinearity of regressors (two subjects, see details below). None of the subjects had neurological or psychiatric history. Informed written consent was obtained from subjects before experiments. This study was approved by the institutional review board of the National Key Laboratory of Cognitive Neuroscience and Learning at Beijing Normal University in China.
Intertemporal choice task.
Figure 1 depicts the stimuli and the experimental design. On each trial, subjects were asked to choose between an immediate reward and a future reward (always 120 d later). To allow for separate estimates of neural responses to immediate and delayed rewards, the sizes of the immediate and delayed reward were manipulated independently, with immediate reward ranging from RMB 25 to 100 (∼4–16 US dollars; 16 levels in RMB 5 increments), and the delayed reward ranging from RMB 38 to 150 (16 levels in RMB 7 or 8 increments). These ranges were chosen based on a pilot study conducted on 11 additional subjects (8 men; age, 22.12 ± 1.72 years), matched those in the fMRI study in terms of age and gender. The pilot test used the standard Monetary Choice Questionnaire (Kirby et al., 1999), followed by a staircase approach to identify the discounting ratio (f) at the delay of 120 d. This test suggested that the delayed reward delivered in 120 d should be ∼1.5 times the immediate reward in order for the median participant to similarly value the two alternatives. Thus, we expected that, for most participants, this range of rewards would elicit a wide range of preferences, from strong acceptance to strong rejection of immediate over delayed alternatives.
An illustration of the event-related experimental design. During each trial, an immediate reward and a delayed reward to be paid in 120 d were simultaneously presented for 3 s on either side of the screen. Participants were asked to choose one of the options based on their preference. The chosen value turned to yellow after their choice. The value of the immediate and delayed rewards for each trial were sampled from the immediate/delayed matrix, shown here as one sample trial. A decision from each cell in this 16 × 16 matrix was presented during scanning, but the data were collapsed into a 4 × 4 matrix for analysis. The interstimulus interval (ISI) is jittered to optimize design efficiency.
There were a total of 256 trials with all the possible combinations of each immediate and delayed reward, which were divided into three runs. An event-related design was used in this fMRI study, and the timing and order of stimulus presentation were optimized for estimation efficiency using optseq2 (Dale, 1999). For each trial, the immediate and delayed options were presented on either side of the screen, randomized across trials. Subjects were instructed to respond as quickly as possible within the 3 s trial duration. If no response was made within this window, the task continued, but those trials were modeled as a separate regressor of no interest in the analyses. To encourage participants to reflect on the subjective attractiveness of each decision rather than revert to a fixed decision rule, we asked them to indicate one of four responses to each decision (strongly choose the immediate option, weakly choose the immediate option, weakly choose the delayed option, strongly choose the delayed option) using a four-button response box. The chosen option turned yellow after the subjects' response.
At the end of the experiment, one trial was randomly chosen as bonus, and the payment was realized. If the immediate option was randomly selected, subjects were paid immediately; otherwise, subjects were paid 120 d later. For both types of outcome, the payment was directly deposited into the subjects' cell phone account, and they would be notified by a short message when the money was paid. In addition, all participants received noncontingent compensation of RMB 50.
Functional imaging procedure.
Imaging data were acquired on a 3T Siemens MAGNETOM, a TIM Trio system, in the MRI Center at Beijing Normal University. Subjects laid supine on the scanner bed and viewed visual stimuli back-projected onto a screen through a mirror attached to the head coil. Foam pads were used to minimize head motion. Stimulus presentation and timing of all stimuli and response were achieved using MATLAB (MathWorks) and Psychtoolbox (www.psychtoolbox.org) on a PC. Participants' responses were collected online using an MRI-compatible button box.
Functional scanning used a z-shim gradient echo EPI sequence with PACE (prospective acquisition correction). This specific sequence is designed to reduce signal loss in the prefrontal and orbitofrontal areas. The PACE option can help reduce the impact of head motion during data acquisition. The following parameters were used: TR = 2000 ms; TE = 25 ms; flip angle = 90°; 64 × 64 matrix size with a resolution of 3 × 3 mm2. Thirty-one 4 mm axial slices were used to cover the whole cerebrum and most of the cerebellum with no gap. The slices were tilted ∼30 degrees clockwise from the AC–PC plane to obtain better signals in the orbitofrontal cortex. The anatomical T1-weighted structural scan was acquired using an MPRAGE sequence (TI = 800 ms; TR = 2530 ms; TE = 3.39 ms; flip angle 10; 208 sagittal slices; 256 × 256 matrix size with spatial resolution as 1.3 × 1 × 1.3 mm3).
Behavioral data analysis.
Statistical analyses of the behavioral data were performed using MATLAB (MathWorks). Logistic regression was done on the behavioral data after collapsing strong/weak responses into immediate and delay categories, with the size of the immediate and delayed rewards as independent variables and choice of immediate or delayed option as the dependent variable. This analysis was performed separately for each participant, collapsing over scanning runs. Temporal discounting factor (f) was computed as follows: f = βdelayed/βimmediate, where βimmediate and βdelayed are the unstandardized regression coefficients for the immediate and delayed variables, respectively. Assuming a simple hyperbolic discounting function (V = A/(1 + k × D), V is time discounted value of a delayed reward, A is amount, D is delay, and k is a fit parameter often referred to as discounting rate. k can be calculated using the following equation, k = (1/f − 1)/120, with larger k indicating steeper temporal discounting. It should be noted that, as the delay was fixed to 120 d, the selection of the temporal discounting functions (e.g., hyperbolic vs exponential) did not affect the subjective value calculation, which is determined by the discounting factor f. However, interactions between magnitude and discounting are not captured by our model (Green et al., 1997). It should be noted that, because only one delay is used, the immediate and delayed “amount” regressors can equally be considered as immediate and delayed “value” regressors.
fMRI data analysis.
Image preprocessing and statistical analyses were performed by using the FMRI Expert Analysis Tool (version 5.98; part of the FSL package; http://www.fmrib.ox.ac.uk/fsl). The first four volumes before the task were automatically discarded by the scanner to allow for T1 equilibrium. The remaining images were then realigned to correct for head movements. Data were spatially smoothed by using a 5 mm full width at half maximum Gaussian kernel and filtered in the temporal domain using a nonlinear high-pass filter with a 90 s cutoff. EPI images were first registered to the MPRAGE structural images and then into standard (MNI) space, using affine transformations (Jenkinson and Smith, 2001). Registration from MPRAGE structural images to standard space was further refined using FNIRT nonlinear registration (Andersson et al., 2007). Statistical analyses were performed in the native image space, with the statistical maps normalized to the standard space before higher-level analysis.
The data were modeled at the first level using a general linear model within FSL's FILM module. Five parametric regressors were included during the decision-making period starting from presentation of intertemporal alternatives and ending when subjects responded: (1) the overall task regressor (1 for each trial); (2) the size of the immediate reward; (3) the size of the delayed reward; (4) the relative value, calculated using this formula: relative value = abs (immediate − delayed × f) (FitzGerald et al., 2009; Lim et al., 2011); and (5) reaction time (RT) (Sripada et al., 2011). The RT variable was included to separate relative value from the time on task effect, as behavioral evidence suggests longer time in making a decision for trials with lower relative value. For all the models, each regressor (except for the task regressor) was first demeaned and normalized to the same range (−1 vs 1) and then convolved with the double-gamma canonical hemodynamic response function. Trials with no valid response were modeled as a separate regressor of no interest.
It is worth noting that, whereas immediate and delayed amount regressors were orthogonal, the relative value regressor was not orthogonal to the individual reward regressors. The correlations between immediate value and relative value were especially high for subjects with steep temporal discounting. Because of this concern, we removed two subjects with an f smaller than one-third (for them, the correlation was 0.98 and 0.88, respectively). As expected, there were also high correlations between relative value and RT (−0.41 ± 0.11). Essentially, we did not orthogonalize the model so that the obtained parametric estimation was unique to each regressor. This provided a conservative estimation of the parameters association with MRI signal.
Furthermore, we applied several simple models with only the task parameter and one of the value or RT regressors mentioned above, to examine the effect of each individual regressor. Because previous studies have suggested that the ventromedial prefrontal cortex (VmPFC) might encode the summed value (FitzGerald et al., 2009; Sripada et al., 2011) and/or chosen value (Kable and Glimcher, 2009), two additional models to examine the neural responses to summed value and chosen value were also included. This final regressor, value of chosen alternative, is very similar to a regressor set to the larger alternative (Sripada et al., 2011).
A second-level analysis was performed using a fixed-effect model where all three functional runs were combined within individual participants. Finally, these contrast results were then input into a random-effect model for group analysis using an ordinary least square model. Group images were thresholded using cluster detection statistics, with a height threshold of z > 2.3 and a cluster probability of p < 0.05, corrected for whole-brain multiple comparisons using Gaussian Random Field Theory.
Estimating signal change for each level of immediate and delayed reward.
We did two additional models to estimate brain activation for each level of the 16 immediate and delayed rewards, respectively. For each model, all trials with the same immediate (or delayed) reward were grouped as separate regressors (16 in total), the delayed (or immediate) value, the relative value, and the RT were included as covariates of no interests. In one version of the model, we used the smooth data, and results were used to fit the value function. In another version, the unsmoothed data were used, and results were used for the MVPA.
It should be noted that, to accurately estimate the neural response of single trials, previous studies have generally used a slow event-related design with a long intertrial interval (Hampton and O'Doherty, 2007; McNamee et al., 2013). Moreover, different types of value were separately presented, and the subjective value was estimated using procedures, such as “willingness to pay” (McNamee et al., 2013). However, given evidence that valuation in a choice context differs from valuation of single rewards (Luo et al., 2009), we opted for a parametric design that allowed us to identify neural correlates of immediate and delayed rewards that were always presented as alternatives. Moreover, by using a fast event-related design, we could present all possible pairs of immediate and delayed reward at relatively wide ranges and small steps, which we anticipated would improve the resolution and accuracy of reward representation. A similar method has been used to examine the neural representation of gain and loss in risky decision-making (Jimura and Poldrack, 2012).
One potential limitation of this design is that the short interstimulus interval might not allow us to completely separate visual stimulation and valuation. With the use of textual stimulation and a multivariate regression model that specifically identified spatial maps within which variance predicted the amounts of each reward, but not the category of immediate or delayed reward (see below), we believe the probability that visual stimulation contributed to our findings is low.
Support vector regression (SVR) analysis.
High-dimensional regression MVPA was performed using a searchlight procedure with a 3-voxel radius. Epsilon-insensitive SVR (Drucker et al., 1997) with a linear kernel, as implemented in PyMVPA (http://www.pymvpa.org) (Hanke et al., 2009), was used to estimate the target reward amount (Kahnt et al., 2011, 2014; Jimura and Poldrack, 2012; He et al., 2013). This method allowed us to decode continuous variables; thus, there was no need to divide reward amount into high versus low groups (Kahnt et al., 2010). Threefold cross-run validation was used. For each level of the immediate or delayed reward, test and training data were normalized (i.e., mean subtracted out and then divided by SD) across voxels within each region of interest (i.e., searchlight) (Misaki et al., 2010). This procedure allowed evaluation of the pattern of activity across voxels without contamination from the mean signal differences within the searchlight. Based on previous study (Jimura and Poldrack, 2012), the SVR cost parameter was set to 0.01 and the ε parameter was set to 0.1.
Voxelwise accuracy of SVR prediction was then calculated, defined as the z-transformed Pearson's correlation coefficient between actual and predicted amount of the immediate or delayed reward for the left-out BOLD period. Group analysis used ordinary least square models, to facilitate the comparison with the univariate analyses. Group images were thresholded using cluster detection statistics, with a height threshold of z > 2.3 and a cluster probability of p < 0.05, corrected for whole-brain multiple comparisons using Gaussian Random Field Theory.
ROI analyses.
ROIs were created by extracting the clusters showing significant modulation of immediate value (pDmPFC), delayed value (aDmPFC, frontal pole), and relative value (VmPFC, PCC, and ventral striatum). ROI analyses were performed by extracting parameter estimates (betas) of each event type from the fitted model and averaging across all voxels in each significant cluster for each subject. Percentage signal changes were calculated using the following formula: [contrast image/(mean of run)] × ppheight × 100%, where ppheight is the peak height of the hemodynamic response versus the baseline level of activity (Mumford, 2007).
Results
Behavioral results
We first analyzed the RT data and the probability of choosing delayed rewards. As expected, the probability of choosing delayed rewards increased with the amount of delayed rewards (Fig. 2A). Subjects responded faster when the value differences of the two options (i.e., relative value) increased (Fig. 2B).
A, Color-coded heatmap of probability of choosing delayed rewards at each level of immediate/delayed rewards combination. Red represents higher willingness to accept the delayed rewards; blue represents lower willingness to accept delayed rewards. B, Color-coded heatmap of reaction times. Red represents slower reaction times; blue represents faster reaction times.
We then assessed behavioral sensitivity to immediate rewards and delayed rewards by fitting a logistic regression to each participant's choices collected in the scanner, using the amount of the immediate and delayed rewards as independent variables. The accuracy of the model was determined by using the following equation: y = 1/(1 + e−f(x)), where f(x) represents the regression function and y is the model prediction. The mean accuracy of prediction is 92.23% (SD 3.83%). The discounting factor (f) from this analysis had a median of 0.78, ranged from 0.42 to 0.96 (two subjects who were excluded had an f of 0.32 and 0.1, respectively), and the corresponding k was 0.0023, ranging from 0.0003 to 0.0115.
Imaging results
Brain regions representing the immediate rewards
The present study aimed at examining neural representations of immediate and delayed rewards, using both univariate and MVPA. Using MVPA, we found that activity in the posterior portion of the dorsomedial prefrontal cortex (pDmPFC) (x = 6, y = 34, z = 30 in MNI coordinates, Z = 4.42) correlated with the amount of immediate rewards (Fig. 3A). Univariate analysis, however, failed to observe any activity in the mPFC, even after lowering the threshold to z < 1.96, uncorrected. The brain region showing significant univariate modulation by the amount of immediate rewards was the left superior temporal gyrus (STG) (xyz: −58, −48, 20, Z = 4.21).
Brain regions modulated by immediate rewards. A, Brain region showed sensitivity to the magnitude of immediate rewards in multivariate analysis. B, Group-level t values of each voxel in the pDmPFC cluster were plotted for MVPA against univariate analysis. Each dot indicates one voxel. Black slanted lines indicate where the two t values were equivalent. U_IM, Univariate statistics of immediate values; M_IM, multivariate statistics of immediate values.
To directly compare the univariate and MVPA results, we plotted the t statistics of each voxel in the pDmPFC area. As shown in Figure 3B, there was a significant correlation between the t statistics of the two analyses (r = 0.125, p = 0.01), although the t value for the univariate analysis was overall lower.
Brain regions representing delayed rewards
Next, we examined brain regions whose activities were correlated with the magnitude of delayed rewards. Univariate analysis revealed that the activities in the anterior portion of the dorsomedial prefrontal cortex (aDmPFC) (xyz: −6, 50, 22, Z = 3.31) (Fig. 4A) and the right lateral frontal pole cortex (LFPC) (xyz: 20, 66, 18, Z = 3.46) (Fig. 4C) were negatively correlated with the amount of delayed rewards. Other brain regions showing similarly negative correlation included the right superior frontal gyrus, left STG, right middle temporal gyrus (MTG), right supplementary motor area, and the left cerebellum (Table 1). Focusing on the aDmPFC and LFPC, ROI analysis confirmed the linear decreases in BOLD response as the amount of delayed rewards increased (Fig. 4B,D).
The brain regions modulated by delayed rewards. The aDmPFC (A) and the LPFC (C) showed sensitivity to the delayed rewards in the univariate analysis. B, D, Scatter plots of percentage signal change as a function of delayed rewards in aDmPFC and LFPC, respectively. A linear function was fitted to the data. The aDmPFC (E) and LPFC (G) also showed sensitivity to delayed rewards in the multivariate analysis. F, H, Scatterplots of the group-level t value of multivariate against univariate analysis for delayed rewards in the mPFC and the PFC, respectively. U_DL, t statistics of delayed reward condition by univariate analysis; M_DL, t statistics of delayed reward condition by multivariate analysis.
Brain regions showing significant effect in univariate analysis
MVPA results also revealed that activities in the aDmPFC (xyz: −4, 40, 32, Z = 3.39) and the LFPC (xyz: 50, 42, 10, Z = 4.23) could predict the amount of delayed reward (Fig. 4E,G). Other regions included the right inferior frontal gyrus, right precuneus, left supramarginal gyrus, and the cerebellum (Table 2). Direct comparison of the results between MVPA and univariate analysis showed a significant correlation between the t statistics of the two analyses in the aDmPFC (r = −0.41, p < 0.0001) (Fig. 4F) and the LFPC (r = −0.24, p < 0.0001) (Fig. 4H). Again, the effect size was lower for univariate analysis compared with MVPA.
Brain regions that showed significant effect in multivariate analysis
Posterior–anterior gradient in representing immediate and delayed value
The above analyses suggested a spatially graded sensitivity of value representation of immediate and delayed rewards along the posterior versus the anterior mPFC (Fig. 5A,B). To visualize the posterior–anterior gradient in immediate and delayed value representation, we took the MVPA second-level t-scores as an indication of the strength of the distributed value representation and plotted it against the position in y-axis. We observed that the strength of immediate reward representations decreased (r = −0.58) along the posterior–anterior axis, whereas the strength of delayed reward representations increased (r = 0.52) along the same axis (Fig. 5C). This result clearly suggests a posterior–anterior gradient in representing immediate and delayed values.
The posterior–anterior gradient for the representation of immediate and delayed rewards in the DmPFC. Brain regions sensitive to the amplitude of immediate (red) and delayed values (blue) in multivariate analysis were overlaid on axial (A) and sagittal (B) slice of the group mean structural image. C, Plot of MVPA t statistics against y-axis location. Red dots represent voxel sensitivity to immediate rewards; blue dots represent voxel sensitivity to delayed rewards.
Brain regions involved in relative value and RT
Several regions were positively modulated by the relative value: the value difference of the options offered to the subject during intertemporal choices, including the VmPFC (xyz: −10, 40, −14, Z = 4.60), the PCC (xyz: −6, −54, 22, Z = 4.51), and the right nucleus accumbens (NAcc) (xyz: 4, 14, −8, Z = 3.45). In contrast, the dorsal anterior cingulate cortex (dACC) (xyz: 8, 20, 42, Z = 4.05) showed a negative modulation (Fig. 6A). Other regions showing negative correlations included the left middle frontal gyrus, right STG, left precuneus cortex, right temporal pole, right precentral gyrus, right angular gyrus, and left cerebellum (Table 1). No region showed positive correlations with delayed value.
Brain regions sensitive to relative value and RT. A, Regions showing significant (p < 0.05 FWE whole-brain corrected) positive (red) and negative (blue) correlation with relative value are rendered onto a population-averaged surface atlas using multifiducial mapping (Van Essen, 2005). B, Regions showing significant (p < 0.05 FWE whole-brain corrected) positive (red) and negative (blue) correlation with RT.
Activity in the dACC (xyz: 2, 16, 50, Z = 7.3) was stronger when the RT was longer (Fig. 6B). In contrast, the rostral anterior cingulate cortex (rACC) (xyz: 8, 32, −4, Z = 5.64) and the left precuneus (xyz: −22, −52, 12, Z = 3.63) showed decreased activation with increased RT.
Distributed value representations in the medial prefrontal cortex
The above analysis investigated neural correlates of the immediate reward, delayed reward, and relative value, as well as the RT. As summarized in Figure 7A, it is clear that various types of reward were separately represented in the MFPC. Furthermore, we found that the VmPFC, PCC, and NAcc were strongly modulated by relative value, weakly by the chosen value, but minimally by the summed value or RT.
Distributed value representations during intertemporal choices. A, Schematic illustration of the major findings from the present study, showing that different regions are sensitive for immediate value, delayed value, relative value, and RT. Percentage signal change for different regressors in the (B) VmPFC, (C) PCC, and (D) NAcc. It is evident that these regions are particularly sensitive to relative value, but not to other value signals. IM, Immediate reward; DL, delayed reward; RV, relative value; RT, reaction time; SV, summed value; CV, chosen value.
To further show that these separated value representations had not resulted from the model selected, we created four additional simple models, including only the task regressor, one of the three reward/values (i.e., immediate reward, delayed reward, and relative value) or RT regressors. The results were overall similar to that obtained by the overall model, with a few exceptions. In particular, we found VmPFC showed similar modulation by relative value in the simple and overall model (0.18% vs 0.17%) but stronger negative modulation by RT in the simple model compared with the overall model (−0.13% vs −0.05%), suggesting that this region is mainly modulated by relative value. In contrast, we found that rACC showed stronger modulation by the relative value in the simple model relative to the overall model (0.13% vs 0.056%), although similar modulation by RT (−0.19% vs −0.15%), suggesting it was mainly modulated by RT.
Discussion
The present study investigated the neural correlates of the immediate and delayed rewards during intertemporal choices, by simultaneously and independently manipulating the magnitude of immediate and delayed rewards. We discovered that brain activity in the posterior DmPFC was modulated by the amount of immediate options, whereas the activity in the adjacent anterior DmPFC, together with the LFPC, was modulated by the amount of delayed options. In addition, activities in the VmPFC and the rACC were modulated by the relative value and RT, respectively. These results provide important insights into the role of mPFC in intertemporal choices in particular and decision-making in general.
Univariate analysis revealed a significant negative correlation with the amount of delayed rewards in the anterior DmPFC. One hypothesis regarding the functional significance of this is that there was increased demand in valuation when the amplitude of delayed value was low. Consistent with this view, previous studies have implicated the anterior DmPFC in higher-level control (Ramnani and Owen, 2004; Venkatraman et al., 2009), and subjects who were more sensitive to the temporal delay exhibited greater activity in frontopolar cortex in the delay condition than in the immediate condition (Luhmann et al., 2008). Activity in DmPFC during intertemporal choices was also correlated with individuals' impulsivity (Sripada et al., 2011). Consistent with the importance of future perspective in intertemporal discounting (Peters and Büchel, 2010), activity in the anterior DmPFC during episodic prospection predicted subsequent selection of patient alternatives (Benoit et al., 2011). With this hypothesis in mind, it would be informative to examine whether the observed anterior DmPFC finding was dependent directly on the associated delay of the reward or was dependent on the presence of an alternative more immediate option.
Importantly, using the MVPA, we found that DmPFC was functionally organized along a posterior–anterior axis with respect to the immediate and delayed rewards. This finding is consistent with the observation that processing recent time information engages posterior areas of prefrontal cortex whereas processing distant time information engages anterior areas of prefrontal cortex in decision-making (Koritzky et al., 2013).
This functional dissociation also corresponds very well with the general structural and functional architecture of the prefrontal cortex (Kringelbach and Rolls, 2004; Wise, 2008). Paralleling the immediate–future dissociation in the medial PFC, the lateral orbital cortex shows a primary–secondary dissociation along the posterior–anterior axis: whereas the anterior lateral OFC processes secondary reward (i.e., monetary gains), the posterior lateral OFC processes more primary rewards (i.e., erotic stimuli) (Sescousse et al., 2010). In the lateral prefrontal cortex, cognitive control involving temporally proximate and concrete action representations is supported by posterior lateral prefrontal regions, and that involving temporally extended and abstract representations is supported by more anterior lateral prefrontal regions, such as the frontopolar cortex (Koechlin and Summerfield, 2007; Badre, 2008; Dreher et al., 2008). Together, these lines of evidence are consistent with the proposal that information conveying high immediacy, high certainty, or high tangibility engages the more posterior PFC, whereas information conveying delay in the future, low certainty, or less tangibility engages the more anterior PFC (i.e., frontal pole) (Bechara and Damasio, 2005).
In addition to the separate representation of delayed and immediate values, we found that the VmPFC, the PCC, and the NAcc were involved in the representation of relative value, the value differences between possible options that can be maximized by certain decisions. Many economic models assume that individuals assign values to stimuli relative to a reference point given by the expected level of consumption reward. During the two-option tasks, the reference point could be well approximated by the averaged value of the two options (e.g., mean Vimmediate,Vdelay) (Këszegi and Rabin, 2006), and the relative value is thus compatible with the reference-dependent value signal (Lim et al., 2011). The finding that VmPFC represents reference-dependent value signal is consistent with previous observations (Knutson et al., 2007; Plassmann et al., 2007; Hare et al., 2008; FitzGerald et al., 2009; Lim et al., 2011; Sripada et al., 2011). Moreover, the VmPFC also encodes the relative value regardless of the categories of goods presented or the specific types of comparison being performed (Chib et al., 2009; FitzGerald et al., 2009; McNamee et al., 2013). Importantly, this result is also in line with previous findings that the VmPFC and PCC coded subjective value of delayed rewards (Kable and Glimcher, 2007; Ballard and Knutson, 2009; Peters and Büchel, 2009). In these studies, the immediate value was fixed and the relative value and delayed value should show similar modulation in these regions (Kable and Glimcher, 2009).
A greater relative value often requires shorter time in making a decision. Consistent with this view, we observed a moderate correlation (r = 0.4) between relative value and RT. To dissociate them, we included two correlated regressors (i.e., relative value and RT) in the same model and compared the results with that from the simple models. We found that, although rACCs were relatively more affected by RT, adding the RT regressor did not significantly alter the strength of the association between relative value and vmPFC signal. Although this suggests that vmPFC signal tracked relative value rather than an alternative component of decision difficulty, the RT regressor certainly does not capture all variance associated with decision difficulty. Definitive resolution of this issue requires an experimental dissociation of relative value and decision difficulty.
The dACC, on the other hand, was modulated by both processes, which is consistent with studies showing that the dACC activities were correlated with the degree of the dissonance between value and time (Pine et al., 2009) and also with the relative value of the two options (FitzGerald et al., 2009). As the dACC and the adjacent presupplementary motor area (pre-SMA) are an important region in response control and action selection, the overlapping representation of relative value and RT in dACC is consistent with its role to convert value differences to motor responses (Rushworth et al., 2004, 2007). Additional regions modulated by RT included the lateral prefrontal cortex and parietal lobe, also consistent with previous observation in intertemporal choices (McClure et al., 2004).
Together, our results add to growing evidence in decision neuroscience that, when deciding between a small closed set of alternatives, values of alternatives are separately represented according to the features and attributes in terms of the primacy, modality, concreteness, and immediacy (Bechara and Damasio, 2005; Kringelbach, 2005; Sescousse et al., 2013). It has been further hypothesized that value comparison is facilitated by conversion of distinct value signals into a common neural currency (Shizgal and Conover, 1996; Montague and Berns, 2002), primarily represented in the ventral striatum and VmPFC (Kable and Glimcher, 2007; Knutson and Bossaerts, 2007; Tom et al., 2007; Chib et al., 2009; Xue et al., 2009).
The MVPA has been increasingly used to investigate the neural substrates of decision-making (Carter et al., 2012; Jimura and Poldrack, 2012; McNamee et al., 2013). Compared with conventional univariate analysis, MVPA has been shown to be more sensitive and more capable of detecting the finer, distributed neural dissociations (McNamee et al., 2013). Critically, these two approaches might tap into qualitatively distinct features of functional neuroimaging data according to distinct mechanisms. In particular, the univariate approach evaluates changes in voxelwise intensity and is thus more sensitive to global engagement in ongoing tasks. In contrast, MVPA examines patterns of BOLD fMRI signal across voxels and is thus more sensitive to distributed coding of information (Jimura and Poldrack, 2012). Our results provide support for the above claims. In general, we observed that the univariate and MVPA signals were correlated, and the MVPA signal was relatively stronger, suggesting the value signal might be distributedly represented and thus better captured by the MVPA approach (McNamee et al., 2013).
In the present design, a single delay (i.e., 120 d) was chosen to avoid dependency on the particular discount function used in modeling valuation of delayed rewards. This is important because of the individual differences in the functional form of discounting (Coller et al., 2012). Nevertheless, real-life intertemporal decisions span from seconds to many years. One interesting hypothesis that needs to be tested is whether the posterior–anterior gradient of reward representation in the DmPFC aligns with the relative or absolute distance of the future. Furthermore, future studies should also examine whether and how episodic future thinking (Peters and Büchel, 2010) and cognitive control (Hare et al., 2009; Figner et al., 2010; Monterosso and Luo, 2010) could modulate these reward representations to influence decisions.
In conclusion, the present study suggests multiple, distributed value signals during intertemporal choices. The representation of immediate and delayed options in the DmPFC follows a posterior–anterior gradient, which is consistent with the structural and function architecture of the prefrontal cortex. Moreover, our results provide support for a central role of the VmPFC in value comparison, where the presented option values were converted into a common neural scale and relative value differences were computed and conveyed to the dACC to guide decisions. Together, these findings provide further understanding of the neural processes related to valuation instantiated in the brain.
Footnotes
This work was supported in part by the National Natural Science Foundation of China (31130025), the 973 Program (2014CB846102), the National Natural Science Foundation of China (31221003, 31170990, 81100992), and the 111 Project (B07008). We thank Koji Jimura and Russell Poldrack for help with the multivariate analysis.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Gui Xue, National Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China. guixue{at}gmail.com