Abstract
Marginal utility theory prescribes the relationship between the objective property of the magnitude of rewards and their subjective value. Despite its pervasive influence, however, there is remarkably little direct empirical evidence for such a theory of value, let alone of its neurobiological basis. We show that human preferences in an intertemporal choice task are best described by a model that integrates marginally diminishing utility with temporal discounting. Using functional magnetic resonance imaging, we show that activity in the dorsal striatum encodes both the marginal utility of rewards, over and above that which can be described by their magnitude alone, and the discounting associated with increasing time. In addition, our data show that dorsal striatum may be involved in integrating subjective valuation systems inherent to time and magnitude, thereby providing an overall metric of value used to guide choice behavior. Furthermore, during choice, we show that anterior cingulate activity correlates with the degree of difficulty associated with dissonance between value and time. Our data support an integrative architecture for decision making, revealing the neural representation of distinct subcomponents of value that may contribute to impulsivity and decisiveness.
Introduction
Adam Smith (1776) provided an engaging account of the paradox of value by considering the disparity between the price of water and diamonds. A subjective theory of value addressed this perplexity by positing that the value of a good is not determined by its maximal utility, rather by the increase in utility obtained by consuming one extra unit of that good—its marginal utility. A salient feature of marginal utility is that it diminishes as the quantity of a good increases—hence the utility provided by a fixed amount of £10 is greater when added to an option worth £50, than one worth £500. Since water is more plentiful than diamonds, its marginal utility is consequently smaller. This “law” remains integral to economic theory, most notably in relation to the microeconomic concept of indifference curves and modern analyses of decision under risk and uncertainty (von Neumann and Morgenstern, 1947; Kahneman and Tversky, 1979). However, risk aversion does not provide incontrovertible evidence for decreasing marginal utility, because this link exists only in theory.
One domain where the effect of diminishing marginal utility may be observed is in intertemporal choice—choice between smaller–sooner and larger–later rewards or punishments. Humans frequently make important intertemporal decisions, for example, when deciding to spend or save. Since future rewards are less valuable than current rewards, their value is discounted in accordance with their delay—a process known as temporal discounting. Previous research (Mazur, 1987; Green and Myerson, 2004) suggests that humans and animals discount future rewards in a hyperbolic manner, where the steepness of the hyperbolic curve is determined by a free parameter, termed the discount rate. Those with greater discount rates will devalue future rewards more quickly and have greater preference for smaller–sooner rewards. However, choice outcome should also be determined by the rate at which the marginal utility of the chooser diminishes, since this affects the perceived increase in utility of the larger, relative to the smaller option (independently of temporal discounting). Consequently, a unified model that integrates temporal discounting and marginal utility should account better for choice behavior. Such a model might shed light on the determinants of impulsivity, of which preference in intertemporal choice is a key measure (Evenden, 1999). Moreover, the integration of subjective valuation schemes relating to magnitude and delay should be represented neuronally to give a representation of the overall subjective value of a delayed reward.
Here, we developed a paradigm based on intertemporal choice and varying monetary rewards, to measure changes in marginal utility directly. Incorporating a utility function into existing models of temporal discounting enabled us to manipulate temporal effects and those mediated by marginal utility. We tested whether this model was better in accounting for subjects' choices than standard models. Furthermore, having established the effects of decreasing marginal utility on choice behavior, we could study the neural implementation of temporal discounting (free from the utility confound), marginal utility, or both. Specifically, we asked how these distinct, subjective components of value are integrated, both behaviorally and neuronally.
Materials and Methods
We used functional magnetic resonance imaging (fMRI) while subjects chose between two serially presented options of differing magnitude (from £1 to £100) and delay (from 1 week to 1 year) (Fig. 1). These choices were often smaller–sooner versus larger–later in nature and presented serially, to separate decision-making and option valuation processes. Two of the subjects' choices were selected at random at the end of the experiment (one from each experimental session) and paid for real, by way of prepaid credit cards with a timed activation date (see supplemental Methods, available at www.jneurosci.org as supplemental material). We used subjects' choices to assess the extent of discounting for both magnitude and time. We assessed a model that combined a utility function (converting magnitude to utility) with a standard hyperbolic discounting function. In simple terms, the function for the discounted utility (subjective value) of a delayed reward (V) is equal to D × U, where D is a discount factor between 0 and 1, and U is undiscounted utility. D is a function of delay to the reward, and includes the individual's discount rate parameter, whereas U is a function of the magnitude of the reward and includes a subject-specific parameter determining the concavity (or degree of diminishing marginal utility) of the utility function (see supplemental Methods, available at www.jneurosci.org as supplemental material).
Experimental task. The display outlines the sequence of stimuli in a single trial. Subjects are presented with two options—a smaller–sooner and a larger–later amount of money (range, £1–100; 1 week to 1 year). Subjects chose the option which they preferred and received their chosen option for 2 out of the 220 trials, as determined by a lottery and paid using prepaid credit cards activated at the specified time (see supplemental Methods, available at www.jneurosci.org as supplemental material).
Participants.
Twenty-four right-handed, healthy volunteers were included in the experiment (12 male, 12 female; mean age, 23; range, 19–28). Subjects were preassessed to exclude those with a previous history of neurological or psychiatric illness. All subjects gave informed consent, and the study was approved by the UCL ethics committee.
Procedure and task description.
The choice task was as described above and in Figure 1 (see supplemental Methods, available at www.jneurosci.org as supplemental material, for further details).
On arrival, subjects were given an instruction sheet to read (see supplemental Appendix 1, available at www.jneurosci.org as supplemental material), explaining the task and details of the payment. They were also shown the credit cards and the lottery machine to reassure them that the payment system was genuine. After a short practice of six trials, they were taken into the scanner where they performed two sessions of 110 trials each.
Payment was implemented by way of a manual lottery after completion of all testing. The lottery contained 110 numbered balls, each representing a trial from the first session of testing. The ball which was selected corresponded to the rewarded trial for that testing session. The magnitude and delay of the option which the subject chose in the selected trial was determined and awarded using a prepaid credit card. The magnitude of the option chosen was loaded onto the card and given to the subject. The activation code on the card was removed and sent by e-mail to the subject at the delay specified by the chosen option. This lottery was then repeated to determine a reward for the second session of testing, and a second card was issued. Both lotteries took place after all testing had been completed. Thus, the payment each subject received was determined by a combination of the lottery and the choices that they made—a manipulation that ensured subjects treated all choices as real. The payment system was designed so that on average each subject would receive £100. No other payment was awarded for mere participation in the experiment. Since only two choices were paid to the subjects, and selected after the testing was completed, we could be confident that any influence of changing reference points (as a result of increasing wealth) was unlikely to be significant.
Imaging procedure.
Functional imaging was conducted by using a 3 Tesla Siemens Allegra head-only MRI scanner to acquire gradient echo T2*-weighted echo-planar images (EPIs) with blood oxygenation level-dependent (BOLD) contrast. We used a sequence designed to optimize functional sensitivity in the orbitofrontal cortex (OFC) (Deichmann et al., 2003). This consisted of tilted acquisition in an oblique orientation at 30* to the anterior cingulate–posterior cingulate line, as well as application of a preparation pulse with a duration of 1 ms and amplitude of −2 mT/m in the slice selection direction. The sequence enabled 36 axial slices of 3 mm thickness and 3 mm in-plane resolution to be acquired with a repetition time (TR) of 2.34 s. Subjects were placed in a light head restraint within the scanner to limit head movement during acquisition. Functional imaging data were acquired in two separate 610 volume sessions. A T1-weighted structural image and fieldmaps were also acquired for each subject after the testing sessions.
Behavioral analysis.
We used a maximum likelihood estimation of the softmax decision rule to assign probabilities to each option of the choice. This was implemented within the context of our valuation model, to calculate the best fitting parameter estimates for the discount rate (K) and utility concavity (r) as well as the likelihood of the model. We repeated this procedure for a number of other influential valuation models. To test for an effect of temporal discounting and concave utility, we compared subject-specific parameter estimates to zero using one sample t tests. Model comparison was performed using the Akaike information criterion (AIC) (see supplemental Methods, available at www.jneurosci.org as supplemental material, for detailed description of the models and fitting procedures).
Imaging analysis.
Image analysis was performed using SPM5 (www.fil.ion.ucl.ac.uk/spm). For each session, the first five images were discarded to account for T1 equilibration effects. The remaining images were realigned to the sixth volume (to correct for head movements), unwarped using fieldmaps, spatially normalized to the Montreal Neurological Institute (MNI) standard brain template and smoothed spatially with a three-dimensional Gaussian kernel of 4 mm full-width at half-maximum (FWHM) (and resampled, resulting in 3 × 3 × 3 mm voxels). Low-frequency artifacts were removed using a 1/128 Hz high-pass filter and temporal autocorrelation intrinsic to the fMRI time series was corrected by prewhitening using an AR(1) process.
Single-subject contrast maps were generated using parametric modulation in the context of the general linear model. We performed three analyses, examining variance in regional BOLD response attributable to different regressors of interest: absolute U, D, and V for all options (analysis 1); absolute M, U, and D for all options (analysis 2); absolute difference in V, D, and U (ΔV, ΔD, and ΔU) between the two options on each trial (analysis 3). In these analyses, the interaction term V was calculated from the mean corrected values of D and U (this was not necessary in the nonorthogonalized regression analysis). Analysis 1 allowed us to identify regions implicated in the evaluation and integration of different reward-related information. Analysis 2 allowed us to identify regions showing response to the (diminishing marginal) utility of rewards, as opposed to their absolute magnitude. Analysis 3 allowed us to identify regions where activation covaried with the difficulty of each choice. The 4 mm smoothed images allowed us to perform high-resolution, single-subject analyses (see supplemental Results, available at www.jneurosci.org as supplemental material, for examples).
For analysis 1, U, D, and V for each option (two per trial) were calculated using the canonical parameter estimates (K and r) in the context of the second model (see supplemental Material, Eq.4, available at www.jneurosci.org as supplemental material) and convolved with the canonical hemodynamic response function (HRF) at the onset of each option. Analysis 2 was performed in a similar manner for M, U, and D. We also looked for subject-by-subject covariation between U in analysis 2 and the estimated parameter r, but this did not yield significant results at our threshold. For analysis 3, ΔV, ΔD, and ΔU were convolved with the canonical HRF at the onset of the choice phase. All onsets were modeled as stick functions, and all regressors in the same model were detrended and orthogonalized (in the orders stated above) before analysis by SPM5. To correct for motion artifacts, the six realignment parameters were modeled as regressors of no interest in each analysis.
At the second level (group analysis), regions showing significant modulation by each of the regressors specified at the first level were identified through random effects analysis of the β images from the single-subject contrast maps. The contrast maps were smoothed before analysis with a three-dimensional Gaussian kernel of 7 mm FWHM (this achieved an effective smoothing of 8 mm FWHM at the second level). We included the betas from the single-subject reaction time analyses (see supplemental Material, available at www.jneurosci.org as supplemental material) as covariates in analysis 3. We report results for regions where the peak voxel-level t value corresponded to p < 0.001 (uncorrected), with minimum cluster size of five. Results which were corrected for multiple comparisons (family-wise error corrected p < 0.05) at the whole-brain level or with small volume corrections are indicated in the supplemental Results Tables and Figures, available at www.jneurosci.org as supplemental material. We additionally report uncorrected results but caution that these should be considered exploratory findings, which await additional confirmation by further studies. Coordinates were transformed from the MNI array to the stereotaxic array of Talairach and Tournoux (1988) (http://imaging.mrc-cbu.cam.ac.uk/imaging/MniTalairach).
To identify regions where modulation by each regressor in each analysis overlapped (as in analyses 1 and 3), we constructed explicit inclusive masks using a threshold of p < 0.001 and constrained subsequent analyses using this mask. For example, in analysis 1, to observe regions significantly correlating with U, D, and V, we first identified regions where U modulated responses at p < 0.001 and created a mask from this image. We then identified regions where D modulated responses within this mask, at p < 0.001 and created a second mask from this contrast. Finally, we identified regions where V modulated responses within this second mask. In analysis 2, a mask was created from regions modulated by presentation of options at p < 0.001.
The structural T1 images were coregistered to the mean functional EPI images for each subject and normalized using the parameters derived from the EPI images. Anatomical localization was performed by overlaying the t-maps on a normalized structural image averaged across subjects and with reference to the anatomical atlas of Mai et al. (2003).
Results
Model comparison
Maximizing the likelihood of the choices made by the subjects (on a subject-by-subject basis), under the assumptions of the model, enabled us to estimate the individual parameters determining discount rate and utility concavity. These estimates revealed (see supplemental Results, available at www.jneurosci.org as supplemental material) that subjects discounted the value of future rewards (null hypothesis of no discounting: p < 0.00025) and also that they exhibited diminishing marginal utility (null hypothesis of a linear utility function: p < 0.05). Evidence-based model comparison revealed that, compared with a number of other influential valuation models (Table 1; see supplemental Methods and Results, available at www.jneurosci.org as supplemental material), our discounted utility model was the most likely given the data (Akaike weight = 0.99) and significantly better at describing the choices than a standard hyperbolic model which assumes linear utility (difference in AIC = 34.5), as well as the other models.
Model comparison
fMRI
We next analyzed brain activity acquired, using fMRI, during actual task performance by constructing parametric regressors to explore the representation of three key quantities. The first two quantities were undiscounted utility (which incorporates the nonlinear utility function but ignores time) and the discount factor (the proportion by which utility is reduced in relation to an immediate payoff, i.e., a value between zero and one). The third quantity was discounted utility—the product of the first two, which in statistical terms represents an interaction between utility and discounting. These regressors were generated from the behavioral parameter estimates and orthogonalized with respect to each other. We used our discounted utility model to create the regressors (see supplemental Methods, available at www.jneurosci.org as supplemental material).
Statistical parametric maps (SPMs) (Fig. 2) revealed distinct patterns of brain activity associated with each component process aspect of valuation. Undiscounted utility (U) correlated with activity in the striatum, ventral tegmental area (VTA), and anterior cingulate cortex (ACC), consistent with previous findings implicating these regions in the anticipation and receipt of reward (Breiter et al., 2001; Knutson et al., 2001; Yacubian et al., 2007). The discount factor (D) correlated with activity in the striatum, insula, posterior and pregenual cingulate cortex, ventromedial orbitofrontal cortex, VTA, and inferior frontal gyrus, consistent with, and supplementing previous results from, studies of temporal discounting in both animals (Mobini et al., 2002; Cardinal et al., 2004; Roesch et al., 2007a,b; Kim et al., 2008; Kobayashi and Schultz, 2008) and humans (McClure et al., 2004; Tanaka et at., 2004; Kable and Glimcher, 2007).
Regions involved in the subjective valuation and integration of objective reward properties (a parametric analysis). a, Correlates of undiscounted utility (U) of each option, a concave function of its magnitude. b, Correlates of the discount factor (D) of each option, a hyperbolic function of the delay to receipt of the option. c, Interaction of U and D affording the (orthogonalized) discounted utility or value (V) of the option, used to guide choice. d, Dorsal striatum (MNI coordinates and statistical z score: 15, 3, 18; z = 3.26*) significantly correlated with U, D, and V. These SPMs have been thresholded at p < 0.001 (uncorrected) (for comprehensive results, see supplemental Results, available at www.jneurosci.org as supplemental material). *Corrected for multiple comparisons (family-wise error p < 0.05).
Our key analysis, testing for an interaction (i.e., discounted utility, V = DxU orthogonalized with respect to D and U), found significant correlates in dorsal striatum and pregenual cingulate cortex (Fig. 2). Critically, this activation in the dorsal striatum incorporated the same anatomical zone that correlated independently with both undiscounted utility and temporal discounting. Second, each of these colocalized effects cannot be explained by the other two. This implicates the dorsal striatum in both encoding and possible integration of undiscounted utility and temporal discounting to furnish a discounted utility that may play an important role in subsequent choice.
One potential caveat with respect to these results relates to the orthogonalization of the regressors. Because U, D, and V have shared variance components, we orthogonalized V with respect to D and U. This orthogonalization means we are assigning shared variance to U (and D). This was motivated by the fact that V is constructed from or depends on U and D. However, to ensure that the U and D regressors were not modeling any variance attributable to variations in V (or D), we reversed the orthogonalization order in a second analysis. Importantly, activity in the dorsal stratium still correlated with all three regressors, although the strength of V- and U-related effects were somewhat swapped when compared with the first analysis (V activity in striatum resembling the striatal pattern seen for U in the first model). In a further, more conservative analysis, we removed the orthogonalization step entirely (thus removing any shared variance components) from our regression model. The results of this model revealed that responses in the striatum still correlated with unique components of U, D, and V (see supplemental Results, available at www.jneurosci.org as supplemental material). Thus, these analyses strongly suggest that the striatal responses have three separable variance components that can be predicted by variations that are unique to U, D, and their interaction V. Another caveat with respect to the region of overlap concerns a theoretical possibility that colocalized activations corresponding to all three (mean corrected) regressors, may be consistent with encoding of nonmean corrected discounted utility (V). Nevertheless, the fact that all three regressors are encoded in the striatum (separately) is consistent with the hypothesis that integration of distinct components is reflected by activity within the striatum.
Existing neurobiological evidence of nonlinear utility is limited to a previous study (Tobler et al., 2007), which found that learning-related neural activity in striatum correlated with subjects' wealth. However, this evidence is based on a fusion of learning theory and marginal utility theory and leaves open the question as to whether decreasing marginal utility can be detected on a subject-by-subject basis rather than at a population level. To investigate more directly the representation of basic utility at an individual level, we assessed whether the neural representation of utility in the striatum was better correlated with a concave utility function or simply magnitude. Consequently, we included actual magnitude (M) as an extra regressor in our original linear model and orthogonalized the utility regressor (U) with respect to M. Within this model, the representation of utility (U) still correlated with activity in the dorsal striatum (Fig. 3). This finding suggests that the dorsal striatum specifically encodes the utility of a good over-and-above that which can be described by its objective value, thereby offering direct neural evidence for the nonlinearity (concavity) of subjective utility.
The neural encoding of marginal utility and its diminishing nature (statistical parametric maps and example of regressors). a, Activity in the dorsal striatum correlated with the undiscounted utility of rewards (U) over and above its correlation with their objective magnitude (M; i.e., a linear effect of magnitude). A peak was found in the right dorsal caudate (MNI coordinate and statistical z score: 19, 15, 13; z = 3.29). U was orthogonalized with respect to M in the regression to isolate the nonlinear or concave aspects of the predictor variable. b, Example of regressors used in a. Black dots (M) show the subjective value of rewards ranging from £1 to £100 under the assumptions of a linear utility function, whereas red dots indicate the utility (U) of the same magnitudes, calculated using a nonlinear utility function and a canonical estimate of subjects' r—the concavity (0.009).
An important aspect of our model is that it makes clear predictions regarding choice difficulty. Under the assumption that difficult choices—with a small difference in discounted utility (ΔV) between two options—take longer, our model predicts which choices should induce a greater reaction time and more neuronal activity. Consequently, we tested at a behavioral and neural level for these effects. Such an effect was evident from choice latencies, where reaction times were significantly longer in cases where ΔV was small (p < 0.00005), i.e., a negative correlation was observed. Furthermore, we conjectured that in addition to differences in discounted utility (ΔV), greater difficulty would be incurred by options that were separated more in time. Consistent with such a “dissonance effect,” we found that reaction times were also slower when the difference in discount factor (ΔD) was large (p < 0.05), independent of (i.e., orthogonal to) ΔV (see supplemental Methods, available at www.jneurosci.org as supplemental material). We tested for brain regions that correlated with both difficulty indices (ΔV and orthogonalized ΔD) at the time of choice. This revealed correlates in the anterior cingulate cortex (Fig. 4), suggesting a distinct role for this region in intertemporal choice and response selection. This is important in light of a previous finding in which ACC lesions in rodents had no effect on this task (Cardinal et al., 2001). Furthermore, activity in ACC covaried with the degree to which choice latency was affected by ΔV, whereby subjects whose latencies were more affected by difficulty also showed greater activity in ACC with increasing difficulty. Drawing on previous insights on the function of this region (Botvinick et al., 2004; Cohen et al., 2005; Kennerley et al., 2006; Botvinick, 2007; Pochon et al., 2008) in decision making and on anatomical studies of its connectivity leads us to suggest that it adopts a regulatory or monitoring role with respect to the integrative function of the dorsal striatum. However, the impact of this function on actual choice behavior remains to be determined.
Choice difficulty: the intertemporal dissonance effect. Regions that correlated significantly with choice difficulty as measured by closeness in discounted utility between the options (ΔV) and also with difficulty as measured by difference in discount factor (and thus delay) between options (ΔD). Activity here increases as ΔV gets smaller and ΔD increases. Peak activations (MNI coordinates and statistical z scores) are ACC (3, 33, 30; z = 3.64*). Activity in the ACC also covaried with the degree to which behavior (choice latency) was affected by difficulty (as measured by ΔV) (12, 39, 33; z = 3.64) (see supplemental Results, available at www.jneurosci.org as supplemental material, for regions correlating with ΔV and ΔD, separately, as well as regions covarying with reaction time parameters). *Corrected for multiple comparisons (family-wise error p < 0.05).
Discussion
In summary, our data provide both direct behavioral and neurobiological support for marginal utility theory in the context of a choice model that incorporates temporal discounting. This is of particular concern since a neural basis for the concavity of the utility function, a key concept in economics, has not previously been demonstrated. Furthermore, our results suggest that the dorsal striatum may act as a site of convergence of these two systems so as to construct the discounted utility that plays an important role in subsequent choice.
The striatum has been identified in previous studies of temporal discounting in both animals (Cardinal et al., 2001, 2004; Roesch et al., 2007a; Kobayashi and Schultz, 2008) and humans (McClure et al., 2004; Tanaka et al., 2004, 2007; Kable and Glimcher, 2007; Wittmann et al., 2007; Luhmann et al., 2008; Ballard and Knutson, 2009) and less directly in marginal utility (Tobler et al., 2007). In humans, activity has been shown to correlate with preferences for immediate options (McClure et al., 2004) and for discounted magnitude across both immediate and delayed options over both short (Tanaka et al., 2004) and long timescales (Kable and Glimcher, 2007). However, the exact nature of this signal has been unclear, including whether it merely reports on value calculations or their prediction errors, performed elsewhere (Tanaka et al., 2004, 2007; Luhmann et al., 2008). For instance, the well recognized role of this region in reinforcement learning (Robbins et al., 1989; O'Doherty et al., 2004, Seymour et al., 2004; Haruno and Kawato, 2006) does not necessarily speak to a role in constructing value and choice. However, previous data are consistent with distinct roles in encoding delay (Tanaka et al., 2007) and marginal utility (Tobler et al., 2007). The data presented here advance these insights and support a broader and more sophisticated role for this region than previously thought, wherein choices are determined by an integration of distinct determinants of value.
The exact nature of the representation of temporal discounting remains unclear (Wittmann and Paulus, 2008). Superficially, the diminished utility associated with increasing time has strong parallels to probability discounting and indeed some theoretical accounts of temporal discounting propose just this: that uncertainty, for instance through unexpected occurrences that might interfere with reward delivery, accumulates with time (Stevenson, 1986). However, recent neurophysiological evidence suggests that uncertainty and temporal discount factors may be, at least in part, distinct (Luhmann et al., 2008). Furthermore, that the BOLD activity correlates with a single parametric regressor does not in itself imply that it is driven by a single neural determinant, since distinct psychological processes (such as the utility of anticipation or anxiety) (Loewenstein, 1987; Wu, 1999) and neurochemical processes (such as 5HT and dopamine) (Roesch et al., 2007b; Tanaka, 2007) may make independent contributions.
From a behavioral and economic perspective, neglecting nonlinear utility has the potential to confound inferences about discounting since any model could over-estimate the discount rate to account for marginal utility effects (Andersen et al., 2008). A similar argument could apply to the neurophysiological data. This has particular relevance for understanding personality characteristics such as impulsivity. The term impulsive is a general description of a diverse group of behaviors with distinct features (likely dependent on distinct neural processes), which are encompassed by a general theme of behavior in the absence of adequate foresight (Evenden, 1999; Winstanley et al., 2006). These include motor/behavioral impulsiveness, the inability to withhold a prepotent behavioral response, and reflection impulsiveness, a failure to slow down (or “hold your horses”) (Frank et al., 2007) in response to decision–conflict, to properly consider options. Another feature, choice/temporal impulsiveness, is often defined as the propensity to choose small short-term gains in preference to larger delayed gains (or larger delayed losses in preference to smaller immediate losses) (Herrnstein, 1981; Mazur, 1987; Logue, 1988; Evenden, 1999; Ainslie, 2001; Cardinal et al., 2003; Green and Myerson, 2004). Traditionally, the psychological basis of impulsive choice has rested on the discount rate parameter, such that those with a higher rate are described as impulsive and those with a low rate as self-controlled (Herrnstein, 1981; Mazur, 1987; Logue, 1988; Evenden, 1999; Ainslie, 2001; Cardinal et al., 2003; Green and Myerson, 2004). However, the data presented here illustrate that impulsivity and self-control are also determined by the concavity of an individual's utility function and that these two parameters are independent of one another. Specifically, the more concave the function, the faster marginal utility diminishes and the more impulsive the individual. This is because a concave utility function diminishes the value of the larger reward relative to the smaller reward, making it less attractive. A corollary of this is that subjects who are more impulsive (as a result of a more concave utility function) may be more risk-averse, since the concavity of the utility function is also a key determinant of choice under uncertainty (von Neumann and Morgenstern, 1947; Kahneman and Tversky, 1979; Pindyck and Rubinfeld, 2004).
We conclude that impulsivity in choice (temporal impulsiveness) should not solely be defined by K. Moreover, K and r in our view should be kept separate, as there is no theoretical reason why the discounting of time and scaling of magnitude (two different features of preferences) should influence each other. Although it has been suggested that such a correlation may exist (Anderhub et al., 2001), we did not observe it in our results, and previous attempts to find a correlation by simultaneously administering risk preference (to estimate r) and intertemporal choice (to estimate K) tasks have been mixed (Ahlbrecht and Weber, 1997; Anderhub et al., 2001). In our view, it is perfectly possible that a person with a high discount rate but a close to linear utility function is as behaviorally impulsive as a counterpart with a low discount rate but a more concave function—although both parameters will correlate with impulsiveness, individually. Future studies of impulsive choice should, therefore, consider these determinants and others—including top-down, inhibitory control mechanisms—when hypothesizing about the underlying cause of a change in intertemporal choice behavior across experimental conditions. These considerations have an important bearing on studies of psychopathologies where impulsive choice is a quintessential clinical feature, such as drug addiction (Cardinal et al., 2003; Bickel et al., 2007) and attention-deficit hyperactivity disorder (Sagvolden and Sergeant, 1998; Winstanley et al., 2006), particularly since dysfunction of the striatum is implicated in both conditions.
One caveat relating to the fMRI data is that when comparing different valuation models we found that the hyperbolic discounting of utility was the best model for describing the behavioral data. However, attributable to constraints in the design of the study, we were not able to use the fMRI data to make such inferences regarding the accuracy of the different models. The regressors used to analyze the imaging data were created only from the model we proposed, which was also selected by the AIC analysis; and so we caution that these results may not be independent of the model used (e.g., exponential vs hyperbolic). We anticipate further studies which aim to assess the validity of these models using fMRI data.
One of the useful aspects of the model is the ability to calculate utility functions from intertemporal choices. Previous methods to construct utility functions have mostly used risk preference tasks, such as simple gambles. These studies suggest that the average utility function derived from risk-preference tasks is magnitude to the power of 0.88 (Tversky and Kahneman, 1992). This value leads to a slightly more concave utility function than that observed in our task. This discrepancy may have arisen from natural variance of the population or the range of magnitudes used in the task to characterize the function (£1–100 in our study vs a larger hypothetical or smaller real range of amounts, used in other studies). It is also likely that the realistic nature of our study (real amounts paid with real delays) may lead to differences from previous estimates, where, for the most part, hypothetical choices were made. Finally, further studies should address whether utility estimates derived from intertemporal choices differ from those derived from gambles.
Finally, our results bear relevance to a related but distinct personality trait—that of decisiveness. When people have to make choices between similarly valued options, decision–conflict can occur. Decision–conflict often leads to a slowing down of responses and increase in activity of conflict areas such as the ACC (Botvinick et al., 2004; Cohen et al., 2005; Kennerley et al., 2006; Botvinick, 2007; Pochon et al., 2008). Although this phenomenon is relatively well studied in lower-level, perceptual and motor decision-making tasks, it is less well characterized in higher-level tasks (Pochon et al., 2008). We show that decision–conflict occurs in intertemporal choice and that it can be caused by choosing between similarly valued options but also options that are far apart in time (independent of difference in value). Furthermore, we demonstrated that ACC activity in response to conflict correlated with the degree to which individual subjects were slowed down by choice difficulty. This suggests that the psychological trait of decisiveness may be predicted by or relate to an individual's degree of ACC activity.
Footnotes
- Received March 8, 2009.
- Revision received May 7, 2009.
- Accepted June 1, 2009.
-
This work was funded by a Wellcome Trust Programme grant to R.J.D. A.P. was supported by a Medical Research Council studentship. We thank J. Winston and E. Korenfeld for insightful discussions.
- Correspondence should be addressed to Alex Pine, Wellcome Trust Centre for Neuroimaging, 12 Queen Square, London WC1N 3BG, UK. a.pine{at}ucl.ac.uk
- Copyright © 2009 Society for Neuroscience 0270-6474/09/299575-07$15.00/0