Previous research, involving monetary rewards, found that limbic reward-related areas show greater activity when an intertemporal choice includes an immediate reward than when the options include only delayed rewards. In contrast, the lateral prefrontal and parietal cortex (areas commonly associated with deliberative cognitive processes, including future planning) respond to intertemporal choices in general but do not exhibit sensitivity to immediacy (McClure et al., 2004). The current experiments extend these findings to primary rewards (fruit juice or water) and time delays of minutes instead of weeks. Thirsty subjects choose between small volumes of drinks delivered at precise times during the experiment (e.g., 2 ml now vs 3 ml in 5 min). Consistent with previous findings, limbic activation was greater for choices between an immediate reward and a delayed reward than for choices between two delayed rewards, whereas the lateral prefrontal cortex and posterior parietal cortex responded similarly whether choices were between an immediate and a delayed reward or between two delayed rewards. Moreover, relative activation of the two sets of brain regions predicts actual choice behavior. A second experiment finds that when the delivery of all rewards is offset by 10 min (so that the earliest available juice reward in any choice is 10 min), no differential activity is observed in limbic reward-related areas for choices involving the earliest versus only more delayed rewards. We discuss implications of this finding for differences between primary and secondary rewards.
Empirical studies of intertemporal choice find that value declines rapidly over the short run but at a slower rate over the long run (Ainslie, 1975; Thaler, 1981; Loewenstein and Prelec, 1992; Kirby, 1997). A convenient functional form for representing such “time preferences” is the quasihyperbolic discount function (Laibson, 1997), according to which the present value of a stream of consumptions (c1, c2, …) is as follows: where u is the utility function, and discount parameters β and δ are bounded between 0 and 1.
A multiplicatively scaled transformation of V can be decomposed into two distinct processes: The term labeled “δ system” exhibits exponential discounting with discount factor δ. That labeled “β system” captures the extra weight given to immediate rewards.
Based on previous research (McClure et al., 2004; Tanaka et al., 2004), we expected the impatient β system to be related to the mesolimbic dopamine system and associated structures (henceforth, “limbic reward areas”). Humans share this limbic reward system with many other mammalian species, none of which respond to costs and benefits delayed more than a few minutes, except in a rigid, preprogrammed manner. Moreover, recent research finds that people with greater activation in these limbic reward regions in response to gaining or losing money also place greater weight on immediate rewards relative to delayed rewards (Hariri et al., 2006).
Additionally, we anticipated the δ component to be associated with prefrontal and parietal cortex. These regions have been implicated in planning and deliberation. Damage to these areas in humans produces a pattern of short-sighted nondeliberative behavior termed “reward-dependency syndrome” (Lhermitte, 1986).
Previously, we studied intertemporal choices between gift certificates (McClure et al., 2004). We found that activity in limbic reward areas (ventral striatum, medial prefrontal cortex, and orbitofrontal cortex) is associated with choices involving immediate rewards; in contrast, prefrontal and parietal regions are active in all intertemporal choices (not just the choices with an immediate option).
One purpose of the current study was to test whether these results replicate with primary rewards and shorter delays. Thirsty subjects chose between volumes of juice or water available at different times within the experiment. Unlike gift certificates, drink rewards allow us to control both when reward is delivered and when it is consumed. Such control is absent from most previous studies of time discounting in people (Ainslie and Monterosso, 2004), which has generated debate over the validity of time discounting studies (Cubitt and Read, 2005).
We extend our previous statistical analysis by estimating a more flexible functional form for discounting. We generalize the idea that discounting can be decomposed into present-oriented and more patient processes. Let D(τ) represent an aggregate discount function, where τ is the delay until consumption, and D(τ) represents the current value of one unit of delayed utility. We hypothesize that D(τ) is composed of weighted subcomponents associated with different brain systems, with one component, Dβ(τ), that discounts at a faster rate than another, Dδ(τ). The overall utility function is as follows: We estimate this function for choice behavior and neural activity.
Materials and Methods
Subjects made a series of binary choices of the following form: choose either “R squirts of fluid delivered at delay D min,” or “R′ squirts of fluid delivered at delay D′ min.” Some choices were between early and late delivery of juice, and other choices were between early and late delivery of water. The use of juice and water allowed us to double the number of choices subjects responded to and reduce the risk that subjects would experience the task as repetitive and/or adopt a fixed, rote decision strategy. No trials were mixed; in other words, the choices between early delivery and late delivery were always for the same type of fluid. Hence, each choice can be summarized by a quintuple (R, D; R′, D′; fluid type), where the fluid type was either juice or water.
We scanned a total of 37 (18 female) Princeton graduate and undergraduate students with a 3 Tesla Siemens (Munich, Germany) Allegra scanner, 23 (12 female) for experiment 1, and 14 (6 female) for experiment 2. Subjects were prescreened to exclude history of psychiatric disorder or drug use. Subjects were instructed to abstain from drinking any fluids for 3 h before participating.
Three subjects were excluded from data analysis (all from experiment 1), one for failing to abstain from drinking before the experiment, one because of excessive head movement, and a third for selecting the later, larger option on every choice (which seemed to indicate either lack of engagement with the task or inappropriate calibration of the task for this subject's intertemporal preferences).
The timing structure of both experiments is shown in Figure 1. Each 30 s period was broken into a decision period and a fluid delivery period. During each decision period, subjects were shown a choice (example display shown in Fig. 1B) and given up to 10 s to indicate their preference via a button box. During the 15 s fluid delivery period, a total of four squirts of fluid could be delivered, depending on the subjects' choices in previous decision periods. The experiment was separated into four 7 min blocks with a 1 min rest period between blocks. The ordering of the choices is discussed in section 1 of the supplemental material (available at www.jneurosci.org).
Experiment 1 choices.
Subjects answered one control question (one squirt of juice at 0 min delay or one squirt of juice at 30 min delay) at the beginning of the experiment to acquaint them with the task. This was followed by an intermixed set of 36 experimental choices and 19 control trials (to be described below).
The intertemporal characteristics of the experimental choices can be divided into six delay categories. We represent each delay category as a pair of alternatives, D versus D′, where D is the delay to the early reward and D′ is the delay to the alternative later reward. In a given trial, the subject chose one of these two options: early or late. The six delay categories follow: 0 versus 1 min, 0 versus 5 min, 10 versus 11 min, 10 versus 15 min, 20 versus 21 min, and 20 versus 25 min. Given this structure, it follows that the time difference between the late reward and the early reward, D′ − D, was either 1 or 5 min, and the time of the earliest reward, D, was 0, 10, or 20 min.
The reward magnitudes can be divided into three magnitude categories. We represent each magnitude category as a pair of alternatives, R and R′, where R is the magnitude of the early reward (in 1 ml “squirts” of fluid), and R′ is the magnitude of the mutually exclusive later reward (in squirts of fluid). The three magnitude categories were 1 versus 2, 1 versus 3, and 2 versus 3.
Over the duration of the experiment, subjects were presented with each possible combination of (R, D) versus (R′, D′) twice: once with juice and once with water. Hence, the experimental choices had a 6 × 3 × 2 structure (six delay categories by three magnitude categories by two fluid categories), or 36 experimental choices in total. The type of fluid (juice or water) was indicated visually as depicted in Figure 1B; red circles indicated juice, and blue circles indicated water.
The control trials were introduced to control for the motor and visual requirements associated with task performance. On these trials, subjects were simply required to press a button to receive a specified number of squirts after a given delay. Perhaps in part because of low power as a result of fitting only 19 events, we were unable to interpret the pattern of brain activity during control events and do not discuss these choices further.
Experiment 2 choices.
Subjects responded to a total of 36 intertemporal choices in experiment 2. Delays to reward delivery corresponded to one of three values: 10 min (labeled as “early”), 20 min (“middle”), and 30 min (“late”). Delay categories were early versus middle, middle versus late, and early versus late. Subjects were told the meaning of the different delay period labels before entering the scanner. Reward values were as in experiment 1, with R and R′ corresponding to 1 versus 2, 1 versus 3, and 2 versus 3 squirts of juice or water. Subjects were given two choices for each pairing of delay and reward value and one each for juice and water.
Juice and water delivery.
Squirts of juice and water were delivered by a computer-controlled syringe pump and customized software. Individual squirts were delivered at a rate of 1 ml/s for 1 s, giving a squirt size of 1 ml. Before receiving a squirt, subjects were required to press a button. This ensured that subjects would have control over, and not be surprised by, the fluid delivery. This, plus the small squirt size, was sufficient to keep swallowing-related head movement to a minimum (all subjects included in the data analysis had total head movement <3 mm in any direction during the experiment).
Functional magnetic resonance imaging data analysis.
Functional magnetic resonance imaging (fMRI) data analysis was performed using AFNI (Cox, 1996), SPM2 (Wellcome Department of Cognitive Neurology, London, UK), and custom-written programs in Matlab. Whole-brain, blood oxygenation level-dependent (BOLD)-weighted echo-planar images were acquired parallel to the anterior commissure–posterior commissure line with a repetition time of 2 s and echo time of 30 ms. Functional images were first aligned to correct for head movement during the scan. Slice timing correction was then performed using sinc interpolation. Images were subsequently normalized to Montreal Neurological Institute coordinates and resampled at 4 × 4 × 4 mm3 resolution. A Gaussian smoothing kernel of 8 mm full-width at half-maximum (FWHM) was applied to improve comparison across subjects. Head movement estimates derived from the first realignment step were included as regressors in all analyses to help diminish the impact of any movement-related effects on the results.
Analysis of a general linear model (GLM) was performed using SPM2. In our previous study (McClure et al., 2004), we conducted such an analysis with a regressor that was constructed by weighting decision periods with an immediate reward in the choice set by 1.0, weighting decision periods with only delayed options in the choice set by −0.5, and using a zero weighting for periods in which no decision is made. We called this the β regressor. In contrast, the β regressor used in our current analysis selectively weights decision periods with an immediate option in the choice set by 1.0 and all other periods by 0. This new β regressor ensures that areas showing increased activity only for decisions involving an immediate reward will only load on the new β regressor. The differences produced by the use of the new β regressor on the data from our previous study are subtle and do not qualitatively change any of the effects originally reported. In comparing the results between studies (see Fig. 5), we have reanalyzed our previous data with the new β regressor structure.
In our previous study, we also used a regressor that was constructed by weighting all decision periods by 1.0 and weighting all other periods by 0. We called this the δ regressor, and we use the same regressor in the current study.
In addition to the β and δ regressors, regressors were included for head movement (six regressors for movements along the x-, y-, and z-axes as well as rotations around the x-, y-, and z-axes). Baseline drift in the magnetic resonance signal was accounted for with three regressors to estimate a quadratic dependence between signal amplitude and time. Finally, additional regressors were included to account for response time and time on task (i.e., linear dependence on choice number). All regressors were fit simultaneously in a single GLM analysis.
GLMs were fit to each subject individually. Between-subjects t tests were then performed to identify voxels that loaded significantly on the β and/or δ regressors. The AFNI program AlphaSim was used to determine our significance criterion. The smoothness of our data were estimated using 3dFWHM. Using these tools, we determined that a corrected (familywise) p < 0.05 is achieved with a minimum of nine contiguous voxels each significant at p < 0.001.
Individual trial estimates of brain response.
We estimated the amplitude of brain response for each identified brain region separately for each trial to correlate with choices and to fit discount functions. To do so, BOLD data for each voxel identified in the initial GLM were normalized by the mean across the experiment to give percentage signal change relative to baseline and then averaged across voxels in each identified brain region. We then fit gamma functions during each choice epoch with the following functional form: This is a normalized version of the gamma function developed by Cohen (1997); values for b and c were taken from Cohen (1997) (8.6 and 0.55, respectively).
Our fitting procedure was developed to overcome two main challenges. First, we did not know at which exact time choice-related brain responses occurred. This is because subjects had 10 s to make their choice on each trial and took several seconds on average to render their choice. We therefore included the temporal offset parameter τ, which was allowed to vary to give the best fit to the BOLD signal (where τ is at most the subject's response time).
The second challenge arose in estimating the baseline from which the brain response occurred. The quadratic fit to the BOLD signal drift during the experiment was adequate for the full GLM analysis (above) but did not account for higher-frequency signal drift that was apparent during the period of each choice. We therefore fit the baseline signal individually for each choice by computing a weighted average of the BOLD signal amplitude around the time at which the choice was visually displayed. We gave maximal weight to the BOLD signal at the time when the choice was first displayed with linearly decreasing weight for the subsequent 2 s (to a weight of 0 at 2 s) because changes in BOLD response are known to lag neural events by at least 2 s. A linearly decreasing weight was also given to the BOLD signal for the 4 s before choice presentation (0 weight at 4 s). We did not weight BOLD signals beyond 4 s because of the proximity of preceding juice and water delivery events.
After correcting for baseline BOLD amplitude, a simplex fitting procedure was performed in Matlab to find values of a and τ that minimized the sum-squared error from actual BOLD signal measures.
Brain activity and behavior.
To estimate the relationship between brain activity and behavior (including fits of discount functions to brain activity), we first mean normalized the BOLD response amplitudes by dividing by the mean amplitude across all voxels in each brain region for each subject and across all trials. We then averaged these mean normalized responses over all β and δ areas, respectively.
In fitting to behavior, probit models were used to predict, from brain activity, the probability of choosing the earlier of the two rewards in a given choice. In addition to the normed brain activity variables, we entered the following set of basic control variables: indicator variables for each subject, choice number, and the number of juice squirts the subject received in the delivery period before the choice. Where noted, we also control for specific choice type (i.e., an indicator variable for each combination of reward size and delay). Details for the procedure used to fit discount functions to brain activity are given in section 2 of the supplemental material (available at www.jneurosci.org).
Each subject was presented with 36 binary intertemporal choices (the timing and sequence of events in the experiment is depicted in Fig. 1). For each choice, subjects decided between a reward of size R at delay D and a second reward of size R′ at delay D′ (where R < R′ and D < D′). We varied the size of the two rewards (R and R′), the time to the earliest reward (D), and the lag between rewards (D′ − D) as detailed above. Each choice was repeated for juice squirts and water squirts.
Subjects showed clear evidence of nonexponential discounting in their behavioral preferences. Figure 2 provides a nonparametric test of this property. In Figure 2, we bin the data by the value of D, the time delay to the early reward in a binary choice; in our experiment, D is 0, 10, or 20 min.
The probability of selecting the earlier reward for the data labeled D = 0 min in Figure 2A is calculated as the percentage of trials that subjects select the early option from the six sets of trials listed in Table 1 (each row represents a trial, which is a choice between an early option and a later option). Likewise, the probability of selecting the early reward for the data labeled D = 10 min is calculated as the percentage of trials that subjects select the early option from the trials detailed in Table 2 (again each row represents a choice/trial). Analogous calculations are conducted for the data labeled D = 20 min, for which the trials in Table 3 are used.
Figure 2A reveals a greater tendency to select the earlier reward option in the D = 0 min trials compared with either D = 10 min trials (paired t test; p < 0.001) or the D = 20 min trials (p < 0.001). Additionally, by comparing the plots for the 1 and 5 min differences between D′ and D, it can be seen that the immediacy effect is driven primarily by the trials with 5 min differences (Fig. 2B).
We next fit subjects' choice data to both the β–δ discounting model (Eq. 1), which assumes that the β system conforms to a discrete (1–0) function, and the more flexible formulation represented by Equation 3. We first estimated a probit model that predicts choice based on the β–δ discounting model. This yields parameter estimates of β = 0.52 (SE, 0.045) and δ = 0.98 (SE, 0.015). We cannot reject the hypothesis that the δ system does not discount at all [H0: δ = 1; p = 0.25, likelihood ratio (LR) test]. However, we can reject the hypothesis that the discounting is purely exponential (H0: β = 1; p < 0.0001, LR test).
Next, we fit choices with a “double-exponential” implementation of Equation 3, where Dβ(τ) = λβτ, and Dδ(τ)= λδτ. This is a continuous-time generalization of the β–δ model (Harris and Laibson, 2004). Using a probit model, we estimate that λβ =0.46 (SE, 0.107) and λδ = 1.02 (SE, 0.019). Once again, we cannot reject the hypothesis that the more patient system does not discount at all (H0: λδ = 1; p = 0.18, Wald test). If the two discounting parameters λβ and λδ are identical, then this model is equivalent to the classic model of single-exponential discounting. We can reject the hypothesis that the two discounting parameters are the same in the pooled data (H0: λβ = λδ; p < 0.0001, Wald test), suggesting that a two-system model is consistent with the behavioral data.
Brain response to juice and water
We identified brain regions that responded to juice and water delivery as those that correlate significantly with a regressor generated by convolving delivery times with a hemodynamic response kernel. This analysis reveals three brain regions (Fig. 3A). The first is bilateral primary taste cortex in the anterior insula. The second is a region in the premotor area (PMA), most likely resulting from the button press required to initiate fluid delivery. Finally, we find regions bilaterally in the ventral striatum that are also activated by fluid delivery. The ventral striatum has repeatedly been implicated in reward processing involving both money (Knutson et al., 2001) and primary rewards such as juice (McClure et al., 2003; O'Doherty et al., 2003). This is also a main projection site of midbrain dopamine neurons that are known to convey reward information (Schultz et al., 1997).
To assess whether reward-related brain responses to juice and water are significantly different, we extract mean event-related activity changes from the ventral striatum bilaterally (Fig. 3B). No differences in the amplitudes of brain response to juice and water are found in this analysis (cf. Berns et al., 2001).
Identification of β and δ brain systems
The main purpose of our study was to determine whether distinct brain regions would show qualitatively different patterns of time discounting and, if so, whether the division of regions would resemble that revealed by our previous research. To avoid imposing our hypotheses on the data, we do not use a region-of-interest analysis to identify potential β and δ brain areas but instead conduct a whole-brain analysis. We regress voxel-level neural activity (i.e., the BOLD signal measured in each voxel) on control variables (time in experiment and number of juice/water squirts in previous delivery period) and on dummy variables that represent different types of intertemporal choices. As described previously, one dummy variable identifies intertemporal choices that involve a decision between an immediate reward and a delayed reward (β dummy). A second dummy variable identifies all intertemporal choices (in other words, all decision epochs) (δ dummy). Note that the β dummy identifies a subset of the decision epochs identified by the δ dummy and that brain regions load on the β regressor to the degree that they respond preferentially to intertemporal choices involving immediate rewards. Brain areas that load only on the δ regressor do not respond preferentially to the presence of an option for immediate reward.
We identify several brain regions that load on the β regressor (Fig. 4A). These regions include the nucleus accumbens (NAc), subgenual cingulate cortex (SGC), medial orbitofrontal cortex (mOFC), posterior cingulate cortex (PCC), and precuneus. As demonstrated below, these are very similar to the areas identified by a similar analysis in our previous study (McClure et al., 2004). All of these regions are within the limbic system and paralimbic cortex and have been directly implicated in reward processing in previous studies (Breiter and Rosen, 1999; Knutson et al., 2001). The β dummy also identifies a region in the dorsal anterior cingulate cortex (ACC). The ACC is frequently associated with the presence of response conflict (Barch et al., 2001; Botvinick et al., 2001). Finding ACC activity in this context may imply that choices involving an immediate reward option are associated with greater conflict. Such conflict is consistent with a two-system model of decision making.
As in our previous study, regions that correlate significantly with the δ dummy included both visual and motor areas, as well as regions commonly associated with higher-level cognitive functions. It is likely that areas of primary visual and premotor cortex (including PMA and supplementary motor area) were activated because task performance requires subjects to look at a visual display and press a button to indicate preference. The remaining areas we find to correlate with the δ dummy include a region in the PCC, bilateral areas in the posterior parietal cortex (PPar), bilateral areas in the anterior insula (Ant Ins), and several regions in the dorsolateral prefrontal cortex (DLPFC) (Brodmann areas 9, 44, 46, and 10) (Fig. 4B). Activity in DLPFC and PPar is observed commonly in tasks involving cognitive processes such as working memory, abstract problem solving, and exertion of control in favor of long-term goals (Miller and Cohen, 2001). The region in the PCC lies slightly rostral to the region that correlated with the β regressor, with no shared voxels.
Consistency with results from intertemporal choice for money
The areas associated with the β and δ regressors, henceforth referred to as β and δ areas, are close to, or overlap with, the areas identified in our previous study of intertemporal choice for money (McClure et al., 2004). To directly compare the responses in this and our previous report, we performed a conjunction analysis to identify overlapping voxels (Fig. 5). Among the δ areas, every region of activity replicates our previous findings, with the exception of the additional finding of a rostral area of the PCC. There is substantial overlap of the voxels among the δ areas at p < 0.001 and even greater overlap with a threshold of p < 0.01.
Among the β areas, almost all regions of activity replicate across the two studies, but, unlike the δ regions, this region-level replication does not translate into voxel-level replication. At p < 0.001, only seven voxels in the medial prefrontal cortex are consistent across the two studies. Two of these voxels are in the SGC, and the other five are in the dorsal ACC. The number of overlapping voxels increases somewhat at p < 0.01 to include one voxel in the PCC and one voxel in the NAc (Fig. 5B).
A possible interpretation of these results is that cognitive (δ) brain regions are domain general and hence are consistent across tasks, a finding that has been reported by others (Shallice, 1982; Duncan, 1986; Miller and Cohen, 2001). In contrast, limbic reward-related (β) areas may be more stimulus or task specific. The same general brain regions may be involved in a variety of functions, but the specific subregions involved may be dependent on reward modality, time scale, or other details of the particular task circumstance.
Our analysis has shown that subjects' behavioral choices are described by a two-system discounting model. Our analysis of the brain-imaging data has also identified two different neural systems, which appear to be associated with each of the two discounting systems. To test this relationship more directly, we fit discount functions to the identified β and δ brain areas, respectively. Under the hypothesis that the β and δ brain areas generate the two components of the discount function that we estimated from the behavioral data, we may expect that there will be a direct correspondence between fits to behavior and fits to brain activity. As will be seen, the fits correspond very well for the δ system but not for the β system. We discuss possible reasons for this in the Discussion.
We estimated an exponential decay function for the β and δ areas. We created a multiplicatively normed measure of activity for each brain region by dividing the observed amplitude in a given trial by that subject's average amplitude over all trials for the brain region (see supplemental materials section 3, Discounting in individual brain areas, available at www.jneurosci.org). We then averaged these normalized measures over the β and δ areas and fit these activation levels with exponentially discounted value functions: λiDR + λiD′R′, where i indexes the identified brain areas (β or δ). We use the sum of the discounted values because we assume that the value of both options is appraised within each trial. Table 4 presents the estimated discount factors. Note that we find similar results when we estimate the discount factors without reward magnitude (i.e., using λiD+ λiD′ instead of λiDR + λiD′R′), as well as when we include task controls (juice squirts, choice number, and response time).
Although both discount factors appear to be numerically similar (i.e., both are >0.9), in fact the β areas demonstrate a substantially greater amount of discounting. Because time units are measured in minutes, the discount factor of 0.963 implies that a 25 min delay is discounted by 0.96325 = 0.38. This represents a 62% decay in value relative to an immediate reward. The δ area discount factor is significantly higher than that of the β area (p < 0.001) and implies substantially less discounting: a 25 min delay is discounted by only 0.99025 = 0.78. For both areas, we can reject the null hypothesis that the discount factor is equal to unity (p < 0.01). However, the dependence on time delay in δ areas seems to be determined by only one of the areas identified in the brain by the δ regressor (left anterior insula). When the discount factors estimated for each of the δ regions are examined individually, the null hypothesis that the discount factor is unity can be rejected only for the left insula at p < 0.01 (see supplemental material, available at www.jneurosci.org); that is, all of the others fail to exhibit significant discounting. In contrast, all of the β areas exhibit significant discounting.
Brain activity and behavior
Finally, we analyze the relationship between brain activity and behavior, both controlling for and not controlling for the specific parameters of choice (i.e., reward amounts and time delays) that subjects faced in a particular trial. We use the average of the multiplicatively normed activity over the β and δ areas, as done previously for the neural discounting reported in Table 4, and estimate a probit model that predicts the probability that a subject chooses the early option in a given trial.
We first estimate a model in which choice is predicted by average (across identified brain regions) β activity Aβ and average δ activity Aδ, with basic controls for subject, choice number (1, 2, … 36), and recent juice squirts. In this regression specification only, we exclude choices in which (D, D′) = (0, 1). This is because our imaging analysis suggests that the β system responds to rewards at both of these delays, and the behavioral data indicate that the immediacy effect is primarily found when choosing between delays of 0 and 5 min.
We find that average activity in both β and δ brain areas is associated with choice, such that greater activity in β areas and lesser activity in δ areas predict a greater likelihood of choosing the sooner, lesser option (Table 5). The coefficient (coeff.) on Aβ is 0.288 (p = 0.02), indicating that higher activity in β areas increases the probability of choosing the earlier reward. In addition, activity in the δ areas has a marginally significant negative effect on choosing early (coeff. = −0.195; p = 0.15). However, average β and δ area activity do not significantly predict choice once we control for the characteristics of the choice: with the addition of controls for reward size and delay, the coefficient on Aβ also becomes insignificant (coeff. = 0.094; p = 0.46), and the coefficient on Aδ is now far from the significance threshold (coeff. = −0.081; p = 0.60). This indicates that although average β area activity is associated with choosing the earlier reward, it does not independently predict which reward the subject will choose once the characteristics of that choice are taken into account.
Experiment 2: framing experiment to test nature of immediacy
The brain areas identified by the β and δ regressors are similar to those identified in our previous study (McClure et al., 2004). This is surprising given not only the different modality of rewards in the two studies, but also the different time scales over which rewards were discounted. In the current experiment, the latest rewards were delivered was at a 25 min delay, and these did not elicit a significant response in the β system. In contrast, in our previous study, the earliest a reward could be received was after the experimental session had ended and the subject had a chance to access their computer (gift certificates were sent by e-mail). Nevertheless, this was adequate to evoke activity in β regions. Thus, despite the fact that the latest juice/water reward (in the present study) was available sooner than the earliest monetary reward (in the previous study), the former did not elicit activity in β areas, whereas the latter did.
This pattern of results suggests the possibility that the β system may respond not to absolute time delay but rather to the earliest reward available in the overall choice set, and hence independently of the absolute delay until reward delivery. We tested this hypothesis in a second experiment in which the delays were shifted by 10 min (D = 10, 20, or 30), so that the shortest delay to reward delivery was 10 min. To further encourage subjects to consider the 10 min delay as the most “immediate” option, we substituted labels for each delay: “early” for 10 min delay options and “middle” and “late” for the 20 and 30 min options, respectively. All of these labels were explained to subjects before the experiment.
Contrary to our hypothesis, subjects did not treat rewards delayed by 10 min as if they were immediate. In fact, they were equally likely to select the lesser reward option when it was available early (i.e., 10 vs 20 min) as when it was available after a medium delay (i.e., 20 vs 30 min; p = 0.53) (data not shown). Correspondingly, we failed to observe any brain areas that showed a greater propensity to respond to early versus middle choices than for middle versus late choices. However, we did observe the same set of δ brain areas responding to all choices as in experiment 1 (data not shown).
We use functional neuroimaging to study the brain areas involved in making intertemporal decisions for primary rewards. Delays of just a few minutes lead thirsty subjects to substantially discount the value of juice and water. Estimating a discount function from subjects' choices, we find that subjects discount drink rewards available 5 min from the present by >50% percent relative to rewards available immediately. In contrast, subjects display little or no discounting when both rewards under consideration are delayed (e.g., 20 min vs 25 min).
In contrast with our previous report (McClure et al., 2004), the current study uses a different modality of reward (fluid vs money) and substantially different time period of discounting (minutes vs weeks). Nevertheless, the two studies identify a consistent array of brain areas that are involved in discounting. Activity in the (limbic reward) β system decays rapidly as opportunities for reward are delayed. The (frontal/parietal) δ system is much less sensitive to the timing of available rewards.
In the current study, we estimated a model in which each of the two neural systems has a separate continuous exponential discount function (this is a generalization of the quasihyperbolic model). In this generalization, the integrated discount function takes the following form: where β < δ. This function has the important property that short-run discount rates are higher than long-run aggregate discount rates, because the relative magnitude of the (impatient) β system is greater in the short run than in the long run.
There is a long history of research in psychology suggesting that decision making under many different circumstances reflects the interaction of qualitatively different systems. Some researchers have compared emotional (or affective) processes and deliberative (or analytic) processes (Loewenstein, 1996; Kahneman, 2003), whereas others have contrasted automatic and controlled processes (Posner and Snyder, 1975; Shiffrin and Schneider, 1977; Cohen et al., 1990). Both distinctions have received support from research using diverse methods and subjects. Such distinctions have also begun to receive support from neuroscientific evidence (Sanfey et al., 2006). For example, LeDoux (1996) finds that activation of the amygdala in response to fear stimuli arises through two pathways: one fast, stereotyped, and driven by direct sensory projections, and the other slower, more goal-directed, and adaptive. Similar findings have been observed in human decision making using functional brain imaging (Sanfey et al., 2003; Cohen, 2005).
Discounting of primary and secondary rewards
The consistency in the pattern of brain activity across the McClure et al. (2004) study and the current study is intriguing given the change in reward horizons. The money study and the primary reward study both identify a neural/behavioral “immediacy” effect. However, the money study showed this effect for financial rewards that were received, at the earliest, in >1 h. Moreover, the subjects in the money study were given gift certificates for Amazon.com, so the receipt of goods took at least 1 d.
In contrast, our current experiment shows that choices that include an option for an immediate fluid squirt generate a limbic response but that choices that include squirts delayed at least 5 min fail to do so. Why should the β system respond to gift certificates delivered in over 1 h when it does not respond to squirts of juice or water delayed by 5 or 10 min?
Our results are not consistent with the conjecture, which motivated experiment 2, that immediacy is a relative concept framed by the circumstances of the task. If that were the case, the β system would have responded to the earliest available reward in the framing experiment, despite the 10 min time delay to the earliest reward. The findings from experiment 2 suggest that the β system responds to the absolute rather than relative value of delays, at least for rewards of juice and water.
Perhaps the reason that the β system was unresponsive to even modest delays of juice and water is that these are “primary reinforcers” (rewards that directly satisfy evolved appetitive mechanisms). In contrast, gift certificates are “secondary reinforcers” that may evoke limbic activity only indirectly, mediated by more abstract symbolic and/or associative processes that may be more susceptible to contextual framing effects. That is, limbic activity in response to primary reinforcers may follow a rigid discounting function that has adapted under evolutionary pressures to physiological needs as well as specific features of the environment, whereas the discounting function for secondary reinforcers may be more context dependent. Primary rewards are likely to have reasonably well defined and stable temporal characteristics (e.g., the rate of body water loss or the deterioration of food goods) and therefore are amenable to discount rates expressed in absolute time (although these may vary according to the type of reward, changes in internal states such as satiety, and known features of the environment such as temperature). In contrast, the value and timing characteristics of secondary reinforcers may be more variable and dependent on context and therefore better served by a more flexible system for temporal discounting that is sensitive to relative value. These possibilities suggest a number of predictions (for instance, that money rewards should be subject to framing effects in a way that primary rewards are not) that remain to be tested in future experiments.
Difference in fit of discount function to behavior and brain activity
Equation 3, above, suggests the simple hypothesis that the β system and the δ system combine additively to determine the value of a delayed reward. The double-exponential discount function would result from this additive combination if β areas and δ areas are each characterized by exponential discount functions with different discount factors.
In our statistical analysis, we fit the double-exponential discount function to behavior and single-exponential discount functions to brain activity in individual brain areas. Because the behavioral and neural data sets are distinct, the resulting estimates present a new test of our model: do the behavioral estimates of the double-exponential discount function match the neural activation estimates of the exponential discount functions in the β and δ regions?
In the case of the δ system, there is a good correspondence. In fitting the behavioral data with a double-exponential discount factor, we were unable to reject the null hypothesis that there is no discounting in the δ system (i.e., the larger discount factor in the double-exponential function is statistically indistinguishable from unity). Likewise, in fitting the neural activation of individual δ brain areas, only one brain area (the anterior insula) showed a significant difference from unity, whereas none of the remaining ones did so.
However, the discount factors extracted for the myopic β system did not quantitatively match in the behavioral and neuroimaging analyses. Fitting to behavioral data with a double-exponential discount function, we estimate a discount factor in the β system equal to 0.46 (i.e., the smaller discount factor in the double-exponential function is 0.46). However, the fit to neural activity in β brain areas produced a discount factor of 0.96. These two numbers do not quantitatively match. However, a discount factor of 0.96/min still implies a high rate of discounting. Because 0.9625 = 0.39, value declines by 61% if the delay is only 25 min.
Although a direct correspondence between discounting behavior and discounting of brain activation would be elegant, there is at least one simple reason why it might not be observed. Our identified brain regions likely perform functions other than valuing the rewards in our experiment. Averaging other functions with valuation will tend to bias estimates of β region discounting toward 1 (i.e., toward being uniformly responsive to all trials).
Intertemporal choice and conflict monitoring
We have observed that choices between lesser immediate and greater delayed rewards elicit activity in distinct neural systems that appear to favor different choice outcomes. That is, intertemporal choice under these conditions elicits decisional conflict. A growing body of evidence suggests that a dorsocaudal region of the ACC responds to conflicts in processing (Carter et al., 1998; Botvinick et al., 2004; Yeung et al., 2004). This is consistent with findings from the current study in which we observed activity in a similar area of the ACC that was greater for decisions involving choices between immediate and delayed rewards than for choices between only delayed rewards. Such findings have been taken as evidence for a conflict-monitoring function of ACC, which serves to detect conditions requiring the recruitment of cognitive control mechanisms subserved by prefrontal cortex and associated structures (Botvinick et al., 2001; Kerns et al., 2004).
Our observation of an association of ACC activity with choices between an immediate and delayed reward (consistent with its role in conflict monitoring) and an association of PFC activity with choices favoring delayed rewards (consistent with its role in executing cognitive control) are consistent with existing theory regarding the functions of these neural systems. This, in turn, provides a framework within which to generate quantitative predictions about the dynamics of the neural mechanisms underlying intertemporal choice. Such studies (that join economic theory with hypotheses about and measurements of mechanism from neuroscience) remain a priority for future work.
This work was supported by National Institutes of Health Grants P30 AG024361 (J.D.C.), T32 MH065214 (J.D.C.), F32 MH072141 (S.M.M.), 3 P30 AG012810 (D.I.L.), and P01 AG005842 (D.I.L.). K. D'Ardenne contributed significantly to the task design and was critical in data collection. P. Dayan provided valuable comments on this manuscript.
- Correspondence should be addressed to Samuel M. McClure, Center for the Study of Brain, Mind, and Behavior, Princeton University, Princeton, NJ 08540.