Abstract
Hypothetical reports of intended behavior are commonly used to draw conclusions about real choices. A fundamental question in decision neuroscience is whether the same type of valuation and choice computations are performed in hypothetical and real decisions. We investigated this question using functional magnetic resonance imaging while human subjects made real and hypothetical choices about purchases of consumer goods. We found that activity in common areas of the orbitofrontal cortex and the ventral striatum correlated with behavioral measures of the stimulus value of the goods in both types of decision. Furthermore, we found that activity in these regions was stronger in response to the stimulus value signals in the real choice condition. The findings suggest that the difference between real and hypothetical choice is primarily attributable to variations in the value computations of the medial orbitofrontal cortex and the ventral striatum, and not attributable to the use of different valuation systems, or to the computation of stronger stimulus value signals in the hypothetical condition.
Introduction
Real choices are binding consequential commitments to a course of action, like buying a house, accepting a job, or casting a vote. However, scientists and forecasters interested in these real choices often settle for measuring hypothetical statements about likely or future choices instead. Hypothetical choices are common in psychology and neuroscience experiments when implementing real choice is impractical or unethical. The relationship between the two types of choices is also conceptually important because complex real choices usually have hypothetical future plans embedded in them. For example, a student might make a real binding choice of a university to attend, planning to major in electrical engineering, but the planned major is itself a hypothetical future choice, at the time of the real university choice.
The reliance on hypothetical choice data presumes that hypothetical choices are usually good forecasts of actual choices. But saying “We should get married!” is not the same as saying “I do,” which is a legally binding real choice. Furthermore, many studies have found a “hypothetical bias”: people overstate hypothetical valuations (Cummings et al., 1995; Johannesson et al., 1998; List and Gallet, 2001; Little and Berrens, 2004; Murphy et al., 2005; Blumenschein et al., 2007) and plans (Ariely and Wertenbroch, 2002; O'Donoghue and Rabin, 2008; Tanner and Carlson, 2009) compared with carefully matched real choices.
To explore the neural bases of this phenomenon, we used a simple functional magnetic resonance imaging (fMRI) task in which subjects make real and hypothetical purchase decisions for consumer goods. The experiment was designed to test two hypotheses regarding two potential explanations of hypothetical choice bias. Hypothesis 1 states that the hypothetical choice bias would be attributable to the deployment of different valuation systems (with potentially different properties) during real and hypothetical choice. Indeed, there are conceivable differences between real and hypothetical choices: real choices are typically precise, immediate, have higher stakes, and are often more emotionally charged, whereas hypothetical choices, which have no consequence, might be rapid and mindless, requiring fewer cognitive resources.
However, the alternative hypothesis—the same valuation systems are recruited for both types of decision—cannot be dismissed a priori because previous studies have found that the certain areas of the brain, including the orbitofrontal cortex, encode stimulus values during simple real economic choice (Knutson et al., 2007; Plassmann et al., 2007, 2010; Hare et al., 2008) and during preference rating tasks involving neither commitment nor actual decisions (Erk et al., 2002; Paulus and Frank, 2003). This alternative hypothesis leads to hypothesis 2, which states that neural activity in the same valuation systems is stronger in hypothetical choice than in real choice. This hypothesis attributes the explanation for the hypothetical choice bias to excess responsivity of value areas to appetitiveness of stimuli during hypothetical trials.
Materials and Methods
Participants.
Twenty-four healthy right-handed male subjects participated in the experiment (mean age, 20.9 ± 6.1 years; age range, 17–47). Seven additional subjects were excluded for the following reasons. Four subjects were excluded because (during a debriefing) they reported a misunderstanding of the instructions for the second part (the real choice block); we considered this to compromise the internal validity of the treatment. One subject was omitted since his median willingness to pay (WTP) was zero. Two subjects were discarded since their choice data were not reliably related to decision values, and hence their θ values could not be estimated—when we estimated a logistic regression model for the frequency of purchase decision as a function of decision value in each of real and hypothetical conditions (see Fig. 2B, top), these subjects showed a very low pseudo-R2 in both of the conditions (pseudo-R2 is a goodness-of-fit measure for logistic models; their pseudo-R2 values were outside 2 SDs of the mean pseudo-R2). Subjects had no history of psychiatric, neurological, or metabolic illnesses; had normal or corrected-to-normal vision; and were not taking medications that interfere with the performance of fMRI. Subjects were informed of the experiment and gave written consent on arrival at the laboratory. The institutional review board of Caltech approved the study.
Stimuli.
Two hundred consumer products (e.g., DVDs, electronics) (for a complete list, see supplemental Table 1, available at www.jneurosci.org as supplemental material) were presented to the subjects using color pictures (72 dpi). Stimulus presentation and response recording was controlled by E-prime (Psychology Software Tools).
Procedure.
Subjects were told that they would earn $60 for completing the experiment. The experiment consisted of three parts—prescanning, scanning, and postscanning (Fig. 1A). The scanning part had two decision-making tasks—one hypothetical and the other real. Initially, subjects were informed that there were three decision-making parts, but detailed instructions for each part were not given until each part began.
In the prescanning part, subjects were shown images of the 200 consumer products, one at a time and in random order. They were asked to state a maximum hypothetical WTP for each item. In each trial, subjects entered an amount between $0 and $50 using a sliding scale in $1 increments (Fig. 1B).
Based on each subject's responses in the prescanning task, 100 products were selected for that subject for the scanning task—50 for hypothetical trials and 50 different products for real trials. The details of the selections are as follows. On completing the prescanning part, the computer ranked products in a descending order of the subject's reported WTPs and then paired up each two adjacent products (i.e., {1st, 2nd}, {3rd, 4th}, … , {199th, 200th}). Then, among 100 pairs it selected 50 pairs (the 17 pairs with the highest WTP, the 16 pairs with the medium WTP, and the 17 pairs with the lowest WTP) and randomly chose one product of each pair and assigned it to the hypothetical trials and the other to the real trials. This procedure ensured that the distributions of WTP in both blocks were comparable (supplemental Fig. 1, available at www.jneurosci.org as supplemental material). The median WTP of the 200 items that were initially valued in the prescanning phase was used as the constant price for the rest of the experiment (mean price, $10.42; SD, 5.93).
The scanning part had two blocks of purchase decision-making tasks. Both blocks were identical except that the first was hypothetical and the second was real. In each block, subjects were shown the 50 items associated with that type of task, one at a time, in random order (Fig. 1C). For each item, they decided whether they wanted to buy the product shown at a fixed price (the subject-specific constant price described above).
In the hypothetical trials, the decisions were hypothetical and did not count. In the real trials, subjects learned that one of the 50 trials would be chosen at random at the end of the experiment, and whatever decision they had made in the chosen trial would be implemented as real, whether it be to “buy” or to “not buy.” Note that since only one trial counted as real, subjects did not have to worry about spreading a budget over the different items, so they could treat each trial as if it were the only one.
Afterward, subjects did a final postscanner task. The procedure was the same as in the scanning part except for shortened durations of the intertrial screens (500 ms each). The same 50 items that had been presented in the scanning hypothetical trials were shown again in this postscanning phase. This time, however, subjects made a real decision on these items. The subjects were informed that exactly 1 trial of the 100 real trials, including 50 from the real trials in the scanner and 50 from this “surprise” real phase, would be randomly selected and implemented, based on their decision made in the selected trial. The purpose of this postscanning part was to measure switches from hypothetical to real decisions for a matched set of items presented once in each condition.
Note that the hypothetical-then-real order was deliberately not counterbalanced for the following reasons. First, it is possible that there might be an ordering effect in which thinking about real choices first would spill over to affect valuation in hypothetical tasks: identifying such an effect would require counterbalancing and substantially more subjects. Furthermore, hypothetical trials contaminated by such an ordering effect would not be very informative about the neural mechanisms solely dedicated to hypothetical decision making. In contrast, note that the spillover effect is expected to be minimal, if any, in the hypothetical-then-real order since in the real condition subjects have a strong incentive to change or adjust any behavior carried over from the previous hypothetical block (i.e., it is in their best interest to make a decision according to their true preference in the real trials). In addition (despite our concerns), previous studies that used a within-subject design found no evidence for ordering (Cummings et al., 1995; Johannesson et al., 1998). Second, we also estimated the primary general linear model (GLM 1) reported below with a trial number included as an additional regressor, and then repeated the analysis; there was little difference in the results. Any plausible model of ordering effects should show a within-block effect of trials (because of practice effects, for example), so any such effects are controlled for in this general linear model (GLM) with a trial regressor. Third, hypothetical decision followed by real decision is a natural order for forecasting purposes. In most applications, hypothetical decision data are gathered in advance of real decisions (e.g., polls are only useful before elections). Hence, this particular order of events (hypothetical followed by real decision) is of major interest. Finally, in a separate behavioral study the opposite “real-then-hypothetical order” was presented (see the section below for details). This behavioral test for the potential effects of treatment order shows there is little effect; the hypothetical bias is comparable (supplemental Fig. 2, available at www.jneurosci.org as supplemental material), although the real-hypothetical gap is a bit smaller, which might be attributable to session variability.
Behavioral study of ordering effects.
The experimental procedure, task, and stimuli were identical except that the order of the hypothetical and real blocks in the second part was changed; that is, hypothetical, real, and then surprise real. Unlike the fMRI experiment, all parts of the behavioral experiment were conducted outside the scanner and subjects were run in a batch (multiple subjects at a time). All participants were Caltech students (N = 15; all male; mean age, 20.5 ± 2.97 years; age range, 17–27). Four additional subjects initially participated in the experiment, but on completion of the first part, their price (median WTP) was $0—they were not asked to continue.
The subjects in this study also showed hypothetical bias. However, the overall purchase percentage in this study is ∼10% lower for all types of trials than the fMRI study (supplemental Fig. 2, available at www.jneurosci.org as supplemental material). This overall decline might be attributable to session variability such as the following: (1) the statistical variance of price was larger for the fMRI subjects (35.21 compared with 12.11 of the behavioral subjects); (2) the behavioral subjects might not have been from the same subject pool as the fMRI subjects—the subjects in this experiment might have behavioral characteristics that are different from the fMRI subjects (such as a risk attitude—some subjects only participate in behavioral experiments because of a fear of the fMRI technique); or (3) unlike the fMRI study, multiple subjects (∼10) were run simultaneously.
Behavioral experiment for hypothetical and real WTP.
Eleven Caltech male students participated in the experiment (mean age, 20.55 ± 3.42 years; age range, 18–29). Subjects were paid $60 for participation. The experiment consisted of two parts of 200 trials each. In the first part, subjects were presented with the same 200 consumer products as in the prescanning part of the fMRI study and asked to report hypothetical willingness to pay, ranging $0∼$50, for each item. The second part was identical with the first part except that they were unexpectedly asked to report real WTP for the same 200 items. Subjects were informed at the beginning of the second part that one of these trials would count as real. To elicit real WTP, we used a Becker–DeGroot–Marschak auction mechanism (Becker et al., 1964; Plassmann et al., 2007; Hare et al., 2008). The mechanism worked as follows: Subjects reported WTP for each of the 200 products. At the end of the experiment, one of the 200 trials from the real WTP part would be randomly selected. The computer generated a random integer between 0 and 50 (each integer over the interval was equally likely) and set it as a price for the item in the chosen trial. If the subject's reported WTP was greater than the randomly generated price, say $X, then she paid $X and got the item. Otherwise, she did not get the product and paid nothing. The optimal strategy for a subject is to report exactly what one is willing to pay for the item presented (Becker et al., 1964).
Data acquisition and preprocessing.
fMRI data were acquired on a Siemens 3T Trio MRI scanner (repetition time, 2000 ms; echo time, 30 ms; field of view, 192 mm; 32 axial slices; 3 × 3 × 3 mm3 resolution) in two separate sessions of ∼14 min each. Blood oxygenation level-dependent (BOLD) contrast was measured with gradient echo T2*-weighted echo-planar images (EPIs). To optimize functional sensitivity in the orbitofrontal cortex (OFC), we used a tilted acquisition sequence at 30° to the anterior commissure–posterior commissure line (Deichmann et al., 2003) and an eight-channel phased array coil that yields a 40% signal increase in this area compared with a standard coil. Slices were collected in an interleaved ascending manner. The first three volumes in each session were discarded to permit T1 equilibration. A high-resolution T1-weighted structural scan (1 × 1 × 1 mm3) was acquired from each subject to facilitate localization and coregistration of functional data.
fMRI data analysis was performed by using SPM5 (Wellcome Department of Imaging Neuroscience, London, UK). Images were corrected for slice acquisition time within each volume, motion corrected with alignment to the first volume, spatially normalized to the standard Montreal Neurological Institute EPI template, and spatially smoothed using a Gaussian kernel with full width at half-maximum of 8 mm. Intensity normalization and high-pass temporal filtering (filter width, 128 s) were also applied to the data. The structural T1 images were coregistered to the mean functional EPI images for each subject and normalized using parameters derived from the EPI images. All regression models included six regressors indexing residual motion and two regressors for session baseline as regressors of no interest.
GLM 1.
We estimated two different GLMs of BOLD activity to test the two hypotheses. The first GLM assumed first-order autoregression and included the following regressors that capture the main events in our experiment: H1, an indicator function denoting product image presentation in the hypothetical trial; H2, H1 modulated by modified decision value (mDV) (for a definition of mDV, see Results); H3, H1 modulated by an indicator function denoting yes decision in the given trial; H4, an indicator function denoting a first button press during product image presentation in the hypothetical trials; H5, an indicator function denoting response phase (between the onset and the offset of the response screen) in the hypothetical trials; R1, an indicator function denoting product image presentation in the real trial; R2, R1 modulated by mDV; R3, R1 modulated by an indicator function denoting yes decision in the given trial; R4, an indicator function denoting a first button press during product image presentation in the real trials; R5, an indicator function denoting response phase in the real trials.
The regressors H1–H3, H5, and R1–R3, R5 were modeled using boxcar functions with subjects' response time as a duration. The regressors H4 and R4 were modeled using stick functions. We orthogonalized H4 and R4 with respect to H1 and R1, respectively. Each of the regressors was convolved with a canonical hemodynamic response function.
We then calculated the following first-level single-subject contrasts: (1) the real versus the hypothetical trial during image presentation (R1-H1), (2) image presentation in the hypothetical trials modulated by mDV (H2, or hypothetical*mDV hereafter), (3) image presentation in the real trials modulated by mDV (R2, or real*mDV), and (4) the real versus the hypothetical trials during image presentation modulated by mDV [R2-H2, or (real*mDV − hypothetical*mDV)].
Finally, we calculated second-level group contrasts using a one-sample t test. All statistical inferences regarding the fMRI data were performed at a level of p < 0.001, uncorrected, with an extent threshold of 5 voxels unless noted otherwise. Anatomical localizations were then performed by overlaying the t maps on a normalized structural image averaged across subjects, and with reference to an anatomical atlas (Duvernoy, 1999).
GLM 2.
The second GLM was identical with the first one except that activity at the time of real and hypothetical decisions was modulated by DV, instead of mDV (see Results for a precise definition).
Psychophysiological interactions model.
The goal of psychophysiological interaction (PPI) analysis was to investigate whether functional connectivity between areas of the anterior cingulate cortex (ACC) and the medial orbitofrontal cortex (mOFC) identified in GLM 1 differed between real and hypothetical trials (Friston et al., 1997). The analyses proceeded in three steps.
First, we extracted individual average time-series of BOLD activity within a region of interest (ROI). The ROI was defined as a 4 mm sphere surrounding each individual's peak activation voxel within the functional mask of the ACC shown in Figure 4C (in orange, right panel). Individual subject peaks within the ACC mask were identified based on the areas having the highest Z values in the (real*mDV − hypothetical*mDV) contrast. Variance associated with the motion regressors was removed from the extracted time series. The time course was then deconvolved, using the canonical hemodynamic response (HDR), to estimate the underlying neuronal activity in the ACC (Gitelman et al., 2003).
Second, we estimated a GLM with the following regressors: PPI-R1, an interaction between neural activity in the seed region and an indicator function for real trials (real trials coded as 1; hypothetical trials as −1); PPI-R2, an indicator function for the real trials; PPI-R3, the original BOLD eigenvariate (within the 4 mm sphere).
These regressors were convolved with a canonical HDR. The model also included motion parameters as regressors of no interest. Note that the first regressor identifies areas that exhibit task-related functional connectivity with the ACC; specifically, it identifies areas in which the correlation in BOLD activity with the ACC increases differentially during real trials (compared with hypothetical trials).
Third, a second-level analysis was performed by calculating a one-sample t test on the single-subject contrast coefficients.
Results
Behavioral differences between real and hypothetical decisions
The distribution of the WTPs for presented products was matched across both conditions (supplemental Fig. 1, available at www.jneurosci.org as supplemental material). As a result, in the absence of a hypothetical bias, subjects should have purchased at the same rate in both conditions. However, as was found in previous studies, subjects exhibited a significant pro-purchase bias during hypothetical decisions: they indicated an intention to purchase in 53% (SEM, 1.74%) of the hypothetical trials, but only in 38% (SEM, 3.10%) and 40% (SEM, 3.15%) of the real and surprise real trials (p < 0.0001 and p < 0.0008, respectively, within-subject paired-sample t tests) (Fig. 2A).
Figure 2B provides additional evidence for the hypothetical bias. The top panel plots the estimated mixed-effects logistic fit for the frequency of purchase as a function of the decision value (DV), which equals the subject's hypothetical WTP for the item minus the constant price at which it was sold (i.e., DV = WTP − price). In the hypothetical case, the fitted curve crosses 50% purchase probability line approximately at the zero DV point, indicating that at this DV subjects were indifferent between buying and not buying, as implied by the definition of DV and rational stochastic choice. In contrast, in the real case the curve crosses at $6.25 (p < 0.0001) (supplemental Table 2, available at www.jneurosci.org as supplemental material). That is, for goods with stated values $6.25 above the price, in the real condition subjects are equally torn between purchasing and not, which suggests that the hypothetical valuations were overstated by approximately $6.25 (compared with implicit decision values revealed by real choices).
This gap suggests that the WTP, which was initially stated before scanning, is not a perfect measure of the ultimate stimulus value that was used to make the real choices during scanning, which is sometimes called decision utility (Kahneman et al., 1997). The reason for this inference is that the probability of buying an item with a DV of zero (based on the initial stated WTP) should be 50% in the real condition, if the initial WTP is being used to compute decision utility, but choice probabilities are actually <50% during real choice. Furthermore, we conducted a separate behavioral study that directly compared hypothetical and real WTP for the same product (see Materials and Methods). Subjects in that study tended to report hypothetical WTPs that were then reduced by >50% when real WTP was elicited (supplemental Fig. 3, Table 3, available at www.jneurosci.org as supplemental material).
For some of the fMRI analyses performed below, it is useful to construct a behavioral measure of stimulus values that is consistent with the actual choices observed in each condition. We constructed such a measure by using observed decisions to infer how the WTPs need to be adjusted in the hypothetical and real conditions to generate a good measure of trial-by-trial decision values; that is, a common metric independent of condition. To do so, a discount factor in the real condition, θR, is estimated for each subject in the real condition, which creates a modified DV (mDV) of θR · WTP − price. The value of θR was estimated for each subject by imposing the requirement that the fitted probability of purchase at the estimated mDV of zero be 50% in real trials. For comparability, discount factors in the hypothetical condition, θH, were also estimated using the same procedure. That is, subject-specific θH was estimated with the requirement that the fitted purchase probability at the estimated mDV (=θH · WTP − price) of zero be 50% in hypothetical trials. More specifically, the algorithm used to estimate θ values is as follows: (1) Let P(x) be a fitted probability of “yes” purchase decision at x. (2) Then P(mDV) = 1/(1 + e−(α + βmDV)), where mDV = θ · WTP − price; α and β are estimated coefficients of a logistic regression model whose dependent variable is purchase decision (yes = 1; no = 0) and independent variable is mDV(= θ · WTP − price) (as in Fig. 2B). (3) Find θR where P(0) = 0.5 (or α = 0) in the real trials; find θH where P(0) = 0.5 (or α = 0) in the hypothetical trials.
Figure 2B, bottom, shows the smoothed choice probability curves using the mDV. Median θH (=1.03) for hypothetical trials is not significantly different from 1, but the median θR (=0.60) for real trials is significantly less than one (signed-rank test, p > 0.5 and p < 0.0015, respectively) (Fig. 2C). The median difference between θH and θR is also significantly different from zero (Fig. 2C).
This weighting pattern suggests that people act as if they are approximately using the originally hypothetically stated WTP when later making hypothetical purchase decisions, but that the values that they placed on objects are ∼40% lower when making real purchase decisions. This numerical adjustment is a measure of the degree of the purchase bias, and it creates condition-specific measures of mDV that are comparable in their decision implications across the hypothetical and real conditions.
Note that another conceivable way to modify the DV is to keep WTP fixed, but adjust prices by multiplying price by a corrective factor θ (reflecting the possibility that subjects weight prices more highly during the real condition). However, given the additional behavioral evidence discussed above about how WTP decreases between hypothetical and real conditions, it is unlikely that all of the hypothetical bias can be attributed to an underweighting of the prices.
Test of hypothesis 1: Are there different areas involved in the computation of stimulus values in hypothetical and real choice trials?
We tested hypothesis 1 using two different types of analyses.
The first analysis, based on GLM 1, looks for areas that correlated with the mDV measure in each of the two types of trials. The basic idea of this analysis is to identify areas that correlate with a measure of stimulus value that is consistent with observed choices in both conditions, which are candidates for putative valuation systems, and to test whether the same or different areas are active in real and hypothetical conditions. Note that since the price of the items is constant across trials, the correlation with the mDV variable is only driven by differences in adjusted item values. Thus, any correlation with mDV is driven by the modified WTP variable. We found that similar regions, including the mOFC and the ventral striatum (VStr), correlated with the value of the goods that is consistent with observed choices in both hypothetical and real trials (Fig. 3A; supplemental Tables 4, 5, available at www.jneurosci.org as supplemental material). We tested this further using a conjunction analysis (Nichols et al., 2005), which confirmed that the mOFC and the VStr are jointly active in both real and hypothetical trials in response to mDV (Fig. 3B; supplemental Table 6, available at www.jneurosci.org as supplemental material).
The second analysis, based on GLM 2, looks for areas that correlated with the DV measure in each of the two types of trials. The basic idea of this analysis is to identify areas that correlate with a measure of stimulus values that is exactly the same across conditions. This alternative analysis is motivated by the possibility that the difference between real and hypothetical choices might be driven by differences in how the prices are weighted, or the net decision values compared with making a choice, instead of being driven by how the stimulus values are computed. Our results are similar to the results of the first analysis using mDV. We found that similar regions of the mOFC and the VStr correlated with the value of the goods in both types of trials (Fig. 3C; supplemental Tables 7, 8, available at www.jneurosci.org as supplemental material). A conjunction analysis again confirmed that the mOFC and the VStr are jointly active in both real and hypothetical trials in response to DV (Fig. 3D; supplemental Table 9, available at www.jneurosci.org as supplemental material).
In addition, to rule out any potential confound, we repeated all the fMRI analysis using the difference between the market retail price and the price offered to the subject, to see whether valuation regions were sensitive to the “deal” subjects were getting. However, this price differential regressor did not correlate with activity in any of the regions, including OFC, VStr, and ACC, which respond to the mDV regressor.
Test of hypothesis 2: Are the value signals computed in common valuation areas stronger in the hypothetical case?
We tested hypothesis 2 using three different types of analyses.
First, we looked for difference in average activation between the real and hypothetical conditions. Note that this question can be addressed equivalently by looking at the regressors indicating decision phase (not modulated by either of mDV of GLM 1 or DV of GLM 2). We found stronger activation in the decision phase for real than for hypothetical choice in ventromedial prefrontal regions, including the mOFC and subgenual cingulate (Fig. 4A; supplemental Table 10, available at www.jneurosci.org as supplemental material). In contrast, no areas exhibited the opposite pattern of activation (stronger activity for hypothetical than for real) at the omnibus threshold.
Second, we looked for areas exhibiting differential sensitivity to the mDV regressor in real and hypothetical conditions using GLM 1. The mOFC, ACC, caudate, and inferior frontal gyrus were more responsive to mDV in real than in hypothetical trials (Fig. 4B; supplemental Table 11, available at www.jneurosci.org as supplemental material). No areas were more sensitive to mDV in hypothetical trials at our omnibus threshold.
To further examine the difference in the strength of the response to mDV, we conducted an independent ROI analysis in the OFC. Figure 4C shows parameter estimates (β values) for the real*mDV and hypothetical*mDV regressors that were extracted within the area of the mOFC activated in both conditions, averaged across all voxels in that region for each subject, and then averaged across subjects. To guard against overfitting, mOFC ROIs for each subject were defined based on the second-level contrasts, real*mDV and hypothetical*mDV, which were independently generated using data from the rest of the subjects (N − 1 subjects). The plot shows that the common area of mOFC is more responsive to the adjusted value of items during real decisions.
Third, we looked for areas exhibiting differential sensitivity to the DV regressor in real and hypothetical conditions using GLM 2. As shown in Figure 5, A and B (and supplemental Table 12, available at www.jneurosci.org as supplemental material), we found similar, albeit weaker, results in this condition.
Task-dependent functional connectivity between the dorsal ACC and the mOFC
The comparisons reported so far show substantial overlap in valuation areas during hypothetical and real choice, and stronger activity during real choice. This raises the question of which areas might be involved in generating the hypothetical bias. We used a PPI analysis to carry out a post hoc test of the role of the dorsal ACC. We focused on this area because, although it has not been previously involved in the valuation process, it correlated with mDV in real trials but not in hypothetical trials. This suggested that this area might be involved in either adjusting the values signals differently in the two conditions, or in implementing a value comparison process that is more active in real trials. Consistent with this hypothesis, the dorsal ACC area exhibited stronger functional connectivity with mOFC in real trials than in hypothetical trials (supplemental Fig. 4, Table 13, available at www.jneurosci.org as supplemental material).
Discussion
The goal of the current study was to use fMRI to test the validity of two potential explanations of the hypothetical bias effect. The first class of theories attributes the effect to the use of substantially different valuation circuitries, with potentially different properties, in real and hypothetical choice. The second class of theories attributes the effect to the computation of stronger value signals during hypothetical choice in a common set of valuation circuits.
Contrary to the first hypothesis, we found that a common set of areas encompassing mOFC and VStr were active in both types of trials and correlated with the decision value of goods at the time of purchase decisions. This finding is particularly interesting because several studies of real choice have found that the activity in the area of mOFC identified here correlates with the value of receiving food, pleasant smells, attractive faces, and abstract rewards such as money or avoiding an aversive outcome (O'Doherty et al., 2001, 2003; Small et al., 2001; Anderson et al., 2003; Gottfried et al., 2003; Kim et al., 2006), and also with the value of stimuli during real simple choices (Plassmann et al., 2007; Hare et al., 2008).
Contrary to the second hypothesis, we found that activity in valuation and cognitive control areas (mOFC, ACC, caudate, inferior frontal gyrus) was more responsive to various behavioral measures of stimulus values in real than in hypothetical choice. This result is interesting because it rules out a natural explanation of the hypothetical bias—namely, that apparent overvaluation of stimuli during hypothetical choice—is attributable to increased activity in the valuation circuitry of the brain.
Since the analysis shows that there is evidence for stronger neural activity during real choice, we can speculate about the behavioral effects of that enhanced activity. ACC activity is more strongly associated with decision value in real choices than in hypothetical choices and is functionally connected (as shown by PPI) to mOFC more strongly in real choice. One possibility is that the ACC is involved in switching context-dependent valuation from one context (hypothetical choice, which is closer to the prescanner product valuation task) to a different and therefore more cognitively demanding context (real choice). Another possible explanation is that the ACC activity in real choice, and stronger activity in mOFC and VStr during real choice, reflect implementation of a more careful comparison process between products and prices in real choice than in hypothetical choice. This latter hypothesis of enhanced comparison during real choice is also consistent with two other behavioral facts: (1) decision times were substantially faster for real choices than for hypothetical choices when mDV was positive (Fig. 2D), and (2) choice probabilities were more sensitive to mDV for real choices than for hypothetical choices (i.e., the psychometric decision curves in Fig. 2B are more inflected) (supplemental Table 14, available at www.jneurosci.org as supplemental material). However, additional research is certainly necessary to distinguish these two hypotheses by more precisely investigating the role of these regions.
Although not explored in the current study, there is another possible explanation for the observed neural and behavioral difference between real and hypothetical choice. Attention might have been differently allocated to the prices, during the two types of decision making. We cannot test and hence rule out this idea since the prices were not manipulated in the current study. However, this would be an interesting extension of the present study and should be addressed in a future study with varying prices and more direct measures of attention.
There are two potential practical implications of these results. First, our study has a reassuring methodological conclusion for scientific inference. In many experiments in psychology and neuroscience, it is common to elicit hypothetical choices or ask hypothetical questions that cannot be actually implemented for practical reasons. Typical examples include experiments for very high stakes, payments with long delays, unusual highly controlled social events, or morally charged consequences (Greene et al., 2001, 2004; Delgado et al., 2005; Hariri et al., 2006; Monterosso et al., 2007; Takahashi et al., 2009). Generalizing claims about neural processing in these hypothetical choice tasks to real choice rests heavily on the assumption that neural processes engaged in the two kinds of decisions are highly overlapping. If this assumption is incorrect, the conclusions about neural mechanisms in virtually all studies with hypothetical tasks are suspect.
Fortunately, our study shows that this overlap is mostly present, in the domain of consumer goods purchase. Thus, an important methodological conclusion of our study is an optimistic one: conclusions about neural circuitry drawn from hypothetical choice could generalize to real choice (in at least cases like ours). This is welcome news since collecting hypothetical choice data is all we can do for many phenomena in natural and social sciences, even though the goal of collecting those data is to understand and predict real choices.
The second implication is more speculative. As noted briefly in the Introduction, there is a substantial amount of data suggesting systematic biases between what people say they would do, hypothetically, and what they actually do. One example is voting: polls typically overestimate intention to vote (as do recollections) and sometimes misforecast the direction of voting (Crespi, 1989; Keeter and Samaranayake, 2007; Hopkins, 2009). In both commercial and academic marketing research, self-reported intentions to buy goods are widely used to plan the launch of new products and forecast sales (Silk and Urban, 1978; Urban et al., 1983; Infosino, 1986; Jamieson and Bass, 1989; Chandon et al., 2004), but these intentions are often upward-biased. Surveys are also routinely used to measure the value of nontraded public goods (e.g., for cost–benefit analyses as inputs to environmental protection, or to assess legal damages). These survey responses are often thought to reflect imprecision and sometimes an upward bias (Diamond and Hausman, 1994; Carson et al., 1996; Mortimer and Segal, 2008). Since there is substantial overlap in neural activity between hypothetical and real choices, but there are also some apparent differences (e.g., the ACC activity, differential responsivity in the valuation areas), future studies could potentially use neural data collected during hypothetical choice to improve the forecasting of real choice from hypothetical data with techniques such as neural decoding.
Footnotes
This work was supported by Human Frontiers of Science Program (C.F.C.), The Betty and Gordon Moore Foundation (C.F.C., A.R.), and The Lipper Family Foundation Fellowship in Neuroeconomics (M.J.K.). Author contributions were as follows: design (M.J.K., A.R., C.F.C.), fMRI collection (M.J.K., M.C.), fMRI analysis (M.J.K., A.R., C.F.C.), other data (M.J.K.), and writing (M.J.K., C.F.C., A.R.).
- Correspondence should be addressed to Colin F. Camerer, 1200 East California Boulevard, MC 228-77, California Institute of Technology, Pasadena, CA 91125. camerer{at}hss.caltech.edu