Abstract
To make economic choices between goods, the brain needs to compute representations of their values. A great deal of research has been performed to determine the neural correlates of value representations in the human brain. However, it is still unknown whether there exists a region of the brain that commonly encodes decision values for different types of goods, or if, in contrast, the values of different types of goods are represented in distinct brain regions. We addressed this question by scanning subjects with functional magnetic resonance imaging while they made real purchasing decisions among different categories of goods (food, nonfood consumables, and monetary gambles). We found activity in a key brain region previously implicated in encoding goal-values: the ventromedial prefrontal cortex (vmPFC) was correlated with the subjects' value for each category of good. Moreover, we found a single area in vmPFC to be correlated with the subjects' valuations for all categories of goods. Our results provide evidence that the brain encodes a “common currency” that allows for a shared valuation for different categories of goods.
Introduction
If you won $500, would you choose to spend it on a vacation in Hawaii, on a new home theater system, or on a lavish evening meal? Humans, like other animals, must make decisions about how to spend scarce resources to perform actions and obtain the rewards necessary for survival. Often, different types of rewards might be available at a particular point in time, presenting a dilemma as to which should be pursued and which should be passed up. This leads to a fundamental question in decision neuroscience: How is the brain capable of making a choice between different types of rewards?
A long-held view in economics, which has also recently been proposed in decision neuroscience and neuroeconomics, is that the brain solves this problem by assigning values to the different goods using an abstract signal that is encoded in common units or currency (Bernoulli, 1738; Georgescu-Roegen, 1968; Glimcher and Rustichini, 2004). This conversion to common units could allow for the worth of different classes of items to be compared in a tractable framework (Montague and Berns, 2002; O'Doherty, 2007).
Recently, many different types of value signal have been found to be encoded in brain structures such as the orbitofrontal cortex (OFC) and adjacent medial prefrontal cortex, amygdala, ventral striatum, and elsewhere (Buchel et al., 1998; Gallagher et al., 1999; O'Doherty et al., 2000; Knutson et al., 2001; Delgado et al., 2003; Barraclough et al., 2004; Rangel et al., 2008). These value signals include outcome values, which represent the value of a specific reinforcer as it is consumed by the individual (such as responses to a food taste delivered to the mouth, or to receipt of monetary quantity), and anticipatory value, which is the value of an outcome that an individual is expecting. Another type of value signal is decision value (DV), which is the assessed value of an item available at the point of decision. It is important to emphasize that the DV is an input to the decision process, whereas outcome and anticipatory values reflect the outcome of a decision process.
Despite the identification of different types of value signals, it is still unknown whether the brain recruits distinct or overlapping regions to represent the DVs of different goods. In this study, we tested the hypothesis that an area of ventromedial prefrontal cortex (vmPFC) plays such a role (Montague and Berns, 2002; Rangel et al., 2008). In particular, we focused on medial OFC (mOFC) and adjacent medial prefrontal cortex (designated collectively as vmPFC). These regions have previously been reported to contain a representation of DV for food rewards in both human and animal studies (Padoa-Schioppa and Assad, 2006; Plassmann et al., 2007; Hare et al., 2008), as well as for monetary gambles and delayed monetary payoffs (Kable and Glimcher, 2007; Tom et al., 2007). However, the extent to which overlapping regions of vmPFC are responsible for encoding decision utility for different types of reinforcers has not been addressed.
Materials and Methods
Experiment 1
Subjects.
Nineteen right-handed healthy subjects participated in the first experiment (mean age, 23; range, 19–31), of which four were female. One additional subject completed the experiment but was excluded from the analysis because he did not understand the instructions. The subjects were prescreened to exclude those with a prior history of neurological or psychiatric illness. Subjects had no history of eating disorders and were screened for liking or occasionally eating the type of foods used during the experiment. The California Institute of Technology Institutional Review Board approved this study, and all subjects gave informed consent.
Stimuli.
Subjects made decisions regarding three classes of goods: amounts of money; nonfood items, termed “trinkets” (e.g., Caltech memorabilia and DVDs); and sweet and salty snack foods (e.g., candy bars and chips). To reduce uncertainty in subjects' valuation of the items, we selected trinkets and snacks that were highly familiar and available at campus and local stores. Each class of goods contained 20 items. Money amounts ranged from $0.20 to $4.00, in 20 cent increments. All items were presented to subjects using high-resolution color pictures (72 dpi). The stimulus presentation and response recording was controlled by MATLAB (Mathworks) using Psychophysics Toolbox extensions (Brainard, 1997).
In this experiment, subjects made valuation and choice decisions for an 80% chance of getting the different items. The probability was represented as a pie chart above each item. This uncertainty was associated with stimuli to ensure that the subjects' valuations of monetary amounts were not trivial. Had subjects simply been asked to value different amounts of money, without any uncertainty, it is possible that they may not have robustly activated the neural structures of the DV system. To standardize task presentation, all classes of items were displayed with an 80% uncertainty.
Experiment protocol.
Subjects were instructed to refrain from eating or drinking any liquids, besides water, for 4 h before the experiment. Subjects were also instructed that they would have to remain in the laboratory for 30 min following the experiment, during which time the only thing they would be able to eat was the food purchased in the experiment. This mildly food-deprived state was meant to increase the value that subjects placed on the food items [mean hunger ratings were 3.3 ± SD (1.1), using a scale from 0 (not at all hungry) to 5 (very hungry)]. The entire experiment had three phases: a prescanning, a scanning, and a postscanning phase.
The subjects' value for each item was measured using a Becker-DeGroot-Marschack (BDM) auction (Becker et al., 1964). This auction mechanism is commonly used in economics to obtain precise measures of the subjects' willingness to pay (WTP) for items. The rules of the auction are as follows. Let b denote the bid made by the subject for a particular item. After the bid is made, a random number n is drawn from a known distribution (in our case, $0, $1, $2, $3, and $4 were chosen with equal probability). If b ≥ n, the subject received the item and paid a price equal to n. In contrast, if b < n, the subject did not get the object but also did not have to pay anything. The rules of the BDM auction create a situation in which the optimal strategy for the subject is to bid exactly what he is willing to pay for a given item. This amount is termed the WTP for an item. The WTP of an item serves as a measure an individual's underlying DV of that item. Before the experiment began, the optimal strategy for the BDM was described to subjects.
During the prescanning phase, subjects were endowed with $12 to make bids on three different types of goods (in separate blocks of trials): 20 snack food items, 20 nonfood items, and 20 monetary amounts ranging from $0 to $4. The specific goods used for each item category were preselected from a larger subset of items within each category to ensure an approximately equal distribution of DV for the items within each category. For each item, subjects were asked to place a bid using a continuous scale ranging from $0 to $4, for the opportunity to play a lottery in which they stood an 80% chance of obtaining the given item. Items were presented in random order, and bids were placed for the current item by clicking the mouse on a continuous scale from $0 to $4. Each item presentation trial ended as soon as a bid was placed, and the subsequent trial immediately followed. The bid represented the maximum amount of money that would be deducted from the subjects' endowment to purchase the lottery opportunity for that item, if the item was subsequently selected in the BDM auction process (subjects were informed this could occur in the postscanning phase). To make WTP decisions for monetary rewards nontrivial, we used a lottery mechanism for each item instead of offering the item with 100% certainty. If we had not used the lottery, WTP measures would always equal the exact value of the monetary reward available, resulting in a trivial valuation for subjects. We used the 80% lottery for all item types to ensure consistency in the type of decision involved across all item categories.
The second phase (scanning phase) of the experiment began with the selection of a reference price for each subject. This price was set equal to the median bid over all items made in the prescanning session. During scanning, subjects were asked to make purchase decisions for all of the stimuli that they saw in the prescanning phase. The stimuli were presented in a fully randomized order between and across categories. The positions of the stimuli were randomly assigned to the left and right of a fixation cross. In every trial, subjects were presented with a choice between an amount of money equal to the reference price and the opportunity to play a lottery in which they stood an 80% chance of obtaining a given item. Subjects made a choice by selecting one of two buttons on a button-press response pad, whereby the left button selected the leftmost stimuli and the right button selected the right most stimuli. Subjects had 2 s in which to make a selection, after which a pseudorandomly jittered (∼1–10 s) blank screen was presented to subjects. Trials in which subjects did not make a selection in the allotted time were assigned as “missed responses” during the functional magnetic resonance imaging (fMRI) analysis. If they chose the item, they would pay the reference price (out of their endowment) and subsequently receive the item if it were selected in the postscanning phase lottery. Alternatively, if they chose the reference price, they would not get to play the lottery and would also not have to pay anything.
At the end of the scanning phase, the computer selected three of the trials at random to be implemented (one for each category of good). These trials could be drawn either from the prescanner BDM auction phase or from the within-scanner choice phase. The outcome of the selected trial, and only that trial, was implemented. In this way, subjects did not have to worry about spreading their $4 budget over the different items. Instead, they were able to treat each trial as if it were the only decision that counted.
Experiment 2
To provide an independent replication of the results from experiment 1, and to rule out the possibility that the common DV area observed in experiment 1 emerged because subjects were constantly evaluating each item against a fixed monetary bid, we performed a second follow-up experiment. This second experiment was performed in a separate group of subjects (n = 13; three female). It was identical to the first experiment in almost all respects, except that on each scanner trial, instead of making a choice between an item and a fixed monetary amount, subjects made decisions between a fixed snack item whose value approximated that of the median bid. Note that choosing the item entailed foregoing the fixed snack item, which thus can be thought of as the “cost” of getting the item (with 80% probability).
fMRI data acquisition
Functional imaging was performed with a 3 T Siemens Trio scanner. Forty-two contiguous interleaved transversal slices of echo-planar T2*-weighted images were acquired in each volume, with a slice thickness of 3 mm and no gap (repetition time, 2500 ms; echo time, 30 ms; flip angle, 80°; field of view, 192 mm2; matrix, 64 × 64). Slice orientation was tilted −30° from a line connecting the anterior and posterior commissure. This slice tilt alleviates the signal drop in the OFC (Deichmann et al., 2003). We discarded the first three images before data processing and statistical analysis, to compensate for the T1 saturation effects.
Image processing
Image processing and statistical analyses were performed using SPM5 (http://www.fil.ion.ucl.ac.uk/spm). All volumes from all sessions were corrected for differences in slice acquisition, realigned to the first volume, spatially normalized to a standard echo-planar imaging template included in the SPM software package (Friston et al., 1995) using fourth-degree B-spline interpolation, and finally smoothed with an isotropic 9 mm full-width-at-half-maximum Gaussian filter to account for anatomical differences between subjects and to allow for statistical inference at the group level.
fMRI statistical analysis
We estimated subject-specific (first-level) general linear models that included three types of conditions corresponding to the appearance of the choice screen for the three types of goods (money, trinkets, and snacks). In every case, the event was modeled with a duration of 2 s. For each such event, we also introduced two parametric modulators: (1) the WTP (obtained from prescanning trials) of the presented item and (2) the subjects' choice between the fixed bid and the presented item. Trials with missing responses were modeled as separate nuisance regressors. In addition, regressors modeling the head motion as derived from the affine part of the realignment procedure were included in the model. Serial autocorrelation was modeled as a first-order autoregressive model, and the data were high-pass filtered at a cutoff of 120 s. For the second-level analysis, we constructed a 3 × 1 factorial design and included the images of the parameter estimates (betas) from the WTP parametric modulations for each condition. This model allowed us to test several contrasts: (1) areas in which activity is correlated with the WTP for money; (2) areas in which activity is correlated with the WTP for trinkets; (3) areas in which activity is correlated with the WTP for foods; and (4) areas that exhibit a differential effect for each condition (i.e., areas that correlate are significantly more correlated for WTP in one category compared with each of the other two categories in direct comparisons). In addition, we were able to test for areas that show overlapping correlations with WTP across all three classes of goods (i.e., conjunction null of WTP effects for food, money, and trinkets) (Nichols et al., 2005). For display purposes, we present all our statistical maps (SPMs) at a threshold of p < 0.001, uncorrected; however, all statistics reported in the text were small-volume false discovery rate (FDR) corrected. For the first experiment, the small-volume correction was performed using a sphere 20 mm in radius, centered at [x = 4, y = 30, z = −18]. These coordinates were chosen because a previous study found them to be positively correlated with WTP for food items (Plassmann et al., 2007).
The analysis of the second experiment was identical to the first. For the application of a statistical threshold, we again performed a small-volume correction. To implement this correction, we used the activation map from the conjunction identified in the first experiment (i.e., the activation map found to commonly encode all categories of goods when making choices against a fixed median monetary bid) [x = −3, y = 42, z = −6] and performed small-volume correction within that image mask to test for voxels showing significant effects in vmPFC in the second experiment.
Results
Behavioral results
Figure 1B shows the distributions of bids during the prescanning trials. The average bid was $1.61 (SD, 1.07) for money goods, $1.61 (SD, 1.50) for trinket goods, and $1.26 (SD, 1.08) for snack goods. There was no significant difference between mean bids among the three classes of goods (p > 0.05), indicating that these goods are well matched in terms of their overall utility to the subjects. For each class of goods, the majority of bids were greater than zero (money, 91%; trinkets, 71%; snacks, 84%), and the average WTP was significantly greater than zero (p < 0.001). This suggests that most items were rewarding for most participants and that a similar distribution of bids was placed for each category of goods.
Experimental design and behavioral results. A, Before entering the scanner, subjects judged their WTP for an 80% probability of receiving each item. Inside the scanner, subjects made choices between a reference monetary price (equal to the median WTP of all items) and an 80% chance of receiving each item. The order of item presentation was randomized. B, Group distributions of WTP for each category of item. C, Group psychometric functions for each category of item. Psychometric functions were generated from subjects' prescanning WTP measures and choices inside the scanner.
To ensure the participants' decisions inside the scanner were consistent with the measures of DV taken in the prescanning phase, we fitted psychometric functions of the scanner choices as a function of the subjects' WTP. Figure 1C shows group psychometric functions for the money, trinket, and snack item categories. It can be seen that as the WTP for an item increased, participants had a higher likelihood of selecting that item over the fixed bid. This relationship illustrates that participants make choices in the scanner that are consistent with the prescanning WTP valuations, and that the scanner choices are dependent on the underlying WTPs.
fMRI results
Common representation of DV for different categories of reinforcers
We performed a whole-brain analysis to identify areas responsible for the computation of WTP (our measure of DV) for different categories of goods. These activation maps (Fig. 2) illustrate brain regions that encode the WTP for money ([x = −3, y = 42, z = −6]; p < 0.05, FDR corrected), trinkets ([x = 6, y = 39, z = 12]; p < 0.001, FDR corrected), and snacks ([x = 3, y = 30, z = 12]; p < 0.001, FDR corrected). Consistent with previous studies, for all classes of items, we found activity in vmPFC that was positively correlated with the subjects' WTP. Furthermore, a conjunction analysis found a region of vmPFC that was commonly active for all classes of items ([x = −3, y = 42, z = −6]; p < 0.05, FDR corrected). No other area of the brain showed significant correlations in this conjunction analysis at p < 0.005 uncorrected. This provides evidence for a common representation of DV in vmPFC that was not found anywhere else in the brain.
mOFC commonly encodes the DV of multiple classes of goods. A–C, The results from experiment 1 for areas correlating with our goal valuation measure separately for the money, trinket, and food conditions. Columns show areas exhibiting voxel-wise correlations with WTP (color codes depict the statistical thresholds used to display the activation). The peaks of activation are as follows: money condition: [x = −3, y = 42, z = −6], Z = 3.42; trinket condition: [x = 6, y = 39, z = 12], Z = 5.06; snack condition: [x = 3, y = 30, z = 12], Z = 5.00. D, An area of mOFC surviving a conjunction analysis testing for correlations with valuation common to all of the goods in the first experiment (for choices against a fixed monetary bid). The locus of the peak voxel from the conjunction analysis was [x = −3, y = 42, z = −6]; Z = 3.42. The right column shows the average percentage signal change in a 5 mm sphere centered on the peak coordinates from the study by Plassmann et al. (2007), ensuring the independence of this plot from the contrasts used in A–D. The average percentage signal change is shown for the lower ($0–$1) and upper ($3–$4) bounds of WTP. E, Results for the conjunction analysis for the second replication experiment in which choices were made against a fixed snack item. The peak conjunction response in this experiment was [x = −9, y = 39, z = −6]; Z = 3.57. The right column shows the average percentage signal change as a function of WTP, from a 5 mm sphere centered on the peak coordinate derived from experiment 1, ensuring the independence of this plot from the contrasts used to generate the conjunction.
We next investigated whether there were brain areas exhibiting sensitivity only for the DV of particular classes of items. Differential effects were examined using linear contrasts between parameter estimates of WTP regressors for each condition. A whole-brain analysis of these differential contrasts resulted in no areas of unique activity for any class of item (at a level of p < 0.005 uncorrected).
Plots of peak percentage signal change reveal similar blood oxygenation level-dependent (BOLD) responses for all categories of items, both during item-specific activation maps and the conjunction (Fig. 2). In all cases, the subjects' choices are driven by lower neural activations on trials with low WTP and relatively higher activation on trials with high WTP. These results illustrate a common pattern of encoding of value signal in vmPFC that informs decisions regarding all categories of items. A plot of BOLD signal from the peak of the activated area against the full range of DVs is (using an independent leave-one-out analysis) is shown in supplemental Figure 1A (available at www.jneurosci.org as supplemental material).
Representations of common decision utility are independent of the reinforcer category of the reference item
One possible concern about the findings reported thus far is that an overlapping response in vmPFC during decisions between dissimilar goods could reflect that subjects are making a choice between that good and the fixed monetary amount used as the reference price. Therefore, perhaps subjects have solved the choice problem by always converting the item's DV into a monetary currency, which was then compared with the reference price. According to this possibility, overlapping activity between the conditions could merely reflect the common use of a monetary currency to make decisions across conditions. To rule out this strategy, we performed a second closely related experiment in a separate group of 13 subjects. This experiment was almost identical to the first, except that subjects made choices between a fixed reference snack item whose value approximated that of the median bid. They would forego this fixed snack item if they opted to play the lottery for the nonfixed item, whether monetary, food, or trinket.
We again found an area of common activation within the same region of vmPFC (p < 0.05, FDR corrected), and, once again, no other brain area showed evidence of such conjunction activity at p < 0.001 uncorrected or even at p < 0.005 uncorrected. Also, a whole-brain analysis of differential contrasts again resulted in no areas of unique activity for any class of item (at p < 0.001 uncorrected or even at p < 0.005 uncorrected). A plot of BOLD signal from the peak of the activated area against the full range of DVs is (using an independent leave-one-out analysis) is shown in supplemental Figure 1B (available at www.jneurosci.org as supplemental material).
Discussion
In this study, we present direct evidence that a specific region of vmPFC encodes the DVs for at least three distinct types of goods: monetary prizes, trinkets, and food items. Activity in this region was found to correlate with the subjects' DV for each of the items while performing a binary choice between a lottery involving the given item and a fixed reference good (a monetary prize in experiment 1 and a snack in experiment 2). This finding supports the hypothesis that a specific region of vmPFC holds a representation of value regardless of the categories of goods presented, and regardless of the specific type of comparison being performed. Our findings provide some hints about the mechanism the brain might use to make comparisons between different categories of reinforcers at the time of choice. Our study supports the possibility that the brain uses a common currency mechanism by encoding a representation of DVs for every stimulus, regardless of its category, in a common region of vmPFC. We find evidence suggesting that this region contains neurons that have similar properties and thus can encode signals that are easily compared.
Because of limitations in the spatial resolution of fMRI, our data provide no information about the extent to which different neurons within vmPFC encode the value of different stimuli, or whether there are neurons that encode DVs for multiple categories of stimuli. Nevertheless, our findings suggest that the valuation processes underlying basic decisions appears to be mediated, at least in part, by the same region of frontal cortex. This encoding holds regardless of the nature of items being valued and is not mediated in discrete nonspatially overlapping cortical circuits for each category of item. The present study builds on a rapidly growing literature showing that subregions of vmPFC correlate with a number of different value signals for a variety of reinforcers. Regions of medial prefrontal and mOFC have been found to correlate with the value of monetary outcomes in a number of previous studies (Knutson et al., 2001; O'Doherty et al., 2001; Kim et al., 2006). Anticipated value signals have been observed in vmPFC during tasks involving choices between stimuli associated with the subsequent delivery of monetary outcomes (Daw et al., 2006; Hampton et al., 2006; Kim et al., 2006). Activity in medial prefrontal cortex has also been found to correlate with the DV that subjects assign to monetary gambles, delayed versus immediate monetary prizes, and food items (Kable and Glimcher, 2007; Plassmann et al., 2007; Tom et al., 2007). An important difference between our study and previous studies is that none of the previous studies used a design that made it possible to test whether DV signals for different reinforcer types are encoded within the same brain region. The closest study to ours compared responses at the time of outcome to the receipt of monetary outcomes and social reinforcement, identifying an area showing overlapping activity within the striatum (Izuma et al., 2008). However, by focusing exclusively on signals related to receipt of reward outcomes, this study did not address how the brain assigns values to different categories of items as inputs to the choice process at the time of decision making.
It should be noted that in this experiment, subjects were making decisions regarding 80% gambles for the different types of goods. Subjects were presented with these lotteries to ensure that value judgments for the monetary reward were nontrivial and recruited the DV system (without the gamble component, the value of the monetary reward would have simply corresponded to the amount offered). To ensure equivalence across conditions, gambles were used for each of the different types of items. Given that subjects always made decisions under a level of moderate risk, it is possible that our finding of a common valuation area with vmPFC applies only in the case of risky decisions. However, a number of previous studies have found activity in similar regions of vmPFC correlated with DVs for items without the gamble context used here (Plassmann et al., 2007; Hare et al., 2008). Therefore, when taking the present results together with those previous findings, it seems unlikely that the results we obtain here are specific only to the probability discounted case.
Another feature of the present study is that, by design, DVs for each item are correlated with subjects' choice of a particular item on offer, or the reference item. This raises the possibility that vmPFC activity could, in the present study, reflect a binary choice and not valuation per se. However, this explanation is unlikely, given that we find vmPFC activity scales across the full range of DVs (as shown in supplemental Fig. 1, available at www.jneurosci.org as supplemental material), rather than merely reflecting a binary choice outcome. Furthermore, in the previous study by Plassmann et al. (2007), the subjects' provided trial-by-trial WTP estimates as opposed to making binary choices, and activity in vmPFC was also found to be correlated with the subjects' WTP. That study found a region of vmPFC correlating with WTP for food rewards overlapping the regions we found here for DVs across all three reinforcer types.
To conclude, we provide the first direct evidence that activity in a discrete region of vmPFC is correlated with the subjects' DVs for distinct types of reinforcers. These findings are compatible with the concept that the brain uses a common currency framework when making decisions between fundamentally different classes of rewards.
Footnotes
This work was supported by National Science Foundation Grant 0617174 and a Searle Scholarship (J.P.O.), by the Gordon and Betty Moore Foundation (J.P.O., A.R.), and by the Japan Science and Technology Agency, Exploratory Research for Advanced Technology (S.S.).
- Correspondence should be addressed to Vikram S. Chib, 1200 East California Boulevard, M/C 139-74, Pasadena, CA 91125. vchib{at}caltech.edu