Abstract
The ability of human subjects to choose between disparate kinds of rewards suggests that the neural circuits for valuing different reward types must converge. Economic theory suggests that these convergence points represent the subjective values (SVs) of different reward types on a common scale for comparison. To examine these hypotheses and to map the neural circuits for reward valuation we had food and water-deprived subjects make risky choices for money, food, and water both in and out of a brain scanner. We found that risk preferences across reward types were highly correlated; the level of risk aversion an individual showed when choosing among monetary lotteries predicted their risk aversion toward food and water. We also found that partially distinct neural networks represent the SVs of monetary and food rewards and that these distinct networks showed specific convergence points. The hypothalamic region mainly represented the SV for food, and the posterior cingulate cortex mainly represented the SV for money. In both the ventromedial prefrontal cortex (vmPFC) and striatum there was a common area representing the SV of both reward types, but only the vmPFC significantly represented the SVs of money and food on a common scale appropriate for choice in our data set. A correlation analysis demonstrated interactions across money and food valuation areas and the common areas in the vmPFC and striatum. This may suggest that partially distinct valuation networks for different reward types converge on a unified valuation network, which enables a direct comparison between different reward types and hence guides valuation and choice.
Introduction
It is often said that one cannot choose between apples and oranges, but consumers make this choice every day. Classical economic proofs (Samuelson, 1947) demonstrate that whenever such choices are logically consistent, choosers are behaving as if the values of apples and oranges are mapped to a single common scale, and choosers select the option with the highest value on that scale.
A subject choosing between a 50% chance of winning an apple and a 25% chance of winning an orange in a consistent manner behaves as if they represent the expected utility (EU) of each option separately and then transform those EUs onto a common scale for comparison. Thus, for any choice situation with any reward types, the simplest possible explanation for consistent choice would be the assumption that there is a neural valuation system that represents the EU of all options and then allows comparison on a single common scale.
If this simplest possible explanation corresponded to the mechanism for choice, then where in the brain are the values of different types of rewards represented, and how are these representations resolved to a single common scale for comparison? A growing body of evidence suggests that the ventromedial prefrontal cortex (vmPFC), orbitofrontal cortex (OFC), striatum, anterior (ACC) and posterior cingulate cortex (PCC), posterior parietal cortex, and lateral intraparietal cortex (LIP) may represent different aspects of valuation (Glimcher, 2011). However, most of these studies relied on a single reward type. While this allowed an examination of valuation processes, it precluded the identification of independent systems for valuing different kinds of rewards or the resolution of those systems to a single common scale for comparison.
Klein et al. (2008) made headway on this issue in monkeys, demonstrating that neurons in LIP encode social and fluid rewards on a single common scale that predicts choice. But that left the question of how independent representations converge. Previous studies have also demonstrated that the human vmPFC (Chib et al., 2009; Kim et al., 2010) and striatum (FitzGerald et al., 2009) carry spatially overlapping value-related signals for food, money, and incommensurable goods. The existence of these overlapping areas is compatible with the hypothesis that distinct representations for different kinds of rewards occur in the brain, and that these valuations converge. However, none of these studies has demonstrated either that activity in these convergent areas encodes value on a common scale for direct comparison or that valuation areas exist which are reward-type specific. In this study we hoped to advance these insights by searching for value areas specialized for different reward types and by explicitly testing the hypothesis that these distinct neural networks converge to a single common-scale value representation in the vmPFC and/or in the striatum.
We also hoped to determine how monetary valuation, and thus risk aversion, relates to valuation and risk aversion for primary rewards. The primate brain evolved to make choices between primary rewards (e.g., food, water). Most modern studies, however, have examined the valuation of money, a mechanism that must have arisen quite recently in human history. This naturally raises the question of whether money engages valuation systems associated with primary rewards or some more abstract set of processes associated with common-scale valuation.
Materials and Methods
Subjects
A total of 66 subjects (31 women) were successfully enrolled in the study presented here, and all of those subjects completed at least one behavioral session. All participants gave informed consent, and all procedures were in compliance with the safety guidelines for MRI research and were approved by the University Committee on Activities Involving Human Subjects of New York University. Of the 66 subjects, we could not estimate risk parameters (described below) for 1 subject, and that subject was discarded from all further analyses. We could not estimate scaling factors (described below) for an additional 18 subjects. Those subjects were discarded from scaling factor-related analyses. (These 19 subjects, who showed no behavioral variation in their choices during one or more of the behaviorally measured conditions, were not asked to return for a second testing session.)
Of the 47 remaining subjects, 18 subjects did not show up for the second behavioral session. The remaining 29 subjects (15 women) both agreed to return for a second behavioral testing session and yielded analyzable data in the first session. A subject's data were considered stable if the difference between the fitted risk aversion parameter for either money or food across the two behavioral sessions was ≤0.25. Of the 29 subjects, 6 subjects failed to meet this criterion. In addition, 4 subjects refused to participate in the fMRI session. The remaining 19 subjects (nine women) participated in the fMRI scanning session that completed the study.
General procedure
In preparation for the behavioral sessions, participants were asked to refrain from eating and drinking for 4 h before coming to the lab. Money and two primary rewards (food and water) were offered to the subjects during experimental trials. Before the first experimental session began, subjects were offered a choice between two food rewards: small chocolate candies (M&M's; Mars Nutrition) or small salted crackers (Mini-Ritz; Kraft Foods). The food reward that they selected then served as the target of all future food choices for that subject. Of the initial 66 subjects, 37 selected chocolate candies. Of the 19 subjects studied in the scanner, 7 selected chocolate candies. Water offers were for a fixed number of milliliters of spring water. Monetary offers were in units of U.S. dollars.
Before testing began in each session (both behavioral and fMRI), subjects were asked to report their current hunger and thirst levels by replying to the question “How hungry/thirsty are you right now?” using a visual analog scale (VAS).
Behavioral sessions
During the two behavioral sessions, the subjects were then asked to perform 900 same-type trials (300 choices over each of the three reward types: money, food, and water) and 600 mixed-type trials (300 choices over money–food lotteries and 300 choices over money–water lotteries) in 12 blocks, as described below. All trials were randomly interleaved. Subjects received $40 for completing each of the behavioral sessions, which lasted ∼1 h and during which the subjects made a total of 750 choices. Subjects were informed in advance that after testing they would be asked to remain in the laboratory for 2 h, during which the only food and water to which they would have access was the food and water realized from their trials.
On each trial presented during behavioral testing, two options were presented to the subject on a computer screen for 2 s (Fig. 1C, right). This presentation was followed by the appearance of a yellow cross in the middle of the screen, which signaled to the subjects that they had a maximum of 1.5 s to indicate which option they preferred by pressing one of two buttons on a computer mouse. Thereafter, a feedback screen indicating the subject's choice was presented for 0.5 s plus the difference between 1.5 s and the reaction time to make sure that the total time of choice plus feedback was 2 s. This was followed immediately by the next trial. Failing to make a choice within the given time resulted in an error signal during the feedback interval. Missed trials were not repeated. Of the 750 trials in a session, subjects missed 10 trials on average (range, 0–55).
Same-type trials
In same-type trials (Fig. 1A), subjects were asked to choose between a certain small reward (the reference option) and a stated probability of either winning a larger amount of the same reward (money, food, or water) or getting nothing (the lottery option). The value of the reference option during same-type trials was fixed throughout the experiment ($2, five chocolate candies or two salty crackers, and 60 ml of water). There were five different values for the lottery option for each reward type ($2, $4.50, $10, $22.50, or $50; 5, 10, 20, 40, or 80 candies; 2, 5, 10, 20, or 40 crackers; and 60, 125, 250, 500, or 1000 ml of water). Five different winning probabilities (13, 22, 38, 50, and 75%) were fully crossed with these five reward magnitudes, yielding 25 unique lottery options for each of the three reward types. A subject might thus be asked to choose between a sure win of five M&M's and a 38% chance of winning 20 M&M's (and a 62% chance of winning nothing). These 75 unique lottery options for all three reward types constructed one block of the session. Each lottery was presented against the same-type reference option six times in six separate blocks in each behavioral session in a randomized order, resulting in 150 choices per reward type (450 total) per session. The side of the reference option was fixed throughout a block to prevent subjects from making mistakes and was switched to the other side for the following block.
Mixed-type trials
In the mixed-type trials (Fig. 1B), subjects were asked to choose between a sure win of a small amount of money ($0.50) and a stated probability of either winning a fixed amount of food or water or getting nothing. Five amounts (10, 20, 30, 50, or 80 candies; 5, 10, 15, 25, or 40 crackers; and 125, 250, 400, 600, or 1000 ml of water) in the same range as in the same-type trials with the same five winning probabilities as in the same-type trials were used, resulting in 25 unique lotteries for food and water. These 50 unique lottery options for food and water constructed one block of the session. Each lottery was presented against the mixed-type money reference option six times in six separate blocks in each behavioral session in a randomized order, resulting in 150 choices per reward type (300 total) per session.
Realization of choices
At the conclusion of each behavioral session, one completed trial of each type (for a total of four trials) was randomly selected and played for real money and/or real primary rewards.
If on a selected trial the subject had chosen the reference option, they then received that amount of food, money, or water. If on a selected trial the subject had chosen the lottery option, then a random number generator determined whether or not the subject had won the specified amount of food, money, or water. Subjects were then given their consumable rewards and asked to stay in the lab for an additional 2 h. During this period, the only food and drink they were allowed to consume was what they had realized from the experiment. We imposed this 2 h delay for two reasons: First, it insured that the choices made by the subjects over consumable rewards had an impact on their physiological state over an extended period. Second, it insured that subjects could not effectively maximize their food and water intake on mixed trials by always selecting the monetary reward and then leaving the lab to purchase candy or crackers at market prices. Our observation that subjects typically valued the food and water rewards at 2–3× their market value (described in Results) suggests that this manipulation was successful. All studied subjects remained in the lab for this additional 2 h period.
Scanning session
During the scanning session, subjects were asked to perform 90 same-type food trials and 90 same-type monetary trials. Same-type water trials and all mixed-type trials were omitted from the scanning session in the interest of maximizing the number of same-type food and monetary trials that could be completed. Subjects received $50 for completing the scanning session, which lasted ∼1.25 h. At the completion of testing, one trial of each type (for a total of two) was selected for realization. Subjects were informed in advance that they would be asked to remain in the laboratory for 2 h after completing the experiment.
The same-type trials conducted inside the fMRI scanner were thus similar to the trials presented in the behavioral sessions. However, instead of the 25 unique lotteries used in the same-type money and food trials examined behaviorally, a subset of 15 lottery trials of each reward type were sampled. We used these lottery options to maximize our statistical power with the limited number of trials available in the scanner. We sampled lotteries with low, medium, and high EUs for our search, but we made sure to sample each reward magnitude and each winning probability three times. In the scanner, as in the behavioral sessions, each trial lasted 4 s, but in the scanner, each trial was followed by a randomly selected 6–22 s intertrial interval.
Each session was composed of six 436 s scan runs. There was a white fixation cross during the first 8 s of each scan, and this time was discarded from analysis to exclude T1 magnetic saturation effects. Thirty trials were presented during each scan, which resulted in a total of 90 choices per reward type (money and food).
Imaging protocol
Imaging data were collected with a Siemens Allegra 3T head-only scanner equipped with a head coil from Nova Medical. To measure blood oxygen level-dependent (BOLD) changes in brain activity, a T2*-weighted functional multiecho EPI pulse sequence was used (TR, 2 s; TE, 30 ms; flip angle, 82°; 33 axial 3 mm slices with no interslice gap were acquired in ascending interleaved order; 3 × 3 × 3 mm; 64 × 64 matrix in a 192 mm field of view). The slices were tilted −30° from a line connecting the anterior and the posterior commissure (AC–PC) to reduce the signal dropout in the OFC (Deichmann et al., 2003). Each scan consisted of 218 images resulting in 436-s-long scans.
High-resolution T1-weighted anatomical images were also collected using an MP-RAGE sequence (TR, 2.5 s; TE, 3.93 ms; T1, 900 ms; flip angle, 8°; 144 sagittal slices; 1 × 1 × 1 mm; 256 × 256 matrix in a 256 mm field of view) and used for volume-based statistical analysis.
Stimuli were projected onto a screen at the back of the scanner, and subjects viewed them through a mirror attached to the head coil. Subject responses were made pressing one of two buttons placed beneath the first and second digits of the right hand.
Data analysis
Estimating expected utilities/expected subjective values.
We used random utility theory to derive the subject-specific EU for each reward type using the data from the same-type trials. Our aim was to map objective values to subjective values using winning probabilities as our ruler and choices for each of the three different reward types as our data. We therefore pooled all same-type data from all one to three sessions completed by any one subject and segregated the data by reward type. We modeled the utility functions for each reward type (reference or lottery) separately as power functions having the form where p is the stated probability that an option will yield a reward (p = 1.0 in reference options or the stated probability in lottery options), X is the objective value of the offered reward, and αi is the free parameter representing the subject-by-subject level of risk aversion for specific reward type i. We note that with this type of utility function, a value of α = 1 represents a risk-neutral agent, a value of α < 1 represents a risk-averse agent with a concave function, and a value of α > 1 represents a risk-seeking agent with a convex function. We selected this particular equation for our utility function fits because it is a simple functional form with few assumptions, with only one free parameter, and it successfully predicts choices under these behavioral conditions.
Using maximum likelihood estimation, the choice data for each reward type from all sessions (behavioral and fMRI) of the same-type trials for each subject were simultaneously fit to a single logistic function of the following form: where PL is the probability that the subject chose the lottery option, EUL and EUR are the expected utility for the lottery and reference options, respectively, and βi is the slope of the logistic function, which is an additional subject-specific parameter. Thus, all same-type data were fit simultaneously with six parameters, three α terms and three β terms. This analysis produced a fitted risk aversion parameter (αi) and a slope parameter (βi) for each reward type and thus specified a utility function (or equivalently, a subjective value function) for each reward type for each subject that could account for the trade-offs between risk and reward that we observed in our subjects.
For the purposes of computing the significance values of the correlations between the different risk aversion parameters (αi) and between the different slope parameters (βi), we used a nonparametric spearman rank test. To ascertain whether the correlations between risk aversion estimates made with the power utility functions were robust to other functional forms with constant first and second derivatives (see Results), we also fit all choice data with an exponential utility function in the form 1 − e−αx, while using the same maximum likelihood and logistic function approach. We estimated how well our model fit the data with a pseudo-R2 value computed as the ratio between the log-likelihoods of the fits obtained to the observed choices and the log-likelihood of the fit that would have been obtained from a completely random chooser.
Estimating the behavioral scaling factor.
Using the fitted parameters from the same-type trials and the choice data from the mixed-type trials, we estimated the relative pricing between money and food and water for each subject. We again used a power utility function, but we introduced an additional linear factor that scaled the expected utility of food and water to that of money in a manner that predicted choice. We thus effectively searched for families of indifference points, where and where EUR$ is the expected utility of the reference option in monetary subjective value units, and EULf and EULw are the expected utilities of the lottery options in subjective value units of food and water, respectively. Sf and Sw are the fitted free parameters scaling factors for food and water lotteries, respectively, and were fit using maximum likelihood estimation. The choice data for each reward type (food and water) from all behavioral sessions of the mixed-type trials for each subject were simultaneously fit to a single logistic function of the following form: where PL is the probability that the subject chose the lottery option.
We fit the mixed-type choice data for each chooser with the αi term fixed at the values determined for the larger same-type trial data set and added two free parameters: a noise parameter (βi) and a free parameter that linearly scaled the subjective value (or utility) of the primary reward relative to the monetary reward (separately for food and water). According to standard economic theory, this effectively assumes that all rewards can be formally described as perfect substitutes in units of utility. This effectively assumes additively separable utility functions. (Note that EUL and EUR were not fit here. They were determined using the fitted risk aversion parameter from the same-reward type lotteries described above.)
Thus, all mixed-type data were fit simultaneously with four parameters, two S terms and two β terms. This analysis produced a fitted scaling factor parameter (Si) and a slope parameter (βi) for each mixed reward type, and thus specified the relative pricing between money and food and between money and water. This allowed us to plot the utility functions of money and food or water on a common scale on the same graph for each subject.
Stability of risk parameters.
To formally test the stability of our parameter estimates for the 19 subjects who participated in every aspect of this study, we used subjects' estimated risk aversion parameters, which were estimated separately for each session, for each reward type and performed a repeated-measures ANOVA on the risk aversion parameters estimated for each of the three sessions (behavior 1, behavior 2, and fMRI) for money and food and found no significant session effects (money, F(2,18) = 0.860, p = 0.43; food, F(2,18) = 0.723, p = 0.49).
To further test the stability of our parameter estimates at a subject-by-subject level, we allowed the estimated risk parameter to systematically vary as a function of session. We introduced a dummy variable for each session, for each reward type, and examined weather the dummy coefficients were significantly different from zero. In all of the 19 subjects participating in three sessions, we failed to reject the null hypothesis. In our estimates, risk parameters across sessions were not significantly different either in the money or food domain.
Assessing model structure
We also performed a comparison between the model described above that has five independent noise parameters (one for each of the trial types; 5-β model) and a more restricted model that has only one noise parameter (1-β model), fit either sequentially as above or simultaneously. We found that the 5-β model fit the data better compared to the 1-β model as measured by log likelihood values, Bayesian information criterion values, or pseudo-R2 (against a random chooser) analysis. For this reason, the 5-β model was used.
General linear model for fMRI.
Functional imaging data were analyzed using BrainVoyager QX (Brain Innovation) with additional analyses performed in Matlab (Math-Works). Functional images were sinc-interpolated in time to adjust for staggered slice acquisition, corrected for any head movement by realigning all volumes to the first volume of the last scanning session using six-parameter rigid body transformations, and detrended and high-pass filtered (cutoff of five cycles per scan) to remove low-frequency drift in the fMRI signal. Data were also spatially smoothed with a Gaussian kernel of 8 mm (full-width at half-maximum). Images were then coregistered with each subject's high-resolution anatomical scan, rotated into the AC–PC plane, and normalized into Talairach space (Talairach and Tournoux, 1988). All spatial transformations of the functional data used trilinear interpolation.
Initial statistical analysis was based on a general linear model (GLM). The time course of activity of each voxel was modeled as a sustained response during each trial, convolved with a standard estimate of the hemodynamic impulse response function. The sustained response interval for the regression was two TRs long, which included the experimental interval during which the lottery options appeared on the screen, the choice period, and the feedback interval. We note that, as in many previous studies, our “event-related design” placed signal related to both action and feedback in the interval that was averaged across trials to produce our baseline measurement.
The main model consisted of two dummy predictors for mean activation in money and food trials, two parametric predictors for the expected subjective value (ESV; the neuronal correlate of behavioral expected utility for each subject estimated as described above) of money and of food, and six predictors modeling head motion as derived from the affine realignment. We refer to the parametric behavioral regressors as ESVs to respect the fact that neural activations cannot, for technical reasons related to cardinality, be referred to as utilities. We define ESVs as brain activations linearly correlated with behaviorally measured EUs.
The predicted ESV of each trial was calculated using the individual subject-specific and reward-specific αi and βi values, which were obtained from the model fit. Because the reference option was always the same, we used only the ESV of the lottery option as a predictor in the regression where β0, β1, and β2 are the regression coefficients; p and X are the stated probability and objective value of the lottery option, respectively; and α$ and αf are the fitted risk aversion parameters for money and food, respectively. The parametric predictors were normalized to a range of 0–1. Activation during the intertrial intervals served as baseline. The activity time course of each voxel in each scan was converted to percentage signal change (PSC), and the model was independently fit to each voxel's PSC, yielding nine coefficients for each subject, including a money ESV (β1) and a food ESV (β2). These results were used in a group random-effects analyses testing whether the mean effect at each voxel is significantly different from zero across subjects. We used a whole-brain analysis and considered any activation significant if it exceeded p < 0.0005 on an individual voxel level and on a minimum cluster size of 20 functional voxels that was determined using a Monte Carlo simulation in BrainVoyager. We used this significance threshold to define regions of interest (ROIs) for further analysis. We identified two types of ROIs: ROIs that significantly tracked ESV for money (money ROI) and ROIs that significantly tracked ESV for food (food ROI). We could then extract the average regression coefficient (β) for each subject in each of the significant ROIs.
To determine whether an ROI was uniquely coding the food or money ESV, we used a leave-one-out cross-validation method. The aim of the method was to extract β values from a significant ROI identified in the main GLM in an unbiased way. For this, we computed the identified ROI using all subjects except one. From the center of mass of the identified ROI, we extracted the β values for money and food for the subject that was left out. We repeated this computation 19 times, each time leaving a different subject out, and computed the unbiased β values for all of our subjects. Then we conducted a t test to examine whether there was a significant difference between money and food β values for the ESV.
Areas encoding relative pricing and ESV.
Areas encoding both the ESVs of food and money were identified by conducting a conjunction analyses on the money and food ROIs; that is, we asked which voxels significantly track the ESV for money as well as (conjointly) the ESV for food. We used a significance level of p < 0.0005 per voxel (uncorrected) for each reward type, and then we examined the conjunction null of ESV effects for money and food with a significant threshold of p < 0.002 (conjoint probability, uncorrected).
To determine whether the overlapping areas found by the conjunction analysis described above represent the relative ESVs of money and food on a common scale, we sought to determine whether at indifference the BOLD level for a given monetary option was similar to the scaled BOLD level (using the scaling factor) for the amount of food the subject found to be of equal subjective value to that amount of money.
At behavioral indifference points (where a given amount of food and money were of equal subjective value to that subject), the BOLD level for the money option (ESV for money) should be similar (less noise and unspecified nonlinearities) to the scaled BOLD level for the food option. To demonstrate this, we calculated for each subject in each of the overlapping areas the estimated BOLD level for $0.50 (using all regression coefficients and the full GLM model) and the estimated BOLD level for the relevant food amount at indifference. We then scaled the estimated BOLD level of the food at indifference using the scaling factor. We repeated the above analysis five more times using all five different amounts of money used in this experiment ($2, $4.50, $10, $22.50, $50) and their inferred indifferent quantity of food (using the scaling factor). For each subject we then averaged, across all indifference points, the BOLD money levels and averaged the scaled BOLD food levels.
We then conducted two measurements across subjects. First, we correlated the average money BOLD level to the average scaled food BOLD level, and second, we computed the difference between the average money BOLD level to the average scaled food BOLD level. For comparison, we repeated these calculations using the nonscaled food BOLD level. We used Spearman's rank test to examine the significance of the correlations, and we used a one-sided t test to examine whether the mean of the distribution of the BOLD differences was significantly different from zero.
Below are the equations for the GLM model used to calculate the BOLD levels for money and food at each indifference point. Note that Di is a dummy variable representing the money or food option. The nonscaled food BOLD levels were calculated in a similar manner, but the scaling factor (Sf) was omitted: and To further test our conclusions about coding in a common neural currency across reward types, we conducted an additional analysis more directly rooted in economic theory. In this second parallel analysis, we sought to determine whether the ratio of BOLD percentage signal change between money and food in these areas was correlated with the ratio of the relative rate of change in money and food EUs. This effectively allowed us to compare the ratio of the marginal BOLD signal to the ratio of the marginal utilities as assessed behaviorally—a traditional economic approach to examining exchange rates as a function of changing wealth. We hypothesized that if there was a significant correlation between the ratio of the behaviorally measured marginal utilities for food and money and the ratio of the BOLD signals to food and money, then we would have evidence of a single common neural representation of ESV because the marginal increase in BOLD signal and expected utility must therefore be linearly related. In essence, what we were hypothesizing was that because the measured BOLD signal (PSC) encodes changes in brain activity as a function of changes in ESV of the offered options (marginal increase in BOLD signal), and marginal utilities from behavior encode changes in EU as a function of the offered options, then these two marginal measures should correlate. For demonstrating common currency, we need to use the ratio between money and scaled (using the scaling factor) food marginal measures. In a common currency area, the ratio of the marginal increase in BOLD signal should correlate with the ratio of the scaled marginal utilities from behavior.
For each subject, we therefore calculated the scaled change in EU that results from a change in the lottery option. Importantly, this change in the EU reflected the relative pricing of food and money with the scaling factor extracted behaviorally during the mixed-type trials.
Formally, we achieved this by calculating the marginal utility for food and money in each of the 15 lotteries the subject faced inside the fMRI scanner and then rescaled food to money using the fitted scaling factor. We then computed the relative ratio of marginal expected utilities (MEUs) between money and food weighted by the scaling factor: and where MEU is the marginal expected utility for each lottery option, Sf is the fitted scaling factor, αi is the fitted risk parameter, X is the reward magnitude of the lottery option, and p is the stated probability. In this way we were able to use the scaling parameter to model the change in BOLD signal that would be expected in an area that represented, in level of aggregate neural activity, the expected subjective values of food and monetary rewards on a single common scale.
To perform the neural comparison, for each of the overlapping ROIs, for each subject, we extracted the averaged BOLD PSC for each lottery option. (Note that, to a first approximation, the averaged BOLD PSC basically represents the average change in ESV when encountering each lottery). We then averaged the 15 different BOLD levels (one for each offered lottery) for money and for food, resulting in a subject-by-subject rate of change in BOLD signal for the money ESV and a subject-by-subject rate of change in BOLD signal for the food ESV. Thereafter, we computed the ratio between the money and food ESV rates of change—the neural ratio of the marginal expected utilities. We performed this analysis in a time window located 8–10 s after presentation onset of the choice options, a time at which a standard hemodynamic response function to lottery option onset would peak.
We then determined whether there was a significant correlation, using p < 0.05, between the neuronal ratio and the scaled behavioral ratio. We also verified the significance level using a second robust regression analysis, which underweights outliers. Activity in a common ROI displaying a significant correlation may be considered as encoding a relative ESV of money and food on a common scale.
Correlation analysis.
On a subject-by-subject level, for each of the ROIs identified in the main GLM, we extracted the subject-specific model residuals after projecting out the variance explained by the subjective value regressor; that is, we projected out the BOLD activity that was correlated with subjective value and used only the regression residuals that remained. We then projected out from the (mean subtracted) residuals a global mean brain activation regressor (after subtracting out its mean signal as well) that was derived from the activity of all the gray matter in the brain of that subject. We then asked whether the identified six ROIs from the main GLM and the two overlapping regions in the vmPFC and striatum show correlated activity on a subject-by-subject level. Finally, we averaged the correlation coefficients and p values across subjects. The critical p value here was corrected for multiple comparisons (28 comparisons). Therefore, the critical p value was α < 0.0017857 (0.05 divided by 28). [The interested reader is referred to the paper by Fox and Raichle (2007) for more details on this fairly standard approach.]
Results
Behavioral task
We analyzed data from two behavioral sessions (n = 65 for the first session, n = 29 for two sessions) and one fMRI session (n = 19). Subjective hunger and thirst levels were assessed before each session using a VAS. A within-subjects analysis (repeated-measures ANOVA) of the VAS ratings across sessions revealed no significant effect of session (behavior, n = 29; food, F(1,28) = 0.09, p > 0.7; water, F(1,28) = 2.43, p > 0.12; behavior plus fMRI, n = 19; food, F(2,36) = 0.31, p > 0.7; water, F(2,36) = 1.12, p > 0.3).
Psychometric results: same-type trials
To assess both the subjective values of rewards for each subject and to quantify the consistency of those choices, we separated the data for each subject first by reward type and then by the probability of the lotteries. This allowed us to plot choice curves. We found that the likelihood a subject would select the lottery over the reference varied as a lawful function of reward magnitude. To take one example, the green dots in Figure 2A plot the probability that subject 181 would select a 22% chance of winning X dollars over having $2 for certain. As the circled point indicates, she selected a 22% chance of $10 over the sure $2 only 20% of the time, but as the magnitude of reward offered by the lottery increased, the likelihood she would select it also increased. When the probability of winning the lottery was increased to 50%, she switched to the lottery at lower overall reward magnitudes. In contrast, when the probability of winning the lottery was decreased, she switched to the lottery only at higher reward magnitudes. Figure 2, B and C, plots similar data for lotteries over food and water. All choice curves show a logit-like distribution and shift to the right as probability of reward decreases.
To characterize both the degree of risk aversion of each subject and the internal representation of value that would be required to account for each subject's degree of risk aversion, we used random EU theory (McFadden and Richter, 1990) fit with a power utility function that states that one can derive a utility function for choosers like ours. In as much as the choice behavior of our subjects was consistent, we know that they behaved exactly as if an internal subjective representation of value of this subject-specific type occurs (Von Neumann and Morgenstern, 1944).
Figure 2A–C shows fits for our example subject, who shows a mild risk aversion (α ∼ 0.75; utility functions used to construct the fit choice curves are also shown; Fig. 2D–F). Figure 3 plots the distribution of risk aversion parameters (αi values from the power utility functions) obtained across our population. For most subjects, the model fit the data very well. The average pseudo-R2 across all subjects was 0.57 ± 0.01 SEM. Figure 3A, which presents this parameter for monetary choices, agrees well with prior studies in the literature (Wu and Gonzalez, 1996; Holt and Laury, 2002, 2005). Interestingly, Figure 3, B and C, reveals a similar distribution of risk aversion for food and water, although no prior studies are available for comparison. These distributions represent idiosyncratic preferences across individuals for food and water. To our knowledge, this is the first formal estimation of risk attitudes (through the measurement of utility functions in consistent choosers) for primary rewards.
To assess within-subject variation in the degree of risk aversion across reward types, we plotted the risk aversion parameters (αi) for each subject against one another. Figure 3D plots risk aversion for monetary lotteries against risk aversion for food lotteries for each subject. These two measurements were highly correlated within subject (Spearman's rank test; n = 65; p < 0.0001), although there was also tremendous variation. As Figure 3, E and F, reveals, this was also true for the relationship between money and water (n = 65; p < 0.0001) and for the relationship between food and water (n = 65; p < 0.0001). While previous studies have suggested that risk attitudes are both idiosyncratic and domain specific (Weber et al., 2002), these data suggest that when a single common procedure is used to elicit risk attitudes across reward domains, humans remain idiosyncratic in their preferences, but preferences across reward types are highly correlated within subject; that is, a subject that is more risk averse in a given reward type will likely be more risk averse in another reward type. The measured risk aversion parameter for one reward type can thus be used to predict the degree of risk aversion that subject will show for a different reward type relative to other choosers. To our knowledge, this is the first demonstration of correlated risk attitudes across reward types.
To assess the level of randomness (as assessed by the slope of the logistic functions) in our subjects' choices as a function of reward type, we computed the correlation between the slopes of the logit functions (the noise parameter, β) within each subject. As can be seen in Figure 3G–I, the β values across reward types, like the α values described above, are highly correlated (p < 0.0001 for all three pairwise comparisons).
A note about risk-aversion correlations
The power utility function described above was chosen because it has the property of constant relative risk aversion; the risk level is a constant percentage as a function of different ranges of magnitude of a good (yielding a constant Arrow–Pratt index in units of percent X). However, for this class of utility functions, different ranges of magnitude of a good might well result in different levels of assessed risk aversion with regard to the absolute magnitude of that good; that is, the local curvature of the specific type of utility function we used (in units of absolute X) changes as one moves out along that curve. Measurements of “risk aversion” can thus be seen as changing (in absolute units) as one moves along that curve in units of X (Pratt, 1964; Arrow, 1965).
To further test our hypothesis that risk attitudes within an individual are correlated across reward types, we repeated our estimation of each subject's risk attitude using an exponential utility function. This class of utility functions has the property of constant absolute risk aversion; the risk level is constant (in absolute terms) as a function of different ranges of magnitude of a good (Arrow–Pratt over absolute X). This measure allowed us to examine correlations in risk attitude (correlations in the free parameters of these functions across reward types) regardless of the magnitude range of a good. When performing this second analysis, we found that risk attitudes across subjects were still significantly correlated (Spearman's rank test; n = 65; p < 0.0001, for all three pairwise comparisons); that is to say that we found that our previous conclusions were robust to correction for Arrow–Pratt over absolute X style measurements.
Neurometric results: same-type trials
Nineteen of the subjects that completed two behavioral sessions participated in a third fMRI session. These were subjects for whom risk parameter estimates were stable across previous behavioral sessions and whose parameter estimates from the fMRI session remained consistent with the previous behavioral sessions (Table 1; for details, see Materials and Methods). The task conducted inside the fMRI scanner was similar to same-type behavioral task (Fig. 1C). However, due to time limitations, water lotteries were omitted, and only 15 different choice options for money and food rewards were examined. No mixed-type lottery trials were conducted in the scanner.
The primary goal of scanning was to identify brain areas that represent the ESV (the neural correlate of EU) of the different reward types for each subject. Because we know that under these conditions our subjects behaved exactly as if the magnitude of any offered reward gave rise to a subjective reward magnitude (captured by αi), which is multiplied by probability and then used as the decision variable, we searched for the neural correlates of this well characterized decision variable. To this end, we computed, independently for each subject, the EU of each choice option she faced in the fMRI. We then searched for brain areas in which the BOLD signal tracked the subject-specific EU for monetary offers (money ESV) and the EU for food offers (food ESV).
Figure 4A shows the brain areas significantly correlated with subject-specific measures of ESV for money across our population (p < 0.0005 per voxel; p < 0.05 corrected for cluster size). The top row of Figure 4A shows a subregion of the vmPFC that significantly tracks the money ESV. The middle and bottom rows show subregions of the striatum and PCC that track the money ESV as well. This finding replicates previous closely related work (Knutson et al., 2003; Daw et al., 2006; Huettel et al., 2006; Kable and Glimcher, 2007; De Martino et al., 2009; Levy et al., 2010).
Figure 4B shows the brain areas significantly correlated with food ESV. The top and middle rows of Figure 4B show subregions of the vmPFC and striatum. Importantly, the bottom row shows a hypothalamic region that also tracks the food ESV. Note that as in previous studies, we identified a region in the occipital cortex (OC) that tracks the ESV for money and food (data not shown), but this effect was mainly driven by reward magnitude. We note that when the ROIs were defined by using reward magnitude as a regressor (in a separate GLM), the only significant β values observed for different predictors applied (ESV, EV, reward magnitude, and probability) were in the OC (data not shown).
Only for food rewards is activation in the hypothalamic region significant, and only for monetary rewards is activation in the PCC significant. The hypothalamic area is known to be important for the regulation of gut hormones, satiety, and eating behavior (McMinn et al., 2000; Schwartz et al., 2000; Harrold, 2004), which has not been observed previously in studies of reward valuation by humans. The PCC area is one that has been widely identified in previous monetary studies (Knutson et al., 2003; Daw et al., 2006; Huettel et al., 2006; Kable and Glimcher, 2007; De Martino et al., 2009; Levy et al., 2010).
To determine whether the BOLD signals in our ROIs track uniquely one reward type, we performed a secondary analysis. We used a leave-one-out cross-validation method to extract β values in an unbiased way from the ROIs identified in our main GLM (for details, see Materials and Methods). There were no significant differences between the β values in the vmPFC or in the striatum (although there was a significant trend in the money vmPFC and in both striatal ROIs). Importantly, we found that the food β value was significantly higher than the money β value in the hypothalamic region (Fig. 4E), whereas in the PCC we found the opposite results. The money β value was significantly higher than the food β value (Fig. 4E). These data thus suggest the existence of a novel valuation region in the hypothalamic region that responds mainly to food, and the data also suggest that the PCC responds mainly to money. This constitutes evidence, explored below, that valuation networks segregated by reward type occur in the human brain.
Common areas coding ESV independent of reward type
Although, the preceding analysis suggests the existence of a specific brain network for representing the value of food rewards, it cannot tell us whether there are brain areas in which activation encodes the ESVs of different reward types on a single common scale—areas that can literally compare apples and oranges, or in our case, food and money. Both economic (Von Neumann and Morgenstern, 1944; Samuelson, 1947) and neurobiological (Glimcher et al., 2005; Glimcher, 2011) theory suggest that such representations must exist as part of a core common valuation system.
To search for such an area, we looked for voxels in the money and food areas identified above that significantly tracked the ESV for both rewards [following an approach pioneered by Chib et al. (2009) with a conjunction analysis]. Figure 5 shows the voxels that significantly tracked both (conjointly) the ESVs for money and food (p < 0.0005 per voxel for each reward type, uncorrected; for details, see Materials and Methods): subregions of the vmPFC (Fig. 5A) and striatal (Fig. 5B) clusters described above. To verify that this group-level overlap reflects overlap at the single-subject level, we also searched for overlapping areas on a subject-by-subject level. In 15 of our 19 subjects, there was an overlap region in the vmPFC, and in 10 subjects there was an overlap region in the striatum (Table 2). Overlapping money and food ESV regions thus appear to be a feature of our subjects.
Our demonstration of overlapping money and food ESV regions replicates previous studies (Chib et al., 2009; FitzGerald et al., 2009; Kim et al., 2010) and might suggest that the overlapping regions represent ESVs of money and food on a common scale for comparison. However, this is only a necessary but not a sufficient condition for a common-scale representation. From this we can only state that the overlapping voxels significantly track the ESVs for money and for food. It does not mean that they do so on a common scale. All three previous papers showed that an overlap region in the vmPFC or striatum shows linear correlations with two or more independent classes of rewards. These are two sets of independent linear correlations. It is important to note, however, that these independent correlations cannot suggest that the overlapping area codes all rewards with regard to a single common metric. To establish that conclusion, one would have to be able to relate the relative values of the two rewards to the relative levels of activation in these two independent linear regressions; that is, the relative levels of activity in these areas to food and money rewards must also reflect the relative values of food and money rewards to that chooser and predict trade-off choices between these two classes of rewards.
Psychometric results: mixed-type trials
To investigate whether we can identify brain areas that represent ESVs on a single common scale, we must first derive, for each subject, the relative values of the two different reward types we examined. We therefore had each subject during the two behavioral sessions make choices on mixed-type lotteries. In these trials, subjects had to choose between a sure $0.50 (reference) and a stated probability of either winning a stated (and relatively large) amount of food (or water) or getting nothing (lottery). Using the fitted risk aversion parameters from the larger same-type lottery data set, we estimated the relative pricing between money and food (and water) for each subject at indifference. We determined, in essence, what amount of food (and water) was equal (the subject was indifferent) in value to a sure gain of $0.50 across the range of food and water reward probabilities and magnitudes that we examined above. Deriving this scaling factor thus allowed us to plot the utility functions for money and food/water on a common scale on the same graph for each subject.
Note that the scaling factor is the relative price in subjective value space (not value space) between money and food/water only at indifference, only for those pairs of monetary and consumable rewards subjects viewed as equally desirable; that is, the scaling factor equates the subjective value of X units of food (for which the subject was indifferent between that amount and $0.50, using his choices in behavior) to the subjective value of $0.50.
Figure 6A–D plots the mixed-type choice data for the subject shown in Figure 2. Figure 6, A and B, plots the probability the subject will choose the food and water lotteries as a function of reward magnitude and probability. In addition, it plots the choice curves using the risk parameter (αi) derived from the same-type lotteries (for this same subject) and the newly estimated scaling factor parameter fit to the mixed-type choice data (and the newly fit logit slope, β). As the data reveal, we were able to estimate the relative pricing between money and food/water for this subject. This allowed us to plot the utility functions for money and food (Fig. 6C) and for money and water (Fig. 6D) on a common scale. More formally, this subject behaved in the mixed-type lotteries as if food, money, and water gave rise to the subjective valuations on a single common scale indicated in Figure 6, C and D. For this subject, the scaling factor between money and food was 0.15, and the risk parameters for money and food were 0.77 and 0.72, respectively. A pack of M&M's (∼40 pieces) was equivalent in value to approximately $2.6; this 4 h food-deprived subject valued M&M's at approximately two and a half times the market value.
Figure 6, E and F, plots the distribution of scaling factor parameters obtained across our population for food and water. Importantly, most subjects valued food and water in the laboratory above market prices. This is important because it suggests that subjects instructed to forego food and water showed an increased valuation for food and water rewards.
Neurometric results: mixed-type trials
To claim that activity in one or more of the conjunction areas identified above represents a common measure of ESV independent of reward type, one must demonstrate that at any behaviorally identified indifference point between a given amount of money and food, the BOLD level representing the money ESV is similar to the BOLD level representing the scaled food ESV (using the scaling factor identified from the mixed-type trials in behavior outside the scanner).
For this aim, we first computed for each subject the money BOLD level and the scaled food BOLD level at six indifference points and averaged them. We then correlated the average money BOLD level to the average scaled BOLD level across subjects. Second, for each subject, we computed the difference between the average money BOLD level to the average scaled food BOLD level. For comparison, we repeated this calculation using the nonscaled food BOLD level.
If activity in a given area encodes both reward types on a common scale appropriate for direct comparison, the BOLD levels should be correlated, and the difference between BOLD levels should be zero when the food BOLD level is scaled by the behaviorally derived scaling factor. However, if the nonscaled version of the BOLD signal is used in this comparison, then the mean difference between the two classes of BOLD signal should not be zero. Only an area that shows (across subjects) both a significant correlation and that the mean of the differences of the BOLD levels is zero (when scaled by the behaviorally derived scaling factor) could represent the subjective value of money and food on a common scale appropriate for comparison in choice.
As can be seen in Figure 7, there is a significant correlation in the vmPFC between the average money BOLD level and the average scaled food BOLD level at indifference. More importantly, the mean of the distribution of the difference between the average money BOLD level and the average scaled food BOLD level is very close to (not significantly different from) zero. Note for comparison that when we use the nonscaled food BOLD level, the mean of the distribution of the difference in BOLD level is significantly different from zero.
We saw such a significant relationship only in the vmPFC, not in the striatum or the OC. In the striatum, the mean of the distribution of the differences (when using the scaled food BOLD level) is also close to zero, but the correlation of the BOLD levels is not significant. In the OC, the mean of the distribution of the differences is not zero, and the correlation is not significant. Hence, from this we can conclude that the strongest evidence for a common representation of subjective value of food and money in this data set is in the vmPFC.
To strengthen the conclusion that activity represents different classes of rewards in a convergent common currency, we conducted an additional analysis more directly rooted in economic theory. In this analysis we tested, for each subject in each of the common areas described above and shown in Figure 8, whether the ratio of the averaged BOLD PSC between money and food across all choices was correlated with the ratio of the averaged rate of change in money and food EU measured behaviorally. This essentially provides a comparison of the ratios of the marginals of the BOLD signal and the ratio of the marginals of the utilities. We measured the rate of change in EU because the BOLD PSC is a representation, to a first approximation, of the rate of change in ESV. Note also that while our definition of these regions of interest was predicated on a representation of ESV, the procedure makes no assumption about the value of this ratio we hoped to measure.
Figure 8 plots the subject-by-subject relationship between the neuronal relative activation of money and food ESVs to the behavioral relative rate of change in money and food EUs. Figure 8A–C represents this within the common areas found in the vmPFC, striatum, and OC, respectively (as shown in Fig. 5). Only in the vmPFC was there a significant correlation (n = 19; p < 0.05). Using this more economics-oriented analysis, we thus show results similar to those presented in the previous analysis. BOLD activity in a subregion of the vmPFC appears to represent the ESV of money and food on a single common scale in a manner that could, in principle, support choices between food and money—a neural mechanism that would allow comparisons of apples and oranges, so to speak.
Valuation networks: correlation analysis
Our observation that BOLD activity in the hypothalamic region and PCC correlates with the ESV of a specific reward type suggests that independent networks of brain areas place values on different classes of rewards. Our observation that a subregion of the vmPFC encodes the relative ESVs of food and money in a way that matches the relative EUs of food and money suggests (as did previous work) that these independent networks must converge. To further examine the notions that (1) separable neuronal networks compute the ESVs for money and food and (2) that these networks converge on a core common neuronal valuation system, we asked whether the identified six ROIs from the main GLM and the two overlapping regions in the vmPFC and striatum show correlated activity. To conduct this analysis without committing a statistical error, we conducted the correlation analysis in a space orthogonal to subjective value in each area. We orthogonalized the BOLD signal by projecting out from each ROI the subjective value vector and then conducted the correlation on the remaining signal. In other words, we used only the residuals of the regression after projecting out all the variance explained by the subjective value regressor.
Figure 9 plots the significant correlation coefficients (averaged across all subjects) between all ROIs. A higher correlation coefficient is represented with a thicker line.
A few interesting observations emerge from the correlation analysis. First, a functional connection between money and food vmPFC and between both of these areas and a common area within the vmPFC was observed. A similar correlational structure can be found in the striatal areas. This may suggest a principle loop connecting food and money valuation networks and the conversion to a unified value representation. Second, a correlation is observed between money vmPFC and money striatum, presumably representing part of the specific money network. Similarly, there is a connection between the food striatum and food hypothalamic region presumably representing part of the specific food network. Finally, while our GLM-based analysis clearly indicates the existence of a specific monetary value-related representation in the PCC, our correlation analysis may suggest that the PCC is not a part of the core valuation system because it is not functionally well connected to the other key valuation areas. This is a finding that may be relevant to the hypothesized role of the PCC in reward-related attention and choice (Hayden et al., 2008). We stress, however, that this is not a general statement about the anatomical connections of the PCC with the other areas.
It is also important to note that when we used a less restrictive significance threshold or used more permissive regressions, we identified additional connections between the areas that might have been expected on a priori grounds. What Figure 9 shows are the most important correlations in our data set using a very strict significance threshold. There is every reason to believe that the actual valuation network is richer than the one we describe in this way.
Previous anatomical and physiological work clearly indicates that subareas of the vmPFC and striatum not identified in this analysis are highly interconnected. Indeed, with a less restrictive analysis, we also see evidence of these connections (data not shown). The aim of the analysis, however, was to identify the most significant correlations that arise when subjects perform this particular task set. Our results from this analysis should thus not be taken as a representation of the underlying anatomical connections between these areas.
Discussion
Perhaps surprisingly, we found that when a single procedure is used to elicit risk attitudes, individuals have idiosyncratic preferences, but for each subject, preferences across reward types are highly correlated. Our data suggest that distinct brain networks track the ESVs for money and food, but that these distinct representations converge on a common representation in the vmPFC that is appropriate for choice. Our correlational analysis strengthens this conclusion that distinct valuation networks converge in the vmPFC, and perhaps the ventral striatum, where a common utility-like representation arises.
Our finding that the fitted risk parameters for different classes of rewards are correlated is novel, and we believe that it has important behavioral and neural implications. It suggests that the measured risk aversion parameter for one reward type can be used to predict the degree of risk aversion that a subject will show for a different reward type relative to other choosers. This implies that we can recover subjects' risk preferences for unmeasured reward types based on choices subjects make in other domains. The neural implication of this pan-domain level of risk aversion may be that humans use a unified brain mechanism for risky choices of this kind.
Our behavioral results may appear to conflict with a previous study (Weber et al., 2002), which hypothesized that risk attitude is highly domain specific. That study, however, used very different techniques to estimate risk attitudes in different domains. We used a single technique (and one that explicitly relied on symbolically communicated probabilities) and examined fewer, and much more similar, kinds of decisions. Our data suggest commonalities in behavioral preferences that may derive from commonalities in neural mechanism used under these highly convergent conditions. The data of Weber et al. (2002), in contrast, suggest significant independence of the mechanisms for preference generation. Both conclusions can be viewed as supported by our neural observations, which suggest the existence of independent mechanisms for preference generation that converge to a common point late in the process of valuation.
We found that the vmPFC, striatum, and PCC represent the ESV for monetary gains, an observation compatible with a large body of existing evidence. For example, in choice tasks, these areas have demonstrated increased activity for increasing gains or decreasing losses during choices between small and large monetary wins and losses (Elliott et al., 2000; Kuhnen and Knutson, 2005; Liu et al., 2007; Tom et al., 2007; Venkatraman et al., 2009). The brain activity in these areas has also been shown to track choice probability (Hsu et al., 2005; Huettel et al., 2005; Preuschoff et al., 2006), the EV of the choice options (Hsu et al., 2005; Knutson et al., 2005; Daw et al., 2006; Preuschoff et al., 2006; Tobler et al., 2007; Tom et al., 2007; Luhmann et al., 2008), the ESV of delayed monetary rewards (McClure et al., 2004; Kable and Glimcher, 2007), the ESV of risky and ambiguous choice tasks (Hsu et al., 2005; Huettel et al., 2006; Levy et al., 2010), and changes in marginal utility (Pine et al., 2009), and to code value in a reference-dependent manner (Tom et al., 2007; De Martino et al., 2009). All of these decision variables are tightly correlated with ESV (Glimcher, 2011).
In nonchoice tasks, food rewards produce increased activity in the vmPFC (Kringelbach et al., 2003). Consumption of palatable foods results in a greater activation of the vmPFC (O'Doherty et al., 2002). Administration of pleasant tastes activates the vmPFC (O'Doherty et al., 2001; Zald et al., 2002), and meal consumption has been shown to be associated with increased neuronal activity in the vmPFC (Del Parigi et al., 2002). The striatum has also been shown previously to respond to the anticipation of primary rewards (O'Doherty et al., 2002), and activity here is correlated with juice preferences (O'Doherty et al., 2006), meal pleasantness ratings (Small et al., 2003), subjective preferences of goods (Knutson et al., 2007), and food craving (Pelchat et al., 2004).
Even the sight of food cues has been shown to activate the vmPFC and striatum (Killgore et al., 2003; Simmons et al., 2005; Goldstone et al., 2009; Siep et al., 2009). Studies involving choice tasks also demonstrate the involvement of the vmPFC and striatum in primary reward value coding. Using food items as rewards, the activity in the vmPFC was correlated with subjects' willingness to pay (Plassmann et al., 2007; Hare et al., 2009) both for appetitive and aversive values (Plassmann et al., 2010), reported experienced pleasantness (Plassmann et al., 2008), decision values (Hare et al., 2008), and the subjective value of delayed juice rewards (McClure et al., 2007), while the striatum was correlated with food reward prediction errors (Hare et al., 2008). Together, these data strongly suggest that activation in the vmPFC and the ventral striatum is correlated with preferences as measured behaviorally. Goods, events, and even images that subjects prefer are associated with elevated activations in these areas. Existing economic theory (Von Neumann and Morgenstern, 1944; Savage, 1954; McFadden and Richter, 1990; Glimcher, 2011) tells us what the precise underlying representation of these preferences would have to look like if these representations were causally responsible for choice itself. These activations would have to be linearly correlated with EU when choice behavior is coherent, and correlated with ESV at all times. The data thus suggest that what these areas actually encode is the ESV signals that guide choice.
The neural structures for valuing different kinds of goods must, however, be at least partially distinct (Glimcher et al., 2005; Rangel et al., 2008). The neural structures for valuing water rewards, for example, must include circuits sensitive to blood osmolality, which are unlikely to participate in the valuations of monetary rewards. Circuits that guide food choices must be sensitive to energy balance. Our neural examination of circuits correlated with the ESVs of food and monetary rewards indicates the existence of just such distinct neural circuits.
The hypothalamus has long been recognized as critical for the maintenance of homeostasis, energy balance, inhibition of hunger, food craving, and feeding (Schwartz et al., 2000; Williams et al., 2001; Morton et al., 2006). Administration of intravenous glucose to hungry humans, or liquid food administered orally, has been shown to cause a decrease in hypothalamic activity (Tataranni et al., 1999; Gautier et al., 2000; Liu et al., 2000) in a dose-dependent manner (Smeets et al., 2005a). Using high-resolution imaging, Smeets (2005a,b) even demonstrated that this decrease in activity is observed only in the dorsal hypothalamus.
To our knowledge, this study is the first demonstration that there are distinct valuation areas for different reward types. We observed that activity in the hypothalamic region is tightly correlated with the ESV of food rewards. Activity in this area, however, is not significantly related to the ESV of monetary rewards in our data set. In a similar way, we observed that activity in the PCC is tightly correlated with the ESV of money rewards but not significantly related to the ESV of food rewards. Although perhaps unsurprising, this is a fairly unambiguous indication that previous scholars have been correct in hypothesizing distinct neural circuits for valuing different types of rewards. The significance of this finding is thus twofold. First, this involvement of the hypothalamic region in choice tasks suggests that primary-reward value computations important for decision making take place in the hypothalamus. Second, our evidence suggests the existence of a neuronal valuation network specific to the domain of food rewards and a neural valuation network specific to the domain of money rewards.
We have shown here that although there is some differentiation between the valuation systems for money and food, there are also overlapping regions that track the ESV of money as well as of food in a single common currency. Our data thus suggest, in line with previous theoretical and empirical work (Glimcher et al., 2005; Chib et al., 2009; FitzGerald et al., 2009; Kim et al., 2010; Glimcher, 2011) that there is a core valuation system, likely fed by multiple domain-specific valuation subnetworks. This core system represents subjective value irrespective of reward type.
The first study that demonstrated the existence of a core common network for valuing multiple reward domains in a single firing rate-based currency was from work in rhesus macaques. In that species, activity in parietal area LIP was shown to reflect the aggregated subjective value of expected social rewards, threats, and fluid rewards (Klein et al., 2008). In a related study in humans, it was shown that an area in the striatum encodes subjects' social reputation and monetary rewards (Izuma et al., 2008). Additional studies demonstrated a representation in the vmPFC or in the striatum of the value of different types of goods (Chib et al., 2009; FitzGerald et al., 2009; Kim et al., 2010; Smith et al., 2010). Finally, overlapping representation of prediction errors for different classes of goods has been demonstrated in the dorsal striatum (Valentin and O'Doherty, 2009). All these studies suggest the existence of a core valuation system that represents the values of many different kinds of rewards in the striatum and vmPFC.
Like these previous studies, this study also demonstrates that subregions of the vmPFC and striatum represent the ESVs of both food and monetary rewards. But we believe that we advance these previous findings by adding a crucial step. Demonstrating that the areas representing the ESVs of food and money overlap is a necessary property of a common value representation area that actually supports domain-independent choice behavior, but such a demonstration is not sufficient. It is also necessary to demonstrate that these neural activations lie on a common scale that can account for observed choices between different kinds of rewards and must also reflect the relative values of food and money rewards to a given chooser. The data presented here show, for the first time, that a specific subregion of the vmPFC tracks the ESV of money and food on a common scale in humans, a requisite for comparing options of different reward types and choosing the option with the highest ESV.
We demonstrated that our identified valuation areas are correlated in a specific way. We found evidence in these correlations for a specific money network and for a specific food network. Furthermore, we have shown that these areas with reward-specific correlations are themselves correlated with the “common areas.” We suggest that this is additional evidence that distinct domain-specific valuation networks converge to support overall choice behavior.
One final observation suggested by the correlation analysis involves value-related representations in the PCC. Numerous studies in humans (McClure et al., 2004; Kable and Glimcher, 2007; Levy et al., 2010) and animals (McCoy and Platt, 2005a; Hayden et al., 2008; Pearson et al., 2009) have identified the PCC as an area that encodes an ESV-like signal. We have demonstrated that in humans the PCC primarily encodes the ESV for money. The correlation analysis suggests that this signal in the PCC is not functionally well connected with the other valuation areas, an observation compatible with the hypothesis that activity in the PCC is involved in attention to choices and postchoice evaluation but not in encoding expected subjective values for choice (McCoy and Platt, 2005b). However, we stress that this is only a hypothesis derived from our data and not a proof to the actual role of the PCC in decision making or its general anatomical connections with other brain areas.
So how do we make choices between “apples and oranges?” The available data suggest that each individual assigns different subjective values to “apples and oranges” and to the relative value between them, but that each individual uses a common strategy for making choices. When facing choices between different reward types like money and food, humans appear to compute the subjective value of each reward type using domain-specific (but partially overlapping) valuation networks. Thereafter, it seems likely that subjective value information from each reward domain is incorporated into a single common network, which represents all subjective values on a common neuronal scale in a subregion of the vmPFC (and probably in other areas as well). It is this final common neural scale, then, that can serve as the substrate for choice.
Footnotes
-
This work was supported by National Institute of Health Grant R01-NS054775 (P.W.G.). We thank D. Burghart and R. Rutledge for their valuable comments on this study.
-
The authors declare no competing financial interests.
- Correspondence should be addressed to Dino J. Levy, Center for Neural Science, New York University, 4 Washington Place, New York, NY 10003. dino.levy{at}nyu.edu