Abstract
A widely observed phenomenon in decision making under risk is the apparent overweighting of unlikely events and the underweighting of nearly certain events. This violates standard assumptions in expected utility theory, which requires that expected utility be linear (objective) in probabilities. Models such as prospect theory have relaxed this assumption and introduced the notion of a “probability weighting function,” which captures the key properties found in experimental data. This study reports functional magnetic resonance imaging (fMRI) data that neural response to expected reward is nonlinear in probabilities. Specifically, we found that activity in the striatum during valuation of monetary gambles are nonlinear in probabilities in the pattern predicted by prospect theory, suggesting that probability distortion is reflected at the level of the reward encoding process. The degree of nonlinearity reflected in individual subjects' decisions is also correlated with striatal activity across subjects. Our results shed light on the neural mechanisms of reward processing, and have implications for future neuroscientific studies of decision making involving extreme tails of the distribution, where probability weighting provides an explanation for commonly observed behavioral anomalies.
Introduction
A central question in the social and biological sciences is how the likelihood and value of outcomes are combined to make risky choices. For humans, these choices range widely, from voting, gambling, and buying stocks and insurance to searching for jobs and mates. The standard approach, expected utility theory (EU), assumes that outcomes x are valued nonlinearly by a utility function u(x), but are weighted by their objective probabilities (Savage, 1954).
There is substantial evidence in economics and decisionmaking research, however, that the hypothesis of expected utility being linear in probabilities is systematically wrong, in a reliable direction. This stylized fact was originally established by Maurice Allais via the “common ratio effect” (Allais, 1953). Consider a decision maker who prefers a sure (p = 1) gain of $100,000 over a coin toss (p = 0.5) for $300,000 but who also rejects a 0.02 chance of $100,000 for a 0.01 chance of $300,000. If these choices are consistent with choosing the gamble with the highest expected utility, the first choice implies u($100,000) > 0.5u($300,000), while the second choice implies 0.02u($100,000) < 0.01u($300,000). The two inequalities are clearly at odds, since both sides of the second are the same as those in the first multiplied by 0.02 (Prelec, 1998).
The common ratio pattern can be reconciled by the plausible assumption that people apply nonlinear “decision weights” π(p) to objective probabilities p, so that the ratio π(0.02)/π(0.01) is much smaller than π(1)/π(0.5) (cf. Rubinstein, 1988). An inverse Sshaped nonlinear function was first suggested experimentally (Preston and Baratta, 1948), is a central feature of prospect theory (Kahneman and Tversky, 1979), and has been replicated in subsequent experimental and field studies including ours (see Fig. 1A,B and supplemental material, available at www.jneurosci.org). Small probabilities are typically overweighted while high probabilities are underweighted, with a crossover point, at which a probability is subjectively weighted by its objective value of p* = π(p*) around 1/3 (Kahneman and Tversky, 1979; Kagel et al., 1995; Prelec, 1998; Starmer, 2000).
Despite its clear relevance in a wide variety of settings, few studies have directly studied nonlinear weighting of probabilities, especially compared with the number of studies on risk and reward as a whole. There is much evidence that a number of brain regions are sensitive to expected reward (or “utility”). Arguably the most well established are dopaminergic regions such as the striatum and midbrain structures (Knutson et al., 2001; Abler et al., 2006; Tobler et al., 2007). However, most studies do not sample sufficiently near the probability endpoints to detect the theoretical nonlinearity. The two existing studies on probability weighting also have not shown neural responses to probabilities resembling the smoothly increasing function which typically fit behavior well. Paulus and Frank (2006) focused on betweensubjects measures and found that activity in anterior cingulate correlated with degree of nonlinearity across subjects. Berns et al. (2008) used probabilities of shock and found a number of regions exhibiting flat responses to probability, but were not able to statistically reject the null linear hypothesis. In this study, we used a parametric design that varied the probabilities and outcome of two gambles in a binary choice task. Our statistical analysis focused on separating the weighted form of expected utility into two components: a portion linear in probabilities, and one that is nonlinear in probabilities. The null hypothesis of EU predicts only significant responses to the linear portion. Under the alternative hypothesis of nonlinear weighting, we expect to find regions correlated with both the linear and nonlinear portions in a manner predicted by existing models of probability weighting.
Materials and Methods
The experimental sample was 21 subjects (11 female). Their mean (SD) age was 29.6 (7.5). Informed consent was given through a form approved by the Internal Review Board at Caltech. Subjects were recruited from an online bulletin board (see supplemental material, available at www.jneurosci.org).
Experimental paradigm
The experiment consisted of 120 selfpaced trials (Fig. 2). In each trial, subjects chose between two simple gambles, (p_{1}, x_{1}) and (p_{2}, x_{2}). A gamble pays off $x with probability p and $0 otherwise. Subjects were first presented a screen containing only (p_{1}, x_{1}). This serves to isolate the brain response evaluating (p_{1}, x_{1}) without confounding evaluation of the second gamble (p_{2}, x_{2}) or the process of choice. The values of p_{1} are {0.01, 0.1, 0.3, 0.5, 0.8, 0.95}, the values of $x are {10, 20, 50, 100}, and they are combined factorially to form 24 pairs, which are each presented five times.
On 12 randomly chosen trials (10% of the trials), subjects were asked after the firstgamble presentation to indicate whether p_{1} > 0.4. This was done to ensure that subjects were paying attention to the firstgamble stimulus. The final screen presented both the first gamble and a second gamble, and subjects chose one of the gambles. The probability and reward levels of the second gamble, (p_{2}, x_{2}), were varied across trials, and were chosen so that its weighted expected utility (see model below) would be close to that of the first gamble, which facilitated powerful behavioral estimation of the relevant parameters of the experiment. At the end of the experiment, one choice round was randomly selected and the gamble chosen in that round was resolved to determine the subject's payment. Average earnings for subjects were $21.48 ± $5.49 plus the participation fee of $20 (for more detail, see supplemental Methods, available at www.jneurosci.org as supplemental material).
Behavioral analysis
A stochastic choice model was used to infer the probability weighting function from behavior and correlate its parametric expression with brain activity (see supplemental Methods, available at www.jneurosci.org as supplemental material). As in many studies on this topic, precision is gained by assuming specific functional forms for the utility and probability weighting functions. The utility function is assumed to be a power function u(x;ρ) = x^{ρ}, and the probability weighting function is the oneparameter Prelec weighting function π(p;α) = 1/exp{{ln(1/p)}^{α}} (Prelec, 1998), derived axiomatically from psychophysical principles [it is implied if ln(1/π(p)) is a power function of ln(1/p)]. Decision weights and utilities are assumed to be combined linearly, U(x, p) = π(p, α)u(x, ρ) (where the zero payoff has zero utility). We chose the oneparameter Prelec function because it fits as well as or better than other one and twoparameter functions that were estimated from subjects' choices (Table 1), and having one parameter simplifies the crosssubject analysis below.
fMRI acquisition
Scans were acquired using the 3 Tesla Siemens Trio scanner at Caltech's Broad Imaging Center. Anatomical images (highresolution, T1weighted) were acquired first. Functional (T2weighted) images were then acquired using the following parameters: TR = 2000 ms, TE = 40 ms, slice thickness = 4 mm, 32 slices. Horizontal slices were acquired ∼30° clockwise of the anterior–posterior commissure (AC–PC) axis to minimize signal dropout of the orbitofrontal cortex. The total time duration of the experiment varied because each round was selfpaced.
fMRI analysis
Imaging data were preprocessed using SPM2, including, in order, slice time correction, motion correction, coregistration, normalization to the MNI template, and smoothing of the functional data with an 8 mm kernel (Friston et al., 1995). Random effects analyses were done in SPM2 (Friston et al., 1995) by specifying a separate general linear model for each subject and pooling at the second level. First all images were highpass filtered in the temporal domain (filter width 128 s) and autocorrelation of the hemodynamic responses was modeled as an AR(1) process. In the GLM model all visual stimuli and motor responses were entered as separate regressors that were constructed by convolving a hemodynamic response function (hrf) with a comb of Dirac functions at the onset of each visual stimulus or motor response. Parametric modulators were added to the main regressors as interaction terms.
Parametric model.
An eventrelated analysis focused on brain activity during presentation of the first gamble. Because no information is present regarding the second gamble, we assume brain regions correlated with decision variables reflect reward anticipation with respect to first gamble, rather than choice. We further make the assumption that neural activity is approximately a linear function of the behaviorally derived utility function (that is, we search for brain activity which resembles closely the functions in Fig. 1A,B). A GLM is used that separates the weighting function into two components: (1) component that is linear in p and (2) the component that is the nonlinear deviation term Δ(p, α_{i}) = π(p, α_{i}) − p (Fig. 3A). Specifically, we are looking for a prospecttheoretic expected value function that is nonlinear in p; that is, π(p, α)u(x) = p · u(x) + Δ(p, α) · u(x). We assume the function u(x) is a power function x^{ρ}, where the value of ρ is taken from the individual behavioral estimate, and Δ(p, α) = π(p, α) − p, where the mean group α = 0.771 is used.
The BOLD signal during the first gamble presentation is regressed against p · u(x) and Δ(p, α) · u(x). If the expected utility (EU) null hypothesis is an accurate approximation of valuation of risky choices, there should be no rewardrelated brain regions that respond to the deviation term Δ(p, α) · u(x). If the nonlinear weighting hypothesis is an accurate approximation, there should be rewardrelated brain regions that respond equally strongly to the linear component p · u(x) and to the nonlinear component Δ(p, α) · u(x).
Nonparametric model.
To see how closely activity in brain regions correlated with the weighting function corresponds to the behaviorally derived stylized empirical weighting function, a nonparametric method was used. Each level of probability p was given a separate dummy variable I(p). The relation y = α + β_{i} · I_{i}(p) · u(x) + ε was estimated, where y is the BOLD response upon presentation of the first gamble, I is an indicator function for the particular level of p, and u(x) is a power function x^{ρ} used above. Each β_{i} for the six levels of probability was then rescaled by dividing the β_{i} values for each probability level by the estimated slope for the response to the linear probability term in the parametric regression of activity against the linear and deviation terms. This is to take into account the dimensionless nature of BOLD responses as well as to add a robustness check of the concordance of the relationship between the parametric and nonparametric model.
Betweensubject correlation.
Next we test whether crosssubject variation in the inflection of nonlinear weighting inferred from choices is consistent with crosssubject differences in neural activity. Intuitively, more highly nonlinear functions will be approximated by a combination of the linear term p and the nonlinear term Δ(p, α_{i}) = π(p, α_{i}) − p (shown in Fig. 3A, right) that puts more weight on the nonlinear term. Less nonlinear functions will put less weight on the nonlinear term. A linearweighting subject, for example, will put no weight on the nonlinear deviation Δ(p, α_{i}) = π(p, α_{i}) − p. This analysis exploits the helpful fact that in the oneparameter Prelec form, all functions π(p, α_{i}) pass through a common value π(1/e) = 0. Weighting functions with α > 0.77 (less inflected, more linear) will therefore be well approximated by p plus a dampened form of the deviation curve in Figure 3A (right). Weighting functions with α < 0.77 (more inflected) will be approximated by p plus an amplified form of the decision curve in Figure 3A.
Denote the true weighting function for subject i by π(p, α_{i}), and the deviations from linear weighting by Δ(p, α_{i}) = π(p, α_{i}) − p. A brain region that represents π(p, α_{i}) will be significantly correlated with both Δ(p, α_{i}) and p. Using the mean inflection parameter ᾱ = 0.77, the theoretical relation is π(p, α) = a + b_{1} · p + b_{2} · Δ(p, ᾱ) + e [that is, every weight α gives a weighting function that, when its values are regressed against p and Δ(p, ᾱ), gives weights b_{1} and b_{2}]. Fig. 4A shows the value of b_{2} that is theoretically estimated for the various values of α inferred from subjects' choices. The value of b_{2} is the inflection sensitivity (compared with the benchmark group mean α = 0.77) for particular values of subjectspecific α. Note that when α = ᾱ (a person's α equals the group mean), b_{2} = 1, and when α = 1 (a linear weighting function), b_{2} = 0 because the nonlinear deviation term receives no weight. The graph shows that in theory, lower individual values of α, corresponding to more inflection of the weighting function, should lead to higher values of the b_{2} coefficient. That is, more inflected functions are best approximated by a combination of the linear term p and nonlinear deviation term (calculated using the average ᾱ) with a higher weight on the nonlinear deviation term.
Results
Behavioral results
Table 1 contains the pooled (groupedsubject) estimates of the various weighting function parameters as well as the utility function power parameter ρ and the stochastic choice response sensitivity λ. Five subjects always chose the gamble with the higher probability, such that their choices did not permit identification of the relevant parameters. These subjects were therefore excluded from this and all subsequent fMRI analyses that depended on these parameter estimates. Figure 1C plots the estimated functional forms for all five functions and shows that they look similar (except for more pronounced overweighting of low probabilities in the Prelec twoparameter form). The two Prelec functions fit slightly better than the others as judged by lower negative log likelihood LL. We focused on the oneparameter version of the Prelec function, as it fits only slightly worse than the twoparameter version, and permits simpler crosssubject comparison (since each subject's curvature is expressed by one parameter rather than two).
Supplemental Table S4 (available at www.jneurosci.org as supplemental material) presents individual parameter estimates for the stochastic choice model. These parameter estimates are similar to those found in the literature, indicating nonlinear weighting of probability (mean α = 0.77 ± 0.08) and concavity of utility for money (ρ = 0.57 ± 0.04) (see supplemental material, available at www.jneurosci.org). These estimates also correspond closely to pooled estimates shown in Table 1.
fMRI results
We first identify regions that are significantly correlated with the linear term of the parametric model (Table 2). Consistent with previous literature, we found a number of regions including striatum (Knutson et al., 2001; Abler et al., 2006; Preuschoff et al., 2006; Tobler et al., 2007), anterior cingulate cortex (ACC) (Knutson et al., 2001), and cerebellum (Knutson et al., 2001), as well as frontal regions such as motor areas and medial prefrontal cortex (Knutson et al., 2001; Yacubian et al., 2006; Tobler et al., 2007). Regions that are significantly correlated with the nonlinear term include the striatum, cingulate gyrus, motor cortex, and cerebellum (Table 3). Significant activation in the striatum and cingulate gyrus are in particular consistent with findings from two previous papers on probability weighting (Paulus and Frank, 2006; Berns et al., 2008). In contrast, there are no regions where activity was negatively correlated with the nonlinear term at traditional significance levels (p > 0.1) (for details, see Table S6, available at www.jneurosci.org as supplemental material).
Next we search for regions that are significantly and equally activated by both the linear term and the nonlinear term. These regions include the striatum (Fig. 3B; supplemental Figs. S2, S3, available at www.jneurosci.org as supplemental material), motor cortex, and cerebellum. We focus on the striatum for two reasons. First, it is well established to be involved in reward processing and specifically in reward anticipation (Knutson et al., 2000; Schultz, 2000; O'Doherty et al., 2004). In particular it has been suggested that the striatum combines reward magnitude and probability multiplicatively into a general expected reward signal (Tobler et al., 2007). Second, in our data, the striatum is correlated with both the linear term and the nonlinear term (Fig. 3B). We cannot reject the hypothesis of equal response to both terms at highly liberal p values (i.e., we can conclude with high confidence that the activity levels are not different). Furthermore, this result is robust to two variations of the statistical model that we used (see supplemental Figs. S3, S4, available at www.jneurosci.org as supplemental material).
Figure 3C contains results from the nonparametric model, where the neural β normalized by the group mean β for each level of probability used in the experiment are plotted against the true probabilities, as well as a plot of the behaviorally inferred weighting function using the group α = 0.77. To adjust the raw probability weight values (which range from 0 to 1) to scale comparably to the neural β, the behavioral weights were regressed against the neural β for each probability. The behavioral weights were then multiplied by the regression slope, and the regression constant was added, to create the adjusted values plotted in Figure 3C. There is a clear concavity in lowprobability activity and convexity in highprobability activity, and relatively equal activity for probabilities from 0.10 to 0.80, which are also the key features of the psychometric curves derived from behavior. The neurometrically derived BOLD signal curve looks like a more inflected expression of the psychometric curve. In particular, the behavioral weighting function departs from the objective probabilities more so than the I(p) neural estimates. Finally, a nonnested model test (Cox test) rejects the linear model in favor of the nonlinear π(p) (p < 0.064) and does not reject the linear model against the π(p) alternative (p < 0.302) (see supplemental Methods, available at www.jneurosci.org as supplemental material). That is, the π(p) model contains additional explanatory power relative to the linear model, whereas the reverse is not true. This, however, cannot rule out other models that we have not considered, nor potential confounds that may influence the BOLD response to anticipated reward. Indeed, introspection of Figure 3C shows that there remains much variation between the neurometric and psychometric curves that are not explained by our model; the highly imperfect fit could surely be improved by other designs and models. Future studies will therefore be needed to explore this important issue.
Crosssubject correlation
Figure 3B presents the regions of striatum with activity during the first gamble presentation that is significantly correlated with the linear component p and the nonlinear component Δ(p, α) and in which the empirical coefficient on the nonlinear component (b_{2}) is negatively correlated with subject α. The values of the nonlinear component weight b_{2} for each subject is plotted against that subject's behaviorally estimated α in Figure 4B. Across subjects, those with more inflected decision weights as revealed by the behavioral stochastic choice function (lower α) do have higher values of b_{2} estimated from neural activity (r = −0.35, bootstrapped 95% confidence interval (−0.60, −0.05)) The scaling of the coefficients is arbitrary (since it is derived from BOLD signal) but the sign of the coefficients is consistent with the theoretical relationship in Figure 4A. Subjects with positive b_{2} coefficients are generally those with inverse Sshaped weighting functions (α < 1), who overestimate small probabilities and underestimate large ones. The two subjects with negative b_{2} coefficients have α > 1 and underestimate small probabilities and overestimate large ones [i.e., they act as if they have negative rather than positive weight on the nonlinear deviation term shown in Fig. 3A (right)]. Excluding the two subjects with α > 1 from the Figure 4B analysis reduces the correlation coefficient only slightly, to r = −0.34 [bootstrap 95% confidence interval (−0.92, 0.17)].
Discussion
The hypothesis that organisms weight probabilities objectively (i.e., linearly) when evaluating risks has been a useful benchmark in many areas of social and biological sciences and is a reasonable approximation for many risks (Kagel et al., 1995). However, much of the appeal of the linearweighting EU model comes from its empirical superiority to the simpler expected value model [where u(x) = x] and the intuitive appeal of its logical axioms.
There is much behavioral evidence, however, that linearity appears to break down for very high and low probabilities in a systematic manner (Allais, 1953; Tversky and Kahneman, 1992). Indeed, Oskar Morgenstern, one of the founders of expected utility theory, speculated that linearity is a plausible assumption within the intermediate range of probabilities because “a normal individual would have some intuition of what 50:50 or 25:75 means,” but that linearity at extreme probabilities was unlikely because “probabilities used must be within certain plausible ranges and not go to 0.01 or even less to 0.001” (Morgenstern, 1979, p. 178). Allowing for such nonlinearity elegantly reconciles the common phenomenon of simultaneously purchasing insurance against rare disasters (a riskaverse choice) and buying lottery tickets (a riskseeking choice) because both are consistent with overweighting low probabilities.
The linearity hypothesis has also been adopted, often implicitly, in most studies of decision making and learning in neuroscience (Yacubian et al., 2006). For example, in standard models of reinforcement learning, reward prediction is assumed to result from an unbiased representation of rewards accumulated from experience (Montague et al., 2004). Therefore, reward predictions in stochastic environments are expected to be accurate in the long run. Most existing studies have not sampled closely enough to probability endpoints to be sensitive to the full pattern of predicted nonlinearity. For example, in the study by Abler et al. (2006), the probabilities sampled were {0, 0.25, 0.5, 0.75, 1}, for money rewards. Berns et al. (2008) used probabilities {1/6, 2/6, 4/6, 5/6, 1} for electric shocks. Given existing evidence that the inflection point is ∼1/3, one would expect to find a closely linear approximation across the range of probabilities in the former studies, rather than the full reverse Sshaped function. Similarly, one would expect a convex function in the study by Berns et al. (2008) with disproportionate brain responses to the lowest probability of 1/6. This is consistent with their results, which found a Ushaped response in the caudate/subgenual anterior cingulate cortex. This region of activation is further anterior compared with our striatal activation, and is negatively correlated with the magnitude of shock. This potentially reflects differences between encoding of rewards in the gain and loss domain.
In addition, in many neuroscientific studies of decision making, probabilities are estimated, either from ratios of bars and pie areas (Huettel et al., 2006; Berns et al., 2008) or from past experience in studies of learning (O'Doherty et al., 2004; Haruno and Kawato, 2006). This methodology differs from those in the behavioral literature on probability weighting, where the probabilities are usually given explicitly in numerical format (and sometimes graphically as well) (Camerer and Ho, 1994; Wu and Gonzalez, 1996; Kilka, 2001). Using devices such as pies or bars to depict probabilities without numerical representations can potentially induce difficulties in interpretation of the source of probability weighting. There is substantial evidence that individuals make systematic errors in proportion estimation, especially for proportions close to zero or one (Varey et al., 1990; Hollands and Dyre, 2000). Therefore it becomes unclear whether the behavioral patterns come from errors in estimating probabilities from graphical displays, or from nonlinearity in the valuation process.
In this study, therefore, we closely followed the representation of decisions under risk used in the behavioral economics literature, while at the same time take into account of the limitations of neuroimaging measures by separating presentation of the gambles in time. This allowed us to dissociate neural contributions of the evaluation of individual gambles from activity due to comparison and choice (Fig. 2). The need for temporal separation of evaluation and choice does not arise in behavioral studies in which the only observable variable is choice. Existing studies do not always distinguish between perception of value and the decision process (Paulus and Frank, 2006). This distinction, however, is of potential importance when using neuroimaging data. The modeling of probability weighting assumes that the value function itself is “distorted” (nonlinear) in probabilities, whereas the decision process is unbiased. The temporal separation therefore allows us to focus on reward perception, rather than simultaneous perception and choice (Preuschoff et al., 2006). Our fMRI results suggest that striatum activity in evaluation of risks is nonlinear in probabilities in isolation of choice, consistent with standard interpretations of the weighting function. Activity in the striatum is also found by Tom et al. (2007) in response to gamble monetary values, and can be used to link the neural aversion to loss (compared with equalsized gain) to behavioral lossaversion.
More generally, this type of study represents an empirical competition between models of risky choices which are rooted in logical axioms (chiefly the independence axiom stated earlier) and models rooted in psychophysics which can be further grounded in evolutionary adaptation. This study completes the exploratory studies to the key elements of prospect theory, the others being loss aversion (Tom et al., 2007) and framing/reference point (De Martino et al., 2006) (for review, see Fox and Poldrack, 2008). Our intuition is that brain activity during valuation of risks is more likely to correspond to the cognitive components of prospect theory than to EU, and it will also be easier to construct an adaptationist account of how evolution would have shaped brains to follow prospect theory rather than EU (Robson, 2002), since prospect theory follows from psychophysics and EU from normative logic. Establishing a neural and evolutionary basis of prospect theory could provide an illustrative example of how the foundation for principles guiding social science might be usefully shifted from relying largely on logic, to respecting biological implementation (which might, of course, include convergence to logical principles as a result of learning or higherorder cognition).
There are a number of more general implications for neuroscientific studies of reward and decision making. The maturation of decision neuroscience and neuroeconomics will likely lead to increasing emphasis on problems involving extreme tails of the distribution (e.g., public fear about rare catastrophic risks, pathological gambling, broad participation in longtailed lotteries, and preferences for financial assets with positive skewness) (Barberis and Huang, 2008). The numerical ratio of overweighting of low probabilities is dramatic for the functions we estimate: probabilities of 10^{−2}, 10^{−6}, and 10^{−9} are overweighed by factors of 4, 520, and 33,000, respectively.
In addition, some studies suggest that probabilities learned through experience do not exhibit the same type of patterns of behavior as those represented abstractly (Hertwig et al., 2004; Fox and Hadar, 2006). Furthermore, because the learning process is still under investigation, it is unknown how the brain updates probabilities conditional on past events. More studies of the neural basis of response to probabilities represented abstractly, and those learned from experience, are therefore needed to provide a unified framework to understand and reconcile these results.
Footnotes

This work was supported by a Gordon and Betty Moore Foundation grant and a Human Frontier Science Program grant to C.F.C. We thank P. Bossaerts, K. Friston, A. Healy, and R. Poldrack for comments.
 Correspondence should be addressed to Colin F. Camerer, Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA 91125. camerer{at}hss.caltech.edu